The neuroimaging and neurophysiological literature on inner speech in healthy participants and those who experience auditory verbal hallucinations (AVHs) is reviewed. AVH-hearers in remission and controls do not differ neurologically on tasks involving low levels of verbal self-monitoring (VSM), such as reciting sentences in inner speech. In contrast, on tasks involving high levels of VSM, such as auditory verbal imagery, AVH-hearers in remission show less activation in areas including the middle and superior temporal gyri. This pattern of findings leads to a conundrum, given that mentation involving low levels of VSM is typically held to form the raw material for AVHs. We address this by noting that existing neuroimaging and neurophysiological studies have been based on unexamined assumptions about the form and developmental significance of inner speech. We set out a Vygotskian approach to AVHs which can account for why they are generally experienced as the voice of another person, with specific acoustic properties, and a tendency to take the form of commands. On this approach, which we argue is consistent with the neural correlates evidence, AVHs result from abnormalities in the transition between condensed and expanded dialogic inner speech. Further potential empirical tests of this model are discussed.