By any standard we human beings are unusual creatures. Just for a start, our bizarre upright locomotion is reflected in a whole laundry list of physical peculiarities that range from strange globular heads, precariously balanced above sinuous vertical spines, to a dismal propensity for hernias, wrenched knees, and fallen arches. Still, if pressed to identify the one single feature that most sharply distinguishes us from the rest of the living world, most of us would probably opt for our ability to produce and understand language, a curious form of communication in which a limited vocabulary of vocal or gestural symbols can generate an unlimited number of readily recognizable meanings.
Even scientists, a famously disputatious breed, pretty much agree on this: the archaeologist Colin Renfrew’s remark that language is “the most remarkable and the most characteristic of all human creations” has been closely echoed by savants as diverse as the cognitive psychologist Steven Pinker (“in any natural history of the human species, language would stand out as the preeminent trait”) and the ecologist and entomologist E.O. Wilson (language is “the one capacity that distinguishes Homo sapiens absolutely from other creatures”). It is, in other words, uncontroversial to claim that modern humankind’s most radically defining attribute is the language that permits us not only to communicate ideas and situations with unique precision but also to evoke intended thoughts in the minds of others.
Yet, although it may be our signature characteristic today, language clearly has not always been a feature of our lineage. We can’t know this directly, because prior to the invention of writing a mere five millennia ago language left no direct traces in the archaeological record. But since our species is alone among all its many living relatives in possessing language in the sense familiar to us, it follows that we must, at some remove, be descended from a nonlinguistic ancestor. The question then becomes one of just when our first linguistic ancestor emerged, who that ancestor was, and exactly how the transition from a nonlinguistic to a linguistic state took place. And that is where all the trouble begins—or rather began, because the field of linguistics had barely come into existence before the origin of language became so toxically divisive that, in 1866, the newly established Société de Linguistique de Paris hastily revised its statutes to ban any discussion of the subject.
As with most bans, that one ultimately failed. And as with most debates, polarization only increased. In our own time some experts still believe that language is so complex, and so central to the essence of humanity, that on both counts it must necessarily have deep roots within our hominin lineage. Early advocates of this view were Pinker and his psychologist colleague Paul Bloom who, back in 1990, famously summarized this “gradualist” argument as follows:
There must have been a series of steps leading from no language at all to language as we now find it, each step small enough to have been produced by a random mutation or recombination…. Every detail of grammatical competence that we wish to ascribe to selection must have conferred a reproductive advantage on its speakers…. And [for language to have evolved] there must be enough evolutionary time and genomic space separating our species from nonlinguistic primate ancestors.
This neatly reductive approach predictably appealed to the broad spectrum of scientists who were beguiled at the time by the powerful notion that evolution consists of little more than slow change under the guiding hand of natural selection, and it was embraced with particular enthusiasm by evolutionary psychologists.
In the opposite corner were those, led by Noam Chomsky, who saw language as qualitatively distinct from all other kinds of vocal communication, and thus as much more than simply a refinement of a preexisting form of any one of them. In fact, Chomsky himself didn’t see language as primarily a form of vocal communication at all; to him, its principal function was not as an externalized conveyor of information but rather as an internalized organizer of thought. And it is certainly true that no other human attribute maps better than language does onto the “symbolic” way in which human beings process information in their minds. Uniquely, as far as we can tell, we humans deconstruct our interior and exterior worlds into a vocabulary of discrete mental symbols. Once we have done this, we can shuffle those symbols around, according to rules, to produce statements not only about those worlds as they are but as they might be.
The practical results of our shift to symbolic reasoning, from what one may assume to have been a more intuitive earlier cognitive style, are dramatic. Other organisms appear to occupy broadly holistic worlds. They live in the environment more or less as it presents itself to them and respond to stimuli from it in ways that may be highly sophisticated, but are not transformative. In stark contrast, we human beings live for much of the time in the worlds that we individually reconstruct in our heads, while our clever hands also give us an extraordinary ability to modify the environment to suit our desires, if not to avoid the law of unintended consequences.
This unusual combination of competencies has made human beings not only unique but also uniquely dangerous to the other species with which we share the planet, and it is hard not to conclude that our possession of language is deeply implicated in this. What’s more, Chomsky and his colleagues now reckon that, for all the remarkable consequences of our linguistic—and by extension cognitive—abilities, they depend on an extremely simple mental algorithm. If they are correct, the transition from a nonlinguistic to a linguistic state could have been achieved more or less instantaneously, rather than over the extended periods of time expected for change under natural selection.
With no general agreement in sight, the argument between the long-timescale and short-timescale folks rages on and continues to attract scientists from a wide range of backgrounds. The most recent expert to enter the fray at book length is the English archaeologist Steven Mithen, who already has volumes on the evolution of human cognition and musical abilities under his belt. Those who know Mithen’s writings will be aware that he naturally inclines toward a broadly linear outlook on the question of how our unusual human characteristics emerged. Given his background in paleoanthropology this is hardly surprising, because the science of human origins has always stood somewhat apart from the mainstream of evolutionary biology.
The other major subspecialties of paleontology arose from the disciplines of geology and comparative anatomy, each of which in its own way depended on understanding how living things had so richly diversified in the past. In contrast, paleoanthropology developed from the study of human anatomy, as early archaeologists sent the human bones they unearthed during their excavations to physicians and the like for description and publication. That was logical enough: for who, after all, knew human bones better than the physicians and anatomists? But at the same time those scientists were inward-looking, concerned with the minutiae of variation within the single species, Homo sapiens. Understanding the riotous diversity of organisms in the living world was beyond their bailiwick, leaving them ill-equipped to answer the most central question in paleoanthropology, namely how modern humans and their fossil relatives fit into that complex world.
That legacy lingers today. For even though it has long been clear that Homo sapiens is actually the sole surviving twig of a luxuriantly branching evolutionary bush that begs to be properly characterized, many paleoanthropologists still prefer to begin with that familiar lone survivor and to project it back in time, as if it were simply the outcome of a single-minded process of transformation by natural selection. From that perspective, the long timescale is the default.
In his new book, The Language Puzzle, Mithen likens the search for the origins of language to putting together a jigsaw puzzle. The pieces of this particular brainteaser come from the various academic disciplines—linguistics, neuroscience, anthropology, archaeology, primate behavior, and so forth—that contribute to our overall understanding of language and its context, and the book itself is generally organized along those disciplinary lines. Two chapters aimed at “framing” the puzzle—basically, establishing context—are followed by a dozen devoted to characterizing each of what Mithen sees as its major interior pieces. We are assured from the start that, once the facts have been completely assembled, all the complex interrelationships among the components of the puzzle will be grandly revealed; and, given this declaration, it is worth taking a quick look at what we learn along the way.
Mithen begins his chapter on human origins—the first half of the jigsaw’s frame—with the appearance of bipedal hominins some seven million years ago. He says rather little, however, of the poorly documented earliest forms and their better-known successors, the bipedal but still diminutive and partly arboreal “australopiths.” He effectively starts his story with the larger-brained early species of our own genus Homo, which he associates with the earliest simple stone tools (though he does note that australopiths probably invented them earlier). He mentions the utility of those cutting tools for quickly removing limbs from scavenged carnivore kills and points out that it was the addition of animal fats and proteins to the hominin diet that underwrote the metabolically expensive enlargement of the hominin brain that began in earnest around two million years ago.
After discussing the history of the Acheulean hand ax, the first consciously shaped stone tool, Mithen launches into an account of how hominins spread out of Africa around 1.8 million years ago, rightly emphasizing the complex effects upon this dispersal of the wild climatic oscillations of the ice ages during which it occurred. A consideration of the initial occupation of Europe by early Homo sapiens then gives him the opportunity to contrast the invading modern humans with the resident Homo neanderthalensis—who, he later concludes, left “no compelling evidence that they made and used visual symbols.” Finally, Mithen outlines the Old World–wide diaspora of our own species and the accompanying cultural innovations that reflect the possession of full-fledged language.
That leaves the puzzle frame to be completed by delving into the characteristics of “fully modern language” itself. Mithen notes the distinction between lexical and grammatical words, and between the arbitrary words in which sound and meaning are not related and the so-called iconic ones in which they are. He takes a vigorous swipe at Chomsky’s notion that a “universal grammar” underlies children’s ability to absorb language, though he still has to accept that human babies have an innate ability to acquire any language given the appropriate cultural cues. He also dwells on the tendency of languages to diversify, finding lots of reasons why this should occur despite the resulting barriers to communication. What he does not consider is the question of why, if selection reigns supreme and language is all about communication, those barriers should not be selected out. Is it possible that, in such an intensely social species, language is somehow more important as a badge of group membership than as a way of communicating about the physical and social environment? Or might there be some merit in Chomsky’s speculation that language’s most important role is its internalized one?
Despite such lingering uncertainties, Mithen is ready at this point to begin filling in his jigsaw. And because “evolution works slowly, especially for something as complex as language,” he starts with the vocal systems of our closest but still fairly distant living relatives, the great apes, in whose “minds, calls and actions…some pieces of the language puzzle do indeed reside.” Still, following a survey of laboratory ape-language studies and field investigations of ape and monkey calls, he is forced to the rather modest conclusion that “although monkeys and apes entirely lack words, their calls sometimes have word-like qualities.” That qualifying adjective is surely a significant one, but it is not enough to dissuade Mithen from declaring that “chimpanzee word-like calls provide a starting point for the evolution of fully modern language.”
Next is a chapter on the vocal and auditory systems, both said to be “finely attuned” to the sound frequencies we use in speech. These are produced in the upper vocal tract, the roof of which is formed in human babies and nonhuman primates of all ages by a flat skull base. In contrast, the cranial base flexes in maturing humans as the tongue and larynx descend down the neck, producing a long pharyngeal area that can be distorted by the surrounding musculature to modulate the sounds it produces. Mithen explains how the sound system works and then surveys the fossil record, only to discover that the modern adult proportions of the vocal tract are of remarkably recent origin. So in Mithen’s timeline none of this turns out to be particularly helpful in assessing the origin of speech, let alone that of language. And although the Neanderthal auditory system apparently had a range of detection similar enough to that of modern humans to evoke the claim that “major developments in human spoken communication had occurred by 500,000 years ago,” just what those developments might have been remains unclear.
After speculating about the evolutionary relevance of arbitrary versus iconic words, Mithen examines the phenomenon of synesthesia, in which unusual connections in the brain enable experiences in one sensory modality to be felt in another, as when some musicians “see” music in colors. All of this makes for some entertaining digressions into linguistics and neurobiology, but it also leaves a lot of questions open, as does the next chapter on stone-tool making and the putative implications for the brain and language of the succession of technological traditions we see in the archaeological record. A consideration of what we can learn from computer simulations then leads to some reflections on how languages are structured and on how children contrive to learn them and to acquire theory of mind (the ability to infer what others are thinking).
Some conjecturing about the role of ancient fire control in promoting the use of language (stories around campfires) is then followed by an excursion into neurobiology, in which Mithen correctly observes that language functions are widely distributed in the human brain and that, size apart, the main difference between ape and human brains lies in the vastly more elaborate internal connectivity of the latter. That central feature is something we cannot observe in fossils, but Mithen is nonetheless prepared to hazard that “modern humans and Neanderthals may have adopted some of the sounds, words, phrases and structures of each other’s languages into their own”—even though there is precious little to support the idea that the large-brained but now-extinct Neanderthals, though doubtless highly vocal, possessed language in any sense a linguist would recognize today.
Without ever quite revealing how it might relate to language origins, Mithen then proceeds to consider how languages change over time, as a prelude to the critical question of whether language might be about more than mere communication. Oddly, while he agrees that the language we use deeply affects how we perceive the world, he never sees fit to consider this issue from a Chomskyan perspective, turning instead to some concluding thoughts on how language might relate to the late outburst of creativity and overt symbolism (art, music, notation) that we see best reflected in the European cave art of the last Ice Age.
All of this ricocheting around among the diverse facets of language might sound a bit like a game of intellectual hopscotch. By its end, however, not only has the reader been treated to an accessible account of an intrinsically fascinating subject, but Mithen feels that he has all of his pieces “on the table” and is ready to explain “why, when and how” language evolved. He accordingly presents us with a detailed scenario of human evolution in which the “word-like and syntax-like” vocalizations of the earliest bipeds eventually gave way to the use of iconic words and gestures by the australopiths and later on by tool-wielding members of Homo.
“Cortical leakage” in the expanding hominin brain subsequently made “synesthesia the normal and ubiquitous state of mind,” promoting cross-modal connections even as iconic sounds increased in variety and began to be used in combination. At the same time the vocal tract was modified as the skull changed, and breath control became more precise, signaling that “language had arrived.” An “instinctive desire for…effective communication” then sidelined synesthesia, and linguistic diversity expanded along with the brain’s storage capacity and the use of arbitrary words. By around 600,000 years ago, when Homo heidelbergensis (a precursor of both H. neanderthalensis and H. sapiens) appeared, humans were already able to “share [the] ideas and knowledge” that eventually allowed them to spread widely across the Old World. And from that substrate, modern language was gradually refined.
This is all so confidently delivered, with so much circumstantial detail, that the reader is tempted to forget that there is precious little evidence for any of it and that the whole construct is held together by nothing more than a blind faith that language must have emerged gradually over a prolonged period of time. In effect, Mithen has been able to complete his jigsaw puzzle only by placing his pieces on top of one another in a neat pile, rather than by carefully fitting them together side by side. Which suggests that he has been using the wrong metaphor all along. A better one might be the architectural arch, which cannot function until it is fully complete. Half an arch, or even nine tenths of one, is useless; similarly, to deprive language of any of its interlocking elements would rob it of the “discrete infinity” that makes it qualitatively unique among the many systems of vocal and/or gestural communication. Yet, just as Chomsky’s minimalist algorithm predicts, for all its many complexities language appears to be easily invented if you happen to already have a “language-ready” brain—as was the case, for example, for the deaf and language-naive Nicaraguan schoolkids who spontaneously invented and elaborated a rule-bound sign language when housed together for the first time in the 1970s.
So, what actually happened? The fossil record shows that, after a long history of hominin brain enlargement and doubtless also of complexifying vocal/gestural communication, by around 230,000 years ago our anatomically distinctive species Homo sapiens had emerged in Africa. We know from direct evidence that these ancient humans boasted a modern vocal tract, and we can reasonably infer that they also possessed brains with the internal connectivity required for language—which, after all, they could never otherwise have acquired. However, we have no evidence either direct or indirect that those earliest Homo sapiens used language as we know it, and we have to wait until about 100,000 years ago for the first convincing proxy evidence of language use to show up. This comes in the form of overtly symbolic objects such as abstract engravings, items of bodily adornment, and, before long, representational paintings.
Such objects allow us to infer pretty confidently that the human brain had by this time shifted from the ancestral intuitive algorithm to the symbolic one we use today. What is more, that change must have been precipitated by a purely behavioral stimulus, because the requisite biology had to have been there already. And since it is difficult to conceive of either language or symbolic ability in the absence of the other, the cultural stimulus concerned was most plausibly the spontaneous invention of language. People (quite likely children, to start with) began to associate particular sounds with specific meanings, and thereby to give things the names that particularize them in the modern human mind and make symbolic reasoning possible. Once that mental feedback between sound and symbol had been established, the keystone of the symbolic arch was in place.
It may seem remarkable that everything needed for a sudden cognitive transition just happened to be in the right place at the right time. But significantly, in evolutionary terms the “exaptation” of the human brain that was involved—its potential to master language and symbolic thought where no such cognitive features had existed before—would have been nothing special. Ancestral birds, for example, had possessed feathers for many millions of years before they ever used them to fly. And once the new way of processing information had kicked in, the rest was history, as the newly symbolic and linguistic (though otherwise manifestly unperfected) Homo sapiens spread beyond Africa, eventually invented farming, and for better or for worse began vigorously devoting its new cognitive capacities to the domination of the planet.