Since the last common ancestor shared by modern humans, chimpanzees and bonobos, the lineage leading to Homo sapiens has undergone a substantial change in brain size and organization. As a result, modern humans display striking differences from the living apes in the realm of cognition and linguistic expression. In this article, we review the evolutionary changes that occurred in the descent of Homo sapiens by reconstructing the neural and cognitive traits that would have characterized the last common ancestor and comparing these with the modern human condition. The last common ancestor can be reconstructed to have had a brain of approximately 300–400 g that displayed several unique phylogenetic specializations of development, anatomical organization, and biochemical function. These neuroanatomical substrates contributed to the enhancement of behavioral flexibility and social cognition. With this evolutionary history as precursor, the modern human mind may be conceived as a mosaic of traits inherited from a common ancestry with our close relatives, along with the addition of evolutionary specializations within particular domains. These modern human-specific cognitive and linguistic adaptations appear to be correlated with enlargement of the neocortex and related structures. Accompanying this general neocortical expansion, certain higher-order unimodal and multimodal cortical areas have grown disproportionately relative to primary cortical areas. Anatomical and molecular changes have also been identified that might relate to the greater metabolic demand and enhanced synaptic plasticity of modern human brain's. Finally, the unique brain growth trajectory of modern humans has made a significant contribution to our species’ cognitive and linguistic abilities.
‘What would have to be true – not only of the quaint folk across the river, but of chimpanzees, dolphins, gaseous extraterrestrials, or digital computers (things in many ways quite different from the rest of us) – for them to be correctly counted among us? ... When we ask, Who are we? or What sort of thing are we? the answers can vary without competing. Each one defines a different way of saying “we”; each kind of “we”-saying defines a different community, and we find ourselves in many communities. This thought suggests that we think of ourselves in broadest terms as the ones who say “we”.’ (Brandom, 1994)
‘We’ are Homo sapiens and our species’ intellectual abilities distinguish us from all other animals. Our technological sophistication, capacity for introspection, and ability to create and manipulate symbols is unrivalled. We engage in behaviors that are profoundly unique, such as the production of personal ornamentation, language, art and music, and the performance of religious rituals. This behavioral discontinuity has prompted many to regard modern humans as standing apart from the rest of nature. Yet, despite our distinctiveness, ‘we’ are also one among several species of great ape, displaying more than 99% nonsynonymous DNA sequence similarity with chimpanzees (Wildman et al. 2003), having diverged from each other approximately 4–8 Ma (Bradley, 2008). Consequently, modern humans share many phenotypic traits with these close relatives through common descent. The tension between striking behavioral divergence in the face of phylogenetic continuity presents a puzzle. Although many authors have discussed the possible selective advantages and evolutionary processes underlying the emergence of modern human cognition (e.g. Holloway, 1967; Calvin, 1994; Dunbar, 1996; Tomasello, 1999; Tooby & Cosmides, 2005), it still remains a serious challenge to understand how the unique features of modern human behavior are mapped onto evolutionary changes in neural structure.
Considering the dramatic behavioral differences between modern humans and other animals, it is reasonable to expect similarly remarkable alterations in brain organization. As Darwin noted in The Descent of Man (1871), there appears to be a link between our intelligence and our expanded brain, which increased in size by roughly threefold since the last common ancestor (LCA) shared by hominins (the lineage including modern humans and our fossil close relatives and ancestors) and panins (the lineage including common chimpanzees, bonobos, and their fossil close relatives and ancestors). Because a large brain size so clearly distinguishes modern humans, many theories of human cognitive evolution consider only this single anatomical variable to account for the myriad specialized behaviors we exhibit (e.g. Jerison, 1973; Dunbar, 1996).
However, modern human-specific traits have been described at many different levels of neural organization, including gross brain size, the relative extent of neocortical areas, asymmetry, developmental patterning, the distribution of cell types, histology, and gene expression. Thus, while increased brain size, comprising mostly growth of the neocortex (Finlay & Darlington, 1995), undoubtedly has been central to the evolution of modern human cognition, other modifications to brain development, structure, and function are also certain to be significant. Furthermore, explaining modern human behavioral distinctiveness simply as a secondary byproduct of brain enlargement leaves unanswered fundamental questions regarding the computational substrates of our species-specific behavioral capacities (Holloway, 1968). Does it even make sense to ask how many ‘extra’ grams of neocortical tissue are necessary for the development of recursive syntax, pair-bondedness, or ‘theory of mind’? Indeed, abundant data from the neurosciences show that changes in structural modularity and connectivity interact with variation in molecular and neurochemical signaling to determine brain function. Subtle modifications in neural microstructure and gene expression can have a significant impact on behavior, even in the absence of large-scale changes in the size of brain parts (e.g. Hammock & Young, 2005). Evolutionary processes, therefore, can mold behavioral phenotypes using a host of strategies.
In this context, the aim of this article is to examine how changes in brain anatomy and physiology articulate with unique aspects of modern human cognition. We employ a multidisciplinary approach to trace evolutionary changes in mind and brain from the LCA to modern Homo sapiens, incorporating evidence from comparative psychology, neuroscience, genetics, paleoanthropology, and linguistics. By providing a detailed contrast between the mind of the LCA and Homo sapiens, it is our intent to bring into relief the distinctive characteristics of modern humans against the background of what is inherited from our most recent ancestry.
Although it would be desirable to trace the course of mental evolution through the succession of extinct species that fall along the lineage leading to modern humans, the fossil evidence is frustratingly scant. Unlike some other adaptations in human evolution that show reliable hard tissue correlates, such as the transition to habitual bipedalism (Lovejoy, 2005), behavior and soft tissue do not fossilize. Therefore, the paleonotological record for these traits in human evolution is limited to what can be gleaned from endocranial casts and archaeological evidence (Mithen, 1996; Holloway et al. 2004). As endocasts preserve only an impression of the external morphology of the brain, critical information regarding internal neuroanatomical organization cannot be determined. Behavioral abilities, furthermore, can only be glimpsed opaquely through material remains. However, the paleontological and archaeological records constitute the only direct evidence of temporal change in morphology and behavior, providing crucial insight regarding their association. Paleontological evidence, in fact, indicates that major innovations in cultural behavior were not always linked to upsurges in cranial capacity of fossil hominins. For example, early signs of animal butchery are found in association with Australopithecus garhi(Asfaw et al. 1999), suggesting that a small-brained (450 cm3 cranial capacity) East African hominin from 2.5 Ma might have had the capacity to fashion simple stone tools.
The comparative approach, although an indirect source of information regarding evolution, provides a greater opportunity to explore the relationship between biological diversification and its correlates. In addition, by analyzing the distribution of neuroanatomical, behavioral, and genetic character states that are present in contemporary species within an established phylogeny, the principle of parsimony may be used to make reasonable inferences concerning the condition of extinct ancestral taxa (Johnson et al. 1984; Northcutt & Kaas, 1995). Of course, all living species are the product of their own evolutionary trajectory and cannot be considered stand-ins for fossil ancestors. Nevertheless, when a character state is observed only in modern humans and not in any of the other extant hominids (the clade that includes the living great apes and modern humans), then it is reasonable to conclude that the modern human condition is derived compared to the symplesiomorphic state seen in the great apes. In this review, we rely heavily on comparative data and the types of inference used in cladistic analysis to reconstruct the natural history of the modern human mind (as such, unless stated otherwise, any subsequent reference to ‘humans’ pertains to modern humans).
To clearly distinguish what features are evolutionarily derived in recent human evolution, in the first part of this article we reconstruct the behavioral, cognitive, and neuroanatomical characteristics that we predict would have been present in the LCA. We draw particular attention to the features that generally distinguish hominids from other primates. In so doing, we highlight the cognitive and neural features that were derived character traits in our most recent evolutionary ancestry and which set the stage for the dramatic further modifications that were to take place in the Plio-Pleistocene within the lineage leading to Homo sapiens. This narrow focus on only recent human evolutionary history means that our account does not detail many important characteristics that arose at deeper nodes in our family tree. For example, the orientation of human cognition towards social problem solving is the product of a long primate heritage (Cheney & Seyfarth, 1990, 2007). Similarly, our capacity for fine-grained hand–eye coordination derives from selection long ago for visually guided reaching performance in stem primates (Cartmill, 1992).
Next, we review the cognitive and neural features that are uniquely present in Homo sapiens and which differ from the conditions that characterized the LCA. Because language is such a fundamental component of the human behavioral phenotype, we focus special attention on examining how this communication system differs from that used by other species. At the outset it should be acknowledged that establishing a clear causal link between evolutionary changes in brain structure and the emergence of species-specific behavior is complicated for a number of reasons – anatomical homology does not necessarily entail functional similarity; also, our present understanding of transcriptional, cyto- and chemoarchitectural scaling allometry is extremely rudimentary and some apparent human-specific differences may simply be related to maintaining functional equivalence in the context of biochemical, physiological, and geometric constraints at larger overall brain size. To compound the problem, because there are so few detailed studies of neuroanatomical organization comparing humans and great apes, many of the unique characteristics of the human brain are currently hidden from view. Therefore, we do not expect there to be a straightforward correspondence between every anatomical and every behavioral characteristic that we discuss.
We conclude by outlining a preliminary model to explain how changes in brain size and other aspects of neuroanatomical reorganization might yield domain-general cognitive specializations with emergent domain-specific skills.
It has proven easier to distinguish great apes from other primates based on dietary and ecological variables than on cognitive specializations. Because of the generalized dental anatomy of living hominids, these species rely heavily on mature, nonfibrous fruits with high sugar and calorie content. As a consequence of this diet, the great apes occupy a fairly narrow range of ecological habitats, being largely restricted to tropical and woodland forests (Foley & Lee, 1989; Potts, 2004). Paleoecological and dental evidence suggests that the middle Miocene hominids, presumably including the common ancestor of living great apes, consumed a varied frugivorous diet that incorporated opportunistic, perhaps seasonal, utilization of hard objects such as nuts and seeds (Singleton, 2004).
The social organization of modern panins reflects one solution to the foraging challenge posed by the hominid diet. Chimpanzee societies were first described as ‘fission-fusion’ by Jane Goodall (1986) to highlight the fluid nature of their associations. The unique fission-fusion social grouping of chimpanzees affords individuals the benefits of gregarious living, including predator defense, access to mates, and the opportunity to locate widely distributed foods, while simultaneously minimizing direct contest competition over food. In fission-fusion societies, individuals see each other infrequently, with some intervals of separation lasting as long as a week. Yet all individuals recognize each other and maintain their affiliations and alliances despite these relatively long separations (Goodall, 1986). Within the fission-fusion organization of chimpanzees certain age- and sex-specific subgroups appear to have particular social functions. For instance, groups of juveniles may serve as territory patrols for the community (Wrangham & Peterson, 1996; Mitani, 2006), adult males form hunting parties (Boesch, 2002; Mitani, 2006), and females and their offspring are largely associated with tool-use and other subsistence technologies such as nut-cracking (Boesch & Boesch, 1990). The segregation of individuals by sex and age for specific social activities is arguably unique to the great apes – specifically, chimpanzees – and virtually absent in monkeys, including those with societies that resemble a fission-fusion social organization, such as the hamadryas baboon.
Another aspect of social behavior in great apes that appears to be unique among primates concerns regional traditions (Subiaul, 2007). All great apes, but no monkey species, possess a suite of behaviors that include gestures and styles of object manipulations that are distinctive for a given social group/community, persist from generation to generation, and are transmitted horizontally through social learning (Whiten, 2005; Horner et al. 2006). The traditions of chimpanzees are by far the best documented and also appear to be the most widespread and diverse of any nonhuman primate species. Whiten and colleagues (1999, 2001) reported 39 different traditions in various African chimpanzee communities that included tool use, grooming, and mating practices. Some of these traditions were customary or habitual in some chimpanzee groups but absent in others after controlling for ecological constraints (e.g. availability of certain raw materials). Using the same systematic approach to the documentation of traditions in nonhuman primates, van Schaik and colleagues (2003) reported at least 19 clearly defined traditions in orangutans. This stands in contrast to reports of traditions in monkeys, whales, birds, and fish, where at most a handful (usually only one or two) have been identified (e.g. song ‘dialect’ in birds and whales). In none of these cases has the number of behavioral variants reached double digits (Rendell & Whitehead, 2001; Fragaszy & Perry, 2003). Although there is always the possibility that such a result is due to over-representation of great apes in the sample, it is notable that despite many years of research on several monkey species (including macaques, baboons, and capuchin monkeys), only capuchin monkeys evidence behaviors that potentially meet the criteria for traditions (Panger et al. 2002; Perry et al. 2003, but see Subiaul, 2007). Previous claims of ‘proto-culture’ in Japanese macaques (Kawai, 1965), for example, are no longer considered to be ‘cultural’ as they do not conform to contemporary standards of non-human ‘traditions’ (e.g. Whiten et al. 1999; van Schaik et al. 2003).
One lingering question concerns how such complex and unique behaviors as hunting, patrolling, and cultural traditions map onto the various cognitive abilities known to distinguish the great apes from other primates. Below we discuss some cognitive skills that appear to be unique to great apes and that may shed some light on this question (for a more extensive review, please see Subiaul et al. 2006).
Starting in the 1970s, a number of studies explored chimpanzees’ kinesthetic perception of the self via mirror self-recognition (Gallup, 1970). In these studies, it was demonstrated that chimpanzees use their reflections to explore body parts, such as the underarms, teeth, and anogenital region, which are difficult to see without the aid of a mirror. In contrast, after lengthy exposures to mirrors, monkeys continue to display social behaviors toward their mirror image, which suggests that they fail to see their reflections as representations of themselves. Additional research has reported mirror self-recognition in orangutans (Lethmate & Dücker, 1973; Suarez & Gallup, 1981), but most gorillas fail to recognize their mirror image (Suarez & Gallup, 1981; Ledbetter & Basen, 1982; Shilito et al. 1999), with one exception (Patterson & Cohn, 1994). Subsequent studies with monkeys confirmed Gallup's initial negative findings (e.g. Suarez & Gallup, 1981; Hauser et al. 2001; de Waal et al. 2005).
Povinelli & Cant (1995) have hypothesized that mirror self-recognition in great apes may be an emergent property of being a large-bodied primate that spends a significant amount of time navigating the complex three-dimensional environment of trees, constantly monitoring where to place limbs to support the body during travel. However, Povinelli & Cant's (1995) ‘clambering hypothesis’ has been challenged by evidence of mirror self-recognition in animals whose habitats do not require arboreal locomotion. Today there are reports that bottlenose dolphins (Reiss & Marino, 2001) may recognize themselves in the mirror – at the very least, they do not seem to treat their reflection as if it were another individual. Studies on elephants, however, are more equivocal. One study reported that elephants engaged in mirror-directed reaching but did not identify themselves in the mirror and behaved aggressively toward their image (Povinelli, 1989). However, Plotnik and colleagues (2006) reported that one of three elephants studied showed evidence of mirror self-recognition. The possibility that different mammalian lineages are ‘self-aware’ presents at least two possibilities: (1) mirror self-recognition is an emergent property present in species with large brain size and a complex social organization or (2) there are multiple adaptive functions to the cognitive ability that is measured by mirror-self recognition and, consequently, this skill emerged independently in numerous mammalian species.
Great apes are acutely sensitive to the direction of others’ gaze. Determining the precise direction of another's attention is an important ability because it can provide salient information about the location of objects such as food and predators. In social settings, a great deal of information is communicated by means of following other individuals’ gaze to specific individuals or to call attention to specific events.
Many primate species engage in social activities, including tracking allies, that likely require following the gaze of conspecifics (e.g. Chance, 1967; Menzel & Halperin, 1975; Whiten & Byrne, 1988; Mitani, 2006). However, in field studies, it is often difficult to identify which object, individual, or event is the focus of two individuals’ attention and whether they arrived at the focal point by following one another's gaze. Laboratory studies have confirmed that great apes and, to a lesser extent, monkeys follow the gaze of others to objects (e.g. chimpanzees, mangabeys, and macaques; Emery et al. 1997; Itakura & Tanaka, 1998; Tomasello et al. 1993, 2001; Tomonaga, 1999). In one of the few explicitly comparative studies of this behavior, Itakura (1996) examined the ability of various primate species to follow a human experimenter's gaze. Notably, in this paradigm only chimpanzees and one orangutan responded above chance levels. Okamoto-Barth and colleagues (2007) have extended these results with a refined method that included barriers with and without windows. They conclude that chimpanzees, bonobos, and gorillas are more sensitive to another's line of sight than are orangutans.
One method commonly used in the laboratory to investigate nonhuman primates’ ability to use gaze cues is the ‘object-choice task’. In this task, an experimenter attends to one of two containers (controls usually include directing the face and eyes to one container or looking askance at one container, while facing forward) while subjects are given the opportunity to choose a container, only one of which is baited. The available research suggests that there is a significant difference between monkeys’ and great apes’ understanding of gaze cues in the object-choice task (see also Itakura & Anderson, 1996). Monkeys generally cannot be trained to use only the human experimenter's gaze cues to retrieve the concealed reward (Anderson et al. 1995, 1996), whereas great apes can (Itakura & Anderson, 1996). Povinelli & Eddy (1996a,b) have hypothesized that great apes outperform monkeys on this task because they possess an automatic response that forms part of a primitive orienting reflex triggered by a reward. This reflex, however, does not require the attribution of a mental state or an understanding of the psychological state underpinning ‘seeing’. Another possibility is that sensitivity to eyes, in particular, co-evolved with the ability to make inferences about certain psychological states such as seeing. In support of this latter hypothesis, Hare and colleagues (Hare et al. 2000, 2001, 2006; Hare & Tomasello, 2004) have argued that chimpanzees use the direction of gaze to reason about the intentions of conspecifics. Santos and her colleagues (Flombaum & Santos, 2005) have reached similar conclusions with rhesus macaques based on a comparable experimental paradigm. Although controversial, these tasks and results raise the possibility that catarrhine primates (including Old World monkeys and apes) either share a system that binds observable features (e.g. eyes) with unobservable concepts such as ‘seeing’ or that all primates share a primitive system that can only construct concepts based on observable features but not unobservable causes (Povinelli, 2000; Povinelli & Vonk, 2003).
Some have argued that chimpanzees and other great apes have a more sophisticated understanding of physical causality than monkeys, as reflected by their tool-use in the wild (Visalberghi, 1990; Limongelli et al. 1995; Visalberghi et al. 1995; Westergaard, 1999). This conclusion is buttressed by the fact that traditions as they exist in chimpanzees and orangutans are mostly absent in monkeys. And where they exist, as appears to be the case in capuchin monkeys (Panger et al. 2002; Perry et al. 2003), they comprise only two or three behaviors, which lack the diversity and complexity that characterize chimpanzee and orangutan behavioral traditions (Subiaul, 2007; Whiten & van Schaik, 2007). But is there any evidence of differences in the physical cognition skills of apes and monkeys? Research with monkeys has shown that they disregard non-functional surface features, such as color and shape, when choosing a tool, but they fail to appreciate how changes in shape affect changes in function (Hauser, 1997a; Hauser et al. 1999, 2002b, 2002c; Santos & Hauser, 2002; Fujita et al. 2003; Santos et al. 2005; but see Santos et al. 2006). Povinelli (2000) reported similar results for chimpanzees. In a series of studies, Povinelli (2000) presented chimpanzees with tasks that involved actions commonly seen in the wild such as pulling, pushing, and poking. Following training, subjects were presented with a choice of method: one was consistent with a theory of intrinsic connection (transfer of force), whereas the other choice was consistent with a theory of superficial contact. With very few exceptions, superficial and/or perceptual contact guided the chimpanzees’ responses across the various tool tasks. Thus, great apes’ understanding of simple mechanics may not differ substantially from that of monkeys.
An important facet of physical cognition is the ability to quantify objects in one's environment. As such, many animals (birds, rodents and primates) have evidenced numerical knowledge (Brannon, 2006). Some of the most important work in this area has demonstrated that primates likely share a non-verbal system for ordering small and large numerosities (Cantlon & Brannon, 2006). Specifically, research suggests that monkeys, apes, and humans share a system for adding (chimpanzees: Rumbaugh et al. 1987; Boysen & Bernston, 1995; Boysen et al. 1993, 1996; Beran, 2001; Herrmann et al. 2007; rhesus macaques: Cantlon & Brannon, 2007) as well as subtracting quantities (chimpanzee: Beran, 2001; monkeys: Sulkowski & Hauser, 2001). Additionally, research with rhesus monkeys and chimpanzees has demonstrated that the ability to represent and quantify objects in one's environment is modality independent. In one study, rhesus monkeys in a laboratory setting matched the number of vocalizations that they heard with the number of faces that they saw (e.g. 2 vs. 3) (Jordan et al. 2005). This result corresponds with field research demonstrating that wild chimpanzees on patrol compare the number of vocalizations generated by ‘foreign’ chimpanzees with the number of individuals in their own group, retreating if the number of vocalizations exceeds the number of individuals in their own group (Wilson et al. 2002).
Such experiments on physical and numerical cognition suggest that there are no significant qualitative differences between chimpanzees’ and monkeys’ understanding of the non-verbal aspects of number or of unobservable physical causes. Therefore, the differences between great apes’ and monkeys’ tool-use in the wild may reflect factors such as greater manual dexterity and fine motor coordination, as well as social-cognitive variables such as the ability to benefit from social conventions and copy novel motor rules.
Affective and temperamental differences may also contribute to phylogenetic variation in behavioral performance (Hauser, 2003; Beran & Evans, 2006; Evans & Beran, 2007). In particular, great apes appear to be able to delay gratification longer than rhesus monkeys (Beran & Evans, 2006; Evans & Beran, 2007) and may be more tolerant of conspecifics than monkeys (Goodall, 1986; van Schaik et al. 1999; Brosnan et al. 2005). But there are also differences in tolerance among the great apes. These differences affect cognitive performance in certain domains. For instance, tolerance appears to play a major role in the frequency and diversity of cooperative behaviors in chimpanzees and bonobos. Specifically, in a cooperative feeding task, bonobos were found to be more tolerant of co-feeding than chimpanzees (Hare et al. 2007). However, when the task involved retrieving food that was difficult to monopolize, there were no differences between chimpanzees and bonobos. Aside from cooperation, tolerance and inhibitory control may similarly also affect performance in physical and spatial cognition tasks (Herrmann et al. 2007).
Such results point to subtle temperamental variables that in some instances have a significant effect on cognitive functioning. These results suggest that the great apes’ more sophisticated social and physical cognition skills rest in part on greater inhibitory control relative to monkeys – an executive function of the prefrontal cortex – which may allow them to better focus attention, in turn enhancing learning and memory (Call, 2007). Temperament and inhibitory control may be the target of directional selection insofar as they may result in greater behavioral flexibility and learning, offering individuals the opportunity to exploit new niches in their social or physical environments, resulting in increased fitness. It is possible that such subtle changes in temperament yield qualitatively distinct behavioral repertoires between species.
Given the differences between monkeys and great apes discussed above, what may be said about the LCA? We hypothesize that changes in the paleoenvironment in the late Miocene resulting in a wider distribution of woodlands produced a trend among the African great apes toward more stable relationships between the sexes and stronger associations between male kin (Foley & Lee, 1989). The LCA likely lived in an environment where the patchiness of food led to a wide distribution of females, selecting for male kin to form cooperative coalitions to defend an expansive home range (Foley & Lee, 1989). Given the evidence reviewed above, we conclude that the LCA displayed regional variation in certain behavioral traditions, ‘self-awareness’, and an enhanced ability to follow the gaze of other social agents. We further hypothesize that these behavioral characteristics are related to increased capacities of executive control to inhibit conventional responses in favor of social tolerance and seeking novel and flexible solutions to problems. These behavioral abilities would have been advantageous for a large-bodied, frugivorous, social primate in the late Miocene.
As a catarrhine primate, the LCA's brain would have been specialized for social cognition (Cheney & Seyfarth, 2007). In both macaque monkeys and humans, ventral premotor and inferior parietal cortex contain neurons that fire when an individual either performs or observes different goal-directed actions (Rizzolatti et al. 2002; Arbib, 2005). This ‘mirror neuron system’ potentially serves as a substrate for understanding others’ actions, imitating new skills, and simulating others’ intentions. Furthermore, catarrhine brains also contain separate populations of neurons in the temporal cortex that are selective for the direction of others’ gaze, facial expressions and identity (reviewed in Hauser, 1997), as well as species-specific vocal calls (Poremba et al. 2004; Gil-da-Costa et al. 2006). In addition to these, there are several features of extant great ape brains that distinguish them collectively from other primates (MacLeod, 2004; Sherwood & Hof, 2007). It is most parsimonious to conclude that these traits evolved in the great ape stem approximately 14 Ma and therefore would have also been present in the LCA. Here we focus on the neocortical phenotype of living hominids, with some reference to other brain systems, drawing particular attention to reorganization at the histological and molecular level (Fig. 1a).
All living hominids have large brain sizes in absolute terms and in some measures of relative size (Passingham, 1975a; Holloway, 1983; Martin, 1983). Similarly, the LCA is expected to have had a brain mass of approximately 300–400 g. This size is within the range of extant great apes and is close to the cranial capacity of the earliest hominins (Holloway et al. 2004). Furthermore, the cranial capacities of some late Miocene apes, such as Dryopithecus, are within the lower end of this range (300–330 cm3) (Begun, 2004). Comparative studies suggest that positive selection occurred in genes associated with primary microcephaly (i.e. ASPM and microcephalin) along the phylogenetic lineage leading to the LCA (Kouprina et al. 2004; Wang & Su, 2004); however, it is less clear whether these genetic variants are related to normal brain size variation and evolution (Woods et al. 2006). Accompanying enlarged absolute brain size, the LCA probably also had a high degree of cerebral cortical gyrification as compared with other primates (Connolly, 1950). Although not differing significantly from allometric scaling predictions (Zilles et al. 1989), the amount of gyral folding in living great apes suggests that there is relatively more associational connectivity between neighboring cortical regions, as gyri are thought to form due to tension-based mechanisms that bring strongly interconnected regions more closely together, achieving spatially compact wiring (van Essen, 1997).
The human brain displays strong population-level left hemisphere dominance for language functions, especially among right-handed individuals. Because language is clearly a unique behavioral innovation in humans, it has historically been thought that such lateralization is a requisite for the expression of this complex adaptation. Recent studies have shown that some of the cortical areas associated with language function in humans also display asymmetries in extant great apes, suggesting that they were present in the LCA. Specifically, population-level left hemisphere dominant asymmetry of the planum temporale, a surface feature of the cerebral cortex in the region of Wernicke's area, is shared by humans, chimpanzees, bonobos, gorillas, and orangutans (Gannon et al. 1998; Hopkins et al. 1998). A recent study, in fact, indicates that the volume of cytoarchitecturally defined area Tpt (the homologue of part of Wernicke's area) in macaque monkeys shows leftward dominance (Gannon et al. 2008). This finding suggests that the basis of planum temporale asymmetry can be traced back earlier to a common ancestor of catarrhine primates. Similarly, the sulci within the inferior frontal cortex, which contains Broca's area, display left hemisphere dominant asymmetry in their depth and length in humans, chimpanzees, bonobos, and gorillas (Cantalupo & Hopkins, 2001). Given the poor correspondence between external morphological landmarks and cytoarchitectural area borders in this region of great apes (Sherwood et al. 2003) and humans (Amunts et al. 1999), however, it remains to be determined how these gross asymmetries relate to the underlying organization of neural tissue.
Nonetheless, these neuroanatomical asymmetries suggest that some aspects of functional processing were already lateralized in the brain of the LCA prior to the evolution of language. This might be due to the computational demand to process sequential information, such as species-specific vocal calls and dexterous manual actions, with temporal fidelity by reducing conduction delay across hemispheres (Ringo et al. 1994). In fact, functional hemispheric dominance for processing communication signals was probably established much earlier in primate evolution. Species-specific vocalizations elicit responses in the homologues of Broca's area and Wernicke's area in macaque monkeys (Gil-da-Costa et al. 2006) and produce left-hemisphere dominant activity in the temporal cortex (Poremba et al. 2004). Taken together, these findings indicate that the neural machinery for processing complex acoustic signals contained in species-specific communication was present and lateralized in the LCA, providing a scaffold upon which language functions later evolved.
The frontal cortex (which includes primary motor cortex, premotor cortex, supplementary motor area, and prefrontal cortex) is involved in numerous processes, ranging from simple motor execution to higher-order cognition. Regions of the prefrontal cortex are implicated in functions such as decision-making, planning, working memory, and emotional regulation. As a fraction of total neocortex volume, the frontal cortex of hominids is large (36%) in comparison with other primates (29% in gibbons and 31% in capuchin and macaque monkeys) (Semendeferi et al. 2002). Analyses of scaling relative to the rest of the neocortex indicate that the frontal cortex shows a positive allometric relationship in primates, such that it comprises an increasing percentage of the neocortex as overall brain size enlarges (Bush & Allman, 2004). Thus, proportionally larger frontal cortex is to be expected in primates with big brains such as hominids.
In addition, the dorsal frontal cortex of hominids has a more complex pattern of gyral folding than in other primates, with distinct precentral and superior frontal sulci evident. Comparisons of macaques, chimpanzees, and humans based on sulcal landmark registration suggest that the enlargement of frontal cortex in hominids has involved mostly the dorsolateral prefrontal cortex (van Essen, 2007). Recent studies have shown that the signaling protein Fgf17, which regulates the expression of transcription factors in the developing neuroepithelium, is particularly important for specifying the size of dorsolateral frontal cortex in mammals (Cholfin & Rubenstein, 2007), pointing to one possible developmental mechanism responsible for the relatively increased size of this cortical region in great apes and humans.
Even though it is predicted by allometric scaling, the relatively increased frontal cortex size of hominids could have significant functional implications. It has been hypothesized that, through axon competition and sorting in development, brain regions that are relatively enlarged might influence activity in more widespread targets (Deacon, 1997; Striedter, 2005). Given the role of dorsolateral prefrontal cortex in executive functions, such as selection among alternative cognitive strategies when faced with novel problems, it is an intriguing possibility that the enhanced capacity of great apes to inhibit their behavioral responses, delay gratification, and demonstrate a higher degree of behavioral flexibility might be related to the increased size of this part of frontal cortex relative to the rest of the neocortex.
Several different neuron types have evolved biochemical and morphological specializations in the hominid lineage (Sherwood & Hof, 2007). For example, in layer V of anterior cingulate and frontoinsular cortex, neurons with a spindle-shaped cell body are only found in great apes and humans among primates (Nimchinsky et al. 1995, 1999; Allman et al. 2005). This class of cell, called the von Economo neuron, has a very large soma that displays a distinctive tapering towards the apical dendrite and basal axon. Golgi impregnation studies in human brains show that von Economo neurons have a narrow dendritic field and a thick axon that descends into the white matter (Watson et al. 2006). Based on the location, neurochemistry, and morphological characteristics of von Economo neurons, it has been hypothesized that they transmit rapid outputs to subcortical regions (Allman et al. 2005). It is interesting that these specialized projection neuron types have been identified in cortical areas that are positioned at the interface between emotional and cognitive processing. Given their characteristics, it has been speculated that von Economo neurons are designed for quick signaling of an appropriate response in the context of social ambiguity (Allman et al. 2005). Enhancements of this ability would be particularly important in the context of fission-fusion communities, such as those of panins and possibly the LCA, with complex networks of social interactions and potential uncertainties at reunions. Quite interestingly, von Economo neurons have now also been identified in large-brained cetaceans (Hof & van der Gucht, 2007), indicating that they have independently evolved in multiple lineages. Given the distribution of mammalian species in which they are found, it seems that they may differentiate from a common precursor pool to perform important social cognitive functions in species that have both large brain size and complex social organization.
Diffusely projecting neurotransmitter systems in the neocortex have an enormous influence on cognitive processing by modulating attention, working memory, states of arousal, and motivation. These systems have the capacity to adjust the excitability and synchronicity of large ensembles of postsynaptic neurons. Due to their widespread effects, deficits in neocortical innervation by these neuromodulators result in the decline of cognitive functions, notably learning and memory, and are associated with neurodegenerative conditions in humans such as Alzheimer's and Parkinson's disease. A recent comparative examination of axons that contain serotonin, dopamine, and acetylcholine in the frontal cortex of humans, chimpanzees, and macaque monkeys has revealed significant phylogenetic variation (Raghanti et al. 2008a; Raghanti et al. 2008b; Raghanti et al. in review). In general, these neuromodulators provide more extensive innervation to layers V and VI in prefrontal cortical areas associated with cognition (Brodmann's areas 9 and 32) of chimpanzees and humans relative to macaques, but there are no species differences in primary motor cortex (Brodmann's area 4). Given the role of these neuromodulatory afferents in attention and learning, it is tempting to speculate that the increased prefrontal innervation of hominids is involved in the unique behavioral characteristics of this clade, particularly enhanced gaze-following and the existence of socially transmitted traditions.
In addition, a remarkable morphological feature present in chimpanzee and human neocortex, but not in macaques, is the accumulation of ‘coils’ of varicose axons for serotonin, dopamine, and acetylcholine. Based on previous observations of such cholinergic axon ‘coils’ in human neocortex, these morphological features might represent episodes of plasticity and synaptic reorganization (Mesulam et al. 1992). These findings suggest that enhanced facilitation of intracortical processing in prefrontal cortex by neuromodulatory systems would have characterized in the LCA.
Brain tissue is metabolically expensive because of the high energetic costs associated with ion pumps and the synthesis and packaging of neurotransmitters (Aiello & Wheeler, 1995). Over and above these general costs, available evidence suggests that the mass-specific energetic demands of neocortex might be even higher in apes (including hylobatids) than in other primates. For example, a duplication of the glutamate dehydrogenase gene took place in the stem hominoid (Burki & Kaessmann, 2004). The hominoid-specific isoform of this enzyme, GLUD2, which is expressed by astrocytes, is activated during high glutamate flux. These modifications, which provide a greater capacity for glutamate metabolism, presumably occurred to enable relatively high levels of excitatory neurotransmission in ape brains. Additionally, several electron transport chain genes (COX4-1, COX5A, COX8A, ISP, NDUFA1, NDUFC2, and NDUFA4) underwent positive selection in the hominoid stem (Grossman et al. 2004; Uddin et al. 2008). The amino acid substitutions characterizing the hominoid variants of these proteins have been demonstrated in some cases to involve changes in electrical charge, suggesting that they confer functional enhancements of the mitochondrial aerobic metabolism pathway to provide energy substrates to cells that are highly active, such as neurons. Taken together, these studies suggest that relatively high rates of overall neuronal activity level in the LCA would have been supported by evolutionary changes in biochemical pathways for glutamate uptake and energy production.
Outside of the neocortex, several other neuroanatomical traits distinguish apes (including hylobatids) compared with other primates. Specifically, the lateral cerebellar hemisphere of hominoids is larger than would be predicted by allometry in monkeys (Rilling & Insel, 1998; MacLeod et al. 2003). This portion of the cerebellum participates in a variety of functions including planning of complex motor patterns, sensory discrimination, attention shifting, and procedural learning (Leiner et al. 1993). The cerebellar specializations of apes are most likely associated with their suspensory mode of locomotion to allow for feeding on fruits at the tips of branches given a large body size. Such adaptations require excellent physical coordination in addition to visuo-spatial mapping skills to track seasonal resource availability, both functions involving lateral cerebellar connections. Hence, the relatively greater size of the lateral cerebellum in the LCA might have provided a neuroanatomical basis for refining motor behavior and consequentially also supporting other higher-order cognitive processes.
Even the brainstem of apes exhibits distinctive features in comparison with other primates. Perhaps in conjunction with the enlarged neocortex subsuming a greater diversity of functions, the dorsal cochlear nucleus of hominoids has lost the granular layer and stratified cellular organization that is evident in other primates (Moore, 1980). The functional significance of this simplified dorsal cochlear nucleus in hominoids is unclear. Conversely, the facial nucleus of the great ape and human clade displays a larger volume and a greater number of neurons than would be expected for a non-hominid primate of the same medulla size (Sherwood, 2005; Sherwood et al. 2005a). These changes in facial nucleus anatomy might subserve the increased facial mobility of hominids compared to other primates (Dobson, 2006). Additionally, a neurochemically distinct nucleus has been described in the brainstem of humans, the nucleus paramedianus dorsalis, which is absent in all other mammals investigated, including macaque monkeys – great apes have not yet been studied (Baizer et al. 2007). It is hypothesized that this nucleus receives vestibular inputs related to eye movements.
Finally, many life history characteristics relevant to brain growth show modifications in hominoids relative to other primates. Skeletal, dental, and sexual maturation are delayed and the lifespan is elongated. In fact, Miocene fossil hominoids (Sivapithecus and Dryopithecus) appear to have matured at rates and durations similar to those of the living apes (Kelley, 2004). Prolonged life history phases have been shown to relate to the evolution of large brain size (Sacher, 1982; Smith, 1989; Allman et al. 1993; Deaner et al. 2002). The extended period of offspring dependency, combined with more slowly maturing neuronal pathways in the LCA, would have provided an opportunity for learning in a social environment to strongly shape plastic changes in the developing brain.
Comparative research has shed light on the cognitive specializations of modern humans (Subiaul et al. 2006). Here we will focus on those findings that have received the most empirical support, including those pertaining to joint or shared attention, the understanding of minds, and imitation learning.
Various studies (for reviews see Povinelli, 2000; Subiaul et al. 2006) have revealed that chimpanzees and humans share many aspects of gaze-following behavior including: (1) the ability to extract specific information about the direction of gaze from others; (2) displaying the gaze-following response whether it is initiated by movements of the hand and eyes in concert or by the eyes alone; (3) using another's gaze to visually search into spaces outside their immediate visual field in response to eye plus head/upper torso movement, eye plus head movement or eye movement alone; (4) not requiring direct visual perception of the shifts in another's gaze direction to follow it into a space outside their immediate visual field; and (5) possessing at a minimum a tacit understanding of how another's gaze is interrupted by solid, opaque surfaces.
But important differences exist alongside these similarities in the gaze-following behavior of humans and great apes. For example, Okamoto and colleagues (2002, 2004) reported a case study in which an infant chimpanzee failed to look back at the experimenter after following her gaze to an object located behind him. This type of triadic interaction, which exists between mother, child, and an object of interest, has been widely reported in the human developmental literature but it is largely absent in the animal literature. Researchers have offered various explanations for these differences. Among humans, a number of changes in social communication occur around 9 months of age (Carpenter et al. 1998). For instance, by 6 months, human infants interact dyadically with objects or with a person in a turn-taking reciprocally exchanging sequence. However, they do not interact with a person who is manipulating objects (Tomasello, 1999). From 9 months on, infants start to engage in triadic exchanges with others. Their interactions involve both objects and persons, resulting in the formation of a referential triangle of infant, adult, and object to which they share attention (Tomasello, 1999; Rochat, 2001).
Mirror self-recognition (Gallup, 1970) and the many parallels between human and chimpanzee gaze-following (Povinelli & Eddy, 1996c) have been widely interpreted to indicate that chimpanzees and other great apes have a ‘theory of mind’ (Premack & Woodruff, 1978). One way researchers have attempted to assess whether an agent possesses an understanding of unobservable mental states is to measure whether individuals associate certain observable features such as eyes with unobservable psychological states such as ‘seeing.’ In a series of classic studies, Povinelli & Eddy (1996c) used the chimpanzee's natural begging gesture – an out-stretched hand, palm facing up –to make requests to a human experimenter. When chimpanzees were confronted with two experimenters, one whose eyes were visible and therefore could respond to their gestures, and another whose eyes were covered or closed and therefore could not respond to their gestures, results revealed that chimpanzees showed no preference for gesturing toward the experimenter who could see them. There was one exception. Subjects responded correctly in the condition where one experimenter faced forward and another faced backwards. In a follow-up experiment to exclude the possibility that chimpanzees were using a global cue (e.g. body orientation) to guide their responses, Povinelli & Eddy (1996c) had two human experimenters turn their backs to a chimpanzee subject, but one looked over their shoulder. In this condition, performance dropped to chance. Many aspects of these results have now been independently replicated by other comparative psychologists working with captive chimpanzees (e.g. Kaminsky et al. 2004).
This pattern of performance sharply contrasted with the performance of human children. In an experiment similar to the one described above, where children were trained to gesture to an experimenter to request brightly colored stickers (Povinelli & Eddy, 1996c). They were tested on several of the same conditions used with the chimpanzees, and the youngest children (2-year-olds) were correct in most or all of the conditions from their very first trial.
Hare, Call, Tomasello and their associates have challenged Povinelli & Eddy's (1996c) conclusion that theory of mind is unique to modern humans (Hare et al. 2000, 2001, 2006; Hare, 2001). Hare and colleagues used a ‘competitive paradigm’, where individuals must compete with conspecifics or human experimenters for food, because they argue that this paradigm is more ecologically valid than the ‘cooperative paradigm’, where subjects gesture to an experimenter, used by Povinelli and Eddy (Hare, 2001; Hare & Tomasello, 2004). In the competitive paradigm, a dominant and a subordinate chimpanzee were placed in opposite sides of a large enclosure. In certain trials both the subordinate and the dominant animal were in view of one another and food was placed in a position that was visible to both. In other trials, food was strategically placed in a position that only was visible to the subordinate. In this experiment, the subordinate animals avoided the food that was visible to the dominant animal but not the food that had been positioned so that only the subordinate animal could see. The authors interpret these results as evidence that chimpanzees have the capacity to infer some aspects of mental states such as ‘seeing’ (Hare et al. 2000, 2001, 2006; Hare, 2001). These results have been further supported by Hostetter and colleagues (2007) who presented over 100 chimpanzees with treatment conditions similar to those presented by Povinelli and Eddy (1996c). They report that chimpanzees produced more overt behaviors (e.g. vocalizations) when the experimenter's eyes were visible than when the experimenter's eyes were covered or not visible.
However, the see/not-see paradigm (whether competitive or cooperative) poses several distinct problems as a way of addressing whether nonhuman animals are capable of mental state attribution (for a review see Vonk & Povinelli, 2006; Subiaul, 2007). The main problem involves whether or not Povinelli and Eddy's (1996c) and Hostetter et al.'s (2007) ‘cooperative paradigm’ or Hare et al.'s (2000, 2001, 2006) ‘competitive paradigm’ can adequately isolate nonhuman primates’ understanding of unobservable psychological states such as ‘seeing’ from their use of non-psychological, observable cues such as eyes. Certainly, reasoning about mental states or any other ‘unobservable’ is premised on a correlation between a behavioral or physical feature (e.g. eyes) and an underlying cognitive state (e.g. ‘seeing’). However, it is possible that these are decoupled in the minds of nonhuman primates. This presents the distinct possibility that an individual could reason adeptly about the visibility of eyes (or other observable features) without simultaneously reasoning about what eyes do (‘see’). In such an instance, the individual that reasons about eyes only, would behave remarkably like someone who associates eyes with the psychological state of ‘seeing.’ For these reasons, Povinelli and colleagues (Povinelli, 2000; Vonk & Povinelli, 2006) have argued that the see/not-see paradigms (e.g. Hare et al. 2000, 2001; Hostetter et al. 2007; Povinelli & Eddy, 1997) are fundamentally flawed and cannot be used as evidence that animals have an understanding of a mental state such as seeing. The development of an experimental paradigm that overcomes this interpretational challenge continues to elude researchers.
Given the varied and dynamic ability of modern humans to learn new behaviors, is it possible that our species deploys a dramatically different cognitive mechanism when learning new skills? To what extent is ‘learning by imitation’ (i.e. ‘novel imitation’) unique to humans? To date, 10 studies have directly compared novel imitation – or imitation learning, where individuals must copy responses or rules that do not already exist in their behavioral repertoire – in human and nonhuman adult primates using analogous procedures (Nagell et al. 1993; Tomasello et al. 1993; Call & Tomasello, 1995; Whiten et al. 1996; Horowitz, 2003; Horner & Whiten, 2005; Herrmann et al. 2007; Subiaul et al. 2007). Only one study has compared novel imitation in monkeys and children (Subiaul et al. 2007). Six studies have reported that on an operational task, where a tool or object had to be manipulated in a certain manner to retrieve a reward, humans reproduced the demonstrator's actions with greater fidelity than did great apes who either reproduced only the outcome of the modeled actions or did not imitate at all (Nagel, et al. 1993; Tomasello et al. 1993; Call & Tomasello, 1995; Call et al. 2005; Hermann et al. 2007; Horner & Whiten, 2007). The other four studies reported both similarities and differences between humans and chimpanzees when executing specific actions on an object following a demonstration (Whiten et al. 1996; Horner & Whiten, 2004). Two studies, one that involved an operational-tool task (Horowitz, 2003) and another that used a cognitive imitation paradigm (Subiaul et al. 2007), found no differences between the performance of humans and nonhuman primates. Thus, ‘imitation’ is not a singular cognitive mechanism. Different aspects of the imitation faculty in humans are shared with chimpanzees and other primates, whereas other characteristics appear to be unique in our species (Subiaul, 2007).
The newborn's ability to copy the orofacial expressions of a model, for instance, appears to be a behavioral trait that is shared among humans, chimpanzees, and rhesus macaques. Neonatal chimpanzees and rhesus macaques, like human infants (e.g. Meltzoff & Moore, 1977), reproduce tongue protrusions and mouth openings in response to a model displaying the same expression (Myowa-Yamakoshi et al. 2004;Ferrari et al. 2006). There are also striking parallels in the developmental trajectory of orofacial imitation in these species. Similar to humans (Abravanel & Sigafoos, 1984), the incidence of orofacial imitation in chimpanzees slowly disappears after 9 weeks of age (Myowa-Yamakoshi et al. 2004).
Recent experiments, however, appear to show that both cooperation and imitation come more ‘naturally’ to human children than to young chimpanzees (Horner & Whiten, 2005, 2007; Herrmann & Tomasello, 2006). For example, it has been shown that human children, but not nonhuman animals, learn in a ‘ghost condition’, which is an experimental social learning control condition where the actions of a model are removed and the target object and the consequent results occur automatically (i.e. as if executed by a ghost) (Subiaul et al. 2004; Thompson & Russell, 2004; Huang & Charman, 2005). Other studies have also demonstrated that human children learn from others’ mistakes (Want & Harris, 2001; Subiaul et al. in review) but nonhuman primates do not (Horner & Whiten, 2007). These results may be indicative of fundamental differences between species where only humans are capable of differentiating between a correct ‘intentional’ response and an incorrect ‘unintentional’ response, and of engaging in counterfactual reasoning (Want & Harris, 2002; Subiaul et al. in review). As such, the human imitation faculty likely has added more functions or possibly become functionally linked with other psychological faculties, such as theory of mind, granting it greater flexibility and power to copy a broad range of rules and responses, including the ability to reason about physical capability and counterfactual information (learning from others’ mistakes).
Language and the thought that it expresses constitute arguably the most distinctive feature of human behavior. Yet, consensus about the course of language evolution is elusive. Controversies pervade not only speculation about the phylogeny of distinctively human language, but also the characterization of what has evolved, the linguistic component of the human behavioral phenotype. There is at least this much agreement about human language – it is a form of communication that is unique in the natural world (Hockett, 1960; Bickerton, 1990; Hauser, 1997b; Christiansen & Kirby, 2003a,b). Unlike systems of communication employed by other species, human language is said to have: (1) modality/stimulus independence, (2) duality of patterning, (3) shared, arbitrary symbols, capable of displaced reference, (4) generalized systematicity/domain independence, and (5) hierarchical/recursive structure or syntax.
Unlike nonhuman animal communication systems, human language is both modality and stimulus independent (Hockett, 1960; Hauser, 1997b; Hauser et al. 2002a). It is modality independent because it can take oral, visual, gestural, and even tactile forms. In other words, human language is abstract enough to be communicated in radically different media. In contrast, the communication signals of nonhuman animals do not display similar flexibility of modality. For example, mating or threat displays always employ the same sequences of bodily movements, and cannot be conveyed in different modalities.
The stimulus independence of human language refers to its freedom from specific environmental triggers. Human language users can speak about anything in virtually any circumstances. For example, we can talk of food even without food present or without being hungry. This contrasts with animal communication systems, which for the most part are under the control of very specific environmental or endogenous triggers. Both the modality and stimulus independence of human language imply that it is under far more voluntary control than typical, nonhuman animal communication systems (Deacon, 1997).
Although no form of nonhuman animal communication approaches the extreme stimulus- and modality independence of human language, there is evidence that nonhuman primate species, and even some non-primates, are capable of a certain degree of voluntary control over communicative acts. The best-studied examples are the use of gestures to indicate intentions in various great ape species, and the use of referential vocalizations in various monkey species (Tomasello & Call, 1997). Chimpanzees, bonobos, and gorillas have been shown to use ‘ontogenetically ritualized’ gestures to express intentions to conspecifics. These gestures begin as direct behavioral manipulations that are gradually truncated as interactions recur. For example, the initiation of play by a chimpanzee might start off as a play-hit, and then become ritualized, such that merely raising an arm comes to indicate the intention to play (Tomasello & Call, 1997). Such gestures are often used for very different communicative purposes, indicating some intentional, voluntary control, and stimulus independence.
Perhaps the best-known example of intentional vocal communication in nonhuman primates is the vervet monkey's system of warning calls (Cheney & Seyfarth, 1990; Tomasello & Call, 1997). Vervets produce different vocalizations depending on whether they see predatory cats, birds, or snakes. Other members of a vervet troop respond to such signals with avoidance behavior appropriate to the indicated predator. The proper use of these signals must be learned during subadulthood. Also, there is anecdotal evidence that vervets are capable of tailoring these calls to specific audiences, indicating stimulus independence, and some degree of intentional, voluntary control (Cheney & Seyfarth, 1990; Tomasello & Call, 1997). It is worth noting that several other nonhuman primate species, as well as non-primates, including domestic chickens, are capable of comparable control over predator-warning vocalizations (Hauser, 1997b; Tomasello & Call, 1997; Zuberbühler, 2006).
Although it is clear that various nonhuman species are capable of voluntary, intentional, stimulus- and modality independent communication, these capacities are extremely limited when compared with the human capacity for language. The ontogenetically ritualized gestures observed in various great ape species are limited to intention expression, attention getting, or other conspecific-manipulative functions, so the range of stimuli that can elicit them is relatively limited. The modalities available for expressing such communicative acts are also limited to ritualized versions of the kinds of manipulative behaviors from which they originate. There is very little evidence that, for example, chimpanzees can accomplish with vocalization the same sorts of communicative acts as they can accomplish with gesture (Pollick & de Waal, 2007). Monkey vocalizations appear even more limited in these respects. Vervet referential vocalizations are limited to the function of warning conspecifics about predators, so the range of stimuli that elicit them is also relatively narrow. Furthermore, vervets do not use gestures, for example, to communicate the same warning signals, indicating a lack of modality independence.
Alone among natural communication systems, human language is compositional at two levels (Hockett, 1960; Pinker & Jackendoff, 2005). At the first level, neural representations of a finite set of meaningless gestures, known as ‘phonemes’, can be systematically combined into a much larger set of meaningful units, like representations of words, called ‘morphemes.’ At the second level, this set of morphemes can be systematically combined into an infinite set of larger meaningful units, e.g. phrases, clauses, and sentences. The meanings of these larger units are systematic functions of the meanings of the morphemes that compose them, computed via recursive, syntactic rules.
Some argue that certain nonhuman animal communication systems, like birdsong and whalesong, have an analog of the first compositional level (Hauser, 1997b; Okanoya, 2002; Fitch, 2004, 2005; Pinker & Jackendoff, 2005) – a finite set of meaningless sounds can be systematically combined into a much larger set of sequences of such sounds. However, these sequences of sounds are not meaningful in the way that morphemes of human language are, i.e. they are not words with referential meaning, and they are not themselves combined into higher-order meaningful units like phrases, clauses, and sentences (Fitch, 2004, 2005; Pinker & Jackendoff, 2005).
Human language consists of shared, arbitrary symbols capable of displaced reference, i.e. the capacity to refer to events and objects that are not perceptually present, like spatially or temporally distant objects and events (Hauser, 1997b; Hauser et al. 2002a; Pinker & Jackendoff, 2005). For example, a word like ‘star’ is an arbitrary symbol that has no connection, e.g. resemblance or correlation, with what it stands for. It is shared in that both the producer and the consumer of the symbol ‘star’ understand it to stand for the same object. This is made possible by the consumer's capacity to infer the communicative intention of the producer, premised on a well developed theory of mind. Finally, the reference of the symbol need not be perceptually salient or restricted to the here-and-now. In fact, language can be used to talk about things with which it is not possible to have perceptual contact, either because they are spatio-temporally too removed, e.g. the beginning of the universe, or because they are not located in space and time, e.g. abstractions like the number two.
There is some controversy about the degree to which nonhuman animals are capable of shared, symbolic, displaced reference (Hauser et al. 2002a; Pinker & Jackendoff, 2005). Honeybees appear capable of encoding spatially removed locations using a ‘dance-language’, components of which correspond systematically to distance and direction (Von Frisch, 1967; Hauser, 1997b). However, the honeybee dance is not symbolic, i.e. it stands in a non-arbitrary relation to the locations it encodes. Because components of the dance, e.g. ‘waggles’, correspond systematically to distance and direction (Hauser, 1997b), the dance is more of an iconic representation, like a map, than a symbolic representation.
Another example of nonhuman shared, symbolic, displaced reference comes from studies of language-trained great apes. The most famous of these, the bonobo Kanzi, has learned to use visual symbols, with arbitrary referents, to communicate with his handlers (Savage-Rumbaugh et al. 1998). Kanzi appears capable of inferring simple communicative intentions, and can use symbols to refer to objects that are spatially displaced. However, this feat was heavily reliant on scaffolding by Kanzi's handlers (Tomasello & Call, 1997). There is very little evidence of such capacity in wild populations. So even if nonhuman animals are capable of some rudimentary, symbolic communication, it does not come easily or naturally.
Human language is characterized by two distinctive, related and, as far as we know, unparalleled semantic properties. Firstly, human language is generally systematic: it can represent any object for which it has a term as possessing any property for which it has a term (Evans, 1982; Fodor & Pylyshyn, 1988). This property gives language a boundless, creative capacity for representing unobserved and unobservable situations, for example, cats that walk upright in boots and talk, etc. Such talk and the thought it expresses also enable the discovery of the hidden mechanisms of nature – light can be thought of as a wave, for example (see Mithen, 1996; Carruthers, 2003; Camp, 2004). Second, human language is task-domain neutral; it represents information about the world as independent of any task to which it might be put. For example, the sentence ‘There are fruit trees beyond the hill’ has no immediate implications for action; it puts no constraints on what the speaker or hearer can do with this information. These two properties are closely related. A sentence that systematically combines words for concepts from two different cognitive domains cannot itself be domain specific. It must transcend the proprietary task domains of its components, and achieve a kind of task-domain neutrality.
There are good reasons to doubt that nonhuman animals are capable of generally systematic and task-domain neutral communication or thought. Nonhuman primates show little evidence of it (Cheney & Seyfarth, 1990; Tomasello & Call, 1997; Hurley, 2003) and it is difficult to imagine ecological circumstances that could have selected for this kind of thought in nonhuman species. Indeed, there is little archaeological evidence that even our relatively recent hominin ancestor, Homo erectus, was capable of such thought (Mithen, 1996). For example, prior to the origin of modern Homo sapiens, hominin tools lacked totem-like, symbolic decoration. By contrast, the archaeological record that dates roughly from the speciation of Homo sapiens shows clear evidence of domain neutral cognition – tools incorporate animal products, and display increasingly complex, symbolic decoration, often depicting fauna. Mithen (1996) argues that this demonstrates a capacity to integrate information from disparate cognitive domains, including knowledge of the natural world, tool-making capacity, and social cognition. It is unclear how and why such domain-neutral cognition evolved in humans. One suggestion is that once the capacity to encode complex information in a public medium, i.e. complex language, evolved, a ‘common code’ in which information from previously isolated cognitive modules could be integrated, became available. On this view, domain-neutral, generally systematic thought is a kind of ‘thinking for speaking’ (Slobin, 1991) made possible by the prior evolution of a capacity for complex, public language (Mithen, 1996; Zawidzki, 2006).
Since Chomsky's early work (Chomsky, 1957), the claim that natural language exhibits hierarchical, recursive structure has had the status of orthodoxy. Sentences of natural language are composed of nested hierarchies of sub-sentential units. For example, consider the sentence, ‘The herd, beyond the woods, north of the plain, west of the hills is on the move.’ This sentence consists of more than just words; it consists of sub-sentential compounds of these words known as phrases. Because phrases can be nested within other phrases, we can construct sentences of arbitrary length, conveying information of arbitrarily precise specificity. This indefinitely extendable hierarchical nesting of phrases constitutes the recursive, syntactic structure of human language (Pinker & Jackendoff, 2005).
Many theorists consider recursive syntax the most distinctive and important property of human language, on which many of its other unique properties depend (Bickerton, 1990, 1998, 2000, 2003; Hauser et al. 2002a). This is currently a topic of great controversy (Hauser et al. 2002a; Fitch et al. 2005; Jackendoff & Pinker, 2005; Pinker & Jackendoff, 2005). Bickerton (1990) argues that recursive syntax explains the creativity, or the ‘generalized systematicity/domain independence’ of language. Deacon (1997, 2003) argues that communication is truly symbolic only if communicative acts constitute a system governed by internal structural relations, much like recursive syntax. Similarities between the syntactic structure of human language and the hierarchically-organized action sequences of Acheulean toolmaking in later hominins have been noted (Holloway, 1969), with recent functional neuroimaging evidence suggesting a shared neural substrate related to complex, goal directed action (Stout et al. in press). However, others downplay the importance of syntax –Wray (2000), Hurford (2003), Tomasello (2003), Christiansen & Kirby (2003b) all suggest that syntax is an artifact of cultural evolution since the emergence of Homo sapiens and that it is not as biologically central as some of the other properties of language discussed here.
There is evidence that nonhuman animals are incapable of both comprehending and producing hierarchically recursive structures (Fitch & Hauser, 2004). Indeed, Kanzi, the most successful of the language-trained apes, is conspicuously incapable of acquiring syntax (Corballis, 2002).
The single most obvious neuroanatomical specialization of Homo sapiens is large absolute and relative size of the brain. Figure 1b summarizes this and other unique features of the human brain. Averaging about 1400 g, human brains are approximately three times larger than those of great apes. This indicates that a significant amount of brain mass increase occurred along the hominin lineage since its origination from the LCA. Humans also outrank all other animals in measures of encephalization, the degree to which brain mass exceeds expectations based on allometric scaling for body mass (Fig. 2a) (Holloway & Post, 1982; Martin, 1990). Fossil evidence indicates that in the hominin lineage since the LCA there have been periods of gradual increases in cranial capacity that were occasionally accompanied by increases in body mass. However, starting at about 1.8 Ma, beginning with Homo erectus, brain expansion in hominins occurred at a much more rapid pace (Holloway et al. 2004).
Recent progress has been made in identifying possible genetic mechanisms underlying the expansion of the neocortex in human evolution (Bradley, 2008). Several genes that are known to participate in regulating the dynamics of proliferation and programmed death of cerebral precursor cells show evidence of positive selection in the hominin clade since the LCA (Evans et al. 2004; Gilbert et al. 2005; Vallender & Lahn, 2006). These putative changes in the division, differentiation, and migration of cerebral progenitor cells might explain why the modern human brain growth trajectory deviates from the pattern typical of most primates. While the neonatal brain in Homo sapiens makes up only about 27% of its adult size, other primates have relatively more mature brains at birth – newborn macaque brains are approximately 70% of adult size and newborn chimpanzee brains are 36% of adult size (Martin, 1983; Robson & Wood, 2008). At the time of birth, human brains are already about two times larger than great ape brains (Martin, 1983; Robson & Wood, 2008). Subsequently, postnatal brain growth in humans continues at its fetal rate through the first year, whereas in other primates, brain growth rates decrease shortly after birth (Fig. 2b) (Leigh, 2004). This unique human brain growth schedule is critical to achieving a high level of encephalization in the face of the obstetric constraints associated with pelvic adaptations for bipedality and provides a richer set of social and environmental stimuli to the developing infant while the brain's connections are still highly malleable. In this context, it is notable that the onset of joint attention in human infants occurs within this critical first year of life, providing the opportunity for intensive social facilitation of learning to influence synapse establishment during this period.
Across mammals, as overall brain size enlarges, various parts do not increase at the same rate. Because of the particularly steep allometric scaling slope of the neocortex relative to other brain parts, larger brains become comprised of progressively more neocortex (Finlay & Darlington, 1995). Going even beyond this general trend, the human neocortex (including both gray and white matter) exceeds predictions for a hominoid of the same total brain size (Rilling, 2006). Thus, brain size enlargement in human evolution might have led to a greater degree of functional ‘neocorticalization’, with this structure taking on more direct influence of other brain regions, allowing for greater voluntary control over actions (Deacon, 1997; Striedter, 2005). There is some evidence to support this hypothesis. Firstly, as overall brain size increases in primates, a larger proportion of the brainstem becomes occupied by structures related to or receiving descending neocortical projections, such as the pyramidal tract, red nucleus, and pontine nuclei (Tilney, 1928). Second, tracing studies have shown that neocortical axons form synapses directly onto motoneurons of the vocal folds in the nucleus ambiguus only in humans (Kuypers, 1958a;Iwatsubo et al. 1990), but not other primates (Kuypers, 1958b;Simonyan & Jürgens, 2003). Accordingly, functional MRI studies in humans have demonstrated the existence of an expanded representation of the intrinsic muscles of the larynx located in the dorsal precental gyrus adjacent to the representation of the lips (Brown et al. in press). In squirrel monkeys and rhesus macaques, the cortical laryngeal area occupies a position anterior to the precental gyrus, in the opercular part of the premotor cortex, and it does not contribute to vocalization (Simonyan & Jürgens, 2003). This greater extent of direct cortical involvement in the activation of the vocal folds may be important in the voluntary motor control needed to learn and execute the articulatory sequences of speech. More generally, such enhanced involvement of the neocortex in voluntary control over actions might contribute to other human-specific behavioral abilities, such as the modality and stimulus independence of language.
A large number of neocortical areas in humans have been shown to have functional and structural homologues in macaque monkeys, including many higher-order multimodal and language related areas (Preuss & Goldman-Rakic, 1991; Petrides & Pandya, 1994, 1999; Grefkes & Fink, 2005; Petrides et al. 2005). Currently, the most compelling evidence for ‘new’ neocortical areas in humans that are not homologous with macaques include regions within posterior parietal cortex which provide additional central visual field representations and greater sensitivity to extract three-dimensional form related to motion (Orban et al. 2006). Quite interestingly, these very same regions in the dorsal interparietal sulcus are activated in positron emission tomography imaging of humans learning how to fashion Oldowan-style stone tools (Stout & Chaminade, 2007). In the absence of comparable data from apes, however, it is not clear whether these posterior parietal areas are specific to Homo sapiens or whether they are shared with our close relatives.
Although the layout of the human cortical map shares many similarities with other catarrhine primates, it is possible that some territories have changed in relative size. One long-held idea is that human neocortical expansion has involved enlargement of higher-order multimodal areas to a greater degree than primary sensory and motor areas. Anatomically, this can be seen as an increase in the relative amount of ‘generalized’ or eulaminate cortex. However, humans have as much total eulaminate cortex as expected for a primate of our brain size, with slightly fewer eulaminate neurons (Shariff, 1953; Passingham, 1975b; Armstrong, 1990). Despite this finding, it remains possible that particular higher-order unimodal and multimodal cortical fields have enlarged disproportionately in humans.
Because parts of frontal cortex are implicated in executive control functions, it has long been assumed that this region was a focal point for volumetric expansion in human evolution. Recent data, however, have shown that total frontal cortex size in humans is no greater than expected based on apelike scaling trends for brain size and that it occupies a similar fraction of the cerebral cortex as in great apes (Semendeferi et al. 2002; Bush & Allman 2004). But is the prefrontal portion of the frontal cortex relatively large in humans? Circumstantial evidence suggests that this is the case. The human prefrontal cortex exhibits more gyrification than expected for an anthropoid primate of the same brain size (Rilling & Insel, 1999). Furthermore, primary motor and premotor cortex in humans occupy a smaller proportion of the frontal lobe compared with other primates, suggesting that the remainder is comprised of a relatively large prefrontal cortex (Preuss, 2004). Yet, comparative studies that have directly examined whether prefrontal cortex or any of its subdivisions are enlarged in humans have yielded contradictory results (Brodmann, 1912; Blinkov & Glezer, 1968; Holloway, 1968, 2002; Passingham, 1973; Uylings & van Eden, 1990; Deacon, 1997; Semendeferi et al. 2001; Schoenemann et al. 2005; Sherwood et al. 2005b). Unfortunately, these studies are typically based on small sample sizes of only one or two individuals per species. No doubt, the current lack of consensus on this critical topic will be alleviated once a rigorous comparative study of prefrontal cortex volume is performed with larger samples.
Besides the prefrontal cortex, other cortical regions appear to have undergone human-specific reorganization in size. For example, human primary visual cortex is only about one and half times larger in humans than in great apes, while the rest of neocortex is about three times larger (Stephan et al. 1981). Human primary visual cortex is also substantially smaller than predicted by allometry for total human brain size (Holloway, 1996). The relatively small size of primary visual cortex in humans suggests that adjacent areas of the posterior parietal cortex have disproportionately increased in volume (Holloway, 1996; Holloway et al. 2004). Several endocasts of Australopithecus afarensis and Australopithecus africanus from approximately 4–2.5 Ma show evidence that the lunate sulcus, which marks the border between primary visual cortex and parietal cortex, had already shifted to a more humanlike configuration (Holloway et al. 2004). This suggests that early hominins had evolved a relatively enhanced representation of visuospatial and sensorimotor integration in the posterior parietal cortex prior to dramatic brain size expansion. Because posterior parietal cortex is active in object manipulation tasks and motor planning (Shibata & Ioannides, 2001; Stout & Chaminade, 2007), it is possible that this cortical reorganization opened the door to stone tool production in later hominins. Further geometric expansion of the parietal lobes appears to distinguish modern Homo sapiens endocasts from other hominins (Bruner, 2004).
Another region that shows extraordinary enlargement in humans is the temporal lobe. Based on measurements of MRI scans, Semendeferi & Damasio (2000) and Rilling & Seligman (2002) demonstrated that the human temporal lobe, especially the underlying white matter, exceeds allometric predictions based on hominoids. The relative increase in white matter interconnectivity in humans appears to be concentrated in the region immediately beneath gyri rather than within the central core (Schenker et al. 2005), suggesting that the human temporal lobe is specifically characterized by increased local connectivity between neighboring cortical fields. The relative increase in the volume of the human temporal lobe, furthermore, is related to coordinated reorganization of nuclei in the amygdala (Barger et al. in press). Enlargement of the temporal lobe and its axonal connectivity in humans is intriguing in light of the key role of this region in functions such as language comprehension, naming, verbal memory, and face recognition. Compared with great apes, an anteriorly expanded and laterally pointing temporal lobe characterizes endocasts of Australopithecus africanus, suggesting reorganization of the cortical areas involved in some aspects of the these multimodal functions might have preceded brain size enlargement in human evolution (Falk et al. 2000).
In parallel with these changes in the size of higher-order cortical areas in humans, there has also been differential enlargement of certain thalamic nuclei with which they share reciprocal connections. Humans have more neurons than other hominoids in several dorsal thalamic nuclei, including the anterior principal (anteroventral) nucleus, mediodorsal nucleus, and pulvinar, while neuron numbers in sensory relay nuclei are generally similar across these species (Armstrong, 1982). Recent data indicate that pulvinar size scales against brain volume in anthropoid primates with positive allometry, explaining the proportionally larger nucleus complex in humans (Chalfin et al. 2007). During human brain development, the pulvinar and other dorsal thalamic nuclei attract migrating neurons from the telencephalic ganglionic eminence, which mature to become GABAergic interneurons (Letinic & Rakic, 2001). This migration stream has not been observed in any other species, including macaque monkeys, where cells of exclusively diencephalic origin take up positions in the dorsal thalamus. The possibility that greater numbers of interneurons uniquely characterize the human thalamus is supported by the observation that only humans show a bimodal distribution of neuron sizes in the pulvinar, whereas apes show a unimodal distribution (Armstrong, 1981). Thus, fundamental developmental processes might have been modified in the evolution of the human brain to accommodate expanded representation of higher-order systems in the thalamus.
Also concomitant with the elaboration of certain neocortical areas in humans, the cerebellum is large relative to body size (Rilling, 2006). Cerebellar enlargement in humans is not surprising given that it is linked by extensive connections with the neocortex and these two structures have evolved as a coordinated system across primates (Whiting & Barton, 2003). Interestingly, the ventral portion of the cerebellum's dentate nucleus is relatively larger in humans than in the great apes (Matano, 2001). This part of the dentate nucleus projects to non-motor regions of the frontal lobe by way of the ventrolateral thalamus. Therefore, the human cortico-cerebellar circuit may be distinguished from other primates in having a greater development of the connections with frontal association areas that play a role in cognition and language (Leiner et al. 1993).
As described above, many neuroanatomical asymmetries would have been present in the LCA. However, humans have elaborated on these asymmetries and evolved a much greater degree of hemispheric lateralization. Across Homo sapiens the incidence of right-handedness is approximately 90%, particularly for fine motor tasks involving precision grip and manipulation of tools. In contrast, most other primates do not display such pronounced bias for hand use at the population level (McGrew & Marchant, 1997). When tested with an experimental coordinated bimanual task, for example, captive common chimpanzees (n = 467) display population-level right-handedness at an incidence of approximately 67%, captive gorillas (n = 31) show a non-significant trend towards right-handedness, and 79% of captive orangutans (n = 19) are significantly left-handed (Hopkins et al. 2003, 2004). Thus, to date there is no evidence that any great ape species displays the same high degree of population-level handedness that is present in humans.
An additional feature of asymmetry that has increased in human evolution is a pattern of combined left-occipital and right-frontal petalias. These asymmetric lateral protrusions at the frontal and occipital poles of the cerebral hemispheres are a common feature of human brains. Although recent voxel-based MRI studies do not support previous findings of an association between this petalial torque pattern in humans and right-handedness (Good et al. 2001; Herve et al. 2006), it is nonetheless significant that this pattern of cerebral asymmetry is not consistently present in nonhuman primates or in hominin fossil brain endocasts until Homo erectus (Holloway & De La Coste-Lareymondie, 1982).
Beyond these asymmetries of the human brain's gross morphology, there are additional aspects of neuroanatomical asymmetry in the cellular organization of the neocortex that have not yet been found in other species. Human brains tend to have a greater proportion of neuropil in the left hemisphere of Broca's area (Brodmann areas 44 and 45) (Amunts et al. 1999, 2003), Wernicke's area (area Tpt) (Buxhoeveden et al. 2001), primary motor cortex hand area (Amunts et al. 1996, 1997), as well as primary visual cortex and extrastriate areas (Amunts et al. 2007). Neuropil is the space between cell bodies occupied by dendrites, axons, and synapses. In contrast, investigations of chimpanzee brains have not revealed such interhemispheric differences in the amount of neuropil in area Tpt (Buxhoeveden et al. 2001) or the primary motor cortex representation of the hand (Sherwood et al. 2007). Thus, this aspect of histological asymmetry appears to be unique to humans. Although the adaptive significance of such brain asymmetry is poorly understood, it is possible that humans have evolved a greater extent of cerebral lateralization in the context of specialization for computationally demanding functions, such as language, to avoid bilateral duplication of circuitry and interhemispheric conflict (Corballis, 1991).
Increased neocortex size in humans is not the result of a simple multiplication of uniform processing units. Shariff (1953) reported that human cerebral cortex volume is 2.75 times larger than in chimpanzees, but has only 1.25 times more neurons. This suggests that much of the increased mass of the neocortex derives from alterations within the space between cell bodies. Recent studies that make detailed comparisons of the fine structure of the neocortex among humans and their close relatives indicate that microanatomical changes have occurred in the course of human brain evolution. For example, the patterned arrangement of dendrites and local-circuit interneurons in layer IVA of primary visual cortex of humans is distinctive relative to other hominids (Preuss et al. 1999; Preuss & Coleman, 2002), potentially relating to changes in the motion-processing pathway. Additionally, in humans the spindle-shaped von Economo neurons located in layer V of anterior cingulate and frontoinsular cortex are especially large in size, more numerous, and show a greater tendency to be aggregated in clusters than in great apes (Nimchinsky et al. 1995, 1999). If, as hypothesized, von Economo neurons furnish a projection pathway that integrates interoceptive feedback and cognitive monitoring of conflict to mediate rapid non-rational behavioral selection in ambiguous social interactions, then these anatomical differences might be important in allowing humans to navigate ever more complex social networks (Allman et al. 2005).
Studies of gene expression using microarray techniques have shown that the human cerebral cortex is also distinguished from chimpanzees and other primates in displaying up-regulation of genes related to neuronal signaling, plasticity, and metabolic activity (Cáceres et al. 2003; Preuss et al. 2004; Uddin et al. 2004). These observations are further supported by findings that various subunits of the mitochondrial electron transport chain show evidence of natural selection in the human terminal lineage (Grossman et al. 2004; Uddin et al. 2008a). Some of the increased mass-specific metabolic demand of human neocortex is expected given the energetic costs of maintaining membrane potentials in neurons that have expanded dendritic arbors and longer axonal projections in a large brain (Elston et al. 2006). Congruent with this idea, there are increasing numbers of glial cells relative to neurons in the primate neocortex as a function of brain size, and humans have the highest glia-neuron ratio (Sherwood et al. 2006). Other findings indicate that two thrombospondins, THBS2 and THBS4, have elevated expression in the neuropil of the adult human neocortex and striatum (Cáceres et al. 2007). These proteins are astrocyte-secreted factors that have the capacity to induce synapse formation. Therefore, their increased expression suggests that the human brain might be distinguished by enhanced synaptic plasticity in adulthood, comprising a possible molecular substrate of greater flexibility of behavior and capacity for learning.
In recent years, a tremendous amount of progress has been made in understanding the underpinning of human brain evolution through the examination of changes in gene sequences (Enard et al. 2002a; Dorus et al. 2004; Fisher & Marcus, 2006; Bradley, 2008). These comparative studies have revealed many intriguing human-specific genetic differences relative to chimpanzees and other primates. While some have clear implications for the human brain phenotype, the significance of others remains mysterious.
One of the first brain-important genes reported to show sequence changes in human evolution was the transcription factor FOXP2 (Enard et al. 2002b). Based on the association of certain point mutations of this gene with grammatical impairment and orofacial dyspraxia, accompanied by functional under-activation and structural abnormalities of language-related brain regions, FOXP2 has been suggested to play a role in the development of language and speech in humans (Fisher & Marcus, 2006). Although the FOXP2 gene is highly conserved across mammals (with identical amino acid sequences in rhesus macaques, gorillas, and chimpanzees), humans have fixed mutations that yield two amino acid substitutions in comparison with other primates, suggesting positive selection for its function. The human variant of FOXP2 has been reported to also be present in Neandertals (Krause et al. 2007). As of yet, however, it is not clear how these genetic changes might relate to modifications in human neuroanatomy relevant to language or speech.
AHI1 is another gene that shows evidence of positive selection in human evolution (Ferland et al. 2004). This gene is required for normal axonal pathfinding in development that leads to decussation of the corticospinal tract and superior cerebellar peduncles. Deleterious mutation of this gene in human patients leads to Joubert syndrome, a condition that presents with abnormalities of motor coordination and gait, mental retardation, and antisocial behavior. The evidence for AHI1 gene evolution indicates that some particular aspects of neuronal interconnectivity were selectively modified in human evolution, possibly in support of our species’ derived mode of gait and posture.
Using novel search strategies, other studies are identifying new genomic regions that might play a role in the evolution of human brain structure and function. For example, a previously unknown gene was recently identified that is markedly amplified in the human brain (Popesco et al. 2006). This gene, which encodes the DUF1220 protein domain, shows much higher copy number in humans than in other primate species. Although it has been demonstrated that the DUF1220 protein is expressed in neuronal somata and dendrites, its function is not yet understood. In another study, Pollard and colleagues (2006) identified several genomic regions that show rapid evolution in the human lineage since the LCA but are otherwise highly conserved across mammals. The most highly accelerated of these, dubbed HAR1, is a 118-bp region in the last band of chromosome 20q that encodes a stable secondary RNA structure expressed in Cajal-Retzius cells during weeks 7–9 of gestation in humans. These cells types, which also express reelin, are critical in the early specification and migration of cerebral cortical neurons into their correct layers. Further studies of the products of these genes might shed light on the mechanisms leading to the development of distinctive microanatomical features of the human brain.
Finally, recent genome-wide surveys indicate that non-coding cis-regulatory sequences in close proximity to genes involved in neuronal cell adhesion (Prabhakar et al. 2006) and neurogenesis (Haygood et al. 2007) have undergone accelerated evolution in both human and chimpanzee lineages. Of special significance, despite the overrepresentation of these gene categories in both human and chimpanzee terminal lineages, these studies have found little overlap among the specific genes underlying these shared enriched annotations. These findings suggest that independent accelerated evolution of sequences in these categories since the divergence of the human and chimpanzee lineages has contributed to distinct neural and cognitive phenotypes through differential regulation of genes involved in coordinating developmental process.
Our current view of human mental evolution is like a jigsaw puzzle where many of the pieces have been taken out of the box, but they have not yet been put together to form a coherent picture. In the preceding sections, we have enumerated many changes that have taken place in the descent of humans from the LCA. However, while many differences can be described that distinguish humans at genetic, behavioral, and neuroanatomical levels, we are still woefully ignorant about how these apparent specializations are bound together. Modern humans differ from the reconstructed LCA, and from all other living animals, dramatically in cognition and language. In many other lineages that show such profound divergence in behavior or sensory capacities there is a comparably obvious specialization at the neuroanatomical level. Well-described examples include the increased allocation of cortical somatosensory representation for the bill in the platypus or the magnification of auditory cortex in echolocating bats (Krubitzer, 2007). Can such a direct parallel between structure and function be drawn in the case of modern humans? In the concluding section of this article, we offer a preliminary model to account for some links between brain and behavior in human evolution.
We hypothesize that subtle shifts in the genetically programmed processes underlying brain development, cellular physiology, and neurochemistry, in conjunction with biases in temperament, perception, and sensation that epigenetically affect learning processes, are sufficient to yield significant changes in the architecture and function of the modern human brain (Fig. 3). According to our model, domain-specific behavioral capacities emerge from deep shifts in the weight of basic behavioral processes like attention, executive control, working memory, and inhibition, as they guide the individual's interactions with the environment. In humans, this process unfolds through a unique postnatal ontogeny that includes especially rapid brain growth and synaptogenesis in the first year of life. The neural substrates of these changes in bias are, to some extent, emergent from allometric changes in microstructure related to brain enlargement and, to some extent, determined by non-allometric mosaic changes in neural structure and physiology.
Given the high energetic cost of neural tissue (Aiello & Wheeler, 1995), brain enlargement could only have arisen in human evolution if offset by significant fitness benefits (Barrickman et al. in press). The fact that the human neocortex tripled in size since the LCA is almost certainly related to increased general intelligence in our species. Absolute brain size has been shown to predict interspecific variation on measures of cognitive flexibility (Rumbaugh, 1997; Gibson et al. 2001; Reader & Laland, 2002; Deaner et al. 2007), that is, the ability to explore novel tactics when reward contingencies change. It has been suggested that such neural machinery for an enhanced ‘cognitive reserve’ might be required in long-lived species, such as humans, where individuals are likely to face unpredictable socioecological challenges over a long lifespan (Allen et al. 2005).
So, what computational advantage does increasing the absolute size of the brain confer? First, larger brains contain a greater total number of neurons available for data encoding and integration (Roth & Dicke, 2005). In addition, theoretical models and comparative data suggest that neocortical enlargement yields a proliferation of functionally discrete modules, which are involved in specialized information processing (Kaas, 2000; Changizi & Shimogo, 2005) and could be manifest as increased neuroanatomical lateralization (Ringo et al. 1994). In this context, it is interesting that certain characteristics of anatomical brain asymmetry are unique to humans, such as increased neuropil space in the left hemisphere and left-occipital right-frontal petalia torque. Increased anatomical modularity would seem to be congruent with the evolutionary psychology model of human cognitive evolution which posits that new, genetically specified cognitive modules have accumulated in the modern human mind (Cosmides & Tooby, 1994; Buss, 2005; Tooby & Cosmides, 2005). Here, we have discussed several unique psychological specializations in humans, such as the capacity to reason about unobservable causes and the faculty of language. However, as discussed above, because macaque monkey homologues have been described for the majority of human neocortical areas, it seems that the large-scale modular organization of human neocortex is retained from a catarrhine primate ancestor. Although it remains a possibility that improved techniques for cortical mapping may reveal human-specific areas in the future, at the present time, we see no compelling evidence that novel domain-specific cognitive capacities in humans can be readily linked to the addition of genetically specified, specialized processing modules in the neocortex or elsewhere in the brain.
Are there other general features of neuroanatomical organization that change with increasing brain size and might account for our species’ enhanced cognitive powers? One correlate of brain enlargement, at least in anthropoid primates, is an increase in the proportion of neocortical neurons that show molecular adaptations for long-range associational axon projections (Sherwood et al. 2004; Sherwood & Hof, 2007). This suggests that the outputs of local neocortical processing in larger brains are fed-forward to a greater diversity of targets. One unfortunate consequence is that these same neuron classes are especially susceptible to degeneration in aging and Alzheimer's disease in humans (Hof et al. 2002; Hof & Morrison, 2004). Other long-range projecting neuronal types might also be especially vulnerable to degenerative neuropathology. The von Economo neurons, for example, are severely and selectively affected in human frontotemporal dementia cases (Seeley et al. 2006).
Increased numbers of corticocortical association connections might, in part, contribute to the evolution of new functions in human brain regions that have homologues in other species, such as cortical ‘language’ areas. For example, in humans, the cortex of the fronto-operculum, known as Broca's area, mediates phonological and syntactical aspects of speech and language production. In addition to classical language processes, neuroimaging studies in humans have revealed that this cortical region participates in several other functions as well, such as object manipulation and grasping, imagery of motion, imitation of movements, and movement preparation and planning (Nishitani et al. 2005). Many of these non-linguistic functions have also been defined for the ventrolateral prefrontal cortex of macaque monkeys (Petrides, 2005). Thus, this region in both humans and macaques serves as an interface for the perception and orchestration of sensory and motor sequencing essential to planning, observation, understanding, and imitation of actions (Arbib, 2005).
We hypothesize that positively selected alterations in axon guidance and cell adhesion mechanisms of humans (Prabhakar et al. 2006; Uddin et al. 2008b) in combination with the general tendency for long-association projections to form in larger brains provide the inferior frontal cortex of humans with access to a greater diversity of afferents, particularly regions of the temporoparietal cortex containing multimodal semantic and lexical representations. Indeed, new diffusion tensor magnetic resonance imaging data suggest that the arcuate fasciculus, the axon pathway which links temporal cortex to the inferior frontal cortex, has a more prominent projection to middle temporal gyrus semantic processing areas in humans as compared with chimpanzees and macaques (Rilling et al. 2007). Such diversified connectivity might explain a central feature of human language and the thought it expresses – the ‘generalized systematicity’ described above, by virtue of which we can think and express thoughts that systematically combine concepts from diverse domains. This capacity explains the human facility with analogical thinking, e.g. thinking of light as a wave. More generally, enhanced corticocortical associational connectivity might contribute to other distinctive aspects of modern human cognition, such as the capacity of the imitation faculty to incorporate inferences about others’ mental states and intentions.
Many genes related to brain development and function show signs of accelerated evolution exclusively in humans (Dorus et al. 2004; Uddin et al. 2008b), suggesting that beyond brain size expansion, many other aspects of neurobiology have been differentially altered in human evolution.
Changes in the balance among brain region sizes that are established early in development might have dramatic implications for the dynamic competitive processes that ultimately result in the connectivity of the adult brain (Deacon, 1997). Shifts in the relative size of cortical and subcortical regions, or the strength of their connections, could contribute to significant alterations in the flow of information during ontogeny, with long-term repercussions for setting biases in perception and learning via modifications to domain-general features such as attention, motivation, working memory, and inhibitory control. One dramatic example of the manner in which early developmental experience can powerfully mold learning and behavior is the ‘enculturation’ experiments in apes (see Tomasello & Carpenter, 2005; but also Bering, 2004 for an alternative perspective). Our model hypothesizes that shifting biases among higher-order processing circuits involving multimodal neocortical areas and their connections in the striatum, thalamus, and cerebellum, could give rise to striking behavioral changes. Furthermore, alterations at the level of histological architecture and gene expression could also determine the shape of activity in networks of neurons.
For example, we reason that dramatic changes during ontogeny in the flow and direction of attention could potentially result in powerful domain-specific mechanisms; that is, mechanisms that perform computation on specific types of stimuli and generate specific types of behaviors. Some of these include edge detection mechanisms, categorical perception, sophisticated types of imitation learning mechanisms (e.g. novel motor and vocal imitation), the ability to make inferences about different unobserved causes (e.g. psychological, physical), and extremely flexible syntactically structured and semantically creative language use. These different cognitive skills, though modularized, are likely to exert an effect on the development and contours of other modules enhancing ‘cognitive self-control’, the ability to inhibit automatic responses based on sophisticated social understanding. For instance, cognitive self-control could help account for inferences about unobservable causes, as such capacities require the ability to inhibit stereotyped responses to superficially similar stimuli, treating them differently if there are reasons to expect different hidden causes. Similarly, some, though not all, of the social learning differences between chimpanzees and humans might amount to a capacity for attentional control – perhaps involving inhibition of stimulus-capture forms of attention that cause distraction (Tomasello & Carpenter, 2005). And our capacity to use language to talk about anything in any circumstances clearly requires sophisticated voluntary control.
In each case, domain-general processes and primary domain-specific mechanisms that are highly encapsulated (i.e. resistant to extraneous information), such as those that mediate the processing of different types of visual and auditory information (e.g. the perception of lines and phonemes), likely result in the generation of other specialized mechanisms, such as those that mediate the identification of letters and spoken words. Other domain-specific cognitive functions are likely responsible for the development of domain-specific skills that mediate the imitation of different types of stimuli and our species’ ability to flexibly and robustly reason about unobservable causes across contexts (Subiaul et al. 2006; Vonk & Povinelli, 2006). These more sophisticated cognitive skills are less encapsulated than the primary domain-specific mechanisms, as they have to be more malleable and responsive to environmental changes in order to effectively solve the species-typical problems that arise in the course of development.
Thus, the dynamic interaction between domain-general and domain-specific mechanisms could result in the proliferation of many specialized cognitive operations that ultimately become highly modularized (e.g. reading, writing). We believe that the neurobiological profile of humans which includes a relatively large neocortex, increased relative size of certain higher-order multimodal cortical areas, specializations of projection neuron classes, and modifications to the rate of postnatal brain growth, makes such a process more powerful in our species than in other primates. Such a framework gives credence to the popular saying among evolutionary psychologists (Cosmides & Tooby, 1994) that humans have more, not fewer, ‘instincts’ than nonhuman animals.
One of the central puzzles about human cognition is its curious combination of flexibility and efficiency. While the human mind is certainly characterized by an abundance of highly efficient, domain-specific, modular capacities, many of these, like reading, writing, game playing, and musical ability, are clearly recent cultural products, and therefore indicate extraordinary ontogenetic flexibility. In tracing these capacities to ontogenetic effects of evolved, genetically determined modifications to domain-general capacities we share with the LCA, e.g. voluntary control, attention, and perceptual biases, our model accounts for the balance between flexibility and efficiency that characterizes human cognition.
We suggest that ‘descent with modification’ aptly describes the construction of the human mind. The studies we have reviewed demonstrate that, although humans have certainly acquired many novel cognitive and neural specializations in the course of evolution, a large number of features are shared exclusively with our fellow great apes. Our hypothesized model explains how incremental changes in brain development and organization might yield apparent discontinuity in our species-specific behavioral repertoire. Taken together, we consider it inescapable to recognize the continuity in mentality between the LCA and us, despite the significant disparity in phenotypes.
The authors thank Drs Bernard Wood and Sarah Elton for inviting us to present this paper at the ‘Human Evolution’ symposium at the 2007 Proceedings of the Anatomical Society of Great Britain and Ireland. We also thank Drs M. A. Raghanti, N. M. Schenker, J. Vonk, J. M. Allman, P. R. Hof, R. L. Holloway, J. K. Rilling, and M. Uddin for providing useful comments on an earlier draft of this manuscript. This work was supported in part by the National Science Foundation (BCS-0515484 and BCS-0549117), the National Institutes of Health (NS-42867), and the James S. McDonnell Foundation (22002078).