Joint Attention

Certainty Style Key

Certainty styling is being phased out topic by topic.

Hover over keys for definitions:
True   Likely   Speculative
Human Uniqueness Compared to "Great Apes": 
Likely Difference
Human Universality: 
Population Universal (Some Individuals Everywhere)
MOCA Domain: 
MOCA Topic Authors: 

 Joint attention is the capacity to participate in the referential triangle of self, social partner, and external object or event by jointly attending to a stimulus while simultaneously demonstrating awareness of this shared attention. Joint attention is a human-specific capacity that underlies theory of mind and cultural learning. For attention to be truly joint, the individuals involved must not only be simultaneously looking to the same stimulus, but must be actively monitoring the attention of their social partner on the object or event. In human infants, the capacity to follow gaze emerges in the second half of the first year, and by around the time of their first birthday, they are able to actively participate in episodes of true coordinated joint attention. Infant chimpanzees can follow gaze and points of social partners, and gorillas and bonobos have demonstrated gaze alternation and referential gestures to direct others’ attention. However, given that these primates do not look at their social partner’s face when engaging in interactions with objects and do not make attempts to reengage a social partner who has ceased interacting with them, the capacity for true joint attention is thought to be unique to humans.

Though it developed along the same evolutionary trajectory as social learning, gaze following, and referential communication, true joint attention developed after our split from chimpanzees and bonobos, as this capacity is not present in our primate relatives. Given that joint visual attention allows individuals to share intentionality and participate in collaborative communication with conspecifics, this skill likely emerged in homo erectus, when larger brains and strong selection pressures of an increasingly complex social environment facilitated the development of more complex social cognition. One potential mechanism for this capacity is early socialization patterns, as chimpanzees raised by humans have developed typical joint visual attention and human infants raised in many traditional cultures do not show the typical Western capacity for joint visual attention.


Timing of appearance of the difference in the Hominin Lineage as a defined date or a lineage separation event. The point in time associated with lineage separation events may change in the future as the scientific community agrees upon better time estimates. Lineage separation events are defined in 2017 as:

  • the Last Common Ancestor (LCA) of humans and old world monkeys was 25,000 - 30,000 thousand (25 - 30 million) years ago
  • the Last Common Ancestor (LCA) of humans and chimpanzees was 6,000 - 8,000 thousand (6 - 8 million) years ago
  • the emergence of the genus Homo was 2,000 thousand (2 million) years ago
  • the Last Common Ancestor (LCA) of humans and neanderthals was 500 thousand years ago
  • the common ancestor of modern humans was 100 - 300 thousand years ago

Probable Appearance (Lineage Separation Event): 
Definite Appearance: 
2 thousand years ago
Background Information: 

 Joint attention is measured by either directing (i.e., pointing) or following (i.e., gaze following) another individual’s attention then sharing the experience of attending to the object or event by spontaneous looking to the social partner’s face. This demonstrates that the individual is motivated not only by the exciting stimulus, but also by the social experience of jointly attending to that stimulus. Beyond the ability to simply learn via observation, joint visual attention allows individuals to share intentionality and participate in collaborative communication. When this capacity comes online toward the end of the first year of life, human infants are able to gain valuable social information about their environment and later begin to develop more complex knowledge of the social environment, making joint attention a key component of the human-specific capacity for theory of mind.

The Human Difference: 

 Our closest primate relatives seem to clearly identify the goals of conspecifics in order to work together toward the same goal, yet the awareness and desire to maintain shared attention on the same goal seems to be uniquely human. Some studies report incidences of wild and captive bonobos attempting to share attention through referential gestures (i.e., pointing) and other studies report observations of gorillas engaging in triadic play interactions, yet the consensus in the literature is still that humans are the only animal demonstrating true joint engagement.

Universality in Human Populations: 

 The capacity for joint attention is present in all Western, Educated, Industrialized, Rich, and Democratic (“WEIRD”) populations. However, evidence from other cultures suggests that visual joint attention is not ubiquitous to humans everywhere. For example, some cultures do not engage in mutual eye contact at all, as it represents a cultural taboo or even a sign of aggression. As monitoring the gaze and facial expression of the social partner is a primary characteristic of traditional shared visual attention, these cultures would not be categorized as having the capacity for visual joint attention. Many cultures emphasize the use of other modalities besides gaze as cues to attention, such as physical contact. These cultures (e.g., Vanuatu) initiate shared attention on an object through shared touch, suggesting that we may need to redefine joint attention in a way that better accounts for global variation in attention-sharing strategies.

Mechanisms Responsible for the Difference: 

 One of the most well-supported mechanisms underlying the development of visual joint attention is culturally-mediated socialization patterns. Studies have shown that chimpanzees raised by humans develop similar social cognitive abilities of initiating and maintaining visual joint attention, suggesting that it is early social experience that leads to the acquisition of the competency for joint attention. Cultural and individual variation in the emergence of joint attention is associated with differences in parenting strategies and the level of emphasis placed by parents on sharing visual attention, further demonstrating the importance of early social experience. There is also evidence for certain brain regions being selectively recruited during triadic interactions. Specifically the dorsal Medial Prefrontal Cortex (dMPFC) is demonstrated across many studies to be selectively activated when monitoring the attention of others, making judgments that combine self and other, and when engaging in coordinated attention with another individual.

Possible Selection Processes Responsible for the Difference: 

 It is well established that one of the strongest selection pressures leading to the homo line was the increasingly complex social environment, and the subsequent increase in competition for food, mates, etc. Sharing attention represents the skill and desire to work collaboratively with others, and may also underlie even more complex social tools such as theory of mind. Though other primates were seemingly on a similar trajectory of developing increasingly complex skills to help them understand others’ intentions and actions, the desire to share the experience of an eternal entity or experience brought our ancestors in the homo lineage a step further in the acquisition of the most complex social toolkit.

Implications for Understanding Modern Humans: 

 This difference provides insight on a possible mechanism that allowed humans to overcome the challenges of an increasingly competitive social environment through enhanced cooperation and social reward mechanisms.

Occurrence in Other Animals: 

 Other animals demonstrate sensitivity to social cues that allow them to orient to the same external entity as another animal, most notably head orientation. Dogs and horses and other domesticated animals can follow human gaze, which is thought to have emerged during their long history of human-controlled selective breeding.

Related MOCA Topics
Related Topics (hover over title for reason):


  1. The function and mechanism of vocal accommodation in humans and other primates., Ruch, Hanna, Zürcher Yvonne, and Burkart Judith M. , Biol Rev Camb Philos Soc, 2018 05, Volume 93, Issue 2, p.996-1013, (2018)
  2. On Privileging the Role of Gaze in Infant Social Cognition., Akhtar, Nameera, and Gernsbacher Morton Ann , Child Dev Perspect, 2008 Aug, Volume 2, Issue 2, p.59-65, (2008)
  3. Cooperative activities in young children and chimpanzees., Warneken, Felix, Chen Frances, and Tomasello Michael , Child Dev, 2006 May-Jun, Volume 77, Issue 3, p.640-63, (2006)
  4. Development of social cognition in infant chimpanzees (Pan troglodytes): Face recognition, smiling, gaze, and the lack of triadic interactions1, Tomonaga, Masaki, Tanaka Masayuki, Matsuzawa Tetsuro, Myowa-Yamakoshi Masako, Kosugi Daisuke, Mizuno Yuu, Okamoto Sanae, Yamaguchi Masami K., and Bard Kim A. , Japanese Psychological Research, Volume 46, p.227–235, (2004)
  5. Gaze-following and joint visual attention in nonhuman animals, Itakura, Shoji , Japanese Psychological Research, Volume 46, p.216–226, (2004)
  6. Social cognition, joint attention, and communicative competence from 9 to 15 months of age., Carpenter, M, Nagell K, and Tomasello M , Monogr Soc Res Child Dev, 1998, Volume 63, Issue 4, p.i-vi, 1-143, (1998)
  7. Joint attention and imitative learning in children, chimpanzees, and enculturated chimpanzees, Carpenter, Malinda, and Tomasello Michael , Social Development, Volume 4, p.217–237, (1995)
  8. Joint attention as social cognition, Tomasello, Michael , Joint attention: Its origins and role in development, p.103–130, (1995)