Using transcribed child-directed conversations and a computational model tuned to natural speech, the team measures predictability at each moment in an utterance and checks the model against experiments in each language. The main pattern is clear: cues for agents appear early and reliably, no matter the language or word order. Identifying patients (the ones acted upon) depends on extra context and varies with the language’s grammar. That pattern hints at a learning environment where doer roles are salient and easier to map from limited input.

For parents, educators, and designers of language-learning tools, these results matter because they point to which meanings children grasp first and which require richer contexts. If agent roles pop out across languages, early interactions that highlight actions and actors could scaffold learning efficiently. The article connects this bias to broader questions about human cognition, development, and inclusion: how do universal tendencies interact with language-specific cues to shape who gets noticed in conversation? Read the full article to explore how these findings might change approaches to supporting language growth and equitable communication.

Abstract
Language comprehension unfolds incrementally, requiring listeners to continually predict and revise interpretations. Comprehenders across very diverse languages show a consistent preference for agents, anticipating the agent (“the doer” of an action) more strongly than the patient (“the undergoer”). An unresolved question is how the preference develops in children given incomplete utterances and argument omission in their input. Here, we approach this question by quantifying the incremental predictability of semantic roles (agents vs. patients), probing specifically what kind of contextual information impacts ease of learning. We use transcribed utterances from child-directed speech in three languages, differing in critical conditions of word order and argument omission: Tagalog (verb-initial), English (verb-medial), and Turkish (verb-final). To quantify incremental predictability at each position in the sentence, we use a computational model trained on naturalistic child-directed speech, which is first validated against experimental data in each language. Our results show that agents are highly predictable irrespective of sentence position or language, requiring barely any contextual information. In contrast, patient prediction requires additional information, varying by language. These findings suggest that the assignment of agent roles in child-directed speech is an easier task across typologically distinct languages, possibly reflecting the more general preference for agents outside language. Patients, by contrast, appear to be contextually induced roles that develop in ways that are largely shaped by the affordances of each language.

Read Full Article (External Site)