Measuring Children’s Early Vocabulary in Low‐Resource Languages Using a Swadesh‐Style Word List

The study proposes a practical bridge by identifying a compact set of 100 concepts that behave similarly across many languages. Drawing on dozens of existing parent-report inventories, the authors picked concepts with stable learning patterns so they can serve as reliable probes in low-resource settings. This approach mirrors the historical idea behind Swadesh lists but refocuses it on child language assessment, offering a thoughtful, evidence-based shortcut for creating new screening tools where full inventories are not possible.

For anyone interested in human potential, this work matters because early vocabulary relates to later learning, health, and social outcomes. A smaller, more universal checklist could make assessments fairer and more inclusive, letting practitioners and communities gather meaningful data where none existed before. Follow the link to see how this compact list was built and how it might change who gets counted in the science of early development.

Abstract
Early language skill is predictive of many later life outcomes and is thus of great interest to developmental psychologists and clinicians. The MacArthur-Bates Communicative Development Inventories (CDIs), parent report instruments typically containing inventories of hundreds of children’s vocabulary words, have proven to be valid and reliable instruments for measuring children’s early language skill. The CDIs have been adapted to many dozens of languages, and cross-linguistic comparisons show both consistency and variability in language acquisition trajectories. However, thousands of languages do not yet have CDIs, nor the early language corpora needed to create them, posing a significant barrier to increasing the diversity of languages that are studied. Here, we propose a method for selecting candidate words to include on new CDIs through analyzing psychometric properties of the translation-equivalent concepts that are frequently included on existing CDIs. Leveraging 32 datasets from existing CDIs, we propose a list of 100 concepts that have low variability in their cross-linguistic learning difficulty. This pool of common concepts—analogous to the Swadesh lists, which are basic vocabulary lists used in glottochronology for cross-language comparison—can be used as a starting point for future CDI adaptations. We show that the proposed Swadesh-CDI list generalizes well to data from 10 additional languages.
Read Full Article (External Site)

Measuring Children’s Early Vocabulary in Low‐Resource Languages Using a Swadesh‐Style Word List

Think you’re bad at languages? Experts say these 5 myths are to blame

Looking forward: eye-gaze methods in vocabulary development research