The Emotional Content of Children’s Writing: A Data‐Driven Approach

Published on March 18, 2024

Abstract
Emotion is closely associated with language, but we know very little about how children express emotion in their own writing. We used a large-scale, cross-sectional, and data-driven approach to investigate emotional expression via writing in children of different ages, and whether it varies for boys and girls. We first used a lexicon-based bag-of-words approach to identify emotional content in a large corpus of stories (N>100,000) written by 7- to 13-year-old children. Generalized Additive Models were then used to model changes in sentiment across age and gender. Two other machine learning approaches (BERT and TextBlob) validated and extended these analyses, converging on the finding that positive sentiments in children’s writing decrease with age. These findings echo reports from previous studies showing a decrease in mood and an increased use of negative emotion words with age. We also found that stories by girls contained more positive sentiments than stories by boys. Our study shows the utility of large-scale data-driven approaches to reveal the content and nature of children’s writing. Future experimental work should build on these observations to understand the likely complex relationships between written language and emotion, and how these change over development.

Read Full Article (External Site)