VOLUME XCV | NOV 2024

Can Language Models Trained on Written Monologue Learn to Predict Spoken Dialogue?

Abstract
Transformer-based Large Language Models (LLMs) have recently increased in popularity, in part due to their impressive performance on a number of language tasks. While LLMs can produce human-like writing, the extent to which these models can learn to predict spoken language in natural interaction remains unclear. This is a nontrivial question, as spoken and written language differ in syntax, pragmatics, and norms that interlocutors follow. Previous work suggests that while LLMs may develop an understanding of linguistic rules based on statistical regularities, they fail to acquire the knowledge required for language use. This implies that LLMs may not learn the normative structure underlying interactive spoken language, but may instead only model superficial regularities in speech. In this paper, we aim to evaluate LLMs as models of spoken dialogue. Specifically, we investigate whether LLMs can learn that the identity of a speaker in spoken dialogue influences what is likely to be said. To answer this question, we first fine-tuned two variants of a specific LLM (GPT-2) on transcripts of natural spoken dialogue in English. Then, we used these models to compute surprisal values for two-turn sequences with the same first-turn but different second-turn speakers and compared the output to human behavioral data. While the predictability of words in all fine-tuned models was influenced by speaker identity information, the models did not replicate humans’ use of this information. Our findings suggest that although LLMs may learn to generate text conforming to normative linguistic structure, they do not (yet) faithfully replicate human behavior in natural conversation.

Anne-Marie Dubois

Anne-Marie is a French-Canadian philosopher from New Brunswick, delving into existential questions of human purpose and fulfillment. Her contributions encourage reflective practices for realizing potential, inspired by Acadian resilience and communal wisdom.

Read Full Article (External Site)

Can Language Models Trained on Written Monologue Learn to Predict Spoken Dialogue?

Can Large Language Models Simulate Spoken Human Conversations?