A Computational Framework to Study Hierarchical Processing in Visual Narratives

Published on May 2, 2025

Abstract
Theories of visual narrative comprehension have advocated for a hierarchical grammar-based comprehension mechanism, but only limited work has investigated this hierarchy. Here, we provide a computational framework inspired by computational psycholinguistics to address hierarchy in visual narratives. The predictions generated by this framework were compared against behavior data to draw inferences about the hierarchical properties of visual narratives. A segmentation task—where participants ranked all possible segmental boundaries—demonstrated that participants’ preferences were predicted by visual narrative grammar. Three kinds of models using surprisal theory—an Earley parser, a hidden Markov model (HMM), and an n-gram model—were then used to generate segmentation preferences for the same task. Earley parser’s preferences were based on a hierarchical grammar with recursion properties, while the HMM and the n-grams used a flattened grammar for visual narrative comprehension. Given the differences in the mechanics of these models, contrasting their predictions against behavior data could provide crucial insights into understanding the underlying mechanisms of visual narrative comprehension. By investigating grammatical systems outside of language, this research provides new directions to explore the generic makeup of the cognitive structure of mental representations.

Read Full Article (External Site)