Thursday, May 08, 2025

The vocabulary, semantics, and syntax of prosody

Matalon et al. (open source) offer a fascinating study illustrating the linguistic structuring of prosody -the communication of meaning through the tone and inflection of our speaking:

Significance

In conversation, prosody complements words, forming a structured communication system distinct from, yet connected to, text. By analyzing large datasets of spontaneous conversations and clustering similar snippets of speech, we identify the fundamental building blocks of this system. Our findings reveal a prosodic vocabulary of a few hundred patterns (far fewer than the thousands of words in a core verbal vocabulary), which fulfill interactional and attitudinal functions. Just as syntax governs word combinations, we observe recurring prosodic structures where certain patterns follow others more frequently than chance. Such ubiquitous pairs were not detected in scripted speech. These results provide data-driven support for the analogy of prosody to a linguistic system with its own vocabulary, semantics, and a simple syntax.

Abstract

Prosody, the musical facet of speech, is pivotal in human communication, and its structure and meaning remain subjects of ongoing research. In this study, we introduce a data-driven model for English prosody, based on large-scale analysis of spontaneous conversations. As a first step, we identify approximately 200 discernible prosodic patterns—which we view as building blocks of the prosodic vocabulary—and outline their properties and range of meanings. Next, we reveal a Markovian logic, akin to a syntax, for concatenating these elementary building blocks into coherent utterances. We identify distinct compound functions associated with pairs of consecutive patterns and show that the Markovian syntax is more prevalent in spontaneous prosody, as compared to scripted speech. These findings offer invaluable insights into the underlying mechanisms of conversational prosody: They empirically inform and refine existing theoretical concepts. The methodology we present, combining unsupervised analysis of large datasets of spontaneous speech with manual sampling of the results, could guide future research aimed at refining our model and expanding it to other languages.

 

No comments:

Post a Comment