Conversational Implicatures Through the Lens of LLMs

May 13, 2026·
Agnese Lombardi
Agnese Lombardi
,
Alessandro Lenci
· 0 min read
Abstract
Recent research has explored the capacity of Large Language Models (LLMs; e.g., Hu et al. 2023) to perform pragmatic reasoning and interpret complex pragmatic phenomena. However, such phenomena are inherently ambiguous, and even human evaluations are highly variable. Many existing studies directly compare human and model responses while assuming a single “correct” interpretation, thereby overlooking the natural variability that characterizes human pragmatic understanding. This raises two key issues: (1) the need for novel evaluation methods that account for interpretive variability and allow for meaningful comparison between humans and models, and (2) the potential limitations of current linguistic theories in capturing the richness of human pragmatic behavior. We propose that LLMs can serve not only as benchmarks for human-model alignment, but also as tools for investigating the nature of pragmatic phenomena and their relationship to linguistic theory. To this end, we developed a handcrafted dataset encompassing eight types of conversational implicatures and applied a new evaluation method designed to capture interpretive diversity. Our study addresses three main research questions: (1) Do LLMs process conversational implicatures differently from humans? (2) If so, how do these differences manifest? (3) What do these findings reveal about the cognitive capacities of LLMs and the explanatory adequacy of pragmatic theory?
Type
Publication
LREC 2026 - Main conference