Human Speech Perception in Noise: Can Large Language Models Paraphrase to Improve It?

Chingacham, Anupama; Zhang, Miaoran; Demberg, Vera; Klakow, Dietrich

Computer Science > Computation and Language

arXiv:2408.04029 (cs)

[Submitted on 7 Aug 2024]

Title:Human Speech Perception in Noise: Can Large Language Models Paraphrase to Improve It?

Authors:Anupama Chingacham, Miaoran Zhang, Vera Demberg, Dietrich Klakow

View PDF HTML (experimental)

Abstract:Large Language Models (LLMs) can generate text by transferring style attributes like formality resulting in formal or informal text. However, instructing LLMs to generate text that when spoken, is more intelligible in an acoustically difficult environment, is an under-explored topic. We conduct the first study to evaluate LLMs on a novel task of generating acoustically intelligible paraphrases for better human speech perception in noise. Our experiments in English demonstrated that with standard prompting, LLMs struggle to control the non-textual attribute, i.e., acoustic intelligibility, while efficiently capturing the desired textual attributes like semantic equivalence. To remedy this issue, we propose a simple prompting approach, prompt-and-select, which generates paraphrases by decoupling the desired textual and non-textual attributes in the text generation pipeline. Our approach resulted in a 40% relative improvement in human speech perception, by paraphrasing utterances that are highly distorted in a listening condition with babble noise at a signal-to-noise ratio (SNR) -5 dB. This study reveals the limitation of LLMs in capturing non-textual attributes, and our proposed method showcases the potential of using LLMs for better human speech perception in noise.

Comments:	Accepted at HuCLLM @ ACL 2024
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2408.04029 [cs.CL]
	(or arXiv:2408.04029v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2408.04029

Submission history

From: Anupama Chingacham [view email]
[v1] Wed, 7 Aug 2024 18:24:23 UTC (8,256 KB)

Computer Science > Computation and Language

Title:Human Speech Perception in Noise: Can Large Language Models Paraphrase to Improve It?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Human Speech Perception in Noise: Can Large Language Models Paraphrase to Improve It?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators