Closing the Curious Case of Neural Text Degeneration

Finlayson, Matthew; Hewitt, John; Koller, Alexander; Swayamdipta, Swabha; Sabharwal, Ashish

Abstract:Despite their ubiquity in language generation, it remains unknown why truncation sampling heuristics like nucleus sampling are so effective. We provide a theoretical explanation for the effectiveness of the truncation sampling by proving that truncation methods that discard tokens below some probability threshold (the most common type of truncation) can guarantee that all sampled tokens have nonzero true probability. However, thresholds are a coarse heuristic, and necessarily discard some tokens with nonzero true probability as well. In pursuit of a more precise sampling strategy, we show that we can leverage a known source of model errors, the softmax bottleneck, to prove that certain tokens have nonzero true probability, without relying on a threshold. Based on our findings, we develop an experimental truncation strategy and the present pilot studies demonstrating the promise of this type of algorithm. Our evaluations show that our method outperforms its threshold-based counterparts under automatic and human evaluation metrics for low-entropy (i.e., close to greedy) open-ended text generation. Our theoretical findings and pilot experiments provide both insight into why truncation sampling works, and make progress toward more expressive sampling algorithms that better surface the generative capabilities of large language models.

Subjects:	Computation and Language (cs.CL)
MSC classes:	68T50
ACM classes:	I.2.7
Cite as:	arXiv:2310.01693 [cs.CL]
	(or arXiv:2310.01693v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2310.01693

Computer Science > Computation and Language

Title:Closing the Curious Case of Neural Text Degeneration

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators