LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text Understanding and Generation

Guan, Jian; Feng, Zhuoer; Chen, Yamei; He, Ruilin; Mao, Xiaoxi; Fan, Changjie; Huang, Minlie

Computer Science > Computation and Language

arXiv:2108.12960 (cs)

[Submitted on 30 Aug 2021 (v1), last revised 17 Jan 2022 (this version, v2)]

Title:LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text Understanding and Generation

Authors:Jian Guan, Zhuoer Feng, Yamei Chen, Ruilin He, Xiaoxi Mao, Changjie Fan, Minlie Huang

View PDF

Abstract:Standard multi-task benchmarks are essential for developing pretraining models that can generalize to various downstream tasks. Existing benchmarks for natural language processing (NLP) usually focus only on understanding or generating short texts. However, long text modeling requires many distinct abilities in contrast to short texts, such as the modeling of long-range discourse and commonsense relations, and the coherence and controllability of generation. The lack of standardized benchmarks makes it difficult to assess these abilities of a model and fairly compare different models, especially Chinese models. Therefore, we propose a story-centric benchmark named LOT for evaluating Chinese long text modeling, which aggregates two understanding tasks and two generation tasks. We construct new datasets for these tasks based on human-written Chinese stories with hundreds of words. Furthermore, we release an encoder-decoder-based Chinese long text pretraining model named LongLM with up to 1 billion parameters. We pretrain LongLM on 120G Chinese novels with two generative tasks including text infilling and conditional continuation. Extensive experiments show that LongLM outperforms similar-sized pretraining models substantially on both the understanding and generation tasks in LOT.

Comments:	Accepted by TACL 2022. Benchmark datasets, pretraining models, appendix url: this https URL
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2108.12960 [cs.CL]
	(or arXiv:2108.12960v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2108.12960

Submission history

From: Jian Guan [view email]
[v1] Mon, 30 Aug 2021 02:38:32 UTC (249 KB)
[v2] Mon, 17 Jan 2022 08:52:46 UTC (5,514 KB)

Computer Science > Computation and Language

Title:LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text Understanding and Generation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text Understanding and Generation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators