FairytaleQA Translated: Enabling Educational Question and Answer Generation in Less-Resourced Languages

Leite, Bernardo; Osório, Tomás Freitas; Cardoso, Henrique Lopes

doi:10.1007/978-3-031-72315-5_16

Computer Science > Computation and Language

arXiv:2406.04233 (cs)

[Submitted on 6 Jun 2024 (v1), last revised 24 Jun 2024 (this version, v2)]

Title:FairytaleQA Translated: Enabling Educational Question and Answer Generation in Less-Resourced Languages

Authors:Bernardo Leite, Tomás Freitas Osório, Henrique Lopes Cardoso

View PDF HTML (experimental)

Abstract:Question Answering (QA) datasets are crucial in assessing reading comprehension skills for both machines and humans. While numerous datasets have been developed in English for this purpose, a noticeable void exists in less-resourced languages. To alleviate this gap, our paper introduces machine-translated versions of FairytaleQA, a renowned QA dataset designed to assess and enhance narrative comprehension skills in young children. By employing fine-tuned, modest-scale models, we establish benchmarks for both Question Generation (QG) and QA tasks within the translated datasets. In addition, we present a case study proposing a model for generating question-answer pairs, with an evaluation incorporating quality metrics such as question well-formedness, answerability, relevance, and children suitability. Our evaluation prioritizes quantifying and describing error cases, along with providing directions for future work. This paper contributes to the advancement of QA and QG research in less-resourced languages, promoting accessibility and inclusivity in the development of these models for reading comprehension. The code and data is publicly available at this http URL.

Comments:	Preprint - Accepted for publication at ECTEL 2024
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2406.04233 [cs.CL]
	(or arXiv:2406.04233v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2406.04233
Journal reference:	EC-TEL 2024, Lecture Notes in Computer Science, vol. 15159, pp. 222-236, Springer, 2024
Related DOI:	https://doi.org/10.1007/978-3-031-72315-5_16

Submission history

From: Bernardo Leite [view email]
[v1] Thu, 6 Jun 2024 16:31:47 UTC (498 KB)
[v2] Mon, 24 Jun 2024 15:39:17 UTC (499 KB)

Computer Science > Computation and Language

Title:FairytaleQA Translated: Enabling Educational Question and Answer Generation in Less-Resourced Languages

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:FairytaleQA Translated: Enabling Educational Question and Answer Generation in Less-Resourced Languages

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators