Skip to main content

Showing 1–4 of 4 results for author: Ma, W A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.08260  [pdf, ps, other

    cs.CL cs.AI

    Automatic Generation of Inference Making Questions for Reading Comprehension Assessments

    Authors: Wanjing Anya Ma, Michael Flor, Zuowei Wang

    Abstract: Inference making is an essential but complex skill in reading comprehension (RC). Some inferences require resolving references across sentences, and some rely on using prior knowledge to fill in the detail that is not explicitly written in the text. Diagnostic RC questions can help educators provide more effective and targeted reading instruction and interventions for school-age students. We intro… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: Accepted to the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025), co-located with the ACL 2025

  2. arXiv:2407.15645  [pdf, other

    cs.CL cs.AI

    Psychometric Alignment: Capturing Human Knowledge Distributions via Language Models

    Authors: Joy He-Yueya, Wanjing Anya Ma, Kanishk Gandhi, Benjamin W. Domingue, Emma Brunskill, Noah D. Goodman

    Abstract: Language models (LMs) are increasingly used to simulate human-like responses in scenarios where accurately mimicking a population's behavior can guide decision-making, such as in developing educational materials and designing public policies. The objective of these simulations is for LMs to capture the variations in human responses, rather than merely providing the expected correct answers. Prior… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

    Comments: Code and data: https://github.com/joyheyueya/psychometric-alignment

  3. arXiv:2406.10215  [pdf, other

    cs.CL cs.LG

    DevBench: A multimodal developmental benchmark for language learning

    Authors: Alvin Wei Ming Tan, Sunny Yu, Bria Long, Wanjing Anya Ma, Tonya Murray, Rebecca D. Silverman, Jason D. Yeatman, Michael C. Frank

    Abstract: How (dis)similar are the learning trajectories of vision-language models and children? Recent modeling work has attempted to understand the gap between models' and humans' data efficiency by constructing models trained on less data, especially multimodal naturalistic data. However, such models are often evaluated on adult-level benchmarks, with limited breadth in language abilities tested, and wit… ▽ More

    Submitted 6 December, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: Accepted at NeurIPS 2024 (Oral)

  4. arXiv:2310.06837  [pdf, other

    cs.CL cs.LG

    Generating and Evaluating Tests for K-12 Students with Language Model Simulations: A Case Study on Sentence Reading Efficiency

    Authors: Eric Zelikman, Wanjing Anya Ma, Jasmine E. Tran, Diyi Yang, Jason D. Yeatman, Nick Haber

    Abstract: Developing an educational test can be expensive and time-consuming, as each item must be written by experts and then evaluated by collecting hundreds of student responses. Moreover, many tests require multiple distinct sets of questions administered throughout the school year to closely monitor students' progress, known as parallel tests. In this study, we focus on tests of silent sentence reading… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 (Main)