Skip to main content

Showing 1–50 of 93 results for author: Li, J J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.00439  [pdf, ps, other

    cs.CL

    Beyond Sociodemographic Prompting: Using Supervision to Align LLMs with Human Response Distributions

    Authors: Gauri Kambhatla, Sanjana Gautam, Angela Zhang, Alex Liu, Ravi Srinivasan, Junyi Jessy Li, Matthew Lease

    Abstract: The ability to accurately predict how different population groups would answer subjective questions would have great value. In this work, we show that use of relatively simple supervision can greatly improve language model alignment with diverse population groups, as measured over three datasets spanning various topics. Beyond evaluating average performance, we also report how alignment varies acr… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

  2. arXiv:2506.20876  [pdf, ps, other

    cs.CL

    Decide less, communicate more: On the construct validity of end-to-end fact-checking in medicine

    Authors: Sebastian Joseph, Lily Chen, Barry Wei, Michael Mackert, Iain J. Marshall, Paul Pu Liang, Ramez Kouzy, Byron C. Wallace, Junyi Jessy Li

    Abstract: Technological progress has led to concrete advancements in tasks that were regarded as challenging, such as automatic fact-checking. Interest in adopting these systems for public health and medicine has grown due to the high-stakes nature of medical decisions and challenges in critically appraising a vast and diverse medical literature. Evidence-based medicine connects to every individual, and yet… ▽ More

    Submitted 28 June, 2025; v1 submitted 25 June, 2025; originally announced June 2025.

    Comments: Flattened Figure 1 PDF for compatibility with Mac Preview

  3. arXiv:2506.04534  [pdf, other

    cs.CL cs.AI

    Is It JUST Semantics? A Case Study of Discourse Particle Understanding in LLMs

    Authors: William Sheffield, Kanishka Misra, Valentina Pyatkin, Ashwini Deo, Kyle Mahowald, Junyi Jessy Li

    Abstract: Discourse particles are crucial elements that subtly shape the meaning of text. These words, often polyfunctional, give rise to nuanced and often quite disparate semantic/discourse effects, as exemplified by the diverse uses of the particle "just" (e.g., exclusive, temporal, emphatic). This work investigates the capacity of LLMs to distinguish the fine-grained senses of English "just", a well-stud… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Comments: To be published in Findings of The 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025). The main paper is 5 pages and contains 3 figures and 1 table. In total, the paper is 12 pages and contains 8 figures and 5 tables (References + Appendix)

  4. arXiv:2506.01195  [pdf, ps, other

    cs.CL

    Strategic Discourse Assessment: The Crooked Path to Innocence

    Authors: Anshun Asher Zheng, Junyi Jessy Li, David I. Beaver

    Abstract: Language is often used strategically, particularly in high-stakes, adversarial settings, yet most work on pragmatics and LLMs centers on cooperativity. This leaves a gap in the systematic understanding of strategic communication in adversarial settings. To address this, we introduce SDA (Strategic Discourse Assessment), a framework grounded in Gricean and game-theoretic pragmatics to assess strate… ▽ More

    Submitted 2 September, 2025; v1 submitted 1 June, 2025; originally announced June 2025.

    Comments: 49 pages. Substantially revised and expanded. Title changed

  5. A Tool for Generating Exceptional Behavior Tests With Large Language Models

    Authors: Linghan Zhong, Samuel Yuan, Jiyang Zhang, Yu Liu, Pengyu Nie, Junyi Jessy Li, Milos Gligoric

    Abstract: Exceptional behavior tests (EBTs) are crucial in software development for verifying that code correctly handles unwanted events and throws appropriate exceptions. However, prior research has shown that developers often prioritize testing "happy paths", e.g., paths without unwanted events over exceptional scenarios. We present exLong, a framework that automatically generates EBTs to address this ga… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

    Comments: FSE 2025 Demo (Camera Ready)

  6. arXiv:2505.20538  [pdf, ps, other

    cs.CL astro-ph.IM cs.LG

    AstroVisBench: A Code Benchmark for Scientific Computing and Visualization in Astronomy

    Authors: Sebastian Antony Joseph, Syed Murtaza Husain, Stella S. R. Offner, Stéphanie Juneau, Paul Torrey, Adam S. Bolton, Juan P. Farias, Niall Gaffney, Greg Durrett, Junyi Jessy Li

    Abstract: Large Language Models (LLMs) are being explored for applications in scientific research, including their capabilities to synthesize literature, answer research questions, generate research ideas, and even conduct computational experiments. Ultimately, our goal is for these to help scientists derive novel scientific insights. In many areas of science, such insights often arise from processing and v… ▽ More

    Submitted 3 June, 2025; v1 submitted 26 May, 2025; originally announced May 2025.

  7. arXiv:2504.15219  [pdf, ps, other

    cs.CL

    EvalAgent: Discovering Implicit Evaluation Criteria from the Web

    Authors: Manya Wadhwa, Zayne Sprague, Chaitanya Malaviya, Philippe Laban, Junyi Jessy Li, Greg Durrett

    Abstract: Evaluation of language model outputs on structured writing tasks is typically conducted with a number of desirable criteria presented to human evaluators or large language models (LLMs). For instance, on a prompt like "Help me draft an academic talk on coffee intake vs research productivity", a model response may be evaluated for criteria like accuracy and coherence. However, high-quality response… ▽ More

    Submitted 17 August, 2025; v1 submitted 21 April, 2025; originally announced April 2025.

    Comments: Published at COLM 2025

  8. arXiv:2504.09373  [pdf, ps, other

    cs.CL

    QUDsim: Quantifying Discourse Similarities in LLM-Generated Text

    Authors: Ramya Namuduri, Yating Wu, Anshun Asher Zheng, Manya Wadhwa, Greg Durrett, Junyi Jessy Li

    Abstract: As large language models become increasingly capable at various writing tasks, their weakness at generating unique and creative content becomes a major liability. Although LLMs have the ability to generate text covering diverse topics, there is an overall sense of repetitiveness across texts that we aim to formalize and quantify via a similarity metric. The familiarity between documents arises fro… ▽ More

    Submitted 11 August, 2025; v1 submitted 12 April, 2025; originally announced April 2025.

    Comments: COLM 2025 Camera Ready

  9. arXiv:2504.01370  [pdf, other

    cs.DC cs.CE cs.GT

    Accelerating Blockchain Scalability: New Models for Parallel Transaction Execution in the EVM

    Authors: Souradeep Das, Konpat Preechakul, Jonas Bäumer, Riddhi Patel, Jefferson Jinchuan Li

    Abstract: As the number of decentralized applications and users on Ethereum grows, the ability of the blockchain to efficiently handle a growing number of transactions becomes increasingly strained. Ethereums current execution model relies heavily on sequential processing, meaning that operations are processed one after the other, which creates significant bottlenecks to future scalability demands. While sc… ▽ More

    Submitted 2 April, 2025; originally announced April 2025.

  10. arXiv:2502.14613  [pdf, other

    cs.CL

    Behavioral Analysis of Information Salience in Large Language Models

    Authors: Jan Trienes, Jörg Schlötterer, Junyi Jessy Li, Christin Seifert

    Abstract: Large Language Models (LLMs) excel at text summarization, a task that requires models to select content based on its importance. However, the exact notion of salience that LLMs have internalized remains unclear. To bridge this gap, we introduce an explainable framework to systematically derive and investigate information salience in LLMs through their summarization behavior. Using length-controlle… ▽ More

    Submitted 27 May, 2025; v1 submitted 20 February, 2025; originally announced February 2025.

    Comments: Accepted at ACL 2025 (Findings)

  11. arXiv:2502.07963  [pdf, other

    cs.CL cs.AI

    Caught in the Web of Words: Do LLMs Fall for Spin in Medical Literature?

    Authors: Hye Sun Yun, Karen Y. C. Zhang, Ramez Kouzy, Iain J. Marshall, Junyi Jessy Li, Byron C. Wallace

    Abstract: Medical research faces well-documented challenges in translating novel treatments into clinical practice. Publishing incentives encourage researchers to present "positive" findings, even when empirical results are equivocal. Consequently, it is well-documented that authors often spin study results, especially in article abstracts. Such spin can influence clinician interpretation of evidence and ma… ▽ More

    Submitted 5 May, 2025; v1 submitted 11 February, 2025; originally announced February 2025.

    Comments: 22 pages, 12 figures, 4 tables, CHIL 2025

  12. arXiv:2502.03397  [pdf, ps, other

    cs.CL cs.AI

    SPRI: Aligning Large Language Models with Context-Situated Principles

    Authors: Hongli Zhan, Muneeza Azmat, Raya Horesh, Junyi Jessy Li, Mikhail Yurochkin

    Abstract: Aligning Large Language Models to integrate and reflect human values, especially for tasks that demand intricate human oversight, is arduous since it is resource-intensive and time-consuming to depend on human expertise for context-specific guidance. Prior work has utilized predefined sets of rules or principles to steer the behavior of models (Bai et al., 2022; Sun et al., 2023). However, these p… ▽ More

    Submitted 29 May, 2025; v1 submitted 5 February, 2025; originally announced February 2025.

    Comments: Forty-Second International Conference on Machine Learning (ICML 2025) Camera-Ready Version

  13. arXiv:2411.17967  [pdf, other

    cs.CL

    QuaLLM-Health: An Adaptation of an LLM-Based Framework for Quantitative Data Extraction from Online Health Discussions

    Authors: Ramez Kouzy, Roxanna Attar-Olyaee, Michael K. Rooney, Comron J. Hassanzadeh, Junyi Jessy Li, Osama Mohamad

    Abstract: Health-related discussions on social media like Reddit offer valuable insights, but extracting quantitative data from unstructured text is challenging. In this work, we present an adapted framework from QuaLLM into QuaLLM-Health for extracting clinically relevant quantitative data from Reddit discussions about glucagon-like peptide-1 (GLP-1) receptor agonists using large language models (LLMs). We… ▽ More

    Submitted 26 November, 2024; originally announced November 2024.

  14. arXiv:2409.01568  [pdf, other

    cs.LG

    Quantifying Emergence in Neural Networks: Insights from Pruning and Training Dynamics

    Authors: Faisal AlShinaifi, Zeyad Almoaigel, Johnny Jingze Li, Abdulla Kuleib, Gabriel A. Silva

    Abstract: Emergence, where complex behaviors develop from the interactions of simpler components within a network, plays a crucial role in enhancing neural network capabilities. We introduce a quantitative framework to measure emergence during the training process and examine its impact on network performance, particularly in relation to pruning and training dynamics. Our hypothesis posits that the degree o… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

  15. arXiv:2407.19044  [pdf, other

    cs.LG cs.CV

    Advancing Neural Network Performance through Emergence-Promoting Initialization Scheme

    Authors: Johnny Jingze Li, Vivek Kurien George, Gabriel A. Silva

    Abstract: Emergence in machine learning refers to the spontaneous appearance of complex behaviors or capabilities that arise from the scale and structure of training data and model architectures, despite not being explicitly programmed. We introduce a novel yet straightforward neural network initialization scheme that aims at achieving greater potential for emergence. Measuring emergence as a kind of struct… ▽ More

    Submitted 3 January, 2025; v1 submitted 26 July, 2024; originally announced July 2024.

  16. arXiv:2407.02397  [pdf, ps, other

    cs.CL

    Learning to Refine with Fine-Grained Natural Language Feedback

    Authors: Manya Wadhwa, Xinyu Zhao, Junyi Jessy Li, Greg Durrett

    Abstract: Recent work has explored the capability of large language models (LLMs) to identify and correct errors in LLM-generated responses. These refinement approaches frequently evaluate what sizes of models are able to do refinement for what problems, but less attention is paid to what effective feedback for refinement looks like. In this work, we propose looking at refinement with feedback as a composit… ▽ More

    Submitted 19 June, 2025; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: Code and models available at: https://github.com/ManyaWadhwa/DCR; Findings of EMNLP 2024

  17. arXiv:2407.00211  [pdf, other

    cs.CL

    Detection and Measurement of Syntactic Templates in Generated Text

    Authors: Chantal Shaib, Yanai Elazar, Junyi Jessy Li, Byron C. Wallace

    Abstract: Recent work on evaluating the diversity of text generated by LLMs has focused on word-level features. Here we offer an analysis of syntactic features to characterize general repetition in models, beyond frequent n-grams. Specifically, we define syntactic templates and show that models tend to produce templated text in downstream tasks at a higher rate than what is found in human-reference texts. W… ▽ More

    Submitted 6 October, 2024; v1 submitted 28 June, 2024; originally announced July 2024.

    Comments: EMNLP 2024

  18. arXiv:2406.17947  [pdf, other

    cs.CL

    Do they mean 'us'? Interpreting Referring Expressions in Intergroup Bias

    Authors: Venkata S Govindarajan, Matianyu Zang, Kyle Mahowald, David Beaver, Junyi Jessy Li

    Abstract: The variations between in-group and out-group speech (intergroup bias) are subtle and could underlie many social phenomena like stereotype perpetuation and implicit bias. In this paper, we model the intergroup bias as a tagging task on English sports comments from forums dedicated to fandom for NFL teams. We curate a unique dataset of over 6 million game-time comments from opposing perspectives (t… ▽ More

    Submitted 31 October, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

    Comments: Accepted to Findings@EMNLP 2024

  19. arXiv:2405.20179  [pdf, ps, other

    cs.CL cs.AI cs.RO

    Robo-Instruct: Simulator-Augmented Instruction Alignment For Finetuning Code LLMs

    Authors: Zichao Hu, Junyi Jessy Li, Arjun Guha, Joydeep Biswas

    Abstract: Code LLMs have shown promising results with converting tasks in natural language to programs that can be executed by service robots. We are interested in finetuning small, specialized LLMs for this purpose, but collecting datasets of task-program pairs specific to each robot is time-consuming and expensive. While approaches such as SELF-INSTRUCT and EVOL-INSTRUCT are capable of generating novel ta… ▽ More

    Submitted 12 August, 2025; v1 submitted 30 May, 2024; originally announced May 2024.

  20. arXiv:2405.14619  [pdf, other

    cs.SE cs.AI

    exLong: Generating Exceptional Behavior Tests with Large Language Models

    Authors: Jiyang Zhang, Yu Liu, Pengyu Nie, Junyi Jessy Li, Milos Gligoric

    Abstract: Many popular programming languages, including C#, Java, and Python, support exceptions. Exceptions are thrown during program execution if an unwanted event happens, e.g., a method is invoked with an illegal argument value. Software developers write exceptional behavior tests (EBTs) to check that their code detects unwanted events and throws appropriate exceptions. Prior research studies have shown… ▽ More

    Submitted 24 December, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: ICSE 2025 (camera ready)

  21. arXiv:2405.00723  [pdf, other

    eess.SP cs.AI cs.LG

    EEG_RL-Net: Enhancing EEG MI Classification through Reinforcement Learning-Optimised Graph Neural Networks

    Authors: Htoo Wai Aung, Jiao Jiao Li, Yang An, Steven W. Su

    Abstract: Brain-Computer Interfaces (BCIs) rely on accurately decoding electroencephalography (EEG) motor imagery (MI) signals for effective device control. Graph Neural Networks (GNNs) outperform Convolutional Neural Networks (CNNs) in this regard, by leveraging the spatial relationships between EEG electrodes through adjacency matrices. The EEG_GLT-Net framework, featuring the state-of-the-art EEG_GLT adj… ▽ More

    Submitted 26 April, 2024; originally announced May 2024.

  22. arXiv:2404.11075  [pdf, other

    cs.LG cs.AI eess.SP

    EEG_GLT-Net: Optimising EEG Graphs for Real-time Motor Imagery Signals Classification

    Authors: Htoo Wai Aung, Jiao Jiao Li, Yang An, Steven W. Su

    Abstract: Brain-Computer Interfaces connect the brain to external control devices, necessitating the accurate translation of brain signals such as from electroencephalography (EEG) into executable commands. Graph Neural Networks (GCN) have been increasingly applied for classifying EEG Motor Imagery signals, primarily because they incorporates the spatial relationships among EEG channels, resulting in improv… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  23. arXiv:2404.10917  [pdf, other

    cs.CL

    Which questions should I answer? Salience Prediction of Inquisitive Questions

    Authors: Yating Wu, Ritika Mangla, Alexandros G. Dimakis, Greg Durrett, Junyi Jessy Li

    Abstract: Inquisitive questions -- open-ended, curiosity-driven questions people ask as they read -- are an integral part of discourse processing (Kehler and Rohde, 2017; Onea, 2016) and comprehension (Prince, 2004). Recent work in NLP has taken advantage of question generation capabilities of LLMs to enhance a wide range of applications. But the space of inquisitive questions is vast: many questions can be… ▽ More

    Submitted 3 October, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: Camera Ready for EMNLP 2024 Main Conference

  24. arXiv:2404.01288  [pdf, other

    cs.CL

    Large Language Models are Capable of Offering Cognitive Reappraisal, if Guided

    Authors: Hongli Zhan, Allen Zheng, Yoon Kyung Lee, Jina Suh, Junyi Jessy Li, Desmond C. Ong

    Abstract: Large language models (LLMs) have offered new opportunities for emotional support, and recent work has shown that they can produce empathic responses to people in distress. However, long-term mental well-being requires emotional self-regulation, where a one-time empathic response falls short. This work takes a first step by engaging with cognitive reappraisals, a strategy from psychology practitio… ▽ More

    Submitted 8 August, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: Accepted to COLM 2024

  25. arXiv:2403.18148  [pdf, other

    cs.CL cs.AI

    Large Language Models Produce Responses Perceived to be Empathic

    Authors: Yoon Kyung Lee, Jina Suh, Hongli Zhan, Junyi Jessy Li, Desmond C. Ong

    Abstract: Large Language Models (LLMs) have demonstrated surprising performance on many tasks, including writing supportive messages that display empathy. Here, we had these models generate empathic messages in response to posts describing common life experiences, such as workplace situations, parenting, relationships, and other anxiety- and anger-eliciting situations. Across two studies (N=192, 202), we sh… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  26. arXiv:2402.11456  [pdf, other

    cs.CL

    FactPICO: Factuality Evaluation for Plain Language Summarization of Medical Evidence

    Authors: Sebastian Antony Joseph, Lily Chen, Jan Trienes, Hannah Louisa Göke, Monika Coers, Wei Xu, Byron C Wallace, Junyi Jessy Li

    Abstract: Plain language summarization with LLMs can be useful for improving textual accessibility of technical content. But how factual are these summaries in a high-stakes domain like medicine? This paper presents FactPICO, a factuality benchmark for plain language summarization of medical texts describing randomized controlled trials (RCTs), which are the basis of evidence-based medicine and can directly… ▽ More

    Submitted 4 June, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

    Comments: Preprint has been updated to match the final revision for ACL 2024

  27. arXiv:2401.16475  [pdf, other

    cs.CL

    InfoLossQA: Characterizing and Recovering Information Loss in Text Simplification

    Authors: Jan Trienes, Sebastian Joseph, Jörg Schlötterer, Christin Seifert, Kyle Lo, Wei Xu, Byron C. Wallace, Junyi Jessy Li

    Abstract: Text simplification aims to make technical texts more accessible to laypeople but often results in deletion of information and vagueness. This work proposes InfoLossQA, a framework to characterize and recover simplification-induced information loss in form of question-and-answer (QA) pairs. Building on the theory of Question Under Discussion, the QA pairs are designed to help readers deepen their… ▽ More

    Submitted 4 June, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: Accepted at ACL 2024 (main conference)

  28. arXiv:2311.09602  [pdf, other

    cs.CL

    Language Models (Mostly) Do Not Consider Emotion Triggers When Predicting Emotion

    Authors: Smriti Singh, Cornelia Caragea, Junyi Jessy Li

    Abstract: Situations and events evoke emotions in humans, but to what extent do they inform the prediction of emotion detection models? This work investigates how well human-annotated emotion triggers correlate with features that models deemed salient in their prediction of emotions. First, we introduce a novel dataset EmoTrigger, consisting of 900 social media posts sourced from three different datasets; t… ▽ More

    Submitted 25 March, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: NAACL 2024 Camera Ready

  29. arXiv:2311.08644  [pdf, other

    cs.LG cs.AI cs.HC

    Wrapper Boxes: Faithful Attribution of Model Predictions to Training Data

    Authors: Yiheng Su, Junyi Jessy Li, Matthew Lease

    Abstract: Can we preserve the accuracy of neural models while also providing faithful explanations of model decisions to training data? We propose a "wrapper box'' pipeline: training a neural model as usual and then using its learned feature representation in classic, interpretable models to perform prediction. Across seven language models of varying sizes, including four large language models (LLMs), two d… ▽ More

    Submitted 4 October, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Journal ref: The seventh edition of BlackboxNLP Workshop at EMNLP 2024

  30. arXiv:2310.14520  [pdf, other

    cs.CL

    QUDEVAL: The Evaluation of Questions Under Discussion Discourse Parsing

    Authors: Yating Wu, Ritika Mangla, Greg Durrett, Junyi Jessy Li

    Abstract: Questions Under Discussion (QUD) is a versatile linguistic framework in which discourse progresses as continuously asking questions and answering them. Automatic parsing of a discourse to produce a QUD structure thus entails a complex question generation task: given a document and an answer sentence, generate a question that satisfies linguistic constraints of QUD and can be grounded in an anchor… ▽ More

    Submitted 1 November, 2023; v1 submitted 22 October, 2023; originally announced October 2023.

    Comments: Camera Ready for EMNLP Main Conference

  31. arXiv:2310.14389  [pdf, other

    cs.CL

    Evaluating Subjective Cognitive Appraisals of Emotions from Large Language Models

    Authors: Hongli Zhan, Desmond C. Ong, Junyi Jessy Li

    Abstract: The emotions we experience involve complex processes; besides physiological aspects, research in psychology has studied cognitive appraisals where people assess their situations subjectively, according to their own values (Scherer, 2005). Thus, the same situation can often result in different emotional experiences. While the detection of emotion is a well-established task, there is very limited wo… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 (Findings) Camera-Ready Version

  32. arXiv:2308.08156  [pdf, other

    cs.CL cs.LG

    Sarcasm Detection in a Disaster Context

    Authors: Tiberiu Sosea, Junyi Jessy Li, Cornelia Caragea

    Abstract: During natural disasters, people often use social media platforms such as Twitter to ask for help, to provide information about the disaster situation, or to express contempt about the unfolding event or public policies and guidelines. This contempt is in some cases expressed as sarcasm or irony. Understanding this form of speech in a disaster-centric context is essential to improving natural lang… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

  33. arXiv:2307.14991  [pdf, other

    cs.SE cs.AI

    Multilingual Code Co-Evolution Using Large Language Models

    Authors: Jiyang Zhang, Pengyu Nie, Junyi Jessy Li, Milos Gligoric

    Abstract: Many software projects implement APIs and algorithms in multiple programming languages. Maintaining such projects is tiresome, as developers have to ensure that any change (e.g., a bug fix or a new feature) is being propagated, timely and without errors, to implementations in other programming languages. In the world of ever-changing software, using rule-based translation tools (i.e., transpilers)… ▽ More

    Submitted 11 September, 2023; v1 submitted 27 July, 2023; originally announced July 2023.

    Comments: FSE 2023 (camera ready)

  34. arXiv:2306.01444  [pdf, other

    cs.CL

    Unsupervised Extractive Summarization of Emotion Triggers

    Authors: Tiberiu Sosea, Hongli Zhan, Junyi Jessy Li, Cornelia Caragea

    Abstract: Understanding what leads to emotions during large-scale crises is important as it can provide groundings for expressed emotions and subsequently improve the understanding of ongoing disasters. Recent approaches trained supervised models to both detect emotions and explain emotion triggers (events and appraisals) via abstractive summarization. However, obtaining timely and qualitative abstractive s… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: ACL 2023 Camera-Ready

  35. arXiv:2305.16409  [pdf, other

    cs.CL cs.CY

    Counterfactual Probing for the Influence of Affect and Specificity on Intergroup Bias

    Authors: Venkata S Govindarajan, Kyle Mahowald, David I. Beaver, Junyi Jessy Li

    Abstract: While existing work on studying bias in NLP focues on negative or pejorative language use, Govindarajan et al. (2023) offer a revised framing of bias in terms of intergroup social context, and its effects on language behavior. In this paper, we investigate if two pragmatic features (specificity and affect) systematically vary in different intergroup contexts -- thus connecting this new framing of… ▽ More

    Submitted 2 June, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: To appear in Findings of ACL 2023

  36. arXiv:2305.14770  [pdf, ps, other

    cs.CL

    Using Natural Language Explanations to Rescale Human Judgments

    Authors: Manya Wadhwa, Jifan Chen, Junyi Jessy Li, Greg Durrett

    Abstract: The rise of large language models (LLMs) has brought a critical need for high-quality human-labeled data, particularly for processes like human feedback and evaluation. A common practice is to label data via consensus annotation over human judgments. However, annotators' judgments for subjective tasks can differ in many ways: they may reflect different qualitative judgments about an example, and t… ▽ More

    Submitted 19 June, 2025; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted to COLM 2024; code and data: https://github.com/ManyaWadhwa/explanation_based_rescaling

  37. arXiv:2305.12532  [pdf, other

    cs.CL

    Multilingual Simplification of Medical Texts

    Authors: Sebastian Joseph, Kathryn Kazanas, Keziah Reina, Vishnesh J. Ramanathan, Wei Xu, Byron C. Wallace, Junyi Jessy Li

    Abstract: Automated text simplification aims to produce simple versions of complex texts. This task is especially useful in the medical domain, where the latest medical findings are typically communicated via complex and technical articles. This creates barriers for laypeople seeking access to up-to-date medical findings, consequently impeding progress on health literacy. Most existing work on medical text… ▽ More

    Submitted 18 October, 2023; v1 submitted 21 May, 2023; originally announced May 2023.

    Comments: This version will be in EMNLP 2023 main

  38. arXiv:2305.10387  [pdf, other

    cs.CL

    Elaborative Simplification as Implicit Questions Under Discussion

    Authors: Yating Wu, William Sheffield, Kyle Mahowald, Junyi Jessy Li

    Abstract: Automated text simplification, a technique useful for making text more accessible to people such as children and emergent bilinguals, is often thought of as a monolingual translation task from complex sentences to simplified sentences using encoder-decoder models. This view fails to account for elaborative simplification, where new information is added into the simplified text. This paper proposes… ▽ More

    Submitted 24 October, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

    Comments: Equal contribution by Yating Wu and William Sheffield. This the EMNLP 2023 Main camera-ready version

  39. arXiv:2305.06299  [pdf, other

    cs.CL

    Summarizing, Simplifying, and Synthesizing Medical Evidence Using GPT-3 (with Varying Success)

    Authors: Chantal Shaib, Millicent L. Li, Sebastian Joseph, Iain J. Marshall, Junyi Jessy Li, Byron C. Wallace

    Abstract: Large language models, particularly GPT-3, are able to produce high quality summaries of general domain news articles in few- and zero-shot settings. However, it is unclear if such models are similarly capable in more specialized, high-stakes domains such as biomedicine. In this paper, we enlist domain experts (individuals with medical training) to evaluate summaries of biomedical articles generat… ▽ More

    Submitted 11 May, 2023; v1 submitted 10 May, 2023; originally announced May 2023.

    Comments: Accepted short paper to ACL 2023

  40. arXiv:2302.10166  [pdf, other

    cs.SE cs.CL cs.LG

    Learning Deep Semantics for Test Completion

    Authors: Pengyu Nie, Rahul Banerjee, Junyi Jessy Li, Raymond J. Mooney, Milos Gligoric

    Abstract: Writing tests is a time-consuming yet essential task during software development. We propose to leverage recent advances in deep learning for text and code generation to assist developers in writing tests. We formalize the novel task of test completion to automatically complete the next statement in a test method based on the context of prior statements and the code under test. We develop TeCo --… ▽ More

    Submitted 7 March, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: Accepted as a conference paper in ICSE 2023

  41. arXiv:2211.06335  [pdf

    cs.SE cs.CL

    Using Developer Discussions to Guide Fixing Bugs in Software

    Authors: Sheena Panthaplackel, Milos Gligoric, Junyi Jessy Li, Raymond J. Mooney

    Abstract: Automatically fixing software bugs is a challenging task. While recent work showed that natural language context is useful in guiding bug-fixing models, the approach required prompting developers to provide this context, which was simulated through commit messages written after the bug-fixing code changes were made. We instead propose using bug report discussions, which are available before the ta… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

    Comments: Accepted in the Findings of EMNLP 2022

  42. arXiv:2210.12531  [pdf, other

    cs.CL cs.SI

    Why Do You Feel This Way? Summarizing Triggers of Emotions in Social Media Posts

    Authors: Hongli Zhan, Tiberiu Sosea, Cornelia Caragea, Junyi Jessy Li

    Abstract: Crises such as the COVID-19 pandemic continuously threaten our world and emotionally affect billions of people worldwide in distinct ways. Understanding the triggers leading to people's emotions is of crucial importance. Social media posts can be a good source of such analysis, yet these texts tend to be charged with multiple emotions, with triggers scattering across multiple sentences. This paper… ▽ More

    Submitted 22 October, 2022; originally announced October 2022.

    Comments: EMNLP 2022 Camera Ready Version

    Journal ref: https://aclanthology.org/2022.emnlp-main.642/

  43. arXiv:2210.05905  [pdf, other

    cs.CL

    Discourse Analysis via Questions and Answers: Parsing Dependency Structures of Questions Under Discussion

    Authors: Wei-Jen Ko, Yating Wu, Cutter Dalton, Dananjay Srinivas, Greg Durrett, Junyi Jessy Li

    Abstract: Automatic discourse processing is bottlenecked by data: current discourse formalisms pose highly demanding annotation tasks involving large taxonomies of discourse relations, making them inaccessible to lay annotators. This work instead adopts the linguistic framework of Questions Under Discussion (QUD) for discourse analysis and seeks to derive QUD structures automatically. QUD views each sentenc… ▽ More

    Submitted 12 May, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

    Comments: Findings of ACL 2023

  44. arXiv:2210.02197  [pdf, other

    cs.LG stat.AP

    Hierarchical Neyman-Pearson Classification for Prioritizing Severe Disease Categories in COVID-19 Patient Data

    Authors: Lijia Wang, Y. X. Rachel Wang, Jingyi Jessica Li, Xin Tong

    Abstract: COVID-19 has a spectrum of disease severity, ranging from asymptomatic to requiring hospitalization. Understanding the mechanisms driving disease severity is crucial for developing effective treatments and reducing mortality rates. One way to gain such understanding is using a multi-class classification framework, in which patients' biological features are used to predict patients' severity classe… ▽ More

    Submitted 29 September, 2023; v1 submitted 1 October, 2022; originally announced October 2022.

  45. arXiv:2209.12356  [pdf, other

    cs.CL

    News Summarization and Evaluation in the Era of GPT-3

    Authors: Tanya Goyal, Junyi Jessy Li, Greg Durrett

    Abstract: The recent success of prompting large language models like GPT-3 has led to a paradigm shift in NLP research. In this paper, we study its impact on text summarization, focusing on the classic benchmark domain of news summarization. First, we investigate how GPT-3 compares against fine-tuned models trained on large summarization datasets. We show that not only do humans overwhelmingly prefer GPT-3… ▽ More

    Submitted 23 May, 2023; v1 submitted 25 September, 2022; originally announced September 2022.

    Comments: All data shared at: https://tagoyal.github.io/zeroshot-news-annotations.html

  46. arXiv:2209.06687  [pdf, other

    cs.CL

    How people talk about each other: Modeling Generalized Intergroup Bias and Emotion

    Authors: Venkata S Govindarajan, Katherine Atwell, Barea Sinno, Malihe Alikhani, David I. Beaver, Junyi Jessy Li

    Abstract: Current studies of bias in NLP rely mainly on identifying (unwanted or negative) bias towards a specific demographic group. While this has led to progress recognizing and mitigating negative bias, and having a clear notion of the targeted group is necessary, it is not always practical. In this work we extrapolate to a broader notion of bias, rooted in social science and psychology literature. We m… ▽ More

    Submitted 13 February, 2023; v1 submitted 14 September, 2022; originally announced September 2022.

    Comments: To be presented at EACL 2023

  47. arXiv:2209.04529  [pdf, other

    cs.CL

    Text Simplification of College Admissions Instructions: A Professionally Simplified and Verified Corpus

    Authors: Zachary W. Taylor, Maximus H. Chu, Junyi Jessy Li

    Abstract: Access to higher education is critical for minority populations and emergent bilingual students. However, the language used by higher education institutions to communicate with prospective students is often too complex; concretely, many institutions in the US publish admissions application instructions far above the average reading level of a typical high school graduate, often near the 13th or 14… ▽ More

    Submitted 9 September, 2022; originally announced September 2022.

    Comments: International Conference on Computational Linguistics (COLING) 2022

  48. arXiv:2208.05446  [pdf, ps, other

    cs.SE cs.LG

    CoditT5: Pretraining for Source Code and Natural Language Editing

    Authors: Jiyang Zhang, Sheena Panthaplackel, Pengyu Nie, Junyi Jessy Li, Milos Gligoric

    Abstract: Pretrained language models have been shown to be effective in many software-related generation tasks; however, they are not well-suited for editing tasks as they are not designed to reason about edits. To address this, we propose a novel pretraining objective which explicitly models edits and use it to build CoditT5, a large language model for software-related editing tasks that is pretrained on l… ▽ More

    Submitted 14 September, 2022; v1 submitted 10 August, 2022; originally announced August 2022.

    Comments: ASE 2022 (camera ready)

  49. arXiv:2206.14729  [pdf, other

    cs.CL cs.AI cs.HC

    longhorns at DADC 2022: How many linguists does it take to fool a Question Answering model? A systematic approach to adversarial attacks

    Authors: Venelin Kovatchev, Trina Chatterjee, Venkata S Govindarajan, Jifan Chen, Eunsol Choi, Gabriella Chronis, Anubrata Das, Katrin Erk, Matthew Lease, Junyi Jessy Li, Yating Wu, Kyle Mahowald

    Abstract: Developing methods to adversarially challenge NLP systems is a promising avenue for improving both model performance and interpretability. Here, we describe the approach of the team "longhorns" on Task 1 of the The First Workshop on Dynamic Adversarial Data Collection (DADC), which asked teams to manually fool a model on an Extractive Question Answering task. Our team finished first, with a model… ▽ More

    Submitted 29 June, 2022; originally announced June 2022.

    Comments: Accepted at DADC2022

  50. arXiv:2205.09641  [pdf, other

    cs.CL

    SNaC: Coherence Error Detection for Narrative Summarization

    Authors: Tanya Goyal, Junyi Jessy Li, Greg Durrett

    Abstract: Progress in summarizing long texts is inhibited by the lack of appropriate evaluation frameworks. When a long summary must be produced to appropriately cover the facets of that text, that summary needs to present a coherent narrative to be understandable by a reader, but current automatic and human evaluation methods fail to identify gaps in coherence. In this work, we introduce SNaC, a narrative… ▽ More

    Submitted 28 October, 2022; v1 submitted 19 May, 2022; originally announced May 2022.

    Comments: EMNLP 2022