Skip to main content

Showing 1–35 of 35 results for author: Flek, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.13621  [pdf, ps, other

    cs.AI

    Superalignment with Dynamic Human Values

    Authors: Florian Mai, David Kaczér, Nicholas Kluge Corrêa, Lucie Flek

    Abstract: Two core challenges of alignment are 1) scalable oversight and 2) accounting for the dynamic nature of human values. While solutions like recursive reward modeling address 1), they do not simultaneously account for 2). We sketch a roadmap for a novel algorithmic framework that trains a superhuman reasoning model to decompose complex tasks into subtasks that are still amenable to human-level guidan… ▽ More

    Submitted 17 March, 2025; originally announced March 2025.

    Comments: Published at the ICLR 2025 Workshop on Bidirectional Human-AI Alignment (BiAlign)

  2. arXiv:2501.14981  [pdf, other

    cs.CL

    The Muddy Waters of Modeling Empathy in Language: The Practical Impacts of Theoretical Constructs

    Authors: Allison Lahnala, Charles Welch, David Jurgens, Lucie Flek

    Abstract: Conceptual operationalizations of empathy in NLP are varied, with some having specific behaviors and properties, while others are more abstract. How these variations relate to one another and capture properties of empathy observable in text remains unclear. To provide insight into this, we analyze the transfer performance of empathy models adapted to empathy tasks with different theoretical ground… ▽ More

    Submitted 24 January, 2025; originally announced January 2025.

  3. arXiv:2501.14719  [pdf, other

    cs.CL cs.AI cs.HC cs.IR

    Do LLMs Provide Consistent Answers to Health-Related Questions across Languages?

    Authors: Ipek Baris Schlicht, Zhixue Zhao, Burcu Sayin, Lucie Flek, Paolo Rosso

    Abstract: Equitable access to reliable health information is vital for public health, but the quality of online health resources varies by language, raising concerns about inconsistencies in Large Language Models (LLMs) for healthcare. In this study, we examine the consistency of responses provided by LLMs to health-related questions across English, German, Turkish, and Chinese. We largely expand the Health… ▽ More

    Submitted 24 January, 2025; originally announced January 2025.

    Comments: 9 pages. Short paper appeared at 47th European Conference on Information Retrieval (ECIR 2025)

  4. arXiv:2501.14617  [pdf, other

    cs.CL

    Funzac at CoMeDi Shared Task: Modeling Annotator Disagreement from Word-In-Context Perspectives

    Authors: Olufunke O. Sarumi, Charles Welch, Lucie Flek, Jörg Schlötterer

    Abstract: In this work, we evaluate annotator disagreement in Word-in-Context (WiC) tasks exploring the relationship between contextual meaning and disagreement as part of the CoMeDi shared task competition. While prior studies have modeled disagreement by analyzing annotator attributes with single-sentence inputs, this shared task incorporates WiC to bridge the gap between sentence-level semantic represent… ▽ More

    Submitted 24 January, 2025; originally announced January 2025.

    Comments: Accepted to CoMeDi Shared Task at COLING 2025

  5. arXiv:2501.08322  [pdf, other

    cs.CL

    Exploring Robustness of Multilingual LLMs on Real-World Noisy Data

    Authors: Amirhossein Aliakbarzadeh, Lucie Flek, Akbar Karimi

    Abstract: Large Language Models (LLMs) are trained on Web data that might contain spelling errors made by humans. But do they become robust to similar real-world noise? In this paper, we investigate the effect of real-world spelling mistakes on the performance of 9 language models, with parameters ranging from 0.2B to 13B, in 3 different NLP tasks, namely Natural Language Inference (NLI), Name Entity Recogn… ▽ More

    Submitted 14 January, 2025; originally announced January 2025.

  6. arXiv:2501.08276  [pdf, other

    cs.CL

    Exploring Robustness of LLMs to Sociodemographically-Conditioned Paraphrasing

    Authors: Pulkit Arora, Akbar Karimi, Lucie Flek

    Abstract: Large Language Models (LLMs) have shown impressive performance in various NLP tasks. However, there are concerns about their reliability in different domains of linguistic variations. Many works have proposed robustness evaluation measures for local adversarial attacks, but we need globally robust models unbiased to different language styles. We take a broader approach to explore a wider range of… ▽ More

    Submitted 14 January, 2025; originally announced January 2025.

  7. arXiv:2501.08203  [pdf, other

    cs.CL

    ArithmAttack: Evaluating Robustness of LLMs to Noisy Context in Math Problem Solving

    Authors: Zain Ul Abedin, Shahzeb Qamar, Lucie Flek, Akbar Karimi

    Abstract: While Large Language Models (LLMs) have shown impressive capabilities in math problem-solving tasks, their robustness to noisy inputs is not well-studied. In this work, we propose ArithmAttack to examine how robust the LLMs are when they encounter noisy prompts that contain extra noise in the form of punctuation marks. While being easy to implement, ArithmAttack does not cause any information loss… ▽ More

    Submitted 14 January, 2025; originally announced January 2025.

  8. arXiv:2501.05588  [pdf, other

    cs.LG hep-ex

    Enforcing Fundamental Relations via Adversarial Attacks on Input Parameter Correlations

    Authors: Timo Saala, Lucie Flek, Alexander Jung, Akbar Karimi, Alexander Schmidt, Matthias Schott, Philipp Soldin, Christopher Wiebusch

    Abstract: Correlations between input parameters play a crucial role in many scientific classification tasks, since these are often related to fundamental laws of nature. For example, in high energy physics, one of the common deep learning use-cases is the classification of signal and background processes in particle collisions. In many such cases, the fundamental principles of the correlations between obser… ▽ More

    Submitted 9 January, 2025; originally announced January 2025.

    Comments: 12 pages, 8 figures (Without appendix)

  9. arXiv:2501.04820  [pdf, other

    cs.SI cs.CL cs.CY

    Unifying the Extremes: Developing a Unified Model for Detecting and Predicting Extremist Traits and Radicalization

    Authors: Allison Lahnala, Vasudha Varadarajan, Lucie Flek, H. Andrew Schwartz, Ryan L. Boyd

    Abstract: The proliferation of ideological movements into extremist factions via social media has become a global concern. While radicalization has been studied extensively within the context of specific ideologies, our ability to accurately characterize extremism in more generalizable terms remains underdeveloped. In this paper, we propose a novel method for extracting and analyzing extremist discourse acr… ▽ More

    Submitted 8 January, 2025; originally announced January 2025.

    Comments: 17 pages, 7 figures, 4 tables

  10. arXiv:2411.13800  [pdf, other

    cs.CL

    Explaining GPT-4's Schema of Depression Using Machine Behavior Analysis

    Authors: Adithya V Ganesan, Vasudha Varadarajan, Yash Kumar Lal, Veerle C. Eijsbroek, Katarina Kjell, Oscar N. E. Kjell, Tanuja Dhanasekaran, Elizabeth C. Stade, Johannes C. Eichstaedt, Ryan L. Boyd, H. Andrew Schwartz, Lucie Flek

    Abstract: Use of large language models such as ChatGPT (GPT-4) for mental health support has grown rapidly, emerging as a promising route to assess and help people with mood disorders, like depression. However, we have a limited understanding of GPT-4's schema of mental disorders, that is, how it internally associates and interprets symptoms. In this work, we leveraged contemporary measurement theory to dec… ▽ More

    Submitted 20 November, 2024; originally announced November 2024.

    Comments: 21 pages, 3 tables, 6 figures, 1 supplementary table, 83 references

  11. arXiv:2410.19221  [pdf, other

    cs.CL cs.AI cs.LG

    Can Stories Help LLMs Reason? Curating Information Space Through Narrative

    Authors: Vahid Sadiri Javadi, Johanne R. Trippas, Yash Kumar Lal, Lucie Flek

    Abstract: Narratives are widely recognized as a powerful tool for structuring information and facilitating comprehension of complex ideas in various domains such as science communication. This paper investigates whether incorporating narrative elements can assist Large Language Models (LLMs) in solving complex problems more effectively. We propose a novel approach, Story of Thought (SoT), integrating narrat… ▽ More

    Submitted 24 October, 2024; originally announced October 2024.

  12. arXiv:2410.06271  [pdf, ps, other

    cs.CL cs.AI

    Probing the Robustness of Theory of Mind in Large Language Models

    Authors: Christian Nickel, Laura Schrewe, Lucie Flek

    Abstract: With the success of ChatGPT and other similarly sized SotA LLMs, claims of emergent human like social reasoning capabilities, especially Theory of Mind (ToM), in these models have appeared in the scientific literature. On the one hand those ToM-capabilities have been successfully tested using tasks styled similar to those used in psychology (Kosinski, 2023). On the other hand, follow up studies sh… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

  13. arXiv:2408.14398  [pdf, other

    cs.CL cs.AI cs.LG

    Investigating Language-Specific Calibration For Pruning Multilingual Large Language Models

    Authors: Simon Kurz, Jian-Jia Chen, Lucie Flek, Zhixue Zhao

    Abstract: Recent advances in large language model (LLM) pruning have shown state-of-the-art (SotA) compression results in post-training and retraining-free settings while maintaining high predictive performance. However, previous research mainly considered calibrating based on English text, despite the multilingual nature of modern LLMs and their frequent use in non-English languages. In this paper, we set… ▽ More

    Submitted 29 October, 2024; v1 submitted 26 August, 2024; originally announced August 2024.

  14. arXiv:2407.05740  [pdf, other

    cs.CL

    Do Multilingual Large Language Models Mitigate Stereotype Bias?

    Authors: Shangrui Nie, Michael Fromm, Charles Welch, Rebekka Görge, Akbar Karimi, Joan Plepi, Nazia Afsan Mowmita, Nicolas Flores-Herr, Mehdi Ali, Lucie Flek

    Abstract: While preliminary findings indicate that multilingual LLMs exhibit reduced bias compared to monolingual ones, a comprehensive understanding of the effect of multilingual training on bias mitigation, is lacking. This study addresses this gap by systematically training six LLMs of identical size (2.6B parameters) and architecture: five monolingual models (English, German, French, Italian, and Spanis… ▽ More

    Submitted 9 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: 19 pages, 8 figures, C3NLP 2024

  15. arXiv:2406.19071  [pdf, other

    cs.CL cs.AI

    EmPO: Emotion Grounding for Empathetic Response Generation through Preference Optimization

    Authors: Ondrej Sotolar, Vojtech Formanek, Alok Debnath, Allison Lahnala, Charles Welch, Lucie FLek

    Abstract: Empathetic response generation is a desirable aspect of conversational agents, crucial for facilitating engaging and emotionally intelligent multi-turn conversations between humans and machines. Leveraging large language models for this task has shown promising results, yet challenges persist in ensuring both the empathetic quality of the responses and retention of the generalization performance o… ▽ More

    Submitted 17 September, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: v02, 8 pages long paper, EMNLP ACL style

    MSC Class: 68T50 ACM Class: I.2.7

  16. arXiv:2406.16833  [pdf, other

    cs.CL cs.AI cs.LG

    USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$onversations

    Authors: Mounika Marreddy, Subba Reddy Oota, Venkata Charan Chinni, Manish Gupta, Lucie Flek

    Abstract: Identifying user's opinions and stances in long conversation threads on various topics can be extremely critical for enhanced personalization, market research, political campaigns, customer service, conflict resolution, targeted advertising, and content moderation. Hence, training language models to automate this task is critical. However, to train such models, gathering manual annotations has mul… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 32 pages, 18 figures

  17. arXiv:2404.06488  [pdf, ps, other

    cs.CL cs.AI

    Pitfalls of Conversational LLMs on News Debiasing

    Authors: Ipek Baris Schlicht, Defne Altiok, Maryanne Taouk, Lucie Flek

    Abstract: This paper addresses debiasing in news editing and evaluates the effectiveness of conversational Large Language Models in this task. We designed an evaluation checklist tailored to news editors' perspectives, obtained generated texts from three popular conversational models using a subset of a publicly available dataset in media bias, and evaluated the texts according to the designed checklist. Fu… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: The paper is accepted at the DELITE workshop which is co-located at COLING/LREC

  18. arXiv:2404.02340  [pdf, other

    cs.CL

    Corpus Considerations for Annotator Modeling and Scaling

    Authors: Olufunke O. Sarumi, Béla Neuendorf, Joan Plepi, Lucie Flek, Jörg Schlötterer, Charles Welch

    Abstract: Recent trends in natural language processing research and annotation tasks affirm a paradigm shift from the traditional reliance on a single ground truth to a focus on individual perspectives, particularly in subjective tasks. In scenarios where annotation tasks are meant to encompass diversity, models that solely rely on the majority class labels may inadvertently disregard valuable minority pers… ▽ More

    Submitted 17 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted at NAACL 2024

    ACM Class: F.2.2; I.2.7

  19. arXiv:2311.00475  [pdf, other

    cs.CL

    Style Locality for Controllable Generation with kNN Language Models

    Authors: Gilles Nawezi, Lucie Flek, Charles Welch

    Abstract: Recent language models have been improved by the addition of external memory. Nearest neighbor language models retrieve similar contexts to assist in word prediction. The addition of locality levels allows a model to learn how to weight neighbors based on their relative location to the current text in source documents, and have been shown to further improve model performance. Nearest neighbor mode… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: Accepted to TamingLLM Workshop at SIGDIAL 2023

  20. arXiv:2308.14641  [pdf, ps, other

    cs.CL

    Challenges of GPT-3-based Conversational Agents for Healthcare

    Authors: Fabian Lechner, Allison Lahnala, Charles Welch, Lucie Flek

    Abstract: The potential to provide patients with faster information access while allowing medical specialists to concentrate on critical tasks makes medical domain dialog agents appealing. However, the integration of large-language models (LLMs) into these agents presents certain limitations that may result in serious consequences. This paper investigates the challenges and risks of using GPT-3-based models… ▽ More

    Submitted 29 August, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

    Comments: 12 pages, 9 Tables, accepted to RANLP 2023

  21. arXiv:2308.04226  [pdf, other

    cs.HC cs.CL cs.IR cs.LG

    OpinionConv: Conversational Product Search with Grounded Opinions

    Authors: Vahid Sadiri Javadi, Martin Potthast, Lucie Flek

    Abstract: When searching for products, the opinions of others play an important role in making informed decisions. Subjective experiences about a product can be a valuable source of information. This is also true in sales conversations, where a customer and a sales assistant exchange facts and opinions about products. However, training an AI for such conversations is complicated by the fact that language mo… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

  22. arXiv:2306.00346  [pdf, other

    cs.CL cs.LG

    CAISA at SemEval-2023 Task 8: Counterfactual Data Augmentation for Mitigating Class Imbalance in Causal Claim Identification

    Authors: Akbar Karimi, Lucie Flek

    Abstract: The class imbalance problem can cause machine learning models to produce an undesirable performance on the minority class as well as the whole dataset. Using data augmentation techniques to increase the number of samples is one way to tackle this problem. We introduce a novel counterfactual data augmentation by verb replacement for the identification of medical claims. In addition, we investigate… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

  23. arXiv:2301.05494  [pdf, other

    cs.CL cs.IR

    Multilingual Detection of Check-Worthy Claims using World Languages and Adapter Fusion

    Authors: Ipek Baris Schlicht, Lucie Flek, Paolo Rosso

    Abstract: Check-worthiness detection is the task of identifying claims, worthy to be investigated by fact-checkers. Resource scarcity for non-world languages and model learning costs remain major challenges for the creation of models supporting multilingual check-worthiness detection. This paper proposes cross-training adapters on a subset of world languages, combined by adapter fusion, to detect claims eme… ▽ More

    Submitted 13 January, 2023; originally announced January 2023.

    Comments: 17 pages, 11 table. It has been accepted as a full paper at ECIR 2023

  24. arXiv:2210.16604  [pdf, other

    cs.CL

    A Critical Reflection and Forward Perspective on Empathy and Natural Language Processing

    Authors: Allison Lahnala, Charles Welch, David Jurgens, Lucie Flek

    Abstract: We review the state of research on empathy in natural language processing and identify the following issues: (1) empathy definitions are absent or abstract, which (2) leads to low construct validity and reproducibility. Moreover, (3) emotional empathy is overemphasized, skewing our focus to a narrow subset of simplified tasks. We believe these issues hinder research progress and argue that current… ▽ More

    Submitted 29 October, 2022; originally announced October 2022.

    Comments: To appear at Findings of EMNLP 2022

  25. arXiv:2210.15762  [pdf, other

    cs.CL

    Nearest Neighbor Language Models for Stylistic Controllable Generation

    Authors: Severino Trotta, Lucie Flek, Charles Welch

    Abstract: Recent language modeling performance has been greatly improved by the use of external memory. This memory encodes the context so that similar contexts can be recalled during decoding. This similarity depends on how the model learns to encode context, which can be altered to include other attributes, such as style. We construct and evaluate an architecture for this purpose, using corpora annotated… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: Accepted to GEM workshop at EMNLP 2022

  26. arXiv:2210.14531  [pdf, other

    cs.CL

    Unifying Data Perspectivism and Personalization: An Application to Social Norms

    Authors: Joan Plepi, Béla Neuendorf, Lucie Flek, Charles Welch

    Abstract: Instead of using a single ground truth for language processing tasks, several recent studies have examined how to represent and predict the labels of the set of annotators. However, often little or no information about annotators is known, or the set of annotators is small. In this work, we examine a corpus of social media posts about conflict from a set of 13k annotators and 210k judgements of so… ▽ More

    Submitted 22 October, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

  27. arXiv:2209.02022  [pdf, other

    cs.CL cs.CR

    How Much User Context Do We Need? Privacy by Design in Mental Health NLP Application

    Authors: Ramit Sawhney, Atula Tejaswi Neerkaje, Ivan Habernal, Lucie Flek

    Abstract: Clinical NLP tasks such as mental health assessment from text, must take social constraints into account - the performance maximization must be constrained by the utmost importance of guaranteeing privacy of user data. Consumer protection regulations, such as GDPR, generally handle privacy by restricting data availability, such as requiring to limit user data to 'what is necessary' for a given pur… ▽ More

    Submitted 5 September, 2022; originally announced September 2022.

    Comments: Accepted to ICWSM 2023

  28. arXiv:2208.08758  [pdf, other

    cs.CL

    Understanding Interpersonal Conflict Types and their Impact on Perception Classification

    Authors: Charles Welch, Joan Plepi, Béla Neuendorf, Lucie Flek

    Abstract: Studies on interpersonal conflict have a long history and contain many suggestions for conflict typology. We use this as the basis of a novel annotation scheme and release a new dataset of situations and conflict aspect annotations. We then build a classifier to predict whether someone will perceive the actions of one individual as right or wrong in a given situation. Our analyses include conflict… ▽ More

    Submitted 27 October, 2022; v1 submitted 18 August, 2022; originally announced August 2022.

  29. arXiv:2205.07233  [pdf, other

    cs.CL cs.AI

    Mitigating Toxic Degeneration with Empathetic Data: Exploring the Relationship Between Toxicity and Empathy

    Authors: Allison Lahnala, Charles Welch, Béla Neuendorf, Lucie Flek

    Abstract: Large pre-trained neural language models have supported the effectiveness of many NLP tasks, yet are still prone to generating toxic language hindering the safety of their use. Using empathetic data, we improve over recent work on controllable text generation that aims to reduce the toxicity of generated text. We find we are able to dramatically reduce the size of fine-tuning data to 7.5-30k sampl… ▽ More

    Submitted 15 May, 2022; originally announced May 2022.

    Comments: Accepted to NAACL 2022

  30. arXiv:2205.06181  [pdf, other

    cs.SI

    FACTOID: A New Dataset for Identifying Misinformation Spreaders and Political Bias

    Authors: Flora Sakketou, Joan Plepi, Riccardo Cervero, Henri-Jacques Geiss, Paolo Rosso, Lucie Flek

    Abstract: Proactively identifying misinformation spreaders is an important step towards mitigating the impact of fake news on our society. In this paper, we introduce a new contemporary Reddit dataset for fake news spreader analysis, called FACTOID, monitoring political discussions on Reddit since the beginning of 2020. The dataset contains over 4K users with 3.4M Reddit posts, and includes, beyond the user… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: Accepted to LREC 2022

  31. arXiv:2204.13329  [pdf, other

    cs.AI

    Refining Diagnosis Paths for Medical Diagnosis based on an Augmented Knowledge Graph

    Authors: Niclas Heilig, Jan Kirchhoff, Florian Stumpe, Joan Plepi, Lucie Flek, Heiko Paulheim

    Abstract: Medical diagnosis is the process of making a prediction of the disease a patient is likely to have, given a set of symptoms and observations. This requires extensive expert knowledge, in particular when covering a large variety of diseases. Such knowledge can be coded in a knowledge graph -- encompassing diseases, symptoms, and diagnosis paths. Since both the knowledge itself and its encoding can… ▽ More

    Submitted 28 April, 2022; originally announced April 2022.

    Comments: Accepted at the 5th Workshop on Semantic Web solutions for large-scale biomedical data analytics

  32. arXiv:2204.10190  [pdf, other

    cs.CL cs.SI

    Investigating User Radicalization: A Novel Dataset for Identifying Fine-Grained Temporal Shifts in Opinion

    Authors: Flora Sakketou, Allison Lahnala, Liane Vogel, Lucie Flek

    Abstract: There is an increasing need for the ability to model fine-grained opinion shifts of social media users, as concerns about the potential polarizing social effects increase. However, the lack of publicly available datasets that are suitable for the task presents a major challenge. In this paper, we introduce an innovative annotated dataset for modeling subtle opinion fluctuations and detecting fine-… ▽ More

    Submitted 29 April, 2022; v1 submitted 16 April, 2022; originally announced April 2022.

    Comments: Accepted to LREC 2022

  33. arXiv:2203.02745  [pdf, other

    cs.CR cs.CL cs.LG

    The Impact of Differential Privacy on Group Disparity Mitigation

    Authors: Victor Petrén Bach Hansen, Atula Tejaswi Neerkaje, Ramit Sawhney, Lucie Flek, Anders Søgaard

    Abstract: The performance cost of differential privacy has, for some applications, been shown to be higher for minority groups; fairness, conversely, has been shown to disproportionally compromise the privacy of members of such groups. Most work in this area has been restricted to computer vision and risk assessment. In this paper, we evaluate the impact of differential privacy on fairness across four tasks… ▽ More

    Submitted 5 March, 2022; originally announced March 2022.

  34. arXiv:2110.08011  [pdf, other

    cs.CL

    Modeling Proficiency with Implicit User Representations

    Authors: Kim Breitwieser, Allison Lahnala, Charles Welch, Lucie Flek, Martin Potthast

    Abstract: We introduce the problem of proficiency modeling: Given a user's posts on a social media platform, the task is to identify the subset of posts or topics for which the user has some level of proficiency. This enables the filtering and ranking of social media posts on a given topic as per user proficiency. Unlike experts on a given topic, proficient users may not have received formal training and po… ▽ More

    Submitted 15 October, 2021; originally announced October 2021.

  35. arXiv:2110.04001  [pdf, other

    cs.CL

    Perceived and Intended Sarcasm Detection with Graph Attention Networks

    Authors: Joan Plepi, Lucie Flek

    Abstract: Existing sarcasm detection systems focus on exploiting linguistic markers, context, or user-level priors. However, social studies suggest that the relationship between the author and the audience can be equally relevant for the sarcasm usage and interpretation. In this work, we propose a framework jointly leveraging (1) a user context from their historical tweets together with (2) the social infor… ▽ More

    Submitted 8 October, 2021; originally announced October 2021.