Skip to main content

Showing 1–24 of 24 results for author: Jaidka, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.00334  [pdf

    cs.CL

    Beyond Context to Cognitive Appraisal: Emotion Reasoning as a Theory of Mind Benchmark for Large Language Models

    Authors: Gerard Christopher Yeo, Kokil Jaidka

    Abstract: Datasets used for emotion recognition tasks typically contain overt cues that can be used in predicting the emotions expressed in a text. However, one challenge is that texts sometimes contain covert contextual cues that are rich in affective semantics, which warrant higher-order reasoning abilities to infer emotional states, not simply the emotions conveyed. This study advances beyond surface-lev… ▽ More

    Submitted 30 May, 2025; originally announced June 2025.

    Comments: 9 pages, 3 figures

  2. arXiv:2506.00332  [pdf

    cs.CL cs.SI

    Disentangling Codemixing in Chats: The NUS ABC Codemixed Corpus

    Authors: Svetlana Churina, Akshat Gupta, Insyirah Mujtahid, Kokil Jaidka

    Abstract: Code-mixing involves the seamless integration of linguistic elements from multiple languages within a single discourse, reflecting natural multilingual communication patterns. Despite its prominence in informal interactions such as social media, chat messages and instant-messaging exchanges, there has been a lack of publicly available corpora that are author-labeled and suitable for modeling human… ▽ More

    Submitted 15 June, 2025; v1 submitted 30 May, 2025; originally announced June 2025.

    Comments: 19 pages, 5 figures, 8 tables

  3. arXiv:2505.17413  [pdf

    cs.CL

    Conversations: Love Them, Hate Them, Steer Them

    Authors: Niranjan Chebrolu, Gerard Christopher Yeo, Kokil Jaidka

    Abstract: Large Language Models (LLMs) demonstrate increasing conversational fluency, yet instilling them with nuanced, human-like emotional expression remains a significant challenge. Current alignment techniques often address surface-level output or require extensive fine-tuning. This paper demonstrates that targeted activation engineering can steer LLaMA 3.1-8B to exhibit more human-like emotional nuance… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

    Comments: 11 pages, 8 figures, 7 tables

  4. arXiv:2505.01162  [pdf

    cs.CL cs.AI

    On the Limitations of Steering in Language Model Alignment

    Authors: Chebrolu Niranjan, Kokil Jaidka, Gerard Christopher Yeo

    Abstract: Steering vectors are a promising approach to aligning language model behavior at inference time. In this paper, we propose a framework to assess the limitations of steering vectors as alignment mechanisms. Using a framework of transformer hook interventions and antonym-based function vectors, we evaluate the role of prompt structure and context complexity in steering effectiveness. Our findings in… ▽ More

    Submitted 2 May, 2025; originally announced May 2025.

  5. arXiv:2503.21000  [pdf, other

    cs.LG cs.AI

    Improving User Behavior Prediction: Leveraging Annotator Metadata in Supervised Machine Learning Models

    Authors: Lynnette Hui Xian Ng, Kokil Jaidka, Kaiyuan Tay, Hansin Ahuja, Niyati Chhaya

    Abstract: Supervised machine-learning models often underperform in predicting user behaviors from conversational text, hindered by poor crowdsourced label quality and low NLP task accuracy. We introduce the Metadata-Sensitive Weighted-Encoding Ensemble Model (MSWEEM), which integrates annotator meta-features like fatigue and speeding. First, our results show MSWEEM outperforms standard ensembles by 14% on h… ▽ More

    Submitted 27 May, 2025; v1 submitted 26 March, 2025; originally announced March 2025.

    Comments: Accepted at CSCW 2025

  6. arXiv:2412.02271  [pdf

    cs.CL

    MediaSpin: Exploring Media Bias Through Fine-Grained Analysis of News Headlines

    Authors: Preetika Verma, Kokil Jaidka

    Abstract: The editability of online news content has become a significant factor in shaping public perception, as social media platforms introduce new affordances for dynamic and adaptive news framing. Edits to news headlines can refocus audience attention, add or remove emotional language, and shift the framing of events in subtle yet impactful ways. What types of media bias are editorialized in and out of… ▽ More

    Submitted 22 May, 2025; v1 submitted 3 December, 2024; originally announced December 2024.

    Comments: 8 pages, 3 figures, 8 tables

  7. arXiv:2411.16813  [pdf

    cs.CL cs.AI

    Incivility and Rigidity: The Risks of Fine-Tuning LLMs for Political Argumentation

    Authors: Svetlana Churina, Kokil Jaidka

    Abstract: The incivility prevalent on platforms like Twitter (now X) and Reddit poses a challenge for developing AI systems that can support productive and rhetorically sound political argumentation. In this study, we report experiments with GPT-3.5 Turbo, fine-tuned on two contrasting datasets of political discussions: high-variance, high-incivility Twitter replies to U.S. Congress, and low-variance, low-i… ▽ More

    Submitted 20 June, 2025; v1 submitted 25 November, 2024; originally announced November 2024.

  8. arXiv:2411.10480  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.MM

    Hateful Meme Detection through Context-Sensitive Prompting and Fine-Grained Labeling

    Authors: Rongxin Ouyang, Kokil Jaidka, Subhayan Mukerjee, Guangyu Cui

    Abstract: The prevalence of multi-modal content on social media complicates automated moderation strategies. This calls for an enhancement in multi-modal classification and a deeper understanding of understated meanings in images and memes. Although previous efforts have aimed at improving model performance through fine-tuning, few have explored an end-to-end optimization pipeline that accounts for modaliti… ▽ More

    Submitted 13 November, 2024; originally announced November 2024.

    Comments: AAAI-25 Student Abstract, Oral Presentation

    MSC Class: 68T45; 68T50; 68T07 ACM Class: I.2.10; I.2.7; I.2.6

  9. arXiv:2407.19526  [pdf, other

    cs.CL

    Impact of Decoding Methods on Human Alignment of Conversational LLMs

    Authors: Shaz Furniturewala, Kokil Jaidka, Yashvardhan Sharma

    Abstract: To be included into chatbot systems, Large language models (LLMs) must be aligned with human conversational conventions. However, being trained mainly on web-scraped data gives existing LLMs a voice closer to informational text than actual human speech. In this paper, we examine the effect of decoding methods on the alignment between LLM-generated and human conversations, including Beam Search, To… ▽ More

    Submitted 28 July, 2024; originally announced July 2024.

  10. arXiv:2407.08607  [pdf, other

    cs.CL

    Turn-Level Empathy Prediction Using Psychological Indicators

    Authors: Shaz Furniturewala, Kokil Jaidka

    Abstract: For the WASSA 2024 Empathy and Personality Prediction Shared Task, we propose a novel turn-level empathy detection method that decomposes empathy into six psychological indicators: Emotional Language, Perspective-Taking, Sympathy and Compassion, Extroversion, Openness, and Agreeableness. A pipeline of text enrichment using a Large Language Model (LLM) followed by DeBERTA fine-tuning demonstrates a… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  11. arXiv:2407.08182  [pdf, other

    cs.CL

    Beyond Text: Leveraging Multi-Task Learning and Cognitive Appraisal Theory for Post-Purchase Intention Analysis

    Authors: Gerard Christopher Yeo, Shaz Furniturewala, Kokil Jaidka

    Abstract: Supervised machine-learning models for predicting user behavior offer a challenging classification problem with lower average prediction performance scores than other text classification tasks. This study evaluates multi-task learning frameworks grounded in Cognitive Appraisal Theory to predict user behavior as a function of users' self-expression and psychological attributes. Our experiments show… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  12. arXiv:2405.10431  [pdf, other

    cs.CL

    Thinking Fair and Slow: On the Efficacy of Structured Prompts for Debiasing Language Models

    Authors: Shaz Furniturewala, Surgan Jandial, Abhinav Java, Pragyan Banerjee, Simra Shahid, Sumit Bhatia, Kokil Jaidka

    Abstract: Existing debiasing techniques are typically training-based or require access to the model's internals and output distributions, so they are inaccessible to end-users looking to adapt LLM outputs for their particular needs. In this study, we examine whether structured prompting techniques can offer opportunities for fair text generation. We evaluate a comprehensive end-user-focused iterative framew… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: The first two authors have equal contribution

  13. arXiv:2403.02246  [pdf, other

    cs.CL

    PHAnToM: Persona-based Prompting Has An Effect on Theory-of-Mind Reasoning in Large Language Models

    Authors: Fiona Anting Tan, Gerard Christopher Yeo, Kokil Jaidka, Fanyou Wu, Weijie Xu, Vinija Jain, Aman Chadha, Yang Liu, See-Kiong Ng

    Abstract: The use of LLMs in natural language reasoning has shown mixed results, sometimes rivaling or even surpassing human performance in simpler classification tasks while struggling with social-cognitive reasoning, a domain where humans naturally excel. These differences have been attributed to many factors, such as variations in prompting and the specific LLMs used. However, no reasons appear conclusiv… ▽ More

    Submitted 22 October, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: 8 pages

  14. arXiv:2402.08498  [pdf

    cs.CL

    "Reasoning" with Rhetoric: On the Style-Evidence Tradeoff in LLM-Generated Counter-Arguments

    Authors: Preetika Verma, Kokil Jaidka, Svetlana Churina

    Abstract: Large language models (LLMs) play a key role in generating evidence-based and stylistic counter-arguments, yet their effectiveness in real-world applications has been underexplored. Previous research often neglects the balance between evidentiality and style, which are crucial for persuasive arguments. To address this, we evaluated the effectiveness of stylized evidence-based counter-argument gene… ▽ More

    Submitted 23 May, 2025; v1 submitted 13 February, 2024; originally announced February 2024.

    Comments: 24 pages, 9 figures, 13 tables

  15. arXiv:2311.08666  [pdf

    cs.CL cs.GT cs.LG

    It Takes Two to Negotiate: Modeling Social Exchange in Online Multiplayer Games

    Authors: Kokil Jaidka, Hansin Ahuja, Lynnette Ng

    Abstract: Online games are dynamic environments where players interact with each other, which offers a rich setting for understanding how players negotiate their way through the game to an ultimate victory. This work studies online player interactions during the turn-based strategy game, Diplomacy. We annotated a dataset of over 10,000 chat messages for different negotiation strategies and empirically exami… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: 28 pages, 11 figures. Accepted to CSCW '24 and forthcoming the Proceedings of ACM HCI '24

  16. arXiv:2310.18964  [pdf

    cs.CL

    LLMs and Finetuning: Benchmarking cross-domain performance for hate speech detection

    Authors: Ahmad Nasir, Aadish Sharma, Kokil Jaidka, Saifuddin Ahmed

    Abstract: In the evolving landscape of online communication, hate speech detection remains a formidable challenge, further compounded by the diversity of digital platforms. This study investigates the effectiveness and adaptability of pre-trained and fine-tuned Large Language Models (LLMs) in identifying hate speech, to address two central questions: (1) To what extent does the model performance depend on t… ▽ More

    Submitted 30 April, 2025; v1 submitted 29 October, 2023; originally announced October 2023.

    Comments: 18 pages, 3 figures, 5 tables

  17. arXiv:2301.11850  [pdf, other

    cs.CL

    Predicting Sentence-Level Factuality of News and Bias of Media Outlets

    Authors: Francielle Vargas, Kokil Jaidka, Thiago A. S. Pardo, Fabrício Benevenuto

    Abstract: Automated news credibility and fact-checking at scale require accurately predicting news factuality and media bias. This paper introduces a large sentence-level dataset, titled "FactNews", composed of 6,191 sentences expertly annotated according to factuality and media bias definitions proposed by AllSides. We use FactNews to assess the overall reliability of news sources, by formulating two text… ▽ More

    Submitted 13 September, 2024; v1 submitted 27 January, 2023; originally announced January 2023.

    Comments: Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing (RANLP 2023). https://aclanthology.org/2023.ranlp-1.127

  18. arXiv:2301.11429  [pdf, other

    cs.SI cs.CY

    Just Another Day on Twitter: A Complete 24 Hours of Twitter Data

    Authors: Juergen Pfeffer, Daniel Matter, Kokil Jaidka, Onur Varol, Afra Mashhadi, Jana Lasser, Dennis Assenmacher, Siqi Wu, Diyi Yang, Cornelia Brantner, Daniel M. Romero, Jahna Otterbacher, Carsten Schwemmer, Kenneth Joseph, David Garcia, Fred Morstatter

    Abstract: At the end of October 2022, Elon Musk concluded his acquisition of Twitter. In the weeks and months before that, several questions were publicly discussed that were not only of interest to the platform's future buyers, but also of high relevance to the Computational Social Science research community. For example, how many active users does the platform have? What percentage of accounts on the site… ▽ More

    Submitted 11 April, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

  19. arXiv:2112.15331  [pdf, other

    cs.CL cs.CY

    Using Graph-Aware Reinforcement Learning to Identify Winning Strategies in Diplomacy Games (Student Abstract)

    Authors: Hansin Ahuja, Lynnette Hui Xian Ng, Kokil Jaidka

    Abstract: This abstract proposes an approach towards goal-oriented modeling of the detection and modeling complex social phenomena in multiparty discourse in an online political strategy game. We developed a two-tier approach that first encodes sociolinguistic behavior as linguistic features then use reinforcement learning to estimate the advantage afforded to any player. In the first tier, sociolinguistic… ▽ More

    Submitted 3 January, 2022; v1 submitted 31 December, 2021; originally announced December 2021.

  20. arXiv:2110.15726  [pdf, other

    cs.CL cs.AI cs.CY cs.SI

    Social Media Reveals Urban-Rural Differences in Stress across China

    Authors: Jesse Cui, Tingdan Zhang, Kokil Jaidka, Dandan Pang, Garrick Sherman, Vinit Jakhetiya, Lyle Ungar, Sharath Chandra Guntuku

    Abstract: Modeling differential stress expressions in urban and rural regions in China can provide a better understanding of the effects of urbanization on psychological well-being in a country that has rapidly grown economically in the last two decades. This paper studies linguistic differences in the experiences and expressions of stress in urban-rural China from Weibo posts from over 65,000 users across… ▽ More

    Submitted 3 November, 2021; v1 submitted 19 October, 2021; originally announced October 2021.

    Comments: Accepted at AAAI Conference on Web and Social Media (ICWSM) 2022

  21. arXiv:1909.00764  [pdf, ps, other

    cs.CL cs.IR

    The CL-SciSumm Shared Task 2018: Results and Key Insights

    Authors: Kokil Jaidka, Michihiro Yasunaga, Muthu Kumar Chandrasekaran, Dragomir Radev, Min-Yen Kan

    Abstract: This overview describes the official results of the CL-SciSumm Shared Task 2018 -- the first medium-scale shared task on scientific document summarization in the computational linguistics (CL) domain. This year, the dataset comprised 60 annotated sets of citing and reference papers from the open access research papers in the CL domain. The Shared Task was organized as a part of the 41st Annual Con… ▽ More

    Submitted 2 September, 2019; originally announced September 2019.

    Comments: BIRNDL @ SIGIR 2018. arXiv admin note: substantial text overlap with arXiv:1907.09854

  22. arXiv:1812.00427  [pdf, ps, other

    cs.IR cs.DL

    Report on the 3rd Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2018)

    Authors: Philipp Mayr, Muthu Kumar Chandrasekaran, Kokil Jaidka

    Abstract: The $3^{rd}$ joint BIRNDL workshop was held at the 41st ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) in Ann Arbor, USA. BIRNDL 2018 intended to stimulate IR researchers and digital library professionals to elaborate on new approaches in natural language processing, information retrieval, scientometrics, and recommendation techniques that can advance the st… ▽ More

    Submitted 2 December, 2018; originally announced December 2018.

    Comments: 6 pages, to appear in SIGIR Forum

  23. arXiv:1811.07430  [pdf, other

    cs.CL cs.CY

    Understanding and Measuring Psychological Stress using Social Media

    Authors: Sharath Chandra Guntuku, Anneke Buffone, Kokil Jaidka, Johannes Eichstaedt, Lyle Ungar

    Abstract: A body of literature has demonstrated that users' mental health conditions, such as depression and anxiety, can be predicted from their social media language. There is still a gap in the scientific understanding of how psychological stress is expressed on social media. Stress is one of the primary underlying causes and correlates of chronic physical illnesses and mental health conditions. In this… ▽ More

    Submitted 4 April, 2019; v1 submitted 18 November, 2018; originally announced November 2018.

    Comments: Accepted for publication in the proceedings of ICWSM 2019

  24. arXiv:1706.02509  [pdf, ps, other

    cs.DL cs.IR

    Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2017)

    Authors: Muthu Kumar Chandrasekaran, Kokil Jaidka, Philipp Mayr

    Abstract: The large scale of scholarly publications poses a challenge for scholars in information seeking and sensemaking. Bibliometrics, information retrieval (IR), text mining and NLP techniques could help in these search and look-up activities, but are not yet widely used. This workshop is intended to stimulate IR researchers and digital library professionals to elaborate on new approaches in natural lan… ▽ More

    Submitted 8 June, 2017; originally announced June 2017.

    Comments: 2 pages, workshop paper accepted at the SIGIR 2017