Skip to main content

Showing 1–21 of 21 results for author: Venkit, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.11255  [pdf, ps, other

    cs.CY

    Social Scientists on the Role of AI in Research

    Authors: Tatiana Chakravorti, Xinyu Wang, Pranav Narayanan Venkit, Sai Koneru, Kevin Munger, Sarah Rajtmajer

    Abstract: The integration of artificial intelligence (AI) into social science research practices raises significant technological, methodological, and ethical issues. We present a community-centric study drawing on 284 survey responses and 15 semi-structured interviews with social scientists, describing their familiarity with, perceptions of the usefulness of, and ethical concerns about the use of AI in the… ▽ More

    Submitted 12 June, 2025; originally announced June 2025.

  2. arXiv:2505.07850  [pdf, other

    cs.CL cs.AI cs.CY

    A Tale of Two Identities: An Ethical Audit of Human and AI-Crafted Personas

    Authors: Pranav Narayanan Venkit, Jiayi Li, Yingfan Zhou, Sarah Rajtmajer, Shomir Wilson

    Abstract: As LLMs (large language models) are increasingly used to generate synthetic personas particularly in data-limited domains such as health, privacy, and HCI, it becomes necessary to understand how these narratives represent identity, especially that of minority communities. In this paper, we audit synthetic personas generated by 3 LLMs (GPT4o, Gemini 1.5 Pro, Deepseek 2.5) through the lens of repres… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

  3. arXiv:2504.18673  [pdf, other

    cs.CL

    Can Third-parties Read Our Emotions?

    Authors: Jiayi Li, Yingfan Zhou, Pranav Narayanan Venkit, Halima Binte Islam, Sneha Arya, Shomir Wilson, Sarah Rajtmajer

    Abstract: Natural Language Processing tasks that aim to infer an author's private states, e.g., emotions and opinions, from their written text, typically rely on datasets annotated by third-party annotators. However, the assumption that third-party annotators can accurately capture authors' private states remains largely unexamined. In this study, we present human subjects experiments on emotion recognition… ▽ More

    Submitted 25 April, 2025; originally announced April 2025.

  4. arXiv:2410.22349  [pdf, other

    cs.IR cs.AI cs.CL cs.CY cs.HC

    Search Engines in an AI Era: The False Promise of Factual and Verifiable Source-Cited Responses

    Authors: Pranav Narayanan Venkit, Philippe Laban, Yilun Zhou, Yixin Mao, Chien-Sheng Wu

    Abstract: Large Language Model (LLM)-based applications are graduating from research prototypes to products serving millions of users, influencing how people write and consume information. A prominent example is the appearance of Answer Engines: LLM-based generative search engines supplanting traditional search engines. Answer engines not only retrieve relevant sources to a user query but synthesize answer… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

  5. arXiv:2410.15467  [pdf, other

    cs.CL cs.AI cs.HC

    Hey GPT, Can You be More Racist? Analysis from Crowdsourced Attempts to Elicit Biased Content from Generative AI

    Authors: Hangzhi Guo, Pranav Narayanan Venkit, Eunchae Jang, Mukund Srinath, Wenbo Zhang, Bonam Mingole, Vipul Gupta, Kush R. Varshney, S. Shyam Sundar, Amulya Yadav

    Abstract: The widespread adoption of large language models (LLMs) and generative AI (GenAI) tools across diverse applications has amplified the importance of addressing societal biases inherent within these technologies. While the NLP community has extensively studied LLM bias, research investigating how non-expert users perceive and interact with biases from these systems remains limited. As these technolo… ▽ More

    Submitted 20 October, 2024; originally announced October 2024.

  6. arXiv:2407.14779  [pdf, other

    cs.CY cs.AI cs.HC

    Do Generative AI Models Output Harm while Representing Non-Western Cultures: Evidence from A Community-Centered Approach

    Authors: Sourojit Ghosh, Pranav Narayanan Venkit, Sanjana Gautam, Shomir Wilson, Aylin Caliskan

    Abstract: Our research investigates the impact of Generative Artificial Intelligence (GAI) models, specifically text-to-image generators (T2Is), on the representation of non-Western cultures, with a focus on Indian contexts. Despite the transformative potential of T2Is in content creation, concerns have arisen regarding biases that may lead to misrepresentations and marginalizations. Through a community-cen… ▽ More

    Submitted 3 August, 2024; v1 submitted 20 July, 2024; originally announced July 2024.

    Comments: This is the pre-peer reviewed version, which has been accepted at the 7th AAAI ACM Conference on AI, Ethics, and Society, Oct. 21, 2024, California, USA

  7. arXiv:2407.01817  [pdf, other

    cs.CL cs.CY cs.HC

    Race and Privacy in Broadcast Police Communications

    Authors: Pranav Narayanan Venkit, Christopher Graziul, Miranda Ardith Goodman, Samantha Nicole Kenny, Shomir Wilson

    Abstract: Radios are essential for the operations of modern police departments, and they function as both a collaborative communication technology and a sociotechnical system. However, little prior research has examined their usage or their connections to individual privacy and the role of race in policing, two growing topics of concern in the US. As a case study, we examine the Chicago Police Department's… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Accepted in the 27th ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW '24)

  8. arXiv:2406.16253  [pdf, other

    cs.CL

    LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

    Authors: Jiangshu Du, Yibo Wang, Wenting Zhao, Zhongfen Deng, Shuaiqi Liu, Renze Lou, Henry Peng Zou, Pranav Narayanan Venkit, Nan Zhang, Mukund Srinath, Haoran Ranran Zhang, Vipul Gupta, Yinghui Li, Tao Li, Fei Wang, Qin Liu, Tianlin Liu, Pengzhi Gao, Congying Xia, Chen Xing, Jiayang Cheng, Zhaowei Wang, Ying Su, Raj Sanjay Shah, Ruohao Guo , et al. (15 additional authors not shown)

    Abstract: This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as th… ▽ More

    Submitted 2 October, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

    Comments: Accepted by EMNLP 2024 main conference

  9. arXiv:2405.11030  [pdf, other

    cs.CL

    The Unappreciated Role of Intent in Algorithmic Moderation of Social Media Content

    Authors: Xinyu Wang, Sai Koneru, Pranav Narayanan Venkit, Brett Frischmann, Sarah Rajtmajer

    Abstract: As social media has become a predominant mode of communication globally, the rise of abusive content threatens to undermine civil discourse. Recognizing the critical nature of this issue, a significant body of research has been dedicated to developing language models that can detect various types of online abuse, e.g., hate speech, cyberbullying. However, there exists a notable disconnect between… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  10. arXiv:2404.07461  [pdf, other

    cs.CL cs.AI

    An Audit on the Perspectives and Challenges of Hallucinations in NLP

    Authors: Pranav Narayanan Venkit, Tatiana Chakravorti, Vipul Gupta, Heidi Biggs, Mukund Srinath, Koustava Goswami, Sarah Rajtmajer, Shomir Wilson

    Abstract: We audit how hallucination in large language models (LLMs) is characterized in peer-reviewed literature, using a critical examination of 103 publications across NLP research. Through the examination of the literature, we identify a lack of agreement with the term `hallucination' in the field of NLP. Additionally, to compliment our audit, we conduct a survey with 171 practitioners from the field of… ▽ More

    Submitted 13 September, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

  11. arXiv:2403.10776  [pdf, other

    cs.HC cs.AI cs.CY cs.LG

    From Melting Pots to Misrepresentations: Exploring Harms in Generative AI

    Authors: Sanjana Gautam, Pranav Narayanan Venkit, Sourojit Ghosh

    Abstract: With the widespread adoption of advanced generative models such as Gemini and GPT, there has been a notable increase in the incorporation of such models into sociotechnical systems, categorized under AI-as-a-Service (AIaaS). Despite their versatility across diverse sectors, concerns persist regarding discriminatory tendencies within these models, particularly favoring selected `majority' demograph… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: In CHI 2024: Generative AI and HCI workshop (GenAICHI 24)

  12. arXiv:2402.11006  [pdf, other

    cs.CR cs.LG

    Automated Detection and Analysis of Data Practices Using A Real-World Corpus

    Authors: Mukund Srinath, Pranav Venkit, Maria Badillo, Florian Schaub, C. Lee Giles, Shomir Wilson

    Abstract: Privacy policies are crucial for informing users about data practices, yet their length and complexity often deter users from reading them. In this paper, we propose an automated approach to identify and visualize data practices within privacy policies at different levels of detail. Leveraging crowd-sourced annotations from the ToS;DR platform, we experiment with various methods to match policy ex… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  13. arXiv:2310.12318  [pdf, other

    cs.CL cs.AI cs.CY cs.HC

    The Sentiment Problem: A Critical Survey towards Deconstructing Sentiment Analysis

    Authors: Pranav Narayanan Venkit, Mukund Srinath, Sanjana Gautam, Saranya Venkatraman, Vipul Gupta, Rebecca J. Passonneau, Shomir Wilson

    Abstract: We conduct an inquiry into the sociotechnical aspects of sentiment analysis (SA) by critically examining 189 peer-reviewed papers on their applications, models, and datasets. Our investigation stems from the recognition that SA has become an integral component of diverse sociotechnical systems, exerting influence on both social and technical users. By delving into sociological and technological li… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: This paper has been accepted and will appear at the EMNLP 2023 Main Conference

  14. Towards a Holistic Approach: Understanding Sociodemographic Biases in NLP Models using an Interdisciplinary Lens

    Authors: Pranav Narayanan Venkit

    Abstract: The rapid growth in the usage and applications of Natural Language Processing (NLP) in various sociotechnical solutions has highlighted the need for a comprehensive understanding of bias and its impact on society. While research on bias in NLP has expanded, several challenges persist that require attention. These include the limited focus on sociodemographic biases beyond race and gender, the narr… ▽ More

    Submitted 24 August, 2023; originally announced August 2023.

  15. arXiv:2308.12539  [pdf, other

    cs.CL cs.AI cs.LG

    CALM : A Multi-task Benchmark for Comprehensive Assessment of Language Model Bias

    Authors: Vipul Gupta, Pranav Narayanan Venkit, Hugo Laurençon, Shomir Wilson, Rebecca J. Passonneau

    Abstract: As language models (LMs) become increasingly powerful and widely used, it is important to quantify them for sociodemographic bias with potential for harm. Prior measures of bias are sensitive to perturbations in the templates designed to compare performance across social groups, due to factors such as low diversity or limited number of templates. Also, most previous work considers only one NLP tas… ▽ More

    Submitted 7 August, 2024; v1 submitted 23 August, 2023; originally announced August 2023.

  16. arXiv:2308.04346  [pdf, other

    cs.CL cs.CY

    Unmasking Nationality Bias: A Study of Human Perception of Nationalities in AI-Generated Articles

    Authors: Pranav Narayanan Venkit, Sanjana Gautam, Ruchi Panchanadikar, Ting-Hao `Kenneth' Huang, Shomir Wilson

    Abstract: We investigate the potential for nationality biases in natural language processing (NLP) models using human evaluation methods. Biased NLP models can perpetuate stereotypes and lead to algorithmic discrimination, posing a significant challenge to the fairness and justice of AI systems. Our study employs a two-step mixed-methods approach that includes both quantitative and qualitative analysis to i… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

  17. arXiv:2307.09209  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Automated Ableism: An Exploration of Explicit Disability Biases in Sentiment and Toxicity Analysis Models

    Authors: Pranav Narayanan Venkit, Mukund Srinath, Shomir Wilson

    Abstract: We analyze sentiment analysis and toxicity detection models to detect the presence of explicit bias against people with disability (PWD). We employ the bias identification framework of Perturbation Sensitivity Analysis to examine conversations related to PWD on social media platforms, specifically Twitter and Reddit, in order to gain insight into how disability bias is disseminated in real-world s… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Comments: TrustNLP at ACL 2023

    Journal ref: Proceedings at The Third Workshop on Trustworthy Natural Language Processing collocated at the 61st Annual Meeting Of The Association For Computational Linguistics. 2023

  18. arXiv:2306.08158  [pdf, other

    cs.CL cs.AI cs.LG

    Sociodemographic Bias in Language Models: A Survey and Forward Path

    Authors: Vipul Gupta, Pranav Narayanan Venkit, Shomir Wilson, Rebecca J. Passonneau

    Abstract: Sociodemographic bias in language models (LMs) has the potential for harm when deployed in real-world settings. This paper presents a comprehensive survey of the past decade of research on sociodemographic bias in LMs, organized into a typology that facilitates examining the different aims: types of bias, quantifying bias, and debiasing techniques. We track the evolution of the latter two question… ▽ More

    Submitted 13 August, 2024; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: 23 pages, 3 figure

  19. arXiv:2302.02463  [pdf, other

    cs.CL cs.AI

    Nationality Bias in Text Generation

    Authors: Pranav Narayanan Venkit, Sanjana Gautam, Ruchi Panchanadikar, Ting-Hao 'Kenneth' Huang, Shomir Wilson

    Abstract: Little attention is placed on analyzing nationality bias in language models, especially when nationality is highly used as a factor in increasing the performance of social NLP models. This paper examines how a text generation model, GPT-2, accentuates pre-existing societal biases about country-based demonyms. We generate stories using GPT-2 for various nationalities and use sensitivity analysis to… ▽ More

    Submitted 14 February, 2023; v1 submitted 5 February, 2023; originally announced February 2023.

    Comments: Paper accepted in the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL2023)

  20. arXiv:2111.13259  [pdf, other

    cs.CL cs.AI

    Identification of Bias Against People with Disabilities in Sentiment Analysis and Toxicity Detection Models

    Authors: Pranav Narayanan Venkit, Shomir Wilson

    Abstract: Sociodemographic biases are a common problem for natural language processing, affecting the fairness and integrity of its applications. Within sentiment analysis, these biases may undermine sentiment predictions for texts that mention personal attributes that unbiased human readers would consider neutral. Such discrimination can have great consequences in the applications of sentiment analysis bot… ▽ More

    Submitted 25 November, 2021; originally announced November 2021.

  21. arXiv:2103.07833  [pdf, other

    cs.CL

    A `Sourceful' Twist: Emoji Prediction Based on Sentiment, Hashtags and Application Source

    Authors: Pranav Venkit, Zeba Karishma, Chi-Yang Hsu, Rahul Katiki, Kenneth Huang, Shomir Wilson, Patrick Dudas

    Abstract: We widely use emojis in social networking to heighten, mitigate or negate the sentiment of the text. Emoji suggestions already exist in many cross-platform applications but an emoji is predicted solely based a few prominent words instead of understanding the subject and substance of the text. Through this paper, we showcase the importance of using Twitter features to help the model understand the… ▽ More

    Submitted 13 March, 2021; originally announced March 2021.