Skip to main content

Showing 1–50 of 197 results for author: Ferrara, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.12349  [pdf, ps, other

    cs.CY cs.AI cs.CL

    Information Suppression in Large Language Models: Auditing, Quantifying, and Characterizing Censorship in DeepSeek

    Authors: Peiran Qiu, Siyi Zhou, Emilio Ferrara

    Abstract: This study examines information suppression mechanisms in DeepSeek, an open-source large language model (LLM) developed in China. We propose an auditing framework and use it to analyze the model's responses to 646 politically sensitive prompts by comparing its final output with intermediate chain-of-thought (CoT) reasoning. Our audit unveils evidence of semantic-level information suppression in De… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

  2. arXiv:2506.05670  [pdf, ps, other

    cs.CL

    Can LLMs Express Personality Across Cultures? Introducing CulturalPersonas for Evaluating Trait Alignment

    Authors: Priyanka Dey, Yugal Khanter, Aayush Bothra, Jieyu Zhao, Emilio Ferrara

    Abstract: As LLMs become central to interactive applications, ranging from tutoring to mental health, the ability to express personality in culturally appropriate ways is increasingly important. While recent works have explored personality evaluation of LLMs, they largely overlook the interplay between culture and personality. To address this, we introduce CulturalPersonas, the first large-scale benchmark w… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

  3. arXiv:2505.21729  [pdf, ps, other

    cs.SI cs.CY

    Bridging the Narrative Divide: Cross-Platform Discourse Networks in Fragmented Ecosystems

    Authors: Patrick Gerard, Hans W. A. Hanley, Luca Luceri, Emilio Ferrara

    Abstract: Political discourse has grown increasingly fragmented across different social platforms, making it challenging to trace how narratives spread and evolve within such a fragmented information ecosystem. Reconstructing social graphs and information diffusion networks is challenging, and available strategies typically depend on platform-specific features and behavioral signals which are often incompat… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

    Comments: 22 pages, 5 figures

  4. arXiv:2505.20067  [pdf, ps, other

    cs.SI cs.AI cs.CY

    Community Moderation and the New Epistemology of Fact Checking on Social Media

    Authors: Isabelle Augenstein, Michiel Bakker, Tanmoy Chakraborty, David Corney, Emilio Ferrara, Iryna Gurevych, Scott Hale, Eduard Hovy, Heng Ji, Irene Larraz, Filippo Menczer, Preslav Nakov, Paolo Papotti, Dhruv Sahnan, Greta Warren, Giovanni Zagni

    Abstract: Social media platforms have traditionally relied on internal moderation teams and partnerships with independent fact-checking organizations to identify and flag misleading content. Recently, however, platforms including X (formerly Twitter) and Meta have shifted towards community-driven content moderation by launching their own versions of crowd-sourced fact-checking -- Community Notes. If effecti… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

    Comments: 1 Figure, 2 tables

  5. arXiv:2505.10867  [pdf, other

    cs.SI

    Coordinated Inauthentic Behavior on TikTok: Challenges and Opportunities for Detection in a Video-First Ecosystem

    Authors: Luca Luceri, Tanishq Vijay Salkar, Ashwin Balasubramanian, Gabriela Pinto, Chenning Sun, Emilio Ferrara

    Abstract: Detecting coordinated inauthentic behavior (CIB) is central to the study of online influence operations. However, most methods focus on text-centric platforms, leaving video-first ecosystems like TikTok largely unexplored. To address this gap, we develop and evaluate a computational framework for detecting CIB on TikTok, leveraging a network-based approach adapted to the platform's unique content… ▽ More

    Submitted 16 May, 2025; originally announced May 2025.

  6. arXiv:2505.02250  [pdf, ps, other

    cs.SI

    EDTok: A Dataset for Eating Disorder Content on TikTok

    Authors: Charles Bickham, Bryan Ramirez-Gonzalez, Minh Duc Chu, Kristina Lerman, Emilio Ferrara

    Abstract: Eating disorders, which include anorexia nervosa and bulimia nervosa, have been exacerbated by the COVID-19 pandemic, with increased diagnoses linked to heightened exposure to idealized body images online. TikTok, a platform with over a billion predominantly adolescent users, has become a key space where eating disorder content is shared, raising concerns about its impact on vulnerable populations… ▽ More

    Submitted 4 May, 2025; originally announced May 2025.

    Comments: 10 pages, 6 figures

  7. arXiv:2503.02328  [pdf, other

    cs.CL cs.CY cs.HC cs.SI

    Limited Effectiveness of LLM-based Data Augmentation for COVID-19 Misinformation Stance Detection

    Authors: Eun Cheol Choi, Ashwin Balasubramanian, Jinhu Qi, Emilio Ferrara

    Abstract: Misinformation surrounding emerging outbreaks poses a serious societal threat, making robust countermeasures essential. One promising approach is stance detection (SD), which identifies whether social media posts support or oppose misleading claims. In this work, we finetune classifiers on COVID-19 misinformation SD datasets consisting of claims and corresponding tweets. Specifically, we test cont… ▽ More

    Submitted 4 March, 2025; originally announced March 2025.

  8. arXiv:2502.17344  [pdf, other

    cs.SI

    Beyond Interaction Patterns: Assessing Claims of Coordinated Inter-State Information Operations on Twitter/X

    Authors: Valeria Pantè, David Axelrod, Alessandro Flammini, Filippo Menczer, Emilio Ferrara, Luca Luceri

    Abstract: Social media platforms have become key tools for coordinated influence operations, enabling state actors to manipulate public opinion through strategic, collective actions. While previous research has suggested collaboration between states, such research failed to leverage state-of-the-art coordination indicators or control datasets. In this study, we investigate inter-state coordination by analyz… ▽ More

    Submitted 24 February, 2025; originally announced February 2025.

  9. arXiv:2502.11248  [pdf, other

    cs.SI cs.CY

    Synthetic Politics: Prevalence, Spreaders, and Emotional Reception of AI-Generated Political Images on X

    Authors: Zhiyi Chen, Jinyi Ye, Beverlyn Tsai, Emilio Ferrara, Luca Luceri

    Abstract: Despite widespread concerns about the risks of AI-generated content (AIGC) to the integrity of social media discourse, little is known about its scale and scope, the actors responsible for its dissemination online, and the user responses it elicits. In this work, we measure and characterize the prevalence, spreaders, and emotional reception of AI-generated political images. Analyzing a large-scale… ▽ More

    Submitted 13 May, 2025; v1 submitted 16 February, 2025; originally announced February 2025.

  10. arXiv:2501.11849  [pdf, other

    cs.CL cs.AI cs.SI

    Network-informed Prompt Engineering against Organized Astroturf Campaigns under Extreme Class Imbalance

    Authors: Nikos Kanakaris, Heng Ping, Xiongye Xiao, Nesreen K. Ahmed, Luca Luceri, Emilio Ferrara, Paul Bogdan

    Abstract: Detecting organized political campaigns is of paramount importance in fighting against disinformation on social media. Existing approaches for the identification of such organized actions employ techniques mostly from network science, graph machine learning and natural language processing. Their ultimate goal is to analyze the relationships and interactions (e.g. re-posting) among users and the te… ▽ More

    Submitted 17 February, 2025; v1 submitted 20 January, 2025; originally announced January 2025.

  11. arXiv:2412.15721  [pdf, other

    cs.SI cs.CY cs.HC

    Safe Spaces or Toxic Places? Content Moderation and Social Dynamics of Online Eating Disorder Communities

    Authors: Kristina Lerman, Minh Duc Chu, Charles Bickham, Luca Luceri, Emilio Ferrara

    Abstract: Social media platforms have become critical spaces for discussing mental health concerns, including eating disorders. While these platforms can provide valuable support networks, they may also amplify harmful content that glorifies disordered cognition and self-destructive behaviors. While social media platforms have implemented various content moderation strategies, from stringent to laissez-fair… ▽ More

    Submitted 20 December, 2024; originally announced December 2024.

    Comments: arXiv admin note: text overlap with arXiv:2401.09647

  12. arXiv:2412.15583  [pdf, other

    cs.SI

    Tracking the 2024 US Presidential Election Chatter on TikTok: A Public Multimodal Dataset

    Authors: Gabriela Pinto, Charles Bickham, Tanishq Salkar, Joyston Menezes, Luca Luceri, Emilio Ferrara

    Abstract: This paper presents the TikTok 2024 U.S. Presidential Election Dataset, a large-scale, resource designed to advance research into political communication and social media dynamics. The dataset comprises 3.14 million videos published on TikTok between November 1, 2023, and October 16, 2024, encompassing video ids and transcripts. Data collection was conducted using the TikTok Research API with a co… ▽ More

    Submitted 20 December, 2024; originally announced December 2024.

  13. arXiv:2412.15291  [pdf, other

    cs.CL cs.SI

    A Large-Scale Simulation on Large Language Models for Decision-Making in Political Science

    Authors: Chenxiao Yu, Jinyi Ye, Yuangang Li, Zheng Li, Emilio Ferrara, Xiyang Hu, Yue Zhao

    Abstract: While LLMs have demonstrated remarkable capabilities in text generation and reasoning, their ability to simulate human decision-making -- particularly in political contexts -- remains an open question. However, modeling voter behavior presents unique challenges due to limited voter-level data, evolving political landscapes, and the complexity of human reasoning. In this study, we develop a theory-… ▽ More

    Submitted 9 April, 2025; v1 submitted 19 December, 2024; originally announced December 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2411.03321 This version adds a new model to our experimental setup, modifies the paper's main discussion, and updates the authorship list

  14. arXiv:2412.14663  [pdf, other

    cs.SI cs.AI cs.LG

    IOHunter: Graph Foundation Model to Uncover Online Information Operations

    Authors: Marco Minici, Luca Luceri, Francesco Fabbri, Emilio Ferrara

    Abstract: Social media platforms have become vital spaces for public discourse, serving as modern agoràs where a wide range of voices influence societal narratives. However, their open nature also makes them vulnerable to exploitation by malicious actors, including state-sponsored entities, who can conduct information operations (IOs) to manipulate public opinion. The spread of misinformation, false news, a… ▽ More

    Submitted 3 March, 2025; v1 submitted 19 December, 2024; originally announced December 2024.

    Comments: Accepted at AAAI 2025

  15. arXiv:2412.10981  [pdf, other

    cs.CY cs.AI cs.HC cs.LG

    Hybrid Forecasting of Geopolitical Events

    Authors: Daniel M. Benjamin, Fred Morstatter, Ali E. Abbas, Andres Abeliuk, Pavel Atanasov, Stephen Bennett, Andreas Beger, Saurabh Birari, David V. Budescu, Michele Catasta, Emilio Ferrara, Lucas Haravitch, Mark Himmelstein, KSM Tozammel Hossain, Yuzhong Huang, Woojeong Jin, Regina Joseph, Jure Leskovec, Akira Matsui, Mehrnoosh Mirtaheri, Xiang Ren, Gleb Satyukov, Rajiv Sethi, Amandeep Singh, Rok Sosic , et al. (4 additional authors not shown)

    Abstract: Sound decision-making relies on accurate prediction for tangible outcomes ranging from military conflict to disease outbreaks. To improve crowdsourced forecasting accuracy, we developed SAGE, a hybrid forecasting system that combines human and machine generated forecasts. The system provides a platform where users can interact with machine models and thus anchor their judgments on an objective ben… ▽ More

    Submitted 14 December, 2024; originally announced December 2024.

    Comments: 20 pages, 6 figures, 4 tables

    Journal ref: AI Magazine, Volume 44, Issue 1, Pages 112-128, Spring 2023

  16. arXiv:2412.06864  [pdf, other

    cs.CL cs.AI

    Political-LLM: Large Language Models in Political Science

    Authors: Lincan Li, Jiaqi Li, Catherine Chen, Fred Gui, Hongjia Yang, Chenxiao Yu, Zhengguang Wang, Jianing Cai, Junlong Aaron Zhou, Bolin Shen, Alex Qian, Weixin Chen, Zhongkai Xue, Lichao Sun, Lifang He, Hanjie Chen, Kaize Ding, Zijian Du, Fangzhou Mu, Jiaxin Pei, Jieyu Zhao, Swabha Swayamdipta, Willie Neiswanger, Hua Wei, Xiyang Hu , et al. (22 additional authors not shown)

    Abstract: In recent years, large language models (LLMs) have been widely adopted in political science tasks such as election prediction, sentiment analysis, policy impact assessment, and misinformation detection. Meanwhile, the need to systematically understand how LLMs can further revolutionize the field also becomes urgent. In this work, we--a multidisciplinary team of researchers spanning computer scienc… ▽ More

    Submitted 9 December, 2024; originally announced December 2024.

    Comments: 54 Pages, 9 Figures

  17. arXiv:2411.08250  [pdf, other

    cs.SI

    What Are The Risks of Living in a GenAI Synthetic Reality? The Generative AI Paradox

    Authors: Emilio Ferrara

    Abstract: Generative AI (GenAI) technologies possess unprecedented potential to reshape our world and our perception of reality. These technologies can amplify traditionally human-centered capabilities, such as creativity and complex problem-solving in socio-technical contexts. By fostering human-AI collaboration, GenAI could enhance productivity, dismantle communication barriers across abilities and cultur… ▽ More

    Submitted 12 November, 2024; originally announced November 2024.

    Comments: HUMANS Lab -- Working Paper No. 2024.2 -- The 2024 Election Integrity Initiative -- University of Southern California

  18. arXiv:2411.05192  [pdf, other

    cs.CL cs.AI

    Explaining Mixtures of Sources in News Articles

    Authors: Alexander Spangher, James Youn, Matt DeButts, Nanyun Peng, Emilio Ferrara, Jonathan May

    Abstract: Human writers plan, then write. For large language models (LLMs) to play a role in longer-form article generation, we must understand the planning steps humans make before writing. We explore one kind of planning, source-selection in news, as a case-study for evaluating plans in long-form generation. We ask: why do specific stories call for specific kinds of sources? We imagine a generative proces… ▽ More

    Submitted 7 November, 2024; originally announced November 2024.

    Comments: 9 pages

  19. Auditing Political Exposure Bias: Algorithmic Amplification on Twitter/X During the 2024 U.S. Presidential Election

    Authors: Jinyi Ye, Luca Luceri, Emilio Ferrara

    Abstract: Approximately 50% of tweets in X's user timelines are personalized recommendations from accounts they do not follow. This raises a critical question: What political content are users exposed to beyond their established networks, and what implications does this have for democratic discourse online? In this paper, we present a six-week audit of X's algorithmic content recommendations during the 2024… ▽ More

    Submitted 20 March, 2025; v1 submitted 4 November, 2024; originally announced November 2024.

    Journal ref: Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency (FAccT '25)

  20. arXiv:2411.01330  [pdf, other

    cs.SI

    Unfiltered Conversations: A Dataset of 2024 U.S. Presidential Election Discourse on Truth Social

    Authors: Kashish Shah, Patrick Gerard, Luca Luceri, Emilio Ferrara

    Abstract: Truth Social, launched as a social media platform with a focus on free speech, has become a prominent space for political discourse, attracting a user base with diverse, yet often conservative, viewpoints. As an emerging platform with minimal content moderation, Truth Social has facilitated discussions around contentious social and political issues but has also seen the spread of conspiratorial an… ▽ More

    Submitted 2 November, 2024; originally announced November 2024.

    Comments: HUMANS Lab -- Working Paper No. 2024.8 -- The 2024 Election Integrity Initiative -- University of Southern California

  21. arXiv:2411.00376  [pdf, other

    cs.SI

    A Public Dataset Tracking Social Media Discourse about the 2024 U.S. Presidential Election on Twitter/X

    Authors: Ashwin Balasubramanian, Vito Zou, Hitesh Narayana, Christina You, Luca Luceri, Emilio Ferrara

    Abstract: In this paper, we introduce the first release of a large-scale dataset capturing discourse on $\mathbb{X}$ (a.k.a., Twitter) related to the upcoming 2024 U.S. Presidential Election. Our dataset comprises 22 million publicly available posts on X.com, collected from May 1, 2024, to July 31, 2024, using a custom-built scraper, which we describe in detail. By employing targeted keywords linked to key… ▽ More

    Submitted 1 November, 2024; originally announced November 2024.

  22. arXiv:2410.23638  [pdf, other

    cs.SI

    Unearthing a Billion Telegram Posts about the 2024 U.S. Presidential Election: Development of a Public Dataset

    Authors: Leonardo Blas, Luca Luceri, Emilio Ferrara

    Abstract: With its lenient moderation policies and long-standing associations with potentially unlawful activities, Telegram has become an incubator for problematic content, frequently featuring conspiratorial, hyper-partisan, and fringe narratives. In the political sphere, these concerns are amplified by reports of Telegram channels being used to organize violent acts, such as those that occurred during th… ▽ More

    Submitted 31 October, 2024; originally announced October 2024.

    Comments: HUMANS Lab -- Working Paper No. 2024.5 -- The 2024 Election Integrity Initiative -- University of Southern California

  23. arXiv:2410.22716  [pdf, other

    cs.SI

    Exposing Cross-Platform Coordinated Inauthentic Activity in the Run-Up to the 2024 U.S. Election

    Authors: Federico Cinus, Marco Minici, Luca Luceri, Emilio Ferrara

    Abstract: Coordinated information operations remain a persistent challenge on social media, despite platform efforts to curb them. While previous research has primarily focused on identifying these operations within individual platforms, this study shows that coordination frequently transcends platform boundaries. Leveraging newly collected data of online conversations related to the 2024 U.S. Election acro… ▽ More

    Submitted 24 April, 2025; v1 submitted 30 October, 2024; originally announced October 2024.

    Comments: HUMANS Lab -- Working Paper No. 2024.7 -- The 2024 Election Integrity Initiative -- University of Southern California - Updated Version of WWW '25 Submission

  24. arXiv:2409.15402  [pdf, other

    cs.SI cs.CY

    Uncovering Coordinated Cross-Platform Information Operations Threatening the Integrity of the 2024 U.S. Presidential Election Online Discussion

    Authors: Marco Minici, Luca Luceri, Federico Cinus, Emilio Ferrara

    Abstract: Information Operations (IOs) pose a significant threat to the integrity of democratic processes, with the potential to influence election-related online discourse. In anticipation of the 2024 U.S. presidential election, we present a study aimed at uncovering the digital traces of coordinated IOs on $\mathbb{X}$ (formerly Twitter). Using our machine learning framework for detecting online coordinat… ▽ More

    Submitted 30 October, 2024; v1 submitted 23 September, 2024; originally announced September 2024.

    Comments: First Monday 29(11), 2024

    Report number: The 2024 Election Integrity Initiative: HUMANS Lab - Working Paper No. 2024.4 - University of Southern California

  25. arXiv:2407.07196  [pdf, ps, other

    cs.HC

    Large Language Models for Wearable Sensor-Based Human Activity Recognition, Health Monitoring, and Behavioral Modeling: A Survey of Early Trends, Datasets, and Challenges

    Authors: Emilio Ferrara

    Abstract: The proliferation of wearable technology enables the generation of vast amounts of sensor data, offering significant opportunities for advancements in health monitoring, activity recognition, and personalized medicine. However, the complexity and volume of this data present substantial challenges in data modeling and analysis, which have been tamed with approaches spanning time series modeling to… ▽ More

    Submitted 31 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Journal ref: Sensors, 2024

  26. arXiv:2407.01471  [pdf, other

    cs.SI

    Tracking the 2024 US Presidential Election Chatter on Tiktok: A Public Multimodal Dataset

    Authors: Gabriela Pinto, Charles Bickham, Tanishq Salkar, Luca Luceri, Emilio Ferrara

    Abstract: This paper documents our release of a large-scale data collection of TikTok posts related to the upcoming 2024 U.S. Presidential Election. Our current data comprises 1.8 million videos published between November 1, 2023, and May 26, 2024. Its exploratory analysis identifies the most common keywords, hashtags, and bigrams in both Spanish and English posts, focusing on the election and the two main… ▽ More

    Submitted 2 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

    Comments: The 2024 Election Integrity Initiative

    Report number: HUMANS Lab -- Working Paper No. 2024.3

  27. Tracing the Unseen: Uncovering Human Trafficking Patterns in Job Listings

    Authors: Siyi Zhou, Jiankun Peng, Emilio Ferrara

    Abstract: In the shadow of the digital revolution, the insidious issue of human trafficking has found new breeding grounds within the realms of social media and online job boards. Previous research efforts have predominantly centered on identifying victims via the analysis of escort advertisements. However, our work shifts the focus towards enabling a proactive approach: pinpointing potential traffickers be… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  28. arXiv:2406.11553  [pdf, other

    cs.SI

    The Susceptibility Paradox in Online Social Influence

    Authors: Luca Luceri, Jinyi Ye, Julie Jiang, Emilio Ferrara

    Abstract: Understanding susceptibility to online influence is crucial for mitigating the spread of misinformation and protecting vulnerable audiences. This paper investigates susceptibility to influence within social networks, focusing on the differential effects of influence-driven versus spontaneous behaviors on user content adoption. Our analysis reveals that influence-driven adoption exhibits high homop… ▽ More

    Submitted 13 May, 2025; v1 submitted 17 June, 2024; originally announced June 2024.

    Journal ref: Proceedings of the Nineteenth International AAAI Conference on Web and Social Media (ICWSM 2025)

  29. arXiv:2406.01862  [pdf, other

    cs.CY

    Charting the Landscape of Nefarious Uses of Generative Artificial Intelligence for Online Election Interference

    Authors: Emilio Ferrara

    Abstract: Generative Artificial Intelligence (GenAI) and Large Language Models (LLMs) pose significant risks, particularly in the realm of online election interference. This paper explores the nefarious applications of GenAI, highlighting their potential to disrupt democratic processes through deepfakes, botnets, targeted misinformation campaigns, and synthetic identities. By examining recent case studies a… ▽ More

    Submitted 4 April, 2025; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: First Monday, 2025

    Journal ref: First Monday 30(1), June 2025

  30. arXiv:2404.15457  [pdf, other

    cs.SI

    Hidden in Plain Sight: Exploring the Intersections of Mental Health, Eating Disorders, and Content Moderation on TikTok

    Authors: Charles Bickham, Kia Kazemi-Nia, Luca Luceri, Kristina Lerman, Emilio Ferrara

    Abstract: Social media platforms actively moderate content glorifying harmful behaviors like eating disorders, which include anorexia and bulimia. However, users have adapted to evade moderation by using coded hashtags. Our study investigates the prevalence of moderation evaders on the popular social media platform TikTok and contrasts their use and emotional valence with mainstream hashtags. We notice that… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 10 pages, 5 figures, 2 tables

  31. arXiv:2402.05904  [pdf, other

    cs.CL cs.CY cs.HC cs.SI

    FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs

    Authors: Eun Cheol Choi, Emilio Ferrara

    Abstract: Our society is facing rampant misinformation harming public health and trust. To address the societal challenge, we introduce FACT-GPT, a system leveraging Large Language Models (LLMs) to automate the claim matching stage of fact-checking. FACT-GPT, trained on a synthetic dataset, identifies social media content that aligns with, contradicts, or is irrelevant to previously debunked claims. Our eva… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  32. arXiv:2402.05882  [pdf, other

    cs.SI cs.CY cs.HC

    GET-Tok: A GenAI-Enriched Multimodal TikTok Dataset Documenting the 2022 Attempted Coup in Peru

    Authors: Gabriela Pinto, Keith Burghardt, Kristina Lerman, Emilio Ferrara

    Abstract: TikTok is one of the largest and fastest-growing social media sites in the world. TikTok features, however, such as voice transcripts, are often missing and other important features, such as OCR or video descriptions, do not exist. We introduce the Generative AI Enriched TikTok (GET-Tok) data, a pipeline for collecting TikTok videos and enriched data by augmenting the TikTok Research API with gene… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: Github repository: https://github.com/gabbypinto/GET-Tok-Peru

  33. arXiv:2402.05873  [pdf, other

    cs.SI cs.CY

    Coordinated Activity Modulates the Behavior and Emotions of Organic Users: A Case Study on Tweets about the Gaza Conflict

    Authors: Priyanka Dey, Luca Luceri, Emilio Ferrara

    Abstract: Social media has become a crucial conduit for the swift dissemination of information during global crises. However, this also paves the way for the manipulation of narratives by malicious actors. This research delves into the interaction dynamics between coordinated (malicious) entities and organic (regular) users on Twitter amidst the Gaza conflict. Through the analysis of approximately 3.5 milli… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  34. arXiv:2402.05865  [pdf, other

    cs.HC cs.CY

    "Can You Play Anything Else?" Understanding Play Style Flexibility in League of Legends

    Authors: Emily Chen, Alexander Bisberg, Emilio Ferrara

    Abstract: This study investigates the concept of flexibility within League of Legends, a popular online multiplayer game, focusing on the relationship between user adaptability and team success. Utilizing a dataset encompassing players of varying skill levels and play styles, we calculate two measures of flexibility for each player: overall flexibility and temporal flexibility. Our findings suggest that the… ▽ More

    Submitted 10 July, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  35. arXiv:2401.08789  [pdf, other

    cs.SI

    Moral Values Underpinning COVID-19 Online Communication Patterns

    Authors: Julie Jiang, Luca Luceri, Emilio Ferrara

    Abstract: The COVID-19 pandemic has triggered profound societal changes, extending beyond its health impacts to the moralization of behaviors. Leveraging insights from moral psychology, this study delves into the moral fabric shaping online discussions surrounding COVID-19 over a span of nearly two years. Our investigation identifies four distinct user groups characterized by differences in morality, politi… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: 11 pages, 8 figures, 2 tables

  36. arXiv:2401.00893  [pdf, other

    cs.SI cs.AI

    Social-LLM: Modeling User Behavior at Scale using Language Models and Social Network Data

    Authors: Julie Jiang, Emilio Ferrara

    Abstract: The proliferation of social network data has unlocked unprecedented opportunities for extensive, data-driven exploration of human behavior. The structural intricacies of social networks offer insights into various computational social science issues, particularly concerning social influence and information diffusion. However, modeling large-scale social network data comes with computational challe… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

    Comments: 10 pages, 5 figures, 2 tables

  37. arXiv:2312.17423  [pdf, other

    cs.SI

    Social Bots: Detection and Challenges

    Authors: Kai-Cheng Yang, Onur Varol, Alexander C. Nwala, Mohsen Sayyadiharikandeh, Emilio Ferrara, Alessandro Flammini, Filippo Menczer

    Abstract: While social media are a key source of data for computational social science, their ease of manipulation by malicious actors threatens the integrity of online information exchanges and their analysis. In this Chapter, we focus on malicious social bots, a prominent vehicle for such manipulation. We start by discussing recent studies about the presence and actions of social bots in various online di… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: This is a draft of the chapter. The final version will be available in the Handbook of Computational Social Science edited by Taha Yasseri, forthcoming 2024, Edward Elgar Publishing Ltd. The material cannot be used for any other purpose without further permission of the publisher and is for private use only

  38. arXiv:2311.10781  [pdf, other

    cs.CL cs.AI

    Can Language Model Moderators Improve the Health of Online Discourse?

    Authors: Hyundong Cho, Shuai Liu, Taiwei Shi, Darpan Jain, Basem Rizk, Yuyang Huang, Zixun Lu, Nuan Wen, Jonathan Gratch, Emilio Ferrara, Jonathan May

    Abstract: Conversational moderation of online communities is crucial to maintaining civility for a constructive environment, but it is challenging to scale and harmful to moderators. The inclusion of sophisticated natural language generation modules as a force multiplier to aid human moderators is a tantalizing prospect, but adequate evaluation approaches have so far been elusive. In this paper, we establis… ▽ More

    Submitted 6 May, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: 9 pages, NAACL 2024 Main

  39. arXiv:2311.09734  [pdf, other

    cs.CL

    Tracking the Newsworthiness of Public Documents

    Authors: Alexander Spangher, Emilio Ferrara, Ben Welsh, Nanyun Peng, Serdar Tumgoren, Jonathan May

    Abstract: Journalists must find stories in huge amounts of textual data (e.g. leaks, bills, press releases) as part of their jobs: determining when and why text becomes news can help us understand coverage patterns and help us build assistive tools. Yet, this is challenging because very few labelled links exist, language use between corpora is very different, and text may be covered for a variety of reasons… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: 9 pages, 7 pages appendix

  40. arXiv:2311.07816  [pdf, other

    cs.SI cs.AI

    Leveraging Large Language Models to Detect Influence Campaigns in Social Media

    Authors: Luca Luceri, Eric Boniardi, Emilio Ferrara

    Abstract: Social media influence campaigns pose significant challenges to public discourse and democracy. Traditional detection methods fall short due to the complexity and dynamic nature of social media. Addressing this, we propose a novel detection method using Large Language Models (LLMs) that incorporates both user metadata and network structures. By converting these elements into a text format, our app… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  41. Susceptibility to Unreliable Information Sources: Swift Adoption with Minimal Exposure

    Authors: Jinyi Ye, Luca Luceri, Julie Jiang, Emilio Ferrara

    Abstract: Misinformation proliferation on social media platforms is a pervasive threat to the integrity of online public discourse. Genuine users, susceptible to others' influence, often unknowingly engage with, endorse, and re-share questionable pieces of information, collectively amplifying the spread of misinformation. In this study, we introduce an empirical framework to investigate users' susceptibilit… ▽ More

    Submitted 29 January, 2025; v1 submitted 9 November, 2023; originally announced November 2023.

    Journal ref: Proceedings of the ACM Web Conference 2024

  42. arXiv:2310.09884  [pdf, other

    cs.SI

    Unmasking the Web of Deceit: Uncovering Coordinated Activity to Expose Information Operations on Twitter

    Authors: Luca Luceri, Valeria Pantè, Keith Burghardt, Emilio Ferrara

    Abstract: Social media platforms, particularly Twitter, have become pivotal arenas for influence campaigns, often orchestrated by state-sponsored information operations (IOs). This paper delves into the detection of key players driving IOs by employing similarity graphs constructed from behavioral pattern data. We unveil that well-known, yet underutilized network properties can help accurately identify coor… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

    Comments: Accepted at the 2024 ACM Web Conference

  43. arXiv:2310.09223  [pdf, other

    cs.CL cs.CY cs.HC

    Automated Claim Matching with Large Language Models: Empowering Fact-Checkers in the Fight Against Misinformation

    Authors: Eun Cheol Choi, Emilio Ferrara

    Abstract: In today's digital era, the rapid spread of misinformation poses threats to public well-being and societal trust. As online misinformation proliferates, manual verification by fact checkers becomes increasingly challenging. We introduce FACT-GPT (Fact-checking Augmentation with Claim matching Task-oriented Generative Pre-trained Transformer), a framework designed to automate the claim matching pha… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  44. arXiv:2310.07779  [pdf, other

    cs.SI

    Social Approval and Network Homophily as Motivators of Online Toxicity

    Authors: Julie Jiang, Luca Luceri, Joseph B. Walther, Emilio Ferrara

    Abstract: Online hate messaging is a pervasive issue plaguing the well-being of social media users. This research empirically investigates a novel theory positing that online hate may be driven primarily by the pursuit of social approval rather than a direct desire to harm the targets. Results show that toxicity is homophilous in users' social networks and that a user's propensity for hostility can be predi… ▽ More

    Submitted 29 February, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

  45. arXiv:2310.05189  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Factuality Challenges in the Era of Large Language Models

    Authors: Isabelle Augenstein, Timothy Baldwin, Meeyoung Cha, Tanmoy Chakraborty, Giovanni Luca Ciampaglia, David Corney, Renee DiResta, Emilio Ferrara, Scott Hale, Alon Halevy, Eduard Hovy, Heng Ji, Filippo Menczer, Ruben Miguez, Preslav Nakov, Dietram Scheufele, Shivam Sharma, Giovanni Zagni

    Abstract: The emergence of tools based on Large Language Models (LLMs), such as OpenAI's ChatGPT, Microsoft's Bing Chat, and Google's Bard, has garnered immense public attention. These incredibly useful, natural-sounding tools mark significant advances in natural language generation, yet they exhibit a propensity to generate false, erroneous, or misleading content -- commonly referred to as "hallucinations.… ▽ More

    Submitted 9 October, 2023; v1 submitted 8 October, 2023; originally announced October 2023.

    Comments: Our article offers a comprehensive examination of the challenges and risks associated with Large Language Models (LLMs), focusing on their potential impact on the veracity of information in today's digital landscape

  46. arXiv:2310.00737  [pdf, other

    cs.CY cs.AI cs.CL cs.HC

    GenAI Against Humanity: Nefarious Applications of Generative Artificial Intelligence and Large Language Models

    Authors: Emilio Ferrara

    Abstract: Generative Artificial Intelligence (GenAI) and Large Language Models (LLMs) are marvels of technology; celebrated for their prowess in natural language processing and multimodal content generation, they promise a transformative future. But as with all powerful tools, they come with their shadows. Picture living in a world where deepfakes are indistinguishable from reality, where synthetic identiti… ▽ More

    Submitted 22 January, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

    Comments: Accepted in: Journal of Computational Social Science

    Journal ref: J Comput Soc Sc (2024)

  47. The Butterfly Effect in Artificial Intelligence Systems: Implications for AI Bias and Fairness

    Authors: Emilio Ferrara

    Abstract: The Butterfly Effect, a concept originating from chaos theory, underscores how small changes can have significant and unpredictable impacts on complex systems. In the context of AI fairness and bias, the Butterfly Effect can stem from a variety of sources, such as small biases or skewed data inputs during algorithm development, saddle points in training, or distribution shifts in data between trai… ▽ More

    Submitted 2 February, 2024; v1 submitted 11 July, 2023; originally announced July 2023.

    Comments: Cite as: Machine Learning with Applications, Volume 15, 2024, 100525 10.1016/j.mlwa.2024.100525

    Journal ref: Machine Learning with Applications, Volume 15, 2024, 100525

  48. arXiv:2305.19230  [pdf, other

    cs.CL cs.AI

    Controlled Text Generation with Hidden Representation Transformations

    Authors: Vaibhav Kumar, Hana Koorehdavoudi, Masud Moshtaghi, Amita Misra, Ankit Chadha, Emilio Ferrara

    Abstract: We propose CHRT (Control Hidden Representation Transformation) - a controlled language generation framework that steers large language models to generate text pertaining to certain attributes (such as toxicity). CHRT gains attribute control by modifying the hidden representation of the base model through learned transformations. We employ a contrastive-learning framework to learn these transformat… ▽ More

    Submitted 31 May, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL 2023 as a long paper (Findings)

  49. arXiv:2305.14904  [pdf, other

    cs.CL cs.AI cs.CY

    Identifying Informational Sources in News Articles

    Authors: Alexander Spangher, Nanyun Peng, Jonathan May, Emilio Ferrara

    Abstract: News articles are driven by the informational sources journalists use in reporting. Modeling when, how and why sources get used together in stories can help us better understand the information we consume and even help journalists with the task of producing it. In this work, we take steps toward this goal by constructing the largest and widest-ranging annotated dataset, to date, of informational s… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: 13 pages

  50. Fairness And Bias in Artificial Intelligence: A Brief Survey of Sources, Impacts, And Mitigation Strategies

    Authors: Emilio Ferrara

    Abstract: The significant advancements in applying Artificial Intelligence (AI) to healthcare decision-making, medical diagnosis, and other domains have simultaneously raised concerns about the fairness and bias of AI systems. This is particularly critical in areas like healthcare, employment, criminal justice, credit scoring, and increasingly, in generative AI models (GenAI) that produce synthetic media. S… ▽ More

    Submitted 7 December, 2023; v1 submitted 15 April, 2023; originally announced April 2023.

    Journal ref: Sci 2024, 6(1), 3