-
J-Guard: Journalism Guided Adversarially Robust Detection of AI-generated News
Authors:
Tharindu Kumarage,
Amrita Bhattacharjee,
Djordje Padejski,
Kristy Roschke,
Dan Gillmor,
Scott Ruston,
Huan Liu,
Joshua Garland
Abstract:
The rapid proliferation of AI-generated text online is profoundly reshaping the information landscape. Among various types of AI-generated text, AI-generated news presents a significant threat as it can be a prominent source of misinformation online. While several recent efforts have focused on detecting AI-generated text in general, these methods require enhanced reliability, given concerns about…
▽ More
The rapid proliferation of AI-generated text online is profoundly reshaping the information landscape. Among various types of AI-generated text, AI-generated news presents a significant threat as it can be a prominent source of misinformation online. While several recent efforts have focused on detecting AI-generated text in general, these methods require enhanced reliability, given concerns about their vulnerability to simple adversarial attacks. Furthermore, due to the eccentricities of news writing, applying these detection methods for AI-generated news can produce false positives, potentially damaging the reputation of news organizations. To address these challenges, we leverage the expertise of an interdisciplinary team to develop a framework, J-Guard, capable of steering existing supervised AI text detectors for detecting AI-generated news while boosting adversarial robustness. By incorporating stylistic cues inspired by the unique journalistic attributes, J-Guard effectively distinguishes between real-world journalism and AI-generated news articles. Our experiments on news articles generated by a vast array of AI models, including ChatGPT (GPT3.5), demonstrate the effectiveness of J-Guard in enhancing detection capabilities while maintaining an average performance decrease of as low as 7% when faced with adversarial attacks.
△ Less
Submitted 6 September, 2023;
originally announced September 2023.
-
Stylometric Detection of AI-Generated Text in Twitter Timelines
Authors:
Tharindu Kumarage,
Joshua Garland,
Amrita Bhattacharjee,
Kirill Trapeznikov,
Scott Ruston,
Huan Liu
Abstract:
Recent advancements in pre-trained language models have enabled convenient methods for generating human-like text at a large scale. Though these generation capabilities hold great potential for breakthrough applications, it can also be a tool for an adversary to generate misinformation. In particular, social media platforms like Twitter are highly susceptible to AI-generated misinformation. A pote…
▽ More
Recent advancements in pre-trained language models have enabled convenient methods for generating human-like text at a large scale. Though these generation capabilities hold great potential for breakthrough applications, it can also be a tool for an adversary to generate misinformation. In particular, social media platforms like Twitter are highly susceptible to AI-generated misinformation. A potential threat scenario is when an adversary hijacks a credible user account and incorporates a natural language generator to generate misinformation. Such threats necessitate automated detectors for AI-generated tweets in a given user's Twitter timeline. However, tweets are inherently short, thus making it difficult for current state-of-the-art pre-trained language model-based detectors to accurately detect at what point the AI starts to generate tweets in a given Twitter timeline. In this paper, we present a novel algorithm using stylometric signals to aid detecting AI-generated tweets. We propose models corresponding to quantifying stylistic changes in human and AI tweets in two related tasks: Task 1 - discriminate between human and AI-generated tweets, and Task 2 - detect if and when an AI starts to generate tweets in a given Twitter timeline. Our extensive experiments demonstrate that the stylometric features are effective in augmenting the state-of-the-art AI-generated text detectors.
△ Less
Submitted 7 March, 2023;
originally announced March 2023.
-
An Agenda for Disinformation Research
Authors:
Nadya Bliss,
Elizabeth Bradley,
Joshua Garland,
Filippo Menczer,
Scott W. Ruston,
Kate Starbird,
Chris Wiggins
Abstract:
In the 21st Century information environment, adversarial actors use disinformation to manipulate public opinion. The distribution of false, misleading, or inaccurate information with the intent to deceive is an existential threat to the United States--distortion of information erodes trust in the socio-political institutions that are the fundamental fabric of democracy: legitimate news sources, sc…
▽ More
In the 21st Century information environment, adversarial actors use disinformation to manipulate public opinion. The distribution of false, misleading, or inaccurate information with the intent to deceive is an existential threat to the United States--distortion of information erodes trust in the socio-political institutions that are the fundamental fabric of democracy: legitimate news sources, scientists, experts, and even fellow citizens. As a result, it becomes difficult for society to come together within a shared reality; the common ground needed to function effectively as an economy and a nation. Computing and communication technologies have facilitated the exchange of information at unprecedented speeds and scales. This has had countless benefits to society and the economy, but it has also played a fundamental role in the rising volume, variety, and velocity of disinformation. Technological advances have created new opportunities for manipulation, influence, and deceit. They have effectively lowered the barriers to reaching large audiences, diminishing the role of traditional mass media along with the editorial oversight they provided. The digitization of information exchange, however, also makes the practices of disinformation detectable, the networks of influence discernable, and suspicious content characterizable. New tools and approaches must be developed to leverage these affordances to understand and address this growing challenge.
△ Less
Submitted 15 December, 2020;
originally announced December 2020.
-
Leveraging Multi-Source Weak Social Supervision for Early Detection of Fake News
Authors:
Kai Shu,
Guoqing Zheng,
Yichuan Li,
Subhabrata Mukherjee,
Ahmed Hassan Awadallah,
Scott Ruston,
Huan Liu
Abstract:
Social media has greatly enabled people to participate in online activities at an unprecedented rate. However, this unrestricted access also exacerbates the spread of misinformation and fake news online which might cause confusion and chaos unless being detected early for its mitigation. Given the rapidly evolving nature of news events and the limited amount of annotated data, state-of-the-art sys…
▽ More
Social media has greatly enabled people to participate in online activities at an unprecedented rate. However, this unrestricted access also exacerbates the spread of misinformation and fake news online which might cause confusion and chaos unless being detected early for its mitigation. Given the rapidly evolving nature of news events and the limited amount of annotated data, state-of-the-art systems on fake news detection face challenges due to the lack of large numbers of annotated training instances that are hard to come by for early detection. In this work, we exploit multiple weak signals from different sources given by user and content engagements (referred to as weak social supervision), and their complementary utilities to detect fake news. We jointly leverage the limited amount of clean data along with weak signals from social engagements to train deep neural networks in a meta-learning framework to estimate the quality of different weak instances. Experiments on realworld datasets demonstrate that the proposed framework outperforms state-of-the-art baselines for early detection of fake news without using any user engagements at prediction time.
△ Less
Submitted 3 April, 2020;
originally announced April 2020.
-
A Feature-Driven Approach for Identifying Pathogenic Social Media Accounts
Authors:
Hamidreza Alvari,
Ghazaleh Beigi,
Soumajyoti Sarkar,
Scott W. Ruston,
Steven R. Corman,
Hasan Davulcu,
Paulo Shakarian
Abstract:
Over the past few years, we have observed different media outlets' attempts to shift public opinion by framing information to support a narrative that facilitate their goals. Malicious users referred to as "pathogenic social media" (PSM) accounts are more likely to amplify this phenomena by spreading misinformation to viral proportions. Understanding the spread of misinformation from account-level…
▽ More
Over the past few years, we have observed different media outlets' attempts to shift public opinion by framing information to support a narrative that facilitate their goals. Malicious users referred to as "pathogenic social media" (PSM) accounts are more likely to amplify this phenomena by spreading misinformation to viral proportions. Understanding the spread of misinformation from account-level perspective is thus a pressing problem. In this work, we aim to present a feature-driven approach to detect PSM accounts in social media. Inspired by the literature, we set out to assess PSMs from three broad perspectives: (1) user-related information (e.g., user activity, profile characteristics), (2) source-related information (i.e., information linked via URLs shared by users) and (3) content-related information (e.g., tweets characteristics). For the user-related information, we investigate malicious signals using causality analysis (i.e., if user is frequently a cause of viral cascades) and profile characteristics (e.g., number of followers, etc.). For the source-related information, we explore various malicious properties linked to URLs (e.g., URL address, content of the associated website, etc.). Finally, for the content-related information, we examine attributes (e.g., number of hashtags, suspicious hashtags, etc.) from tweets posted by users. Experiments on real-world Twitter data from different countries demonstrate the effectiveness of the proposed approach in identifying PSM users.
△ Less
Submitted 13 January, 2020;
originally announced January 2020.