-
Characterizing AI-Generated Misinformation on Social Media
Authors:
Chiara Drolsbach,
Nicolas Pröllochs
Abstract:
AI-generated misinformation (e.g., deepfakes) poses a growing threat to information integrity on social media. However, prior research has largely focused on its potential societal consequences rather than its real-world prevalence. In this study, we conduct a large-scale empirical analysis of AI-generated misinformation on the social media platform X. Specifically, we analyze a dataset comprising…
▽ More
AI-generated misinformation (e.g., deepfakes) poses a growing threat to information integrity on social media. However, prior research has largely focused on its potential societal consequences rather than its real-world prevalence. In this study, we conduct a large-scale empirical analysis of AI-generated misinformation on the social media platform X. Specifically, we analyze a dataset comprising N=91,452 misleading posts, both AI-generated and non-AI-generated, that have been identified and flagged through X's Community Notes platform. Our analysis yields four main findings: (i) AI-generated misinformation is more often centered on entertaining content and tends to exhibit a more positive sentiment than conventional forms of misinformation, (ii) it is more likely to originate from smaller user accounts, (iii) despite this, it is significantly more likely to go viral, and (iv) it is slightly less believable and harmful compared to conventional misinformation. Altogether, our findings highlight the unique characteristics of AI-generated misinformation on social media. We discuss important implications for platforms and future research.
△ Less
Submitted 15 May, 2025;
originally announced May 2025.
-
Community Fact-Checks Do Not Break Follower Loyalty
Authors:
Michelle Bobek,
Nicolas Pröllochs
Abstract:
Major social media platforms increasingly adopt community-based fact-checking to address misinformation on their platforms. While previous research has largely focused on its effect on engagement (e.g., reposts, likes), an understanding of how fact-checking affects a user's follower base is missing. In this study, we employ quasi-experimental methods to causally assess whether users lose followers…
▽ More
Major social media platforms increasingly adopt community-based fact-checking to address misinformation on their platforms. While previous research has largely focused on its effect on engagement (e.g., reposts, likes), an understanding of how fact-checking affects a user's follower base is missing. In this study, we employ quasi-experimental methods to causally assess whether users lose followers after their posts are corrected via community fact-checks. Based on time-series data on follower counts for N=3516 community fact-checked posts from X, we find that community fact-checks do not lead to meaningful declines in the follower counts of users who post misleading content. This suggests that followers of spreaders of misleading posts tend to remain loyal and do not view community fact-checks as a sufficient reason to disengage. Our findings underscore the need for complementary interventions to more effectively disincentivize the production of misinformation on social media.
△ Less
Submitted 15 May, 2025;
originally announced May 2025.
-
References to unbiased sources increase the helpfulness of community fact-checks
Authors:
Kirill Solovev,
Nicolas Pröllochs
Abstract:
Community-based fact-checking is a promising approach to address misinformation on social media at scale. However, an understanding of what makes community-created fact-checks helpful to users is still in its infancy. In this paper, we analyze the determinants of the helpfulness of community-created fact-checks. For this purpose, we draw upon a unique dataset of real-world community-created fact-c…
▽ More
Community-based fact-checking is a promising approach to address misinformation on social media at scale. However, an understanding of what makes community-created fact-checks helpful to users is still in its infancy. In this paper, we analyze the determinants of the helpfulness of community-created fact-checks. For this purpose, we draw upon a unique dataset of real-world community-created fact-checks and helpfulness ratings from X's (formerly Twitter) Community Notes platform. Our empirical analysis implies that the key determinant of helpfulness in community-based fact-checking is whether users provide links to external sources to underpin their assertions. On average, the odds for community-created fact-checks to be perceived as helpful are 2.70 times higher if they provide links to external sources. Furthermore, we demonstrate that the helpfulness of community-created fact-checks varies depending on their level of political bias. Here, we find that community-created fact-checks linking to high-bias sources (of either political side) are perceived as significantly less helpful. This suggests that the rating mechanism on the Community Notes platform successfully penalizes one-sidedness and politically motivated reasoning. These findings have important implications for social media platforms, which can utilize our results to optimize their community-based fact-checking systems.
△ Less
Submitted 13 March, 2025;
originally announced March 2025.
-
Community Fact-Checks Trigger Moral Outrage in Replies to Misleading Posts on Social Media
Authors:
Yuwei Chuai,
Anastasia Sergeeva,
Gabriele Lenzini,
Nicolas Pröllochs
Abstract:
Displaying community fact-checks is a promising approach to reduce engagement with misinformation on social media. However, how users respond to misleading content emotionally after community fact-checks are displayed on posts is unclear. Here, we employ quasi-experimental methods to causally analyze changes in sentiments and (moral) emotions in replies to misleading posts following the display of…
▽ More
Displaying community fact-checks is a promising approach to reduce engagement with misinformation on social media. However, how users respond to misleading content emotionally after community fact-checks are displayed on posts is unclear. Here, we employ quasi-experimental methods to causally analyze changes in sentiments and (moral) emotions in replies to misleading posts following the display of community fact-checks. Our evaluation is based on a large-scale panel dataset comprising N=2,225,260 replies across 1841 source posts from X's Community Notes platform. We find that informing users about falsehoods through community fact-checks significantly increases negativity (by 7.3%), anger (by 13.2%), disgust (by 4.7%), and moral outrage (by 16.0%) in the corresponding replies. These results indicate that users perceive spreading misinformation as a violation of social norms and that those who spread misinformation should expect negative reactions once their content is debunked. We derive important implications for the design of community-based fact-checking systems.
△ Less
Submitted 26 January, 2025; v1 submitted 13 September, 2024;
originally announced September 2024.
-
Community-based fact-checking reduces the spread of misleading posts on social media
Authors:
Yuwei Chuai,
Moritz Pilarski,
Thomas Renault,
David Restrepo-Amariles,
Aurore Troussel-Clément,
Gabriele Lenzini,
Nicolas Pröllochs
Abstract:
Community-based fact-checking is a promising approach to verify social media content and correct misleading posts at scale. Yet, causal evidence regarding its effectiveness in reducing the spread of misinformation on social media is missing. Here, we performed a large-scale empirical study to analyze whether community notes reduce the spread of misleading posts on X. Using a Difference-in-Differen…
▽ More
Community-based fact-checking is a promising approach to verify social media content and correct misleading posts at scale. Yet, causal evidence regarding its effectiveness in reducing the spread of misinformation on social media is missing. Here, we performed a large-scale empirical study to analyze whether community notes reduce the spread of misleading posts on X. Using a Difference-in-Differences design and repost time series data for N=237,677 (community fact-checked) cascades that had been reposted more than 431 million times, we found that exposing users to community notes reduced the spread of misleading posts by, on average, 62.0%. Furthermore, community notes increased the odds that users delete their misleading posts by 103.4%. However, our findings also suggest that community notes might be too slow to intervene in the early (and most viral) stage of the diffusion. Our work offers important implications to enhance the effectiveness of community-based fact-checking approaches on social media.
△ Less
Submitted 13 September, 2024;
originally announced September 2024.
-
A Fused Large Language Model for Predicting Startup Success
Authors:
Abdurahman Maarouf,
Stefan Feuerriegel,
Nicolas Pröllochs
Abstract:
Investors are continuously seeking profitable investment opportunities in startups and, hence, for effective decision-making, need to predict a startup's probability of success. Nowadays, investors can use not only various fundamental information about a startup (e.g., the age of the startup, the number of founders, and the business sector) but also textual description of a startup's innovation an…
▽ More
Investors are continuously seeking profitable investment opportunities in startups and, hence, for effective decision-making, need to predict a startup's probability of success. Nowadays, investors can use not only various fundamental information about a startup (e.g., the age of the startup, the number of founders, and the business sector) but also textual description of a startup's innovation and business model, which is widely available through online venture capital (VC) platforms such as Crunchbase. To support the decision-making of investors, we develop a machine learning approach with the aim of locating successful startups on VC platforms. Specifically, we develop, train, and evaluate a tailored, fused large language model to predict startup success. Thereby, we assess to what extent self-descriptions on VC platforms are predictive of startup success. Using 20,172 online profiles from Crunchbase, we find that our fused large language model can predict startup success, with textual self-descriptions being responsible for a significant part of the predictive power. Our work provides a decision support tool for investors to find profitable investment opportunities.
△ Less
Submitted 5 September, 2024;
originally announced September 2024.
-
Content Moderation on Social Media in the EU: Insights From the DSA Transparency Database
Authors:
Chiara Drolsbach,
Nicolas Pröllochs
Abstract:
The Digital Services Act (DSA) requires large social media platforms in the EU to provide clear and specific information whenever they remove or restrict access to certain content. These "Statements of Reasons" (SoRs) are collected in the DSA Transparency Database to ensure transparency and scrutiny of content moderation decisions of the providers of online platforms. In this work, we empirically…
▽ More
The Digital Services Act (DSA) requires large social media platforms in the EU to provide clear and specific information whenever they remove or restrict access to certain content. These "Statements of Reasons" (SoRs) are collected in the DSA Transparency Database to ensure transparency and scrutiny of content moderation decisions of the providers of online platforms. In this work, we empirically analyze 156 million SoRs within an observation period of two months to provide an early look at content moderation decisions of social media platforms in the EU. Our empirical analysis yields the following main findings: (i) There are vast differences in the frequency of content moderation across platforms. For instance, TikTok performs more than 350 times more content moderation decisions per user than X/Twitter. (ii) Content moderation is most commonly applied for text and videos, whereas images and other content formats undergo moderation less frequently. (ii) The primary reasons for moderation include content falling outside the platform's scope of service, illegal/harmful speech, and pornography/sexualized content, with moderation of misinformation being relatively uncommon. (iii) The majority of rule-breaking content is detected and decided upon via automated means rather than manual intervention. However, X/Twitter reports that it relies solely on non-automated methods. (iv) There is significant variation in the content moderation actions taken across platforms. Altogether, our study implies inconsistencies in how social media platforms implement their obligations under the DSA -- resulting in a fragmented outcome that the DSA is meant to avoid. Our findings have important implications for regulators to clarify existing guidelines or lay out more specific rules that ensure common standards on how social media providers handle rule-breaking content on their platforms.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
Which linguistic cues make people fall for fake news? A comparison of cognitive and affective processing
Authors:
Bernhard Lutz,
Marc Adam,
Stefan Feuerriegel,
Nicolas Pröllochs,
Dirk Neumann
Abstract:
Fake news on social media has large, negative implications for society. However, little is known about what linguistic cues make people fall for fake news and, hence, how to design effective countermeasures for social media. In this study, we seek to understand which linguistic cues make people fall for fake news. Linguistic cues (e.g., adverbs, personal pronouns, positive emotion words, negative…
▽ More
Fake news on social media has large, negative implications for society. However, little is known about what linguistic cues make people fall for fake news and, hence, how to design effective countermeasures for social media. In this study, we seek to understand which linguistic cues make people fall for fake news. Linguistic cues (e.g., adverbs, personal pronouns, positive emotion words, negative emotion words) are important characteristics of any text and also affect how people process real vs. fake news. Specifically, we compare the role of linguistic cues across both cognitive processing (related to careful thinking) and affective processing (related to unconscious automatic evaluations). To this end, we performed a within-subject experiment where we collected neurophysiological measurements of 42 subjects while these read a sample of 40 real and fake news articles. During our experiment, we measured cognitive processing through eye fixations, and affective processing in situ through heart rate variability. We find that users engage more in cognitive processing for longer fake news articles, while affective processing is more pronounced for fake news written in analytic words. To the best of our knowledge, this is the first work studying the role of linguistic cues in fake news processing. Altogether, our findings have important implications for designing online platforms that encourage users to engage in careful thinking and thus prevent them from falling for fake news.
△ Less
Submitted 2 December, 2023;
originally announced December 2023.
-
Is Fact-Checking Politically Neutral? Asymmetries in How U.S. Fact-Checking Organizations Pick Up False Statements Mentioning Political Elites
Authors:
Yuwei Chuai,
Jichang Zhao,
Nicolas Pröllochs,
Gabriele Lenzini
Abstract:
Political elites play an important role in the proliferation of online misinformation. However, an understanding of how fact-checking platforms pick up politicized misinformation for fact-checking is still in its infancy. Here, we conduct an empirical analysis of mentions of U.S. political elites within fact-checked statements. For this purpose, we collect a comprehensive dataset consisting of 35,…
▽ More
Political elites play an important role in the proliferation of online misinformation. However, an understanding of how fact-checking platforms pick up politicized misinformation for fact-checking is still in its infancy. Here, we conduct an empirical analysis of mentions of U.S. political elites within fact-checked statements. For this purpose, we collect a comprehensive dataset consisting of 35,014 true and false statements that have been fact-checked by two major fact-checking organizations (Snopes, PolitiFact) in the U.S. between 2008 and 2023, i.e., within an observation period of 15 years. Subsequently, we perform content analysis and explanatory regression modeling to analyze how veracity is linked to mentions of U.S. political elites in fact-checked statements. Our analysis yields the following main findings: (i) Fact-checked false statements are, on average, 20% more likely to mention political elites than true fact-checked statements. (ii) There is a partisan asymmetry such that fact-checked false statements are 88.1% more likely to mention Democrats, but 26.5% less likely to mention Republicans, compared to fact-checked true statements. (iii) Mentions of political elites in fact-checked false statements reach the highest level during the months preceding elections. (iv) Fact-checked false statements that mention political elites carry stronger other-condemning emotions and are more likely to be pro-Republican, compared to fact-checked true statements. In sum, our study offers new insights into understanding mentions of political elites in false statements on U.S. fact-checking platforms, and bridges important findings at the intersection between misinformation and politicization.
△ Less
Submitted 13 September, 2024; v1 submitted 12 October, 2023;
originally announced October 2023.
-
Did the Roll-Out of Community Notes Reduce Engagement With Misinformation on X/Twitter?
Authors:
Yuwei Chuai,
Haoye Tian,
Nicolas Pröllochs,
Gabriele Lenzini
Abstract:
Developing interventions that successfully reduce engagement with misinformation on social media is challenging. One intervention that has recently gained great attention is X/Twitter's Community Notes (previously known as "Birdwatch"). Community Notes is a crowdsourced fact-checking approach that allows users to write textual notes to inform others about potentially misleading posts on X/Twitter.…
▽ More
Developing interventions that successfully reduce engagement with misinformation on social media is challenging. One intervention that has recently gained great attention is X/Twitter's Community Notes (previously known as "Birdwatch"). Community Notes is a crowdsourced fact-checking approach that allows users to write textual notes to inform others about potentially misleading posts on X/Twitter. Yet, empirical evidence regarding its effectiveness in reducing engagement with misinformation on social media is missing. In this paper, we perform a large-scale empirical study to analyze whether the introduction of the Community Notes feature and its roll-out to users in the U.S. and around the world have reduced engagement with misinformation on X/Twitter in terms of retweet volume and likes. We employ Difference-in-Differences (DiD) models and Regression Discontinuity Design (RDD) to analyze a comprehensive dataset consisting of all fact-checking notes and corresponding source tweets since the launch of Community Notes in early 2021. Although we observe a significant increase in the volume of fact-checks carried out via Community Notes, particularly for tweets from verified users with many followers, we find no evidence that the introduction of Community Notes significantly reduced engagement with misleading tweets on X/Twitter. Rather, our findings suggest that Community Notes might be too slow to effectively reduce engagement with misinformation in the early (and most viral) stage of diffusion. Our work emphasizes the importance of evaluating fact-checking interventions in the field and offers important implications to enhance crowdsourced fact-checking strategies on social media.
△ Less
Submitted 23 August, 2024; v1 submitted 16 July, 2023;
originally announced July 2023.
-
Community Notes vs. Snoping: How the Crowd Selects Fact-Checking Targets on Social Media
Authors:
Moritz Pilarski,
Kirill Solovev,
Nicolas Pröllochs
Abstract:
Deploying links to fact-checking websites (so-called "snoping") is a common intervention that can be used by social media users to refute misleading claims. However, its real-world effect may be limited as it suffers from low visibility and distrust towards professional fact-checkers. As a remedy, Twitter launched its community-based fact-checking system Community Notes on which fact-checks are ca…
▽ More
Deploying links to fact-checking websites (so-called "snoping") is a common intervention that can be used by social media users to refute misleading claims. However, its real-world effect may be limited as it suffers from low visibility and distrust towards professional fact-checkers. As a remedy, Twitter launched its community-based fact-checking system Community Notes on which fact-checks are carried out by actual Twitter users and directly shown on the fact-checked tweets. Yet, an understanding of how fact-checking via Community Notes differs from snoping is absent. In this study, we analyze differences in how contributors to Community Notes and Snopers select their targets when fact-checking social media posts. For this purpose, we analyze two unique datasets from Twitter: (a) 25,912 community-created fact-checks from Twitter's Community Notes platform; and (b) 52,505 "snopes" that debunk tweets via fact-checking replies linking to professional fact-checking websites. We find that Notes contributors and Snopers focus on different targets when fact-checking social media content. For instance, Notes contributors tend to fact-check posts from larger accounts with higher social influence and are relatively less likely to endorse/emphasize the accuracy of not misleading posts. Fact-checking targets of Notes contributors and Snopers rarely overlap; however, those overlapping exhibit a high level of agreement in the fact-checking assessment. Moreover, we demonstrate that Snopers fact-check social media posts at a higher speed. Altogether, our findings imply that different fact-checking approaches -- carried out on the same social media platform -- can result in vastly different social media posts getting fact-checked. This has important implications for future research on misinformation, which should not rely on a single fact-checking approach when compiling misinformation datasets.
△ Less
Submitted 16 September, 2023; v1 submitted 16 May, 2023;
originally announced May 2023.
-
Believability and Harmfulness Shape the Virality of Misleading Social Media Posts
Authors:
Chiara Drolsbach,
Nicolas Pröllochs
Abstract:
Misinformation on social media presents a major threat to modern societies. While previous research has analyzed the virality across true and false social media posts, not every misleading post is necessarily equally viral. Rather, misinformation has different characteristics and varies in terms of its believability and harmfulness - which might influence its spread. In this work, we study how the…
▽ More
Misinformation on social media presents a major threat to modern societies. While previous research has analyzed the virality across true and false social media posts, not every misleading post is necessarily equally viral. Rather, misinformation has different characteristics and varies in terms of its believability and harmfulness - which might influence its spread. In this work, we study how the perceived believability and harmfulness of misleading posts are associated with their virality on social media. Specifically, we analyze (and validate) a large sample of crowd-annotated social media posts from Twitter's Birdwatch platform, on which users can rate the believability and harmfulness of misleading tweets. To address our research questions, we implement an explanatory regression model and link the crowd ratings for believability and harmfulness to the virality of misleading posts on Twitter. Our findings imply that misinformation that is (i) easily believable and (ii) not particularly harmful is associated with more viral resharing cascades. These results offer insights into how different kinds of crowd fact-checked misinformation spreads and suggest that the most viral misleading posts are often not the ones that are particularly concerning from the perspective of public safety. From a practical view, our findings may help platforms to develop more effective strategies to curb the proliferation of misleading posts on social media.
△ Less
Submitted 7 March, 2023; v1 submitted 10 February, 2023;
originally announced February 2023.
-
New threats to society from free-speech social media platforms
Authors:
Dominik Bär,
Nicolas Pröllochs,
Stefan Feuerriegel
Abstract:
In recent years, several free-speech social media platforms (so-called "alt-techs") have emerged, such as Parler, Gab, and Telegram. These platforms market themselves as alternatives to mainstream social media and proclaim "free-speech" due to the absence of content moderation, which has been attracting a large base of partisan users, extremists, and supporters of conspiracy theories. In this comm…
▽ More
In recent years, several free-speech social media platforms (so-called "alt-techs") have emerged, such as Parler, Gab, and Telegram. These platforms market themselves as alternatives to mainstream social media and proclaim "free-speech" due to the absence of content moderation, which has been attracting a large base of partisan users, extremists, and supporters of conspiracy theories. In this comment, we discuss some of the threats that emerge from such social media platforms and call for more policy efforts directed at understanding and countering the risks for society.
△ Less
Submitted 2 February, 2023;
originally announced February 2023.
-
Russian propaganda on social media during the 2022 invasion of Ukraine
Authors:
Dominique Geissler,
Dominik Bär,
Nicolas Pröllochs,
Stefan Feuerriegel
Abstract:
The Russian invasion of Ukraine in February 2022 was accompanied by practices of information warfare, yet existing evidence is largely anecdotal while large-scale empirical evidence is lacking. Here, we analyze the spread of pro-Russian support on social media. For this, we collected N = 349,455 messages from Twitter with pro-Russian support. Our findings suggest that pro-Russian messages received…
▽ More
The Russian invasion of Ukraine in February 2022 was accompanied by practices of information warfare, yet existing evidence is largely anecdotal while large-scale empirical evidence is lacking. Here, we analyze the spread of pro-Russian support on social media. For this, we collected N = 349,455 messages from Twitter with pro-Russian support. Our findings suggest that pro-Russian messages received ~251,000 retweets and thereby reached around 14.4 million users. We further provide evidence that bots played a disproportionate role in the dissemination of pro-Russian messages and amplified its proliferation in early-stage diffusion. Countries that abstained from voting on the United Nations Resolution ES-11/1 such as India, South Africa, and Pakistan showed pronounced activity of bots. Overall, 20.28% of the spreaders are classified as bots, most of which were created at the beginning of the invasion. Together, our findings suggest the presence of a large-scale Russian propaganda campaign on social media and highlight the new threats to society that originate from it. Our results also suggest that curbing bots may be an effective strategy to mitigate such campaigns.
△ Less
Submitted 25 August, 2023; v1 submitted 8 November, 2022;
originally announced November 2022.
-
The Virality of Hate Speech on Social Media
Authors:
Abdurahman Maarouf,
Nicolas Pröllochs,
Stefan Feuerriegel
Abstract:
Online hate speech is responsible for violent attacks such as, e.g., the Pittsburgh synagogue shooting in 2018, thereby posing a significant threat to vulnerable groups and society in general. However, little is known about what makes hate speech on social media go viral. In this paper, we collect N = 25,219 cascades with 65,946 retweets from X (formerly known as Twitter) and classify them as hate…
▽ More
Online hate speech is responsible for violent attacks such as, e.g., the Pittsburgh synagogue shooting in 2018, thereby posing a significant threat to vulnerable groups and society in general. However, little is known about what makes hate speech on social media go viral. In this paper, we collect N = 25,219 cascades with 65,946 retweets from X (formerly known as Twitter) and classify them as hateful vs. normal. Using a generalized linear regression, we then estimate differences in the spread of hateful vs. normal content based on author and content variables. We thereby identify important determinants that explain differences in the spreading of hateful vs. normal content. For example, hateful content authored by verified users is disproportionally more likely to go viral than hateful content from non-verified ones: hateful content from a verified user (as opposed to normal content) has a 3.5 times larger cascade size, a 3.2 times longer cascade lifetime, and a 1.2 times larger structural virality. Altogether, we offer novel insights into the virality of hate speech on social media.
△ Less
Submitted 25 November, 2024; v1 submitted 25 October, 2022;
originally announced October 2022.
-
Mechanisms of True and False Rumor Sharing in Social Media: Collective Intelligence or Herd Behavior?
Authors:
Nicolas Pröllochs,
Stefan Feuerriegel
Abstract:
Social media platforms disseminate extensive volumes of online content, including true and, in particular, false rumors. Previous literature has studied the diffusion of offline rumors, yet more research is needed to understand the diffusion of online rumors. In this paper, we examine the role of lifetime and crowd effects in social media sharing behavior for true vs. false rumors. Based on 126,30…
▽ More
Social media platforms disseminate extensive volumes of online content, including true and, in particular, false rumors. Previous literature has studied the diffusion of offline rumors, yet more research is needed to understand the diffusion of online rumors. In this paper, we examine the role of lifetime and crowd effects in social media sharing behavior for true vs. false rumors. Based on 126,301 Twitter cascades, we find that the sharing behavior is characterized by lifetime and crowd effects that explain differences in the spread of true as opposed to false rumors. All else equal, we find that a longer lifetime is associated with less sharing activities, yet the reduction in sharing is larger for false than for true rumors. Hence, lifetime is an important determinant explaining why false rumors die out. Furthermore, we find that the spread of false rumors is characterized by herding tendencies (rather than collective intelligence), whereby the spread of false rumors becomes proliferated at a larger retweet depth. These findings explain differences in the diffusion dynamics of true and false rumors and further offer practical implications for social media platforms.
△ Less
Submitted 21 March, 2023; v1 submitted 6 July, 2022;
originally announced July 2022.
-
Diffusion of Community Fact-Checked Misinformation on Twitter
Authors:
Chiara Drolsbach,
Nicolas Pröllochs
Abstract:
The spread of misinformation on social media is a pressing societal problem that platforms, policymakers, and researchers continue to grapple with. As a countermeasure, recent works have proposed to employ non-expert fact-checkers in the crowd to fact-check social media content. While experimental studies suggest that crowds might be able to accurately assess the veracity of social media content,…
▽ More
The spread of misinformation on social media is a pressing societal problem that platforms, policymakers, and researchers continue to grapple with. As a countermeasure, recent works have proposed to employ non-expert fact-checkers in the crowd to fact-check social media content. While experimental studies suggest that crowds might be able to accurately assess the veracity of social media content, an understanding of how crowd fact-checked (mis-)information spreads is missing. In this work, we empirically analyze the spread of misleading vs. not misleading community fact-checked posts on social media. For this purpose, we employ a dataset of community-created fact-checks from Twitter's Birdwatch pilot and map them to resharing cascades on Twitter. Different from earlier studies analyzing the spread of misinformation listed on third-party fact-checking websites (e.g., Snopes), we find that community fact-checked misinformation is less viral. Specifically, misleading posts are estimated to receive 36.62% fewer retweets than not misleading posts. A partial explanation may lie in differences in the fact-checking targets: community fact-checkers tend to fact-check posts from influential user accounts with many followers, while expert fact-checks tend to target posts that are shared by less influential users. We further find that there are significant differences in virality across different sub-types of misinformation (e.g., factual errors, missing context, manipulated media). Moreover, we conduct a user study to assess the perceived reliability of (real-world) community-created fact-checks. Here, we find that users, to a large extent, agree with community-created fact-checks. Altogether, our findings offer insights into how misleading vs. not misleading posts spread and highlight the crucial role of sample selection when studying misinformation on social media.
△ Less
Submitted 24 March, 2023; v1 submitted 26 May, 2022;
originally announced May 2022.
-
Finding Qs: Profiling QAnon Supporters on Parler
Authors:
Dominik Bär,
Nicolas Pröllochs,
Stefan Feuerriegel
Abstract:
The social media platform "Parler" has emerged into a prominent fringe community where a significant part of the user base are self-reported supporters of QAnon, a far-right conspiracy theory alleging that a cabal of elites controls global politics. QAnon is considered to have had an influential role in the public discourse during the 2020 U.S. presidential election. However, little is known about…
▽ More
The social media platform "Parler" has emerged into a prominent fringe community where a significant part of the user base are self-reported supporters of QAnon, a far-right conspiracy theory alleging that a cabal of elites controls global politics. QAnon is considered to have had an influential role in the public discourse during the 2020 U.S. presidential election. However, little is known about QAnon supporters on Parler and what sets them aside from other users. Building up on social identity theory, we aim at profiling the characteristics of QAnon supporters on Parler. We analyze a large-scale dataset with more than 600,000 profiles of English-speaking users on Parler. Based on users' profiles, posts, and comments, we then extract a comprehensive set of user features, linguistic features, network features, and content features. This allows us to perform user profiling and understand to what extent these features discriminate between QAnon and non-QAnon supporters on Parler. Our analysis is three-fold: (1) We quantify the number of QAnon supporters on Parler, finding that 34,913 users (5.5% of all users) openly report to support the conspiracy. (2) We examine differences between QAnon vs. non-QAnon supporters. We find that QAnon supporters differ statistically significantly from non-QAnon supporters across multiple dimensions. For example, they have, on average, a larger number of followers, followees, and posts, and thus have a large impact on the Parler network. (3) We use machine learning to identify which user characteristics discriminate QAnon from non-QAnon supporters. We find that user features, linguistic features, network features, and content features, can - to a large extent - discriminate QAnon vs. non-QAnon supporters on Parler. In particular, we find that user features are highly discriminatory, followed by content features and linguistic features.
△ Less
Submitted 24 April, 2023; v1 submitted 18 May, 2022;
originally announced May 2022.
-
Online Emotions During the Storming of the U.S. Capitol: Evidence from the Social Media Network Parler
Authors:
Johannes Jakubik,
Michael Vössing,
Dominik Bär,
Nicolas Pröllochs,
Stefan Feuerriegel
Abstract:
The storming of the U.S. Capitol on January 6, 2021 has led to the killing of 5 people and is widely regarded as an attack on democracy. The storming was largely coordinated through social media networks such as Parler. Yet little is known regarding how users interacted on Parler during the storming of the Capitol. In this work, we examine the emotion dynamics on Parler during the storming with re…
▽ More
The storming of the U.S. Capitol on January 6, 2021 has led to the killing of 5 people and is widely regarded as an attack on democracy. The storming was largely coordinated through social media networks such as Parler. Yet little is known regarding how users interacted on Parler during the storming of the Capitol. In this work, we examine the emotion dynamics on Parler during the storming with regard to heterogeneity across time and users. For this, we segment the user base into different groups (e.g., Trump supporters and QAnon supporters). We use affective computing (Kratzwald et al. 2018) to infer the emotions in the contents, thereby allowing us to provide a comprehensive assessment of online emotions. Our evaluation is based on a large-scale dataset from Parler, comprising of 717,300 posts from 144,003 users. We find that the user base responded to the storming of the Capitol with an overall negative sentiment. Akin to this, Trump supporters also expressed a negative sentiment and high levels of unbelief. In contrast to that, QAnon supporters did not express a more negative sentiment during the storming. We further provide a cross-platform analysis and compare the emotion dynamics on Parler and Twitter. Our findings point at a comparatively less negative response to the incidents on Parler compared to Twitter accompanied by higher levels of disapproval and outrage. Our contribution to research is three-fold: (1) We identify online emotions that were characteristic of the storming; (2) we assess emotion dynamics across different user groups on Parler; (3) we compare the emotion dynamics on Parler and Twitter. Thereby, our work offers important implications for actively managing online emotions to prevent similar incidents in the future.
△ Less
Submitted 19 July, 2022; v1 submitted 8 April, 2022;
originally announced April 2022.
-
Moral Emotions Shape the Virality of COVID-19 Misinformation on Social Media
Authors:
Kirill Solovev,
Nicolas Pröllochs
Abstract:
While false rumors pose a threat to the successful overcoming of the COVID-19 pandemic, an understanding of how rumors diffuse in online social networks is - even for non-crisis situations - still in its infancy. Here we analyze a large sample consisting of COVID-19 rumor cascades from Twitter that have been fact-checked by third-party organizations. The data comprises N=10,610 rumor cascades that…
▽ More
While false rumors pose a threat to the successful overcoming of the COVID-19 pandemic, an understanding of how rumors diffuse in online social networks is - even for non-crisis situations - still in its infancy. Here we analyze a large sample consisting of COVID-19 rumor cascades from Twitter that have been fact-checked by third-party organizations. The data comprises N=10,610 rumor cascades that have been retweeted more than 24 million times. We investigate whether COVID-19 misinformation spreads more viral than the truth and whether the differences in the diffusion of true vs. false rumors can be explained by the moral emotions they carry. We observe that, on average, COVID-19 misinformation is more likely to go viral than truthful information. However, the veracity effect is moderated by moral emotions: false rumors are more viral than the truth if the source tweets embed a high number of other-condemning emotion words, whereas a higher number of self-conscious emotion words is linked to a less viral spread. The effects are pronounced both for health misinformation and false political rumors. These findings offer insights into how true vs. false rumors spread and highlight the importance of considering emotions from the moral emotion families in social media content.
△ Less
Submitted 7 February, 2022;
originally announced February 2022.
-
Hate Speech in the Political Discourse on Social Media: Disparities Across Parties, Gender, and Ethnicity
Authors:
Kirill Solovev,
Nicolas Pröllochs
Abstract:
Social media has become an indispensable channel for political communication. However, the political discourse is increasingly characterized by hate speech, which affects not only the reputation of individual politicians but also the functioning of society at large. In this work, we empirically analyze how the amount of hate speech in replies to posts from politicians on Twitter depends on persona…
▽ More
Social media has become an indispensable channel for political communication. However, the political discourse is increasingly characterized by hate speech, which affects not only the reputation of individual politicians but also the functioning of society at large. In this work, we empirically analyze how the amount of hate speech in replies to posts from politicians on Twitter depends on personal characteristics, such as their party affiliation, gender, and ethnicity. For this purpose, we employ Twitter's Historical API to collect every tweet posted by members of the 117th U.S. Congress for an observation period of more than six months. Additionally, we gather replies for each tweet and use machine learning to predict the amount of hate speech they embed. Subsequently, we implement hierarchical regression models to analyze whether politicians with certain characteristics receive more hate speech. We find that tweets are particularly likely to receive hate speech in replies if they are authored by (i) persons of color from the Democratic party, (ii) white Republicans, and (iii) women. Furthermore, our analysis reveals that more negative sentiment (in the source tweet) is associated with more hate speech (in replies). However, the association varies across parties: negative sentiment attracts more hate speech for Democrats (vs. Republicans). Altogether, our empirical findings imply significant differences in how politicians are treated on social media depending on their party affiliation, gender, and ethnicity.
△ Less
Submitted 17 January, 2022;
originally announced January 2022.
-
Community-Based Fact-Checking on Twitter's Birdwatch Platform
Authors:
Nicolas Pröllochs
Abstract:
Misinformation undermines the credibility of social media and poses significant threats to modern societies. As a countermeasure, Twitter has recently introduced "Birdwatch," a community-driven approach to address misinformation on Twitter. On Birdwatch, users can identify tweets they believe are misleading, write notes that provide context to the tweet and rate the quality of other users' notes.…
▽ More
Misinformation undermines the credibility of social media and poses significant threats to modern societies. As a countermeasure, Twitter has recently introduced "Birdwatch," a community-driven approach to address misinformation on Twitter. On Birdwatch, users can identify tweets they believe are misleading, write notes that provide context to the tweet and rate the quality of other users' notes. In this work, we empirically analyze how users interact with this new feature. For this purpose, we collect {all} Birdwatch notes and ratings between the introduction of the feature in early 2021 and end of July 2021. We then map each Birdwatch note to the fact-checked tweet using Twitter's historical API. In addition, we use text mining methods to extract content characteristics from the text explanations in the Birdwatch notes (e.g., sentiment). Our empirical analysis yields the following main findings: (i) users more frequently file Birdwatch notes for misleading than not misleading tweets. These misleading tweets are primarily reported because of factual errors, lack of important context, or because they treat unverified claims as facts. (ii) Birdwatch notes are more helpful to other users if they link to trustworthy sources and if they embed a more positive sentiment. (iii) The social influence of the author of the source tweet is associated with differences in the level of user consensus. For influential users with many followers, Birdwatch notes yield a lower level of consensus among users and community-created fact checks are more likely to be seen as being incorrect and argumentative. Altogether, our findings can help social media platforms to formulate guidelines for users on how to write more helpful fact checks. At the same time, our analysis suggests that community-based fact-checking faces challenges regarding opinion speculation and polarization among the user base.
△ Less
Submitted 13 December, 2021; v1 submitted 14 April, 2021;
originally announced April 2021.
-
Integrating Floor Plans into Hedonic Models for Rent Price Appraisal
Authors:
Kirill Solovev,
Nicolas Pröllochs
Abstract:
Online real estate platforms have become significant marketplaces facilitating users' search for an apartment or a house. Yet it remains challenging to accurately appraise a property's value. Prior works have primarily studied real estate valuation based on hedonic price models that take structured data into account while accompanying unstructured data is typically ignored. In this study, we inves…
▽ More
Online real estate platforms have become significant marketplaces facilitating users' search for an apartment or a house. Yet it remains challenging to accurately appraise a property's value. Prior works have primarily studied real estate valuation based on hedonic price models that take structured data into account while accompanying unstructured data is typically ignored. In this study, we investigate to what extent an automated visual analysis of apartment floor plans on online real estate platforms can enhance hedonic rent price appraisal. We propose a tailored two-staged deep learning approach to learn price-relevant designs of floor plans from historical price data. Subsequently, we integrate the floor plan predictions into hedonic rent price models that account for both structural and locational characteristics of an apartment. Our empirical analysis based on a unique dataset of 9174 real estate listings suggests that current hedonic models underutilize the available data. We find that (1) the visual design of floor plans has significant explanatory power regarding rent prices - even after controlling for structural and locational apartment characteristics, and (2) harnessing floor plans results in an up to 10.56% lower out-of-sample prediction error. We further find that floor plans yield a particularly high gain in prediction performance for older and smaller apartments. Altogether, our empirical findings contribute to the existing research body by establishing the link between the visual design of floor plans and real estate prices. Moreover, our approach has important implications for online real estate platforms, which can use our findings to enhance user experience in their real estate listings.
△ Less
Submitted 16 February, 2021;
originally announced February 2021.
-
The Longer the Better? The Interplay Between Review Length and Line of Argumentation in Online Consumer Reviews
Authors:
Bernhard Lutz,
Nicolas Pröllochs,
Dirk Neumann
Abstract:
Review helpfulness serves as focal point in understanding customers' purchase decision-making process on online retailer platforms. An overwhelming majority of previous works find longer reviews to be more helpful than short reviews. In this paper, we propose that longer reviews should not be assumed to be uniformly more helpful; instead, we argue that the effect depends on the line of argumentati…
▽ More
Review helpfulness serves as focal point in understanding customers' purchase decision-making process on online retailer platforms. An overwhelming majority of previous works find longer reviews to be more helpful than short reviews. In this paper, we propose that longer reviews should not be assumed to be uniformly more helpful; instead, we argue that the effect depends on the line of argumentation in the review text. To test this idea, we use a large dataset of customer reviews from Amazon in combination with a state-of-the-art approach from natural language processing that allows us to study argumentation lines at sentence level. Our empirical analysis suggests that the frequency of argumentation changes moderates the effect of review length on helpfulness. Altogether, we disprove the prevailing narrative that longer reviews are uniformly perceived as more helpful. Our findings allow retailer platforms to improve their customer feedback systems and to feature more useful product reviews.
△ Less
Submitted 26 September, 2019; v1 submitted 10 September, 2019;
originally announced September 2019.
-
Sentence-Level Sentiment Analysis of Financial News Using Distributed Text Representations and Multi-Instance Learning
Authors:
Bernhard Lutz,
Nicolas Pröllochs,
Dirk Neumann
Abstract:
Researchers and financial professionals require robust computerized tools that allow users to rapidly operationalize and assess the semantic textual content in financial news. However, existing methods commonly work at the document-level while deeper insights into the actual structure and the sentiment of individual sentences remain blurred. As a result, investors are required to apply the utmost…
▽ More
Researchers and financial professionals require robust computerized tools that allow users to rapidly operationalize and assess the semantic textual content in financial news. However, existing methods commonly work at the document-level while deeper insights into the actual structure and the sentiment of individual sentences remain blurred. As a result, investors are required to apply the utmost attention and detailed, domain-specific knowledge in order to assess the information on a fine-grained basis. To facilitate this manual process, this paper proposes the use of distributed text representations and multi-instance learning to transfer information from the document-level to the sentence-level. Compared to alternative approaches, this method features superior predictive performance while preserving context and interpretability. Our analysis of a manually-labeled dataset yields a predictive accuracy of up to 69.90%, exceeding the performance of alternative approaches by at least 3.80 percentage points. Accordingly, this study not only benefits investors with regard to their financial decision-making, but also helps companies to communicate their messages as intended.
△ Less
Submitted 31 December, 2018;
originally announced January 2019.
-
Understanding the Role of Two-Sided Argumentation in Online Consumer Reviews: A Language-Based Perspective
Authors:
Bernhard Lutz,
Nicolas Pröllochs,
Dirk Neumann
Abstract:
This paper examines the effect of two-sided argumentation on the perceived helpfulness of online consumer reviews. In contrast to previous works, our analysis thereby sheds light on the reception of reviews from a language-based perspective. For this purpose, we propose an intriguing text analysis approach based on distributed text representations and multi-instance learning to operationalize the…
▽ More
This paper examines the effect of two-sided argumentation on the perceived helpfulness of online consumer reviews. In contrast to previous works, our analysis thereby sheds light on the reception of reviews from a language-based perspective. For this purpose, we propose an intriguing text analysis approach based on distributed text representations and multi-instance learning to operationalize the two-sidedness of argumentation in review texts. A subsequent empirical analysis using a large corpus of Amazon reviews suggests that two-sided argumentation in reviews significantly increases their helpfulness. We find this effect to be stronger for positive reviews than for negative reviews, whereas a higher degree of emotional language weakens the effect. Our findings have immediate implications for retailer platforms, which can utilize our results to optimize their customer feedback system and to present more useful product reviews.
△ Less
Submitted 24 December, 2018; v1 submitted 25 October, 2018;
originally announced October 2018.
-
Reinforcement Learning in R
Authors:
Nicolas Pröllochs,
Stefan Feuerriegel
Abstract:
Reinforcement learning refers to a group of methods from artificial intelligence where an agent performs learning through trial and error. It differs from supervised learning, since reinforcement learning requires no explicit labels; instead, the agent interacts continuously with its environment. That is, the agent starts in a specific state and then performs an action, based on which it transitio…
▽ More
Reinforcement learning refers to a group of methods from artificial intelligence where an agent performs learning through trial and error. It differs from supervised learning, since reinforcement learning requires no explicit labels; instead, the agent interacts continuously with its environment. That is, the agent starts in a specific state and then performs an action, based on which it transitions to a new state and, depending on the outcome, receives a reward. Different strategies (e.g. Q-learning) have been proposed to maximize the overall reward, resulting in a so-called policy, which defines the best possible action in each state. Mathematically, this process can be formalized by a Markov decision process and it has been implemented by packages in R; however, there is currently no package available for reinforcement learning. As a remedy, this paper demonstrates how to perform reinforcement learning in R and, for this purpose, introduces the ReinforcementLearning package. The package provides a remarkably flexible framework and is easily applied to a wide range of different problems. We demonstrate its use by drawing upon common examples from the literature (e.g. finding optimal game strategies).
△ Less
Submitted 29 September, 2018;
originally announced October 2018.
-
Investor Reaction to Financial Disclosures Across Topics: An Application of Latent Dirichlet Allocation
Authors:
Stefan Feuerriegel,
Nicolas Pröllochs
Abstract:
This paper provides a holistic study of how stock prices vary in their response to financial disclosures across different topics. Thereby, we specifically shed light into the extensive amount of filings for which no a priori categorization of their content exists. For this purpose, we utilize an approach from data mining - namely, latent Dirichlet allocation - as a means of topic modeling. This te…
▽ More
This paper provides a holistic study of how stock prices vary in their response to financial disclosures across different topics. Thereby, we specifically shed light into the extensive amount of filings for which no a priori categorization of their content exists. For this purpose, we utilize an approach from data mining - namely, latent Dirichlet allocation - as a means of topic modeling. This technique facilitates our task of automatically categorizing, ex ante, the content of more than 70,000 regulatory 8-K filings from U.S. companies. We then evaluate the subsequent stock market reaction. Our empirical evidence suggests a considerable discrepancy among various types of news stories in terms of their relevance and impact on financial markets. For instance, we find a statistically significant abnormal return in response to earnings results and credit rating, but also for disclosures regarding business strategy, the health sector, as well as mergers and acquisitions. Our results yield findings that benefit managers, investors and policy-makers by indicating how regulatory filings should be structured and the topics most likely to precede changes in stock valuations.
△ Less
Submitted 8 May, 2018;
originally announced May 2018.
-
Statistical Inferences for Polarity Identification in Natural Language
Authors:
Nicolas Pröllochs,
Stefan Feuerriegel,
Dirk Neumann
Abstract:
Information forms the basis for all human behavior, including the ubiquitous decision-making that people constantly perform in their every day lives. It is thus the mission of researchers to understand how humans process information to reach decisions. In order to facilitate this task, this work proposes a novel method of studying the reception of granular expressions in natural language. The appr…
▽ More
Information forms the basis for all human behavior, including the ubiquitous decision-making that people constantly perform in their every day lives. It is thus the mission of researchers to understand how humans process information to reach decisions. In order to facilitate this task, this work proposes a novel method of studying the reception of granular expressions in natural language. The approach utilizes LASSO regularization as a statistical tool to extract decisive words from textual content and draw statistical inferences based on the correspondence between the occurrences of words and an exogenous response variable. Accordingly, the method immediately suggests significant implications for social sciences and Information Systems research: everyone can now identify text segments and word choices that are statistically relevant to authors or readers and, based on this knowledge, test hypotheses from behavioral research. We demonstrate the contribution of our method by examining how authors communicate subjective information through narrative materials. This allows us to answer the question of which words to choose when communicating negative information. On the other hand, we show that investors trade not only upon facts in financial disclosures but are distracted by filler words and non-informative language. Practitioners - for example those in the fields of investor communications or marketing - can exploit our insights to enhance their writings based on the true perception of word choice.
△ Less
Submitted 5 April, 2018; v1 submitted 21 June, 2017;
originally announced June 2017.
-
Understanding Negations in Information Processing: Learning from Replicating Human Behavior
Authors:
Nicolas Pröllochs,
Stefan Feuerriegel,
Dirk Neumann
Abstract:
Information systems experience an ever-growing volume of unstructured data, particularly in the form of textual materials. This represents a rich source of information from which one can create value for people, organizations and businesses. For instance, recommender systems can benefit from automatically understanding preferences based on user reviews or social media. However, it is difficult for…
▽ More
Information systems experience an ever-growing volume of unstructured data, particularly in the form of textual materials. This represents a rich source of information from which one can create value for people, organizations and businesses. For instance, recommender systems can benefit from automatically understanding preferences based on user reviews or social media. However, it is difficult for computer programs to correctly infer meaning from narrative content. One major challenge is negations that invert the interpretation of words and sentences. As a remedy, this paper proposes a novel learning strategy to detect negations: we apply reinforcement learning to find a policy that replicates the human perception of negations based on an exogenous response, such as a user rating for reviews. Our method yields several benefits, as it eliminates the former need for expensive and subjective manual labeling in an intermediate stage. Moreover, the inferred policy can be used to derive statistical inferences and implications regarding how humans process and act on negations.
△ Less
Submitted 18 April, 2017;
originally announced April 2017.