-
Exploring the Benefits of Domain-Pretraining of Generative Large Language Models for Chemistry
Authors:
Anurag Acharya,
Shivam Sharma,
Robin Cosbey,
Megha Subramanian,
Scott Howland,
Maria Glenski
Abstract:
A proliferation of Large Language Models (the GPT series, BLOOM, LLaMA, and more) are driving forward novel development of multipurpose AI for a variety of tasks, particularly natural language processing (NLP) tasks. These models demonstrate strong performance on a range of tasks; however, there has been evidence of brittleness when applied to more niche or narrow domains where hallucinations or f…
▽ More
A proliferation of Large Language Models (the GPT series, BLOOM, LLaMA, and more) are driving forward novel development of multipurpose AI for a variety of tasks, particularly natural language processing (NLP) tasks. These models demonstrate strong performance on a range of tasks; however, there has been evidence of brittleness when applied to more niche or narrow domains where hallucinations or fluent but incorrect responses reduce performance. Given the complex nature of scientific domains, it is prudent to investigate the trade-offs of leveraging off-the-shelf versus more targeted foundation models for scientific domains. In this work, we examine the benefits of in-domain pre-training for a given scientific domain, chemistry, and compare these to open-source, off-the-shelf models with zero-shot and few-shot prompting. Our results show that not only do in-domain base models perform reasonably well on in-domain tasks in a zero-shot setting but that further adaptation using instruction fine-tuning yields impressive performance on chemistry-specific tasks such as named entity recognition and molecular formula generation.
△ Less
Submitted 5 November, 2024;
originally announced November 2024.
-
EXPERT: Public Benchmarks for Dynamic Heterogeneous Academic Graphs
Authors:
Sameera Horawalavithana,
Ellyn Ayton,
Anastasiya Usenko,
Shivam Sharma,
Jasmine Eshun,
Robin Cosbey,
Maria Glenski,
Svitlana Volkova
Abstract:
Machine learning models that learn from dynamic graphs face nontrivial challenges in learning and inference as both nodes and edges change over time. The existing large-scale graph benchmark datasets that are widely used by the community primarily focus on homogeneous node and edge attributes and are static. In this work, we present a variety of large scale, dynamic heterogeneous academic graphs t…
▽ More
Machine learning models that learn from dynamic graphs face nontrivial challenges in learning and inference as both nodes and edges change over time. The existing large-scale graph benchmark datasets that are widely used by the community primarily focus on homogeneous node and edge attributes and are static. In this work, we present a variety of large scale, dynamic heterogeneous academic graphs to test the effectiveness of models developed for multi-step graph forecasting tasks. Our novel datasets cover both context and content information extracted from scientific publications across two communities: Artificial Intelligence (AI) and Nuclear Nonproliferation (NN). In addition, we propose a systematic approach to improve the existing evaluation procedures used in the graph forecasting models.
△ Less
Submitted 14 April, 2022;
originally announced April 2022.
-
Unsupervised Keyphrase Extraction via Interpretable Neural Networks
Authors:
Rishabh Joshi,
Vidhisha Balachandran,
Emily Saldanha,
Maria Glenski,
Svitlana Volkova,
Yulia Tsvetkov
Abstract:
Keyphrase extraction aims at automatically extracting a list of "important" phrases representing the key concepts in a document. Prior approaches for unsupervised keyphrase extraction resorted to heuristic notions of phrase importance via embedding clustering or graph centrality, requiring extensive domain expertise. Our work presents a simple alternative approach which defines keyphrases as docum…
▽ More
Keyphrase extraction aims at automatically extracting a list of "important" phrases representing the key concepts in a document. Prior approaches for unsupervised keyphrase extraction resorted to heuristic notions of phrase importance via embedding clustering or graph centrality, requiring extensive domain expertise. Our work presents a simple alternative approach which defines keyphrases as document phrases that are salient for predicting the topic of the document. To this end, we propose INSPECT -- an approach that uses self-explaining models for identifying influential keyphrases in a document by measuring the predictive impact of input phrases on the downstream task of the document topic classification. We show that this novel method not only alleviates the need for ad-hoc heuristics but also achieves state-of-the-art results in unsupervised keyphrase extraction in four datasets across two domains: scientific publications and news articles.
△ Less
Submitted 17 February, 2023; v1 submitted 15 March, 2022;
originally announced March 2022.
-
Identifying Causal Influences on Publication Trends and Behavior: A Case Study of the Computational Linguistics Community
Authors:
Maria Glenski,
Svitlana Volkova
Abstract:
Drawing causal conclusions from observational real-world data is a very much desired but challenging task. In this paper we present mixed-method analyses to investigate causal influences of publication trends and behavior on the adoption, persistence, and retirement of certain research foci -- methodologies, materials, and tasks that are of interest to the computational linguistics (CL) community.…
▽ More
Drawing causal conclusions from observational real-world data is a very much desired but challenging task. In this paper we present mixed-method analyses to investigate causal influences of publication trends and behavior on the adoption, persistence, and retirement of certain research foci -- methodologies, materials, and tasks that are of interest to the computational linguistics (CL) community. Our key findings highlight evidence of the transition to rapidly emerging methodologies in the research community (e.g., adoption of bidirectional LSTMs influencing the retirement of LSTMs), the persistent engagement with trending tasks and techniques (e.g., deep learning, embeddings, generative, and language models), the effect of scientist location from outside the US, e.g., China on propensity of researching languages beyond English, and the potential impact of funding for large-scale research programs. We anticipate this work to provide useful insights about publication trends and behavior and raise the awareness about the potential for causal inference in the computational linguistics and a broader scientific community.
△ Less
Submitted 15 October, 2021;
originally announced October 2021.
-
VAINE: Visualization and AI for Natural Experiments
Authors:
Grace Guo,
Maria Glenski,
ZhuanYi Shaw,
Emily Saldanha,
Alex Endert,
Svitlana Volkova,
Dustin Arendt
Abstract:
Natural experiments are observational studies where the assignment of treatment conditions to different populations occurs by chance "in the wild". Researchers from fields such as economics, healthcare, and the social sciences leverage natural experiments to conduct hypothesis testing and causal effect estimation for treatment and outcome variables that would otherwise be costly, infeasible, or un…
▽ More
Natural experiments are observational studies where the assignment of treatment conditions to different populations occurs by chance "in the wild". Researchers from fields such as economics, healthcare, and the social sciences leverage natural experiments to conduct hypothesis testing and causal effect estimation for treatment and outcome variables that would otherwise be costly, infeasible, or unethical. In this paper, we introduce VAINE (Visualization and AI for Natural Experiments), a visual analytics tool for identifying and understanding natural experiments from observational data. We then demonstrate how VAINE can be used to validate causal relationships, estimate average treatment effects, and identify statistical phenomena such as Simpson's paradox through two usage scenarios.
△ Less
Submitted 9 September, 2021;
originally announced September 2021.
-
Leveraging Community and Author Context to Explain the Performance and Bias of Text-Based Deception Detection Models
Authors:
Galen Weld,
Ellyn Ayton,
Tim Althoff,
Maria Glenski
Abstract:
Deceptive news posts shared in online communities can be detected with NLP models, and much recent research has focused on the development of such models. In this work, we use characteristics of online communities and authors -- the context of how and where content is posted -- to explain the performance of a neural network deception detection model and identify sub-populations who are disproporti…
▽ More
Deceptive news posts shared in online communities can be detected with NLP models, and much recent research has focused on the development of such models. In this work, we use characteristics of online communities and authors -- the context of how and where content is posted -- to explain the performance of a neural network deception detection model and identify sub-populations who are disproportionately affected by model accuracy or failure. We examine who is posting the content, and where the content is posted to. We find that while author characteristics are better predictors of deceptive content than community characteristics, both characteristics are strongly correlated with model performance. Traditional performance metrics such as F1 score may fail to capture poor model performance on isolated sub-populations such as specific authors, and as such, more nuanced evaluation of deception detection models is critical.
△ Less
Submitted 27 April, 2021;
originally announced April 2021.
-
Towards Trustworthy Deception Detection: Benchmarking Model Robustness across Domains, Modalities, and Languages
Authors:
Maria Glenski,
Ellyn Ayton,
Robin Cosbey,
Dustin Arendt,
Svitlana Volkova
Abstract:
Evaluating model robustness is critical when developing trustworthy models not only to gain deeper understanding of model behavior, strengths, and weaknesses, but also to develop future models that are generalizable and robust across expected environments a model may encounter in deployment. In this paper we present a framework for measuring model robustness for an important but difficult text cla…
▽ More
Evaluating model robustness is critical when developing trustworthy models not only to gain deeper understanding of model behavior, strengths, and weaknesses, but also to develop future models that are generalizable and robust across expected environments a model may encounter in deployment. In this paper we present a framework for measuring model robustness for an important but difficult text classification task - deceptive news detection. We evaluate model robustness to out-of-domain data, modality-specific features, and languages other than English.
Our investigation focuses on three type of models: LSTM models trained on multiple datasets(Cross-Domain), several fusion LSTM models trained with images and text and evaluated with three state-of-the-art embeddings, BERT ELMo, and GloVe (Cross-Modality), and character-level CNN models trained on multiple languages (Cross-Language). Our analyses reveal a significant drop in performance when testing neural models on out-of-domain data and non-English languages that may be mitigated using diverse training data. We find that with additional image content as input, ELMo embeddings yield significantly fewer errors compared to BERT orGLoVe. Most importantly, this work not only carefully analyzes deception model robustness but also provides a framework of these analyses that can be applied to new models or extended datasets in the future.
△ Less
Submitted 23 April, 2021;
originally announced April 2021.
-
Evaluating Deception Detection Model Robustness To Linguistic Variation
Authors:
Maria Glenski,
Ellyn Ayton,
Robin Cosbey,
Dustin Arendt,
Svitlana Volkova
Abstract:
With the increasing use of machine-learning driven algorithmic judgements, it is critical to develop models that are robust to evolving or manipulated inputs. We propose an extensive analysis of model robustness against linguistic variation in the setting of deceptive news detection, an important task in the context of misinformation spread online. We consider two prediction tasks and compare thre…
▽ More
With the increasing use of machine-learning driven algorithmic judgements, it is critical to develop models that are robust to evolving or manipulated inputs. We propose an extensive analysis of model robustness against linguistic variation in the setting of deceptive news detection, an important task in the context of misinformation spread online. We consider two prediction tasks and compare three state-of-the-art embeddings to highlight consistent trends in model performance, high confidence misclassifications, and high impact failures. By measuring the effectiveness of adversarial defense strategies and evaluating model susceptibility to adversarial attacks using character- and word-perturbed text, we find that character or mixed ensemble models are the most effective defenses and that character perturbation-based attack tactics are more successful.
△ Less
Submitted 23 April, 2021;
originally announced April 2021.
-
Political Bias and Factualness in News Sharing across more than 100,000 Online Communities
Authors:
Galen Weld,
Maria Glenski,
Tim Althoff
Abstract:
As civil discourse increasingly takes place online, misinformation and the polarization of news shared in online communities have become ever more relevant concerns with real world harms across our society. Studying online news sharing at scale is challenging due to the massive volume of content which is shared by millions of users across thousands of communities. Therefore, existing research has…
▽ More
As civil discourse increasingly takes place online, misinformation and the polarization of news shared in online communities have become ever more relevant concerns with real world harms across our society. Studying online news sharing at scale is challenging due to the massive volume of content which is shared by millions of users across thousands of communities. Therefore, existing research has largely focused on specific communities or specific interventions, such as bans. However, understanding the prevalence and spread of misinformation and polarization more broadly, across thousands of online communities, is critical for the development of governance strategies, interventions, and community design. Here, we conduct the largest study of news sharing on reddit to date, analyzing more than 550 million links spanning 4 years. We use non-partisan news source ratings from Media Bias/Fact Check to annotate links to news sources with their political bias and factualness. We find that, compared to left-leaning communities, right-leaning communities have 105% more variance in the political bias of their news sources, and more links to relatively-more biased sources, on average. We observe that reddit users' voting and re-sharing behaviors generally decrease the visibility of extremely biased and low factual content, which receives 20% fewer upvotes and 30% fewer exposures from crossposts than more neutral or more factual content. This suggests that reddit is more resilient to low factual content than Twitter. We show that extremely biased and low factual content is very concentrated, with 99% of such content being shared in only 0.5% of communities, giving credence to the recent strategy of community-wide bans and quarantines.
△ Less
Submitted 9 May, 2022; v1 submitted 16 February, 2021;
originally announced February 2021.
-
Behavior Change in Response to Subreddit Bans and External Events
Authors:
Pamela Bilo Thomas,
Daniel Riehm,
Maria Glenski,
Tim Weninger
Abstract:
As more people flock to social media to connect with others and form virtual communities, it is important to research how members of these groups interact to understand human behavior on the Web. In response to an increase in hate speech, harassment and other antisocial behaviors, many social media companies have implemented different content and user moderation policies. On Reddit, for example, c…
▽ More
As more people flock to social media to connect with others and form virtual communities, it is important to research how members of these groups interact to understand human behavior on the Web. In response to an increase in hate speech, harassment and other antisocial behaviors, many social media companies have implemented different content and user moderation policies. On Reddit, for example, communities, i.e, subreddits, are occasionally banned for violating these policies. We study the effect of these regulatory actions as well as when a community experiences a significant external event like a political election or a market crash. Overall, we find that most subreddit bans prompt a small, but statistically significant, number of active users to leave the platform; the effect of external events varies with the type of event. We conclude with a discussion on the effectiveness of the bans and wider implications for the online content moderation.
△ Less
Submitted 5 January, 2021;
originally announced January 2021.
-
Measure Utility, Gain Trust: Practical Advice for XAI Researcher
Authors:
Brittany Davis,
Maria Glenski,
William Sealy,
Dustin Arendt
Abstract:
Research into the explanation of machine learning models, i.e., explainable AI (XAI), has seen a commensurate exponential growth alongside deep artificial neural networks throughout the past decade. For historical reasons, explanation and trust have been intertwined. However, the focus on trust is too narrow, and has led the research community astray from tried and true empirical methods that prod…
▽ More
Research into the explanation of machine learning models, i.e., explainable AI (XAI), has seen a commensurate exponential growth alongside deep artificial neural networks throughout the past decade. For historical reasons, explanation and trust have been intertwined. However, the focus on trust is too narrow, and has led the research community astray from tried and true empirical methods that produced more defensible scientific knowledge about people and explanations. To address this, we contribute a practical path forward for researchers in the XAI field. We recommend researchers focus on the utility of machine learning explanations instead of trust. We outline five broad use cases where explanations are useful and, for each, we describe pseudo-experiments that rely on objective empirical measurements and falsifiable hypotheses. We believe that this experimental rigor is necessary to contribute to scientific knowledge in the field of XAI.
△ Less
Submitted 27 September, 2020;
originally announced September 2020.
-
Adjusting for Confounders with Text: Challenges and an Empirical Evaluation Framework for Causal Inference
Authors:
Galen Weld,
Peter West,
Maria Glenski,
David Arbour,
Ryan Rossi,
Tim Althoff
Abstract:
Causal inference studies using textual social media data can provide actionable insights on human behavior. Making accurate causal inferences with text requires controlling for confounding which could otherwise impart bias. Recently, many different methods for adjusting for confounders have been proposed, and we show that these existing methods disagree with one another on two datasets inspired by…
▽ More
Causal inference studies using textual social media data can provide actionable insights on human behavior. Making accurate causal inferences with text requires controlling for confounding which could otherwise impart bias. Recently, many different methods for adjusting for confounders have been proposed, and we show that these existing methods disagree with one another on two datasets inspired by previous social media studies. Evaluating causal methods is challenging, as ground truth counterfactuals are almost never available. Presently, no empirical evaluation framework for causal methods using text exists, and as such, practitioners must select their methods without guidance. We contribute the first such framework, which consists of five tasks drawn from real world studies. Our framework enables the evaluation of any casual inference method using text. Across 648 experiments and two datasets, we evaluate every commonly used causal inference method and identify their strengths and weaknesses to inform social media researchers seeking to use such methods, and guide future improvements. We make all tasks, data, and models public to inform applications and encourage additional research.
△ Less
Submitted 6 May, 2022; v1 submitted 21 September, 2020;
originally announced September 2020.
-
CrossCheck: Rapid, Reproducible, and Interpretable Model Evaluation
Authors:
Dustin Arendt,
Zhuanyi Huang,
Prasha Shrestha,
Ellyn Ayton,
Maria Glenski,
Svitlana Volkova
Abstract:
Evaluation beyond aggregate performance metrics, e.g. F1-score, is crucial to both establish an appropriate level of trust in machine learning models and identify future model improvements. In this paper we demonstrate CrossCheck, an interactive visualization tool for rapid crossmodel comparison and reproducible error analysis. We describe the tool and discuss design and implementation details. We…
▽ More
Evaluation beyond aggregate performance metrics, e.g. F1-score, is crucial to both establish an appropriate level of trust in machine learning models and identify future model improvements. In this paper we demonstrate CrossCheck, an interactive visualization tool for rapid crossmodel comparison and reproducible error analysis. We describe the tool and discuss design and implementation details. We then present three use cases (named entity recognition, reading comprehension, and clickbait detection) that show the benefits of using the tool for model evaluation. CrossCheck allows data scientists to make informed decisions to choose between multiple models, identify when the models are correct and for which examples, investigate whether the models are making the same mistakes as humans, evaluate models' generalizability and highlight models' limitations, strengths and weaknesses. Furthermore, CrossCheck is implemented as a Jupyter widget, which allows rapid and convenient integration into data scientists' model development workflows.
△ Less
Submitted 16 April, 2020;
originally announced April 2020.
-
Multilingual Multimodal Digital Deception Detection and Disinformation Spread across Social Platforms
Authors:
Maria Glenski,
Ellyn Ayton,
Josh Mendoza,
Svitlana Volkova
Abstract:
Our main contribution in this work is novel results of multilingual models that go beyond typical applications of rumor or misinformation detection in English social news content to identify fine-grained classes of digital deception across multiple languages (e.g. Russian, Spanish, etc.). In addition, we present models for multimodal deception detection from images and text and discuss the limitat…
▽ More
Our main contribution in this work is novel results of multilingual models that go beyond typical applications of rumor or misinformation detection in English social news content to identify fine-grained classes of digital deception across multiple languages (e.g. Russian, Spanish, etc.). In addition, we present models for multimodal deception detection from images and text and discuss the limitations of image only and text only models. Finally, we elaborate on the ongoing work on measuring deceptive content (in particular disinformation) spread across social platforms.
△ Less
Submitted 12 September, 2019;
originally announced September 2019.
-
Improved Forecasting of Cryptocurrency Price using Social Signals
Authors:
Maria Glenski,
Tim Weninger,
Svitlana Volkova
Abstract:
Social media signals have been successfully used to develop large-scale predictive and anticipatory analytics. For example, forecasting stock market prices and influenza outbreaks. Recently, social data has been explored to forecast price fluctuations of cryptocurrencies, which are a novel disruptive technology with significant political and economic implications. In this paper we leverage and con…
▽ More
Social media signals have been successfully used to develop large-scale predictive and anticipatory analytics. For example, forecasting stock market prices and influenza outbreaks. Recently, social data has been explored to forecast price fluctuations of cryptocurrencies, which are a novel disruptive technology with significant political and economic implications. In this paper we leverage and contrast the predictive power of social signals, specifically user behavior and communication patterns, from multiple social platforms GitHub and Reddit to forecast prices for three cyptocurrencies with high developer and community interest - Bitcoin, Ethereum, and Monero. We evaluate the performance of neural network models that rely on long short-term memory units (LSTMs) trained on historical price data and social data against price only LSTMs and baseline autoregressive integrated moving average (ARIMA) models, commonly used to predict stock prices. Our results not only demonstrate that social signals reduce error when forecasting daily coin price, but also show that the language used in comments within the official communities on Reddit (r/Bitcoin, r/Ethereum, and r/Monero) are the best predictors overall. We observe that models are more accurate in forecasting price one day ahead for Bitcoin (4% root mean squared percent error) compared to Ethereum (7%) and Monero (8%).
△ Less
Submitted 1 July, 2019;
originally announced July 2019.
-
Propagation from Deceptive News Sources: Who Shares, How Much, How Evenly, and How Quickly?
Authors:
Maria Glenski,
Tim Weninger,
Svitlana Volkova
Abstract:
As people rely on social media as their primary sources of news, the spread of misinformation has become a significant concern. In this large-scale study of news in social media we analyze eleven million posts and investigate propagation behavior of users that directly interact with news accounts identified as spreading trusted versus malicious content. Unlike previous work, which looks at specifi…
▽ More
As people rely on social media as their primary sources of news, the spread of misinformation has become a significant concern. In this large-scale study of news in social media we analyze eleven million posts and investigate propagation behavior of users that directly interact with news accounts identified as spreading trusted versus malicious content. Unlike previous work, which looks at specific rumors, topics, or events, we consider all content propagated by various news sources. Moreover, we analyze and contrast population versus sub-population behaviour (by demographics) when spreading misinformation, and distinguish between two types of propagation, i.e., direct retweets and mentions. Our evaluation examines how evenly, how many, how quickly, and which users propagate content from various types of news sources on Twitter.
Our analysis has identified several key differences in propagation behavior from trusted versus suspicious news sources. These include high inequity in the diffusion rate based on the source of disinformation, with a small group of highly active users responsible for the majority of disinformation spread overall and within each demographic. Analysis by demographics showed that users with lower annual income and education share more from disinformation sources compared to their counterparts. News content is shared significantly more quickly from trusted, conspiracy, and disinformation sources compared to clickbait and propaganda. Older users propagate news from trusted sources more quickly than younger users, but they share from suspicious sources after longer delays. Finally, users who interact with clickbait and conspiracy sources are likely to share from propaganda accounts, but not the other way around.
△ Less
Submitted 9 December, 2018;
originally announced December 2018.
-
GuessTheKarma: A Game to Assess Social Rating Systems
Authors:
Maria Glenski,
Greg Stoddard,
Paul Resnick,
Tim Weninger
Abstract:
Popularity systems, like Twitter retweets, Reddit upvotes, and Pinterest pins have the potential to guide people toward posts that others liked. That, however, creates a feedback loop that reduces their informativeness: items marked as more popular get more attention, so that additional upvotes and retweets may simply reflect the increased attention and not independent information about the fracti…
▽ More
Popularity systems, like Twitter retweets, Reddit upvotes, and Pinterest pins have the potential to guide people toward posts that others liked. That, however, creates a feedback loop that reduces their informativeness: items marked as more popular get more attention, so that additional upvotes and retweets may simply reflect the increased attention and not independent information about the fraction of people that like the items. How much information remains? For example, how confident can we be that more people prefer item A to item B if item A had hundreds of upvotes on Reddit and item B had only a few? We investigate using an Internet game called GuessTheKarma that collects independent preference judgments (N=20,674) for 400 pairs of images, approximately 50 per pair. Unlike the rating systems that dominate social media services, GuessTheKarma is devoid of social and ranking effects that influence ratings. Overall, Reddit scores were not very good predictors of the true population preferences for items as measured by GuessTheKarma: the image with higher score was preferred by a majority of independent raters only 68% of the time. However, when one image had a low score and the other was one of the highest scoring in its subreddit, the higher scoring image was preferred nearly 90% of the time by the majority of independent raters. Similarly, Imgur view counts for the images were poor predictors except when there were orders of magnitude differences between the pairs. We conclude that popularity systems marked by feedback loops may convey a strong signal about population preferences, but only when comparing items that received vastly different popularity scores.
△ Less
Submitted 3 September, 2018;
originally announced September 2018.
-
How Humans versus Bots React to Deceptive and Trusted News Sources: A Case Study of Active Users
Authors:
Maria Glenski,
Tim Weninger,
Svitlana Volkova
Abstract:
Society's reliance on social media as a primary source of news has spawned a renewed focus on the spread of misinformation. In this work, we identify the differences in how social media accounts identified as bots react to news sources of varying credibility, regardless of the veracity of the content those sources have shared. We analyze bot and human responses annotated using a fine-grained model…
▽ More
Society's reliance on social media as a primary source of news has spawned a renewed focus on the spread of misinformation. In this work, we identify the differences in how social media accounts identified as bots react to news sources of varying credibility, regardless of the veracity of the content those sources have shared. We analyze bot and human responses annotated using a fine-grained model that labels responses as being an answer, appreciation, agreement, disagreement, an elaboration, humor, or a negative reaction. We present key findings of our analysis into the prevalence of bots, the variety and speed of bot and human reactions, and the disparity in authorship of reaction tweets between these two sub-populations. We observe that bots are responsible for 9-15% of the reactions to sources of any given type but comprise only 7-10% of accounts responsible for reaction-tweets; trusted news sources have the highest proportion of humans who reacted; bots respond with significantly shorter delays than humans when posting answer-reactions in response to sources identified as propaganda. Finally, we report significantly different inequality levels in reaction rates for accounts identified as bots vs not.
△ Less
Submitted 13 July, 2018;
originally announced July 2018.
-
Identifying and Understanding User Reactions to Deceptive and Trusted Social News Sources
Authors:
Maria Glenski,
Tim Weninger,
Svitlana Volkova
Abstract:
In the age of social news, it is important to understand the types of reactions that are evoked from news sources with various levels of credibility. In the present work we seek to better understand how users react to trusted and deceptive news sources across two popular, and very different, social media platforms. To that end, (1) we develop a model to classify user reactions into one of nine typ…
▽ More
In the age of social news, it is important to understand the types of reactions that are evoked from news sources with various levels of credibility. In the present work we seek to better understand how users react to trusted and deceptive news sources across two popular, and very different, social media platforms. To that end, (1) we develop a model to classify user reactions into one of nine types, such as answer, elaboration, and question, etc, and (2) we measure the speed and the type of reaction for trusted and deceptive news sources for 10.8M Twitter posts and 6.2M Reddit comments. We show that there are significant differences in the speed and the type of reactions between trusted and deceptive news sources on Twitter, but far smaller differences on Reddit.
△ Less
Submitted 30 May, 2018;
originally announced May 2018.
-
Fishing for Clickbaits in Social Images and Texts with Linguistically-Infused Neural Network Models
Authors:
Maria Glenski,
Ellyn Ayton,
Dustin Arendt,
Svitlana Volkova
Abstract:
This paper presents the results and conclusions of our participation in the Clickbait Challenge 2017 on automatic clickbait detection in social media. We first describe linguistically-infused neural network models and identify informative representations to predict the level of clickbaiting present in Twitter posts. Our models allow to answer the question not only whether a post is a clickbait or…
▽ More
This paper presents the results and conclusions of our participation in the Clickbait Challenge 2017 on automatic clickbait detection in social media. We first describe linguistically-infused neural network models and identify informative representations to predict the level of clickbaiting present in Twitter posts. Our models allow to answer the question not only whether a post is a clickbait or not, but to what extent it is a clickbait post e.g., not at all, slightly, considerably, or heavily clickbaity using a score ranging from 0 to 1. We evaluate the predictive power of models trained on varied text and image representations extracted from tweets. Our best performing model that relies on the tweet text and linguistic markers of biased language extracted from the tweet and the corresponding page yields mean squared error (MSE) of 0.04, mean absolute error (MAE) of 0.16 and R2 of 0.43 on the held-out test data. For the binary classification setup (clickbait vs. non-clickbait), our model achieved F1 score of 0.69. We have not found that image representations combined with text yield significant performance improvement yet. Nevertheless, this work is the first to present preliminary analysis of objects extracted using Google Tensorflow object detection API from images in clickbait vs. non-clickbait Twitter posts. Finally, we outline several steps to improve model performance as a part of the future work.
△ Less
Submitted 17 October, 2017;
originally announced October 2017.
-
Predicting User-Interactions on Reddit
Authors:
Maria Glenski,
Tim Weninger
Abstract:
In order to keep up with the demand of curating the deluge of crowd-sourced content, social media platforms leverage user interaction feedback to make decisions about which content to display, highlight, and hide. User interactions such as likes, votes, clicks, and views are assumed to be a proxy of a content's quality, popularity, or news-worthiness. In this paper we ask: how predictable are the…
▽ More
In order to keep up with the demand of curating the deluge of crowd-sourced content, social media platforms leverage user interaction feedback to make decisions about which content to display, highlight, and hide. User interactions such as likes, votes, clicks, and views are assumed to be a proxy of a content's quality, popularity, or news-worthiness. In this paper we ask: how predictable are the interactions of a user on social media? To answer this question we recorded the clicking, browsing, and voting behavior of 186 Reddit users over a year. We present interesting descriptive statistics about their combined 339,270 interactions, and we find that relatively simple models are able to predict users' individual browse- or vote-interactions with reasonable accuracy.
△ Less
Submitted 1 July, 2017;
originally announced July 2017.
-
Consumers and Curators: Browsing and Voting Patterns on Reddit
Authors:
Maria Glenski,
Corey Pennycuff,
Tim Weninger
Abstract:
As crowd-sourced curation of news and information become the norm, it is important to understand not only how individuals consume information through social news Web sites, but also how they contribute to their ranking systems. In the present work, we introduce and make available a new dataset containing the activity logs that recorded all activity for 309 Reddit users for one year. Using this new…
▽ More
As crowd-sourced curation of news and information become the norm, it is important to understand not only how individuals consume information through social news Web sites, but also how they contribute to their ranking systems. In the present work, we introduce and make available a new dataset containing the activity logs that recorded all activity for 309 Reddit users for one year. Using this newly collected data, we present findings that highlight the browsing and voting behavior of the study's participants. We find that most users do not read the article that they vote on, and that, in total, 73% of posts were rated (ie, upvoted or downvoted) without first viewing the content. We also show evidence of cognitive fatigue in the browsing sessions of users that are most likely to vote.
△ Less
Submitted 15 March, 2017;
originally announced March 2017.
-
Rating Effects on Social News Posts and Comments
Authors:
Maria Glenski,
Tim Weninger
Abstract:
At a time when information seekers first turn to digital sources for news and opinion, it is critical that we understand the role that social media plays in human behavior. This is especially true when information consumers also act as information producers and editors through their online activity. In order to better understand the effects that editorial ratings have on online human behavior, we…
▽ More
At a time when information seekers first turn to digital sources for news and opinion, it is critical that we understand the role that social media plays in human behavior. This is especially true when information consumers also act as information producers and editors through their online activity. In order to better understand the effects that editorial ratings have on online human behavior, we report the results of a two large-scale in-vivo experiments in social media. We find that small, random rating manipulations on social media posts and comments created significant changes in downstream ratings resulting in significantly different final outcomes. We found positive herding effects for positive treatments on posts, increasing the final rating by 11.02% on average, but not for positive treatments on comments. Contrary to the results of related work, we found negative herding effects for negative treatments on posts and comments, decreasing the final ratings on average, of posts by 5.15% and of comments by 37.4%. Compared to the control group, the probability of reaching a high rating (>=2000) for posts is increased by 24.6% when posts receive the positive treatment and for comments is decreased by 46.6% when comments receive the negative treatment.
△ Less
Submitted 20 June, 2016;
originally announced June 2016.
-
Random Voting Effects in Social-Digital Spaces: A case study of Reddit Post Submissions
Authors:
Maria Glenski,
Thomas J. Johnston,
Tim Weninger
Abstract:
At a time when information seekers first turn to digital sources for news and opinion, it is critical that we understand the role that social media plays in human behavior. This is especially true when information consumers also act as information producers and editors by their online activity. In order to better understand the effects that editorial ratings have on online human behavior, we repor…
▽ More
At a time when information seekers first turn to digital sources for news and opinion, it is critical that we understand the role that social media plays in human behavior. This is especially true when information consumers also act as information producers and editors by their online activity. In order to better understand the effects that editorial ratings have on online human behavior, we report the results of a large-scale in-vivo experiment in social media. We find that small, random rating manipulations on social media submissions created significant changes in downstream ratings resulting in significantly different final outcomes. Positive treatment resulted in a positive effect that increased the final rating by 11.02% on average. Compared to the control group, positive treatment also increased the probability of reaching a high rating (>=2000) by 24.6%. Contrary to the results of related work we also find that negative treatment resulted in a negative effect that decreased the final rating by 5.15% on average.
△ Less
Submitted 5 June, 2015;
originally announced June 2015.