-
Quantifying the Influence of User Behaviors on the Dissemination of Fake News on Twitter with Multivariate Hawkes Processes
Authors:
Yichen Jiang,
Michael D. Porter
Abstract:
Fake news has emerged as a pervasive problem within Online Social Networks, leading to a surge of research interest in this area. Understanding the dissemination mechanisms of fake news is crucial in comprehending the propagation of disinformation/misinformation and its impact on users in Online Social Networks. This knowledge can facilitate the development of interventions to curtail the spread o…
▽ More
Fake news has emerged as a pervasive problem within Online Social Networks, leading to a surge of research interest in this area. Understanding the dissemination mechanisms of fake news is crucial in comprehending the propagation of disinformation/misinformation and its impact on users in Online Social Networks. This knowledge can facilitate the development of interventions to curtail the spread of false information and inform affected users to remain vigilant against fraudulent/malicious content. In this paper, we specifically target the Twitter platform and propose a Multivariate Hawkes Point Processes model that incorporates essential factors such as user networks, response tweet types, and user stances as model parameters. Our objective is to investigate and quantify their influence on the dissemination process of fake news. We derive parameter estimation expressions using an Expectation Maximization algorithm and validate them on a simulated dataset. Furthermore, we conduct a case study using a real dataset of fake news collected from Twitter to explore the impact of user stances and tweet types on dissemination patterns. This analysis provides valuable insights into how users are influenced by or influence the dissemination process of disinformation/misinformation, and demonstrates how our model can aid in intervening in this process.
△ Less
Submitted 26 August, 2023;
originally announced August 2023.
-
A tale of two metrics: Polling and financial contributions as a measure of performance
Authors:
Moeen Mostafavi,
Maria Phillips,
Yichen Jiang,
Michael D. Porter,
Paul Freedman
Abstract:
Campaign analysis is an integral part of American democracy and has many complexities in its dynamics. Experts have long sought to understand these dynamics and evaluate campaign performance using a variety of techniques. We explore campaign financing and standing in the polls as two components of campaign performance in the context of the 2020 Democratic primaries. We show where these measures ex…
▽ More
Campaign analysis is an integral part of American democracy and has many complexities in its dynamics. Experts have long sought to understand these dynamics and evaluate campaign performance using a variety of techniques. We explore campaign financing and standing in the polls as two components of campaign performance in the context of the 2020 Democratic primaries. We show where these measures exhibit represent similar dynamics and where they differ. We focus on identifying change points in the trend for all candidates using joinpoint regression models. We find how these change points identify major events such as failure or success in a debate. Joinpoint regression reveals who the voters support when they stop supporting a specific candidate. This study demonstrates the value of joinpoint regression in political campaign analysis and it represents a crossover of this technique into the political domain building a foundation for continued exploration and use of this method.
△ Less
Submitted 24 March, 2021;
originally announced March 2021.
-
Detecting, identifying, and localizing radiological material in urban environments using scan statistics
Authors:
Michael D. Porter,
Alphonse Akakpo
Abstract:
A method is proposed, based on scan statistics, to detect, identify, and localize illicit radiological material using mobile sensors in an urban environment. Our method handles varying levels of background radiation that change according to an (unknown) environment. Our method can accurately determine if a source is present along a street segment as well as identify which of six possible sources g…
▽ More
A method is proposed, based on scan statistics, to detect, identify, and localize illicit radiological material using mobile sensors in an urban environment. Our method handles varying levels of background radiation that change according to an (unknown) environment. Our method can accurately determine if a source is present along a street segment as well as identify which of six possible sources generated the radiation. Our method can also localize the source, when detected, to within a few seconds. We have presented our results across a range of decision thresholds allowing stakeholders to evaluate the performance at different false alarm rates. Due to the simplicity of our approach, our models can be trained in a few minutes with very little training data and holds the potential to score a run in real-time. Our method was one of the top performing submissions in the 'Detecting Radiological Threats in Urban Areas' competition.
△ Less
Submitted 8 February, 2020;
originally announced February 2020.
-
Optimal Bayesian clustering using non-negative matrix factorization
Authors:
Ketong Wang,
Michael D. Porter
Abstract:
Bayesian model-based clustering is a widely applied procedure for discovering groups of related observations in a dataset. These approaches use Bayesian mixture models, estimated with MCMC, which provide posterior samples of the model parameters and clustering partition. While inference on model parameters is well established, inference on the clustering partition is less developed. A new method i…
▽ More
Bayesian model-based clustering is a widely applied procedure for discovering groups of related observations in a dataset. These approaches use Bayesian mixture models, estimated with MCMC, which provide posterior samples of the model parameters and clustering partition. While inference on model parameters is well established, inference on the clustering partition is less developed. A new method is developed for estimating the optimal partition from the pairwise posterior similarity matrix generated by a Bayesian cluster model. This approach uses non-negative matrix factorization (NMF) to provide a low-rank approximation to the similarity matrix. The factorization permits hard or soft partitions and is shown to perform better than several popular alternatives under a variety of penalty functions.
△ Less
Submitted 20 September, 2018;
originally announced September 2018.
-
Modelling the Proliferation of Terrorism via Diffusion and Contagion
Authors:
Gentry White,
Fabrizio Ruggeri,
Michael D. Porter
Abstract:
The proliferation of terrorism is a serious concern in national and international security, as its spread is seen as an existential threat to Western liberal democracies. Understanding and effectively modelling the spread of terrorism provides useful insight into formulating effective responses. A mathematical model capturing the theoretical constructs of contagion and diffusion is constructed for…
▽ More
The proliferation of terrorism is a serious concern in national and international security, as its spread is seen as an existential threat to Western liberal democracies. Understanding and effectively modelling the spread of terrorism provides useful insight into formulating effective responses. A mathematical model capturing the theoretical constructs of contagion and diffusion is constructed for explaining the spread of terrorist activity and used to analyse data from the Global Terrorism Database from 2000--2016 for Afghanistan, Iraq, and Israel.
△ Less
Submitted 11 February, 2019; v1 submitted 7 December, 2016;
originally announced December 2016.
-
A Statistical Approach to Crime Linkage
Authors:
Michael D. Porter
Abstract:
The object of this paper is to develop a statistical approach to criminal linkage analysis that discovers and groups crime events that share a common offender and prioritizes suspects for further investigation. Bayes factors are used to describe the strength of evidence that two crimes are linked. Using concepts from agglomerative hierarchical clustering, the Bayes factors for crime pairs are comb…
▽ More
The object of this paper is to develop a statistical approach to criminal linkage analysis that discovers and groups crime events that share a common offender and prioritizes suspects for further investigation. Bayes factors are used to describe the strength of evidence that two crimes are linked. Using concepts from agglomerative hierarchical clustering, the Bayes factors for crime pairs are combined to provide similarity measures for comparing two crime series. This facilitates crime series clustering, crime series identification, and suspect prioritization. The ability of our models to make correct linkages and predictions is demonstrated under a variety of real-world scenarios with a large number of solved and unsolved breaking and entering crimes. For example, a naïve Bayes model for pairwise case linkage can identify 82\% of actual linkages with a 5\% false positive rate. For crime series identification, 77\%-89\% of the additional crimes in a crime series can be identified from a ranked list of 50 incidents.
△ Less
Submitted 8 October, 2014;
originally announced October 2014.
-
Discussion of "Estimating the historical and future probabilities of large terrorist event" by Aaron Clauset and Ryan Woodard
Authors:
Brian J. Reich,
Michael D. Porter
Abstract:
Discussion of "Estimating the historical and future probabilities of large terrorist events" by Aaron Clauset and Ryan Woodard [arXiv:1209.0089].
Discussion of "Estimating the historical and future probabilities of large terrorist events" by Aaron Clauset and Ryan Woodard [arXiv:1209.0089].
△ Less
Submitted 10 January, 2014;
originally announced January 2014.
-
Self-exciting hurdle models for terrorist activity
Authors:
Michael D. Porter,
Gentry White
Abstract:
A predictive model of terrorist activity is developed by examining the daily number of terrorist attacks in Indonesia from 1994 through 2007. The dynamic model employs a shot noise process to explain the self-exciting nature of the terrorist activities. This estimates the probability of future attacks as a function of the times since the past attacks. In addition, the excess of nonattack days coup…
▽ More
A predictive model of terrorist activity is developed by examining the daily number of terrorist attacks in Indonesia from 1994 through 2007. The dynamic model employs a shot noise process to explain the self-exciting nature of the terrorist activities. This estimates the probability of future attacks as a function of the times since the past attacks. In addition, the excess of nonattack days coupled with the presence of multiple coordinated attacks on the same day compelled the use of hurdle models to jointly model the probability of an attack day and corresponding number of attacks. A power law distribution with a shot noise driven parameter best modeled the number of attacks on an attack day. Interpretation of the model parameters is discussed and predictive performance of the models is evaluated.
△ Less
Submitted 16 March, 2012;
originally announced March 2012.
-
Mixture Likelihood Ratio Scan Statistic for Disease Outbreak Detection
Authors:
Michael D. Porter,
Jarad B. Niemi,
Brian J. Reich
Abstract:
Early detection of disease outbreaks is of paramount importance to implementing intervention strategies to mitigate the severity and duration of the outbreak. We build methodology that utilizes the characteristic profile of disease outbreaks to reduce the time to detection and false positive rate. We model daily counts through a Poisson distribution with additive background plus outbreak component…
▽ More
Early detection of disease outbreaks is of paramount importance to implementing intervention strategies to mitigate the severity and duration of the outbreak. We build methodology that utilizes the characteristic profile of disease outbreaks to reduce the time to detection and false positive rate. We model daily counts through a Poisson distribution with additive background plus outbreak components. The outbreak component has a parametric form with unknown underlying parameters. A mixture likelihood ratio scan statistic is developed to maximize parameters over a window in time. This provides an alert statistic with early time to detection and low false positive rate. The methodology is demonstrated on three simulated data sets meant to represent E. coli, Cryptosporidium, and Influenza outbreaks.
△ Less
Submitted 18 January, 2012;
originally announced January 2012.