Fame for sale: efficient detection of fake Twitter followers

Cresci, Stefano; Di Pietro, Roberto; Petrocchi, Marinella; Spognardi, Angelo; Tesconi, Maurizio

doi:10.1016/j.dss.2015.09.003

Computer Science > Social and Information Networks

arXiv:1509.04098 (cs)

[Submitted on 14 Sep 2015 (v1), last revised 10 Nov 2015 (this version, v2)]

Title:Fame for sale: efficient detection of fake Twitter followers

Authors:Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, Maurizio Tesconi

View PDF

Abstract:$\textit{Fake followers}$ are those Twitter accounts specifically created to inflate the number of followers of a target account. Fake followers are dangerous for the social platform and beyond, since they may alter concepts like popularity and influence in the Twittersphere - hence impacting on economy, politics, and society. In this paper, we contribute along different dimensions. First, we review some of the most relevant existing features and rules (proposed by Academia and Media) for anomalous Twitter accounts detection. Second, we create a baseline dataset of verified human and fake follower accounts. Such baseline dataset is publicly available to the scientific community. Then, we exploit the baseline dataset to train a set of machine-learning classifiers built over the reviewed rules and features. Our results show that most of the rules proposed by Media provide unsatisfactory performance in revealing fake followers, while features proposed in the past by Academia for spam detection provide good results. Building on the most promising features, we revise the classifiers both in terms of reduction of overfitting and cost for gathering the data needed to compute the features. The final result is a novel $\textit{Class A}$ classifier, general enough to thwart overfitting, lightweight thanks to the usage of the less costly features, and still able to correctly classify more than 95% of the accounts of the original training set. We ultimately perform an information fusion-based sensitivity analysis, to assess the global sensitivity of each of the features employed by the classifier. The findings reported in this paper, other than being supported by a thorough experimental methodology and interesting on their own, also pave the way for further investigation on the novel issue of fake Twitter followers.

Subjects:	Social and Information Networks (cs.SI); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
ACM classes:	H.2.8
Cite as:	arXiv:1509.04098 [cs.SI]
	(or arXiv:1509.04098v2 [cs.SI] for this version)
	https://doi.org/10.48550/arXiv.1509.04098
Journal reference:	Decision Support Systems, 80, 56-71, 2015
Related DOI:	https://doi.org/10.1016/j.dss.2015.09.003

Submission history

From: Stefano Cresci [view email]
[v1] Mon, 14 Sep 2015 13:59:11 UTC (44 KB)
[v2] Tue, 10 Nov 2015 17:31:40 UTC (44 KB)

Computer Science > Social and Information Networks

Title:Fame for sale: efficient detection of fake Twitter followers

Submission history

Access Paper:

References & Citations

1 blog link

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Social and Information Networks

Title:Fame for sale: efficient detection of fake Twitter followers

Submission history

Access Paper:

References & Citations

1 blog link

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators