Building High-Quality Auction Fraud Dataset

Elshaar, Sulaf; Sadaoui, Samira

doi:10.5539/cis.v12n4p1

Computer Science > Machine Learning

arXiv:1906.04272 (cs)

[Submitted on 10 Jun 2019 (v1), last revised 23 Aug 2019 (this version, v3)]

Title:Building High-Quality Auction Fraud Dataset

Authors:Sulaf Elshaar, Samira Sadaoui

View PDF

Abstract:Given the magnitude of online auction transactions, it is difficult to safeguard consumers from dishonest sellers, such as shill bidders. To date, the application of Machine Learning Techniques (MLTs) to auction fraud has been limited, unlike their applications for combatting other types of fraud. Shill Bidding (SB) is a severe auction fraud, which is driven by modern-day technologies and clever scammers. The difficulty of identifying the behavior of sophisticated fraudsters and the unavailability of training datasets hinder the research on SB detection. In this study, we developed a high-quality SB dataset. To do so, first, we crawled and preprocessed a large number of commercial auctions and bidders' history as well. We thoroughly preprocessed both datasets to make them usable for the computation of the SB metrics. Nevertheless, this operation requires a deep understanding of the behavior of auctions and bidders. Second, we introduced two new SB pattern s and implemented other existing SB patterns. Finally, we removed outliers to improve the quality of training SB data.

Comments:	10 pages
Subjects:	Machine Learning (cs.LG); Computers and Society (cs.CY)
Cite as:	arXiv:1906.04272 [cs.LG]
	(or arXiv:1906.04272v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1906.04272
Journal reference:	Computer and Information Science; Vol. 12, No. 4; 2019 ISSN 1913-8989 E-ISSN 1913-8997 Published by Canadian Center of Science and Education
Related DOI:	https://doi.org/10.5539/cis.v12n4p1

Submission history

From: Sulaf Elshaar [view email]
[v1] Mon, 10 Jun 2019 21:03:30 UTC (382 KB)
[v2] Fri, 28 Jun 2019 22:54:32 UTC (382 KB)
[v3] Fri, 23 Aug 2019 08:39:17 UTC (844 KB)

Computer Science > Machine Learning

Title:Building High-Quality Auction Fraud Dataset

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Building High-Quality Auction Fraud Dataset

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators