A General-Purpose Crowdsourcing Computational Quality Control Toolkit for Python

Ustalov, Dmitry; Pavlichenko, Nikita; Losev, Vladimir; Giliazev, Iulian; Tulin, Evgeny

Computer Science > Human-Computer Interaction

arXiv:2109.08584v2 (cs)

[Submitted on 17 Sep 2021 (v1), revised 8 Oct 2021 (this version, v2), latest version 6 Apr 2024 (v4)]

Title:A General-Purpose Crowdsourcing Computational Quality Control Toolkit for Python

Authors:Dmitry Ustalov, Nikita Pavlichenko, Vladimir Losev, Iulian Giliazev, Evgeny Tulin

View PDF

Abstract:Quality control is a crux of crowdsourcing. While most means for quality control are organizational and imply worker selection, golden tasks, and post-acceptance, computational quality control techniques allow parameterizing the whole crowdsourcing process of workers, tasks, and labels, inferring and revealing relationships between them. In this paper, we demonstrate Crowd-Kit, a general-purpose crowdsourcing computational quality control toolkit. It provides efficient implementations in Python of computational quality control algorithms for crowdsourcing, including uncertainty measures and crowd consensus methods. We focus on aggregation methods for all the major annotation tasks, from the categorical annotation in which latent label assumption is met to more complex tasks like image and sequence aggregation. We perform an extensive evaluation of our toolkit on several datasets of different nature, enabling benchmarking computational quality control methods in a uniform, systematic, and reproducible way using the same codebase. We release our code and data under an open-source license at this https URL.

Comments:	accepted at HCOMP 2021 Works-in-Progress and Demonstration Track
Subjects:	Human-Computer Interaction (cs.HC); Software Engineering (cs.SE)
ACM classes:	G.4
Cite as:	arXiv:2109.08584 [cs.HC]
	(or arXiv:2109.08584v2 [cs.HC] for this version)
	https://doi.org/10.48550/arXiv.2109.08584

Submission history

From: Dmitry Ustalov [view email]
[v1] Fri, 17 Sep 2021 15:01:56 UTC (20 KB)
[v2] Fri, 8 Oct 2021 17:13:34 UTC (19 KB)
[v3] Thu, 9 Feb 2023 12:59:49 UTC (19 KB)
[v4] Sat, 6 Apr 2024 08:53:26 UTC (13 KB)

Computer Science > Human-Computer Interaction

Title:A General-Purpose Crowdsourcing Computational Quality Control Toolkit for Python

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Human-Computer Interaction

Title:A General-Purpose Crowdsourcing Computational Quality Control Toolkit for Python

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators