-
Practice of Efficient Data Collection via Crowdsourcing at Large-Scale
Authors:
Alexey Drutsa,
Viktoriya Farafonova,
Valentina Fedorova,
Olga Megorskaya,
Evfrosiniya Zerminova,
Olga Zhilinskaya
Abstract:
Modern machine learning algorithms need large datasets to be trained. Crowdsourcing has become a popular approach to label large datasets in a shorter time as well as at a lower cost comparing to that needed for a limited number of experts. However, as crowdsourcing performers are non-professional and vary in levels of expertise, such labels are much noisier than those obtained from experts. For t…
▽ More
Modern machine learning algorithms need large datasets to be trained. Crowdsourcing has become a popular approach to label large datasets in a shorter time as well as at a lower cost comparing to that needed for a limited number of experts. However, as crowdsourcing performers are non-professional and vary in levels of expertise, such labels are much noisier than those obtained from experts. For this reason, in order to collect good quality data within a limited budget special techniques such as incremental relabelling, aggregation and pricing need to be used. We make an introduction to data labeling via public crowdsourcing marketplaces and present key components of efficient label collection. We show how to choose one of real label collection tasks, experiment with selecting settings for the labelling process, and launch label collection project at Yandex.Toloka, one of the largest crowdsourcing marketplace. The projects will be run on real crowds. We also present main algorithms for aggregation, incremental relabelling, and pricing in crowdsourcing. In particular, we, first, discuss how to connect these three components to build an efficient label collection process; and, second, share rich industrial experiences of applying these algorithms and constructing large-scale label collection pipelines (emphasizing best practices and common pitfalls).
△ Less
Submitted 9 December, 2019;
originally announced December 2019.
-
Latent Distribution Assumption for Unbiased and Consistent Consensus Modelling
Authors:
Valentina Fedorova,
Gleb Gusev,
Pavel Serdyukov
Abstract:
We study the problem of aggregation noisy labels. Usually, it is solved by proposing a stochastic model for the process of generating noisy labels and then estimating the model parameters using the observed noisy labels. A traditional assumption underlying previously introduced generative models is that each object has one latent true label. In contrast, we introduce a novel latent distribution as…
▽ More
We study the problem of aggregation noisy labels. Usually, it is solved by proposing a stochastic model for the process of generating noisy labels and then estimating the model parameters using the observed noisy labels. A traditional assumption underlying previously introduced generative models is that each object has one latent true label. In contrast, we introduce a novel latent distribution assumption, implying that a unique true label for an object might not exist, but rather each object might have a specific distribution generating a latent subjective label each time the object is observed. Our experiments showed that the novel assumption is more suitable for difficult tasks, when there is an ambiguity in choosing a "true" label for certain objects.
△ Less
Submitted 20 June, 2019;
originally announced June 2019.
-
Aggregation of pairwise comparisons with reduction of biases
Authors:
Nadezhda Bugakova,
Valentina Fedorova,
Gleb Gusev,
Alexey Drutsa
Abstract:
We study the problem of ranking from crowdsourced pairwise comparisons. Answers to pairwise tasks are known to be affected by the position of items on the screen, however, previous models for aggregation of pairwise comparisons do not focus on modeling such kind of biases. We introduce a new aggregation model factorBT for pairwise comparisons, which accounts for certain factors of pairwise tasks t…
▽ More
We study the problem of ranking from crowdsourced pairwise comparisons. Answers to pairwise tasks are known to be affected by the position of items on the screen, however, previous models for aggregation of pairwise comparisons do not focus on modeling such kind of biases. We introduce a new aggregation model factorBT for pairwise comparisons, which accounts for certain factors of pairwise tasks that are known to be irrelevant to the result of comparisons but may affect workers' answers due to perceptual reasons. By modeling biases that influence workers, factorBT is able to reduce the effect of biased pairwise comparisons on the resulted ranking. Our empirical studies on real-world data sets showed that factorBT produces more accurate ranking from crowdsourced pairwise comparisons than previously established models.
△ Less
Submitted 9 June, 2019;
originally announced June 2019.
-
Criteria of efficiency for conformal prediction
Authors:
Vladimir Vovk,
Ilia Nouretdinov,
Valentina Fedorova,
Ivan Petej,
Alex Gammerman
Abstract:
We study optimal conformity measures for various criteria of efficiency of classification in an idealised setting. This leads to an important class of criteria of efficiency that we call probabilistic; it turns out that the most standard criteria of efficiency used in literature on conformal prediction are not probabilistic unless the problem of classification is binary. We consider both unconditi…
▽ More
We study optimal conformity measures for various criteria of efficiency of classification in an idealised setting. This leads to an important class of criteria of efficiency that we call probabilistic; it turns out that the most standard criteria of efficiency used in literature on conformal prediction are not probabilistic unless the problem of classification is binary. We consider both unconditional and label-conditional conformal prediction.
△ Less
Submitted 14 September, 2016; v1 submitted 14 March, 2016;
originally announced March 2016.
-
Large-scale probabilistic predictors with and without guarantees of validity
Authors:
Vladimir Vovk,
Ivan Petej,
Valentina Fedorova
Abstract:
This paper studies theoretically and empirically a method of turning machine-learning algorithms into probabilistic predictors that automatically enjoys a property of validity (perfect calibration) and is computationally efficient. The price to pay for perfect calibration is that these probabilistic predictors produce imprecise (in practice, almost precise for large data sets) probabilities. When…
▽ More
This paper studies theoretically and empirically a method of turning machine-learning algorithms into probabilistic predictors that automatically enjoys a property of validity (perfect calibration) and is computationally efficient. The price to pay for perfect calibration is that these probabilistic predictors produce imprecise (in practice, almost precise for large data sets) probabilities. When these imprecise probabilities are merged into precise probabilities, the resulting predictors, while losing the theoretical property of perfect calibration, are consistently more accurate than the existing methods in empirical studies.
△ Less
Submitted 13 November, 2015; v1 submitted 1 November, 2015;
originally announced November 2015.
-
From conformal to probabilistic prediction
Authors:
Vladimir Vovk,
Ivan Petej,
Valentina Fedorova
Abstract:
This paper proposes a new method of probabilistic prediction, which is based on conformal prediction. The method is applied to the standard USPS data set and gives encouraging results.
This paper proposes a new method of probabilistic prediction, which is based on conformal prediction. The method is applied to the standard USPS data set and gives encouraging results.
△ Less
Submitted 21 June, 2014;
originally announced June 2014.
-
Plug-in martingales for testing exchangeability on-line
Authors:
Valentina Fedorova,
Alex Gammerman,
Ilia Nouretdinov,
Vladimir Vovk
Abstract:
A standard assumption in machine learning is the exchangeability of data, which is equivalent to assuming that the examples are generated from the same probability distribution independently. This paper is devoted to testing the assumption of exchangeability on-line: the examples arrive one by one, and after receiving each example we would like to have a valid measure of the degree to which the as…
▽ More
A standard assumption in machine learning is the exchangeability of data, which is equivalent to assuming that the examples are generated from the same probability distribution independently. This paper is devoted to testing the assumption of exchangeability on-line: the examples arrive one by one, and after receiving each example we would like to have a valid measure of the degree to which the assumption of exchangeability has been falsified. Such measures are provided by exchangeability martingales. We extend known techniques for constructing exchangeability martingales and show that our new method is competitive with the martingales introduced before. Finally we investigate the performance of our testing method on two benchmark datasets, USPS and Statlog Satellite data; for the former, the known techniques give satisfactory results, but for the latter our new more flexible method becomes necessary.
△ Less
Submitted 28 June, 2012; v1 submitted 15 April, 2012;
originally announced April 2012.