-
Specialists Outperform Generalists in Ensemble Classification
Authors:
Sascha Meyen,
Frieder Göppert,
Helen Alber,
Ulrike von Luxburg,
Volker H. Franz
Abstract:
Consider an ensemble of $k$ individual classifiers whose accuracies are known. Upon receiving a test point, each of the classifiers outputs a predicted label and a confidence in its prediction for this particular test point. In this paper, we address the question of whether we can determine the accuracy of the ensemble. Surprisingly, even when classifiers are combined in the statistically optimal…
▽ More
Consider an ensemble of $k$ individual classifiers whose accuracies are known. Upon receiving a test point, each of the classifiers outputs a predicted label and a confidence in its prediction for this particular test point. In this paper, we address the question of whether we can determine the accuracy of the ensemble. Surprisingly, even when classifiers are combined in the statistically optimal way in this setting, the accuracy of the resulting ensemble classifier cannot be computed from the accuracies of the individual classifiers-as would be the case in the standard setting of confidence weighted majority voting. We prove tight upper and lower bounds on the ensemble accuracy. We explicitly construct the individual classifiers that attain the upper and lower bounds: specialists and generalists. Our theoretical results have very practical consequences: (1) If we use ensemble methods and have the choice to construct our individual (independent) classifiers from scratch, then we should aim for specialist classifiers rather than generalists. (2) Our bounds can be used to determine how many classifiers are at least required to achieve a desired ensemble accuracy. Finally, we improve our bounds by considering the mutual information between the true label and the individual classifier's output.
△ Less
Submitted 9 July, 2021;
originally announced July 2021.
-
Group Decisions based on Confidence Weighted Majority Voting
Authors:
Sascha Meyen,
Dorothee M. B. Sigg,
Ulrike von Luxburg,
Volker H. Franz
Abstract:
Background: It has repeatedly been reported that when making decisions under uncertainty, groups outperform individuals. In a lab setting, real groups are often replaced by simulated groups: Instead of performing an actual group discussion, individual responses are aggregated by a numerical computation. While studies typically use unweighted majority voting (MV) for this aggregation, the theoretic…
▽ More
Background: It has repeatedly been reported that when making decisions under uncertainty, groups outperform individuals. In a lab setting, real groups are often replaced by simulated groups: Instead of performing an actual group discussion, individual responses are aggregated by a numerical computation. While studies typically use unweighted majority voting (MV) for this aggregation, the theoretically optimal method is confidence weighted majority voting (CWMV) -- if confidence ratings for the individual responses are available. However, it is not entirely clear how well the theoretically derived CWMV method predicts real group decisions and confidences. Therefore, we compared simulated group responses using CWMV and MV to real group responses.
Results: Simulated group decisions based on CWMV matched the accuracy of real group decisions very well, while simulated group decisions based on MV showed lower accuracy. Also, CWMV well predicted the confidence that groups put into their group decisions. Yet, individuals and real groups showed a bias towards under-confidence while CWMV does not.
Conclusion: Our results highlight the importance of taking into account individual confidences when investigating group decisions: We found that real groups can aggregate individual confidences such that they match the optimal aggregation given by CWMV. This implies that research using simulated group decisions should use CWMV and not MV.
△ Less
Submitted 7 June, 2021; v1 submitted 30 April, 2020;
originally announced May 2020.
-
Advancing Research on Unconscious Priming: When can Scientists Claim an Indirect Task Advantage?
Authors:
Sascha Meyen,
Iris A. Zerweck,
Catarina Amado,
Ulrike von Luxburg,
Volker H. Franz
Abstract:
Current literature holds that many cognitive functions can be performed outside consciousness. Evidence for this view comes from unconscious priming. In a typical experiment, visual stimuli are masked, such that participants are close to chance when directly asked to which of two categories the stimuli belong. This close-to-zero sensitivity is seen as evidence that participants cannot consciously…
▽ More
Current literature holds that many cognitive functions can be performed outside consciousness. Evidence for this view comes from unconscious priming. In a typical experiment, visual stimuli are masked, such that participants are close to chance when directly asked to which of two categories the stimuli belong. This close-to-zero sensitivity is seen as evidence that participants cannot consciously report the category of the masked stimuli. Nevertheless, the category of the masked stimuli can indirectly affect responses to other stimuli (e.g., reaction times or brain activity). Priming is therefore seen as evidence that there is still some (albeit unconscious) sensitivity to the stimulus categories, thereby indicating processing outside consciousness. Although this "standard reasoning of unconscious priming" has been used in many studies, we show that it is flawed: Sensitivities are not calculated appropriately, hereby creating the wrong impression that priming indicated better sensitivity than the close-to-zero sensitivity of the direct discrimination. We describe the appropriate way to determine sensitivities, replicate the behavioral part of a landmark study, develop a method to estimate sensitivities for published studies from reported summary statistics, and use this method to reanalyze 15 highly influential studies. Results show that the interpretations of many studies need to be changed and that a community effort to reassess the vast literature on unconscious priming is needed. This process will allow scientists to learn more about the true boundary conditions of unconscious priming, thereby advancing the scientific understanding of consciousness.
△ Less
Submitted 7 June, 2021; v1 submitted 30 April, 2020;
originally announced April 2020.
-
Unconscious lie detection as an example of a widespread fallacy in the Neurosciences
Authors:
Volker H. Franz,
Ulrike von Luxburg
Abstract:
Neuroscientists frequently use a certain statistical reasoning to establish the existence of distinct neuronal processes in the brain. We show that this reasoning is flawed and that the large corresponding literature needs reconsideration. We illustrate the fallacy with a recent study that received an enormous press coverage because it concluded that humans detect deceit better if they use unconsc…
▽ More
Neuroscientists frequently use a certain statistical reasoning to establish the existence of distinct neuronal processes in the brain. We show that this reasoning is flawed and that the large corresponding literature needs reconsideration. We illustrate the fallacy with a recent study that received an enormous press coverage because it concluded that humans detect deceit better if they use unconscious processes instead of conscious deliberations. The study was published under a new open-data policy that enabled us to reanalyze the data with more appropriate methods. We found that unconscious performance was close to chance - just as the conscious performance. This illustrates the flaws of this widely used statistical reasoning, the benefits of open-data practices, and the need for careful reconsideration of studies using the same rationale.
△ Less
Submitted 16 July, 2014;
originally announced July 2014.
-
A Geometric Approach to Confidence Sets for Ratios: Fieller's Theorem, Generalizations, and Bootstrap
Authors:
Ulrike von Luxburg,
Volker H. Franz
Abstract:
We present a geometric method to determine confidence sets for the ratio E(Y)/E(X) of the means of random variables X and Y. This method reduces the problem of constructing confidence sets for the ratio of two random variables to the problem of constructing confidence sets for the means of one-dimensional random variables. It is valid in a large variety of circumstances. In the case of normally…
▽ More
We present a geometric method to determine confidence sets for the ratio E(Y)/E(X) of the means of random variables X and Y. This method reduces the problem of constructing confidence sets for the ratio of two random variables to the problem of constructing confidence sets for the means of one-dimensional random variables. It is valid in a large variety of circumstances. In the case of normally distributed random variables, the so constructed confidence sets coincide with the standard Fieller confidence sets. Generalizations of our construction lead to definitions of exact and conservative confidence sets for very general classes of distributions, provided the joint expectation of (X,Y) exists and the linear combinations of the form aX + bY are well-behaved. Finally, our geometric method allows to derive a very simple bootstrap approach for constructing conservative confidence sets for ratios which perform favorably in certain situations, in particular in the asymmetric heavy-tailed regime.
△ Less
Submitted 1 November, 2007;
originally announced November 2007.
-
Ratios: A short guide to confidence limits and proper use
Authors:
Volker H. Franz
Abstract:
Researchers often calculate ratios of measured quantities. Specifying confidence limits for ratios is difficult and the appropriate methods are often unknown. Appropriate methods are described (Fieller, Taylor, special bootstrap methods). For the Fieller method a simple geometrical interpretation is given. Monte Carlo simulations show when these methods are appropriate and that the most frequent…
▽ More
Researchers often calculate ratios of measured quantities. Specifying confidence limits for ratios is difficult and the appropriate methods are often unknown. Appropriate methods are described (Fieller, Taylor, special bootstrap methods). For the Fieller method a simple geometrical interpretation is given. Monte Carlo simulations show when these methods are appropriate and that the most frequently used methods (index method and zero-variance method) can lead to large liberal deviations from the desired confidence level. It is discussed when we can use standard regression or measurement error models and when we have to resort to specific models for heteroscedastic data. Finally, an old warning is repeated that we should be aware of the problems of spurious correlations if we use ratios.
△ Less
Submitted 10 October, 2007;
originally announced October 2007.