-
Skeptical binary inferences in multi-label problems with sets of probabilities
Authors:
Yonatan Carlos Carranza Alarcón,
Sébastien Destercke
Abstract:
In this paper, we consider the problem of making distributionally robust, skeptical inferences for the multi-label problem, or more generally for Boolean vectors. By distributionally robust, we mean that we consider a set of possible probability distributions, and by skeptical we understand that we consider as valid only those inferences that are true for every distribution within this set. Such i…
▽ More
In this paper, we consider the problem of making distributionally robust, skeptical inferences for the multi-label problem, or more generally for Boolean vectors. By distributionally robust, we mean that we consider a set of possible probability distributions, and by skeptical we understand that we consider as valid only those inferences that are true for every distribution within this set. Such inferences will provide partial predictions whenever the considered set is sufficiently big. We study in particular the Hamming loss case, a common loss function in multi-label problems, showing how skeptical inferences can be made in this setting. Our experimental results are organised in three sections; (1) the first one indicates the gain computational obtained from our theoretical results by using synthetical data sets, (2) the second one indicates that our approaches produce relevant cautiousness on those hard-to-predict instances where its precise counterpart fails, and (3) the last one demonstrates experimentally how our approach copes with imperfect information (generated by a downsampling procedure) better than the partial abstention [31] and the rejection rules.
△ Less
Submitted 2 May, 2022;
originally announced May 2022.
-
Multi-label Chaining with Imprecise Probabilities
Authors:
Yonatan Carlos Carranza Alarcón,
Sébastien Destercke
Abstract:
We present two different strategies to extend the classical multi-label chaining approach to handle imprecise probability estimates. These estimates use convex sets of distributions (or credal sets) in order to describe our uncertainty rather than a precise one. The main reasons one could have for using such estimations are (1) to make cautious predictions (or no decision at all) when a high uncer…
▽ More
We present two different strategies to extend the classical multi-label chaining approach to handle imprecise probability estimates. These estimates use convex sets of distributions (or credal sets) in order to describe our uncertainty rather than a precise one. The main reasons one could have for using such estimations are (1) to make cautious predictions (or no decision at all) when a high uncertainty is detected in the chaining and (2) to make better precise predictions by avoiding biases caused in early decisions in the chaining. We adapt both strategies to the case of the naive credal classifier, showing that this adaptations are computationally efficient. Our experimental results on missing labels, which investigate how reliable these predictions are in both approaches, indicate that our approaches produce relevant cautiousness on those hard-to-predict instances where the precise models fail.
△ Less
Submitted 19 July, 2021; v1 submitted 15 July, 2021;
originally announced July 2021.
-
Copula-based conformal prediction for Multi-Target Regression
Authors:
Soundouss Messoudi,
Sébastien Destercke,
Sylvain Rousseau
Abstract:
There are relatively few works dealing with conformal prediction for multi-task learning issues, and this is particularly true for multi-target regression. This paper focuses on the problem of providing valid (i.e., frequency calibrated) multi-variate predictions. To do so, we propose to use copula functions applied to deep neural networks for inductive conformal prediction. We show that the propo…
▽ More
There are relatively few works dealing with conformal prediction for multi-task learning issues, and this is particularly true for multi-target regression. This paper focuses on the problem of providing valid (i.e., frequency calibrated) multi-variate predictions. To do so, we propose to use copula functions applied to deep neural networks for inductive conformal prediction. We show that the proposed method ensures efficiency and validity for multi-target regression problems on various data sets.
△ Less
Submitted 28 January, 2021;
originally announced January 2021.
-
Epistemic Uncertainty Sampling
Authors:
Vu-Linh Nguyen,
Sébastien Destercke,
Eyke Hüllermeier
Abstract:
Various strategies for active learning have been proposed in the machine learning literature. In uncertainty sampling, which is among the most popular approaches, the active learner sequentially queries the label of those instances for which its current prediction is maximally uncertain. The predictions as well as the measures used to quantify the degree of uncertainty, such as entropy, are almost…
▽ More
Various strategies for active learning have been proposed in the machine learning literature. In uncertainty sampling, which is among the most popular approaches, the active learner sequentially queries the label of those instances for which its current prediction is maximally uncertain. The predictions as well as the measures used to quantify the degree of uncertainty, such as entropy, are almost exclusively of a probabilistic nature. In this paper, we advocate a distinction between two different types of uncertainty, referred to as epistemic and aleatoric, in the context of active learning. Roughly speaking, these notions capture the reducible and the irreducible part of the total uncertainty in a prediction, respectively. We conjecture that, in uncertainty sampling, the usefulness of an instance is better reflected by its epistemic than by its aleatoric uncertainty. This leads us to suggest the principle of "epistemic uncertainty sampling", which we instantiate by means of a concrete approach for measuring epistemic and aleatoric uncertainty. In experimental studies, epistemic uncertainty sampling does indeed show promising performance.
△ Less
Submitted 31 August, 2019;
originally announced September 2019.
-
Probability boxes on totally preordered spaces for multivariate modelling
Authors:
Matthias C. M. Troffaes,
Sebastien Destercke
Abstract:
A pair of lower and upper cumulative distribution functions, also called probability box or p-box, is among the most popular models used in imprecise probability theory. They arise naturally in expert elicitation, for instance in cases where bounds are specified on the quantiles of a random variable, or when quantiles are specified only at a finite number of points. Many practical and formal resul…
▽ More
A pair of lower and upper cumulative distribution functions, also called probability box or p-box, is among the most popular models used in imprecise probability theory. They arise naturally in expert elicitation, for instance in cases where bounds are specified on the quantiles of a random variable, or when quantiles are specified only at a finite number of points. Many practical and formal results concerning p-boxes already exist in the literature. In this paper, we provide new efficient tools to construct multivariate p-boxes and develop algorithms to draw inferences from them. For this purpose, we formalise and extend the theory of p-boxes using Walley's behavioural theory of imprecise probabilities, and heavily rely on its notion of natural extension and existing results about independence modeling. In particular, we allow p-boxes to be defined on arbitrary totally preordered spaces, hence thereby also admitting multivariate p-boxes via probability bounds over any collection of nested sets. We focus on the cases of independence (using the factorization property), and of unknown dependence (using the Fréchet bounds), and we show that our approach extends the probabilistic arithmetic of Williamson and Downs. Two design problems---a damped oscillator, and a river dike---demonstrate the practical feasibility of our results.
△ Less
Submitted 29 March, 2011; v1 submitted 9 March, 2011;
originally announced March 2011.