-
Calibrated Perception Uncertainty Across Objects and Regions in Bird's-Eye-View
Authors:
Markus Kängsepp,
Meelis Kull
Abstract:
In driving scenarios with poor visibility or occlusions, it is important that the autonomous vehicle would take into account all the uncertainties when making driving decisions, including choice of a safe speed. The grid-based perception outputs, such as occupancy grids, and object-based outputs, such as lists of detected objects, must then be accompanied by well-calibrated uncertainty estimates.…
▽ More
In driving scenarios with poor visibility or occlusions, it is important that the autonomous vehicle would take into account all the uncertainties when making driving decisions, including choice of a safe speed. The grid-based perception outputs, such as occupancy grids, and object-based outputs, such as lists of detected objects, must then be accompanied by well-calibrated uncertainty estimates. We highlight limitations in the state-of-the-art and propose a more complete set of uncertainties to be reported, particularly including undetected-object-ahead probabilities. We suggest a novel way to get these probabilistic outputs from bird's-eye-view probabilistic semantic segmentation, in the example of the FIERY model. We demonstrate that the obtained probabilities are not calibrated out-of-the-box and propose methods to achieve well-calibrated uncertainties.
△ Less
Submitted 8 November, 2022;
originally announced November 2022.
-
On the Usefulness of the Fit-on-the-Test View on Evaluating Calibration of Classifiers
Authors:
Markus Kängsepp,
Kaspar Valk,
Meelis Kull
Abstract:
Every uncalibrated classifier has a corresponding true calibration map that calibrates its confidence. Deviations of this idealistic map from the identity map reveal miscalibration. Such calibration errors can be reduced with many post-hoc calibration methods which fit some family of calibration maps on a validation dataset. In contrast, evaluation of calibration with the expected calibration erro…
▽ More
Every uncalibrated classifier has a corresponding true calibration map that calibrates its confidence. Deviations of this idealistic map from the identity map reveal miscalibration. Such calibration errors can be reduced with many post-hoc calibration methods which fit some family of calibration maps on a validation dataset. In contrast, evaluation of calibration with the expected calibration error (ECE) on the test set does not explicitly involve fitting. However, as we demonstrate, ECE can still be viewed as if fitting a family of functions on the test data. This motivates the fit-on-the-test view on evaluation: first, approximate a calibration map on the test data, and second, quantify its distance from the identity. Exploiting this view allows us to unlock missed opportunities: (1) use the plethora of post-hoc calibration methods for evaluating calibration; (2) tune the number of bins in ECE with cross-validation. Furthermore, we introduce: (3) benchmarking on pseudo-real data where the true calibration map can be estimated very precisely; and (4) novel calibration and evaluation methods using new calibration map families PL and PL3.
△ Less
Submitted 26 February, 2025; v1 submitted 16 March, 2022;
originally announced March 2022.
-
Correlated daily time series and forecasting in the M4 competition
Authors:
Anti Ingel,
Novin Shahroudi,
Markus Kängsepp,
Andre Tättar,
Viacheslav Komisarenko,
Meelis Kull
Abstract:
We participated in the M4 competition for time series forecasting and describe here our methods for forecasting daily time series. We used an ensemble of five statistical forecasting methods and a method that we refer to as the correlator. Our retrospective analysis using the ground truth values published by the M4 organisers after the competition demonstrates that the correlator was responsible f…
▽ More
We participated in the M4 competition for time series forecasting and describe here our methods for forecasting daily time series. We used an ensemble of five statistical forecasting methods and a method that we refer to as the correlator. Our retrospective analysis using the ground truth values published by the M4 organisers after the competition demonstrates that the correlator was responsible for most of our gains over the naive constant forecasting method. We identify data leakage as one reason for its success, partly due to test data selected from different time intervals, and partly due to quality issues in the original time series. We suggest that future forecasting competitions should provide actual dates for the time series so that some of those leakages could be avoided by the participants.
△ Less
Submitted 31 March, 2020; v1 submitted 28 March, 2020;
originally announced March 2020.
-
Beyond temperature scaling: Obtaining well-calibrated multiclass probabilities with Dirichlet calibration
Authors:
Meelis Kull,
Miquel Perello-Nieto,
Markus Kängsepp,
Telmo Silva Filho,
Hao Song,
Peter Flach
Abstract:
Class probabilities predicted by most multiclass classifiers are uncalibrated, often tending towards over-confidence. With neural networks, calibration can be improved by temperature scaling, a method to learn a single corrective multiplicative factor for inputs to the last softmax layer. On non-neural models the existing methods apply binary calibration in a pairwise or one-vs-rest fashion.
We…
▽ More
Class probabilities predicted by most multiclass classifiers are uncalibrated, often tending towards over-confidence. With neural networks, calibration can be improved by temperature scaling, a method to learn a single corrective multiplicative factor for inputs to the last softmax layer. On non-neural models the existing methods apply binary calibration in a pairwise or one-vs-rest fashion.
We propose a natively multiclass calibration method applicable to classifiers from any model class, derived from Dirichlet distributions and generalising the beta calibration method from binary classification. It is easily implemented with neural nets since it is equivalent to log-transforming the uncalibrated probabilities, followed by one linear layer and softmax. Experiments demonstrate improved probabilistic predictions according to multiple measures (confidence-ECE, classwise-ECE, log-loss, Brier score) across a wide range of datasets and classifiers. Parameters of the learned Dirichlet calibration map provide insights to the biases in the uncalibrated model.
△ Less
Submitted 28 October, 2019;
originally announced October 2019.
-
Change Blindness in 3D Virtual Reality
Authors:
Madis Vasser,
Markus Kängsepp,
Jaan Aru
Abstract:
In the present change blindness study subjects explored stereoscopic three dimensional (3D) environments through a virtual reality (VR) headset. A novel method that tracked the subjects' head movements was used for inducing changes in the scene whenever the changing object was out of the field of view. The effect of change location (foreground or background in 3D depth) on change blindness was inv…
▽ More
In the present change blindness study subjects explored stereoscopic three dimensional (3D) environments through a virtual reality (VR) headset. A novel method that tracked the subjects' head movements was used for inducing changes in the scene whenever the changing object was out of the field of view. The effect of change location (foreground or background in 3D depth) on change blindness was investigated. Two experiments were conducted, one in the lab (n = 50) and the other online (n = 25). Up to 25% of the changes were undetected and the mean overall search time was 27 seconds in the lab study. Results indicated significantly lower change detection success and more change cycles if the changes occurred in the background, with no differences in overall search times. The results confirm findings from previous studies and extend them to 3D environments. The study also demonstrates the feasibility of online VR experiments.
△ Less
Submitted 24 August, 2015;
originally announced August 2015.