Forecast dominance testing via sign randomization
Authors:
Werner Ehm,
Fabian Krüger
Abstract:
We propose randomization tests of whether forecast 1 outperforms forecast 2 across a class of scoring functions. This hypothesis is of applied interest: While the prediction context often prescribes a certain class of scoring functions, it is typically hard to motivate a specific choice on statistical or substantive grounds. We investigate the asymptotic behavior of the test statistics under mild…
▽ More
We propose randomization tests of whether forecast 1 outperforms forecast 2 across a class of scoring functions. This hypothesis is of applied interest: While the prediction context often prescribes a certain class of scoring functions, it is typically hard to motivate a specific choice on statistical or substantive grounds. We investigate the asymptotic behavior of the test statistics under mild conditions, avoiding the need to assume particular dynamic properties of forecasts and realizations. The properties of the one-sided tests depend on a corresponding version of Anderson's inequality, which we state as a conjecture of independent interest. Numerical experiments and a data example indicate that the tests have good size and power properties in practically relevant situations.
△ Less
Submitted 22 October, 2018; v1 submitted 10 July, 2017;
originally announced July 2017.
Of Quantiles and Expectiles: Consistent Scoring Functions, Choquet Representations, and Forecast Rankings
Authors:
Werner Ehm,
Tilmann Gneiting,
Alexander Jordan,
Fabian Krüger
Abstract:
In the practice of point prediction, it is desirable that forecasters receive a directive in the form of a statistical functional, such as the mean or a quantile of the predictive distribution. When evaluating and comparing competing forecasts, it is then critical that the scoring function used for these purposes be consistent for the functional at hand, in the sense that the expected score is min…
▽ More
In the practice of point prediction, it is desirable that forecasters receive a directive in the form of a statistical functional, such as the mean or a quantile of the predictive distribution. When evaluating and comparing competing forecasts, it is then critical that the scoring function used for these purposes be consistent for the functional at hand, in the sense that the expected score is minimized when following the directive.
We show that any scoring function that is consistent for a quantile or an expectile functional, respectively, can be represented as a mixture of extremal scoring functions that form a linearly parameterized family. Scoring functions for the mean value and probability forecasts of binary events constitute important examples. The quantile and expectile functionals along with the respective extremal scoring functions admit appealing economic interpretations in terms of thresholds in decision making.
The Choquet type mixture representations give rise to simple checks of whether a forecast dominates another in the sense that it is preferable under any consistent scoring function. In empirical settings it suffices to compare the average scores for only a finite number of extremal elements. Plots of the average scores with respect to the extremal scoring functions, which we call Murphy diagrams, permit detailed comparisons of the relative merits of competing forecasts.
△ Less
Submitted 17 April, 2015; v1 submitted 27 March, 2015;
originally announced March 2015.