-
Measuring Dependence between Events
Authors:
Marc-Oliver Pohle,
Timo Dimitriadis,
Jan-Lukas Wermuth
Abstract:
Measuring dependence between two events, or equivalently between two binary random variables, amounts to expressing the dependence structure inherent in a $2\times 2$ contingency table in a real number between $-1$ and $1$. Countless such dependence measures exist, but there is little theoretical guidance on how they compare and on their advantages and shortcomings. Thus, practitioners might be ov…
▽ More
Measuring dependence between two events, or equivalently between two binary random variables, amounts to expressing the dependence structure inherent in a $2\times 2$ contingency table in a real number between $-1$ and $1$. Countless such dependence measures exist, but there is little theoretical guidance on how they compare and on their advantages and shortcomings. Thus, practitioners might be overwhelmed by the problem of choosing a suitable measure. We provide a set of natural desirable properties that a proper dependence measure should fulfill. We show that Yule's Q and the little-known Cole coefficient are proper, while the most widely-used measures, the phi coefficient and all contingency coefficients, are improper. They have a severe attainability problem, that is, even under perfect dependence they can be very far away from $-1$ and $1$, and often differ substantially from the proper measures in that they understate strength of dependence. The structural reason is that these are measures for equality of events rather than of dependence. We derive the (in some instances non-standard) limiting distributions of the measures and illustrate how asymptotically valid confidence intervals can be constructed. In a case study on drug consumption we demonstrate how misleading conclusions may arise from the use of improper dependence measures.
△ Less
Submitted 4 November, 2024; v1 submitted 26 March, 2024;
originally announced March 2024.
-
Generalised Covariances and Correlations
Authors:
Tobias Fissler,
Marc-Oliver Pohle
Abstract:
The covariance of two random variables measures the average joint deviations from their respective means. We generalise this well-known measure by replacing the means with other statistical functionals such as quantiles, expectiles, or thresholds. Deviations from these functionals are defined via generalised errors, often induced by identification or moment functions. As a normalised measure of de…
▽ More
The covariance of two random variables measures the average joint deviations from their respective means. We generalise this well-known measure by replacing the means with other statistical functionals such as quantiles, expectiles, or thresholds. Deviations from these functionals are defined via generalised errors, often induced by identification or moment functions. As a normalised measure of dependence, a generalised correlation is constructed. Replacing the common Cauchy-Schwarz normalisation by a novel Fréchet-Hoeffding normalisation, we obtain attainability of the entire interval $[-1, 1]$ for any given marginals. We uncover favourable properties of these new dependence measures. The families of quantile and threshold correlations give rise to function-valued distributional correlations, exhibiting the entire dependence structure. They lead to tail correlations, which should arguably supersede the coefficients of tail dependence. Finally, we construct summary covariances (correlations), which arise as (normalised) weighted averages of distributional covariances. We retrieve Pearson covariance and Spearman correlation as special cases. The applicability and usefulness of our new dependence measures is illustrated on demographic data from the Panel Study of Income Dynamics.
△ Less
Submitted 21 September, 2023; v1 submitted 7 July, 2023;
originally announced July 2023.
-
Testing Quantile Forecast Optimality
Authors:
Jack Fosten,
Daniel Gutknecht,
Marc-Oliver Pohle
Abstract:
Quantile forecasts made across multiple horizons have become an important output of many financial institutions, central banks and international organisations. This paper proposes misspecification tests for such quantile forecasts that assess optimality over a set of multiple forecast horizons and/or quantiles. The tests build on multiple Mincer-Zarnowitz quantile regressions cast in a moment equa…
▽ More
Quantile forecasts made across multiple horizons have become an important output of many financial institutions, central banks and international organisations. This paper proposes misspecification tests for such quantile forecasts that assess optimality over a set of multiple forecast horizons and/or quantiles. The tests build on multiple Mincer-Zarnowitz quantile regressions cast in a moment equality framework. Our main test is for the null hypothesis of autocalibration, a concept which assesses optimality with respect to the information contained in the forecasts themselves. We provide an extension that allows to test for optimality with respect to larger information sets and a multivariate extension. Importantly, our tests do not just inform about general violations of optimality, but may also provide useful insights into specific forms of sub-optimality. A simulation study investigates the finite sample performance of our tests, and two empirical applications to financial returns and U.S. macroeconomic series illustrate that our tests can yield interesting insights into quantile forecast sub-optimality and its causes.
△ Less
Submitted 12 October, 2023; v1 submitted 6 February, 2023;
originally announced February 2023.
-
Score-based calibration testing for multivariate forecast distributions
Authors:
Malte Knüppel,
Fabian Krüger,
Marc-Oliver Pohle
Abstract:
Calibration tests based on the probability integral transform (PIT) are routinely used to assess the quality of univariate distributional forecasts. However, PIT-based calibration tests for multivariate distributional forecasts face various challenges. We propose two new types of tests based on proper scoring rules, which overcome these challenges. They arise from a general framework for calibrati…
▽ More
Calibration tests based on the probability integral transform (PIT) are routinely used to assess the quality of univariate distributional forecasts. However, PIT-based calibration tests for multivariate distributional forecasts face various challenges. We propose two new types of tests based on proper scoring rules, which overcome these challenges. They arise from a general framework for calibration testing in the multivariate case, introduced in this work. The new tests have good size and power properties in simulations and solve various problems of existing tests. We apply the tests to forecast distributions for macroeconomic and financial time series data.
△ Less
Submitted 12 December, 2023; v1 submitted 29 November, 2022;
originally announced November 2022.
-
Unlucky Number 13? Manipulating Evidence Subject to Snooping
Authors:
Uwe Hassler,
Marc-Oliver Pohle
Abstract:
Questionable research practices like HARKing or p-hacking have generated considerable recent interest throughout and beyond the scientific community. We subsume such practices involving secret data snooping that influences subsequent statistical inference under the term MESSing (manipulating evidence subject to snooping) and discuss, illustrate and quantify the possibly dramatic effects of several…
▽ More
Questionable research practices like HARKing or p-hacking have generated considerable recent interest throughout and beyond the scientific community. We subsume such practices involving secret data snooping that influences subsequent statistical inference under the term MESSing (manipulating evidence subject to snooping) and discuss, illustrate and quantify the possibly dramatic effects of several forms of MESSing using an empirical and a simple theoretical example. The empirical example uses numbers from the most popular German lottery, which seem to suggest that 13 is an unlucky number.
△ Less
Submitted 4 September, 2020;
originally announced September 2020.
-
The Murphy Decomposition and the Calibration-Resolution Principle: A New Perspective on Forecast Evaluation
Authors:
Marc-Oliver Pohle
Abstract:
I provide a unifying perspective on forecast evaluation, characterizing accurate forecasts of all types, from simple point to complete probabilistic forecasts, in terms of two fundamental underlying properties, autocalibration and resolution, which can be interpreted as describing a lack of systematic mistakes and a high information content. This "calibration-resolution principle" gives a new insi…
▽ More
I provide a unifying perspective on forecast evaluation, characterizing accurate forecasts of all types, from simple point to complete probabilistic forecasts, in terms of two fundamental underlying properties, autocalibration and resolution, which can be interpreted as describing a lack of systematic mistakes and a high information content. This "calibration-resolution principle" gives a new insight into the nature of forecasting and generalizes the famous sharpness principle by Gneiting et al. (2007) from probabilistic to all types of forecasts. It amongst others exposes the shortcomings of several widely used forecast evaluation methods. The principle is based on a fully general version of the Murphy decomposition of loss functions, which I provide. Special cases of this decomposition are well-known and widely used in meteorology.
Besides using the decomposition in this new theoretical way, after having introduced it and the underlying properties in a proper theoretical framework, accompanied by an illustrative example, I also employ it in its classical sense as a forecast evaluation method as the meteorologists do: As such, it unveils the driving forces behind forecast errors and complements classical forecast evaluation methods. I discuss estimation of the decomposition via kernel regression and then apply it to popular economic forecasts. Analysis of mean forecasts from the US Survey of Professional Forecasters and quantile forecasts derived from Bank of England fan charts indeed yield interesting new insights and highlight the potential of the method.
△ Less
Submitted 4 May, 2020;
originally announced May 2020.