-
Dual Interpretation of Machine Learning Forecasts
Authors:
Philippe Goulet Coulombe,
Maximilian Goebel,
Karin Klieber
Abstract:
Machine learning predictions are typically interpreted as the sum of contributions of predictors. Yet, each out-of-sample prediction can also be expressed as a linear combination of in-sample values of the predicted variable, with weights corresponding to pairwise proximity scores between current and past economic events. While this dual route leads nowhere in some contexts (e.g., large cross-sect…
▽ More
Machine learning predictions are typically interpreted as the sum of contributions of predictors. Yet, each out-of-sample prediction can also be expressed as a linear combination of in-sample values of the predicted variable, with weights corresponding to pairwise proximity scores between current and past economic events. While this dual route leads nowhere in some contexts (e.g., large cross-sectional datasets), it provides sparser interpretations in settings with many regressors and little training data-like macroeconomic forecasting. In this case, the sequence of contributions can be visualized as a time series, allowing analysts to explain predictions as quantifiable combinations of historical analogies. Moreover, the weights can be viewed as those of a data portfolio, inspiring new diagnostic measures such as forecast concentration, short position, and turnover. We show how weights can be retrieved seamlessly for (kernel) ridge regression, random forest, boosted trees, and neural networks. Then, we apply these tools to analyze post-pandemic forecasts of inflation, GDP growth, and recession probabilities. In all cases, the approach opens the black box from a new angle and demonstrates how machine learning models leverage history partly repeating itself.
△ Less
Submitted 17 December, 2024;
originally announced December 2024.
-
Unveiling True Talent: The Soccer Factor Model for Skill Evaluation
Authors:
Alexandre Andorra,
Maximilian Göbel
Abstract:
Evaluating a soccer player's performance can be challenging due to the high costs and small margins involved in recruitment decisions. Raw observational statistics further complicate an accurate individual skill assessment as they do not abstract from the potentially confounding factor of team strength. We introduce the Soccer Factor Model (SFM), which corrects this bias by isolating a player's tr…
▽ More
Evaluating a soccer player's performance can be challenging due to the high costs and small margins involved in recruitment decisions. Raw observational statistics further complicate an accurate individual skill assessment as they do not abstract from the potentially confounding factor of team strength. We introduce the Soccer Factor Model (SFM), which corrects this bias by isolating a player's true skill from the team's influence. We compile a novel data set, web-scraped from publicly available data sources. Our empirical application draws on information of 144 players, playing a total of over 33,000 matches, in seasons 2000/01 through 2023/24. Not only does the SFM allow for a structural interpretation of a player's skill, but also stands out against more reduced-form benchmarks in terms of forecast accuracy. Moreover, we propose Skill- and Performance Above Replacement as metrics for fair cross-player comparisons. These, for example, allow us to settle the discussion about the GOAT of soccer in the first quarter of the twenty-first century.
△ Less
Submitted 8 December, 2024;
originally announced December 2024.
-
Maximally Forward-Looking Core Inflation
Authors:
Philippe Goulet Coulombe,
Karin Klieber,
Christophe Barrette,
Maximilian Goebel
Abstract:
Timely monetary policy decision-making requires timely core inflation measures. We create a new core inflation series that is explicitly designed to succeed at that goal. Precisely, we introduce the Assemblage Regression, a generalized nonnegative ridge regression problem that optimizes the price index's subcomponent weights such that the aggregate is maximally predictive of future headline inflat…
▽ More
Timely monetary policy decision-making requires timely core inflation measures. We create a new core inflation series that is explicitly designed to succeed at that goal. Precisely, we introduce the Assemblage Regression, a generalized nonnegative ridge regression problem that optimizes the price index's subcomponent weights such that the aggregate is maximally predictive of future headline inflation. Ordering subcomponents according to their rank in each period switches the algorithm to be learning supervised trimmed inflation - or, put differently, the maximally forward-looking summary statistic of the realized price changes distribution. In an extensive out-of-sample forecasting experiment for the US and the euro area, we find substantial improvements for signaling medium-term inflation developments in both the pre- and post-Covid years. Those coming from the supervised trimmed version are particularly striking, and are attributable to a highly asymmetric trimming which contrasts with conventional indicators. We also find that this metric was indicating first upward pressures on inflation as early as mid-2020 and quickly captured the turning point in 2022. We also consider extensions, like assembling inflation from geographical regions, trimmed temporal aggregation, and building core measures specialized for either upside or downside inflation risks.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
Maximally Machine-Learnable Portfolios
Authors:
Philippe Goulet Coulombe,
Maximilian Goebel
Abstract:
When it comes to stock returns, any form of predictability can bolster risk-adjusted profitability. We develop a collaborative machine learning algorithm that optimizes portfolio weights so that the resulting synthetic security is maximally predictable. Precisely, we introduce MACE, a multivariate extension of Alternating Conditional Expectations that achieves the aforementioned goal by wielding a…
▽ More
When it comes to stock returns, any form of predictability can bolster risk-adjusted profitability. We develop a collaborative machine learning algorithm that optimizes portfolio weights so that the resulting synthetic security is maximally predictable. Precisely, we introduce MACE, a multivariate extension of Alternating Conditional Expectations that achieves the aforementioned goal by wielding a Random Forest on one side of the equation, and a constrained Ridge Regression on the other. There are two key improvements with respect to Lo and MacKinlay's original maximally predictable portfolio approach. First, it accommodates for any (nonlinear) forecasting algorithm and predictor set. Second, it handles large portfolios. We conduct exercises at the daily and monthly frequency and report significant increases in predictability and profitability using very little conditioning information. Interestingly, predictability is found in bad as well as good times, and MACE successfully navigates the debacle of 2022.
△ Less
Submitted 4 April, 2024; v1 submitted 8 June, 2023;
originally announced June 2023.
-
Assessing and Comparing Fixed-Target Forecasts of Arctic Sea Ice: Glide Charts for Feature-Engineered Linear Regression and Machine Learning Models
Authors:
Francis X. Diebold,
Maximilian Goebel,
Philippe Goulet Coulombe
Abstract:
We use "glide charts" (plots of sequences of root mean squared forecast errors as the target date is approached) to evaluate and compare fixed-target forecasts of Arctic sea ice. We first use them to evaluate the simple feature-engineered linear regression (FELR) forecasts of Diebold and Goebel (2021), and to compare FELR forecasts to naive pure-trend benchmark forecasts. Then we introduce a much…
▽ More
We use "glide charts" (plots of sequences of root mean squared forecast errors as the target date is approached) to evaluate and compare fixed-target forecasts of Arctic sea ice. We first use them to evaluate the simple feature-engineered linear regression (FELR) forecasts of Diebold and Goebel (2021), and to compare FELR forecasts to naive pure-trend benchmark forecasts. Then we introduce a much more sophisticated feature-engineered machine learning (FEML) model, and we use glide charts to evaluate FEML forecasts and compare them to a FELR benchmark. Our substantive results include the frequent appearance of predictability thresholds, which differ across months, meaning that accuracy initially fails to improve as the target date is approached but then increases progressively once a threshold lead time is crossed. Also, we find that FEML can improve appreciably over FELR when forecasting "turning point" months in the annual cycle at horizons of one to three months ahead.
△ Less
Submitted 8 June, 2023; v1 submitted 21 June, 2022;
originally announced June 2022.
-
When Will Arctic Sea Ice Disappear? Projections of Area, Extent, Thickness, and Volume
Authors:
Francis X. Diebold,
Glenn D. Rudebusch,
Maximilian Goebel,
Philippe Goulet Coulombe,
Boyuan Zhang
Abstract:
Rapidly diminishing Arctic summer sea ice is a strong signal of the pace of global climate change. We provide point, interval, and density forecasts for four measures of Arctic sea ice: area, extent, thickness, and volume. Importantly, we enforce the joint constraint that these measures must simultaneously arrive at an ice-free Arctic. We apply this constrained joint forecast procedure to models r…
▽ More
Rapidly diminishing Arctic summer sea ice is a strong signal of the pace of global climate change. We provide point, interval, and density forecasts for four measures of Arctic sea ice: area, extent, thickness, and volume. Importantly, we enforce the joint constraint that these measures must simultaneously arrive at an ice-free Arctic. We apply this constrained joint forecast procedure to models relating sea ice to atmospheric carbon dioxide concentration and models relating sea ice directly to time. The resulting "carbon-trend" and "time-trend" projections are mutually consistent and predict a nearly ice-free summer Arctic Ocean by the mid-2030s with an 80% probability. Moreover, the carbon-trend projections show that global adoption of a lower carbon path would likely delay the arrival of a seasonally ice-free Arctic by only a few years.
△ Less
Submitted 23 May, 2023; v1 submitted 8 March, 2022;
originally announced March 2022.
-
Zombie-Lending in the United States -- Prevalence versus Relevance
Authors:
Maximilian Göbel,
Nuno Tavares
Abstract:
Extraordinary fiscal and monetary interventions in response to the COVID-19 pandemic have revived concerns about zombie prevalence in advanced economies. Within a sample of publicly listed U.S. companies, we find zombie prevalence and zombie-lending not to be a widespread phenomenon per se. Nevertheless, our results reveal negative spillovers of zombie-lending on productivity, capital-growth, and…
▽ More
Extraordinary fiscal and monetary interventions in response to the COVID-19 pandemic have revived concerns about zombie prevalence in advanced economies. Within a sample of publicly listed U.S. companies, we find zombie prevalence and zombie-lending not to be a widespread phenomenon per se. Nevertheless, our results reveal negative spillovers of zombie-lending on productivity, capital-growth, and employment-growth of non-zombies as well as on overall business dynamism. It is predominantly the class of healthy small- and medium-sized companies that is sensitive to zombie-lending activities, with financial constraints further amplifying these effects.
△ Less
Submitted 3 July, 2022; v1 submitted 23 January, 2022;
originally announced January 2022.
-
On Spurious Causality, CO2, and Global Temperature
Authors:
Philippe Goulet Coulombe,
Maximilian Göbel
Abstract:
Stips, Macias, Coughlan, Garcia-Gorriz, and Liang (2016, Nature Scientific Reports) use information flows (Liang, 2008, 2014) to establish causality from various forcings to global temperature. We show that the formulas being used hinges on a simplifying assumption that is nearly always rejected by the data. We propose an adequate measure of information flow based on Vector Autoregressions, and fi…
▽ More
Stips, Macias, Coughlan, Garcia-Gorriz, and Liang (2016, Nature Scientific Reports) use information flows (Liang, 2008, 2014) to establish causality from various forcings to global temperature. We show that the formulas being used hinges on a simplifying assumption that is nearly always rejected by the data. We propose an adequate measure of information flow based on Vector Autoregressions, and find that most results in Stips et al. (2016) cannot be corroborated. Then, it is discussed which modeling choices (e.g., the choice of CO2 series and assumptions about simultaneous relationships) may help in extracting credible estimates of causal flows and the transient climate response simply by looking at the joint dynamics of two climatic time series.
△ Less
Submitted 18 March, 2021;
originally announced March 2021.
-
A Benchmark Model for Fixed-Target Arctic Sea Ice Forecasting
Authors:
Francis X. Diebold,
Maximilian Gobel
Abstract:
We propose a reduced-form benchmark predictive model (BPM) for fixed-target forecasting of Arctic sea ice extent, and we provide a case study of its real-time performance for target date September 2020. We visually detail the evolution of the statistically-optimal point, interval, and density forecasts as time passes, new information arrives, and the end of September approaches. Comparison to the…
▽ More
We propose a reduced-form benchmark predictive model (BPM) for fixed-target forecasting of Arctic sea ice extent, and we provide a case study of its real-time performance for target date September 2020. We visually detail the evolution of the statistically-optimal point, interval, and density forecasts as time passes, new information arrives, and the end of September approaches. Comparison to the BPM may prove useful for evaluating and selecting among various more sophisticated dynamical sea ice models, which are widely used to quantify the likely future evolution of Arctic conditions and their two-way interaction with economic activity.
△ Less
Submitted 2 January, 2022; v1 submitted 25 January, 2021;
originally announced January 2021.
-
Arctic Amplification of Anthropogenic Forcing: A Vector Autoregressive Analysis
Authors:
Philippe Goulet Coulombe,
Maximilian Göbel
Abstract:
On September 15th 2020, Arctic sea ice extent (SIE) ranked second-to-lowest in history and keeps trending downward. The understanding of how feedback loops amplify the effects of external CO2 forcing is still limited. We propose the VARCTIC, which is a Vector Autoregression (VAR) designed to capture and extrapolate Arctic feedback loops. VARs are dynamic simultaneous systems of equations, routinel…
▽ More
On September 15th 2020, Arctic sea ice extent (SIE) ranked second-to-lowest in history and keeps trending downward. The understanding of how feedback loops amplify the effects of external CO2 forcing is still limited. We propose the VARCTIC, which is a Vector Autoregression (VAR) designed to capture and extrapolate Arctic feedback loops. VARs are dynamic simultaneous systems of equations, routinely estimated to predict and understand the interactions of multiple macroeconomic time series. The VARCTIC is a parsimonious compromise between full-blown climate models and purely statistical approaches that usually offer little explanation of the underlying mechanism. Our completely unconditional forecast has SIE hitting 0 in September by the 2060's. Impulse response functions reveal that anthropogenic CO2 emission shocks have an unusually durable effect on SIE -- a property shared by no other shock. We find Albedo- and Thickness-based feedbacks to be the main amplification channels through which CO2 anomalies impact SIE in the short/medium run. Further, conditional forecast analyses reveal that the future path of SIE crucially depends on the evolution of CO2 emissions, with outcomes ranging from recovering SIE to it reaching 0 in the 2050's. Finally, Albedo and Thickness feedbacks are shown to play an important role in accelerating the speed at which predicted SIE is heading towards 0.
△ Less
Submitted 9 March, 2021; v1 submitted 5 May, 2020;
originally announced May 2020.
-
Optimal Combination of Arctic Sea Ice Extent Measures: A Dynamic Factor Modeling Approach
Authors:
Francis X. Diebold,
Maximilian Göbel,
Philippe Goulet Coulombe,
Glenn D. Rudebusch,
Boyuan Zhang
Abstract:
The diminishing extent of Arctic sea ice is a key indicator of climate change as well as an accelerant for future global warming. Since 1978, Arctic sea ice has been measured using satellite-based microwave sensing; however, different measures of Arctic sea ice extent have been made available based on differing algorithmic transformations of the raw satellite data. We propose and estimate a dynami…
▽ More
The diminishing extent of Arctic sea ice is a key indicator of climate change as well as an accelerant for future global warming. Since 1978, Arctic sea ice has been measured using satellite-based microwave sensing; however, different measures of Arctic sea ice extent have been made available based on differing algorithmic transformations of the raw satellite data. We propose and estimate a dynamic factor model that combines four of these measures in an optimal way that accounts for their differing volatility and cross-correlations. We then use the Kalman smoother to extract an optimal combined measure of Arctic sea ice extent. It turns out that almost all weight is put on the NSIDC Sea Ice Index, confirming and enhancing confidence in the Sea Ice Index and the NASA Team algorithm on which it is based.
△ Less
Submitted 12 August, 2020; v1 submitted 31 March, 2020;
originally announced March 2020.