Search | arXiv e-print repository

MIDAS-QR with 2-Dimensional Structure

Authors: Tibor Szendrei, Arnab Bhattacharjee, Mark E. Schaffer

Abstract: Mixed frequency data has been shown to improve the performance of growth-at-risk models in the literature. Most of the research has focused on imposing structure on the high-frequency lags when estimating MIDAS-QR models akin to what is done in mean models. However, only imposing structure on the lag-dimension can potentially induce quantile variation that would otherwise not be there. In this pap… ▽ More Mixed frequency data has been shown to improve the performance of growth-at-risk models in the literature. Most of the research has focused on imposing structure on the high-frequency lags when estimating MIDAS-QR models akin to what is done in mean models. However, only imposing structure on the lag-dimension can potentially induce quantile variation that would otherwise not be there. In this paper we extend the framework by introducing structure on both the lag dimension and the quantile dimension. In this way we are able to shrink unnecessary quantile variation in the high-frequency variables. This leads to more gradual lag profiles in both dimensions compared to the MIDAS-QR and UMIDAS-QR. We show that this proposed method leads to further gains in nowcasting and forecasting on a pseudo-out-of-sample exercise on US data. △ Less

Submitted 21 June, 2024; originally announced June 2024.

arXiv:2403.14036 [pdf, other]

Fused LASSO as Non-Crossing Quantile Regression

Authors: Tibor Szendrei, Arnab Bhattacharjee, Mark E. Schaffer

Abstract: Growth-at-Risk is vital for empirical macroeconomics but is often suspect to quantile crossing due to data limitations. While existing literature addresses this through post-processing of the fitted quantiles, these methods do not correct the estimated coefficients. We advocate for imposing non-crossing constraints during estimation and demonstrate their equivalence to fused LASSO with quantile-sp… ▽ More Growth-at-Risk is vital for empirical macroeconomics but is often suspect to quantile crossing due to data limitations. While existing literature addresses this through post-processing of the fitted quantiles, these methods do not correct the estimated coefficients. We advocate for imposing non-crossing constraints during estimation and demonstrate their equivalence to fused LASSO with quantile-specific shrinkage parameters. By re-examining Growth-at-Risk through an interquantile shrinkage lens, we achieve improved left-tail forecasts and better identification of variables that drive quantile variation. We show that these improvements have ramifications for policy tools such as Expected Shortfall and Quantile Local Projections. △ Less

Submitted 18 April, 2025; v1 submitted 20 March, 2024; originally announced March 2024.

arXiv:2401.01645 [pdf, other]

Model Averaging and Double Machine Learning

Authors: Achim Ahrens, Christian B. Hansen, Mark E. Schaffer, Thomas Wiemann

Abstract: This paper discusses pairing double/debiased machine learning (DDML) with stacking, a model averaging method for combining multiple candidate learners, to estimate structural parameters. In addition to conventional stacking, we consider two stacking variants available for DDML: short-stacking exploits the cross-fitting step of DDML to substantially reduce the computational burden and pooled stacki… ▽ More This paper discusses pairing double/debiased machine learning (DDML) with stacking, a model averaging method for combining multiple candidate learners, to estimate structural parameters. In addition to conventional stacking, we consider two stacking variants available for DDML: short-stacking exploits the cross-fitting step of DDML to substantially reduce the computational burden and pooled stacking enforces common stacking weights over cross-fitting folds. Using calibrated simulation studies and two applications estimating gender gaps in citations and wages, we show that DDML with stacking is more robust to partially unknown functional forms than common alternative approaches based on single pre-selected learners. We provide Stata and R software implementing our proposals. △ Less

Submitted 25 September, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

arXiv:2301.09397 [pdf, other]

ddml: Double/debiased machine learning in Stata

Authors: Achim Ahrens, Christian B. Hansen, Mark E. Schaffer, Thomas Wiemann

Abstract: We introduce the package ddml for Double/Debiased Machine Learning (DDML) in Stata. Estimators of causal parameters for five different econometric models are supported, allowing for flexible estimation of causal effects of endogenous variables in settings with unknown functional forms and/or many exogenous variables. ddml is compatible with many existing supervised machine learning programs in Sta… ▽ More We introduce the package ddml for Double/Debiased Machine Learning (DDML) in Stata. Estimators of causal parameters for five different econometric models are supported, allowing for flexible estimation of causal effects of endogenous variables in settings with unknown functional forms and/or many exogenous variables. ddml is compatible with many existing supervised machine learning programs in Stata. We recommend using DDML in combination with stacking estimation which combines multiple machine learners into a final predictor. We provide Monte Carlo evidence to support our recommendation. △ Less

Submitted 6 January, 2024; v1 submitted 23 January, 2023; originally announced January 2023.

Comments: Tutorials and installations can be found at https://statalasso.github.io/

arXiv:2208.10896 [pdf, other]

pystacked: Stacking generalization and machine learning in Stata

Authors: Achim Ahrens, Christian B. Hansen, Mark E. Schaffer

Abstract: pystacked implements stacked generalization (Wolpert, 1992) for regression and binary classification via Python's scikit-learn. Stacking combines multiple supervised machine learners -- the "base" or "level-0" learners -- into a single learner. The currently supported base learners include regularized regression, random forest, gradient boosted trees, support vector machines, and feed-forward neur… ▽ More pystacked implements stacked generalization (Wolpert, 1992) for regression and binary classification via Python's scikit-learn. Stacking combines multiple supervised machine learners -- the "base" or "level-0" learners -- into a single learner. The currently supported base learners include regularized regression, random forest, gradient boosted trees, support vector machines, and feed-forward neural nets (multi-layer perceptron). pystacked can also be used with as a `regular' machine learning program to fit a single base learner and, thus, provides an easy-to-use API for scikit-learn's machine learning algorithms. △ Less

Submitted 6 March, 2023; v1 submitted 23 August, 2022; originally announced August 2022.

Comments: The pystacked package is available here: https://github.com/aahrens1/pystacked

arXiv:1901.05397 [pdf, other]

lassopack: Model selection and prediction with regularized regression in Stata

Authors: Achim Ahrens, Christian B. Hansen, Mark E. Schaffer

Abstract: This article introduces lassopack, a suite of programs for regularized regression in Stata. lassopack implements lasso, square-root lasso, elastic net, ridge regression, adaptive lasso and post-estimation OLS. The methods are suitable for the high-dimensional setting where the number of predictors $p$ may be large and possibly greater than the number of observations, $n$. We offer three different… ▽ More This article introduces lassopack, a suite of programs for regularized regression in Stata. lassopack implements lasso, square-root lasso, elastic net, ridge regression, adaptive lasso and post-estimation OLS. The methods are suitable for the high-dimensional setting where the number of predictors $p$ may be large and possibly greater than the number of observations, $n$. We offer three different approaches for selecting the penalization (`tuning') parameters: information criteria (implemented in lasso2), $K$-fold cross-validation and $h$-step ahead rolling cross-validation for cross-section, panel and time-series data (cvlasso), and theory-driven (`rigorous') penalization for the lasso and square-root lasso for cross-section and panel data (rlasso). We discuss the theoretical framework and practical considerations for each approach. We also present Monte Carlo results to compare the performance of the penalization approaches. △ Less

Submitted 16 January, 2019; originally announced January 2019.

Comments: 52 pages, 6 figures, 6 tables; submitted to Stata Journal; for more information see https://statalasso.github.io/

Showing 1–6 of 6 results for author: Schaffer, M E