-
Latent Factor Analysis in Short Panels
Authors:
Alain-Philippe Fortin,
Patrick Gagliardini,
Olivier Scaillet
Abstract:
We develop inferential tools for latent factor analysis in short panels. The pseudo maximum likelihood setting under a large cross-sectional dimension n and a fixed time series dimension T relies on a diagonal TxT covariance matrix of the errors without imposing sphericity nor Gaussianity. We outline the asymptotic distributions of the latent factor and error covariance estimates as well as of an…
▽ More
We develop inferential tools for latent factor analysis in short panels. The pseudo maximum likelihood setting under a large cross-sectional dimension n and a fixed time series dimension T relies on a diagonal TxT covariance matrix of the errors without imposing sphericity nor Gaussianity. We outline the asymptotic distributions of the latent factor and error covariance estimates as well as of an asymptotically uniformly most powerful invariant (AUMPI) test for the number of factors based on the likelihood ratio statistic. We derive the AUMPI characterization from inequalities ensuring the monotone likelihood ratio property for positive definite quadratic forms in normal variables. An empirical application to a large panel of monthly U.S. stock returns separates month after month systematic and idiosyncratic risks in short subperiods of bear vs. bull market based on the selected number of factors. We observe an uptrend in the paths of total and idiosyncratic volatilities while the systematic risk explains a large part of the cross-sectional total variance in bear markets but is not driven by a single factor. Rank tests show that observed factors struggle spanning latent factors with a discrepancy between the dimensions of the two factor spaces decreasing over time.
△ Less
Submitted 30 May, 2024; v1 submitted 24 June, 2023;
originally announced June 2023.
-
Eigenvalue tests for the number of latent factors in short panels
Authors:
Alain-Philippe Fortin,
Patrick Gagliardini,
Olivier Scaillet
Abstract:
This paper studies new tests for the number of latent factors in a large cross-sectional factor model with small time dimension. These tests are based on the eigenvalues of variance-covariance matrices of (possibly weighted) asset returns, and rely on either the assumption of spherical errors, or instrumental variables for factor betas. We establish the asymptotic distributional results using expa…
▽ More
This paper studies new tests for the number of latent factors in a large cross-sectional factor model with small time dimension. These tests are based on the eigenvalues of variance-covariance matrices of (possibly weighted) asset returns, and rely on either the assumption of spherical errors, or instrumental variables for factor betas. We establish the asymptotic distributional results using expansion theorems based on perturbation theory for symmetric matrices. Our framework accommodates semi-strong factors in the systematic components. We propose a novel statistical test for weak factors against strong or semi-strong factors. We provide an empirical application to US equity data. Evidence for a different number of latent factors according to market downturns and market upturns, is statistically ambiguous in the considered subperiods. In particular, our results contradicts the common wisdom of a single factor model in bear markets.
△ Less
Submitted 28 October, 2022;
originally announced October 2022.
-
A penalized two-pass regression to predict stock returns with time-varying risk premia
Authors:
Gaetan Bakalli,
Stéphane Guerrier,
Olivier Scaillet
Abstract:
We develop a penalized two-pass regression with time-varying factor loadings. The penalization in the first pass enforces sparsity for the time-variation drivers while also maintaining compatibility with the no-arbitrage restrictions by regularizing appropriate groups of coefficients. The second pass delivers risk premia estimates to predict equity excess returns. Our Monte Carlo results and our e…
▽ More
We develop a penalized two-pass regression with time-varying factor loadings. The penalization in the first pass enforces sparsity for the time-variation drivers while also maintaining compatibility with the no-arbitrage restrictions by regularizing appropriate groups of coefficients. The second pass delivers risk premia estimates to predict equity excess returns. Our Monte Carlo results and our empirical results on a large cross-sectional data set of US individual stocks show that penalization without grouping can yield to nearly all estimated time-varying models violating the no-arbitrage restrictions. Moreover, our results demonstrate that the proposed method reduces the prediction errors compared to a penalized approach without appropriate grouping or a time-invariant factor model.
△ Less
Submitted 1 August, 2022;
originally announced August 2022.
-
SWAG: A Wrapper Method for Sparse Learning
Authors:
Roberto Molinari,
Gaetan Bakalli,
Stéphane Guerrier,
Cesare Miglioli,
Samuel Orso,
Mucyo Karemera,
Olivier Scaillet
Abstract:
The majority of machine learning methods and algorithms give high priority to prediction performance which may not always correspond to the priority of the users. In many cases, practitioners and researchers in different fields, going from engineering to genetics, require interpretability and replicability of the results especially in settings where, for example, not all attributes may be availabl…
▽ More
The majority of machine learning methods and algorithms give high priority to prediction performance which may not always correspond to the priority of the users. In many cases, practitioners and researchers in different fields, going from engineering to genetics, require interpretability and replicability of the results especially in settings where, for example, not all attributes may be available to them. As a consequence, there is the need to make the outputs of machine learning algorithms more interpretable and to deliver a library of "equivalent" learners (in terms of prediction performance) that users can select based on attribute availability in order to test and/or make use of these learners for predictive/diagnostic purposes. To address these needs, we propose to study a procedure that combines screening and wrapper approaches which, based on a user-specified learning method, greedily explores the attribute space to find a library of sparse learners with consequent low data collection and storage costs. This new method (i) delivers a low-dimensional network of attributes that can be easily interpreted and (ii) increases the potential replicability of results based on the diversity of attribute combinations defining strong learners with equivalent predictive power. We call this algorithm "Sparse Wrapper AlGorithm" (SWAG).
△ Less
Submitted 31 October, 2021; v1 submitted 23 June, 2020;
originally announced June 2020.
-
Spanning analysis of stock market anomalies under Prospect Stochastic Dominance
Authors:
Stelios Arvanitis,
Olivier Scaillet,
Nikolas Topaloglou
Abstract:
We develop and implement methods for determining whether introducing new securities or relaxing investment constraints improves the investment opportunity set for prospect investors. We formulate a new testing procedure for prospect spanning for two nested portfolio sets based on subsampling and Linear Programming. In an application, we use the prospect spanning framework to evaluate whether well-…
▽ More
We develop and implement methods for determining whether introducing new securities or relaxing investment constraints improves the investment opportunity set for prospect investors. We formulate a new testing procedure for prospect spanning for two nested portfolio sets based on subsampling and Linear Programming. In an application, we use the prospect spanning framework to evaluate whether well-known anomalies are spanned by standard factors. We find that of the strategies considered, many expand the opportunity set of the prospect type investors, thus have real economic value for them. In-sample and out-of-sample results prove remarkably consistent in identifying genuine anomalies for prospect investors.
△ Less
Submitted 6 April, 2020;
originally announced April 2020.
-
Saddlepoint approximations for spatial panel data models
Authors:
Chaonan Jiang,
Davide La Vecchia,
Elvezio Ronchetti,
Olivier Scaillet
Abstract:
We develop new higher-order asymptotic techniques for the Gaussian maximum likelihood estimator in a spatial panel data model, with fixed effects, time-varying covariates, and spatially correlated errors. Our saddlepoint density and tail area approximation feature relative error of order $O(1/(n(T-1)))$ with $n$ being the cross-sectional dimension and $T$ the time-series dimension. The main theore…
▽ More
We develop new higher-order asymptotic techniques for the Gaussian maximum likelihood estimator in a spatial panel data model, with fixed effects, time-varying covariates, and spatially correlated errors. Our saddlepoint density and tail area approximation feature relative error of order $O(1/(n(T-1)))$ with $n$ being the cross-sectional dimension and $T$ the time-series dimension. The main theoretical tool is the tilted-Edgeworth technique in a non-identically distributed setting. The density approximation is always non-negative, does not need resampling, and is accurate in the tails. Monte Carlo experiments on density approximation and testing in the presence of nuisance parameters illustrate the good performance of our approximation over first-order asymptotics and Edgeworth expansions. An empirical application to the investment-saving relationship in OECD (Organisation for Economic Co-operation and Development) countries shows disagreement between testing results based on first-order asymptotics and saddlepoint techniques.
△ Less
Submitted 12 July, 2021; v1 submitted 22 January, 2020;
originally announced January 2020.
-
A Higher-Order Correct Fast Moving-Average Bootstrap for Dependent Data
Authors:
Davide La Vecchia,
Alban Moor,
Olivier Scaillet
Abstract:
We develop and implement a novel fast bootstrap for dependent data. Our scheme is based on the i.i.d. resampling of the smoothed moment indicators. We characterize the class of parametric and semi-parametric estimation problems for which the method is valid. We show the asymptotic refinements of the proposed procedure, proving that it is higher-order correct under mild assumptions on the time seri…
▽ More
We develop and implement a novel fast bootstrap for dependent data. Our scheme is based on the i.i.d. resampling of the smoothed moment indicators. We characterize the class of parametric and semi-parametric estimation problems for which the method is valid. We show the asymptotic refinements of the proposed procedure, proving that it is higher-order correct under mild assumptions on the time series, the estimating functions, and the smoothing kernel. We illustrate the applicability and the advantages of our procedure for Generalized Empirical Likelihood estimation. As a by-product, our fast bootstrap provides higher-order correct asymptotic confidence distributions. Monte Carlo simulations on an autoregressive conditional duration model provide numerical evidence that the novel bootstrap yields higher-order accurate confidence intervals. A real-data application on dynamics of trading volume of stocks illustrates the advantage of our method over the routinely-applied first-order asymptotic theory, when the underlying distribution of the test statistic is skewed or fat-tailed.
△ Less
Submitted 17 January, 2022; v1 submitted 14 January, 2020;
originally announced January 2020.
-
A diagnostic criterion for approximate factor structure
Authors:
Patrick Gagliardini,
Elisa Ossola,
Olivier Scaillet
Abstract:
We build a simple diagnostic criterion for approximate factor structure in large cross-sectional equity datasets. Given a model for asset returns with observable factors, the criterion checks whether the error terms are weakly cross-sectionally correlated or share at least one unobservable common factor. It only requires computing the largest eigenvalue of the empirical cross-sectional covariance…
▽ More
We build a simple diagnostic criterion for approximate factor structure in large cross-sectional equity datasets. Given a model for asset returns with observable factors, the criterion checks whether the error terms are weakly cross-sectionally correlated or share at least one unobservable common factor. It only requires computing the largest eigenvalue of the empirical cross-sectional covariance matrix of the residuals of a large unbalanced panel. A general version of this criterion allows us to determine the number of omitted common factors. The panel data model accommodates both time-invariant and time-varying factor structures. The theory applies to random coefficient panel models with interactive fixed effects under large cross-section and time-series dimensions. The empirical analysis runs on monthly and quarterly returns for about ten thousand US stocks from January 1968 to December 2011 for several time-invariant and time-varying specifications. For monthly returns, we can choose either among time-invariant specifications with at least four financial factors, or a scaled three-factor specification. For quarterly returns, we cannot select macroeconomic models without the market factor.
△ Less
Submitted 7 August, 2017; v1 submitted 15 December, 2016;
originally announced December 2016.
-
Assessing multivariate predictors of financial market movements: A latent factor framework for ordinal data
Authors:
Philippe Huber,
Olivier Scaillet,
Maria-Pia Victoria-Feser
Abstract:
Much of the trading activity in Equity markets is directed to brokerage houses. In exchange they provide so-called "soft dollars," which basically are amounts spent in "research" for identifying profitable trading opportunities. Soft dollars represent about USD 1 out of every USD 10 paid in commissions. Obviously they are costly, and it is interesting for an institutional investor to determine w…
▽ More
Much of the trading activity in Equity markets is directed to brokerage houses. In exchange they provide so-called "soft dollars," which basically are amounts spent in "research" for identifying profitable trading opportunities. Soft dollars represent about USD 1 out of every USD 10 paid in commissions. Obviously they are costly, and it is interesting for an institutional investor to determine whether soft dollar inputs are worth being used (and indirectly paid for) or not, from a statistical point of view. To address this question, we develop association measures between what broker--dealers predict and what markets realize. Our data are ordinal predictions by two broker--dealers and realized values on several markets, on the same ordinal scale. We develop a structural equation model with latent variables in an ordinal setting which allows us to test broker--dealer predictive ability of financial market movements. We use a multivariate logit model in a latent factor framework, develop a tractable estimator based on a Laplace approximation, and show its consistency and asymptotic normality. Monte Carlo experiments reveal that both the estimation method and the testing procedure perform well in small samples. The method is then used to analyze our dataset.
△ Less
Submitted 5 June, 2009;
originally announced June 2009.