-
Non-linear shrinkage of the price return covariance matrix is far from optimal for portfolio optimisation
Authors:
Christian Bongiorno,
Damien Challet
Abstract:
Portfolio optimization requires sophisticated covariance estimators that are able to filter out estimation noise. Non-linear shrinkage is a popular estimator based on how the Oracle eigenvalues can be computed using only data from the calibration window. Contrary to common belief, NLS is not optimal for portfolio optimization because it does not minimize the right cost function when the asset depe…
▽ More
Portfolio optimization requires sophisticated covariance estimators that are able to filter out estimation noise. Non-linear shrinkage is a popular estimator based on how the Oracle eigenvalues can be computed using only data from the calibration window. Contrary to common belief, NLS is not optimal for portfolio optimization because it does not minimize the right cost function when the asset dependence structure is non-stationary. We instead derive the optimal target. Using historical data, we quantify by how much NLS can be improved. Our findings reopen the question of how to build the covariance matrix estimator for portfolio optimization in realistic conditions.
△ Less
Submitted 13 October, 2022; v1 submitted 14 December, 2021;
originally announced December 2021.
-
Cleaning the covariance matrix of strongly nonstationary systems with time-independent eigenvalues
Authors:
Christian Bongiorno,
Damien Challet,
Grégoire Loeper
Abstract:
We propose a data-driven way to reduce the noise of covariance matrices of nonstationary systems. In the case of stationary systems, asymptotic approaches were proved to converge to the optimal solutions. Such methods produce eigenvalues that are highly dependent on the inputs, as common sense would suggest. Our approach proposes instead to use a set of eigenvalues totally independent from the inp…
▽ More
We propose a data-driven way to reduce the noise of covariance matrices of nonstationary systems. In the case of stationary systems, asymptotic approaches were proved to converge to the optimal solutions. Such methods produce eigenvalues that are highly dependent on the inputs, as common sense would suggest. Our approach proposes instead to use a set of eigenvalues totally independent from the inputs and that encode the long-term averaging of the influence of the future on present eigenvalues. Such an influence can be the predominant factor in nonstationary systems. Using real and synthetic data, we show that our data-driven method outperforms optimal methods designed for stationary systems for the filtering of both covariance matrix and its inverse, as illustrated by financial portfolio variance minimization, which makes out method generically relevant to many problems of multivariate inference.
△ Less
Submitted 9 March, 2023; v1 submitted 25 November, 2021;
originally announced November 2021.
-
Financial factors selection with knockoffs: fund replication, explanatory and prediction networks
Authors:
Damien Challet,
Christian Bongiorno,
Guillaume Pelletier
Abstract:
We apply the knockoff procedure to factor selection in finance. By building fake but realistic factors, this procedure makes it possible to control the fraction of false discovery in a given set of factors. To show its versatility, we apply it to fund replication and to the inference of explanatory and prediction networks.
We apply the knockoff procedure to factor selection in finance. By building fake but realistic factors, this procedure makes it possible to control the fraction of false discovery in a given set of factors. To show its versatility, we apply it to fund replication and to the inference of explanatory and prediction networks.
△ Less
Submitted 10 March, 2021;
originally announced March 2021.
-
Reactive Global Minimum Variance Portfolios with $k-$BAHC covariance cleaning
Authors:
Christian Bongiorno,
Damien Challet
Abstract:
We introduce a $k$-fold boosted version of our Boostrapped Average Hierarchical Clustering cleaning procedure for correlation and covariance matrices. We then apply this method to global minimum variance portfolios for various values of $k$ and compare their performance with other state-of-the-art methods. Generally, we find that our method yields better Sharpe ratios after transaction costs than…
▽ More
We introduce a $k$-fold boosted version of our Boostrapped Average Hierarchical Clustering cleaning procedure for correlation and covariance matrices. We then apply this method to global minimum variance portfolios for various values of $k$ and compare their performance with other state-of-the-art methods. Generally, we find that our method yields better Sharpe ratios after transaction costs than competing filtering methods, despite requiring a larger turnover.
△ Less
Submitted 9 March, 2023; v1 submitted 18 May, 2020;
originally announced May 2020.
-
Covariance matrix filtering with bootstrapped hierarchies
Authors:
Christian Bongiorno,
Damien Challet
Abstract:
Statistical inference of the dependence between objects often relies on covariance matrices. Unless the number of features (e.g. data points) is much larger than the number of objects, covariance matrix cleaning is necessary to reduce estimation noise. We propose a method that is robust yet flexible enough to account for fine details of the structure covariance matrix. Robustness comes from using…
▽ More
Statistical inference of the dependence between objects often relies on covariance matrices. Unless the number of features (e.g. data points) is much larger than the number of objects, covariance matrix cleaning is necessary to reduce estimation noise. We propose a method that is robust yet flexible enough to account for fine details of the structure covariance matrix. Robustness comes from using a hierarchical ansatz and dependence averaging between clusters; flexibility comes from a bootstrap procedure. This method finds several possible hierarchical structures in DNA microarray gene expression data, and leads to lower realized risk in global minimum variance portfolios than current filtering methods when the number of data points is relatively small.
△ Less
Submitted 12 March, 2020;
originally announced March 2020.
-
Nonparametric sign prediction of high-dimensional correlation matrix coefficients
Authors:
Christian Bongiorno,
Damien Challet
Abstract:
We introduce a method to predict which correlation matrix coefficients are likely to change their signs in the future in the high-dimensional regime, i.e. when the number of features is larger than the number of samples per feature. The stability of correlation signs, two-by-two relationships, is found to depend on three-by-three relationships inspired by Heider social cohesion theory in this regi…
▽ More
We introduce a method to predict which correlation matrix coefficients are likely to change their signs in the future in the high-dimensional regime, i.e. when the number of features is larger than the number of samples per feature. The stability of correlation signs, two-by-two relationships, is found to depend on three-by-three relationships inspired by Heider social cohesion theory in this regime. We apply our method to US and Hong Kong equities historical data to illustrate how the structure of correlation matrices influences the stability of the sign of its coefficients.
△ Less
Submitted 30 January, 2020;
originally announced January 2020.
-
Deep Prediction of Investor Interest: a Supervised Clustering Approach
Authors:
Baptiste Barreau,
Laurent Carlier,
Damien Challet
Abstract:
We propose a novel deep learning architecture suitable for the prediction of investor interest for a given asset in a given time frame. This architecture performs both investor clustering and modelling at the same time. We first verify its superior performance on a synthetic scenario inspired by real data and then apply it to two real-world databases, a publicly available dataset about the positio…
▽ More
We propose a novel deep learning architecture suitable for the prediction of investor interest for a given asset in a given time frame. This architecture performs both investor clustering and modelling at the same time. We first verify its superior performance on a synthetic scenario inspired by real data and then apply it to two real-world databases, a publicly available dataset about the position of investors in Spanish stock market and proprietary data from BNP Paribas Corporate and Institutional Banking.
△ Less
Submitted 26 February, 2021; v1 submitted 11 September, 2019;
originally announced September 2019.
-
The market nanostructure origin of asset price time reversal asymmetry
Authors:
Marcus Cordi,
Damien Challet,
Serge Kassibrakis
Abstract:
We introduce a framework to infer lead-lag networks between the states of elements of complex systems, determined at different timescales. As such networks encode the causal structure of a system, infering lead-lag networks for many pairs of timescales provides a global picture of the mutual influence between timescales. We apply our method to two trader-resolved FX data sets and document strong a…
▽ More
We introduce a framework to infer lead-lag networks between the states of elements of complex systems, determined at different timescales. As such networks encode the causal structure of a system, infering lead-lag networks for many pairs of timescales provides a global picture of the mutual influence between timescales. We apply our method to two trader-resolved FX data sets and document strong and complex asymmetric influence of timescales on the structure of lead-lag networks. Expectedly, this asymmetry extends to trader activity: for institutional clients in our dataset, past activity on timescales longer than 3 hours is more correlated with future activity at shorter timescales than the opposite (Zumbach effect), while a reverse Zumbach effect is found for past timescales shorter than 3 hours; retail clients have a totally different, and much more intricate, structure of asymmetric timescale influence. The causality structures are clearly caused by markedly different behaviors of the two types of traders. Hence, market nanostructure, i.e., market dynamics at the individual trader level, provides an unprecedented insight into the causality structure of financial markets, which is much more complex than previously thought.
△ Less
Submitted 7 April, 2020; v1 submitted 3 January, 2019;
originally announced January 2019.
-
Testing the causality of Hawkes processes with time reversal
Authors:
Marcus Cordi,
Damien Challet,
Ioane Muni Toke
Abstract:
We show that univariate and symmetric multivariate Hawkes processes are only weakly causal: the true log-likelihoods of real and reversed event time vectors are almost equal, thus parameter estimation via maximum likelihood only weakly depends on the direction of the arrow of time. In ideal (synthetic) conditions, tests of goodness of parametric fit unambiguously reject backward event times, which…
▽ More
We show that univariate and symmetric multivariate Hawkes processes are only weakly causal: the true log-likelihoods of real and reversed event time vectors are almost equal, thus parameter estimation via maximum likelihood only weakly depends on the direction of the arrow of time. In ideal (synthetic) conditions, tests of goodness of parametric fit unambiguously reject backward event times, which implies that inferring kernels from time-symmetric quantities, such as the autocovariance of the event rate, only rarely produce statistically significant fits. Finally, we find that fitting financial data with many-parameter kernels may yield significant fits for both arrows of time for the same event time vector, sometimes favouring the backward time direction. This goes to show that a significant fit of Hawkes processes to real data with flexible kernels does not imply a definite arrow of time unless one tests it.
△ Less
Submitted 25 September, 2017;
originally announced September 2017.
-
Statistically validated network of portfolio overlaps and systemic risk
Authors:
Stanislao Gualdi,
Giulio Cimini,
Kevin Primicerio,
Riccardo Di Clemente,
Damien Challet
Abstract:
Common asset holding by financial institutions, namely portfolio overlap, is nowadays regarded as an important channel for financial contagion with the potential to trigger fire sales and thus severe losses at the systemic level. In this paper we propose a method to assess the statistical significance of the overlap between pairs of heterogeneously diversified portfolios, which then allows us to b…
▽ More
Common asset holding by financial institutions, namely portfolio overlap, is nowadays regarded as an important channel for financial contagion with the potential to trigger fire sales and thus severe losses at the systemic level. In this paper we propose a method to assess the statistical significance of the overlap between pairs of heterogeneously diversified portfolios, which then allows us to build a validated network of financial institutions where links indicate potential contagion channels due to realized portfolio overlaps. The method is implemented on a historical database of institutional holdings ranging from 1999 to the end of 2013, but can be in general applied to any bipartite network where the presence of similar sets of neighbors is of interest. We find that the proportion of validated network links (i.e., of statistically significant overlaps) increased steadily before the 2007-2008 global financial crisis and reached a maximum when the crisis occurred. We argue that the nature of this measure implies that systemic risk from fire sales liquidation was maximal at that time. After a sharp drop in 2008, systemic risk resumed its growth in 2009, with a notable acceleration in 2013, reaching levels not seen since 2007. We finally show that market trends tend to be amplified in the portfolios identified by the algorithm, such that it is possible to have an informative signal about financial institutions that are about to suffer (enjoy) the most significant losses (gains).
△ Less
Submitted 27 September, 2016; v1 submitted 18 March, 2016;
originally announced March 2016.
-
Sharper asset ranking from total drawdown durations
Authors:
Damien Challet
Abstract:
The total duration of drawdowns is shown to provide a moment-free, unbiased, efficient and robust estimator of Sharpe ratios both for Gaussian and heavy-tailed price returns. We then use this quantity to infer an analytic expression of the bias of moment-based Sharpe ratio estimators as a function of the return distribution tail exponent. The heterogeneity of tail exponents at any given time among…
▽ More
The total duration of drawdowns is shown to provide a moment-free, unbiased, efficient and robust estimator of Sharpe ratios both for Gaussian and heavy-tailed price returns. We then use this quantity to infer an analytic expression of the bias of moment-based Sharpe ratio estimators as a function of the return distribution tail exponent. The heterogeneity of tail exponents at any given time among assets implies that our new method yields significantly different asset rankings than those of moment-based methods, especially in periods large volatility. This is fully confirmed by using 20 years of historical data on 3449 liquid US equities.
△ Less
Submitted 8 February, 2017; v1 submitted 6 May, 2015;
originally announced May 2015.
-
One- and two-sample nonparametric tests for the signal-to-noise ratio based on record statistics
Authors:
Damien Challet
Abstract:
A new family of nonparametric statistics, the r-statistics, is introduced. It consists of counting the number of records of the cumulative sum of the sample. The single-sample r-statistic is almost as powerful as Student's t-statistic for Gaussian and uniformly distributed variables, and more powerful than the sign and Wilcoxon signed-rank statistics as long as the data are not too heavy-tailed.…
▽ More
A new family of nonparametric statistics, the r-statistics, is introduced. It consists of counting the number of records of the cumulative sum of the sample. The single-sample r-statistic is almost as powerful as Student's t-statistic for Gaussian and uniformly distributed variables, and more powerful than the sign and Wilcoxon signed-rank statistics as long as the data are not too heavy-tailed.
Three two-sample parametric r-statistics are proposed, one with a higher specificity but a smaller sensitivity than Mann-Whitney U-test and the other one a higher sensitivity but a smaller specificity. A nonparametric two-sample r-statistic is introduced, whose power is very close to that of Welch statistic for Gaussian or uniformly distributed variables.
△ Less
Submitted 13 July, 2015; v1 submitted 18 February, 2015;
originally announced February 2015.