-
FDR-Controlled Portfolio Optimization for Sparse Financial Index Tracking
Authors:
Jasin Machkour,
Daniel P. Palomar,
Michael Muma
Abstract:
In high-dimensional data analysis, such as financial index tracking or biomedical applications, it is crucial to select the few relevant variables while maintaining control over the false discovery rate (FDR). In these applications, strong dependencies often exist among the variables (e.g., stock returns), which can undermine the FDR control property of existing methods like the model-X knockoff m…
▽ More
In high-dimensional data analysis, such as financial index tracking or biomedical applications, it is crucial to select the few relevant variables while maintaining control over the false discovery rate (FDR). In these applications, strong dependencies often exist among the variables (e.g., stock returns), which can undermine the FDR control property of existing methods like the model-X knockoff method or the T-Rex selector. To address this issue, we have expanded the T-Rex framework to accommodate overlapping groups of highly correlated variables. This is achieved by integrating a nearest neighbors penalization mechanism into the framework, which provably controls the FDR at the user-defined target level. A real-world example of sparse index tracking demonstrates the proposed method's ability to accurately track the S&P 500 index over the past 20 years based on a small number of stocks. An open-source implementation is provided within the R package TRexSelector on CRAN.
△ Less
Submitted 30 January, 2024; v1 submitted 26 January, 2024;
originally announced January 2024.
-
Algorithms for Learning Graphs in Financial Markets
Authors:
José Vinícius de Miranda Cardoso,
Jiaxi Ying,
Daniel Perez Palomar
Abstract:
In the past two decades, the field of applied finance has tremendously benefited from graph theory. As a result, novel methods ranging from asset network estimation to hierarchical asset selection and portfolio allocation are now part of practitioners' toolboxes. In this paper, we investigate the fundamental problem of learning undirected graphical models under Laplacian structural constraints fro…
▽ More
In the past two decades, the field of applied finance has tremendously benefited from graph theory. As a result, novel methods ranging from asset network estimation to hierarchical asset selection and portfolio allocation are now part of practitioners' toolboxes. In this paper, we investigate the fundamental problem of learning undirected graphical models under Laplacian structural constraints from the point of view of financial market times series data. In particular, we present natural justifications, supported by empirical evidence, for the usage of the Laplacian matrix as a model for the precision matrix of financial assets, while also establishing a direct link that reveals how Laplacian constraints are coupled to meaningful physical interpretations related to the market index factor and to conditional correlations between stocks. Those interpretations lead to a set of guidelines that practitioners should be aware of when estimating graphs in financial markets. In addition, we design numerical algorithms based on the alternating direction method of multipliers to learn undirected, weighted graphs that take into account stylized facts that are intrinsic to financial data such as heavy tails and modularity. We illustrate how to leverage the learned graphs into practical scenarios such as stock time series clustering and foreign exchange network estimation. The proposed graph learning algorithms outperform the state-of-the-art methods in an extensive set of practical experiments. Furthermore, we obtain theoretical and empirical convergence results for the proposed algorithms. Along with the developed methodologies for graph learning in financial markets, we release an R package, called fingraph, accommodating the code and data to obtain all the experimental results.
△ Less
Submitted 30 December, 2020;
originally announced December 2020.
-
Solving High-Order Portfolios via Successive Convex Approximation Algorithms
Authors:
Rui Zhou,
Daniel P. Palomar
Abstract:
The first moment and second central moments of the portfolio return, a.k.a. mean and variance, have been widely employed to assess the expected profit and risk of the portfolio. Investors pursue higher mean and lower variance when designing the portfolios. The two moments can well describe the distribution of the portfolio return when it follows the Gaussian distribution. However, the real world d…
▽ More
The first moment and second central moments of the portfolio return, a.k.a. mean and variance, have been widely employed to assess the expected profit and risk of the portfolio. Investors pursue higher mean and lower variance when designing the portfolios. The two moments can well describe the distribution of the portfolio return when it follows the Gaussian distribution. However, the real world distribution of assets return is usually asymmetric and heavy-tailed, which is far from being a Gaussian distribution. The asymmetry and the heavy-tailedness are characterized by the third and fourth central moments, i.e., skewness and kurtosis, respectively. Higher skewness and lower kurtosis are preferred to reduce the probability of extreme losses. However, incorporating high-order moments in the portfolio design is very difficult due to their non-convexity and rapidly increasing computational cost with the dimension. In this paper, we propose a very efficient and convergence-provable algorithm framework based on the successive convex approximation (SCA) algorithm to solve high-order portfolios. The efficiency of the proposed algorithm framework is demonstrated by the numerical experiments.
△ Less
Submitted 3 August, 2020;
originally announced August 2020.
-
Learning Undirected Graphs in Financial Markets
Authors:
José Vinícius de Miranda Cardoso,
Daniel P. Palomar
Abstract:
We investigate the problem of learning undirected graphical models under Laplacian structural constraints from the point of view of financial market data. We show that Laplacian constraints have meaningful physical interpretations related to the market index factor and to the conditional correlations between stocks. Those interpretations lead to a set of guidelines that users should be aware of wh…
▽ More
We investigate the problem of learning undirected graphical models under Laplacian structural constraints from the point of view of financial market data. We show that Laplacian constraints have meaningful physical interpretations related to the market index factor and to the conditional correlations between stocks. Those interpretations lead to a set of guidelines that users should be aware of when estimating graphs in financial markets. In addition, we propose algorithms to learn undirected graphs that account for stylized facts and tasks intrinsic to financial data such as non-stationarity and stock clustering.
△ Less
Submitted 9 November, 2020; v1 submitted 20 May, 2020;
originally announced May 2020.
-
Parameter Estimation of Heavy-Tailed AR Model with Missing Data via Stochastic EM
Authors:
Junyan Liu,
Sandeep Kumar,
Daniel P. Palomar
Abstract:
The autoregressive (AR) model is a widely used model to understand time series data. Traditionally, the innovation noise of the AR is modeled as Gaussian. However, many time series applications, for example, financial time series data, are non-Gaussian, therefore, the AR model with more general heavy-tailed innovations is preferred. Another issue that frequently occurs in time series is missing va…
▽ More
The autoregressive (AR) model is a widely used model to understand time series data. Traditionally, the innovation noise of the AR is modeled as Gaussian. However, many time series applications, for example, financial time series data, are non-Gaussian, therefore, the AR model with more general heavy-tailed innovations is preferred. Another issue that frequently occurs in time series is missing values, due to system data record failure or unexpected data loss. Although there are numerous works about Gaussian AR time series with missing values, as far as we know, there does not exist any work addressing the issue of missing data for the heavy-tailed AR model. In this paper, we consider this issue for the first time, and propose an efficient framework for parameter estimation from incomplete heavy-tailed time series based on a stochastic approximation expectation maximization (SAEM) coupled with a Markov Chain Monte Carlo (MCMC) procedure. The proposed algorithm is computationally cheap and easy to implement. The convergence of the proposed algorithm to a stationary point of the observed data likelihood is rigorously proved. Extensive simulations and real datasets analyses demonstrate the efficacy of the proposed framework.
△ Less
Submitted 9 February, 2019; v1 submitted 19 September, 2018;
originally announced September 2018.
-
Sparse Reduced Rank Regression With Nonconvex Regularization
Authors:
Ziping Zhao,
Daniel P. Palomar
Abstract:
In this paper, the estimation problem for sparse reduced rank regression (SRRR) model is considered. The SRRR model is widely used for dimension reduction and variable selection with applications in signal processing, econometrics, etc. The problem is formulated to minimize the least squares loss with a sparsity-inducing penalty considering an orthogonality constraint. Convex sparsity-inducing fun…
▽ More
In this paper, the estimation problem for sparse reduced rank regression (SRRR) model is considered. The SRRR model is widely used for dimension reduction and variable selection with applications in signal processing, econometrics, etc. The problem is formulated to minimize the least squares loss with a sparsity-inducing penalty considering an orthogonality constraint. Convex sparsity-inducing functions have been used for SRRR in literature. In this work, a nonconvex function is proposed for better sparsity inducing. An efficient algorithm is developed based on the alternating minimization (or projection) method to solve the nonconvex optimization problem. Numerical simulations show that the proposed algorithm is much more efficient compared to the benchmark methods and the nonconvex function can result in a better estimation accuracy.
△ Less
Submitted 20 March, 2018;
originally announced March 2018.
-
Optimal Portfolio Design for Statistical Arbitrage in Finance
Authors:
Ziping Zhao,
Rui Zhou,
Zhongju Wang,
Daniel P. Palomar
Abstract:
In this paper, the optimal mean-reverting portfolio (MRP) design problem is considered, which plays an important role for the statistical arbitrage (a.k.a. pairs trading) strategy in financial markets. The target of the optimal MRP design is to construct a portfolio from the underlying assets that can exhibit a satisfactory mean reversion property and a desirable variance property. A general probl…
▽ More
In this paper, the optimal mean-reverting portfolio (MRP) design problem is considered, which plays an important role for the statistical arbitrage (a.k.a. pairs trading) strategy in financial markets. The target of the optimal MRP design is to construct a portfolio from the underlying assets that can exhibit a satisfactory mean reversion property and a desirable variance property. A general problem formulation is proposed by considering these two targets and an investment leverage constraint. To solve this problem, a successive convex approximation method is used. The performance of the proposed model and algorithms are verified by numerical simulations.
△ Less
Submitted 8 March, 2018;
originally announced March 2018.
-
Robust Maximum Likelihood Estimation of Sparse Vector Error Correction Model
Authors:
Ziping Zhao,
Daniel P. Palomar
Abstract:
In econometrics and finance, the vector error correction model (VECM) is an important time series model for cointegration analysis, which is used to estimate the long-run equilibrium variable relationships. The traditional analysis and estimation methodologies assume the underlying Gaussian distribution but, in practice, heavy-tailed data and outliers can lead to the inapplicability of these metho…
▽ More
In econometrics and finance, the vector error correction model (VECM) is an important time series model for cointegration analysis, which is used to estimate the long-run equilibrium variable relationships. The traditional analysis and estimation methodologies assume the underlying Gaussian distribution but, in practice, heavy-tailed data and outliers can lead to the inapplicability of these methods. In this paper, we propose a robust model estimation method based on the Cauchy distribution to tackle this issue. In addition, sparse cointegration relations are considered to realize feature selection and dimension reduction. An efficient algorithm based on the majorization-minimization (MM) method is applied to solve the proposed nonconvex problem. The performance of this algorithm is shown through numerical simulations.
△ Less
Submitted 16 October, 2017;
originally announced October 2017.
-
Mean-Reverting Portfolio Design with Budget Constraint
Authors:
Ziping Zhao,
Daniel P. Palomar
Abstract:
This paper considers the mean-reverting portfolio design problem arising from statistical arbitrage in the financial markets. We first propose a general problem formulation aimed at finding a portfolio of underlying component assets by optimizing a mean-reversion criterion characterizing the mean-reversion strength, taking into consideration the variance of the portfolio and an investment budget c…
▽ More
This paper considers the mean-reverting portfolio design problem arising from statistical arbitrage in the financial markets. We first propose a general problem formulation aimed at finding a portfolio of underlying component assets by optimizing a mean-reversion criterion characterizing the mean-reversion strength, taking into consideration the variance of the portfolio and an investment budget constraint. Then several specific problems are considered based on the general formulation, and efficient algorithms are proposed. Numerical results on both synthetic and market data show that our proposed mean-reverting portfolio design methods can generate consistent profits and outperform the traditional design methods and the benchmark methods in the literature.
△ Less
Submitted 18 January, 2017;
originally announced January 2017.
-
Mean-Reverting Portfolio Design via Majorization-Minimization Method
Authors:
Ziping Zhao,
Daniel P. Palomar
Abstract:
This paper considers the mean-reverting portfolio design problem arising from statistical arbitrage in the financial markets. The problem is formulated by optimizing a criterion characterizing the mean-reversion strength of the portfolio and taking into consideration the variance of the portfolio and an investment budget constraint at the same time. An efficient algorithm based on the majorization…
▽ More
This paper considers the mean-reverting portfolio design problem arising from statistical arbitrage in the financial markets. The problem is formulated by optimizing a criterion characterizing the mean-reversion strength of the portfolio and taking into consideration the variance of the portfolio and an investment budget constraint at the same time. An efficient algorithm based on the majorization-minimization (MM) method is proposed to solve the problem. Numerical results show that our proposed mean-reverting portfolio design method can significantly outperform every underlying single spread and the benchmark method in the literature.
△ Less
Submitted 25 November, 2016;
originally announced November 2016.
-
Performance analysis and optimal selection of large mean-variance portfolios under estimation risk
Authors:
Francisco Rubio,
Xavier Mestre,
Daniel P. Palomar
Abstract:
We study the consistency of sample mean-variance portfolios of arbitrarily high dimension that are based on Bayesian or shrinkage estimation of the input parameters as well as weighted sampling. In an asymptotic setting where the number of assets remains comparable in magnitude to the sample size, we provide a characterization of the estimation risk by providing deterministic equivalents of the po…
▽ More
We study the consistency of sample mean-variance portfolios of arbitrarily high dimension that are based on Bayesian or shrinkage estimation of the input parameters as well as weighted sampling. In an asymptotic setting where the number of assets remains comparable in magnitude to the sample size, we provide a characterization of the estimation risk by providing deterministic equivalents of the portfolio out-of-sample performance in terms of the underlying investment scenario. The previous estimates represent a means of quantifying the amount of risk underestimation and return overestimation of improved portfolio constructions beyond standard ones. Well-known for the latter, if not corrected, these deviations lead to inaccurate and overly optimistic Sharpe-based investment decisions. Our results are based on recent contributions in the field of random matrix theory. Along with the asymptotic analysis, the analytical framework allows us to find bias corrections improving on the achieved out-of-sample performance of typical portfolio constructions. Some numerical simulations validate our theoretical findings.
△ Less
Submitted 16 October, 2011;
originally announced October 2011.