-
Missing Data in Signal Processing and Machine Learning: Models, Methods and Modern Approaches
Authors:
Alexandre Hippert-Ferrer,
Aude Sportisse,
Amirhossein Javaheri,
Mohammed Nabil El Korso,
Daniel P. Palomar
Abstract:
This tutorial aims to provide signal processing (SP) and machine learning (ML) practitioners with vital tools, in an accessible way, to answer the question: How to deal with missing data? There are many strategies to handle incomplete signals. In this paper, we propose to group these strategies based on three common tasks: i) missing-data imputation, ii) estimation with missing values and iii) pre…
▽ More
This tutorial aims to provide signal processing (SP) and machine learning (ML) practitioners with vital tools, in an accessible way, to answer the question: How to deal with missing data? There are many strategies to handle incomplete signals. In this paper, we propose to group these strategies based on three common tasks: i) missing-data imputation, ii) estimation with missing values and iii) prediction with missing values. We focus on methodological and experimental results through specific case studies on real-world applications. Promising and future research directions, including a better integration of informative missingness, are also discussed. We hope that the proposed conceptual framework and the presentation of recent missing-data problems related will encourage researchers of the SP and ML communities to develop original methods and to efficiently deal with new applications involving missing data.
△ Less
Submitted 3 June, 2025; v1 submitted 2 June, 2025;
originally announced June 2025.
-
The Informed Elastic Net for Fast Grouped Variable Selection and FDR Control in Genomics Research
Authors:
Jasin Machkour,
Michael Muma,
Daniel P. Palomar
Abstract:
Modern genomics research relies on genome-wide association studies (GWAS) to identify the few genetic variants among potentially millions that are associated with diseases of interest. Only reproducible discoveries of groups of associations improve our understanding of complex polygenic diseases and enable the development of new drugs and personalized medicine. Thus, fast multivariate variable sel…
▽ More
Modern genomics research relies on genome-wide association studies (GWAS) to identify the few genetic variants among potentially millions that are associated with diseases of interest. Only reproducible discoveries of groups of associations improve our understanding of complex polygenic diseases and enable the development of new drugs and personalized medicine. Thus, fast multivariate variable selection methods that have a high true positive rate (TPR) while controlling the false discovery rate (FDR) are crucial. Recently, the T-Rex+GVS selector, a version of the T-Rex selector that uses the elastic net (EN) as a base selector to perform grouped variable election, was proposed. Although it significantly increased the TPR in simulated GWAS compared to the original T-Rex, its comparably high computational cost limits scalability. Therefore, we propose the informed elastic net (IEN), a new base selector that significantly reduces computation time while retaining the grouped variable selection property. We quantify its grouping effect and derive its formulation as a Lasso-type optimization problem, which is solved efficiently within the T-Rex framework by the terminated LARS algorithm. Numerical simulations and a GWAS study demonstrate that the proposed T-Rex+GVS (IEN) exhibits the desired grouping effect, reduces computation time, and achieves the same TPR as T-Rex+GVS (EN) but with lower FDR, which makes it a promising method for large-scale GWAS.
△ Less
Submitted 7 October, 2024;
originally announced October 2024.
-
False Discovery Rate Control for Fast Screening of Large-Scale Genomics Biobanks
Authors:
Jasin Machkour,
Michael Muma,
Daniel P. Palomar
Abstract:
Genomics biobanks are information treasure troves with thousands of phenotypes (e.g., diseases, traits) and millions of single nucleotide polymorphisms (SNPs). The development of methodologies that provide reproducible discoveries is essential for the understanding of complex diseases and precision drug development. Without statistical reproducibility guarantees, valuable efforts are spent on rese…
▽ More
Genomics biobanks are information treasure troves with thousands of phenotypes (e.g., diseases, traits) and millions of single nucleotide polymorphisms (SNPs). The development of methodologies that provide reproducible discoveries is essential for the understanding of complex diseases and precision drug development. Without statistical reproducibility guarantees, valuable efforts are spent on researching false positives. Therefore, scalable multivariate and high-dimensional false discovery rate (FDR)-controlling variable selection methods are urgently needed, especially, for complex polygenic diseases and traits. In this work, we propose the Screen-T-Rex selector, a fast FDR-controlling method based on the recently developed T-Rex selector. The method is tailored to screening large-scale biobanks and it does not require choosing additional parameters (sparsity parameter, target FDR level, etc). Numerical simulations and a real-world HIV-1 drug resistance example demonstrate that the performance of the Screen-T-Rex selector is superior, and its computation time is multiple orders of magnitude lower compared to current benchmark knockoff methods.
△ Less
Submitted 7 October, 2024;
originally announced October 2024.
-
High-Dimensional False Discovery Rate Control for Dependent Variables
Authors:
Jasin Machkour,
Michael Muma,
Daniel P. Palomar
Abstract:
Algorithms that ensure reproducible findings from large-scale, high-dimensional data are pivotal in numerous signal processing applications. In recent years, multivariate false discovery rate (FDR) controlling methods have emerged, providing guarantees even in high-dimensional settings where the number of variables surpasses the number of samples. However, these methods often fail to reliably cont…
▽ More
Algorithms that ensure reproducible findings from large-scale, high-dimensional data are pivotal in numerous signal processing applications. In recent years, multivariate false discovery rate (FDR) controlling methods have emerged, providing guarantees even in high-dimensional settings where the number of variables surpasses the number of samples. However, these methods often fail to reliably control the FDR in the presence of highly dependent variable groups, a common characteristic in fields such as genomics and finance. To tackle this critical issue, we introduce a novel framework that accounts for general dependency structures. Our proposed dependency-aware T-Rex selector integrates hierarchical graphical models within the T-Rex framework to effectively harness the dependency structure among variables. Leveraging martingale theory, we prove that our variable penalization mechanism ensures FDR control. We further generalize the FDR-controlling framework by stating and proving a clear condition necessary for designing both graphical and non-graphical models that capture dependencies. Additionally, we formulate a fully integrated optimal calibration algorithm that concurrently determines the parameters of the graphical model and the T-Rex framework, such that the FDR is controlled while maximizing the number of selected variables. Numerical experiments and a breast cancer survival analysis use-case demonstrate that the proposed method is the only one among the state-of-the-art benchmark methods that controls the FDR and reliably detects genes that have been previously identified to be related to breast cancer. An open-source implementation is available within the R package TRexSelector on CRAN.
△ Less
Submitted 30 January, 2024; v1 submitted 28 January, 2024;
originally announced January 2024.
-
FDR-Controlled Portfolio Optimization for Sparse Financial Index Tracking
Authors:
Jasin Machkour,
Daniel P. Palomar,
Michael Muma
Abstract:
In high-dimensional data analysis, such as financial index tracking or biomedical applications, it is crucial to select the few relevant variables while maintaining control over the false discovery rate (FDR). In these applications, strong dependencies often exist among the variables (e.g., stock returns), which can undermine the FDR control property of existing methods like the model-X knockoff m…
▽ More
In high-dimensional data analysis, such as financial index tracking or biomedical applications, it is crucial to select the few relevant variables while maintaining control over the false discovery rate (FDR). In these applications, strong dependencies often exist among the variables (e.g., stock returns), which can undermine the FDR control property of existing methods like the model-X knockoff method or the T-Rex selector. To address this issue, we have expanded the T-Rex framework to accommodate overlapping groups of highly correlated variables. This is achieved by integrating a nearest neighbors penalization mechanism into the framework, which provably controls the FDR at the user-defined target level. A real-world example of sparse index tracking demonstrates the proposed method's ability to accurately track the S&P 500 index over the past 20 years based on a small number of stocks. An open-source implementation is provided within the R package TRexSelector on CRAN.
△ Less
Submitted 30 January, 2024; v1 submitted 26 January, 2024;
originally announced January 2024.
-
Sparse PCA with False Discovery Rate Controlled Variable Selection
Authors:
Jasin Machkour,
Arnaud Breloy,
Michael Muma,
Daniel P. Palomar,
Frédéric Pascal
Abstract:
Sparse principal component analysis (PCA) aims at mapping large dimensional data to a linear subspace of lower dimension. By imposing loading vectors to be sparse, it performs the double duty of dimension reduction and variable selection. Sparse PCA algorithms are usually expressed as a trade-off between explained variance and sparsity of the loading vectors (i.e., number of selected variables). A…
▽ More
Sparse principal component analysis (PCA) aims at mapping large dimensional data to a linear subspace of lower dimension. By imposing loading vectors to be sparse, it performs the double duty of dimension reduction and variable selection. Sparse PCA algorithms are usually expressed as a trade-off between explained variance and sparsity of the loading vectors (i.e., number of selected variables). As a high explained variance is not necessarily synonymous with relevant information, these methods are prone to select irrelevant variables. To overcome this issue, we propose an alternative formulation of sparse PCA driven by the false discovery rate (FDR). We then leverage the Terminating-Random Experiments (T-Rex) selector to automatically determine an FDR-controlled support of the loading vectors. A major advantage of the resulting T-Rex PCA is that no sparsity parameter tuning is required. Numerical experiments and a stock market data example demonstrate a significant performance improvement.
△ Less
Submitted 16 January, 2024;
originally announced January 2024.
-
Affine equivariant Tyler's M-estimator applied to tail parameter learning of elliptical distributions
Authors:
Esa Ollila,
Daniel P. Palomar,
Frederic Pascal
Abstract:
We propose estimating the scale parameter (mean of the eigenvalues) of the scatter matrix of an unspecified elliptically symmetric distribution using weights obtained by solving Tyler's M-estimator of the scatter matrix. The proposed Tyler's weights-based estimate (TWE) of scale is then used to construct an affine equivariant Tyler's M-estimator as a weighted sample covariance matrix using normali…
▽ More
We propose estimating the scale parameter (mean of the eigenvalues) of the scatter matrix of an unspecified elliptically symmetric distribution using weights obtained by solving Tyler's M-estimator of the scatter matrix. The proposed Tyler's weights-based estimate (TWE) of scale is then used to construct an affine equivariant Tyler's M-estimator as a weighted sample covariance matrix using normalized Tyler's weights. We then develop a unified framework for estimating the unknown tail parameter of the elliptical distribution (such as the degrees of freedom (d.o.f.) $ν$ of the multivariate $t$ (MVT) distribution). Using the proposed TWE of scale, a new robust estimate of the d.o.f. parameter of MVT distribution is proposed with excellent performance in heavy-tailed scenarios, outperforming other competing methods. R-package is available that implements the proposed method.
△ Less
Submitted 7 May, 2023;
originally announced May 2023.
-
Adaptive Estimation of Graphical Models under Total Positivity
Authors:
Jiaxi Ying,
José Vinícius de M. Cardoso,
Daniel P. Palomar
Abstract:
We consider the problem of estimating (diagonally dominant) M-matrices as precision matrices in Gaussian graphical models. These models exhibit intriguing properties, such as the existence of the maximum likelihood estimator with merely two observations for M-matrices \citep{lauritzen2019maximum,slawski2015estimation} and even one observation for diagonally dominant M-matrices \citep{truell2021max…
▽ More
We consider the problem of estimating (diagonally dominant) M-matrices as precision matrices in Gaussian graphical models. These models exhibit intriguing properties, such as the existence of the maximum likelihood estimator with merely two observations for M-matrices \citep{lauritzen2019maximum,slawski2015estimation} and even one observation for diagonally dominant M-matrices \citep{truell2021maximum}. We propose an adaptive multiple-stage estimation method that refines the estimate by solving a weighted $\ell_1$-regularized problem at each stage. Furthermore, we develop a unified framework based on the gradient projection method to solve the regularized problem, incorporating distinct projections to handle the constraints of M-matrices and diagonally dominant M-matrices. A theoretical analysis of the estimation error is provided. Our method outperforms state-of-the-art methods in precision matrix estimation and graph edge identification, as evidenced by synthetic and financial time-series data sets.
△ Less
Submitted 8 June, 2023; v1 submitted 27 October, 2022;
originally announced October 2022.
-
The Terminating-Random Experiments Selector: Fast High-Dimensional Variable Selection with False Discovery Rate Control
Authors:
Jasin Machkour,
Michael Muma,
Daniel P. Palomar
Abstract:
We propose the Terminating-Random Experiments (T-Rex) selector, a fast variable selection method for high-dimensional data. The T-Rex selector controls a user-defined target false discovery rate (FDR) while maximizing the number of selected variables. This is achieved by fusing the solutions of multiple early terminated random experiments. The experiments are conducted on a combination of the orig…
▽ More
We propose the Terminating-Random Experiments (T-Rex) selector, a fast variable selection method for high-dimensional data. The T-Rex selector controls a user-defined target false discovery rate (FDR) while maximizing the number of selected variables. This is achieved by fusing the solutions of multiple early terminated random experiments. The experiments are conducted on a combination of the original predictors and multiple sets of randomly generated dummy predictors. A finite sample proof based on martingale theory for the FDR control property is provided. Numerical simulations confirm that the FDR is controlled at the target level while allowing for high power. We prove that the dummies can be sampled from any univariate probability distribution with finite expectation and variance. The computational complexity of the proposed method is linear in the number of variables. The T-Rex selector outperforms state-of-the-art methods for FDR control in numerical experiments and on a simulated genome-wide association study (GWAS), while its sequential computation time is more than two orders of magnitude lower than that of the strongest benchmark methods. The open source R package TRexSelector containing the implementation of the T-Rex selector is available on CRAN.
△ Less
Submitted 12 March, 2024; v1 submitted 12 October, 2021;
originally announced October 2021.
-
Does the $\ell_1$-norm Learn a Sparse Graph under Laplacian Constrained Graphical Models?
Authors:
Jiaxi Ying,
José Vinícius de M. Cardoso,
Daniel P. Palomar
Abstract:
We consider the problem of learning a sparse graph under the Laplacian constrained Gaussian graphical models. This problem can be formulated as a penalized maximum likelihood estimation of the Laplacian constrained precision matrix. Like in the classical graphical lasso problem, recent works made use of the $\ell_1$-norm regularization with the goal of promoting sparsity in Laplacian constrained p…
▽ More
We consider the problem of learning a sparse graph under the Laplacian constrained Gaussian graphical models. This problem can be formulated as a penalized maximum likelihood estimation of the Laplacian constrained precision matrix. Like in the classical graphical lasso problem, recent works made use of the $\ell_1$-norm regularization with the goal of promoting sparsity in Laplacian constrained precision matrix estimation. However, we find that the widely used $\ell_1$-norm is not effective in imposing a sparse solution in this problem. Through empirical evidence, we observe that the number of nonzero graph weights grows with the increase of the regularization parameter. From a theoretical perspective, we prove that a large regularization parameter will surprisingly lead to a complete graph, i.e., every pair of vertices is connected by an edge. To address this issue, we introduce the nonconvex sparsity penalty, and propose a new estimator by solving a sequence of weighted $\ell_1$-norm penalized sub-problems. We establish the non-asymptotic optimization performance guarantees on both optimization error and statistical error, and prove that the proposed estimator can recover the edges correctly with a high probability. To solve each sub-problem, we develop a projected gradient descent algorithm which enjoys a linear convergence rate. Finally, an extension to learn disconnected graphs is proposed by imposing additional rank constraint. We propose a numerical algorithm based on based on the alternating direction method of multipliers, and establish its theoretical sequence convergence. Numerical experiments involving synthetic and real-world data sets demonstrate the effectiveness of the proposed method.
△ Less
Submitted 5 September, 2023; v1 submitted 26 June, 2020;
originally announced June 2020.
-
Shrinking the eigenvalues of M-estimators of covariance matrix
Authors:
Esa Ollila,
Daniel P. Palomar,
Frédéric Pascal
Abstract:
A highly popular regularized (shrinkage) covariance matrix estimator is the shrinkage sample covariance matrix (SCM) which shares the same set of eigenvectors as the SCM but shrinks its eigenvalues toward the grand mean of the eigenvalues of the SCM. In this paper, a more general approach is considered in which the SCM is replaced by an M-estimator of scatter matrix and a fully automatic data adap…
▽ More
A highly popular regularized (shrinkage) covariance matrix estimator is the shrinkage sample covariance matrix (SCM) which shares the same set of eigenvectors as the SCM but shrinks its eigenvalues toward the grand mean of the eigenvalues of the SCM. In this paper, a more general approach is considered in which the SCM is replaced by an M-estimator of scatter matrix and a fully automatic data adaptive method to compute the optimal shrinkage parameter with minimum mean squared error is proposed. Our approach permits the use of any weight function such as Gaussian, Huber's, Tyler's, or t-weight functions, all of which are commonly used in M-estimation framework. Our simulation examples illustrate that shrinkage M-estimators based on the proposed optimal tuning combined with robust weight function do not loose in performance to shrinkage SCM estimator when the data is Gaussian, but provide significantly improved performance when the data is sampled from an unspecified heavy-tailed elliptically symmetric distribution. Also, real-world and synthetic stock market data validate the performance of the proposed method in practical applications.
△ Less
Submitted 28 October, 2020; v1 submitted 17 June, 2020;
originally announced June 2020.
-
Learning Undirected Graphs in Financial Markets
Authors:
José Vinícius de Miranda Cardoso,
Daniel P. Palomar
Abstract:
We investigate the problem of learning undirected graphical models under Laplacian structural constraints from the point of view of financial market data. We show that Laplacian constraints have meaningful physical interpretations related to the market index factor and to the conditional correlations between stocks. Those interpretations lead to a set of guidelines that users should be aware of wh…
▽ More
We investigate the problem of learning undirected graphical models under Laplacian structural constraints from the point of view of financial market data. We show that Laplacian constraints have meaningful physical interpretations related to the market index factor and to the conditional correlations between stocks. Those interpretations lead to a set of guidelines that users should be aware of when estimating graphs in financial markets. In addition, we propose algorithms to learn undirected graphs that account for stylized facts and tasks intrinsic to financial data such as non-stationarity and stock clustering.
△ Less
Submitted 9 November, 2020; v1 submitted 20 May, 2020;
originally announced May 2020.
-
M-estimators of scatter with eigenvalue shrinkage
Authors:
Esa Ollila,
Daniel P. Palomar,
Frederic Pascal
Abstract:
A popular regularized (shrinkage) covariance estimator is the shrinkage sample covariance matrix (SCM) which shares the same set of eigenvectors as the SCM but shrinks its eigenvalues toward its grand mean. In this paper, a more general approach is considered in which the SCM is replaced by an M-estimator of scatter matrix and a fully automatic data adaptive method to compute the optimal shrinkage…
▽ More
A popular regularized (shrinkage) covariance estimator is the shrinkage sample covariance matrix (SCM) which shares the same set of eigenvectors as the SCM but shrinks its eigenvalues toward its grand mean. In this paper, a more general approach is considered in which the SCM is replaced by an M-estimator of scatter matrix and a fully automatic data adaptive method to compute the optimal shrinkage parameter with minimum mean squared error is proposed. Our approach permits the use of any weight function such as Gaussian, Huber's, or $t$ weight functions, all of which are commonly used in M-estimation framework. Our simulation examples illustrate that shrinkage M-estimators based on the proposed optimal tuning combined with robust weight function do not loose in performance to shrinkage SCM estimator when the data is Gaussian, but provide significantly improved performance when the data is sampled from a heavy-tailed distribution.
△ Less
Submitted 12 February, 2020;
originally announced February 2020.
-
Robust Factor Analysis Parameter Estimation
Authors:
Rui Zhou,
Junyan Liu,
Sandeep Kumar,
Daniel P. Palomar
Abstract:
This paper considers the problem of robustly estimating the parameters of a heavy-tailed multivariate distribution when the covariance matrix is known to have the structure of a low-rank matrix plus a diagonal matrix as considered in factor analysis (FA). By assuming the observed data to follow the multivariate Student's t distribution, we can robustly estimate the parameters via maximum likelihoo…
▽ More
This paper considers the problem of robustly estimating the parameters of a heavy-tailed multivariate distribution when the covariance matrix is known to have the structure of a low-rank matrix plus a diagonal matrix as considered in factor analysis (FA). By assuming the observed data to follow the multivariate Student's t distribution, we can robustly estimate the parameters via maximum likelihood estimation (MLE). However, the MLE of parameters becomes an intractable problem when the multivariate Student's t distribution and the FA structure are both introduced. In this paper, we propose an algorithm based on the generalized expectation maximization (GEM) method to obtain estimators. The robustness of our proposed method is further enhanced to cope with missing values. Finally, we show the performance of our proposed algorithm using both synthetic data and real financial data.
△ Less
Submitted 27 September, 2019;
originally announced September 2019.
-
Structured Graph Learning Via Laplacian Spectral Constraints
Authors:
Sandeep Kumar,
Jiaxi Ying,
Jos'e Vin'icius de M. Cardoso,
Daniel P. Palomar
Abstract:
Learning a graph with a specific structure is essential for interpretability and identification of the relationships among data. It is well known that structured graph learning from observed samples is an NP-hard combinatorial problem. In this paper, we first show that for a set of important graph families it is possible to convert the structural constraints of structure into eigenvalue constraint…
▽ More
Learning a graph with a specific structure is essential for interpretability and identification of the relationships among data. It is well known that structured graph learning from observed samples is an NP-hard combinatorial problem. In this paper, we first show that for a set of important graph families it is possible to convert the structural constraints of structure into eigenvalue constraints of the graph Laplacian matrix. Then we introduce a unified graph learning framework, lying at the integration of the spectral properties of the Laplacian matrix with Gaussian graphical modeling that is capable of learning structures of a large class of graph families. The proposed algorithms are provably convergent and practically amenable for large-scale semi-supervised and unsupervised graph-based learning tasks. Extensive numerical experiments with both synthetic and real data sets demonstrate the effectiveness of the proposed methods. An R package containing code for all the experimental results is available at https://cran.r-project.org/package=spectralGraphTopology.
△ Less
Submitted 24 September, 2019;
originally announced September 2019.
-
Distributed Inexact Successive Convex Approximation ADMM: Analysis-Part I
Authors:
Sandeep Kumar,
Ketan Rajawat,
Daniel P. Palomar
Abstract:
In this two-part work, we propose an algorithmic framework for solving non-convex problems whose objective function is the sum of a number of smooth component functions plus a convex (possibly non-smooth) or/and smooth (possibly non-convex) regularization function. The proposed algorithm incorporates ideas from several existing approaches such as alternate direction method of multipliers (ADMM), s…
▽ More
In this two-part work, we propose an algorithmic framework for solving non-convex problems whose objective function is the sum of a number of smooth component functions plus a convex (possibly non-smooth) or/and smooth (possibly non-convex) regularization function. The proposed algorithm incorporates ideas from several existing approaches such as alternate direction method of multipliers (ADMM), successive convex approximation (SCA), distributed and asynchronous algorithms, and inexact gradient methods. Different from a number of existing approaches, however, the proposed framework is flexible enough to incorporate a class of non-convex objective functions, allow distributed operation with and without a fusion center, and include variance reduced methods as special cases. Remarkably, the proposed algorithms are robust to uncertainties arising from random, deterministic, and adversarial sources. The part I of the paper develops two variants of the algorithm under very mild assumptions and establishes first-order convergence rate guarantees. The proof developed here allows for generic errors and delays, paving the way for different variance-reduced, asynchronous, and stochastic implementations, outlined and evaluated in part II.
△ Less
Submitted 21 July, 2019;
originally announced July 2019.
-
Parameter Estimation of Heavy-Tailed AR Model with Missing Data via Stochastic EM
Authors:
Junyan Liu,
Sandeep Kumar,
Daniel P. Palomar
Abstract:
The autoregressive (AR) model is a widely used model to understand time series data. Traditionally, the innovation noise of the AR is modeled as Gaussian. However, many time series applications, for example, financial time series data, are non-Gaussian, therefore, the AR model with more general heavy-tailed innovations is preferred. Another issue that frequently occurs in time series is missing va…
▽ More
The autoregressive (AR) model is a widely used model to understand time series data. Traditionally, the innovation noise of the AR is modeled as Gaussian. However, many time series applications, for example, financial time series data, are non-Gaussian, therefore, the AR model with more general heavy-tailed innovations is preferred. Another issue that frequently occurs in time series is missing values, due to system data record failure or unexpected data loss. Although there are numerous works about Gaussian AR time series with missing values, as far as we know, there does not exist any work addressing the issue of missing data for the heavy-tailed AR model. In this paper, we consider this issue for the first time, and propose an efficient framework for parameter estimation from incomplete heavy-tailed time series based on a stochastic approximation expectation maximization (SAEM) coupled with a Markov Chain Monte Carlo (MCMC) procedure. The proposed algorithm is computationally cheap and easy to implement. The convergence of the proposed algorithm to a stationary point of the observed data likelihood is rigorously proved. Extensive simulations and real datasets analyses demonstrate the efficacy of the proposed framework.
△ Less
Submitted 9 February, 2019; v1 submitted 19 September, 2018;
originally announced September 2018.
-
Sparse Reduced Rank Regression With Nonconvex Regularization
Authors:
Ziping Zhao,
Daniel P. Palomar
Abstract:
In this paper, the estimation problem for sparse reduced rank regression (SRRR) model is considered. The SRRR model is widely used for dimension reduction and variable selection with applications in signal processing, econometrics, etc. The problem is formulated to minimize the least squares loss with a sparsity-inducing penalty considering an orthogonality constraint. Convex sparsity-inducing fun…
▽ More
In this paper, the estimation problem for sparse reduced rank regression (SRRR) model is considered. The SRRR model is widely used for dimension reduction and variable selection with applications in signal processing, econometrics, etc. The problem is formulated to minimize the least squares loss with a sparsity-inducing penalty considering an orthogonality constraint. Convex sparsity-inducing functions have been used for SRRR in literature. In this work, a nonconvex function is proposed for better sparsity inducing. An efficient algorithm is developed based on the alternating minimization (or projection) method to solve the nonconvex optimization problem. Numerical simulations show that the proposed algorithm is much more efficient compared to the benchmark methods and the nonconvex function can result in a better estimation accuracy.
△ Less
Submitted 20 March, 2018;
originally announced March 2018.
-
Robust Maximum Likelihood Estimation of Sparse Vector Error Correction Model
Authors:
Ziping Zhao,
Daniel P. Palomar
Abstract:
In econometrics and finance, the vector error correction model (VECM) is an important time series model for cointegration analysis, which is used to estimate the long-run equilibrium variable relationships. The traditional analysis and estimation methodologies assume the underlying Gaussian distribution but, in practice, heavy-tailed data and outliers can lead to the inapplicability of these metho…
▽ More
In econometrics and finance, the vector error correction model (VECM) is an important time series model for cointegration analysis, which is used to estimate the long-run equilibrium variable relationships. The traditional analysis and estimation methodologies assume the underlying Gaussian distribution but, in practice, heavy-tailed data and outliers can lead to the inapplicability of these methods. In this paper, we propose a robust model estimation method based on the Cauchy distribution to tackle this issue. In addition, sparse cointegration relations are considered to realize feature selection and dimension reduction. An efficient algorithm based on the majorization-minimization (MM) method is applied to solve the proposed nonconvex problem. The performance of this algorithm is shown through numerical simulations.
△ Less
Submitted 16 October, 2017;
originally announced October 2017.
-
Orthogonal Sparse PCA and Covariance Estimation via Procrustes Reformulation
Authors:
Konstantinos Benidis,
Ying Sun,
Prabhu Babu,
Daniel P. Palomar
Abstract:
The problem of estimating sparse eigenvectors of a symmetric matrix attracts a lot of attention in many applications, especially those with high dimensional data set. While classical eigenvectors can be obtained as the solution of a maximization problem, existing approaches formulate this problem by adding a penalty term into the objective function that encourages a sparse solution. However, the r…
▽ More
The problem of estimating sparse eigenvectors of a symmetric matrix attracts a lot of attention in many applications, especially those with high dimensional data set. While classical eigenvectors can be obtained as the solution of a maximization problem, existing approaches formulate this problem by adding a penalty term into the objective function that encourages a sparse solution. However, the resulting methods achieve sparsity at the expense of sacrificing the orthogonality property. In this paper, we develop a new method to estimate dominant sparse eigenvectors without trading off their orthogonality. The problem is highly non-convex and hard to handle. We apply the MM framework where we iteratively maximize a tight lower bound (surrogate function) of the objective function over the Stiefel manifold. The inner maximization problem turns out to be a rectangular Procrustes problem, which has a closed form solution. In addition, we propose a method to improve the covariance estimation problem when its underlying eigenvectors are known to be sparse. We use the eigenvalue decomposition of the covariance matrix to formulate an optimization problem where we impose sparsity on the corresponding eigenvectors. Numerical experiments show that the proposed eigenvector extraction algorithm matches or outperforms existing algorithms in terms of support recovery and explained variance, while the covariance estimation algorithms improve significantly the sample covariance estimator.
△ Less
Submitted 12 February, 2016;
originally announced February 2016.
-
Robust Estimation of Structured Covariance Matrix for Heavy-Tailed Elliptical Distributions
Authors:
Ying Sun,
Prabhu Babu,
Daniel P. Palomar
Abstract:
This paper considers the problem of robustly estimating a structured covariance matrix with an elliptical underlying distribution with known mean. In applications where the covariance matrix naturally possesses a certain structure, taking the prior structure information into account in the estimation procedure is beneficial to improve the estimation accuracy. We propose incorporating the prior str…
▽ More
This paper considers the problem of robustly estimating a structured covariance matrix with an elliptical underlying distribution with known mean. In applications where the covariance matrix naturally possesses a certain structure, taking the prior structure information into account in the estimation procedure is beneficial to improve the estimation accuracy. We propose incorporating the prior structure information into Tyler's M-estimator and formulate the problem as minimizing the cost function of Tyler's estimator under the prior structural constraint. First, the estimation under a general convex structural constraint is introduced with an efficient algorithm for finding the estimator derived based on the majorization minimization (MM) algorithm framework. Then, the algorithm is tailored to several special structures that enjoy a wide range of applications in signal processing related fields, namely, sum of rank-one matrices, Toeplitz, and banded Toeplitz structure. In addition, two types of non-convex structures, i.e., the Kronecker structure and the spiked covariance structure, are also discussed, where it is shown that simple algorithms can be derived under the guidelines of MM. Numerical results show that the proposed estimator achieves a smaller estimation error than the benchmark estimators at a lower computational cost.
△ Less
Submitted 17 June, 2015;
originally announced June 2015.
-
Optimization Methods for Designing Sequences with Low Autocorrelation Sidelobes
Authors:
Junxiao Song,
Prabhu Babu,
Daniel P. Palomar
Abstract:
Unimodular sequences with low autocorrelations are desired in many applications, especially in the area of radar and code-division multiple access (CDMA). In this paper, we propose a new algorithm to design unimodular sequences with low integrated sidelobe level (ISL), which is a widely used measure of the goodness of a sequence's correlation property. The algorithm falls into the general framewor…
▽ More
Unimodular sequences with low autocorrelations are desired in many applications, especially in the area of radar and code-division multiple access (CDMA). In this paper, we propose a new algorithm to design unimodular sequences with low integrated sidelobe level (ISL), which is a widely used measure of the goodness of a sequence's correlation property. The algorithm falls into the general framework of majorization-minimization (MM) algorithms and thus shares the monotonic property of such algorithms. In addition, the algorithm can be implemented via fast Fourier transform (FFT) operations and thus is computationally efficient. Furthermore, after some modifications the algorithm can be adapted to incorporate spectral constraints, which makes the design more flexible. Numerical experiments show that the proposed algorithms outperform existing algorithms in terms of both the quality of designed sequences and the computational complexity.
△ Less
Submitted 26 December, 2014;
originally announced January 2015.
-
Sparse Generalized Eigenvalue Problem via Smooth Optimization
Authors:
Junxiao Song,
Prabhu Babu,
Daniel P. Palomar
Abstract:
In this paper, we consider an $\ell_{0}$-norm penalized formulation of the generalized eigenvalue problem (GEP), aimed at extracting the leading sparse generalized eigenvector of a matrix pair. The formulation involves maximization of a discontinuous nonconcave objective function over a nonconvex constraint set, and is therefore computationally intractable. To tackle the problem, we first approxim…
▽ More
In this paper, we consider an $\ell_{0}$-norm penalized formulation of the generalized eigenvalue problem (GEP), aimed at extracting the leading sparse generalized eigenvector of a matrix pair. The formulation involves maximization of a discontinuous nonconcave objective function over a nonconvex constraint set, and is therefore computationally intractable. To tackle the problem, we first approximate the $\ell_{0}$-norm by a continuous surrogate function. Then an algorithm is developed via iteratively majorizing the surrogate function by a quadratic separable function, which at each iteration reduces to a regular generalized eigenvalue problem. A preconditioned steepest ascent algorithm for finding the leading generalized eigenvector is provided. A systematic way based on smoothing is proposed to deal with the "singularity issue" that arises when a quadratic function is used to majorize the nondifferentiable surrogate function. For sparse GEPs with special structure, algorithms that admit a closed-form solution at every iteration are derived. Numerical experiments show that the proposed algorithms match or outperform existing algorithms in terms of computational complexity and support recovery.
△ Less
Submitted 18 November, 2014; v1 submitted 28 August, 2014;
originally announced August 2014.
-
Regularized Tyler's Scatter Estimator: Existence, Uniqueness, and Algorithms
Authors:
Ying Sun,
Prabhu Babu,
Daniel P. Palomar
Abstract:
This paper considers the regularized Tyler's scatter estimator for elliptical distributions, which has received considerable attention recently. Various types of shrinkage Tyler's estimators have been proposed in the literature and proved work effectively in the "small n large p" scenario. Nevertheless, the existence and uniqueness properties of the estimators are not thoroughly studied, and in ce…
▽ More
This paper considers the regularized Tyler's scatter estimator for elliptical distributions, which has received considerable attention recently. Various types of shrinkage Tyler's estimators have been proposed in the literature and proved work effectively in the "small n large p" scenario. Nevertheless, the existence and uniqueness properties of the estimators are not thoroughly studied, and in certain cases the algorithms may fail to converge. In this work, we provide a general result that analyzes the sufficient condition for the existence of a family of shrinkage Tyler's estimators, which quantitatively shows that regularization indeed reduces the number of required samples for estimation and the convergence of the algorithms for the estimators. For two specific shrinkage Tyler's estimators, we also proved that the condition is necessary and the estimator is unique. Finally, we show that the two estimators are actually equivalent. Numerical algorithms are also derived based on the majorization-minimization framework, under which the convergence is analyzed systematically.
△ Less
Submitted 11 July, 2014;
originally announced July 2014.