Skip to main content

Showing 1–24 of 24 results for author: Palomar, D P

Searching in archive stat. Search in all archives.
.
  1. arXiv:2506.01696  [pdf, ps, other

    eess.SP stat.ML

    Missing Data in Signal Processing and Machine Learning: Models, Methods and Modern Approaches

    Authors: Alexandre Hippert-Ferrer, Aude Sportisse, Amirhossein Javaheri, Mohammed Nabil El Korso, Daniel P. Palomar

    Abstract: This tutorial aims to provide signal processing (SP) and machine learning (ML) practitioners with vital tools, in an accessible way, to answer the question: How to deal with missing data? There are many strategies to handle incomplete signals. In this paper, we propose to group these strategies based on three common tasks: i) missing-data imputation, ii) estimation with missing values and iii) pre… ▽ More

    Submitted 3 June, 2025; v1 submitted 2 June, 2025; originally announced June 2025.

  2. arXiv:2410.05211  [pdf, other

    stat.ME stat.ML

    The Informed Elastic Net for Fast Grouped Variable Selection and FDR Control in Genomics Research

    Authors: Jasin Machkour, Michael Muma, Daniel P. Palomar

    Abstract: Modern genomics research relies on genome-wide association studies (GWAS) to identify the few genetic variants among potentially millions that are associated with diseases of interest. Only reproducible discoveries of groups of associations improve our understanding of complex polygenic diseases and enable the development of new drugs and personalized medicine. Thus, fast multivariate variable sel… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

    Comments: Published in IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), 10-13 December 2023, Los Sueños, Costa Rica

  3. arXiv:2410.05169  [pdf, other

    stat.ME

    False Discovery Rate Control for Fast Screening of Large-Scale Genomics Biobanks

    Authors: Jasin Machkour, Michael Muma, Daniel P. Palomar

    Abstract: Genomics biobanks are information treasure troves with thousands of phenotypes (e.g., diseases, traits) and millions of single nucleotide polymorphisms (SNPs). The development of methodologies that provide reproducible discoveries is essential for the understanding of complex diseases and precision drug development. Without statistical reproducibility guarantees, valuable efforts are spent on rese… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

    Comments: Published in IEEE Statistical Signal Processing Workshop (SSP), 2-5 July 2023, Hanoi, Vietnam

  4. arXiv:2401.15796  [pdf, other

    stat.ME stat.ML

    High-Dimensional False Discovery Rate Control for Dependent Variables

    Authors: Jasin Machkour, Michael Muma, Daniel P. Palomar

    Abstract: Algorithms that ensure reproducible findings from large-scale, high-dimensional data are pivotal in numerous signal processing applications. In recent years, multivariate false discovery rate (FDR) controlling methods have emerged, providing guarantees even in high-dimensional settings where the number of variables surpasses the number of samples. However, these methods often fail to reliably cont… ▽ More

    Submitted 30 January, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

  5. arXiv:2401.15139  [pdf, other

    q-fin.PM cs.LG stat.ME stat.ML

    FDR-Controlled Portfolio Optimization for Sparse Financial Index Tracking

    Authors: Jasin Machkour, Daniel P. Palomar, Michael Muma

    Abstract: In high-dimensional data analysis, such as financial index tracking or biomedical applications, it is crucial to select the few relevant variables while maintaining control over the false discovery rate (FDR). In these applications, strong dependencies often exist among the variables (e.g., stock returns), which can undermine the FDR control property of existing methods like the model-X knockoff m… ▽ More

    Submitted 30 January, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

  6. arXiv:2401.08375  [pdf, other

    stat.ML cs.LG

    Sparse PCA with False Discovery Rate Controlled Variable Selection

    Authors: Jasin Machkour, Arnaud Breloy, Michael Muma, Daniel P. Palomar, Frédéric Pascal

    Abstract: Sparse principal component analysis (PCA) aims at mapping large dimensional data to a linear subspace of lower dimension. By imposing loading vectors to be sparse, it performs the double duty of dimension reduction and variable selection. Sparse PCA algorithms are usually expressed as a trade-off between explained variance and sparsity of the loading vectors (i.e., number of selected variables). A… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: Published in ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), scheduled for 14-19 April 2024 in Seoul, Korea

  7. arXiv:2305.04330  [pdf, ps, other

    stat.ME stat.CO

    Affine equivariant Tyler's M-estimator applied to tail parameter learning of elliptical distributions

    Authors: Esa Ollila, Daniel P. Palomar, Frederic Pascal

    Abstract: We propose estimating the scale parameter (mean of the eigenvalues) of the scatter matrix of an unspecified elliptically symmetric distribution using weights obtained by solving Tyler's M-estimator of the scatter matrix. The proposed Tyler's weights-based estimate (TWE) of scale is then used to construct an affine equivariant Tyler's M-estimator as a weighted sample covariance matrix using normali… ▽ More

    Submitted 7 May, 2023; originally announced May 2023.

  8. arXiv:2210.15471  [pdf, other

    stat.ML cs.LG eess.SP

    Adaptive Estimation of Graphical Models under Total Positivity

    Authors: Jiaxi Ying, José Vinícius de M. Cardoso, Daniel P. Palomar

    Abstract: We consider the problem of estimating (diagonally dominant) M-matrices as precision matrices in Gaussian graphical models. These models exhibit intriguing properties, such as the existence of the maximum likelihood estimator with merely two observations for M-matrices \citep{lauritzen2019maximum,slawski2015estimation} and even one observation for diagonally dominant M-matrices \citep{truell2021max… ▽ More

    Submitted 8 June, 2023; v1 submitted 27 October, 2022; originally announced October 2022.

    Comments: 26 pages

  9. arXiv:2110.06048  [pdf, other

    stat.ME eess.SP math.ST stat.ML

    The Terminating-Random Experiments Selector: Fast High-Dimensional Variable Selection with False Discovery Rate Control

    Authors: Jasin Machkour, Michael Muma, Daniel P. Palomar

    Abstract: We propose the Terminating-Random Experiments (T-Rex) selector, a fast variable selection method for high-dimensional data. The T-Rex selector controls a user-defined target false discovery rate (FDR) while maximizing the number of selected variables. This is achieved by fusing the solutions of multiple early terminated random experiments. The experiments are conducted on a combination of the orig… ▽ More

    Submitted 12 March, 2024; v1 submitted 12 October, 2021; originally announced October 2021.

    Comments: R packages 'TRexSelector' and 'tlars' on CRAN, 33 pages, 21 figures, 2 tables

  10. arXiv:2006.14925  [pdf, other

    cs.LG stat.ML

    Does the $\ell_1$-norm Learn a Sparse Graph under Laplacian Constrained Graphical Models?

    Authors: Jiaxi Ying, José Vinícius de M. Cardoso, Daniel P. Palomar

    Abstract: We consider the problem of learning a sparse graph under the Laplacian constrained Gaussian graphical models. This problem can be formulated as a penalized maximum likelihood estimation of the Laplacian constrained precision matrix. Like in the classical graphical lasso problem, recent works made use of the $\ell_1$-norm regularization with the goal of promoting sparsity in Laplacian constrained p… ▽ More

    Submitted 5 September, 2023; v1 submitted 26 June, 2020; originally announced June 2020.

  11. arXiv:2006.10005  [pdf, other

    stat.ME

    Shrinking the eigenvalues of M-estimators of covariance matrix

    Authors: Esa Ollila, Daniel P. Palomar, Frédéric Pascal

    Abstract: A highly popular regularized (shrinkage) covariance matrix estimator is the shrinkage sample covariance matrix (SCM) which shares the same set of eigenvectors as the SCM but shrinks its eigenvalues toward the grand mean of the eigenvalues of the SCM. In this paper, a more general approach is considered in which the SCM is replaced by an M-estimator of scatter matrix and a fully automatic data adap… ▽ More

    Submitted 28 October, 2020; v1 submitted 17 June, 2020; originally announced June 2020.

    Comments: A supplementary report is available at: http://users.spa.aalto.fi/esollila/shrinkM/supplement.pdf

  12. arXiv:2005.09958  [pdf, other

    stat.ML q-fin.CP q-fin.ST

    Learning Undirected Graphs in Financial Markets

    Authors: José Vinícius de Miranda Cardoso, Daniel P. Palomar

    Abstract: We investigate the problem of learning undirected graphical models under Laplacian structural constraints from the point of view of financial market data. We show that Laplacian constraints have meaningful physical interpretations related to the market index factor and to the conditional correlations between stocks. Those interpretations lead to a set of guidelines that users should be aware of wh… ▽ More

    Submitted 9 November, 2020; v1 submitted 20 May, 2020; originally announced May 2020.

    Comments: 5 pages, 13 figures, accepted at Asilomar Conference on Signals, Systems, and Computers, 2020

  13. arXiv:2002.04996  [pdf, other

    stat.ME stat.ML

    M-estimators of scatter with eigenvalue shrinkage

    Authors: Esa Ollila, Daniel P. Palomar, Frederic Pascal

    Abstract: A popular regularized (shrinkage) covariance estimator is the shrinkage sample covariance matrix (SCM) which shares the same set of eigenvectors as the SCM but shrinks its eigenvalues toward its grand mean. In this paper, a more general approach is considered in which the SCM is replaced by an M-estimator of scatter matrix and a fully automatic data adaptive method to compute the optimal shrinkage… ▽ More

    Submitted 12 February, 2020; originally announced February 2020.

    Comments: To appear in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2020), May 4 - 8, Barcelona, Spain, 2020

  14. arXiv:1909.12530  [pdf, ps, other

    stat.CO stat.AP

    Robust Factor Analysis Parameter Estimation

    Authors: Rui Zhou, Junyan Liu, Sandeep Kumar, Daniel P. Palomar

    Abstract: This paper considers the problem of robustly estimating the parameters of a heavy-tailed multivariate distribution when the covariance matrix is known to have the structure of a low-rank matrix plus a diagonal matrix as considered in factor analysis (FA). By assuming the observed data to follow the multivariate Student's t distribution, we can robustly estimate the parameters via maximum likelihoo… ▽ More

    Submitted 27 September, 2019; originally announced September 2019.

    Comments: Presented at Eurocast 2019

  15. arXiv:1909.11594  [pdf, ps, other

    stat.ML cs.LG cs.SI math.OC stat.AP

    Structured Graph Learning Via Laplacian Spectral Constraints

    Authors: Sandeep Kumar, Jiaxi Ying, Jos'e Vin'icius de M. Cardoso, Daniel P. Palomar

    Abstract: Learning a graph with a specific structure is essential for interpretability and identification of the relationships among data. It is well known that structured graph learning from observed samples is an NP-hard combinatorial problem. In this paper, we first show that for a set of important graph families it is possible to convert the structural constraints of structure into eigenvalue constraint… ▽ More

    Submitted 24 September, 2019; originally announced September 2019.

    Comments: 12 Pages, Accepted for NIPS 2019. arXiv admin note: substantial text overlap with arXiv:1904.09792

  16. arXiv:1907.08969  [pdf, other

    math.OC cs.DC stat.ML

    Distributed Inexact Successive Convex Approximation ADMM: Analysis-Part I

    Authors: Sandeep Kumar, Ketan Rajawat, Daniel P. Palomar

    Abstract: In this two-part work, we propose an algorithmic framework for solving non-convex problems whose objective function is the sum of a number of smooth component functions plus a convex (possibly non-smooth) or/and smooth (possibly non-convex) regularization function. The proposed algorithm incorporates ideas from several existing approaches such as alternate direction method of multipliers (ADMM), s… ▽ More

    Submitted 21 July, 2019; originally announced July 2019.

  17. arXiv:1809.07203  [pdf, ps, other

    stat.AP eess.SP math.OC q-fin.ST

    Parameter Estimation of Heavy-Tailed AR Model with Missing Data via Stochastic EM

    Authors: Junyan Liu, Sandeep Kumar, Daniel P. Palomar

    Abstract: The autoregressive (AR) model is a widely used model to understand time series data. Traditionally, the innovation noise of the AR is modeled as Gaussian. However, many time series applications, for example, financial time series data, are non-Gaussian, therefore, the AR model with more general heavy-tailed innovations is preferred. Another issue that frequently occurs in time series is missing va… ▽ More

    Submitted 9 February, 2019; v1 submitted 19 September, 2018; originally announced September 2018.

    Comments: This is a companion document to a paper that is accepted to IEEE Transaction on Signal Processing 2019, complemented with the supplementary material

  18. arXiv:1803.07247  [pdf, ps, other

    stat.ML cs.LG q-fin.CP stat.ME

    Sparse Reduced Rank Regression With Nonconvex Regularization

    Authors: Ziping Zhao, Daniel P. Palomar

    Abstract: In this paper, the estimation problem for sparse reduced rank regression (SRRR) model is considered. The SRRR model is widely used for dimension reduction and variable selection with applications in signal processing, econometrics, etc. The problem is formulated to minimize the least squares loss with a sparsity-inducing penalty considering an orthogonality constraint. Convex sparsity-inducing fun… ▽ More

    Submitted 20 March, 2018; originally announced March 2018.

    Comments: 13 pages, 5 figures

  19. arXiv:1710.05513  [pdf, ps, other

    stat.ML math.NA q-fin.ST stat.AP stat.CO

    Robust Maximum Likelihood Estimation of Sparse Vector Error Correction Model

    Authors: Ziping Zhao, Daniel P. Palomar

    Abstract: In econometrics and finance, the vector error correction model (VECM) is an important time series model for cointegration analysis, which is used to estimate the long-run equilibrium variable relationships. The traditional analysis and estimation methodologies assume the underlying Gaussian distribution but, in practice, heavy-tailed data and outliers can lead to the inapplicability of these metho… ▽ More

    Submitted 16 October, 2017; originally announced October 2017.

    Comments: 5 pages, 3 figures, to appear in Proc. of the 2017 5th IEEE Global Conference on Signal and Information Processing (GlobalSIP)

  20. arXiv:1602.03992  [pdf, other

    stat.ML cs.LG math.OC stat.AP

    Orthogonal Sparse PCA and Covariance Estimation via Procrustes Reformulation

    Authors: Konstantinos Benidis, Ying Sun, Prabhu Babu, Daniel P. Palomar

    Abstract: The problem of estimating sparse eigenvectors of a symmetric matrix attracts a lot of attention in many applications, especially those with high dimensional data set. While classical eigenvectors can be obtained as the solution of a maximization problem, existing approaches formulate this problem by adding a penalty term into the objective function that encourages a sparse solution. However, the r… ▽ More

    Submitted 12 February, 2016; originally announced February 2016.

  21. Robust Estimation of Structured Covariance Matrix for Heavy-Tailed Elliptical Distributions

    Authors: Ying Sun, Prabhu Babu, Daniel P. Palomar

    Abstract: This paper considers the problem of robustly estimating a structured covariance matrix with an elliptical underlying distribution with known mean. In applications where the covariance matrix naturally possesses a certain structure, taking the prior structure information into account in the estimation procedure is beneficial to improve the estimation accuracy. We propose incorporating the prior str… ▽ More

    Submitted 17 June, 2015; originally announced June 2015.

  22. arXiv:1501.02252  [pdf, ps, other

    math.OC cs.IT stat.ME

    Optimization Methods for Designing Sequences with Low Autocorrelation Sidelobes

    Authors: Junxiao Song, Prabhu Babu, Daniel P. Palomar

    Abstract: Unimodular sequences with low autocorrelations are desired in many applications, especially in the area of radar and code-division multiple access (CDMA). In this paper, we propose a new algorithm to design unimodular sequences with low integrated sidelobe level (ISL), which is a widely used measure of the goodness of a sequence's correlation property. The algorithm falls into the general framewor… ▽ More

    Submitted 26 December, 2014; originally announced January 2015.

  23. Sparse Generalized Eigenvalue Problem via Smooth Optimization

    Authors: Junxiao Song, Prabhu Babu, Daniel P. Palomar

    Abstract: In this paper, we consider an $\ell_{0}$-norm penalized formulation of the generalized eigenvalue problem (GEP), aimed at extracting the leading sparse generalized eigenvector of a matrix pair. The formulation involves maximization of a discontinuous nonconcave objective function over a nonconvex constraint set, and is therefore computationally intractable. To tackle the problem, we first approxim… ▽ More

    Submitted 18 November, 2014; v1 submitted 28 August, 2014; originally announced August 2014.

  24. Regularized Tyler's Scatter Estimator: Existence, Uniqueness, and Algorithms

    Authors: Ying Sun, Prabhu Babu, Daniel P. Palomar

    Abstract: This paper considers the regularized Tyler's scatter estimator for elliptical distributions, which has received considerable attention recently. Various types of shrinkage Tyler's estimators have been proposed in the literature and proved work effectively in the "small n large p" scenario. Nevertheless, the existence and uniqueness properties of the estimators are not thoroughly studied, and in ce… ▽ More

    Submitted 11 July, 2014; originally announced July 2014.