High Dimensional Robust Estimation of Sparse Models via Trimmed Hard Thresholding

Liu, Liu; Li, Tianyang; Caramanis, Constantine

Computer Science > Machine Learning

arXiv:1901.08237v1 (cs)

[Submitted on 24 Jan 2019 (this version), latest version 29 May 2019 (v2)]

Title:High Dimensional Robust Estimation of Sparse Models via Trimmed Hard Thresholding

Authors:Liu Liu, Tianyang Li, Constantine Caramanis

View PDF

Abstract:We study the problem of sparsity constrained $M$-estimation with arbitrary corruptions to both {\em explanatory and response} variables in the high-dimensional regime, where the number of variables $d$ is larger than the sample size $n$. Our main contribution is a highly efficient gradient-based optimization algorithm that we call Trimmed Hard Thresholding -- a robust variant of Iterative Hard Thresholding (IHT) by using trimmed mean in gradient computations. Our algorithm can deal with a wide class of sparsity constrained $M$-estimation problems, and we can tolerate a nearly dimension independent fraction of arbitrarily corrupted samples. More specifically, when the corrupted fraction satisfies $\epsilon \lesssim {1} /\left({\sqrt{k} \log (nd)}\right)$, where $k$ is the sparsity of the parameter, we obtain accurate estimation and model selection guarantees with optimal sample complexity. Furthermore, we extend our algorithm to sparse Gaussian graphical model (precision matrix) estimation via a neighborhood selection approach. We demonstrate the effectiveness of robust estimation in sparse linear, logistic regression, and sparse precision matrix estimation on synthetic and real-world US equities data.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Optimization and Control (math.OC); Statistics Theory (math.ST); Machine Learning (stat.ML)
Cite as:	arXiv:1901.08237 [cs.LG]
	(or arXiv:1901.08237v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1901.08237

Submission history

From: Tianyang Li [view email]
[v1] Thu, 24 Jan 2019 05:20:29 UTC (517 KB)
[v2] Wed, 29 May 2019 19:02:20 UTC (938 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2019-01

Change to browse by:

cs
cs.AI
math
math.OC
math.ST
stat
stat.ML
stat.TH

References & Citations

DBLP - CS Bibliography

listing | bibtex

Liu Liu
Tianyang Li
Constantine Caramanis

export BibTeX citation

Computer Science > Machine Learning

Title:High Dimensional Robust Estimation of Sparse Models via Trimmed Hard Thresholding

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:High Dimensional Robust Estimation of Sparse Models via Trimmed Hard Thresholding

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators