Skip to main content

Showing 1–28 of 28 results for author: Raskutti, G

Searching in archive stat. Search in all archives.
.
  1. arXiv:2412.01120  [pdf, other

    stat.ML cs.LG

    Reliable and scalable variable importance estimation via warm-start and early stopping

    Authors: Zexuan Sun, Garvesh Raskutti

    Abstract: As opaque black-box predictive models become more prevalent, the need to develop interpretations for these models is of great interest. The concept of variable importance and Shapley values are interpretability measures that applies to any predictive model and assesses how much a variable or set of variables improves prediction performance. When the number of variables is large, estimating variabl… ▽ More

    Submitted 7 March, 2025; v1 submitted 1 December, 2024; originally announced December 2024.

    Comments: Preliminary version accepted in AISTATS, 2025

  2. arXiv:2306.06582  [pdf, other

    stat.ML cs.LG

    Fast, Distribution-free Predictive Inference for Neural Networks with Coverage Guarantees

    Authors: Yue Gao, Garvesh Raskutti, Rebecca Willet

    Abstract: This paper introduces a novel, computationally-efficient algorithm for predictive inference (PI) that requires no distributional assumptions on the data and can be computed faster than existing bootstrap-type methods for neural networks. Specifically, if there are $n$ training samples, bootstrap methods require training a model on each of the $n$ subsamples of size $n-1$; for large models like neu… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

  3. arXiv:2304.09305  [pdf, other

    stat.ME math.ST stat.AP

    High-dimensional Multi-class Classification with Presence-only Data

    Authors: Lili Zheng, Garvesh Raskutti

    Abstract: Classification with positive and unlabeled (PU) data frequently arises in bioinformatics, clinical data, and ecological studies, where collecting negative samples can be prohibitively expensive. While prior works on PU data focus on binary classification, in this paper we consider multiple positive labels, a practically important and common setting. We introduce a multinomial-PU model and an ordin… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

  4. arXiv:2207.09097  [pdf, other

    stat.ML cs.LG

    Lazy Estimation of Variable Importance for Large Neural Networks

    Authors: Yue Gao, Abby Stevens, Rebecca Willet, Garvesh Raskutti

    Abstract: As opaque predictive models increasingly impact many areas of modern life, interest in quantifying the importance of a given input variable for making a specific prediction has grown. Recently, there has been a proliferation of model-agnostic methods to measure variable importance (VI) that analyze the difference in predictive power between a full model trained on all variables and a reduced model… ▽ More

    Submitted 19 July, 2022; originally announced July 2022.

    Comments: Accepted to ICML'22

  5. arXiv:2111.10461  [pdf, other

    stat.ML cs.LG

    Gaussian Process Inference Using Mini-batch Stochastic Gradient Descent: Convergence Guarantees and Empirical Benefits

    Authors: Hao Chen, Lili Zheng, Raed Al Kontar, Garvesh Raskutti

    Abstract: Stochastic gradient descent (SGD) and its variants have established themselves as the go-to algorithms for large-scale machine learning problems with independent samples due to their generalization performance and intrinsic computational advantage. However, the fact that the stochastic gradient is a biased estimator of the full gradient with correlated samples has led to the lack of theoretical un… ▽ More

    Submitted 19 November, 2021; originally announced November 2021.

    Report number: 23(227):1-59

    Journal ref: Journal of Machine learning Research (JMLR), 2022

  6. arXiv:2106.14630  [pdf, other

    stat.ML cs.LG stat.ME

    Improved Prediction and Network Estimation Using the Monotone Single Index Multi-variate Autoregressive Model

    Authors: Yue Gao, Garvesh Raskutti

    Abstract: Network estimation from multi-variate point process or time series data is a problem of fundamental importance. Prior work has focused on parametric approaches that require a known parametric model, which makes estimation procedures less robust to model mis-specification, non-linearities and heterogeneities. In this paper, we develop a semi-parametric approach based on the monotone single-index mu… ▽ More

    Submitted 28 June, 2021; v1 submitted 28 June, 2021; originally announced June 2021.

  7. arXiv:2105.07587  [pdf, other

    math.ST stat.ME

    Convergence guarantee for the sparse monotone single index model

    Authors: Ran Dai, Hyebin Song, Rina Foygel Barber, Garvesh Raskutti

    Abstract: We consider a high-dimensional monotone single index model (hdSIM), which is a semiparametric extension of a high-dimensional generalize linear model (hdGLM), where the link function is unknown, but constrained with monotone and non-decreasing shape. We develop a scalable projection-based iterative approach, the "Sparse Orthogonal Descent Single-Index Model" (SOD-SIM), which alternates between spa… ▽ More

    Submitted 16 May, 2021; originally announced May 2021.

    MSC Class: 62G08

  8. arXiv:2103.13555  [pdf, other

    stat.ML cs.LG

    Prediction in the presence of response-dependent missing labels

    Authors: Hyebin Song, Garvesh Raskutti, Rebecca Willett

    Abstract: In a variety of settings, limitations of sensing technologies or other sampling mechanisms result in missing labels, where the likelihood of a missing label in the training set is an unknown function of the data. For example, satellites used to detect forest fires cannot sense fires below a certain size threshold. In such cases, training datasets consist of positive and pseudo-negative observation… ▽ More

    Submitted 24 March, 2021; originally announced March 2021.

  9. arXiv:2008.02437  [pdf, other

    math.ST cs.LG math.NA stat.ML

    A Sharp Blockwise Tensor Perturbation Bound for Orthogonal Iteration

    Authors: Yuetian Luo, Garvesh Raskutti, Ming Yuan, Anru R. Zhang

    Abstract: In this paper, we develop novel perturbation bounds for the high-order orthogonal iteration (HOOI) [DLDMV00b]. Under mild regularity conditions, we establish blockwise tensor perturbation bounds for HOOI with guarantees for both tensor reconstruction in Hilbert-Schmidt norm $\|\widehat{\bcT} - \bcT \|_{\tHS}$ and mode-$k$ singular subspace estimation in Schatten-$q$ norm… ▽ More

    Submitted 5 June, 2021; v1 submitted 5 August, 2020; originally announced August 2020.

  10. arXiv:2003.07429  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    Context-dependent self-exciting point processes: models, methods, and risk bounds in high dimensions

    Authors: Lili Zheng, Garvesh Raskutti, Rebecca Willett, Benjamin Mark

    Abstract: High-dimensional autoregressive point processes model how current events trigger or inhibit future events, such as activity by one member of a social network can affect the future activity of his or her neighbors. While past work has focused on estimating the underlying network structure based solely on the times at which events occur on each node of the network, this paper examines the more nuanc… ▽ More

    Submitted 16 March, 2020; originally announced March 2020.

  11. arXiv:1911.03804  [pdf, other

    stat.ML cs.LG math.NA math.ST stat.ME

    ISLET: Fast and Optimal Low-rank Tensor Regression via Importance Sketching

    Authors: Anru Zhang, Yuetian Luo, Garvesh Raskutti, Ming Yuan

    Abstract: In this paper, we develop a novel procedure for low-rank tensor regression, namely \emph{\underline{I}mportance \underline{S}ketching \underline{L}ow-rank \underline{E}stimation for \underline{T}ensors} (ISLET). The central idea behind ISLET is \emph{importance sketching}, i.e., carefully designed sketches based on both the responses and low-dimensional structure of the parameter of interest. We s… ▽ More

    Submitted 8 May, 2020; v1 submitted 9 November, 2019; originally announced November 2019.

  12. arXiv:1910.02348  [pdf, other

    stat.ME

    Convex and Non-convex Approaches for Statistical Inference with Class-Conditional Noisy Labels

    Authors: Hyebin Song, Ran Dai, Garvesh Raskutti, Rina Foygel Barber

    Abstract: We study the problem of estimation and testing in logistic regression with class-conditional noise in the observed labels, which has an important implication in the Positive-Unlabeled (PU) learning setting. With the key observation that the label noise problem belongs to a special sub-class of generalized linear models (GLM), we discuss convex and non-convex approaches that address this problem. A… ▽ More

    Submitted 12 August, 2020; v1 submitted 5 October, 2019; originally announced October 2019.

  13. Minimizing Negative Transfer of Knowledge in Multivariate Gaussian Processes: A Scalable and Regularized Approach

    Authors: Raed Kontar, Garvesh Raskutti, Shiyu Zhou

    Abstract: Recently there has been an increasing interest in the multivariate Gaussian process (MGP) which extends the Gaussian process (GP) to deal with multiple outputs. One approach to construct the MGP and account for non-trivial commonalities amongst outputs employs a convolution process (CP). The CP is based on the idea of sharing latent functions across several convolutions. Despite the elegance of th… ▽ More

    Submitted 31 March, 2019; v1 submitted 31 January, 2019; originally announced January 2019.

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020

  14. arXiv:1812.03659  [pdf, other

    math.ST stat.ME

    Testing for high-dimensional network parameters in auto-regressive models

    Authors: Lili Zheng, Garvesh Raskutti

    Abstract: High-dimensional auto-regressive models provide a natural way to model influence between $M$ actors given multi-variate time series data for $T$ time intervals. While there has been considerable work on network estimation, there is limited work in the context of inference and hypothesis testing. In particular, prior work on hypothesis testing in time series has been restricted to linear Gaussian a… ▽ More

    Submitted 11 December, 2018; v1 submitted 10 December, 2018; originally announced December 2018.

  15. arXiv:1811.02979  [pdf, other

    stat.ML cs.LG

    Estimating Network Structure from Incomplete Event Data

    Authors: Benjamin Mark, Garvesh Raskutti, Rebecca Willett

    Abstract: Multivariate Bernoulli autoregressive (BAR) processes model time series of events in which the likelihood of current events is determined by the times and locations of past events. These processes can be used to model nonlinear dynamical systems corresponding to criminal activity, responses of patients to different medical treatment plans, opinion dynamics across social networks, epidemic spread,… ▽ More

    Submitted 7 November, 2018; originally announced November 2018.

  16. arXiv:1803.07658  [pdf, other

    stat.ML cs.LG

    Graph-based regularization for regression problems with alignment and highly-correlated designs

    Authors: Yuan Li, Benjamin Mark, Garvesh Raskutti, Rebecca Willett, Hyebin Song, David Neiman

    Abstract: Sparse models for high-dimensional linear regression and machine learning have received substantial attention over the past two decades. Model selection, or determining which features or covariates are the best explanatory variables, is critical to the interpretability of a learned model. Much of the current literature assumes that covariates are only mildly correlated. However, in many modern app… ▽ More

    Submitted 13 October, 2019; v1 submitted 20 March, 2018; originally announced March 2018.

  17. arXiv:1802.04838  [pdf, other

    stat.ML cs.IT math.ST

    Network Estimation from Point Process Data

    Authors: Benjamin Mark, Garvesh Raskutti, Rebecca Willett

    Abstract: Consider observing a collection of discrete events within a network that reflect how network nodes influence one another. Such data are common in spike trains recorded from biological neural networks, interactions within a social network, and a variety of other settings. Data of this form may be modeled as self-exciting point processes, in which the likelihood of future events depends on the past… ▽ More

    Submitted 13 February, 2018; originally announced February 2018.

    Comments: Submitted to IEEE Transactions on Information Theory

  18. arXiv:1801.07644  [pdf, other

    stat.ML math.ST

    Non-parametric Sparse Additive Auto-regressive Network Models

    Authors: Hao Henry Zhou, Garvesh Raskutti

    Abstract: Consider a multi-variate time series $(X_t)_{t=0}^{T}$ where $X_t \in \mathbb{R}^d$ which may represent spike train responses for multiple neurons in a brain, crime event data across multiple regions, and many others. An important challenge associated with these time series models is to estimate an influence network between the $d$ variables, especially when the number of variables $d$ is large me… ▽ More

    Submitted 24 January, 2018; v1 submitted 23 January, 2018; originally announced January 2018.

  19. arXiv:1711.08129  [pdf, other

    stat.ME

    PULasso: High-dimensional variable selection with presence-only data

    Authors: Hyebin Song, Garvesh Raskutti

    Abstract: In various real-world problems, we are presented with classification problems with positive and unlabeled data, referred to as presence-only responses. In this paper, we study variable selection in the context of presence only responses where the number of features or covariates p is large. The combination of presence-only responses and high dimensionality presents both statistical and computation… ▽ More

    Submitted 30 October, 2018; v1 submitted 21 November, 2017; originally announced November 2017.

  20. arXiv:1704.08783  [pdf, other

    stat.ML cs.LG

    Learning Quadratic Variance Function (QVF) DAG models via OverDispersion Scoring (ODS)

    Authors: Gunwoong Park, Garvesh Raskutti

    Abstract: Learning DAG or Bayesian network models is an important problem in multi-variate causal inference. However, a number of challenges arises in learning large-scale DAG models including model identifiability and computational complexity since the space of directed graphs is huge. In this paper, we address these issues in a number of steps for a broad class of DAG models where the noise or variance is… ▽ More

    Submitted 27 April, 2017; originally announced April 2017.

    Comments: 41 pages, 7 figures

  21. arXiv:1611.10349  [pdf, other

    stat.ML

    Non-Convex Projected Gradient Descent for Generalized Low-Rank Tensor Regression

    Authors: Han Chen, Garvesh Raskutti, Ming Yuan

    Abstract: In this paper, we consider the problem of learning high-dimensional tensor regression problems with low-rank structure. One of the core challenges associated with learning high-dimensional models is computation since the underlying optimization problems are often non-convex. While convex relaxations could lead to polynomial-time algorithms they are often slow in practice. On the other hand, limite… ▽ More

    Submitted 30 November, 2016; originally announced November 2016.

    Comments: 42 pages, 6 figures

  22. arXiv:1605.02693  [pdf, other

    stat.ML cs.IT math.ST

    Inference of High-dimensional Autoregressive Generalized Linear Models

    Authors: Eric C. Hall, Garvesh Raskutti, Rebecca Willett

    Abstract: Vector autoregressive models characterize a variety of time series in which linear combinations of current and past observations can be used to accurately predict future observations. For instance, each element of an observation vector could correspond to a different node in a network, and the parameters of an autoregressive model would correspond to the impact of the network structure on the time… ▽ More

    Submitted 24 June, 2017; v1 submitted 9 May, 2016; originally announced May 2016.

    Comments: Submitted to IEEE Transactions on Information Theory

  23. arXiv:1602.04418  [pdf, other

    stat.ML cs.LG

    Identifiability Assumptions and Algorithm for Directed Graphical Models with Feedback

    Authors: Gunwoong Park, Garvesh Raskutti

    Abstract: Directed graphical models provide a useful framework for modeling causal or directional relationships for multivariate data. Prior work has largely focused on identifiability and search algorithms for directed acyclic graphical (DAG) models. In many applications, feedback naturally arises and directed graphical models that permit cycles occur. In this paper we address the issue of identifiability… ▽ More

    Submitted 6 July, 2016; v1 submitted 14 February, 2016; originally announced February 2016.

    Comments: 28 pages, 17 figures

  24. arXiv:1505.06659  [pdf, ps, other

    stat.ML

    Statistical and Algorithmic Perspectives on Randomized Sketching for Ordinary Least-Squares -- ICML

    Authors: Garvesh Raskutti, Michael Mahoney

    Abstract: We consider statistical and algorithmic aspects of solving large-scale least-squares (LS) problems using randomized sketching algorithms. Prior results show that, from an \emph{algorithmic perspective}, when using sketching matrices constructed from random projections and leverage-score sampling, if the number of samples $r$ much smaller than the original sample size $n$, then the worst-case (WC)… ▽ More

    Submitted 25 May, 2015; originally announced May 2015.

    Comments: 9 pages, Proceedings of the 32 nd International Conference on Machine Learning, Lille, France, 2015. JMLR

  25. arXiv:1406.5986  [pdf, other

    stat.ML

    A Statistical Perspective on Randomized Sketching for Ordinary Least-Squares

    Authors: Garvesh Raskutti, Michael Mahoney

    Abstract: We consider statistical as well as algorithmic aspects of solving large-scale least-squares (LS) problems using randomized sketching algorithms. For a LS problem with input data $(X, Y) \in \mathbb{R}^{n \times p} \times \mathbb{R}^n$, sketching algorithms use a sketching matrix, $S\in\mathbb{R}^{r \times n}$ with $r \ll n$. Then, rather than solving the LS problem using the full data $(X,Y)$, ske… ▽ More

    Submitted 25 August, 2015; v1 submitted 23 June, 2014; originally announced June 2014.

    Comments: 27 pages, 5 figures

  26. arXiv:1310.7780  [pdf, ps, other

    stat.ML cs.LG

    The Information Geometry of Mirror Descent

    Authors: Garvesh Raskutti, Sayan Mukherjee

    Abstract: Information geometry applies concepts in differential geometry to probability and statistics and is especially useful for parameter estimation in exponential families where parameters are known to lie on a Riemannian manifold. Connections between the geometric properties of the induced manifold and statistical properties of the estimation problem are well-established. However developing first-orde… ▽ More

    Submitted 29 April, 2014; v1 submitted 29 October, 2013; originally announced October 2013.

    Comments: 9 pages

  27. arXiv:1306.3574  [pdf, ps, other

    stat.ML

    Early stopping and non-parametric regression: An optimal data-dependent stopping rule

    Authors: Garvesh Raskutti, Martin J. Wainwright, Bin Yu

    Abstract: The strategy of early stopping is a regularization technique based on choosing a stopping time for an iterative algorithm. Focusing on non-parametric regression in a reproducing kernel Hilbert space, we analyze the early stopping strategy for a form of gradient-descent applied to the least-squares loss function. We propose a data-dependent stopping rule that does not involve hold-out or cross-vali… ▽ More

    Submitted 15 June, 2013; originally announced June 2013.

    Comments: 29 pages, 4 figures

  28. arXiv:0811.3628  [pdf, ps, other

    stat.ML math.ST

    High-dimensional covariance estimation by minimizing $\ell_1$-penalized log-determinant divergence

    Authors: Pradeep Ravikumar, Martin J. Wainwright, Garvesh Raskutti, Bin Yu

    Abstract: Given i.i.d. observations of a random vector $X \in \mathbb{R}^p$, we study the problem of estimating both its covariance matrix $Σ^*$, and its inverse covariance or concentration matrix {$Θ^* = (Σ^*)^{-1}$.} We estimate $Θ^*$ by minimizing an $\ell_1$-penalized log-determinant Bregman divergence; in the multivariate Gaussian case, this approach corresponds to $\ell_1$-penalized maximum likeliho… ▽ More

    Submitted 21 November, 2008; originally announced November 2008.

    Comments: 35 pages, 9 figures