Skip to main content

Showing 1–28 of 28 results for author: Kong, W

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.05237  [pdf, other

    cs.LG cs.CR cs.DS math.OC stat.ML

    Privacy of the last iterate in cyclically-sampled DP-SGD on nonconvex composite losses

    Authors: Weiwei Kong, Mónica Ribero

    Abstract: Differentially-private stochastic gradient descent (DP-SGD) is a family of iterative machine learning training algorithms that privatize gradients to generate a sequence of differentially-private (DP) model parameters. It is also the standard tool used to train DP models in practice, even though most users are only interested in protecting the privacy of the final model. Tight DP accounting for th… ▽ More

    Submitted 10 February, 2025; v1 submitted 6 July, 2024; originally announced July 2024.

    MSC Class: 65K10 (Primary); 60G15; 68P27 ACM Class: G.3; G.1.6

  2. arXiv:2404.15409  [pdf, ps, other

    cs.LG cs.CR stat.ML

    Insufficient Statistics Perturbation: Stable Estimators for Private Least Squares

    Authors: Gavin Brown, Jonathan Hayase, Samuel Hopkins, Weihao Kong, Xiyang Liu, Sewoong Oh, Juan C. Perdomo, Adam Smith

    Abstract: We present a sample- and time-efficient differentially private algorithm for ordinary least squares, with error that depends linearly on the dimension and is independent of the condition number of $X^\top X$, where $X$ is the design matrix. All prior private algorithms for this task require either $d^{3/2}$ examples, error growing polynomially with the condition number, or exponential time. Our ne… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 42 pages, 3 figures

  3. arXiv:2311.16416  [pdf, other

    cs.DS cs.LG stat.ML

    A Combinatorial Approach to Robust PCA

    Authors: Weihao Kong, Mingda Qiao, Rajat Sen

    Abstract: We study the problem of recovering Gaussian data under adversarial corruptions when the noises are low-rank and the corruptions are on the coordinate level. Concretely, we assume that the Gaussian noises lie in an unknown $k$-dimensional subspace $U \subseteq \mathbb{R}^d$, and $s$ randomly chosen coordinates of each data point fall into the control of an adversary. This setting models the scenari… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: To appear at ITCS 2024

  4. arXiv:2311.08362  [pdf, other

    cs.LG stat.ML

    Transformers can optimally learn regression mixture models

    Authors: Reese Pathak, Rajat Sen, Weihao Kong, Abhimanyu Das

    Abstract: Mixture models arise in many regression problems, but most methods have seen limited adoption partly due to these algorithms' highly-tailored and model-specific nature. On the other hand, transformers are flexible, neural sequence models that present the intriguing possibility of providing general-purpose prediction methods, even in this mixture setting. In this work, we investigate the hypothesis… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: 24 pages, 9 figures

  5. arXiv:2309.01973  [pdf, other

    cs.LG cs.AI cs.IT stat.ML

    Linear Regression using Heterogeneous Data Batches

    Authors: Ayush Jain, Rajat Sen, Weihao Kong, Abhimanyu Das, Alon Orlitsky

    Abstract: In many learning applications, data are collected from multiple sources, each providing a \emph{batch} of samples that by itself is insufficient to learn its input-output relationship. A common approach assumes that the sources fall in one of several unknown subgroups, each with an unknown input distribution and input-output relationship. We consider one of this setup's most fundamental and import… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

  6. arXiv:2304.08424  [pdf, other

    stat.ML cs.LG

    Long-term Forecasting with TiDE: Time-series Dense Encoder

    Authors: Abhimanyu Das, Weihao Kong, Andrew Leach, Shaan Mathur, Rajat Sen, Rose Yu

    Abstract: Recent work has shown that simple linear models can outperform several Transformer based approaches in long term time-series forecasting. Motivated by this, we propose a Multi-layer Perceptron (MLP) based encoder-decoder model, Time-series Dense Encoder (TiDE), for long-term time-series forecasting that enjoys the simplicity and speed of linear models while also being able to handle covariates and… ▽ More

    Submitted 4 April, 2024; v1 submitted 17 April, 2023; originally announced April 2023.

  7. arXiv:2302.09451  [pdf, other

    cs.LG stat.ML

    Estimating Optimal Policy Value in General Linear Contextual Bandits

    Authors: Jonathan N. Lee, Weihao Kong, Aldo Pacchiano, Vidya Muthukumar, Emma Brunskill

    Abstract: In many bandit problems, the maximal reward achievable by a policy is often unknown in advance. We consider the problem of estimating the optimal policy value in the sublinear data regime before the optimal policy is even learnable. We refer to this as $V^*$ estimation. It was recently shown that fast $V^*$ estimation is possible but only in disjoint linear bandits with Gaussian covariates. Whethe… ▽ More

    Submitted 18 February, 2023; originally announced February 2023.

  8. arXiv:2301.13273  [pdf, other

    cs.LG cs.CR math.ST stat.ML

    Near Optimal Private and Robust Linear Regression

    Authors: Xiyang Liu, Prateek Jain, Weihao Kong, Sewoong Oh, Arun Sai Suggala

    Abstract: We study the canonical statistical estimation problem of linear regression from $n$ i.i.d.~examples under $(\varepsilon,δ)$-differential privacy when some response variables are adversarially corrupted. We propose a variant of the popular differentially private stochastic gradient descent (DP-SGD) algorithm with two innovations: a full-batch gradient descent to improve sample complexity and a nove… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

  9. arXiv:2211.12743  [pdf, ps, other

    cs.LG cs.IT stat.ML

    Efficient List-Decodable Regression using Batches

    Authors: Abhimanyu Das, Ayush Jain, Weihao Kong, Rajat Sen

    Abstract: We begin the study of list-decodable linear regression using batches. In this setting only an $α\in (0,1]$ fraction of the batches are genuine. Each genuine batch contains $\ge n$ i.i.d. samples from a common unknown distribution and the remaining batches may contain arbitrary or even adversarial samples. We derive a polynomial time algorithm that for any $n\ge \tilde Ω(1/α)$ returns a list of siz… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

    Comments: First draft

  10. arXiv:2206.04777  [pdf, ps, other

    cs.LG stat.ML

    Trimmed Maximum Likelihood Estimation for Robust Learning in Generalized Linear Models

    Authors: Pranjal Awasthi, Abhimanyu Das, Weihao Kong, Rajat Sen

    Abstract: We study the problem of learning generalized linear models under adversarial corruptions. We analyze a classical heuristic called the iterative trimmed maximum likelihood estimator which is known to be effective against label corruptions in practice. Under label corruptions, we prove that this simple estimator achieves minimax near-optimal risk on a wide range of generalized linear models, includi… ▽ More

    Submitted 23 October, 2022; v1 submitted 9 June, 2022; originally announced June 2022.

  11. arXiv:2205.13709  [pdf, other

    cs.LG cs.CR cs.IT math.ST stat.ML

    DP-PCA: Statistically Optimal and Differentially Private PCA

    Authors: Xiyang Liu, Weihao Kong, Prateek Jain, Sewoong Oh

    Abstract: We study the canonical statistical task of computing the principal component from $n$ i.i.d.~data in $d$ dimensions under $(\varepsilon,δ)$-differential privacy. Although extensively studied in literature, existing solutions fall short on two key aspects: ($i$) even for Gaussian data, existing private algorithms require the number of samples $n$ to scale super-linearly with $d$, i.e.,… ▽ More

    Submitted 26 May, 2022; originally announced May 2022.

  12. arXiv:2204.10414  [pdf, other

    cs.LG stat.ML

    Dirichlet Proportions Model for Hierarchically Coherent Probabilistic Forecasting

    Authors: Abhimanyu Das, Weihao Kong, Biswajit Paria, Rajat Sen

    Abstract: Probabilistic, hierarchically coherent forecasting is a key problem in many practical forecasting applications -- the goal is to obtain coherent probabilistic predictions for a large number of time series arranged in a pre-specified tree hierarchy. In this paper, we present an end-to-end deep probabilistic model for hierarchical forecasting that is motivated by a classical top-down strategy. It jo… ▽ More

    Submitted 1 March, 2023; v1 submitted 21 April, 2022; originally announced April 2022.

  13. arXiv:2111.06578  [pdf, ps, other

    math.ST cs.CR cs.IT cs.LG stat.ML

    Differential privacy and robust statistics in high dimensions

    Authors: Xiyang Liu, Weihao Kong, Sewoong Oh

    Abstract: We introduce a universal framework for characterizing the statistical efficiency of a statistical estimation problem with differential privacy guarantees. Our framework, which we call High-dimensional Propose-Test-Release (HPTR), builds upon three crucial components: the exponential mechanism, robust statistics, and the Propose-Test-Release mechanism. Gluing all these together is the concept of re… ▽ More

    Submitted 12 November, 2021; originally announced November 2021.

  14. arXiv:2106.03022  [pdf, other

    stat.ME cs.LG stat.ML

    Fisher-Pitman permutation tests based on nonparametric Poisson mixtures with application to single cell genomics

    Authors: Zhen Miao, Weihao Kong, Ramya Korlakai Vinayak, Wei Sun, Fang Han

    Abstract: This paper investigates the theoretical and empirical performance of Fisher-Pitman-type permutation tests for assessing the equality of unknown Poisson mixture distributions. Building on nonparametric maximum likelihood estimators (NPMLEs) of the mixing distribution, these tests are theoretically shown to be able to adapt to complicated unspecified structures of count data and also consistent agai… ▽ More

    Submitted 5 June, 2021; originally announced June 2021.

    Comments: 52 pages

  15. arXiv:2104.11315  [pdf, other

    cs.LG cs.AI stat.ML

    SPECTRE: Defending Against Backdoor Attacks Using Robust Statistics

    Authors: Jonathan Hayase, Weihao Kong, Raghav Somani, Sewoong Oh

    Abstract: Modern machine learning increasingly requires training on a large collection of data from multiple sources, not all of which can be trusted. A particularly concerning scenario is when a small fraction of poisoned data changes the behavior of the trained model when triggered by an attacker-specified watermark. Such a compromised model will be deployed unnoticed as the model is accurate otherwise. T… ▽ More

    Submitted 22 April, 2021; originally announced April 2021.

    Comments: 29 pages 19 figures

  16. arXiv:2102.09159  [pdf, other

    cs.LG cs.CR cs.IT stat.ML

    Robust and Differentially Private Mean Estimation

    Authors: Xiyang Liu, Weihao Kong, Sham Kakade, Sewoong Oh

    Abstract: In statistical learning and analysis from shared data, which is increasingly widely adopted in platforms such as federated learning and meta-learning, there are two major concerns: privacy and robustness. Each participating individual should be able to contribute without the fear of leaking one's sensitive information. At the same time, the system should be robust in the presence of malicious part… ▽ More

    Submitted 24 November, 2021; v1 submitted 18 February, 2021; originally announced February 2021.

    Comments: 58 pages, 2 figures, both exponential time and efficient algorithms no longer require a known bound on the true mean

  17. arXiv:2011.09750  [pdf, ps, other

    cs.LG stat.ML

    Online Model Selection for Reinforcement Learning with Function Approximation

    Authors: Jonathan N. Lee, Aldo Pacchiano, Vidya Muthukumar, Weihao Kong, Emma Brunskill

    Abstract: Deep reinforcement learning has achieved impressive successes yet often requires a very large amount of interaction data. This result is perhaps unsurprising, as using complicated function approximation often requires more data to fit, and early theoretical results on linear Markov decision processes provide regret bounds that scale with the dimension of the linear approximation. Ideally, we would… ▽ More

    Submitted 19 November, 2020; originally announced November 2020.

  18. arXiv:2007.03746  [pdf, ps, other

    eess.SP cs.HC cs.LG stat.ML

    Transfer Learning for Motor Imagery Based Brain-Computer Interfaces: A Complete Pipeline

    Authors: Dongrui Wu, Xue Jiang, Ruimin Peng, Wanzeng Kong, Jian Huang, Zhigang Zeng

    Abstract: Transfer learning (TL) has been widely used in motor imagery (MI) based brain-computer interfaces (BCIs) to reduce the calibration effort for a new subject, and demonstrated promising performance. While a closed-loop MI-based BCI system, after electroencephalogram (EEG) signal acquisition and temporal filtering, includes spatial filtering, feature engineering, and classification blocks before send… ▽ More

    Submitted 22 January, 2021; v1 submitted 3 July, 2020; originally announced July 2020.

    Journal ref: Neural Networks, 153:235-253, 2022

  19. arXiv:2006.09702  [pdf, other

    cs.LG stat.ML

    Robust Meta-learning for Mixed Linear Regression with Small Batches

    Authors: Weihao Kong, Raghav Somani, Sham Kakade, Sewoong Oh

    Abstract: A common challenge faced in practical supervised learning, such as medical image processing and robotic interactions, is that there are plenty of tasks but each task cannot afford to collect enough labeled examples to be learned in isolation. However, by exploiting the similarities across those tasks, one can hope to overcome such data scarcity. Under a canonical scenario where each task is drawn… ▽ More

    Submitted 18 June, 2020; v1 submitted 17 June, 2020; originally announced June 2020.

    Comments: 52 pages, 2 figures

  20. arXiv:2002.08936  [pdf, other

    cs.LG stat.ML

    Meta-learning for mixed linear regression

    Authors: Weihao Kong, Raghav Somani, Zhao Song, Sham Kakade, Sewoong Oh

    Abstract: In modern supervised learning, there are a large number of tasks, but many of them are associated with only a small amount of labeled data. These include data from medical image processing and robotic interaction. Even though each individual task cannot be meaningfully trained in isolation, one seeks to meta-learn across the tasks from past experiences by exploiting some similarities. We study a f… ▽ More

    Submitted 20 February, 2020; originally announced February 2020.

  21. arXiv:1912.06111  [pdf, other

    cs.LG stat.ML

    Sublinear Optimal Policy Value Estimation in Contextual Bandits

    Authors: Weihao Kong, Gregory Valiant, Emma Brunskill

    Abstract: We study the problem of estimating the expected reward of the optimal policy in the stochastic disjoint linear bandit setting. We prove that for certain settings it is possible to obtain an accurate estimate of the optimal policy value even with a number of samples that is sublinear in the number that would be required to \emph{find} a policy that realizes a value close to this optima. We establis… ▽ More

    Submitted 13 December, 2019; v1 submitted 12 December, 2019; originally announced December 2019.

    Comments: Extended to the mixture of Gaussians setting

  22. arXiv:1912.05903   

    q-bio.QM cs.LG q-bio.BM stat.ML

    Prediction and optimization of NaV1.7 inhibitors based on machine learning methods

    Authors: Weikaixin Kong, Xinyu Tu, Zhengwei Xie, Zhuo Huang

    Abstract: We used machine learning methods to predict NaV1.7 inhibitors and found the model RF-CDK that performed best on the imbalanced dataset. Using the RF-CDK model for screening drugs, we got effective compounds K1. We use the cell patch clamp method to verify K1. However, because the model evaluation method in this article is not comprehensive enough, there is still a lot of research work to be perfor… ▽ More

    Submitted 15 February, 2020; v1 submitted 29 November, 2019; originally announced December 2019.

    Comments: The evaluation of the model in the results section of this article is not comprehensive enough.We will carry out further work. The article needs to be polished. There are certain disadvantages to the molecular optimization method. The discussion part is not deep enough, so withdraw is needed

  23. arXiv:1911.12568  [pdf, other

    cs.LG math.ST stat.ML

    Optimal Estimation of Change in a Population of Parameters

    Authors: Ramya Korlakai Vinayak, Weihao Kong, Sham M. Kakade

    Abstract: Paired estimation of change in parameters of interest over a population plays a central role in several application domains including those in the social sciences, epidemiology, medicine and biology. In these domains, the size of the population under study is often very large, however, the number of observations available per individual in the population is very small (\emph{sparse observations})… ▽ More

    Submitted 28 November, 2019; originally announced November 2019.

  24. arXiv:1902.04553  [pdf, ps, other

    math.ST cs.LG stat.ML

    Maximum Likelihood Estimation for Learning Populations of Parameters

    Authors: Ramya Korlakai Vinayak, Weihao Kong, Gregory Valiant, Sham M. Kakade

    Abstract: Consider a setting with $N$ independent individuals, each with an unknown parameter, $p_i \in [0, 1]$ drawn from some unknown distribution $P^\star$. After observing the outcomes of $t$ independent Bernoulli trials, i.e., $X_i \sim \text{Binomial}(t, p_i)$ per individual, our objective is to accurately estimate $P^\star$. This problem arises in numerous domains, including the social sciences, psyc… ▽ More

    Submitted 12 February, 2019; originally announced February 2019.

  25. arXiv:1808.06347  [pdf, other

    cs.LG stat.ML

    A Distribution Similarity Based Regularizer for Learning Bayesian Networks

    Authors: Weirui Kong, Wenyi Wang

    Abstract: Probabilistic graphical models compactly represent joint distributions by decomposing them into factors over subsets of random variables. In Bayesian networks, the factors are conditional probability distributions. For many problems, common information exists among those factors. Adding similarity restrictions can be viewed as imposing prior knowledge for model regularization. With proper restrict… ▽ More

    Submitted 20 August, 2018; originally announced August 2018.

  26. arXiv:1806.00040  [pdf, ps, other

    cs.LG cs.CC cs.DS math.ST stat.ML

    Efficient Algorithms and Lower Bounds for Robust Linear Regression

    Authors: Ilias Diakonikolas, Weihao Kong, Alistair Stewart

    Abstract: We study the problem of high-dimensional linear regression in a robust model where an $ε$-fraction of the samples can be adversarially corrupted. We focus on the fundamental setting where the covariates of the uncorrupted samples are drawn from a Gaussian distribution $\mathcal{N}(0, Σ)$ on $\mathbb{R}^d$. We give nearly tight upper bounds and computational lower bounds for this problem. Specifica… ▽ More

    Submitted 31 May, 2018; originally announced June 2018.

  27. arXiv:1805.01626  [pdf, other

    cs.LG stat.ML

    Estimating Learnability in the Sublinear Data Regime

    Authors: Weihao Kong, Gregory Valiant

    Abstract: We consider the problem of estimating how well a model class is capable of fitting a distribution of labeled data. We show that it is often possible to accurately estimate this "learnability" even when given an amount of data that is too small to reliably learn any accurate model. Our first result applies to the setting where the data is drawn from a $d$-dimensional distribution with isotropic cov… ▽ More

    Submitted 25 March, 2019; v1 submitted 4 May, 2018; originally announced May 2018.

  28. arXiv:1602.00061  [pdf, other

    cs.LG stat.ML

    Spectrum Estimation from Samples

    Authors: Weihao Kong, Gregory Valiant

    Abstract: We consider the problem of approximating the set of eigenvalues of the covariance matrix of a multivariate distribution (equivalently, the problem of approximating the "population spectrum"), given access to samples drawn from the distribution. The eigenvalues of the covariance of a distribution contain basic information about the distribution, including the presence or lack of structure in the di… ▽ More

    Submitted 16 July, 2017; v1 submitted 29 January, 2016; originally announced February 2016.

    MSC Class: 62H12; 62H10