Skip to main content

Showing 1–32 of 32 results for author: Ha, Y

Searching in archive stat. Search in all archives.
.
  1. arXiv:2501.14197  [pdf, other

    cs.LG cs.SI stat.ML

    Bi-directional Curriculum Learning for Graph Anomaly Detection: Dual Focus on Homogeneity and Heterogeneity

    Authors: Yitong Hao, Enbo He, Yue Zhang, Guisheng Yin

    Abstract: Graph anomaly detection (GAD) aims to identify nodes from a graph that are significantly different from normal patterns. Most previous studies are model-driven, focusing on enhancing the detection effect by improving the model structure. However, these approaches often treat all nodes equally, neglecting the different contributions of various nodes to the training. Therefore, we introduce graph cu… ▽ More

    Submitted 23 January, 2025; originally announced January 2025.

    Comments: 8pages, 5 figures

  2. arXiv:2410.20650  [pdf, other

    cs.LG cs.AI stat.ML

    NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks

    Authors: Yongchang Hao, Yanshuai Cao, Lili Mou

    Abstract: The performance of neural networks improves when more parameters are used. However, the model sizes are constrained by the available on-device memory during training and inference. Although applying techniques like quantization can alleviate the constraint, they suffer from performance degradation. In this work, we introduce NeuZip, a new weight compression scheme based on the entropy of floating-… ▽ More

    Submitted 27 October, 2024; originally announced October 2024.

  3. arXiv:2409.12173  [pdf, other

    stat.ME

    Poisson approximate likelihood compared to the particle filter

    Authors: Yize Hao, Aaron A. Abkemeier, Edward L. Ionides

    Abstract: Filtering algorithms are fundamental for inference on partially observed stochastic dynamic systems, since they provide access to the likelihood function and hence enable likelihood-based or Bayesian inference. A novel Poisson approximate likelihood (PAL) filter was introduced by Whitehouse et al. (2023). PAL employs a Poisson approximation to conditional densities, offering a fast approximation t… ▽ More

    Submitted 18 September, 2024; originally announced September 2024.

  4. arXiv:2409.11134  [pdf, other

    stat.ME

    E-Values for Exponential Families: the General Case

    Authors: Yunda Hao, Peter Grünwald

    Abstract: We analyze common types of e-variables and e-processes for composite exponential family nulls: the optimal e-variable based on the reverse information projection (RIPr), the conditional (COND) e-variable, and the universal inference (UI) and sequen\-tialized RIPr e-processes. We characterize the RIPr prior for simple and Bayes-mixture based alternatives, either precisely (for Gaussian nulls and al… ▽ More

    Submitted 17 September, 2024; originally announced September 2024.

  5. arXiv:2404.19465  [pdf, other

    stat.ME math.ST

    Optimal E-Values for Exponential Families: the Simple Case

    Authors: Peter Grünwald, Tyron Lardy, Yunda Hao, Shaul K. Bar-Lev, Martijn de Jong

    Abstract: We provide a general condition under which e-variables in the form of a simple-vs.-simple likelihood ratio exist when the null hypothesis is a composite, multivariate exponential family. Such `simple' e-variables are easy to compute and expected-log-optimal with respect to any stopping time. Simple e-variables were previously only known to exist in quite specific settings, but we offer a unifying… ▽ More

    Submitted 1 April, 2025; v1 submitted 30 April, 2024; originally announced April 2024.

  6. arXiv:2403.17592  [pdf, other

    cs.LG stat.ML

    On the Benefits of Over-parameterization for Out-of-Distribution Generalization

    Authors: Yifan Hao, Yong Lin, Difan Zou, Tong Zhang

    Abstract: In recent years, machine learning models have achieved success based on the independently and identically distributed assumption. However, this assumption can be easily violated in real-world applications, leading to the Out-of-Distribution (OOD) problem. Understanding how modern over-parameterized DNNs behave under non-trivial natural distributional shifts is essential, as current theoretical und… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  7. arXiv:2402.03295  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    Ginger: An Efficient Curvature Approximation with Linear Complexity for General Neural Networks

    Authors: Yongchang Hao, Yanshuai Cao, Lili Mou

    Abstract: Second-order optimization approaches like the generalized Gauss-Newton method are considered more powerful as they utilize the curvature information of the objective function with preconditioning matrices. Albeit offering tempting theoretical benefits, they are not easily applicable to modern deep learning. The major reason is due to the quadratic memory and cubic time complexity to compute the in… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  8. arXiv:2402.03293  [pdf, other

    cs.LG cs.AI stat.ML

    Flora: Low-Rank Adapters Are Secretly Gradient Compressors

    Authors: Yongchang Hao, Yanshuai Cao, Lili Mou

    Abstract: Despite large neural networks demonstrating remarkable abilities to complete different tasks, they require excessive memory usage to store the optimization states for training. To alleviate this, the low-rank adaptation (LoRA) is proposed to reduce the optimization states by training fewer parameters. However, LoRA restricts overall weight update matrices to be low-rank, limiting the model perform… ▽ More

    Submitted 12 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Accepted @ ICML 2024

  9. arXiv:2401.12236  [pdf, ps, other

    cs.LG cs.CR stat.ML

    The Surprising Harmfulness of Benign Overfitting for Adversarial Robustness

    Authors: Yifan Hao, Tong Zhang

    Abstract: Recent empirical and theoretical studies have established the generalization capabilities of large machine learning models that are trained to (approximately or exactly) fit noisy data. In this work, we prove a surprising result that even if the ground truth itself is robust to adversarial examples, and the benignly overfitted model is benign in terms of the ``standard'' out-of-sample risk objecti… ▽ More

    Submitted 25 January, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

  10. arXiv:2307.02037  [pdf, other

    stat.ML cs.LG math.OC

    Reverse Diffusion Monte Carlo

    Authors: Xunpeng Huang, Hanze Dong, Yifan Hao, Yi-An Ma, Tong Zhang

    Abstract: We propose a Monte Carlo sampler from the reverse diffusion process. Unlike the practice of diffusion models, where the intermediary updates -- the score functions -- are learned with a neural network, we transform the score matching problem into a mean estimation one. By estimating the means of the regularized posterior distributions, we derive a novel Monte Carlo sampling algorithm called revers… ▽ More

    Submitted 13 March, 2024; v1 submitted 5 July, 2023; originally announced July 2023.

    Comments: 44 pages, 16 figures, ICLR 2024

  11. arXiv:2303.00471  [pdf, other

    stat.ME math.ST

    E-values for k-Sample Tests With Exponential Families

    Authors: Yunda Hao, Peter Grünwald, Tyron Lardy, Long Long, Reuben Adams

    Abstract: We develop and compare e-variables for testing whether $k$ samples of data are drawn from the same distribution, the alternative being that they come from different elements of an exponential family. We consider the GRO (growth-rate optimal) e-variables for (1) a `small' null inside the same exponential family, and (2) a `large' nonparametric null, as well as (3) an e-variable arrived at by condit… ▽ More

    Submitted 8 January, 2024; v1 submitted 1 March, 2023; originally announced March 2023.

  12. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  13. arXiv:2202.11968  [pdf

    stat.AP

    Combining the target trial and estimand frameworks to define the causal estimand: an application using real-world data to contextualize a single-arm trial

    Authors: Lisa V Hampson, Jufen Chu, Aiesha Zia, Jie Zhang, Wei-Chun Hsu, Craig Parzynski, Yanni Hao, Evgeny Degtyarev

    Abstract: Single-arm trials (SATs) may be used to support regulatory submissions in settings where there is a high unmet medical need and highly promising early efficacy data undermine the equipoise needed for randomization. In this context, patient-level real-world data (RWD) may be used to create an external control arm (ECA) to contextualize the SAT results. However, naive comparisons of the SAT with its… ▽ More

    Submitted 24 February, 2022; originally announced February 2022.

  14. arXiv:2202.07172  [pdf, other

    stat.ML cs.LG math.ST

    TURF: A Two-factor, Universal, Robust, Fast Distribution Learning Algorithm

    Authors: Yi Hao, Ayush Jain, Alon Orlitsky, Vaishakh Ravindrakumar

    Abstract: Approximating distributions from their samples is a canonical statistical-learning problem. One of its most powerful and successful modalities approximates every distribution to an $\ell_1$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d$ polynomial, where $t\ge1$ and $d\ge0$. Letting $c_{t,d}$ denote the smallest such factor, clearly $c_{1,0}=1$, and it c… ▽ More

    Submitted 17 June, 2022; v1 submitted 14 February, 2022; originally announced February 2022.

    Comments: 19 pages, 12 figures

  15. arXiv:2110.14374  [pdf, other

    physics.comp-ph cond-mat.dis-nn stat.ML

    A2I Transformer: Permutation-equivariant attention network for pairwise and many-body interactions with minimal featurization

    Authors: Ji Woong Yu, Min Young Ha, Bumjoon Seo, Won Bo Lee

    Abstract: The combination of neural network potential (NNP) with molecular simulations plays an important role in an efficient and thorough understanding of a molecular system's potential energy surface (PES). However, grasping the interplay between input features and their local contribution to NNP is growingly evasive due to heavy featurization. In this work, we suggest an end-to-end model which directly… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

  16. arXiv:2010.16055  [pdf, other

    cs.LG stat.ML

    Unsupervised Embedding of Hierarchical Structure in Euclidean Space

    Authors: Jinyu Zhao, Yi Hao, Cyrus Rashtchian

    Abstract: Deep embedding methods have influenced many areas of unsupervised learning. However, the best methods for learning hierarchical structure use non-Euclidean representations, whereas Euclidean geometry underlies the theory behind many hierarchical clustering algorithms. To bridge the gap between these two areas, we consider learning a non-linear embedding of data into Euclidean space as a way to imp… ▽ More

    Submitted 29 October, 2020; originally announced October 2020.

  17. arXiv:2007.08053  [pdf, other

    cs.LG cs.SI stat.ML

    Inductive Link Prediction for Nodes Having Only Attribute Information

    Authors: Yu Hao, Xin Cao, Yixiang Fang, Xike Xie, Sibo Wang

    Abstract: Predicting the link between two nodes is a fundamental problem for graph data analytics. In attributed graphs, both the structure and attribute information can be utilized for link prediction. Most existing studies focus on transductive link prediction where both nodes are already in the graph. However, many real-world applications require inductive prediction for new nodes having only attribute i… ▽ More

    Submitted 15 July, 2020; originally announced July 2020.

    Comments: IJCAI2020

  18. arXiv:2004.11934  [pdf, other

    cs.LG stat.ML

    Correlation-aware Unsupervised Change-point Detection via Graph Neural Networks

    Authors: Ruohong Zhang, Yu Hao, Donghan Yu, Wei-Cheng Chang, Guokun Lai, Yiming Yang

    Abstract: Change-point detection (CPD) aims to detect abrupt changes over time series data. Intuitively, effective CPD over multivariate time series should require explicit modeling of the dependencies across input variables. However, existing CPD methods either ignore the dependency structures entirely or rely on the (unrealistic) assumption that the correlation structures are static over time. In this pap… ▽ More

    Submitted 13 September, 2020; v1 submitted 24 April, 2020; originally announced April 2020.

    Comments: Accepted for publication in the International Conference on Neural Information Processing (ICONIP) 2020 Original paper is 12 pages, additional appendix is available on arxiv

    MSC Class: I.2.6

    Journal ref: ICONIP 2020: Neural Information Processing

  19. arXiv:2003.09660  [pdf, other

    cs.LG cs.AI stat.ML

    NeuCrowd: Neural Sampling Network for Representation Learning with Crowdsourced Labels

    Authors: Yang Hao, Wenbiao Ding, Zitao Liu

    Abstract: Representation learning approaches require a massive amount of discriminative training data, which is unavailable in many scenarios, such as healthcare, smart city, education, etc. In practice, people refer to crowdsourcing to get annotated labels. However, due to issues like data privacy, budget limitation, shortage of domain-specific annotators, the number of crowdsourced labels is still very li… ▽ More

    Submitted 15 December, 2021; v1 submitted 21 March, 2020; originally announced March 2020.

    Comments: Accepted in Knowledge and Information Systems

  20. arXiv:2002.11665  [pdf, ps, other

    stat.ML cs.IT cs.LG math.ST

    Profile Entropy: A Fundamental Measure for the Learnability and Compressibility of Discrete Distributions

    Authors: Yi Hao, Alon Orlitsky

    Abstract: The profile of a sample is the multiset of its symbol frequencies. We show that for samples of discrete distributions, profile entropy is a fundamental measure unifying the concepts of estimation, inference, and compression. Specifically, profile entropy a) determines the speed of estimating the distribution relative to the best natural estimator; b) characterizes the rate of inferring all symmetr… ▽ More

    Submitted 26 February, 2020; originally announced February 2020.

    Comments: 56 pages

  21. arXiv:2002.09589  [pdf, other

    stat.ML cs.IT cs.LG math.ST

    SURF: A Simple, Universal, Robust, Fast Distribution Learning Algorithm

    Authors: Yi Hao, Ayush Jain, Alon Orlitsky, Vaishakh Ravindrakumar

    Abstract: Sample- and computationally-efficient distribution estimation is a fundamental tenet in statistics and machine learning. We present SURF, an algorithm for approximating distributions by piecewise polynomials. SURF is: simple, replacing prior complex optimization techniques by straight-forward {empirical probability} approximation of each potential polynomial piece {through simple empirical-probabi… ▽ More

    Submitted 11 February, 2021; v1 submitted 21 February, 2020; originally announced February 2020.

    Comments: 27 pages, 9 figures, 3 tables

  22. arXiv:1911.03105  [pdf, ps, other

    cs.LG math.ST stat.ML

    Unified Sample-Optimal Property Estimation in Near-Linear Time

    Authors: Yi Hao, Alon Orlitsky

    Abstract: We consider the fundamental learning problem of estimating properties of distributions over large domains. Using a novel piecewise-polynomial approximation technique, we derive the first unified methodology for constructing sample- and time-efficient estimators for all sufficiently smooth, symmetric and non-symmetric, additive properties. This technique yields near-linear-time computable estimator… ▽ More

    Submitted 17 March, 2020; v1 submitted 8 November, 2019; originally announced November 2019.

    Comments: Appeared at NeurIPS 2019. Fixed a few typos and minor issues in corner cases

  23. arXiv:1911.00776  [pdf, other

    cs.LG stat.ML

    Ten-year Survival Prediction for Breast Cancer Patients

    Authors: Changmao Li, Han He, Yunze Hao, Caleb Ziems

    Abstract: This report assesses different machine learning approaches to 10-year survival prediction of breast cancer patients.

    Submitted 2 November, 2019; originally announced November 2019.

  24. arXiv:1906.03794  [pdf, other

    stat.ML cs.LG math.ST

    The Broad Optimality of Profile Maximum Likelihood

    Authors: Yi Hao, Alon Orlitsky

    Abstract: We study three fundamental statistical-learning problems: distribution estimation, property estimation, and property testing. We establish the profile maximum likelihood (PML) estimator as the first unified sample-optimal approach to a wide range of learning tasks. In particular, for every alphabet size $k$ and desired accuracy $\varepsilon$: $\textbf{Distribution estimation}$ Under $\ell_1$ dis… ▽ More

    Submitted 11 July, 2019; v1 submitted 10 June, 2019; originally announced June 2019.

    Comments: Added a new section (Section 8) about truncated PML (TPML) and derived several new results

  25. arXiv:1905.13550  [pdf

    cs.LG eess.SP stat.AP stat.ML

    A novel hybrid model based on multi-objective Harris hawks optimization algorithm for daily PM2.5 and PM10 forecasting

    Authors: Pei Du, Jianzhou Wang, Yan Hao, Tong Niu, Wendong Yang

    Abstract: High levels of air pollution may seriously affect people's living environment and even endanger their lives. In order to reduce air pollution concentrations, and warn the public before the occurrence of hazardous air pollutants, it is urgent to design an accurate and reliable air pollutant forecasting model. However, most previous research have many deficiencies, such as ignoring the importance of… ▽ More

    Submitted 30 May, 2019; originally announced May 2019.

    Comments: 24 pages, 4 figures

    MSC Class: 68U20

  26. arXiv:1904.00070  [pdf, other

    stat.ML cs.LG math.ST

    Data Amplification: A Unified and Competitive Approach to Property Estimation

    Authors: Yi Hao, Alon Orlitsky, Ananda T. Suresh, Yihong Wu

    Abstract: Estimating properties of discrete distributions is a fundamental problem in statistical learning. We design the first unified, linear-time, competitive, property estimator that for a wide class of properties and for all underlying distributions uses just $2n$ samples to achieve the performance attained by the empirical estimator with $n\sqrt{\log n}$ samples. This provides off-the-shelf, distribut… ▽ More

    Submitted 29 March, 2019; originally announced April 2019.

    Comments: In NeurIPS 2018

  27. arXiv:1903.01432  [pdf, other

    math.ST cs.LG stat.ML

    Data Amplification: Instance-Optimal Property Estimation

    Authors: Yi Hao, Alon Orlitsky

    Abstract: The best-known and most commonly used distribution-property estimation technique uses a plug-in estimator, with empirical frequency replacing the underlying distribution. We present novel linear-time-computable estimators that significantly "amplify" the effective amount of data available. For a large variety of distribution properties including four of the most popular ones and for every underlyi… ▽ More

    Submitted 5 March, 2019; v1 submitted 4 March, 2019; originally announced March 2019.

    Comments: In this new version, we strengthened the previous results by eliminating unnecessary assumptions

  28. arXiv:1810.11754  [pdf, other

    cs.LG stat.ML

    On Learning Markov Chains

    Authors: Yi Hao, Alon Orlitsky, Venkatadheeraj Pichapati

    Abstract: The problem of estimating an unknown discrete distribution from its samples is a fundamental tenet of statistical learning. Over the past decade, it attracted significant research effort and has been solved for a variety of divergence measures. Surprisingly, an equally important problem, estimating an unknown Markov chain from its samples, is still far from understood. We consider two problems rel… ▽ More

    Submitted 27 October, 2018; originally announced October 2018.

    Comments: To appear at NIPS 2018

  29. arXiv:1806.00740  [pdf

    stat.AP

    How does climate change influence regional stability

    Authors: Tianyu Shi, Jiayan Guo, Xuxin Cheng, Yu hao

    Abstract: Nowadays, different places have different region stability, which is influenced by lots of factors. In this paper ,it is aimed to analyze the influence of climate change on regional stability. several factors that may influence the region stability are proposed. Then Principle Components Analysis (PCA) was used to select the most relevant factors. After that ,a BP neural network is established con… ▽ More

    Submitted 5 August, 2018; v1 submitted 3 June, 2018; originally announced June 2018.

  30. arXiv:1704.05041  [pdf, other

    cs.LG stat.ML

    Fast multi-output relevance vector regression

    Authors: Youngmin Ha

    Abstract: This paper aims to decrease the time complexity of multi-output relevance vector regression from O(VM^3) to O(V^3+M^3), where V is the number of output dimensions, M is the number of basis functions, and V<M. The experimental results demonstrate that the proposed method is more competitive than the existing method, with regard to computation time. MATLAB codes are available at http://www.mathworks… ▽ More

    Submitted 17 April, 2017; originally announced April 2017.

  31. arXiv:1704.04137  [pdf, other

    stat.ML cs.CY

    Fashion Conversation Data on Instagram

    Authors: Yu-I Ha, Sejeong Kwon, Meeyoung Cha, Jungseock Joo

    Abstract: The fashion industry is establishing its presence on a number of visual-centric social media like Instagram. This creates an interesting clash as fashion brands that have traditionally practiced highly creative and editorialized image marketing now have to engage with people on the platform that epitomizes impromptu, realtime conversation. What kinds of fashion images do brands and individuals sha… ▽ More

    Submitted 13 April, 2017; originally announced April 2017.

    Comments: 10 pages, 6 figures, This paper will be presented at ICWSM'17

  32. arXiv:1311.1040  [pdf

    stat.ML cs.LG

    Combined Independent Component Analysis and Canonical Polyadic Decomposition via Joint Diagonalization

    Authors: Xiao-Feng Gong, Cheng-Yuan Wang, Ya-Na Hao, Qiu-Hua Lin

    Abstract: Recently, there has been a trend to combine independent component analysis and canonical polyadic decomposition (ICA-CPD) for an enhanced robustness for the computation of CPD, and ICA-CPD could be further converted into CPD of a 5th-order partially symmetric tensor, by calculating the eigenmatrices of the 4th-order cumulant slices of a trilinear mixture. In this study, we propose a new 5th-order… ▽ More

    Submitted 27 December, 2016; v1 submitted 5 November, 2013; originally announced November 2013.

    Comments: IEEE China Summit & International Conference on Signal and Information Processing. IEEE, 2014:804 - 808