Skip to main content

Showing 1–50 of 297 results for author: Zhang, T

Searching in archive stat. Search in all archives.
.
  1. arXiv:2506.20533  [pdf, ps, other

    stat.ML cs.LG math.OC

    Global Convergence of Iteratively Reweighted Least Squares for Robust Subspace Recovery

    Authors: Gilad Lerman, Kang Li, Tyler Maunu, Teng Zhang

    Abstract: Robust subspace estimation is fundamental to many machine learning and data analysis tasks. Iteratively Reweighted Least Squares (IRLS) is an elegant and empirically effective approach to this problem, yet its theoretical properties remain poorly understood. This paper establishes that, under deterministic conditions, a variant of IRLS with dynamic smoothing regularization converges linearly to th… ▽ More

    Submitted 29 June, 2025; v1 submitted 25 June, 2025; originally announced June 2025.

  2. arXiv:2505.21892  [pdf, ps, other

    stat.ML cs.LG

    Almost Linear Convergence under Minimal Score Assumptions: Quantized Transition Diffusion

    Authors: Xunpeng Huang, Yingyu Lin, Nikki Lijing Kuang, Hanze Dong, Difan Zou, Yian Ma, Tong Zhang

    Abstract: Continuous diffusion models have demonstrated remarkable performance in data generation across various domains, yet their efficiency remains constrained by two critical limitations: (1) the local adjacency structure of the forward Markov process, which restricts long-range transitions in the data space, and (2) inherent biases introduced during the simulation of time-inhomogeneous reverse denoisin… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

    Comments: 37 pages, 3 figures, 3 tables

  3. arXiv:2505.12113  [pdf, ps, other

    stat.ME stat.CO

    Cyclic-Shift Sparse Kronecker Tensor Classifier for Signal-Region Detection in Neuroimaging

    Authors: Hsin-Hsiung Huang, Yuh-Haur Chen, Teng Zhang

    Abstract: This study proposes a cyclic-shift logistic sparse Kronecker product decomposition (SKPD) model for high-dimensional tensor data, enhancing the SKPD framework with a cyclic-shift mechanism for binary classification. The method enables interpretable and scalable analysis of brain MRI data, detecting disease-relevant regions through a structured low-rank factorization. By incorporating a second spat… ▽ More

    Submitted 17 May, 2025; originally announced May 2025.

    MSC Class: 62H35; 65C60

  4. arXiv:2505.06487  [pdf

    stat.AP math.OC

    Data Envelopment Analysis with Robust and Closest Targets:Integrating Full-Dimensional Efficient Facets for Risk-Resilient Benchmarking

    Authors: Xiuquan Huang, Xi Wang, Tao Zhang, Xiaocang Xu, Ali Emrouznejad

    Abstract: As the external environment become increasingly volatile and unpredictable, the selection of benchmarking targets in data envelopment analysis should account for their ability to consider risks; however, this aspect has not received sufficient attention. We propose a robust benchmarking target defined by the intersection of the maximum number of full-dimensional efficient facets, each representing… ▽ More

    Submitted 18 June, 2025; v1 submitted 9 May, 2025; originally announced May 2025.

  5. arXiv:2505.01049  [pdf, other

    cs.LG math.AP math.ST stat.ML

    Multi-Step Consistency Models: Fast Generation with Theoretical Guarantees

    Authors: Nishant Jain, Xunpeng Huang, Yian Ma, Tong Zhang

    Abstract: Consistency models have recently emerged as a compelling alternative to traditional SDE-based diffusion models. They offer a significant acceleration in generation by producing high-quality samples in very few steps. Despite their empirical success, a proper theoretic justification for their speed-up is still lacking. In this work, we address the gap by providing a theoretical analysis of consiste… ▽ More

    Submitted 25 May, 2025; v1 submitted 2 May, 2025; originally announced May 2025.

    Comments: 31 pages

  6. arXiv:2504.21314  [pdf, other

    cs.LG stat.ML

    Capturing Conditional Dependence via Auto-regressive Diffusion Models

    Authors: Xunpeng Huang, Yujin Han, Difan Zou, Yian Ma, Tong Zhang

    Abstract: Diffusion models have demonstrated appealing performance in both image and video generation. However, many works discover that they struggle to capture important, high-level relationships that are present in the real world. For example, they fail to learn physical laws from data, and even fail to understand that the objects in the world exist in a stable fashion. This is due to the fact that impor… ▽ More

    Submitted 30 April, 2025; originally announced April 2025.

  7. arXiv:2504.11775  [pdf, other

    stat.ML cs.CY cs.LG q-fin.RM

    Discrimination-free Insurance Pricing with Privatized Sensitive Attributes

    Authors: Tianhe Zhang, Suhan Liu, Peng Shi

    Abstract: Fairness has emerged as a critical consideration in the landscape of machine learning algorithms, particularly as AI continues to transform decision-making across societal domains. To ensure that these algorithms are free from bias and do not discriminate against individuals based on sensitive attributes such as gender and race, the field of algorithmic bias has introduced various fairness concept… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

  8. arXiv:2504.11343  [pdf, ps, other

    cs.LG cs.AI cs.CL stat.ML

    A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce

    Authors: Wei Xiong, Jiarui Yao, Yuhui Xu, Bo Pang, Lei Wang, Doyen Sahoo, Junnan Li, Nan Jiang, Tong Zhang, Caiming Xiong, Hanze Dong

    Abstract: Reinforcement learning (RL) has become a prevailing approach for fine-tuning large language models (LLMs) on complex reasoning tasks. Among recent methods, GRPO stands out for its empirical success in training models such as DeepSeek-R1, yet the sources of its effectiveness remain poorly understood. In this work, we revisit GRPO from a reinforce-like algorithm perspective and analyze its core comp… ▽ More

    Submitted 12 June, 2025; v1 submitted 15 April, 2025; originally announced April 2025.

  9. arXiv:2504.08743  [pdf, other

    cs.IR cs.LG eess.SY math.OC stat.AP

    Dynamic Topic Analysis in Academic Journals using Convex Non-negative Matrix Factorization Method

    Authors: Yang Yang, Tong Zhang, Jian Wu, Lijie Su

    Abstract: With the rapid advancement of large language models, academic topic identification and topic evolution analysis are crucial for enhancing AI's understanding capabilities. Dynamic topic analysis provides a powerful approach to capturing and understanding the temporal evolution of topics in large-scale datasets. This paper presents a two-stage dynamic topic analysis framework that incorporates conve… ▽ More

    Submitted 23 March, 2025; originally announced April 2025.

    Comments: 11 pages, 7 figures, 6 tables

  10. arXiv:2503.21352  [pdf

    cs.AI stat.AP

    Using large language models to produce literature reviews: Usages and systematic biases of microphysics parametrizations in 2699 publications

    Authors: Tianhang Zhang, Shengnan Fu, David M. Schultz, Zhonghua Zheng

    Abstract: Large language models afford opportunities for using computers for intensive tasks, realizing research opportunities that have not been considered before. One such opportunity could be a systematic interrogation of the scientific literature. Here, we show how a large language model can be used to construct a literature review of 2699 publications associated with microphysics parametrizations in th… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

  11. arXiv:2503.00156  [pdf, other

    astro-ph.IM cs.CV cs.LG eess.IV stat.AP

    Neural Posterior Estimation for Cataloging Astronomical Images with Spatially Varying Backgrounds and Point Spread Functions

    Authors: Aakash Patel, Tianqing Zhang, Camille Avestruz, Jeffrey Regier, the LSST Dark Energy Science Collaboration

    Abstract: Neural posterior estimation (NPE), a type of amortized variational inference, is a computationally efficient means of constructing probabilistic catalogs of light sources from astronomical images. To date, NPE has not been used to perform inference in models with spatially varying covariates. However, ground-based astronomical images have spatially varying sky backgrounds and point spread function… ▽ More

    Submitted 28 February, 2025; originally announced March 2025.

    MSC Class: 85A35; 62F15 ACM Class: J.2; I.2.10

  12. arXiv:2502.13747  [pdf, other

    cs.LG stat.ME stat.ML

    Reverse Markov Learning: Multi-Step Generative Models for Complex Distributions

    Authors: Xinwei Shen, Nicolai Meinshausen, Tong Zhang

    Abstract: Learning complex distributions is a fundamental challenge in contemporary applications. Generative models, such as diffusion models, have demonstrated remarkable success in overcoming many limitations of traditional statistical methods. Shen and Meinshausen (2024) introduced engression, a generative approach based on scoring rules that maps noise (and covariates, if available) directly to data. Wh… ▽ More

    Submitted 19 February, 2025; originally announced February 2025.

  13. arXiv:2502.08136  [pdf, ps, other

    cs.LG stat.ML

    In-Context Learning of Linear Dynamical Systems with Transformers: Error Bounds and Depth-Separation

    Authors: Frank Cole, Yulong Lu, Tianhao Zhang, Yuxuan Zhao

    Abstract: This paper investigates approximation-theoretic aspects of the in-context learning capability of the transformers in representing a family of noisy linear dynamical systems. Our first theoretical result establishes an upper bound on the approximation error of multi-layer transformers with respect to an $L^2$-testing loss uniformly defined across tasks. This result demonstrates that transformers wi… ▽ More

    Submitted 13 February, 2025; v1 submitted 12 February, 2025; originally announced February 2025.

  14. arXiv:2502.07460  [pdf, ps, other

    cs.LG stat.ML

    Logarithmic Regret for Online KL-Regularized Reinforcement Learning

    Authors: Heyang Zhao, Chenlu Ye, Wei Xiong, Quanquan Gu, Tong Zhang

    Abstract: Recent advances in Reinforcement Learning from Human Feedback (RLHF) have shown that KL-regularization plays a pivotal role in improving the efficiency of RL fine-tuning for large language models (LLMs). Despite its empirical advantage, the theoretical difference between KL-regularized RL and standard RL remains largely under-explored. While there is a recent line of work on the theoretical analys… ▽ More

    Submitted 30 May, 2025; v1 submitted 11 February, 2025; originally announced February 2025.

  15. arXiv:2502.06051  [pdf, ps, other

    cs.LG cs.AI math.ST stat.ML

    Towards a Sharp Analysis of Offline Policy Learning for $f$-Divergence-Regularized Contextual Bandits

    Authors: Qingyue Zhao, Kaixuan Ji, Heyang Zhao, Tong Zhang, Quanquan Gu

    Abstract: Although many popular reinforcement learning algorithms are underpinned by $f$-divergence regularization, their sample complexity with respect to the \emph{regularized objective} still lacks a tight characterization. In this paper, we analyze $f$-divergence-regularized offline policy learning. For reverse Kullback-Leibler (KL) divergence, arguably the most commonly used one, we give the first… ▽ More

    Submitted 30 May, 2025; v1 submitted 9 February, 2025; originally announced February 2025.

    Comments: 38 pages

  16. arXiv:2502.02486  [pdf, ps, other

    stat.ML cs.LG

    Catoni Contextual Bandits are Robust to Heavy-tailed Rewards

    Authors: Chenlu Ye, Yujia Jin, Alekh Agarwal, Tong Zhang

    Abstract: Typical contextual bandit algorithms assume that the rewards at each round lie in some fixed range $[0, R]$, and their regret scales polynomially with this reward range $R$. However, many practical scenarios naturally involve heavy-tailed rewards or rewards where the worst-case range can be substantially larger than the variance. In this paper, we develop an algorithmic approach building on Catoni… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

  17. arXiv:2502.01763  [pdf, other

    cs.LG math.OC stat.ML

    On The Concurrence of Layer-wise Preconditioning Methods and Provable Feature Learning

    Authors: Thomas T. Zhang, Behrad Moniri, Ansh Nagwekar, Faraz Rahman, Anton Xue, Hamed Hassani, Nikolai Matni

    Abstract: Layer-wise preconditioning methods are a family of memory-efficient optimization algorithms that introduce preconditioners per axis of each layer's weight tensors. These methods have seen a recent resurgence, demonstrating impressive performance relative to entry-wise ("diagonal") preconditioning methods such as Adam(W) on a wide range of neural network optimization tasks. Complementary to their p… ▽ More

    Submitted 3 February, 2025; originally announced February 2025.

  18. arXiv:2501.15034  [pdf, other

    cs.LG cs.AI stat.ML

    Divergence-Augmented Policy Optimization

    Authors: Qing Wang, Yingru Li, Jiechao Xiong, Tong Zhang

    Abstract: In deep reinforcement learning, policy optimization methods need to deal with issues such as function approximation and the reuse of off-policy data. Standard policy gradient methods do not handle off-policy data well, leading to premature convergence and instability. This paper introduces a method to stabilize policy optimization when off-policy data are reused. The idea is to include a Bregman d… ▽ More

    Submitted 24 January, 2025; originally announced January 2025.

    Comments: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada

  19. arXiv:2412.13122  [pdf, other

    stat.ME

    Fréchet Sufficient Dimension Reduction for Metric Space-Valued Data via Distance Covariance

    Authors: Hsin-Hsiung Huang, Feng Yu, Kang Li, Teng Zhang

    Abstract: We propose a novel Fréchet sufficient dimension reduction (SDR) method based on kernel distance covariance, tailored for metric space-valued responses such as count data, probability densities, and other complex structures. The method leverages a kernel-based transformation to map metric space-valued responses into a feature space, enabling efficient dimension reduction. By incorporating kernel di… ▽ More

    Submitted 17 December, 2024; originally announced December 2024.

  20. arXiv:2412.03918  [pdf, ps, other

    stat.ME

    Selection of Ultrahigh-Dimensional Interactions Using $L_0$ Penalty

    Authors: Tonglin Zhang

    Abstract: Selecting interactions from an ultrahigh-dimensional statistical model with $n$ observations and $p$ variables when $p\gg n$ is difficult because the number of candidates for interactions is $p(p-1)/2$ and a selected model should satisfy the strong hierarchical (SH) restriction. A new method called the SHL0 is proposed to overcome the difficulty. The objective function of the SHL0 method is compos… ▽ More

    Submitted 5 December, 2024; originally announced December 2024.

    Comments: Preprint

    MSC Class: 62J07; 62J05

  21. arXiv:2411.19448  [pdf, ps, other

    stat.ME

    Unsupervised Variable Selection for Ultrahigh-Dimensional Clustering Analysis

    Authors: Tonglin Zhang, Huyunting Huang

    Abstract: Compared to supervised variable selection, the research on unsupervised variable selection is far behind. A forward partial-variable clustering full-variable loss (FPCFL) method is proposed for the corresponding challenges. An advantage is that the FPCFL method can distinguish active, redundant, and uninformative variables, which the previous methods cannot achieve. Theoretical and simulation stud… ▽ More

    Submitted 28 November, 2024; originally announced November 2024.

    Comments: Preprint

    MSC Class: 62H30; 62J07; 62H35

  22. arXiv:2411.18830  [pdf, other

    q-fin.PM math.ST stat.ME

    Double Descent in Portfolio Optimization: Dance between Theoretical Sharpe Ratio and Estimation Accuracy

    Authors: Yonghe Lu, Yanrong Yang, Terry Zhang

    Abstract: We study the relationship between model complexity and out-of-sample performance in the context of mean-variance portfolio optimization. Representing model complexity by the number of assets, we find that the performance of low-dimensional models initially improves with complexity but then declines due to overfitting. As model complexity becomes sufficiently high, the performance improves with com… ▽ More

    Submitted 27 November, 2024; originally announced November 2024.

  23. arXiv:2411.04625  [pdf, other

    cs.LG stat.ML

    Sharp Analysis for KL-Regularized Contextual Bandits and RLHF

    Authors: Heyang Zhao, Chenlu Ye, Quanquan Gu, Tong Zhang

    Abstract: Reverse-Kullback-Leibler (KL) regularization has emerged to be a predominant technique used to enhance policy optimization in reinforcement learning (RL) and reinforcement learning from human feedback (RLHF), which forces the learned policy to stay close to a reference policy. While the effectiveness and necessity of KL-regularization have been empirically demonstrated in various practical scenari… ▽ More

    Submitted 11 February, 2025; v1 submitted 7 November, 2024; originally announced November 2024.

  24. arXiv:2410.20006  [pdf, other

    cs.CV cs.LG stat.ML

    Unsupervised Machine Learning for Detecting and Locating Human-Made Objects in 3D Point Cloud

    Authors: Hong Zhao, Huyunting Huang, Tonglin Zhang, Baijian Yang, Jin Wei-Kocsis, Songlin Fei

    Abstract: A 3D point cloud is an unstructured, sparse, and irregular dataset, typically collected by airborne LiDAR systems over a geological region. Laser pulses emitted from these systems reflect off objects both on and above the ground, resulting in a dataset containing the longitude, latitude, and elevation of each point, as well as information about the corresponding laser pulse strengths. A widely stu… ▽ More

    Submitted 25 October, 2024; originally announced October 2024.

  25. arXiv:2410.11227  [pdf, other

    stat.ML cs.LG eess.SY

    Guarantees for Nonlinear Representation Learning: Non-identical Covariates, Dependent Data, Fewer Samples

    Authors: Thomas T. Zhang, Bruce D. Lee, Ingvar Ziemann, George J. Pappas, Nikolai Matni

    Abstract: A driving force behind the diverse applicability of modern machine learning is the ability to extract meaningful features across many sources. However, many practical domains involve data that are non-identically distributed across sources, and statistically dependent within its source, violating vital assumptions in existing theoretical studies. Toward addressing these issues, we establish statis… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

    Comments: Appeared at ICML 2024

  26. arXiv:2410.08492  [pdf, ps, other

    stat.ME

    Exact MLE for Generalized Linear Mixed Models

    Authors: Tonglin Zhang

    Abstract: Exact MLE for generalized linear mixed models (GLMMs) is a long-standing problem unsolved until today. The proposed research solves the problem. In this problem, the main difficulty is caused by intractable integrals in the likelihood function when the response does not follow normal and the prior distribution for the random effects is specified by normal. Previous methods use Laplace approximatio… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

    Comments: Preprint. arXiv admin note: text overlap with arXiv:2409.09310

    MSC Class: 62F15; 62J05; 62J12

  27. arXiv:2409.12293  [pdf, other

    cs.LG math.NA stat.ML

    In-Context Learning of Linear Systems: Generalization Theory and Applications to Operator Learning

    Authors: Frank Cole, Yulong Lu, Wuzhe Xu, Tianhao Zhang

    Abstract: We study theoretical guarantees for solving linear systems in-context using a linear transformer architecture. For in-domain generalization, we provide neural scaling laws that bound the generalization error in terms of the number of tasks and sizes of samples used in training and inference. For out-of-domain generalization, we find that the behavior of trained transformers under task distribution… ▽ More

    Submitted 23 May, 2025; v1 submitted 18 September, 2024; originally announced September 2024.

    Comments: Includes new results for in-domain generalization and operator learning, code available at https://github.com/LuGroupUMN/ICL_Linear_Systems

  28. arXiv:2409.10001  [pdf, other

    stat.ME

    Generalized Matrix Factor Model

    Authors: Xinbing Kong, Tong Zhang

    Abstract: This article introduces a nonlinear generalized matrix factor model (GMFM) that allows for mixed-type variables, extending the scope of linear matrix factor models (LMFM) that are so far limited to handling continuous variables. We introduce a novel augmented Lagrange multiplier method, equivalent to the constraint maximum likelihood estimation, and carefully tailored to be locally concave around… ▽ More

    Submitted 16 September, 2024; originally announced September 2024.

  29. arXiv:2409.09310  [pdf, ps, other

    stat.ME

    Exact Posterior Mean and Covariance for Generalized Linear Mixed Models

    Authors: Tonglin Zhang

    Abstract: A novel method is proposed for the exact posterior mean and covariance of the random effects given the response in a generalized linear mixed model (GLMM) when the response does not follow normal. The research solves a long-standing problem in Bayesian statistics when an intractable integral appears in the posterior distribution. It is well-known that the posterior distribution of the random effec… ▽ More

    Submitted 14 September, 2024; originally announced September 2024.

    Comments: Manuscript under review

  30. arXiv:2409.02392  [pdf, other

    cs.LG stat.ML

    Building Math Agents with Multi-Turn Iterative Preference Learning

    Authors: Wei Xiong, Chengshuai Shi, Jiaming Shen, Aviv Rosenberg, Zhen Qin, Daniele Calandriello, Misha Khalman, Rishabh Joshi, Bilal Piot, Mohammad Saleh, Chi Jin, Tong Zhang, Tianqi Liu

    Abstract: Recent studies have shown that large language models' (LLMs) mathematical problem-solving capabilities can be enhanced by integrating external tools, such as code interpreters, and employing multi-turn Chain-of-Thought (CoT) reasoning. While current methods focus on synthetic data generation and Supervised Fine-Tuning (SFT), this paper studies the complementary direct preference learning approach… ▽ More

    Submitted 27 February, 2025; v1 submitted 3 September, 2024; originally announced September 2024.

    Comments: A multi-turn direct preference learning framework for tool-integrated reasoning tasks

  31. arXiv:2408.02060  [pdf, other

    math.ST stat.ME stat.ML

    Winners with Confidence: Discrete Argmin Inference with an Application to Model Selection

    Authors: Tianyu Zhang, Hao Lee, Jing Lei

    Abstract: We study the problem of finding the index of the minimum value of a vector from noisy observations. This problem is relevant in population/policy comparison, discrete maximum likelihood, and model selection. We develop an asymptotically normal test statistic, even in high-dimensional settings and with potentially many ties in the population mean vector, by integrating concepts and tools from cross… ▽ More

    Submitted 4 December, 2024; v1 submitted 4 August, 2024; originally announced August 2024.

  32. arXiv:2407.19078  [pdf, other

    cs.LG stat.ML

    Practical Marketplace Optimization at Uber Using Causally-Informed Machine Learning

    Authors: Bobby Chen, Siyu Chen, Jason Dowlatabadi, Yu Xuan Hong, Vinayak Iyer, Uday Mantripragada, Rishabh Narang, Apoorv Pandey, Zijun Qin, Abrar Sheikh, Hongtao Sun, Jiaqi Sun, Matthew Walker, Kaichen Wei, Chen Xu, Jingnan Yang, Allen T. Zhang, Guoqing Zhang

    Abstract: Budget allocation of marketplace levers, such as incentives for drivers and promotions for riders, has long been a technical and business challenge at Uber; understanding lever budget changes' impact and estimating cost efficiency to achieve predefined budgets is crucial, with the goal of optimal allocations that maximize business value; we introduce an end-to-end machine learning and optimization… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

    Comments: To be published in the 2nd Workshop on Causal Inference and Machine Learning in Practice, KDD 2024, August 25 to 29, 2024, Barcelona, Spain, 10 pages

    MSC Class: 62J99

  33. arXiv:2407.17466  [pdf, other

    cs.LG math.OC stat.ML

    Traversing Pareto Optimal Policies: Provably Efficient Multi-Objective Reinforcement Learning

    Authors: Shuang Qiu, Dake Zhang, Rui Yang, Boxiang Lyu, Tong Zhang

    Abstract: This paper investigates multi-objective reinforcement learning (MORL), which focuses on learning Pareto optimal policies in the presence of multiple reward functions. Despite MORL's significant empirical success, there is still a lack of satisfactory understanding of various MORL optimization targets and efficient learning algorithms. Our work offers a systematic analysis of several optimization t… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

    Comments: Initially submitted in May 2024

  34. arXiv:2407.07631  [pdf, other

    cs.LG math.OC math.ST stat.ML

    Pessimism Meets Risk: Risk-Sensitive Offline Reinforcement Learning

    Authors: Dake Zhang, Boxiang Lyu, Shuang Qiu, Mladen Kolar, Tong Zhang

    Abstract: We study risk-sensitive reinforcement learning (RL), a crucial field due to its ability to enhance decision-making in scenarios where it is essential to manage uncertainty and minimize potential adverse outcomes. Particularly, our work focuses on applying the entropic risk measure to RL problems. While existing literature primarily investigates the online setting, there remains a large gap in unde… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: ICML 2024

  35. arXiv:2407.03558  [pdf, ps, other

    stat.ME

    Aggregated Sure Independence Screening for Variable Selection with Interaction Structures

    Authors: Tonglin Zhang

    Abstract: A new method called the aggregated sure independence screening is proposed for the computational challenges in variable selection of interactions when the number of explanatory variables is much higher than the number of observations (i.e., $p\gg n$). In this problem, the two main challenges are the strong hierarchical restriction and the number of candidates for the main effects and interactions.… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Preprint

    MSC Class: 62J07; 62J05

  36. arXiv:2406.01380  [pdf, other

    cs.CV stat.AP

    Convolutional Unscented Kalman Filter for Multi-Object Tracking with Outliers

    Authors: Shiqi Liu, Wenhan Cao, Chang Liu, Tianyi Zhang, Shengbo Eben Li

    Abstract: Multi-object tracking (MOT) is an essential technique for navigation in autonomous driving. In tracking-by-detection systems, biases, false positives, and misses, which are referred to as outliers, are inevitable due to complex traffic scenarios. Recent tracking methods are based on filtering algorithms that overlook these outliers, leading to reduced tracking accuracy or even loss of the objects… ▽ More

    Submitted 15 September, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: IEEE Transactions on Intelligent Vehicles

  37. arXiv:2405.16780  [pdf, other

    stat.ME

    Analysis of Broken Randomized Experiments by Principal Stratification

    Authors: Qinqing Liu, Xiang Peng, Tao Zhang, Yuhao Deng

    Abstract: Although randomized controlled trials have long been regarded as the ``gold standard'' for evaluating treatment effects, there is no natural prevention from post-treatment events. For example, non-compliance makes the actual treatment different from the assigned treatment, truncation-by-death renders the outcome undefined or ill-defined, and missingness prevents the outcomes from being measured. I… ▽ More

    Submitted 23 April, 2025; v1 submitted 26 May, 2024; originally announced May 2024.

  38. arXiv:2405.16734  [pdf, other

    stat.ML cs.LG

    Faster Sampling via Stochastic Gradient Proximal Sampler

    Authors: Xunpeng Huang, Difan Zou, Yi-An Ma, Hanze Dong, Tong Zhang

    Abstract: Stochastic gradients have been widely integrated into Langevin-based methods to improve their scalability and efficiency in solving large-scale sampling problems. However, the proximal sampler, which exhibits much faster convergence than Langevin-based algorithms in the deterministic setting Lee et al. (2021), has yet to be explored in its stochastic variants. In this paper, we study the Stochasti… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: 48 pages, 2 figures, 5 tables

  39. arXiv:2405.16387  [pdf, other

    stat.ML cs.LG

    Reverse Transition Kernel: A Flexible Framework to Accelerate Diffusion Inference

    Authors: Xunpeng Huang, Difan Zou, Hanze Dong, Yi Zhang, Yi-An Ma, Tong Zhang

    Abstract: To generate data from trained diffusion models, most inference algorithms, such as DDPM, DDIM, and other variants, rely on discretizing the reverse SDEs or their equivalent ODEs. In this paper, we view such approaches as decomposing the entire denoising diffusion process into several segments, each corresponding to a reverse transition kernel (RTK) sampling subproblem. Specifically, DDPM uses a Ga… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: 68 pages, 2 figures

  40. arXiv:2405.07863  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    RLHF Workflow: From Reward Modeling to Online RLHF

    Authors: Hanze Dong, Wei Xiong, Bo Pang, Haoxiang Wang, Han Zhao, Yingbo Zhou, Nan Jiang, Doyen Sahoo, Caiming Xiong, Tong Zhang

    Abstract: We present the workflow of Online Iterative Reinforcement Learning from Human Feedback (RLHF) in this technical report, which is widely reported to outperform its offline counterpart by a large margin in the recent large language model (LLM) literature. However, existing open-source RLHF projects are still largely confined to the offline learning setting. In this technical report, we aim to fill i… ▽ More

    Submitted 12 November, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

    Comments: Published in Transactions on Machine Learning Research (09/2024)

  41. arXiv:2405.01010  [pdf, other

    cs.LG stat.ML

    Efficient and Adaptive Posterior Sampling Algorithms for Bandits

    Authors: Bingshan Hu, Zhiming Huang, Tianyue H. Zhang, Mathias Lécuyer, Nidhi Hegde

    Abstract: We study Thompson Sampling-based algorithms for stochastic bandits with bounded rewards. As the existing problem-dependent regret bound for Thompson Sampling with Gaussian priors [Agrawal and Goyal, 2017] is vacuous when $T \le 288 e^{64}$, we derive a more practical bound that tightens the coefficient of the leading term %from $288 e^{64}$ to $1270$. Additionally, motivated by large-scale real-wo… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  42. arXiv:2404.03578  [pdf, ps, other

    cs.LG stat.ML

    Distributionally Robust Reinforcement Learning with Interactive Data Collection: Fundamental Hardness and Near-Optimal Algorithm

    Authors: Miao Lu, Han Zhong, Tong Zhang, Jose Blanchet

    Abstract: The sim-to-real gap, which represents the disparity between training and testing environments, poses a significant challenge in reinforcement learning (RL). A promising approach to addressing this challenge is distributionally robust RL, often framed as a robust Markov decision process (RMDP). In this framework, the objective is to find a robust policy that achieves good performance under the wors… ▽ More

    Submitted 4 November, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

  43. arXiv:2403.18658  [pdf, ps, other

    math.ST stat.ML

    Theoretical Guarantees for the Subspace-Constrained Tyler's Estimator

    Authors: Gilad Lerman, Feng Yu, Teng Zhang

    Abstract: This work analyzes the subspace-constrained Tyler's estimator (STE) designed for recovering a low-dimensional subspace within a dataset that may be highly corrupted with outliers. It assumes a weak inlier-outlier model and allows the fraction of inliers to be smaller than a fraction that leads to computational hardness of the robust subspace recovery problem. It shows that in this setting, if the… ▽ More

    Submitted 12 April, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

  44. arXiv:2403.17592  [pdf, other

    cs.LG stat.ML

    On the Benefits of Over-parameterization for Out-of-Distribution Generalization

    Authors: Yifan Hao, Yong Lin, Difan Zou, Tong Zhang

    Abstract: In recent years, machine learning models have achieved success based on the independently and identically distributed assumption. However, this assumption can be easily violated in real-world applications, leading to the Out-of-Distribution (OOD) problem. Understanding how modern over-parameterized DNNs behave under non-trivial natural distributional shifts is essential, as current theoretical und… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  45. arXiv:2403.11497  [pdf, other

    cs.CV cs.LG stat.ML

    A Sober Look at the Robustness of CLIPs to Spurious Features

    Authors: Qizhou Wang, Yong Lin, Yongqiang Chen, Ludwig Schmidt, Bo Han, Tong Zhang

    Abstract: Large vision language models, such as CLIP, demonstrate impressive robustness to spurious features than single-modal models trained on ImageNet. However, existing test datasets are typically curated based on ImageNet-trained models, which aim to capture the spurious features inherited in ImageNet. Benchmarking CLIP models based on the ImageNet-oriented spurious features may not be sufficient to re… ▽ More

    Submitted 2 November, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: NeurIPS 2024; Qizhou Wang, Yong Lin, and Yongqiang Chen contributed equally; Project page: https://counteranimal.github.io

  46. arXiv:2403.06183  [pdf, other

    cs.LG math.OC math.ST stat.ML

    An Improved Analysis of Langevin Algorithms with Prior Diffusion for Non-Log-Concave Sampling

    Authors: Xunpeng Huang, Hanze Dong, Difan Zou, Tong Zhang

    Abstract: Understanding the dimension dependency of computational complexity in high-dimensional sampling problem is a fundamental problem, both from a practical and theoretical perspective. Compared with samplers with unbiased stationary distribution, e.g., Metropolis-adjusted Langevin algorithm (MALA), biased samplers, e.g., Underdamped Langevin Dynamics (ULD), perform better in low-accuracy cases just be… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

    Comments: 32 pages

  47. arXiv:2403.05679  [pdf, ps, other

    stat.ME math.ST stat.AP

    Adaptive Projected Two-Sample Comparisons for Single-Cell Gene Expression Data

    Authors: Tianyu Zhang, Jing Lei, Kathryn Roeder

    Abstract: We study high-dimensional two-sample mean comparison and address the curse of dimensionality through data-adaptive projections. Leveraging the low-dimensional and localized signal structures commonly seen in single-cell genomics data, our first proposed method identifies a sparse, informative low-dimensional subspace and then performs statistical inference restricted to this subspace. To address t… ▽ More

    Submitted 10 June, 2025; v1 submitted 8 March, 2024; originally announced March 2024.

  48. arXiv:2402.18571  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards

    Authors: Haoxiang Wang, Yong Lin, Wei Xiong, Rui Yang, Shizhe Diao, Shuang Qiu, Han Zhao, Tong Zhang

    Abstract: Fine-grained control over large language models (LLMs) remains a significant challenge, hindering their adaptability to diverse user needs. While Reinforcement Learning from Human Feedback (RLHF) shows promise in aligning LLMs, its reliance on scalar rewards often limits its ability to capture diverse user preferences in real-world applications. To address this limitation, we introduce the Directi… ▽ More

    Submitted 6 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: The code and model are released at https://github.com/Haoxiang-Wang/directional-preference-alignment

  49. arXiv:2402.18149  [pdf, ps, other

    cs.LG stat.ML

    Provably Efficient Partially Observable Risk-Sensitive Reinforcement Learning with Hindsight Observation

    Authors: Tonghe Zhang, Yu Chen, Longbo Huang

    Abstract: This work pioneers regret analysis of risk-sensitive reinforcement learning in partially observable environments with hindsight observation, addressing a gap in theoretical exploration. We introduce a novel formulation that integrates hindsight observations into a Partially Observable Markov Decision Process (POMDP) framework, where the goal is to optimize accumulated reward under the entropic ris… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 38 pages

  50. arXiv:2402.08991  [pdf, ps, other

    stat.ML cs.LG

    Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption

    Authors: Chenlu Ye, Jiafan He, Quanquan Gu, Tong Zhang

    Abstract: This study tackles the challenges of adversarial corruption in model-based reinforcement learning (RL), where the transition dynamics can be corrupted by an adversary. Existing studies on corruption-robust RL mostly focus on the setting of model-free RL, where robust least-square regression is often employed for value function estimation. However, these techniques cannot be directly applied to mod… ▽ More

    Submitted 20 July, 2024; v1 submitted 14 February, 2024; originally announced February 2024.