Skip to main content

Showing 1–50 of 149 results for author: Zhou, D

Searching in archive stat. Search in all archives.
.
  1. arXiv:2506.14899  [pdf, ps, other

    stat.ML cs.LG

    Optimal Convergence Rates of Deep Neural Network Classifiers

    Authors: Zihan Zhang, Lei Shi, Ding-Xuan Zhou

    Abstract: In this paper, we study the binary classification problem on $[0,1]^d$ under the Tsybakov noise condition (with exponent $s \in [0,\infty]$) and the compositional assumption. This assumption requires the conditional class probability function of the data distribution to be the composition of $q+1$ vector-valued multivariate functions, where each component function is either a maximum value functio… ▽ More

    Submitted 17 June, 2025; originally announced June 2025.

  2. arXiv:2506.12751  [pdf, ps, other

    stat.ML cs.LG

    Single Index Bandits: Generalized Linear Contextual Bandits with Unknown Reward Functions

    Authors: Yue Kang, Mingshuo Liu, Bongsoo Yi, Jing Lyu, Zhi Zhang, Doudou Zhou, Yao Li

    Abstract: Generalized linear bandits have been extensively studied due to their broad applicability in real-world online decision-making problems. However, these methods typically assume that the expected reward function is known to the users, an assumption that is often unrealistic in practice. Misspecification of this link function can lead to the failure of all existing algorithms. In this work, we addre… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

  3. arXiv:2506.12412  [pdf, ps, other

    cs.LG stat.ML

    Cross-Domain Conditional Diffusion Models for Time Series Imputation

    Authors: Kexin Zhang, Baoyu Jing, K. Selçuk Candan, Dawei Zhou, Qingsong Wen, Han Liu, Kaize Ding

    Abstract: Cross-domain time series imputation is an underexplored data-centric research task that presents significant challenges, particularly when the target domain suffers from high missing rates and domain shifts in temporal dynamics. Existing time series imputation approaches primarily focus on the single-domain setting, which cannot effectively adapt to a new domain with domain shifts. Meanwhile, conv… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

    Comments: Accepted by ECML-PKDD 2025

  4. arXiv:2506.00818  [pdf, other

    stat.ML cs.LG

    Generalized Linear Markov Decision Process

    Authors: Sinian Zhang, Kaicheng Zhang, Ziping Xu, Tianxi Cai, Doudou Zhou

    Abstract: The linear Markov Decision Process (MDP) framework offers a principled foundation for reinforcement learning (RL) with strong theoretical guarantees and sample efficiency. However, its restrictive assumption-that both transition dynamics and reward functions are linear in the same feature space-limits its applicability in real-world domains, where rewards often exhibit nonlinear or discrete struct… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

    Comments: 34 pages, 9 figures

  5. arXiv:2505.19209  [pdf, ps, other

    cs.CL cs.AI cs.CE stat.ML

    MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search

    Authors: Zonglin Yang, Wanhao Liu, Ben Gao, Yujie Liu, Wei Li, Tong Xie, Lidong Bing, Wanli Ouyang, Erik Cambria, Dongzhan Zhou

    Abstract: Large language models (LLMs) have shown promise in automating scientific hypothesis generation, yet existing approaches primarily yield coarse-grained hypotheses lacking critical methodological and experimental details. We introduce and formally define the novel task of fine-grained scientific hypothesis discovery, which entails generating detailed, experimentally actionable hypotheses from coarse… ▽ More

    Submitted 25 May, 2025; originally announced May 2025.

  6. arXiv:2505.12023  [pdf, ps, other

    stat.ME

    Model-X Change-Point Detection of Conditional Distribution

    Authors: Yiwen Huang, Yan Dong, Mengying Yan, Ziye Tian, Chuan Hong, Doudou Zhou, Molei Liu

    Abstract: The dynamic nature of many real-world systems can lead to temporal outcome model shifts, causing a deterioration in model accuracy and reliability over time. This requires change-point detection on the outcome models to guide model retraining and adjustments. However, inferring the change point of conditional models is more prone to loss of validity or power than classic detection problems for mar… ▽ More

    Submitted 20 May, 2025; v1 submitted 17 May, 2025; originally announced May 2025.

    Comments: 13 pages, 5 figures

  7. arXiv:2505.08198  [pdf, ps, other

    stat.ML cs.LG

    SIM-Shapley: A Stable and Computationally Efficient Approach to Shapley Value Approximation

    Authors: Wangxuan Fan, Siqi Li, Doudou Zhou, Yohei Okada, Chuan Hong, Molei Liu, Nan Liu

    Abstract: Explainable artificial intelligence (XAI) is essential for trustworthy machine learning (ML), particularly in high-stakes domains such as healthcare and finance. Shapley value (SV) methods provide a principled framework for feature attribution in complex models but incur high computational costs, limiting their scalability in high-dimensional settings. We propose Stochastic Iterative Momentum for… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Comments: 21 pages, 6 figures, 5 tables

  8. arXiv:2503.07976  [pdf, ps, other

    stat.ML cs.LG

    Two-Dimensional Deep ReLU CNN Approximation for Korobov Functions: A Constructive Approach

    Authors: Qin Fang, Lei Shi, Min Xu, Ding-Xuan Zhou

    Abstract: This paper investigates approximation capabilities of two-dimensional (2D) deep convolutional neural networks (CNNs), with Korobov functions serving as a benchmark. We focus on 2D CNNs, comprising multi-channel convolutional layers with zero-padding and ReLU activations, followed by a fully connected layer. We propose a fully constructive approach for building 2D CNNs to approximate Korobov functi… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

  9. arXiv:2412.16684  [pdf, other

    stat.ME math.ST

    MATES: Multi-view Aggregated Two-Sample Test

    Authors: Zexi Cai, Wenbo Fei, Doudou Zhou

    Abstract: The two-sample test is a fundamental problem in statistics with a wide range of applications. In the realm of high-dimensional data, nonparametric methods have gained prominence due to their flexibility and minimal distributional assumptions. However, many existing methods tend to be more effective when the two distributions differ primarily in their first and/or second moments. In many real-world… ▽ More

    Submitted 21 December, 2024; originally announced December 2024.

  10. arXiv:2410.23450  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Return Augmented Decision Transformer for Off-Dynamics Reinforcement Learning

    Authors: Ruhan Wang, Yu Yang, Zhishuai Liu, Dongruo Zhou, Pan Xu

    Abstract: We study offline off-dynamics reinforcement learning (RL) to utilize data from an easily accessible source domain to enhance policy learning in a target domain with limited data. Our approach centers on return-conditioned supervised learning (RCSL), particularly focusing on the decision transformer (DT), which can predict actions conditioned on desired return guidance and complete trajectory histo… ▽ More

    Submitted 30 October, 2024; originally announced October 2024.

    Comments: 26 pages, 10 tables, 10 figures

  11. arXiv:2410.14092  [pdf, other

    cs.LG math.OC stat.ML

    Efficient Sparse PCA via Block-Diagonalization

    Authors: Alberto Del Pia, Dekun Zhou, Yinglun Zhu

    Abstract: Sparse Principal Component Analysis (Sparse PCA) is a pivotal tool in data analysis and dimensionality reduction. However, Sparse PCA is a challenging problem in both theory and practice: it is known to be NP-hard and current exact methods generally require exponential runtime. In this paper, we propose a novel framework to efficiently approximate Sparse PCA by (i) approximating the general input… ▽ More

    Submitted 4 March, 2025; v1 submitted 17 October, 2024; originally announced October 2024.

    Comments: 29 pages, 1 figure

  12. arXiv:2410.06484  [pdf, other

    stat.ME

    Model-assisted and Knowledge-guided Transfer Regression for the Underrepresented Population

    Authors: Doudou Zhou, Mengyan Li, Tianxi Cai, Molei Liu

    Abstract: Covariate shift and outcome model heterogeneity are two prominent challenges in leveraging external sources to improve risk modeling for underrepresented cohorts in paucity of accurate labels. We consider the transfer learning problem targeting some unlabeled minority sample encountering (i) covariate shift to the labeled source sample collected on a different cohort; and (ii) outcome model hetero… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

  13. arXiv:2410.01047  [pdf, ps, other

    cs.LG math.FA stat.ML

    Spherical Analysis of Learning Nonlinear Functionals

    Authors: Zhenyu Yang, Shuo Huang, Han Feng, Ding-Xuan Zhou

    Abstract: In recent years, there has been growing interest in the field of functional neural networks. They have been proposed and studied with the aim of approximating continuous functionals defined on sets of functions on Euclidean domains. In this paper, we consider functionals defined on sets of functions on spheres. The approximation ability of deep ReLU neural networks is investigated by novel spheric… ▽ More

    Submitted 1 October, 2024; originally announced October 2024.

  14. arXiv:2409.07745  [pdf, other

    stat.ME math.ST

    Generalized Independence Test for Modern Data

    Authors: Mingshuo Liu, Doudou Zhou, Hao Chen

    Abstract: The test of independence is a crucial component of modern data analysis. However, traditional methods often struggle with the complex dependency structures found in high-dimensional data. To overcome this challenge, we introduce a novel test statistic that captures intricate relationships using similarity and dissimilarity information derived from the data. The statistic exhibits strong power acro… ▽ More

    Submitted 12 September, 2024; originally announced September 2024.

  15. arXiv:2405.06415  [pdf, other

    stat.ML cs.LG

    Generalization analysis with deep ReLU networks for metric and similarity learning

    Authors: Junyu Zhou, Puyu Wang, Ding-Xuan Zhou

    Abstract: While considerable theoretical progress has been devoted to the study of metric and similarity learning, the generalization mystery is still missing. In this paper, we study the generalization performance of metric and similarity learning by leveraging the specific structure of the true metric (the target function). Specifically, by deriving the explicit form of the true metric for metric and simi… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: 15 pages, 1 figure

  16. arXiv:2403.16459  [pdf, other

    cs.LG math.ST stat.ML

    On the rates of convergence for learning with convolutional neural networks

    Authors: Yunfei Yang, Han Feng, Ding-Xuan Zhou

    Abstract: We study approximation and learning capacities of convolutional neural networks (CNNs) with one-side zero-padding and multiple channels. Our first result proves a new approximation bound for CNNs with certain constraint on the weights. Our second result gives new analysis on the covering number of feed-forward neural networks with CNNs as special cases. The analysis carefully takes into account th… ▽ More

    Submitted 8 April, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

  17. arXiv:2403.14926  [pdf, other

    stat.ML cs.LG

    Contrastive Learning on Multimodal Analysis of Electronic Health Records

    Authors: Tianxi Cai, Feiqing Huang, Ryumei Nakada, Linjun Zhang, Doudou Zhou

    Abstract: Electronic health record (EHR) systems contain a wealth of multimodal clinical data including structured data like clinical codes and unstructured data such as clinical notes. However, many existing EHR-focused studies has traditionally either concentrated on an individual modality or merged different modalities in a rather rudimentary fashion. This approach often results in the perception of stru… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: 34 pages

  18. arXiv:2403.12284  [pdf, other

    math.ST q-bio.QM stat.AP stat.ME

    The Wreaths of KHAN: Uniform Graph Feature Selection with False Discovery Rate Control

    Authors: Jiajun Liang, Yue Liu, Doudou Zhou, Sinian Zhang, Junwei Lu

    Abstract: Graphical models find numerous applications in biology, chemistry, sociology, neuroscience, etc. While substantial progress has been made in graph estimation, it remains largely unexplored how to select significant graph signals with uncertainty assessment, especially those graph features related to topological structures including cycles (i.e., wreaths), cliques, hubs, etc. These features play a… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  19. Causality-Aware Spatiotemporal Graph Neural Networks for Spatiotemporal Time Series Imputation

    Authors: Baoyu Jing, Dawei Zhou, Kan Ren, Carl Yang

    Abstract: Spatiotemporal time series are usually collected via monitoring sensors placed at different locations, which usually contain missing values due to various failures, such as mechanical damages and Internet outages. Imputing the missing values is crucial for analyzing time series. When recovering a specific data point, most existing methods consider all the information relevant to that point regardl… ▽ More

    Submitted 23 October, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: Accepted by CIKM'2024. Fixed typos

  20. arXiv:2402.12875  [pdf, other

    cs.LG cs.CC stat.ML

    Chain of Thought Empowers Transformers to Solve Inherently Serial Problems

    Authors: Zhiyuan Li, Hong Liu, Denny Zhou, Tengyu Ma

    Abstract: Instructing the model to generate a sequence of intermediate steps, a.k.a., a chain of thought (CoT), is a highly effective method to improve the accuracy of large language models (LLMs) on arithmetics and symbolic reasoning tasks. However, the mechanism behind CoT remains unclear. This work provides a theoretical understanding of the power of CoT for decoder-only transformers through the lens of… ▽ More

    Submitted 21 September, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: 38 pages, 10 figures. Accepted by ICLR 2024

  21. arXiv:2402.08998  [pdf, other

    cs.LG stat.ML

    Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path

    Authors: Qiwei Di, Jiafan He, Dongruo Zhou, Quanquan Gu

    Abstract: We study the Stochastic Shortest Path (SSP) problem with a linear mixture transition kernel, where an agent repeatedly interacts with a stochastic environment and seeks to reach certain goal state while minimizing the cumulative cost. Existing works often assume a strictly positive lower bound of the cost function or an upper bound of the expected length for the optimal policy. In this paper, we p… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: 28 pages, 1 figure, In ICML 2023

  22. arXiv:2401.02890  [pdf, other

    stat.ML cs.LG

    Nonlinear functional regression by functional deep neural network with kernel embedding

    Authors: Zhongjie Shi, Jun Fan, Linhao Song, Ding-Xuan Zhou, Johan A. K. Suykens

    Abstract: Recently, deep learning has been widely applied in functional data analysis (FDA) with notable empirical success. However, the infinite dimensionality of functional data necessitates an effective dimension reduction approach for functional learning tasks, particularly in nonlinear functional regression. In this paper, we introduce a functional deep neural network with an adaptive and discretizatio… ▽ More

    Submitted 12 May, 2025; v1 submitted 5 January, 2024; originally announced January 2024.

  23. arXiv:2312.15611  [pdf, other

    stat.ME stat.ML

    Inference of Dependency Knowledge Graph for Electronic Health Records

    Authors: Zhiwei Xu, Ziming Gan, Doudou Zhou, Shuting Shen, Junwei Lu, Tianxi Cai

    Abstract: The effective analysis of high-dimensional Electronic Health Record (EHR) data, with substantial potential for healthcare research, presents notable methodological challenges. Employing predictive modeling guided by a knowledge graph (KG), which enables efficient feature selection, can enhance both statistical efficiency and interpretability. While various methods have emerged for constructing KGs… ▽ More

    Submitted 24 December, 2023; originally announced December 2023.

  24. arXiv:2311.14222  [pdf, other

    cs.LG math.OC stat.ML

    Risk Bounds of Accelerated SGD for Overparameterized Linear Regression

    Authors: Xuheng Li, Yihe Deng, Jingfeng Wu, Dongruo Zhou, Quanquan Gu

    Abstract: Accelerated stochastic gradient descent (ASGD) is a workhorse in deep learning and often achieves better generalization performance than SGD. However, existing optimization theory can only explain the faster convergence of ASGD, but cannot explain its better generalization. In this paper, we study the generalization of ASGD for overparameterized linear regression, which is possibly the simplest se… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

    Comments: 85 pages, 5 figures

  25. arXiv:2311.11563  [pdf

    stat.ME stat.AP

    Time-varying effect in the competing risks based on restricted mean time lost

    Authors: Zhiyin Yu, Zhaojin Li, Chengfeng Zhang, Yawen Hou, Derun Zhou, Zheng Chen

    Abstract: Patients with breast cancer tend to die from other diseases, so for studies that focus on breast cancer, a competing risks model is more appropriate. Considering subdistribution hazard ratio, which is used often, limited to model assumptions and clinical interpretation, we aimed to quantify the effects of prognostic factors by an absolute indicator, the difference in restricted mean time lost (RMT… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  26. arXiv:2309.04236  [pdf, other

    cs.LG stat.ML

    Adaptive Distributed Kernel Ridge Regression: A Feasible Distributed Learning Scheme for Data Silos

    Authors: Di Wang, Xiaotong Liu, Shao-Bo Lin, Ding-Xuan Zhou

    Abstract: Data silos, mainly caused by privacy and interoperability, significantly constrain collaborations among different organizations with similar data for the same purpose. Distributed learning based on divide-and-conquer provides a promising way to settle the data silos, but it suffers from several challenges, including autonomy, privacy guarantees, and the necessity of collaborations. This paper focu… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

    Comments: 46pages, 13figures

  27. arXiv:2308.09605  [pdf, other

    math.NA cs.LG math.ST stat.ML

    Solving PDEs on Spheres with Physics-Informed Convolutional Neural Networks

    Authors: Guanhang Lei, Zhen Lei, Lei Shi, Chenyu Zeng, Ding-Xuan Zhou

    Abstract: Physics-informed neural networks (PINNs) have been demonstrated to be efficient in solving partial differential equations (PDEs) from a variety of experimental perspectives. Some recent studies have also proposed PINN algorithms for PDEs on surfaces, including spheres. However, theoretical understanding of the numerical performance of PINNs, especially PINNs on surfaces or manifolds, is still lack… ▽ More

    Submitted 5 August, 2024; v1 submitted 18 August, 2023; originally announced August 2023.

  28. arXiv:2308.08562  [pdf, other

    stat.AP q-bio.CB

    Bayesian Inference of Phenotypic Plasticity of Cancer Cells Based on Dynamic Model for Temporal Cell Proportion Data

    Authors: Shuli Chen, Yuman Wang, Da Zhou, Jie Hu

    Abstract: Mounting evidence underscores the prevalent hierarchical organization of cancer tissues. At the foundation of this hierarchy reside cancer stem cells, a subset of cells endowed with the pivotal role of engendering the entire cancer tissue through cell differentiation. In recent times, substantial attention has been directed towards the phenomenon of cancer cell plasticity, where the dynamic interc… ▽ More

    Submitted 14 August, 2023; originally announced August 2023.

  29. arXiv:2307.16792  [pdf, ps, other

    stat.ML cs.LG

    Classification with Deep Neural Networks and Logistic Loss

    Authors: Zihan Zhang, Lei Shi, Ding-Xuan Zhou

    Abstract: Deep neural networks (DNNs) trained with the logistic loss (i.e., the cross entropy loss) have made impressive advancements in various binary classification tasks. However, generalization analysis for binary classification with DNNs and logistic loss remains scarce. The unboundedness of the target function for the logistic loss is the main obstacle to deriving satisfactory generalization bounds. I… ▽ More

    Submitted 21 April, 2024; v1 submitted 31 July, 2023; originally announced July 2023.

  30. arXiv:2307.12461  [pdf, ps, other

    cs.LG stat.ML

    Rates of Approximation by ReLU Shallow Neural Networks

    Authors: Tong Mao, Ding-Xuan Zhou

    Abstract: Neural networks activated by the rectified linear unit (ReLU) play a central role in the recent development of deep learning. The topic of approximating functions from Hölder spaces by these networks is crucial for understanding the efficiency of the induced learning algorithms. Although the topic has been well investigated in the setting of deep neural networks with many layers of hidden neurons,… ▽ More

    Submitted 23 July, 2023; originally announced July 2023.

  31. arXiv:2307.03487  [pdf, ps, other

    stat.ML cs.LG

    Learning Theory of Distribution Regression with Neural Networks

    Authors: Zhongjie Shi, Zhan Yu, Ding-Xuan Zhou

    Abstract: In this paper, we aim at establishing an approximation theory and a learning theory of distribution regression via a fully connected neural network (FNN). In contrast to the classical regression methods, the input variables of distribution regression are probability measures. Then we often need to perform a second-stage sampling process to approximate the actual information of the distribution. On… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

  32. arXiv:2306.08321  [pdf, other

    stat.ML cs.LG math.ST

    Nonparametric regression using over-parameterized shallow ReLU neural networks

    Authors: Yunfei Yang, Ding-Xuan Zhou

    Abstract: It is shown that over-parameterized neural networks can achieve minimax optimal rates of convergence (up to logarithmic factors) for learning functions from certain smooth function classes, if the weights are suitably constrained or regularized. Specifically, we consider the nonparametric regression of estimating an unknown $d$-variate function by using shallow ReLU neural networks. It is assumed… ▽ More

    Submitted 15 May, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

    Journal ref: Journal of Machine Learning Research, 25(165):1-35, 2024

  33. arXiv:2305.19640  [pdf, other

    stat.ML cs.LG

    Fine-grained analysis of non-parametric estimation for pairwise learning

    Authors: Junyu Zhou, Shuo Huang, Han Feng, Puyu Wang, Ding-Xuan Zhou

    Abstract: In this paper, we are concerned with the generalization performance of non-parametric estimation for pairwise learning. Most of the existing work requires the hypothesis space to be convex or a VC-class, and the loss to be convex. However, these restrictive assumptions limit the applicability of the results in studying many popular methods, especially kernel methods and neural networks. We signifi… ▽ More

    Submitted 21 June, 2024; v1 submitted 31 May, 2023; originally announced May 2023.

    Comments: 30 pages, 1 figure

  34. arXiv:2305.17126  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Large Language Models as Tool Makers

    Authors: Tianle Cai, Xuezhi Wang, Tengyu Ma, Xinyun Chen, Denny Zhou

    Abstract: Recent research has highlighted the potential of large language models (LLMs) to improve their problem-solving capabilities with the aid of suitable external tools. In our work, we further advance this concept by introducing a closed-loop framework, referred to as LLMs A s Tool Makers (LATM), where LLMs create their own reusable tools for problem-solving. Our approach consists of two phases: 1) to… ▽ More

    Submitted 10 March, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: Code available at https://github.com/ctlllll/LLM-ToolMaker

  35. arXiv:2305.16891  [pdf, other

    cs.LG stat.ML

    Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks

    Authors: Puyu Wang, Yunwen Lei, Di Wang, Yiming Ying, Ding-Xuan Zhou

    Abstract: Recently, significant progress has been made in understanding the generalization of neural networks (NNs) trained by gradient descent (GD) using the algorithmic stability approach. However, most of the existing research has focused on one-hidden-layer NNs and has not addressed the impact of different network scaling parameters. In this paper, we greatly extend the previous work \cite{lei2022stabil… ▽ More

    Submitted 29 September, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: 38 pages, 2 figures

  36. arXiv:2305.11965  [pdf, other

    cs.LG cs.AI math.OC stat.ML

    Not All Semantics are Created Equal: Contrastive Self-supervised Learning with Automatic Temperature Individualization

    Authors: Zi-Hao Qiu, Quanqi Hu, Zhuoning Yuan, Denny Zhou, Lijun Zhang, Tianbao Yang

    Abstract: In this paper, we aim to optimize a contrastive loss with individualized temperatures in a principled and systematic manner for self-supervised learning. The common practice of using a global temperature parameter $τ$ ignores the fact that ``not all semantics are created equal", meaning that different anchor data may have different numbers of samples with similar semantics, especially when data ex… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

    Comments: 33 pages, 11 figures, accepted by ICML2023

  37. arXiv:2305.07408  [pdf, other

    stat.ML cs.LG

    Distributed Gradient Descent for Functional Learning

    Authors: Zhan Yu, Jun Fan, Zhongjie Shi, Ding-Xuan Zhou

    Abstract: In recent years, different types of distributed and parallel learning schemes have received increasing attention for their strong advantages in handling large-scale data information. In the information era, to face the big data challenges {that} stem from functional data analysis very recently, we propose a novel distributed gradient descent functional learning (DGDFL) algorithm to tackle function… ▽ More

    Submitted 21 July, 2024; v1 submitted 12 May, 2023; originally announced May 2023.

    Comments: 48 pages

  38. arXiv:2304.04443  [pdf, other

    stat.ML cs.LG

    Approximation of Nonlinear Functionals Using Deep ReLU Networks

    Authors: Linhao Song, Jun Fan, Di-Rong Chen, Ding-Xuan Zhou

    Abstract: In recent years, functional neural networks have been proposed and studied in order to approximate nonlinear continuous functionals defined on $L^p([-1, 1]^s)$ for integers $s\ge1$ and $1\le p<\infty$. However, their theoretical properties are largely unknown beyond universality of approximation or the existing analysis does not apply to the rectified linear unit (ReLU) activation function. To fil… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

  39. Optimal rates of approximation by shallow ReLU$^k$ neural networks and applications to nonparametric regression

    Authors: Yunfei Yang, Ding-Xuan Zhou

    Abstract: We study the approximation capacity of some variation spaces corresponding to shallow ReLU$^k$ neural networks. It is shown that sufficiently smooth functions are contained in these spaces with finite variation norms. For functions with less smoothness, the approximation rates in terms of the variation norm are established. Using these results, we are able to prove the optimal approximation rates… ▽ More

    Submitted 8 January, 2024; v1 submitted 4 April, 2023; originally announced April 2023.

    Comments: Version 3 improves some approximation bounds by using recent results from arXiv:2307.15285

  40. arXiv:2302.10371  [pdf, other

    cs.LG math.OC stat.ML

    Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency

    Authors: Heyang Zhao, Jiafan He, Dongruo Zhou, Tong Zhang, Quanquan Gu

    Abstract: Recently, several studies (Zhou et al., 2021a; Zhang et al., 2021b; Kim et al., 2021; Zhou and Gu, 2022) have provided variance-dependent regret bounds for linear contextual bandits, which interpolates the regret for the worst-case regime and the deterministic reward regime. However, these algorithms are either computationally intractable or unable to handle unknown variance of the noise. In this… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.

    Comments: 43 pages, 2 tables

  41. arXiv:2212.06132  [pdf, ps, other

    cs.LG math.OC stat.ML

    Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes

    Authors: Jiafan He, Heyang Zhao, Dongruo Zhou, Quanquan Gu

    Abstract: We study reinforcement learning (RL) with linear function approximation. For episodic time-inhomogeneous linear Markov decision processes (linear MDPs) whose transition probability can be parameterized as a linear function of a given feature mapping, we propose the first computationally efficient algorithm that achieves the nearly minimax optimal regret $\tilde O(d\sqrt{H^3K})$, where $d$ is the d… ▽ More

    Submitted 3 November, 2023; v1 submitted 12 December, 2022; originally announced December 2022.

    Comments: 33 pages, 1 table. In ICML 2023

  42. arXiv:2210.10643  [pdf, other

    cs.LG cs.AI stat.ML

    Towards Accurate Subgraph Similarity Computation via Neural Graph Pruning

    Authors: Linfeng Liu, Xu Han, Dawei Zhou, Li-Ping Liu

    Abstract: Subgraph similarity search, one of the core problems in graph search, concerns whether a target graph approximately contains a query graph. The problem is recently touched by neural methods. However, current neural methods do not consider pruning the target graph, though pruning is critically important in traditional calculations of subgraph similarities. One obstacle to applying pruning in neural… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Journal ref: Transactions on Machine Learning Research (TMLR) October 2022

  43. arXiv:2209.13762  [pdf, other

    stat.ML cs.LG

    Consensus Knowledge Graph Learning via Multi-view Sparse Low Rank Block Model

    Authors: Tianxi Cai, Dong Xia, Luwan Zhang, Doudou Zhou

    Abstract: Network analysis has been a powerful tool to unveil relationships and interactions among a large number of objects. Yet its effectiveness in accurately identifying important node-node interactions is challenged by the rapidly growing network size, with data being collected at an unprecedented granularity and scale. Common wisdom to overcome such high dimensionality is collapsing nodes into smaller… ▽ More

    Submitted 4 October, 2024; v1 submitted 27 September, 2022; originally announced September 2022.

  44. arXiv:2209.08005  [pdf, ps, other

    stat.ML cs.LG

    Stability and Generalization for Markov Chain Stochastic Gradient Methods

    Authors: Puyu Wang, Yunwen Lei, Yiming Ying, Ding-Xuan Zhou

    Abstract: Recently there is a large amount of work devoted to the study of Markov chain stochastic gradient methods (MC-SGMs) which mainly focus on their convergence analysis for solving minimization problems. In this paper, we provide a comprehensive generalization analysis of MC-SGMs for both minimization and minimax problems through the lens of algorithmic stability in the framework of statistical learni… ▽ More

    Submitted 16 September, 2022; originally announced September 2022.

  45. arXiv:2209.04188  [pdf, ps, other

    stat.ML cs.CR cs.LG

    Differentially Private Stochastic Gradient Descent with Low-Noise

    Authors: Puyu Wang, Yunwen Lei, Yiming Ying, Ding-Xuan Zhou

    Abstract: Modern machine learning algorithms aim to extract fine-grained information from data to provide accurate predictions, which often conflicts with the goal of privacy protection. This paper addresses the practical and theoretical importance of developing privacy-preserving machine learning algorithms that ensure good performance while preserving privacy. In this paper, we focus on the privacy and ut… ▽ More

    Submitted 14 July, 2023; v1 submitted 9 September, 2022; originally announced September 2022.

  46. arXiv:2208.06972  [pdf, other

    stat.AP econ.GN

    Is the NFL's franchise tag fair to players?

    Authors: Darwin Zhou

    Abstract: There has been a consistent criticism over the past decade of the NFL franchise tag's monetary limitations due to its biased institutions in favor of the team rather than the player. But the question whether the NFL's franchise tag is fair or unfair to players has never been systematically studied. In this paper, I investigate the effects of NFL players' contract extensions when on a franchise tag… ▽ More

    Submitted 15 August, 2022; v1 submitted 14 August, 2022; originally announced August 2022.

  47. arXiv:2208.06528  [pdf, ps, other

    stat.ME stat.ML

    Dynamic Bayesian Learning for Spatiotemporal Mechanistic Models

    Authors: Sudipto Banerjee, Xiang Chen, Ian Frankenburg, Daniel Zhou

    Abstract: We develop an approach for Bayesian learning of spatiotemporal dynamical mechanistic models. Such learning consists of statistical emulation of the mechanistic system that can efficiently interpolate the output of the system from arbitrary inputs. The emulated learner can then be used to train the system from noisy data achieved by melding information from observed data with the emulated mechanist… ▽ More

    Submitted 9 July, 2025; v1 submitted 12 August, 2022; originally announced August 2022.

  48. arXiv:2208.05363  [pdf, ps, other

    cs.LG cs.AI cs.GT math.OC stat.ML

    Learning Two-Player Mixture Markov Games: Kernel Function Approximation and Correlated Equilibrium

    Authors: Chris Junchi Li, Dongruo Zhou, Quanquan Gu, Michael I. Jordan

    Abstract: We consider learning Nash equilibria in two-player zero-sum Markov Games with nonlinear function approximation, where the action-value function is approximated by a function in a Reproducing Kernel Hilbert Space (RKHS). The key challenge is how to do exploration in the high-dimensional function space. We propose a novel online learning algorithm to find a Nash equilibrium by minimizing the duality… ▽ More

    Submitted 10 August, 2022; originally announced August 2022.

    Comments: 42 pages

  49. arXiv:2208.05134  [pdf, other

    stat.ME

    Doubly Robust Augmented Model Accuracy Transfer Inference with High Dimensional Features

    Authors: Doudou Zhou, Molei Liu, Mengyan Li, Tianxi Cai

    Abstract: Due to label scarcity and covariate shift happening frequently in real-world studies, transfer learning has become an essential technique to train models generalizable to some target populations using existing labeled source data. Most existing transfer learning research has been focused on model estimation, while there is a paucity of literature on transfer inference for model accuracy despite it… ▽ More

    Submitted 8 November, 2022; v1 submitted 10 August, 2022; originally announced August 2022.

  50. arXiv:2206.05581  [pdf, other

    stat.ML cs.LG stat.ME

    Federated Offline Reinforcement Learning

    Authors: Doudou Zhou, Yufeng Zhang, Aaron Sonabend-W, Zhaoran Wang, Junwei Lu, Tianxi Cai

    Abstract: Evidence-based or data-driven dynamic treatment regimes are essential for personalized medicine, which can benefit from offline reinforcement learning (RL). Although massive healthcare data are available across medical institutions, they are prohibited from sharing due to privacy constraints. Besides, heterogeneity exists in different sites. As a result, federated offline RL algorithms are necessa… ▽ More

    Submitted 27 January, 2024; v1 submitted 11 June, 2022; originally announced June 2022.