Skip to main content

Showing 1–50 of 298 results for author: Xue, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2505.05269  [pdf, other

    stat.ML cs.LG

    A Two-Sample Test of Text Generation Similarity

    Authors: Jingbin Xu, Chen Qian, Meimei Liu, Feng Guo

    Abstract: The surge in digitized text data requires reliable inferential methods on observed textual patterns. This article proposes a novel two-sample text test for comparing similarity between two groups of documents. The hypothesis is whether the probabilistic mapping generating the textual data is identical across two groups of documents. The proposed test aims to assess text similarity by comparing the… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

  2. arXiv:2505.04795  [pdf, other

    stat.ME math.PR stat.AP

    Assessing Risk Heterogeneity through Heavy-Tailed Frequency and Severity Mixtures

    Authors: Michael R. Powers, Jiaxin Xu

    Abstract: In operational risk management and actuarial finance, the analysis of risk often begins by dividing a random damage-generation process into its separate frequency and severity components. In the present article, we construct canonical families of mixture distributions for each of these components, based on a Negative Binomial kernel for frequency and a Gamma kernel for severity. The mixtures are e… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

    MSC Class: 60E05; 60E10

  3. arXiv:2505.01467  [pdf, other

    stat.CO

    sae4health: An R Shiny Application for Small Area Estimation in Low- and Middle-Income Countries

    Authors: Yunhan Wu, Qianyu Dong, Jieyi Xu, Zehang Richard Li, Jon Wakefield

    Abstract: Accurate subnational estimation of health indicators is critical for public health planning, especially in low- and middle-income countries (LMICs), where data and tools are often limited. The sae4health R shiny app, built on the surveyPrev package, provides a user-friendly tool for prevalence mapping using small area estimation (SAE) methods. Both area- and unit-level models with spatial random e… ▽ More

    Submitted 1 May, 2025; originally announced May 2025.

  4. arXiv:2504.09481  [pdf, other

    cs.LG stat.ME

    Rethinking the generalization of drug target affinity prediction algorithms via similarity aware evaluation

    Authors: Chenbin Zhang, Zhiqiang Hu, Chuchu Jiang, Wen Chen, Jie Xu, Shaoting Zhang

    Abstract: Drug-target binding affinity prediction is a fundamental task for drug discovery. It has been extensively explored in literature and promising results are reported. However, in this paper, we demonstrate that the results may be misleading and cannot be well generalized to real practice. The core observation is that the canonical randomized split of a test set in conventional evaluation leaves the… ▽ More

    Submitted 13 April, 2025; originally announced April 2025.

    Comments: ICLR 2025 Oral

  5. arXiv:2503.16321  [pdf

    stat.ME

    Balancing the effective sample size in prior across different doses in the curve-free Bayesian decision-theoretic design for dose-finding trials

    Authors: Jiapeng Xu, Dehua Bi, Shenghua Kelly Fan, Bee Leng Lee, Ying Lu

    Abstract: The primary goal of dose allocation in phase I trials is to minimize patient exposure to subtherapeutic or excessively toxic doses, while accurately recommending a phase II dose that is as close as possible to the maximum tolerated dose (MTD). Fan et al. (2012) introduced a curve-free Bayesian decision-theoretic design (CFBD), which leverages the assumption of a monotonic dose-toxicity relationshi… ▽ More

    Submitted 20 March, 2025; originally announced March 2025.

    Comments: 24 pages

  6. arXiv:2503.11637  [pdf, other

    stat.ME

    Gradient-bridged Posterior: Bayesian Inference for Models with Implicit Functions

    Authors: Cheng Zeng, Yaozhi Yang, Jason Xu, Leo L Duan

    Abstract: Many statistical problems include model parameters that are defined as the solutions to optimization sub-problems. These include classical approaches such as profile likelihood as well as modern applications involving flow networks or Procrustes distances. In such cases, the likelihood of the data involves an implicit function, often complicating inferential procedures and entailing prohibitive co… ▽ More

    Submitted 14 March, 2025; originally announced March 2025.

    Comments: 31 pages, 13 figures

  7. arXiv:2503.06381  [pdf, other

    stat.ML cs.LG stat.ME

    Bayesian Optimization for Robust Identification of Ornstein-Uhlenbeck Model

    Authors: Jinwen Xu, Qin Lu, Yaakov Bar-Shalom

    Abstract: This paper deals with the identification of the stochastic Ornstein-Uhlenbeck (OU) process error model, which is characterized by an inverse time constant, and the unknown variances of the process and observation noises. Although the availability of the explicit expression of the log-likelihood function allows one to obtain the maximum likelihood estimator (MLE), this entails evaluating the nontri… ▽ More

    Submitted 8 March, 2025; originally announced March 2025.

  8. arXiv:2503.06009  [pdf, ps, other

    cs.LG stat.ML

    Nearly Optimal Differentially Private ReLU Regression

    Authors: Meng Ding, Mingxi Lei, Shaowei Wang, Tianhang Zheng, Di Wang, Jinhui Xu

    Abstract: In this paper, we investigate one of the most fundamental nonconvex learning problems, ReLU regression, in the Differential Privacy (DP) model. Previous studies on private ReLU regression heavily rely on stringent assumptions, such as constant bounded norms for feature vectors and labels. We relax these assumptions to a more standard setting, where data can be i.i.d. sampled from $O(1)$-sub-Gaussi… ▽ More

    Submitted 10 June, 2025; v1 submitted 7 March, 2025; originally announced March 2025.

    Comments: 47 pages (UAI2025)

  9. arXiv:2503.03536  [pdf, other

    stat.ML math.PR stat.AP

    A Criterion for Extending Continuous-Mixture Identifiability Results

    Authors: Michael R. Powers, Jiaxin Xu

    Abstract: For continuous mixtures of random variables, we provide a simple criterion -- generating-function accessibility -- to extend previously known kernel-based identifiability (or unidentifiability) results to new kernel distributions. This criterion, based on functional relationships between the relevant kernels' moment-generating functions or Laplace transforms, may be applied to continuous mixtures… ▽ More

    Submitted 5 March, 2025; originally announced March 2025.

    MSC Class: 62F99; 60E05

  10. arXiv:2502.17814  [pdf, other

    stat.ML cs.AI cs.CL cs.LG

    An Overview of Large Language Models for Statisticians

    Authors: Wenlong Ji, Weizhe Yuan, Emily Getzen, Kyunghyun Cho, Michael I. Jordan, Song Mei, Jason E Weston, Weijie J. Su, Jing Xu, Linjun Zhang

    Abstract: Large Language Models (LLMs) have emerged as transformative tools in artificial intelligence (AI), exhibiting remarkable capabilities across diverse tasks such as text generation, reasoning, and decision-making. While their success has primarily been driven by advances in computational power and deep learning architectures, emerging problems -- in areas such as uncertainty quantification, decision… ▽ More

    Submitted 24 February, 2025; originally announced February 2025.

  11. arXiv:2502.10793  [pdf, other

    stat.ML cs.AI cs.LG

    Dynamic Influence Tracker: Measuring Time-Varying Sample Influence During Training

    Authors: Jie Xu, Zihan Wu

    Abstract: Existing methods for measuring training sample influence on models only provide static, overall measurements, overlooking how sample influence changes during training. We propose Dynamic Influence Tracker (DIT), which captures the time-varying sample influence across arbitrary time windows during training. DIT offers three key insights: 1) Samples show different time-varying influence patterns,… ▽ More

    Submitted 15 February, 2025; originally announced February 2025.

  12. arXiv:2502.10409  [pdf, other

    cs.CY cs.AI cs.ET stat.AP

    Data Science Students Perspectives on Learning Analytics: An Application of Human-Led and LLM Content Analysis

    Authors: Raghda Zahran, Jianfei Xu, Huizhi Liang, Matthew Forshaw

    Abstract: Objective This study is part of a series of initiatives at a UK university designed to cultivate a deep understanding of students' perspectives on analytics that resonate with their unique learning needs. It explores collaborative data processing undertaken by postgraduate students who examined an Open University Learning Analytics Dataset (OULAD). Methods A qualitative approach was adopted, int… ▽ More

    Submitted 22 January, 2025; originally announced February 2025.

    Comments: 17 Pages, 2 Tables, 1 Figure

  13. arXiv:2502.06168  [pdf, other

    stat.ML cs.LG econ.EM math.OC

    Dynamic Pricing with Adversarially-Censored Demands

    Authors: Jianyu Xu, Yining Wang, Xi Chen, Yu-Xiang Wang

    Abstract: We study an online dynamic pricing problem where the potential demand at each time period $t=1,2,\ldots, T$ is stochastic and dependent on the price. However, a perishable inventory is imposed at the beginning of each time $t$, censoring the potential demand if it exceeds the inventory level. To address this problem, we introduce a pricing algorithm based on the optimistic estimates of derivatives… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

    Comments: 33 pages, 1 figure

    MSC Class: 91B06; 91B24; 62P20; 62C20; 90B50 ACM Class: I.2.6

  14. arXiv:2502.00126  [pdf, other

    stat.ME math.ST

    A Bayesian decision-theoretic approach to sparse estimation

    Authors: Aihua Li, Surya T. Tokdar, Jason Xu

    Abstract: We extend the work of Hahn and Carvalho (2015) and develop a doubly-regularized sparse regression estimator by synthesizing Bayesian regularization with penalized least squares within a decision-theoretic framework. In contrast to existing Bayesian decision-theoretic formulation chiefly reliant upon the symmetric 0-1 loss, the new method -- which we call Bayesian Decoupling -- employs a family of… ▽ More

    Submitted 31 January, 2025; originally announced February 2025.

    Comments: Submitted to Biometrika

  15. arXiv:2501.18049  [pdf, ps, other

    cs.LG math.OC stat.ML

    Joint Pricing and Resource Allocation: An Optimal Online-Learning Approach

    Authors: Jianyu Xu, Xuan Wang, Yu-Xiang Wang, Jiashuo Jiang

    Abstract: We study an online learning problem on dynamic pricing and resource allocation, where we make joint pricing and inventory decisions to maximize the overall net profit. We consider the stochastic dependence of demands on the price, which complicates the resource allocation process and introduces significant non-convexity and non-smoothness to the problem. To solve this problem, we develop an effici… ▽ More

    Submitted 21 May, 2025; v1 submitted 29 January, 2025; originally announced January 2025.

    MSC Class: 91B06; 90B22; 91B24; 90B50; 90B80; 62P20 ACM Class: I.2.6

  16. arXiv:2501.12453  [pdf

    stat.ME stat.AP

    On the two-step hybrid design for augmenting randomized trials using real-world data

    Authors: Jiapeng Xu, Ruben P. A. van Eijk, Alicia Ellis, Tianyu Pan, Lorene M. Nelson, Kit C. B. Roes, Marc van Dijk, Maria Sarno, Leonard H. van den Berg, Lu Tian, Ying Lu

    Abstract: Hybrid clinical trials, that borrow real-world data (RWD), are gaining interest, especially for rare diseases. They assume RWD and randomized control arm be exchangeable, but violations can bias results, inflate type I error, or reduce power. A two-step hybrid design first tests exchangeability, reducing inappropriate borrowing but potentially inflating type I error (Yuan et al., 2019). We propose… ▽ More

    Submitted 21 January, 2025; originally announced January 2025.

    MSC Class: 62 ACM Class: G.3

  17. arXiv:2501.06540  [pdf, other

    cs.CV math.ST stat.AP stat.ME

    CeViT: Copula-Enhanced Vision Transformer in multi-task learning and bi-group image covariates with an application to myopia screening

    Authors: Chong Zhong, Yang Li, Jinfeng Xu, Xiang Fu, Yunhao Liu, Qiuyi Huang, Danjuan Yang, Meiyan Li, Aiyi Liu, Alan H. Welsh, Xingtao Zhou, Bo Fu, Catherine C. Liu

    Abstract: We aim to assist image-based myopia screening by resolving two longstanding problems, "how to integrate the information of ocular images of a pair of eyes" and "how to incorporate the inherent dependence among high-myopia status and axial length for both eyes." The classification-regression task is modeled as a novel 4-dimensional muti-response regression, where discrete responses are allowed, tha… ▽ More

    Submitted 11 January, 2025; originally announced January 2025.

  18. arXiv:2501.01657  [pdf, other

    stat.ME

    Change Point Detection for Random Objects with Possibly Periodic Behavior

    Authors: Jiazhen Xu, Andrew T. A. Wood, Tao Zou

    Abstract: Time-varying random objects have been increasingly encountered in modern data analysis. Moreover, in a substantial number of these applications, periodic behavior of the random objects has been observed. We introduce a new, powerful scan statistic and corresponding test for the precise identification and localization of abrupt changes in the distribution of non-Euclidean random objects with possib… ▽ More

    Submitted 3 January, 2025; originally announced January 2025.

    Comments: arXiv admin note: text overlap with arXiv:2311.16025 by other authors

  19. arXiv:2411.17728  [pdf, other

    cond-mat.str-el cs.LG eess.SP physics.comp-ph stat.ML

    Analytic Continuation by Feature Learning

    Authors: Zhe Zhao, Jingping Xu, Ce Wang, Yaping Yang

    Abstract: Analytic continuation aims to reconstruct real-time spectral functions from imaginary-time Green's functions; however, this process is notoriously ill-posed and challenging to solve. We propose a novel neural network architecture, named the Feature Learning Network (FL-net), to enhance the prediction accuracy of spectral functions, achieving an improvement of at least $20\%$ over traditional metho… ▽ More

    Submitted 22 November, 2024; originally announced November 2024.

    Comments: 8 pages, 9 figures

  20. arXiv:2411.15567  [pdf, ps, other

    stat.AP

    Regional consistency evaluation and sample size calculation under two MRCTs

    Authors: Kunhai Qing, Xinru Ren, Jin Xu

    Abstract: Multi-regional clinical trial (MRCT) has been common practice for drug development and global registration. The FDA guidance "Demonstrating Substantial Evidence of Effectiveness for Human Drug and Biological Products Guidance for Industry" (FDA, 2019) requires that substantial evidence of effectiveness of a drug/biologic product to be demonstrated for market approval. In the situations where two p… ▽ More

    Submitted 23 November, 2024; originally announced November 2024.

  21. arXiv:2411.01780  [pdf, other

    cs.LG stat.ML

    Clustering Based on Density Propagation and Subcluster Merging

    Authors: Feiping Nie, Yitao Song, Jingjing Xue, Rong Wang, Xuelong Li

    Abstract: We propose the DPSM method, a density-based node clustering approach that automatically determines the number of clusters and can be applied in both data space and graph space. Unlike traditional density-based clustering methods, which necessitate calculating the distance between any two nodes, our proposed technique determines density through a propagation process, thereby making it suitable for… ▽ More

    Submitted 3 November, 2024; originally announced November 2024.

  22. arXiv:2411.00075  [pdf, other

    cs.LG stat.ML

    μP$^2$: Effective Sharpness Aware Minimization Requires Layerwise Perturbation Scaling

    Authors: Moritz Haas, Jin Xu, Volkan Cevher, Leena Chennuru Vankadara

    Abstract: Sharpness Aware Minimization (SAM) enhances performance across various neural architectures and datasets. As models are continually scaled up to improve performance, a rigorous understanding of SAM's scaling behaviour is paramount. To this end, we study the infinite-width limit of neural networks trained with SAM, using the Tensor Programs framework. Our findings reveal that the dynamics of standa… ▽ More

    Submitted 10 February, 2025; v1 submitted 31 October, 2024; originally announced November 2024.

    Comments: Final NeurIPS 2024 camera-ready version. Differences to v1: Cleaner Figure 1, added Appendix H.3.2 showing that even MLPs can transfer optimal HPs in some versions of SP on CIFAR-10, small improvements in writing

  23. arXiv:2410.17392  [pdf, other

    stat.AP math.OC

    Experimental Designs for Optimizing Last-Mile Delivery

    Authors: Nicholas Rios, Jie Xu

    Abstract: Companies like Amazon and UPS are heavily invested in last-mile delivery problems. Optimizing last-delivery operations not only creates tremendous cost savings for these companies but also generate broader societal and environmental benefits in terms of better delivery service and reduced air pollutants and greenhouse gas emissions. Last-mile delivery is readily formulated as the Travelling Salesm… ▽ More

    Submitted 22 October, 2024; originally announced October 2024.

    Comments: 22 Pages, 2 Figures with 4 subfigure panels each, To be submitted to Quality Engineering

  24. arXiv:2410.05444  [pdf, other

    cs.LG stat.ME stat.ML

    Online scalable Gaussian processes with conformal prediction for guaranteed coverage

    Authors: Jinwen Xu, Qin Lu, Georgios B. Giannakis

    Abstract: The Gaussian process (GP) is a Bayesian nonparametric paradigm that is widely adopted for uncertainty quantification (UQ) in a number of safety-critical applications, including robotics, healthcare, as well as surveillance. The consistency of the resulting uncertainty values however, hinges on the premise that the learning function conforms to the properties specified by the GP model, such as smoo… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

  25. arXiv:2410.03937  [pdf, other

    cs.LG cs.CV eess.IV stat.ML

    Clustering Alzheimer's Disease Subtypes via Similarity Learning and Graph Diffusion

    Authors: Tianyi Wei, Shu Yang, Davoud Ataee Tarzanagh, Jingxuan Bao, Jia Xu, Patryk Orzechowski, Joost B. Wagenaar, Qi Long, Li Shen

    Abstract: Alzheimer's disease (AD) is a complex neurodegenerative disorder that affects millions of people worldwide. Due to the heterogeneous nature of AD, its diagnosis and treatment pose critical challenges. Consequently, there is a growing research interest in identifying homogeneous AD subtypes that can assist in addressing these challenges in recent years. In this study, we aim to identify subtypes of… ▽ More

    Submitted 4 October, 2024; originally announced October 2024.

    Comments: ICIBM'23': International Conference on Intelligent Biology and Medicine, Tampa, FL, USA, July 16-19, 2023

  26. arXiv:2410.03833  [pdf, other

    cs.LG stat.ML

    Understanding Fine-tuning in Approximate Unlearning: A Theoretical Perspective

    Authors: Meng Ding, Rohan Sharma, Changyou Chen, Jinhui Xu, Kaiyi Ji

    Abstract: Machine Unlearning has emerged as a significant area of research, focusing on `removing' specific subsets of data from a trained model. Fine-tuning (FT) methods have become one of the fundamental approaches for approximating unlearning, as they effectively retain model performance. However, it is consistently observed that naive FT methods struggle to forget the targeted data. In this paper, we pr… ▽ More

    Submitted 7 February, 2025; v1 submitted 4 October, 2024; originally announced October 2024.

    Comments: 23 pages,5 figures

  27. arXiv:2409.06530  [pdf, other

    math.OC cs.LG stat.ML

    Functionally Constrained Algorithm Solves Convex Simple Bilevel Problems

    Authors: Huaqing Zhang, Lesi Chen, Jing Xu, Jingzhao Zhang

    Abstract: This paper studies simple bilevel problems, where a convex upper-level function is minimized over the optimal solutions of a convex lower-level problem. We first show the fundamental difficulty of simple bilevel problems, that the approximate optimal value of such problems is not obtainable by first-order zero-respecting algorithms. Then we follow recent works to pursue the weak approximate soluti… ▽ More

    Submitted 27 January, 2025; v1 submitted 10 September, 2024; originally announced September 2024.

    Comments: Accepted at NeurIPS 2024

  28. arXiv:2409.04919  [pdf, other

    cs.LG stat.ML

    Learning with Shared Representations: Statistical Rates and Efficient Algorithms

    Authors: Xiaochun Niu, Lili Su, Jiaming Xu, Pengkun Yang

    Abstract: Collaborative learning through latent shared feature representations enables heterogeneous clients to train personalized models with enhanced performance while reducing sample complexity. Despite its empirical success and extensive research, the theoretical understanding of statistical error rates remains incomplete, even for shared representations constrained to low-dimensional linear subspaces.… ▽ More

    Submitted 21 January, 2025; v1 submitted 7 September, 2024; originally announced September 2024.

  29. arXiv:2409.00407  [pdf, other

    stat.CO

    Response probability distribution estimation of expensive computer simulators: A Bayesian active learning perspective using Gaussian process regression

    Authors: Chao Dang, Marcos A. Valdebenito, Nataly A. Manque, Jun Xu, Matthias G. R. Faes

    Abstract: Estimation of the response probability distributions of computer simulators in the presence of randomness is a crucial task in many fields. However, achieving this task with guaranteed accuracy remains an open computational challenge, especially for expensive-to-evaluate computer simulators. In this work, a Bayesian active learning perspective is presented to address the challenge, which is based… ▽ More

    Submitted 31 August, 2024; originally announced September 2024.

  30. arXiv:2408.14625  [pdf, other

    stat.CO stat.AP stat.ME

    A Bayesian approach for fitting semi-Markov mixture models of cancer latency to individual-level data

    Authors: Raphael Morsomme, Shannon Holloway, Marc Ryser, Jason Xu

    Abstract: Multi-state models of cancer natural history are widely used for designing and evaluating cancer early detection strategies. Calibrating such models against longitudinal data from screened cohorts is challenging, especially when fitting non-Markovian mixture models against individual-level data. Here, we consider a family of semi-Markov mixture models of cancer natural history introduce an efficie… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

    Comments: Submitted for review

  31. arXiv:2408.10996  [pdf, ps, other

    stat.ML cs.LG math.NA

    Approximation Rates for Shallow ReLU$^k$ Neural Networks on Sobolev Spaces via the Radon Transform

    Authors: Tong Mao, Jonathan W. Siegel, Jinchao Xu

    Abstract: Let $Ω\subset \mathbb{R}^d$ be a bounded domain. We consider the problem of how efficiently shallow neural networks with the ReLU$^k$ activation function can approximate functions from Sobolev spaces $W^s(L_p(Ω))$ with error measured in the $L_q(Ω)$-norm. Utilizing the Radon transform and recent results from discrepancy theory, we provide a simple proof of nearly optimal approximation rates in a v… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

    MSC Class: 62M45; 41A25; 41A30

  32. arXiv:2408.06710  [pdf, other

    cs.LG cs.AI stat.ML

    Variational Learning of Gaussian Process Latent Variable Models through Stochastic Gradient Annealed Importance Sampling

    Authors: Jian Xu, Shian Du, Junmei Yang, Qianli Ma, Delu Zeng

    Abstract: Gaussian Process Latent Variable Models (GPLVMs) have become increasingly popular for unsupervised tasks such as dimensionality reduction and missing data recovery due to their flexibility and non-linear nature. An importance-weighted version of the Bayesian GPLVMs has been proposed to obtain a tighter variational bound. However, this version of the approach is primarily limited to analyzing simpl… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

  33. arXiv:2408.03746  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Flexible Bayesian Last Layer Models Using Implicit Priors and Diffusion Posterior Sampling

    Authors: Jian Xu, Zhiqi Lin, Shigui Li, Min Chen, Junmei Yang, Delu Zeng, John Paisley

    Abstract: Bayesian Last Layer (BLL) models focus solely on uncertainty in the output layer of neural networks, demonstrating comparable performance to more complex Bayesian models. However, the use of Gaussian priors for last layer weights in Bayesian Last Layer (BLL) models limits their expressive capacity when faced with non-Gaussian, outlier-rich, or high-dimensional datasets. To address this shortfall,… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

  34. arXiv:2407.19218  [pdf, other

    stat.AP cs.IT

    A Versatility Measure for Parametric Risk Models

    Authors: Michael R. Powers, Jiaxin Xu

    Abstract: Parametric statistical methods play a central role in analyzing risk through its underlying frequency and severity components. Given the wide availability of numerical algorithms and high-speed computers, researchers and practitioners often model these separate (although possibly statistically dependent) random variables by fitting a large number of parametric probability distributions to historic… ▽ More

    Submitted 15 March, 2025; v1 submitted 27 July, 2024; originally announced July 2024.

    MSC Class: 62F07; 62E10

  35. arXiv:2407.17033  [pdf, other

    cs.LG cs.AI stat.ML

    Sparse Inducing Points in Deep Gaussian Processes: Enhancing Modeling with Denoising Diffusion Variational Inference

    Authors: Jian Xu, Delu Zeng, John Paisley

    Abstract: Deep Gaussian processes (DGPs) provide a robust paradigm for Bayesian deep learning. In DGPs, a set of sparse integration locations called inducing points are selected to approximate the posterior distribution of the model. This is done to reduce computational complexity and improve model efficiency. However, inferring the posterior distribution of inducing points is not straightforward. Tradition… ▽ More

    Submitted 24 July, 2024; originally announced July 2024.

  36. arXiv:2407.13195  [pdf, other

    cs.LG cs.AI cs.HC cs.IT stat.ML

    Scalable Exploration via Ensemble++

    Authors: Yingru Li, Jiawei Xu, Baoxiang Wang, Zhi-Quan Luo

    Abstract: Thompson Sampling is a principled method for balancing exploration and exploitation, but its real-world adoption faces computational challenges in large-scale or non-conjugate settings. While ensemble-based approaches offer partial remedies, they typically require prohibitively large ensemble sizes. We propose Ensemble++, a scalable exploration framework using a novel shared-factor ensemble archit… ▽ More

    Submitted 18 May, 2025; v1 submitted 18 July, 2024; originally announced July 2024.

    Comments: 53 pages

  37. arXiv:2405.20970  [pdf, other

    stat.ML cs.LG

    PUAL: A Classifier on Trifurcate Positive-Unlabeled Data

    Authors: Xiaoke Wang, Xiaochen Yang, Rui Zhu, Jing-Hao Xue

    Abstract: Positive-unlabeled (PU) learning aims to train a classifier using the data containing only labeled-positive instances and unlabeled instances. However, existing PU learning methods are generally hard to achieve satisfactory performance on trifurcate data, where the positive instances distribute on both sides of the negative instances. To address this issue, firstly we propose a PU classifier with… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: 24 pages, 6 figures

  38. arXiv:2405.17479  [pdf, other

    cs.LG cs.NE stat.ML

    A rationale from frequency perspective for grokking in training neural network

    Authors: Zhangchen Zhou, Yaoyu Zhang, Zhi-Qin John Xu

    Abstract: Grokking is the phenomenon where neural networks NNs initially fit the training data and later generalize to the test data during training. In this paper, we empirically provide a frequency perspective to explain the emergence of this phenomenon in NNs. The core insight is that the networks initially learn the less salient frequency components present in the test data. We observe this phenomenon a… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  39. arXiv:2403.00968  [pdf, other

    stat.ME

    The Bridged Posterior: Optimization, Profile Likelihood and a New Approach to Generalized Bayes

    Authors: Cheng Zeng, Eleni Dilma, Jason Xu, Leo L Duan

    Abstract: Optimization is widely used in statistics, thanks to its efficiency for delivering point estimates on useful spaces, such as those satisfying low cardinality or combinatorial structure. To quantify uncertainty, Gibbs posterior exponentiates the negative loss function to form a posterior density. Nevertheless, Gibbs posteriors are supported in a high-dimensional space, and do not inherit the comput… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: 42 pages, 8 figures

  40. arXiv:2402.10228  [pdf, other

    cs.LG cs.AI stat.ML

    Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent

    Authors: Yingru Li, Jiawei Xu, Lei Han, Zhi-Quan Luo

    Abstract: We propose HyperAgent, a reinforcement learning (RL) algorithm based on the hypermodel framework for exploration in RL. HyperAgent allows for the efficient incremental approximation of posteriors associated with an optimal action-value function ($Q^\star$) without the need for conjugacy and follows the greedy policies w.r.t. these approximate posterior samples. We demonstrate that HyperAgent offer… ▽ More

    Submitted 14 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Proceedings of the $\mathit{41}^{st}$ International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024. Copyright 2024 by the author(s). Invited talk in Informs Optimization Conference 2024 and International Symposium on Mathematical Programming 2024

  41. arXiv:2402.08493  [pdf, other

    cs.LG stat.ML

    Sparsity via Sparse Group $k$-max Regularization

    Authors: Qinghua Tao, Xiangming Xi, Jun Xu, Johan A. K. Suykens

    Abstract: For the linear inverse problem with sparsity constraints, the $l_0$ regularized problem is NP-hard, and existing approaches either utilize greedy algorithms to find almost-optimal solutions or to approximate the $l_0$ regularization with its convex counterparts. In this paper, we propose a novel and concise regularization, namely the sparse group $k$-max regularization, which can not only simultan… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: 7 pages, accepted to American Control Conference 2024

  42. arXiv:2401.16421  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation

    Authors: Zhenyu He, Guhao Feng, Shengjie Luo, Kai Yang, Liwei Wang, Jingjing Xu, Zhi Zhang, Hongxia Yang, Di He

    Abstract: In this work, we leverage the intrinsic segmentation of language sequences and design a new positional encoding method called Bilevel Positional Encoding (BiPE). For each position, our BiPE blends an intra-segment encoding and an inter-segment encoding. The intra-segment encoding identifies the locations within a segment and helps the model capture the semantic information therein via absolute pos… ▽ More

    Submitted 17 June, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: 17 pages, 7 figures, 8 tables; ICML 2024 Camera Ready version; Code: https://github.com/zhenyuhe00/BiPE

  43. arXiv:2312.15999  [pdf, other

    cs.LG econ.EM stat.ML

    Pricing with Contextual Elasticity and Heteroscedastic Valuation

    Authors: Jianyu Xu, Yu-Xiang Wang

    Abstract: We study an online contextual dynamic pricing problem, where customers decide whether to purchase a product based on its features and price. We introduce a novel approach to modeling a customer's expected demand by incorporating feature-based price elasticity, which can be equivalently represented as a valuation with heteroscedastic noise. To solve the problem, we propose a computationally efficie… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

    Comments: 29 pages

    MSC Class: 91B06; 91B24; 62P20; 62C20; 90B50 ACM Class: I.2.6

  44. arXiv:2312.13875  [pdf, other

    stat.ML cs.LG stat.ME

    Best Arm Identification in Batched Multi-armed Bandit Problems

    Authors: Shengyu Cao, Simai He, Ruoqing Jiang, Jin Xu, Hongsong Yuan

    Abstract: Recently multi-armed bandit problem arises in many real-life scenarios where arms must be sampled in batches, due to limited time the agent can wait for the feedback. Such applications include biological experimentation and online marketing. The problem is further complicated when the number of arms is large and the number of batches is small. We consider pure exploration in a batched multi-armed… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

  45. arXiv:2312.13484  [pdf, other

    stat.ML cs.LG

    Bayesian Transfer Learning

    Authors: Piotr M. Suder, Jason Xu, David B. Dunson

    Abstract: Transfer learning is a burgeoning concept in statistical machine learning that seeks to improve inference and/or predictive accuracy on a domain of interest by leveraging data from related domains. While the term "transfer learning" has garnered much recent interest, its foundational principles have existed for years under various guises. Prior literature reviews in computer science and electrical… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  46. arXiv:2312.07741  [pdf, other

    stat.ME math.ST stat.AP

    Robust Functional Principal Component Analysis for Non-Euclidean Random Objects

    Authors: Jiazhen Xu, Andrew T. A. Wood, Tao Zou

    Abstract: Functional data analysis offers a diverse toolkit of statistical methods tailored for analyzing samples of real-valued random functions. Recently, samples of time-varying random objects, such as time-varying networks, have been increasingly encountered in modern data analysis. These data structures represent elements within general metric spaces that lack local or global linear structures, renderi… ▽ More

    Submitted 6 March, 2025; v1 submitted 28 November, 2023; originally announced December 2023.

  47. arXiv:2311.05806  [pdf, other

    math.ST stat.ME

    Likelihood ratio tests in random graph models with increasing dimensions

    Authors: Ting Yan, Yuanzhang Li, Jinfeng Xu, Yaning Yang, Ji Zhu

    Abstract: We explore the Wilks phenomena in two random graph models: the $β$-model and the Bradley-Terry model. For two increasing dimensional null hypotheses, including a specified null $H_0: β_i=β_i^0$ for $i=1,\ldots, r$ and a homogenous null $H_0: β_1=\cdots=β_r$, we reveal high dimensional Wilks' phenomena that the normalized log-likelihood ratio statistic,… ▽ More

    Submitted 17 March, 2025; v1 submitted 9 November, 2023; originally announced November 2023.

    Comments: Major revisions. This paper supersedes arxiv article arXiv:2211.10055 titled "Wilks' theorems in the $β$-model" by T. Yan, Y. Zhang, J. Xu, Y. Yang and J. Zhu

  48. arXiv:2309.15809  [pdf, other

    cs.LG stat.ML

    Fair Canonical Correlation Analysis

    Authors: Zhuoping Zhou, Davoud Ataee Tarzanagh, Bojian Hou, Boning Tong, Jia Xu, Yanbo Feng, Qi Long, Li Shen

    Abstract: This paper investigates fairness and bias in Canonical Correlation Analysis (CCA), a widely used statistical technique for examining the relationship between two sets of variables. We present a framework that alleviates unfairness by minimizing the correlation disparity error associated with protected attributes. Our approach enables CCA to learn global projection matrices from all data points whi… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

    Comments: Accepted for publication at NeurIPS 2023, 31 Pages, 14 Figures

  49. arXiv:2309.12658  [pdf, other

    cs.LG stat.ML

    Neural Operator Variational Inference based on Regularized Stein Discrepancy for Deep Gaussian Processes

    Authors: Jian Xu, Shian Du, Junmei Yang, Qianli Ma, Delu Zeng

    Abstract: Deep Gaussian Process (DGP) models offer a powerful nonparametric approach for Bayesian inference, but exact inference is typically intractable, motivating the use of various approximations. However, existing approaches, such as mean-field Gaussian assumptions, limit the expressiveness and efficacy of DGP models, while stochastic approximation can be computationally expensive. To tackle these chal… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

  50. arXiv:2309.11764  [pdf, other

    stat.ME

    Causal inference with outcome dependent sampling and mismeasured outcome

    Authors: Min Zeng, Zeyang Jia, Zijian Sui, Jinfeng Xu, Hong Zhang

    Abstract: Outcome-dependent sampling designs are extensively utilized in various scientific disciplines, including epidemiology, ecology, and economics, with retrospective case-control studies being specific examples of such designs. Additionally, if the outcome used for sample selection is also mismeasured, then it is even more challenging to estimate the average treatment effect (ATE) accurately. To our k… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

    Comments: 49 pages, 5 figures