Skip to main content

Showing 1–50 of 312 results for author: Li, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2507.00312  [pdf, ps, other

    stat.ME

    Optimal Targeting in Dynamic Systems

    Authors: Yuchen Hu, Shuangning Li, Stefan Wager

    Abstract: Modern treatment targeting methods often rely on estimating the conditional average treatment effect (CATE) using machine learning tools. While effective in identifying who benefits from treatment on the individual level, these approaches typically overlook system-level dynamics that may arise when treatments induce strain on shared capacity. We study the problem of targeting in Markovian systems,… ▽ More

    Submitted 30 June, 2025; originally announced July 2025.

  2. arXiv:2506.17718  [pdf, ps, other

    cs.LG stat.ML

    Learning Time-Aware Causal Representation for Model Generalization in Evolving Domains

    Authors: Zhuo He, Shuang Li, Wenze Song, Longhui Yuan, Jian Liang, Han Li, Kun Gai

    Abstract: Endowing deep models with the ability to generalize in dynamic scenarios is of vital significance for real-world deployment, given the continuous and complex changes in data distribution. Recently, evolving domain generalization (EDG) has emerged to address distribution shifts over time, aiming to capture evolving patterns for improved model generalization. However, existing EDG methods may suffer… ▽ More

    Submitted 28 June, 2025; v1 submitted 21 June, 2025; originally announced June 2025.

    Comments: ICML 2025

  3. arXiv:2506.12741  [pdf, ps, other

    stat.ME stat.CO

    Efficient Implementation of a Semiparametric Joint Model for Multivariate Longitudinal Biomarkers and Competing Risks Time-to-Event Data

    Authors: Shanpeng Li, Emily Ouyang, Jin Zhou, Xinping Cui, Gang Li

    Abstract: Joint modeling has become increasingly popular for characterizing the association between one or more longitudinal biomarkers and competing risks time-to-event outcomes. However, semiparametric multivariate joint modeling for large-scale data encounter substantial statistical and computational challenges, primarily due to the high dimensionality of random effects and the complexity of estimating n… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

  4. arXiv:2506.06919  [pdf, ps, other

    stat.ME

    Tensor Stochastic Regression for High-dimensional Time Series via CP Decomposition

    Authors: Shibo Li, Yao Zheng

    Abstract: As tensor-valued data become increasingly common in time series analysis, there is a growing need for flexible and interpretable models that can handle high-dimensional predictors and responses across multiple modes. We propose a unified framework for high-dimensional tensor stochastic regression based on CANDECOMP/PARAFAC (CP) decomposition, which encompasses vector, matrix, and tensor responses… ▽ More

    Submitted 7 June, 2025; originally announced June 2025.

  5. arXiv:2506.05544  [pdf, ps, other

    stat.ML cs.LG

    Online Conformal Model Selection for Nonstationary Time Series

    Authors: Shibo Li, Yao Zheng

    Abstract: This paper introduces the MPS (Model Prediction Set), a novel framework for online model selection for nonstationary time series. Classical model selection methods, such as information criteria and cross-validation, rely heavily on the stationarity assumption and often fail in dynamic environments which undergo gradual or abrupt changes over time. Yet real-world data are rarely stationary, and mod… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

  6. arXiv:2506.05295  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Sample Complexity and Representation Ability of Test-time Scaling Paradigms

    Authors: Baihe Huang, Shanda Li, Tianhao Wu, Yiming Yang, Ameet Talwalkar, Kannan Ramchandran, Michael I. Jordan, Jiantao Jiao

    Abstract: Test-time scaling paradigms have significantly advanced the capabilities of large language models (LLMs) on complex tasks. Despite their empirical success, theoretical understanding of the sample efficiency of various test-time strategies -- such as self-consistency, best-of-$n$, and self-correction -- remains limited. In this work, we first establish a separation result between two repeated sampl… ▽ More

    Submitted 12 June, 2025; v1 submitted 5 June, 2025; originally announced June 2025.

  7. arXiv:2506.00561  [pdf, ps, other

    stat.AP stat.ME

    Assessing Climate-Driven Mortality Risk: A Stochastic Approach with Distributed Lag Non-Linear Models

    Authors: Jiacheng Min, Han Li, Thomas Nagler, Shuanming Li

    Abstract: Assessing climate-driven mortality risk has become an emerging area of research in recent decades. In this paper, we propose a novel approach to explicitly incorporate climate-driven effects into both single- and multi-population stochastic mortality models. The new model consists of two components: a stochastic mortality model, and a distributed lag non-linear model (DLNM). The first component ca… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

  8. arXiv:2505.16019  [pdf, other

    stat.ME q-fin.ST stat.AP

    Quantile Predictions for Equity Premium using Penalized Quantile Regression with Consistent Variable Selection across Multiple Quantiles

    Authors: Shaobo Li, Ben Sherwood

    Abstract: This paper considers equity premium prediction, for which mean regression can be problematic due to heteroscedasticity and heavy-tails of the error. We show advantages of quantile predictions using a novel penalized quantile regression that offers a model for a full spectrum analysis on the equity premium distribution. To enhance model interpretability and address the well-known issue of crossing… ▽ More

    Submitted 21 May, 2025; originally announced May 2025.

  9. arXiv:2505.08198  [pdf, ps, other

    stat.ML cs.LG

    SIM-Shapley: A Stable and Computationally Efficient Approach to Shapley Value Approximation

    Authors: Wangxuan Fan, Siqi Li, Doudou Zhou, Yohei Okada, Chuan Hong, Molei Liu, Nan Liu

    Abstract: Explainable artificial intelligence (XAI) is essential for trustworthy machine learning (ML), particularly in high-stakes domains such as healthcare and finance. Shapley value (SV) methods provide a principled framework for feature attribution in complex models but incur high computational costs, limiting their scalability in high-dimensional settings. We propose Stochastic Iterative Momentum for… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Comments: 21 pages, 6 figures, 5 tables

  10. arXiv:2505.05034  [pdf, ps, other

    cs.LG stat.ML

    Dequantified Diffusion-Schr{ö}dinger Bridge for Density Ratio Estimation

    Authors: Wei Chen, Shigui Li, Jiacheng Li, Junmei Yang, John Paisley, Delu Zeng

    Abstract: Density ratio estimation is fundamental to tasks involving f-divergences, yet existing methods often fail under significantly different distributions or inadequately overlapping supports -- the density-chasm and the support-chasm problems. Additionally, prior approaches yield divergent time scores near boundaries, leading to instability. We design $\textbf{D}^3\textbf{RE}$, a unified framework for… ▽ More

    Submitted 29 May, 2025; v1 submitted 8 May, 2025; originally announced May 2025.

    Journal ref: ICML 2025: Proceedings of the 42nd International Conference on Machine Learning, 2025

  11. arXiv:2503.19763  [pdf, other

    stat.ML cs.LG math.ST

    Interpretable Deep Regression Models with Interval-Censored Failure Time Data

    Authors: Changhui Yuan, Shishun Zhao, Shuwei Li, Xinyuan Song, Zhao Chen

    Abstract: Deep neural networks (DNNs) have become powerful tools for modeling complex data structures through sequentially integrating simple functions in each hidden layer. In survival analysis, recent advances of DNNs primarily focus on enhancing model capabilities, especially in exploring nonlinear covariate effects under right censoring. However, deep learning methods for interval-censored data, where t… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

  12. arXiv:2503.18256  [pdf, other

    stat.ME

    Efficient Inference for Covariate-adjusted Bradley-Terry Model with Covariate Shift

    Authors: Xiudi Li, Sijia Li

    Abstract: We propose a general framework for statistical inference on the overall strengths of players in pairwise comparisons, allowing for potential shifts in the covariate distribution. These covariates capture important contextual information that may impact the winning probability of each player. We measure the overall strengths of players under a target distribution through its Kullback-Leibler projec… ▽ More

    Submitted 8 April, 2025; v1 submitted 23 March, 2025; originally announced March 2025.

    Comments: 19 pages

  13. arXiv:2503.11774  [pdf

    cs.LG stat.ML

    UBMF: Uncertainty-Aware Bayesian Meta-Learning Framework for Fault Diagnosis with Imbalanced Industrial Data

    Authors: Zhixuan Lian, Shangyu Li, Qixuan Huang, Zijian Huang, Haifei Liu, Jianan Qiu, Puyu Yang, Laifa Tao

    Abstract: Fault diagnosis of mechanical equipment involves data collection, feature extraction, and pattern recognition but is often hindered by the imbalanced nature of industrial data, introducing significant uncertainty and reducing diagnostic reliability. To address these challenges, this study proposes the Uncertainty-Aware Bayesian Meta-Learning Framework (UBMF), which integrates four key modules: dat… ▽ More

    Submitted 14 March, 2025; originally announced March 2025.

  14. arXiv:2503.04483  [pdf, ps, other

    stat.ML cs.LG q-bio.QM

    InfoSEM: A Deep Generative Model with Informative Priors for Gene Regulatory Network Inference

    Authors: Tianyu Cui, Song-Jun Xu, Artem Moskalev, Shuwei Li, Tommaso Mansi, Mangal Prakash, Rui Liao

    Abstract: Inferring Gene Regulatory Networks (GRNs) from gene expression data is crucial for understanding biological processes. While supervised models are reported to achieve high performance for this task, they rely on costly ground truth (GT) labels and risk learning gene-specific biases, such as class imbalances of GT interactions, rather than true regulatory mechanisms. To address these issues, we int… ▽ More

    Submitted 8 June, 2025; v1 submitted 6 March, 2025; originally announced March 2025.

    Comments: ICML 2025

  15. arXiv:2502.14166  [pdf, ps, other

    stat.ML cs.LG stat.ME

    Prediction-Powered Adaptive Shrinkage Estimation

    Authors: Sida Li, Nikolaos Ignatiadis

    Abstract: Prediction-Powered Inference (PPI) is a powerful framework for enhancing statistical estimates by combining limited gold-standard data with machine learning (ML) predictions. While prior work has demonstrated PPI's benefits for individual statistical problems, modern applications require answering numerous parallel statistical questions. We introduce Prediction-Powered Adaptive Shrinkage (PAS), a… ▽ More

    Submitted 7 June, 2025; v1 submitted 19 February, 2025; originally announced February 2025.

    Comments: Accepted as poster in ICML 2025

  16. arXiv:2412.20992  [pdf, ps, other

    cs.LG cs.PL stat.ML

    Verified Lifting of Deep learning Operators

    Authors: Qi Zhan, Xing Hu, Xin Xia, Shanping Li

    Abstract: Deep learning operators are fundamental components of modern deep learning frameworks. With the growing demand for customized operators, it has become increasingly common for developers to create their own. However, designing and implementing operators is complex and error-prone, due to hardware-specific optimizations and the need for numerical stability. There is a pressing need for tools that ca… ▽ More

    Submitted 30 December, 2024; originally announced December 2024.

  17. arXiv:2412.11744  [pdf, other

    stat.ML cs.LG

    Conditional Diffusion Models Based Conditional Independence Testing

    Authors: Yanfeng Yang, Shuai Li, Yingjie Zhang, Zhuoran Sun, Hai Shu, Ziqi Chen, Renming Zhang

    Abstract: Conditional independence (CI) testing is a fundamental task in modern statistics and machine learning. The conditional randomization test (CRT) was recently introduced to test whether two random variables, $X$ and $Y$, are conditionally independent given a potentially high-dimensional set of random variables, $Z$. The CRT operates exceptionally well under the assumption that the conditional distri… ▽ More

    Submitted 18 December, 2024; v1 submitted 16 December, 2024; originally announced December 2024.

    Comments: 17 pages, 7 figures, aaai 2025

  18. arXiv:2411.17472  [pdf, other

    cs.CV cs.LG stat.ML

    Unlocking the Potential of Text-to-Image Diffusion with PAC-Bayesian Theory

    Authors: Eric Hanchen Jiang, Yasi Zhang, Zhi Zhang, Yixin Wan, Andrew Lizarraga, Shufan Li, Ying Nian Wu

    Abstract: Text-to-image (T2I) diffusion models have revolutionized generative modeling by producing high-fidelity, diverse, and visually realistic images from textual prompts. Despite these advances, existing models struggle with complex prompts involving multiple objects and attributes, often misaligning modifiers with their corresponding nouns or neglecting certain elements. Recent attention-based methods… ▽ More

    Submitted 25 November, 2024; originally announced November 2024.

  19. arXiv:2411.08911  [pdf, other

    physics.comp-ph cond-mat.mtrl-sci cs.LG stat.ML

    A Message Passing Neural Network Surrogate Model for Bond-Associated Peridynamic Material Correspondence Formulation

    Authors: Xuan Hu, Qijun Chen, Nicholas H. Luo, Richy J. Zheng, Shaofan Li

    Abstract: Peridynamics is a non-local continuum mechanics theory that offers unique advantages for modeling problems involving discontinuities and complex deformations. Within the peridynamic framework, various formulations exist, among which the material correspondence formulation stands out for its ability to directly incorporate traditional continuum material models, making it highly applicable to a rang… ▽ More

    Submitted 29 October, 2024; originally announced November 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2410.00934

  20. arXiv:2411.06697  [pdf, ps, other

    cs.LG cs.DS math.OC stat.ML

    Learning a Single Neuron Robustly to Distributional Shifts and Adversarial Label Noise

    Authors: Shuyao Li, Sushrut Karmalkar, Ilias Diakonikolas, Jelena Diakonikolas

    Abstract: We study the problem of learning a single neuron with respect to the $L_2^2$-loss in the presence of adversarial distribution shifts, where the labels can be arbitrary, and the goal is to find a ``best-fit'' function. More precisely, given training samples from a reference distribution $\mathcal{p}_0$, the goal is to approximate the vector $\mathbf{w}^*$ which minimizes the squared loss with respe… ▽ More

    Submitted 10 November, 2024; originally announced November 2024.

  21. arXiv:2411.04852  [pdf, other

    stat.ML cs.LG

    Conformalized Credal Regions for Classification with Ambiguous Ground Truth

    Authors: Michele Caprio, David Stutz, Shuo Li, Arnaud Doucet

    Abstract: An open question in \emph{Imprecise Probabilistic Machine Learning} is how to empirically derive a credal region (i.e., a closed and convex family of probabilities on the output space) from the available data, without any prior knowledge or assumption. In classification problems, credal regions are a tool that is able to provide provable guarantees under realistic assumptions by characterizing the… ▽ More

    Submitted 27 January, 2025; v1 submitted 7 November, 2024; originally announced November 2024.

    Journal ref: TMLR 2025

  22. arXiv:2411.01956  [pdf, other

    cs.LG cs.CY stat.ML

    EXAGREE: Towards Explanation Agreement in Explainable Machine Learning

    Authors: Sichao Li, Quanling Deng, Amanda S. Barnard

    Abstract: Explanations in machine learning are critical for trust, transparency, and fairness. Yet, complex disagreements among these explanations limit the reliability and applicability of machine learning models, especially in high-stakes environments. We formalize four fundamental ranking-based explanation disagreement problems and introduce a novel framework, EXplanation AGREEment (EXAGREE), to bridge d… ▽ More

    Submitted 4 November, 2024; originally announced November 2024.

  23. arXiv:2410.10051  [pdf, other

    cs.LG stat.ML

    Towards Bridging Generalization and Expressivity of Graph Neural Networks

    Authors: Shouheng Li, Floris Geerts, Dongwoo Kim, Qing Wang

    Abstract: Expressivity and generalization are two critical aspects of graph neural networks (GNNs). While significant progress has been made in studying the expressivity of GNNs, much less is known about their generalization capabilities, particularly when dealing with the inherent complexity of graph-structured data. In this work, we address the intricate relationship between expressivity and generalizatio… ▽ More

    Submitted 13 October, 2024; originally announced October 2024.

    Comments: 17 pages, 2 figures, 2 tables

  24. arXiv:2410.09766  [pdf, ps, other

    cs.LG stat.ML

    Stability and Sharper Risk Bounds with Convergence Rate $O(1/n^2)$

    Authors: Bowei Zhu, Shaojie Li, Yong Liu

    Abstract: The sharpest known high probability excess risk bounds are up to $O\left( 1/n \right)$ for empirical risk minimization and projected gradient descent via algorithmic stability (Klochkov \& Zhivotovskiy, 2021). In this paper, we show that high probability excess risk bounds of order up to $O\left( 1/n^2 \right)$ are possible. We discuss how high probability excess risk bounds reach… ▽ More

    Submitted 13 October, 2024; originally announced October 2024.

  25. Towards Sharper Risk Bounds for Minimax Problems

    Authors: Bowei Zhu, Shaojie Li, Yong Liu

    Abstract: Minimax problems have achieved success in machine learning such as adversarial training, robust optimization, reinforcement learning. For theoretical analysis, current optimal excess risk bounds, which are composed by generalization error and optimization error, present 1/n-rates in strongly-convex-strongly-concave (SC-SC) settings. Existing studies mainly focus on minimax problems with specific a… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

  26. arXiv:2410.03477  [pdf, other

    cs.LG cs.CC math.ST stat.ML

    On the Hardness of Learning One Hidden Layer Neural Networks

    Authors: Shuchen Li, Ilias Zadik, Manolis Zampetakis

    Abstract: In this work, we consider the problem of learning one hidden layer ReLU neural networks with inputs from $\mathbb{R}^d$. We show that this learning problem is hard under standard cryptographic assumptions even when: (1) the size of the neural network is polynomial in $d$, (2) its input distribution is a standard Gaussian, and (3) the noise is Gaussian and polynomially small in $d$. Our hardness re… ▽ More

    Submitted 4 October, 2024; originally announced October 2024.

    Comments: 18 pages

  27. arXiv:2410.03229  [pdf, other

    stat.ML cs.LG

    Elucidating the Design Choice of Probability Paths in Flow Matching for Forecasting

    Authors: Soon Hoe Lim, Yijin Wang, Annan Yu, Emma Hart, Michael W. Mahoney, Xiaoye S. Li, N. Benjamin Erichson

    Abstract: Flow matching has recently emerged as a powerful paradigm for generative modeling and has been extended to probabilistic time series forecasting in latent spaces. However, the impact of the specific choice of probability path model on forecasting performance remains under-explored. In this work, we demonstrate that forecasting spatio-temporal data with flow matching is highly sensitive to the sele… ▽ More

    Submitted 18 January, 2025; v1 submitted 4 October, 2024; originally announced October 2024.

    Comments: 33 pages

  28. arXiv:2410.02941  [pdf, other

    stat.ME

    Efficient collaborative learning of the average treatment effect under data sharing constraints

    Authors: Sijia Li, Rui Duan

    Abstract: Driven by the need to generate real-world evidence from multi-site collaborative studies, we introduce an efficient collaborative learning approach to evaluate average treatment effect in a multi-site setting under data sharing constraints. Specifically, the proposed method operates in a federated manner, using individual-level data from a user-defined target population and summary statistics from… ▽ More

    Submitted 3 October, 2024; originally announced October 2024.

    Comments: 17 pages, 3 figures

  29. arXiv:2409.11167  [pdf, other

    stat.ME math.ST

    Poisson and Gamma Model Marginalisation and Marginal Likelihood calculation using Moment-generating Functions

    Authors: Si-Yang R. Y. Li, David A. van Dyk, Maximilian Autenrieth

    Abstract: We present a new analytical method to derive the likelihood function that has the population of parameters marginalised out in Bayesian hierarchical models. This method is also useful to find the marginal likelihoods in Bayesian models or in random-effect linear mixed models. The key to this method is to take high-order (sometimes fractional) derivatives of the prior moment-generating function if… ▽ More

    Submitted 27 November, 2024; v1 submitted 17 September, 2024; originally announced September 2024.

  30. arXiv:2409.07431  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Synthetic continued pretraining

    Authors: Zitong Yang, Neil Band, Shuangping Li, Emmanuel Candès, Tatsunori Hashimoto

    Abstract: Pretraining on large-scale, unstructured internet text enables language models to acquire a significant amount of world knowledge. However, this knowledge acquisition is data-inefficient--to learn a given fact, models must be trained on hundreds to thousands of diverse representations of it. This poses a challenge when adapting a pretrained model to a small corpus of domain-specific documents, whe… ▽ More

    Submitted 3 October, 2024; v1 submitted 11 September, 2024; originally announced September 2024.

    Comments: Updated organization of experimental results and methods introduction. Released the dataset and model weights artifact

  31. arXiv:2409.05798  [pdf, other

    cs.LG cs.AI cs.HC econ.EM stat.ML

    Enhancing Preference-based Linear Bandits via Human Response Time

    Authors: Shen Li, Yuyang Zhang, Zhaolin Ren, Claire Liang, Na Li, Julie A. Shah

    Abstract: Interactive preference learning systems infer human preferences by presenting queries as pairs of options and collecting binary choices. Although binary choices are simple and widely used, they provide limited information about preference strength. To address this, we leverage human response times, which are inversely related to preference strength, as an additional signal. We propose a computatio… ▽ More

    Submitted 2 January, 2025; v1 submitted 9 September, 2024; originally announced September 2024.

    Comments: NeurIPS 2024 (Oral) camera ready

  32. arXiv:2408.03746  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Flexible Bayesian Last Layer Models Using Implicit Priors and Diffusion Posterior Sampling

    Authors: Jian Xu, Zhiqi Lin, Shigui Li, Min Chen, Junmei Yang, Delu Zeng, John Paisley

    Abstract: Bayesian Last Layer (BLL) models focus solely on uncertainty in the output layer of neural networks, demonstrating comparable performance to more complex Bayesian models. However, the use of Gaussian priors for last layer weights in Bayesian Last Layer (BLL) models limits their expressive capacity when faced with non-Gaussian, outlier-rich, or high-dimensional datasets. To address this shortfall,… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

  33. arXiv:2407.16832  [pdf, other

    stat.AP

    Real-time risk estimation for active road safety: Leveraging Waymo AV sensor data with hierarchical Bayesian extreme value models

    Authors: Mohammad Anis, Sixu Li, Srinivas R. Geedipally, Yang Zhou, Dominique Lord

    Abstract: This study develops a real-time framework for estimating the risk of near-misses by using high-fidelity two-dimensional (2D) risk indicator time-to-collision (TTC), which is calculated from high-resolution data collected by autonomous vehicles (AVs). The framework utilizes extreme value theory (EVT) to derive near-miss risk based on observed TTC data. Most existing studies employ a generalized ext… ▽ More

    Submitted 14 October, 2024; v1 submitted 23 July, 2024; originally announced July 2024.

    Comments: 29 pages, 13 figures

  34. arXiv:2407.14065  [pdf, other

    cs.LG stat.ML

    MSCT: Addressing Time-Varying Confounding with Marginal Structural Causal Transformer for Counterfactual Post-Crash Traffic Prediction

    Authors: Shuang Li, Ziyuan Pu, Nan Zhang, Duxin Chen, Lu Dong, Daniel J. Graham, Yinhai Wang

    Abstract: Traffic crashes profoundly impede traffic efficiency and pose economic challenges. Accurate prediction of post-crash traffic status provides essential information for evaluating traffic perturbations and developing effective solutions. Previous studies have established a series of deep learning models to predict post-crash traffic conditions, however, these correlation-based methods cannot accommo… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

    Comments: 13 pages, 9 figures

  35. A Partially Pooled NSUM Model: Detailed estimation of CSEM trafficking prevalence in Philippine municipalities

    Authors: Albert Nyarko-Agyei, Scott Moser, Rowland G Seymour, Ben Brewster, Sabrina Li, Esther Weir, Todd Landman, Emily Wyman, Christine Belle Torres, Imogen Fell, Doreen Boyd

    Abstract: Effective policy and intervention strategies to combat human trafficking for child sexual exploitation material (CSEM) production require accurate prevalence estimates. Traditional Network Scale Up Method (NSUM) models often necessitate standalone surveys for each geographic region, escalating costs and complexity. This study introduces a partially pooled NSUM model, using a hierarchical Bayesian… ▽ More

    Submitted 10 April, 2025; v1 submitted 18 July, 2024; originally announced July 2024.

    Comments: Accepted for publication in the journal of the Royal Statistical Society: Series C

  36. arXiv:2407.11646  [pdf, other

    stat.ME

    Discovery and inference of possibly bi-directional causal relationships with invalid instrumental variables

    Authors: Wei Li, Rui Duan, Sai Li

    Abstract: Learning causal relationships between pairs of complex traits from observational studies is of great interest across various scientific domains. However, most existing methods assume the absence of unmeasured confounding and restrict causal relationships between two traits to be uni-directional, which may be violated in real-world systems. In this paper, we address the challenge of causal discover… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  37. arXiv:2407.01607  [pdf, other

    cs.LG cs.IR stat.ML

    Multi-Epoch learning with Data Augmentation for Deep Click-Through Rate Prediction

    Authors: Zhongxiang Fan, Zhaocheng Liu, Jian Liang, Dongying Kong, Han Li, Peng Jiang, Shuang Li, Kun Gai

    Abstract: This paper investigates the one-epoch overfitting phenomenon in Click-Through Rate (CTR) models, where performance notably declines at the start of the second epoch. Despite extensive research, the efficacy of multi-epoch training over the conventional one-epoch approach remains unclear. We identify the overfitting of the embedding layer, caused by high-dimensional data sparsity, as the primary is… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

  38. Causal Inference with Latent Variables: Recent Advances and Future Prospectives

    Authors: Yaochen Zhu, Yinhan He, Jing Ma, Mengxuan Hu, Sheng Li, Jundong Li

    Abstract: Causality lays the foundation for the trajectory of our world. Causal inference (CI), which aims to infer intrinsic causal relations among variables of interest, has emerged as a crucial research topic. Nevertheless, the lack of observation of important variables (e.g., confounders, mediators, exogenous variables, etc.) severely compromises the reliability of CI methods. The issue may arise from t… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted by KDD'24 Survey Track

  39. arXiv:2406.01380  [pdf, other

    cs.CV stat.AP

    Convolutional Unscented Kalman Filter for Multi-Object Tracking with Outliers

    Authors: Shiqi Liu, Wenhan Cao, Chang Liu, Tianyi Zhang, Shengbo Eben Li

    Abstract: Multi-object tracking (MOT) is an essential technique for navigation in autonomous driving. In tracking-by-detection systems, biases, false positives, and misses, which are referred to as outliers, are inevitable due to complex traffic scenarios. Recent tracking methods are based on filtering algorithms that overlook these outliers, leading to reduced tracking accuracy or even loss of the objects… ▽ More

    Submitted 15 September, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: IEEE Transactions on Intelligent Vehicles

  40. arXiv:2405.19544  [pdf, other

    cs.AI cs.CL cs.LG math.OC stat.ML

    One-Shot Safety Alignment for Large Language Models via Optimal Dualization

    Authors: Xinmeng Huang, Shuo Li, Edgar Dobriban, Osbert Bastani, Hamed Hassani, Dongsheng Ding

    Abstract: The growing safety concerns surrounding large language models raise an urgent need to align them with diverse human preferences to simultaneously enhance their helpfulness and safety. A promising approach is to enforce safety constraints through Reinforcement Learning from Human Feedback (RLHF). For such constrained RLHF, typical Lagrangian-based primal-dual policy optimization methods are computa… ▽ More

    Submitted 22 November, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: 32 pages, 6 figures, 8 tables

  41. arXiv:2405.19231  [pdf, other

    stat.ME

    Covariate Shift Corrected Conditional Randomization Test

    Authors: Bowen Xu, Yiwen Huang, Chuan Hong, Shuangning Li, Molei Liu

    Abstract: Conditional independence tests are crucial across various disciplines in determining the independence of an outcome variable $Y$ from a treatment variable $X$, conditioning on a set of confounders $Z$. The Conditional Randomization Test (CRT) offers a powerful framework for such testing by assuming known distributions of $X \mid Z$; it controls the Type-I error exactly, allowing for the use of fle… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  42. arXiv:2404.08927  [pdf, other

    stat.AP

    PDXpower: A Power Analysis Tool for Experimental Design in Pre-clinical Xenograft Studies for Uncensored and Censored Outcomes

    Authors: Shanpeng Li, Donatello Telesca, Harley I. Kornblum, David Nathanson, Frank Pajonk, Elvis Han Cui, Joycelynne Palmer, Gang Li

    Abstract: In cancer research, leveraging patient-derived xenografts (PDXs) in pre-clinical experiments is a crucial approach for assessing innovative therapeutic strategies. Addressing the inherent variability in treatment response among and within individual PDX lines is essential. However, the current literature lacks a user-friendly statistical power analysis tool capable of concurrently determining the… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  43. arXiv:2404.03163  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    Uncertainty in Language Models: Assessment through Rank-Calibration

    Authors: Xinmeng Huang, Shuo Li, Mengxin Yu, Matteo Sesia, Hamed Hassani, Insup Lee, Osbert Bastani, Edgar Dobriban

    Abstract: Language Models (LMs) have shown promising performance in natural language generation. However, as LMs often generate incorrect or hallucinated responses, it is crucial to correctly quantify their uncertainty in responding to given inputs. In addition to verbalized confidence elicited via prompting, many uncertainty measures ($e.g.$, semantic entropy and affinity-graph-based measures) have been pr… ▽ More

    Submitted 13 September, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

  44. arXiv:2404.01608  [pdf, ps, other

    stat.ML cs.LG stat.ME

    FAIRM: Learning invariant representations for algorithmic fairness and domain generalization with minimax optimality

    Authors: Sai Li, Linjun Zhang

    Abstract: Machine learning methods often assume that the test data have the same distribution as the training data. However, this assumption may not hold due to multiple levels of heterogeneity in applications, raising issues in algorithmic fairness and domain generalization. In this work, we address the problem of fair and generalizable machine learning by invariant principles. We propose a training enviro… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  45. arXiv:2404.00481  [pdf, other

    stat.ML cs.LG eess.SY

    Convolutional Bayesian Filtering

    Authors: Wenhan Cao, Shiqi Liu, Chang Liu, Zeyu He, Stephen S. -T. Yau, Shengbo Eben Li

    Abstract: Bayesian filtering serves as the mainstream framework of state estimation in dynamic systems. Its standard version utilizes total probability rule and Bayes' law alternatively, where how to define and compute conditional probability is critical to state distribution inference. Previously, the conditional probability is assumed to be exactly known, which represents a measure of the occurrence proba… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  46. arXiv:2403.07213  [pdf, other

    cs.LG stat.ML

    Which LLM to Play? Convergence-Aware Online Model Selection with Time-Increasing Bandits

    Authors: Yu Xia, Fang Kong, Tong Yu, Liya Guo, Ryan A. Rossi, Sungchul Kim, Shuai Li

    Abstract: Web-based applications such as chatbots, search engines and news recommendations continue to grow in scale and complexity with the recent surge in the adoption of LLMs. Online model selection has thus garnered increasing attention due to the need to choose the best model among a diverse set while balancing task reward and exploration cost. Organizations faces decisions like whether to employ a cos… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: Accepted by WWW'24 (Oral)

  47. arXiv:2403.04246  [pdf, other

    stat.ML cs.AI cs.LG

    Efficient CNN-LSTM based Parameter Estimation of Levy Driven Stochastic Differential Equations

    Authors: Shuaiyu Li, Yang Ruan, Changzhou Long, Yuzhong Cheng

    Abstract: This study addresses the challenges in parameter estimation of stochastic differential equations driven by non-Gaussian noises, which are critical in understanding dynamic phenomena such as price fluctuations and the spread of infectious diseases. Previous research highlighted the potential of LSTM networks in estimating parameters of alpha stable Levy driven SDEs but faced limitations including h… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: 2023 International Conference on Machine Learning and Applications (ICMLA)

  48. arXiv:2402.19209  [pdf, other

    stat.AP

    Call center data analysis and model validation

    Authors: Ger Koole, Siqiao Li, Sihan Ding

    Abstract: We analyze call center data on properties such as agent heterogeneity, customer patience and breaks. Then we compare simulation models that are different in the ways these properties are modeled. We classify them according to the extend in which they approach the actual service level and average waiting times. We obtain a theoretical understanding on how to distinguish between the model error and… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  49. arXiv:2402.02329  [pdf, ps, other

    stat.ME

    Leveraging Local Distributions in Mendelian Randomization: Uncertain Opinions are Invalid

    Authors: Ziya Xu, Sai Li

    Abstract: Mendelian randomization (MR) considers using genetic variants as instrumental variables (IVs) to infer causal effects in observational studies. However, the validity of causal inference in MR can be compromised when the IVs are potentially invalid. In this work, we propose a new method, MR-Local, to infer the causal effect in the existence of possibly invalid IVs. By leveraging the distribution of… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

  50. arXiv:2401.04900  [pdf, other

    astro-ph.SR astro-ph.IM cs.LG stat.ML

    SPT: Spectral Transformer for Red Giant Stars Age and Mass Estimation

    Authors: Mengmeng Zhang, Fan Wu, Yude Bu, Shanshan Li, Zhenping Yi, Meng Liu, Xiaoming Kong

    Abstract: The age and mass of red giants are essential for understanding the structure and evolution of the Milky Way. Traditional isochrone methods for these estimations are inherently limited due to overlapping isochrones in the Hertzsprung-Russell diagram, while asteroseismology, though more precise, requires high-precision, long-term observations. In response to these challenges, we developed a novel fr… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

    Comments: Accepted by A&A