Skip to main content

Showing 1–50 of 439 results for author: Chen, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2507.04668  [pdf, ps, other

    stat.ME econ.EM

    Forward Variable Selection in Ultra-High Dimensional Linear Regression Using Gram-Schmidt Orthogonalization

    Authors: Jialuo Chen, Zhaoxing Gao, Ruey S. Tsay

    Abstract: We investigate forward variable selection for ultra-high dimensional linear regression using a Gram-Schmidt orthogonalization procedure. Unlike the commonly used Forward Regression (FR) method, which computes regression residuals using an increasing number of selected features, or the Orthogonal Greedy Algorithm (OGA), which selects variables based on their marginal correlations with the residuals… ▽ More

    Submitted 7 July, 2025; originally announced July 2025.

  2. arXiv:2506.17968  [pdf, ps, other

    cs.LG cs.AI cs.CV math.PR stat.ML

    h-calibration: Rethinking Classifier Recalibration with Probabilistic Error-Bounded Objective

    Authors: Wenjian Huang, Guiping Cao, Jiahao Xia, Jingkun Chen, Hao Wang, Jianguo Zhang

    Abstract: Deep neural networks have demonstrated remarkable performance across numerous learning tasks but often suffer from miscalibration, resulting in unreliable probability outputs. This has inspired many recent works on mitigating miscalibration, particularly through post-hoc recalibration methods that aim to obtain calibrated probabilities without sacrificing the classification performance of pre-trai… ▽ More

    Submitted 22 June, 2025; originally announced June 2025.

  3. arXiv:2506.13955  [pdf, ps, other

    stat.ML cs.CR cs.LG stat.AP

    Bridging Unsupervised and Semi-Supervised Anomaly Detection: A Theoretically-Grounded and Practical Framework with Synthetic Anomalies

    Authors: Matthew Lau, Tian-Yi Zhou, Xiangchi Yuan, Jizhou Chen, Wenke Lee, Xiaoming Huo

    Abstract: Anomaly detection (AD) is a critical task across domains such as cybersecurity and healthcare. In the unsupervised setting, an effective and theoretically-grounded principle is to train classifiers to distinguish normal data from (synthetic) anomalies. We extend this principle to semi-supervised AD, where training data also include a limited labeled subset of anomalies possibly present in test tim… ▽ More

    Submitted 16 June, 2025; originally announced June 2025.

  4. arXiv:2506.01502  [pdf, other

    cs.LG cs.AI stat.ML

    Learning of Population Dynamics: Inverse Optimization Meets JKO Scheme

    Authors: Mikhail Persiianov, Jiawei Chen, Petr Mokrov, Alexander Tyurin, Evgeny Burnaev, Alexander Korotin

    Abstract: Learning population dynamics involves recovering the underlying process that governs particle evolution, given evolutionary snapshots of samples at discrete time points. Recent methods frame this as an energy minimization problem in probability space and leverage the celebrated JKO scheme for efficient time discretization. In this work, we introduce $\texttt{iJKOnet}$, an approach that combines th… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  5. arXiv:2506.00044  [pdf, ps, other

    stat.AP cs.LG stat.ML

    Probabilistic intraday electricity price forecasting using generative machine learning

    Authors: Jieyu Chen, Sebastian Lerch, Melanie Schienle, Tomasz Serafin, Rafał Weron

    Abstract: The growing importance of intraday electricity trading in Europe calls for improved price forecasting and tailored decision-support tools. In this paper, we propose a novel generative neural network model to generate probabilistic path forecasts for intraday electricity prices and use them to construct effective trading strategies for Germany's continuous-time intraday market. Our method demonstra… ▽ More

    Submitted 28 May, 2025; originally announced June 2025.

  6. arXiv:2505.08128  [pdf, other

    stat.ME cs.LG math.ST stat.CO

    Beyond Basic A/B testing: Improving Statistical Efficiency for Business Growth

    Authors: Changshuai Wei, Phuc Nguyen, Benjamin Zelditch, Joyce Chen

    Abstract: The standard A/B testing approaches are mostly based on t-test in large scale industry applications. These standard approaches however suffers from low statistical power in business settings, due to nature of small sample-size or non-Gaussian distribution or return-on-investment (ROI) consideration. In this paper, we propose several approaches to addresses these challenges: (i) regression adjustme… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

  7. arXiv:2505.00110  [pdf, ps, other

    stat.ML cs.LG math.NA

    On the expressivity of deep Heaviside networks

    Authors: Insung Kong, Juntong Chen, Sophie Langer, Johannes Schmidt-Hieber

    Abstract: We show that deep Heaviside networks (DHNs) have limited expressiveness but that this can be overcome by including either skip connections or neurons with linear activation. We provide lower and upper bounds for the Vapnik-Chervonenkis (VC) dimensions and approximation rates of these network classes. As an application, we derive statistical convergence rates for DHN fits in the nonparametric regre… ▽ More

    Submitted 30 April, 2025; originally announced May 2025.

    Comments: 61 pages, 16 figures

  8. arXiv:2504.19979  [pdf, other

    cs.LG stat.ME

    Transfer Learning Under High-Dimensional Network Convolutional Regression Model

    Authors: Liyuan Wang, Jiachen Chen, Kathryn L. Lunetta, Danyang Huang, Huimin Cheng, Debarghya Mukherjee

    Abstract: Transfer learning enhances model performance by utilizing knowledge from related domains, particularly when labeled data is scarce. While existing research addresses transfer learning under various distribution shifts in independent settings, handling dependencies in networked data remains challenging. To address this challenge, we propose a high-dimensional transfer learning framework based on ne… ▽ More

    Submitted 29 April, 2025; v1 submitted 28 April, 2025; originally announced April 2025.

  9. arXiv:2504.19953  [pdf, ps, other

    q-fin.RM math.ST stat.AP

    Marginal expected shortfall: Systemic risk measurement under dependence uncertainty

    Authors: Jinghui Chen, Edward Furman, X. Sheldon Lin

    Abstract: Measuring the contribution of a bank or an insurance company to the overall systemic risk of the market is an important issue, especially in the aftermath of the 2007-2009 financial crisis and the financial downturn of 2020. In this paper, we derive the worst-case and best-case bounds for marginal expected shortfall (MES) -- a key measure of systemic risk contribution -- under the assumption of kn… ▽ More

    Submitted 28 April, 2025; originally announced April 2025.

  10. arXiv:2504.10373  [pdf, other

    cs.LG math.DS math.NA stat.ML

    DUE: A Deep Learning Framework and Library for Modeling Unknown Equations

    Authors: Junfeng Chen, Kailiang Wu, Dongbin Xiu

    Abstract: Equations, particularly differential equations, are fundamental for understanding natural phenomena and predicting complex dynamics across various scientific and engineering disciplines. However, the governing equations for many complex systems remain unknown due to intricate underlying mechanisms. Recent advancements in machine learning and data science offer a new paradigm for modeling unknown e… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

    Comments: 28 pages

  11. arXiv:2504.03753  [pdf, other

    cs.LG stat.ME

    MMCE: A Framework for Deep Monotonic Modeling of Multiple Causal Effects

    Authors: Juhua Chen, Karson shi, Jialing He, North Chen, Kele Jiang

    Abstract: When we plan to use money as an incentive to change the behavior of a person (such as making riders to deliver more orders or making consumers to buy more items), the common approach of this problem is to adopt a two-stage framework in order to maximize ROI under cost constraints. In the first stage, the individual price response curve is obtained. In the second stage, business goals and resource… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

  12. arXiv:2503.23524  [pdf, ps, other

    econ.EM stat.ME

    Reinterpreting demand estimation

    Authors: Jiafeng Chen

    Abstract: This paper connects the literature on demand estimation to the literature on causal inference by interpreting nonparametric structural assumptions as restrictions on counterfactual outcomes. It offers nontrivial and equivalent restatements of key demand estimation assumptions in the Neyman-Rubin potential outcomes model, for both settings with market-level data (Berry and Haile, 2014) and settings… ▽ More

    Submitted 30 March, 2025; originally announced March 2025.

  13. arXiv:2503.19095  [pdf, other

    econ.EM stat.ME

    Empirical Bayes shrinkage (mostly) does not correct the measurement error in regression

    Authors: Jiafeng Chen, Jiaying Gu, Soonwoo Kwon

    Abstract: In the value-added literature, it is often claimed that regressing on empirical Bayes shrinkage estimates corrects for the measurement error problem in linear regression. We clarify the conditions needed; we argue that these conditions are stronger than the those needed for classical measurement error correction, which we advocate for instead. Moreover, we show that the classical estimator cannot… ▽ More

    Submitted 24 March, 2025; originally announced March 2025.

  14. arXiv:2503.07664  [pdf

    q-bio.QM cs.IR cs.LG stat.AP

    Antibiotic Resistance Microbiology Dataset (ARMD): A De-identified Resource for Studying Antimicrobial Resistance Using Electronic Health Records

    Authors: Fateme Nateghi Haredasht, Fatemeh Amrollahi, Manoj Maddali, Nicholas Marshall, Stephen P. Ma, Lauren N. Cooper, Richard J. Medford, Sanjat Kanjilal, Niaz Banaei, Stanley Deresinski, Mary K. Goldstein, Steven M. Asch, Amy Chang, Jonathan H. Chen

    Abstract: The Antibiotic Resistance Microbiology Dataset (ARMD) is a de-identified resource derived from electronic health records (EHR) that facilitates research into antimicrobial resistance (AMR). ARMD encompasses data from adult patients, focusing on microbiological cultures, antibiotic susceptibilities, and associated clinical and demographic features. Key attributes include organism identification, su… ▽ More

    Submitted 8 March, 2025; originally announced March 2025.

  15. arXiv:2503.04453  [pdf

    stat.ML cs.LG physics.med-ph

    Reproducibility Assessment of Magnetic Resonance Spectroscopy of Pregenual Anterior Cingulate Cortex across Sessions and Vendors via the Cloud Computing Platform CloudBrain-MRS

    Authors: Runhan Chen, Meijin Lin, Jianshu Chen, Liangjie Lin, Jiazheng Wang, Xiaoqing Li, Jianhua Wang, Xu Huang, Ling Qian, Shaoxing Liu, Yuan Long, Di Guo, Xiaobo Qu, Haiwei Han

    Abstract: Given the need to elucidate the mechanisms underlying illnesses and their treatment, as well as the lack of harmonization of acquisition and post-processing protocols among different magnetic resonance system vendors, this work is to determine if metabolite concentrations obtained from different sessions, machine models and even different vendors of 3 T scanners can be highly reproducible and be p… ▽ More

    Submitted 6 March, 2025; originally announced March 2025.

  16. arXiv:2502.09467  [pdf, other

    stat.ME

    Just Trial Once: Ongoing Causal Validation of Machine Learning Models

    Authors: Jacob M. Chen, Michael Oberst

    Abstract: Machine learning (ML) models are increasingly used as decision-support tools in high-risk domains. Evaluating the causal impact of deploying such models can be done with a randomized controlled trial (RCT) that randomizes users to ML vs. control groups and assesses the effect on relevant outcomes. However, ML models are inevitably updated over time, and we often lack evidence for the causal impact… ▽ More

    Submitted 13 February, 2025; originally announced February 2025.

    Comments: 27 pages

  17. arXiv:2502.06398  [pdf, other

    cs.LG stat.ML

    Learning Counterfactual Outcomes Under Rank Preservation

    Authors: Peng Wu, Haoxuan Li, Chunyuan Zheng, Yan Zeng, Jiawei Chen, Yang Liu, Ruocheng Guo, Kun Zhang

    Abstract: Counterfactual inference aims to estimate the counterfactual outcome at the individual level given knowledge of an observed treatment and the factual outcome, with broad applications in fields such as epidemiology, econometrics, and management science. Previous methods rely on a known structural causal model (SCM) or assume the homogeneity of the exogenous variable and strict monotonicity between… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

  18. arXiv:2501.15955  [pdf, other

    cs.LG cs.CV stat.ML

    Rethinking the Bias of Foundation Model under Long-tailed Distribution

    Authors: Jiahao Chen, Bin Qin, Jiangmeng Li, Hao Chen, Bing Su

    Abstract: Long-tailed learning has garnered increasing attention due to its practical significance. Among the various approaches, the fine-tuning paradigm has gained considerable interest with the advent of foundation models. However, most existing methods primarily focus on leveraging knowledge from these models, overlooking the inherent biases introduced by the imbalanced training data they rely on. In th… ▽ More

    Submitted 4 May, 2025; v1 submitted 27 January, 2025; originally announced January 2025.

    Comments: Published as a conference paper in ICML 2025

  19. arXiv:2501.14107  [pdf, other

    stat.ML cs.LG

    EFiGP: Eigen-Fourier Physics-Informed Gaussian Process for Inference of Dynamic Systems

    Authors: Jianhong Chen, Shihao Yang

    Abstract: Parameter estimation and trajectory reconstruction for data-driven dynamical systems governed by ordinary differential equations (ODEs) are essential tasks in fields such as biology, engineering, and physics. These inverse problems -- estimating ODE parameters from observational data -- are particularly challenging when the data are noisy, sparse, and the dynamics are nonlinear. We propose the Eig… ▽ More

    Submitted 23 January, 2025; originally announced January 2025.

  20. arXiv:2501.06559  [pdf, other

    stat.ME

    Design and analysis for constrained order-of-addition experiments

    Authors: Jianbin Chen, Dennis K. J. Lin, Nicholas Rios, Xueru Zhang

    Abstract: In an order-of-addition (OofA) experiment, the sequence of m different components can significantly impact the experiment's response. In many OofA experiments, the components are subject to constraints, where certain orders are impossible. For example, in survey design and job scheduling, the components are often arranged into groups, and these groups of components must be placed in a fixed order.… ▽ More

    Submitted 11 January, 2025; originally announced January 2025.

  21. arXiv:2501.03155  [pdf, other

    stat.ME

    powerROC: An Interactive Web Tool for Sample Size Calculation in Assessing Models' Discriminative Abilities

    Authors: François Grolleau, Robert Tibshirani, Jonathan H. Chen

    Abstract: Rigorous external validation is crucial for assessing the generalizability of prediction models, particularly by evaluating their discrimination (AUROC) on new data. This often involves comparing a new model's AUROC to that of an established reference model. However, many studies rely on arbitrary rules of thumb for sample size calculations, often resulting in underpowered analyses and unreliable… ▽ More

    Submitted 6 January, 2025; originally announced January 2025.

  22. arXiv:2501.02197  [pdf, other

    stat.ML cs.LG stat.CO

    Majorization-Minimization Dual Stagewise Algorithm for Generalized Lasso

    Authors: Jianmin Chen, Kun Chen

    Abstract: The generalized lasso is a natural generalization of the celebrated lasso approach to handle structural regularization problems. Many important methods and applications fall into this framework, including fused lasso, clustered lasso, and constrained lasso. To elevate its effectiveness in large-scale problems, extensive research has been conducted on the computational strategies of generalized las… ▽ More

    Submitted 4 January, 2025; originally announced January 2025.

  23. arXiv:2412.09814  [pdf, other

    cs.LG cs.AI stat.CO

    Federated Learning of Dynamic Bayesian Network via Continuous Optimization from Time Series Data

    Authors: Jianhong Chen, Ying Ma, Xubo Yue

    Abstract: Traditionally, learning the structure of a Dynamic Bayesian Network has been centralized, requiring all data to be pooled in one location. However, in real-world scenarios, data are often distributed across multiple entities (e.g., companies, devices) that seek to collaboratively learn a Dynamic Bayesian Network while preserving data privacy and security. More importantly, due to the presence of d… ▽ More

    Submitted 5 February, 2025; v1 submitted 12 December, 2024; originally announced December 2024.

    Comments: 34 pages

  24. arXiv:2412.07081  [pdf, other

    stat.ML cs.AI cs.LG

    Sequential Controlled Langevin Diffusions

    Authors: Junhua Chen, Lorenz Richter, Julius Berner, Denis Blessing, Gerhard Neumann, Anima Anandkumar

    Abstract: An effective approach for sampling from unnormalized densities is based on the idea of gradually transporting samples from an easy prior to the complicated target distribution. Two popular methods are (1) Sequential Monte Carlo (SMC), where the transport is performed through successive annealed densities via prescribed Markov chains and resampling steps, and (2) recently developed diffusion-based… ▽ More

    Submitted 9 December, 2024; originally announced December 2024.

  25. arXiv:2411.19666  [pdf, other

    eess.IV cs.AI cs.CV cs.LG stat.AP

    Multimodal Whole Slide Foundation Model for Pathology

    Authors: Tong Ding, Sophia J. Wagner, Andrew H. Song, Richard J. Chen, Ming Y. Lu, Andrew Zhang, Anurag J. Vaidya, Guillaume Jaume, Muhammad Shaban, Ahrong Kim, Drew F. K. Williamson, Bowen Chen, Cristina Almagro-Perez, Paul Doucet, Sharifa Sahai, Chengkuan Chen, Daisuke Komura, Akihiro Kawabe, Shumpei Ishikawa, Georg Gerber, Tingying Peng, Long Phi Le, Faisal Mahmood

    Abstract: The field of computational pathology has been transformed with recent advances in foundation models that encode histopathology region-of-interests (ROIs) into versatile and transferable feature representations via self-supervised learning (SSL). However, translating these advancements to address complex clinical challenges at the patient and slide level remains constrained by limited clinical data… ▽ More

    Submitted 29 November, 2024; originally announced November 2024.

    Comments: The code is accessible at https://github.com/mahmoodlab/TITAN

  26. arXiv:2411.16666  [pdf, ps, other

    stat.ML cs.AI cs.LG q-fin.ST

    CatNet: Controlling the False Discovery Rate in LSTM with SHAP Feature Importance and Gaussian Mirrors

    Authors: Jiaan Han, Junxiao Chen, Yanzhe Fu

    Abstract: We introduce CatNet, an algorithm that effectively controls False Discovery Rate (FDR) and selects significant features in LSTM. CatNet employs the derivative of SHAP values to quantify the feature importance, and constructs a vector-formed mirror statistic for FDR control with the Gaussian Mirror algorithm. To avoid instability due to nonlinear or temporal correlations among features, we also pro… ▽ More

    Submitted 4 June, 2025; v1 submitted 25 November, 2024; originally announced November 2024.

  27. arXiv:2411.16277  [pdf, other

    econ.GN cs.CE cs.CR q-fin.CP stat.ML

    FinML-Chain: A Blockchain-Integrated Dataset for Enhanced Financial Machine Learning

    Authors: Jingfeng Chen, Wanlin Deng, Dangxing Chen, Luyao Zhang

    Abstract: Machine learning is critical for innovation and efficiency in financial markets, offering predictive models and data-driven decision-making. However, challenges such as missing data, lack of transparency, untimely updates, insecurity, and incompatible data sources limit its effectiveness. Blockchain technology, with its transparency, immutability, and real-time updates, addresses these challenges.… ▽ More

    Submitted 25 November, 2024; originally announced November 2024.

  28. arXiv:2411.12726  [pdf, other

    math.NA cs.LG stat.CO stat.ML

    LazyDINO: Fast, scalable, and efficiently amortized Bayesian inversion via structure-exploiting and surrogate-driven measure transport

    Authors: Lianghao Cao, Joshua Chen, Michael Brennan, Thomas O'Leary-Roseberry, Youssef Marzouk, Omar Ghattas

    Abstract: We present LazyDINO, a transport map variational inference method for fast, scalable, and efficiently amortized solutions of high-dimensional nonlinear Bayesian inverse problems with expensive parameter-to-observable (PtO) maps. Our method consists of an offline phase in which we construct a derivative-informed neural surrogate of the PtO map using joint samples of the PtO map and its Jacobian. Du… ▽ More

    Submitted 19 November, 2024; originally announced November 2024.

  29. arXiv:2411.01326  [pdf, other

    cs.LG stat.ML

    Generalized Eigenvalue Problems with Generative Priors

    Authors: Zhaoqiang Liu, Wen Li, Junren Chen

    Abstract: Generalized eigenvalue problems (GEPs) find applications in various fields of science and engineering. For example, principal component analysis, Fisher's discriminant analysis, and canonical correlation analysis are specific instances of GEPs and are widely used in statistical data processing. In this work, we study GEPs under generative priors, assuming that the underlying leading generalized ei… ▽ More

    Submitted 2 November, 2024; originally announced November 2024.

  30. arXiv:2410.23087  [pdf, other

    cs.DS cs.LG stat.ML

    Statistical-Computational Trade-offs for Density Estimation

    Authors: Anders Aamand, Alexandr Andoni, Justin Y. Chen, Piotr Indyk, Shyam Narayanan, Sandeep Silwal, Haike Xu

    Abstract: We study the density estimation problem defined as follows: given $k$ distributions $p_1, \ldots, p_k$ over a discrete domain $[n]$, as well as a collection of samples chosen from a ``query'' distribution $q$ over $[n]$, output $p_i$ that is ``close'' to $q$. Recently~\cite{aamand2023data} gave the first and only known result that achieves sublinear bounds in {\em both} the sampling complexity and… ▽ More

    Submitted 30 October, 2024; originally announced October 2024.

    Comments: To appear at NeurIPS 2024

  31. arXiv:2410.20068  [pdf, other

    cs.LG math.ST stat.ML

    Understanding the Effect of GCN Convolutions in Regression Tasks

    Authors: Juntong Chen, Johannes Schmidt-Hieber, Claire Donnat, Olga Klopp

    Abstract: Graph Convolutional Networks (GCNs) have become a pivotal method in machine learning for modeling functions over graphs. Despite their widespread success across various applications, their statistical properties (e.g., consistency, convergence rates) remain ill-characterized. To begin addressing this knowledge gap, we consider networks for which the graph structure implies that neighboring nodes e… ▽ More

    Submitted 16 April, 2025; v1 submitted 26 October, 2024; originally announced October 2024.

    Comments: 25 pages

    MSC Class: 62G08; 68R10

  32. arXiv:2410.06591  [pdf

    stat.AP

    Decentralized Clinical Trials in the Era of Real-World Evidence: A Statistical Perspective

    Authors: Jie Chen, Junrui Di, Nadia Daizadeh, Ying Lu, Hongwei Wang, Yuan-Li Shen, Jennifer Kirk, Frank W. Rockhold, Herbert Pang, Jing Zhao, Weili He, Andrew Potter, Hana Lee

    Abstract: There has been a growing trend that activities relating to clinical trials take place at locations other than traditional trial sites (hence decentralized clinical trials or DCTs), some of which are at settings of real-world clinical practice. Although there are numerous benefits of DCTs, this also brings some implications on a number of issues relating to the design, conduct, and analysis of DCTs… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

  33. arXiv:2410.06586  [pdf

    stat.AP

    Use of Real-World Data and Real-World Evidence in Rare Disease Drug Development: A Statistical Perspective

    Authors: Jie Chen, Susan Gruber, Hana Lee, Haitao Chu, Shiowjen Lee, Haijun Tian, Yan Wang, Weili He, Thomas Jemielita, Yang Song, Roy Tamura, Lu Tian, Yihua Zhao, Yong Chen, Mark van der Laan, Lei Nie

    Abstract: Real-world data (RWD) and real-world evidence (RWE) have been increasingly used in medical product development and regulatory decision-making, especially for rare diseases. After outlining the challenges and possible strategies to address the challenges in rare disease drug development (see the accompanying paper), the Real-World Evidence (RWE) Scientific Working Group of the American Statistical… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

  34. arXiv:2410.06585  [pdf

    stat.AP

    Challenges and Possible Strategies to Address Them in Rare Disease Drug Development: A Statistical Perspective

    Authors: Jie Chen, Lei Nie, Shiowjen Lee, Haitao Chu, Haijun Tian, Yan Wang, Weili He, Thomas Jemielita, Susan Gruber, Yang Song, Roy Tamura, Lu Tian, Yihua Zhao, Yong Chen, Mark van der Laan, Hana Lee

    Abstract: Developing drugs for rare diseases presents unique challenges from a statistical perspective. These challenges may include slowly progressive diseases with unmet medical needs, poorly understood natural history, small population size, diversified phenotypes and geneotypes within a disorder, and lack of appropriate surrogate endpoints to measure clinical benefits. The Real-World Evidence (RWE) Scie… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

  35. arXiv:2410.05760  [pdf, other

    cs.CV cs.AI cs.LG math.OC stat.ML

    Training-free Diffusion Model Alignment with Sampling Demons

    Authors: Po-Hung Yeh, Kuang-Huei Lee, Jun-Cheng Chen

    Abstract: Aligning diffusion models with user preferences has been a key challenge. Existing methods for aligning diffusion models either require retraining or are limited to differentiable reward functions. To address these limitations, we propose a stochastic optimization approach, dubbed Demon, to guide the denoising process at inference time without backpropagation through reward functions or model retr… ▽ More

    Submitted 27 February, 2025; v1 submitted 8 October, 2024; originally announced October 2024.

    Comments: 35 pages

    Journal ref: Proceedings of the Thirteenth International Conference on Learning Representations (ICLR 2025)

  36. arXiv:2410.04870  [pdf, other

    cs.LG stat.ML

    On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent

    Authors: Bingrui Li, Wei Huang, Andi Han, Zhanpeng Zhou, Taiji Suzuki, Jun Zhu, Jianfei Chen

    Abstract: The Adam optimizer is widely used for transformer optimization in practice, which makes understanding the underlying optimization mechanisms an important problem. However, due to the Adam's complexity, theoretical analysis of how it optimizes transformers remains a challenging task. Fortunately, Sign Gradient Descent (SignGD) serves as an effective surrogate for Adam. Despite its simplicity, theor… ▽ More

    Submitted 2 March, 2025; v1 submitted 7 October, 2024; originally announced October 2024.

    Comments: 79 pages, 19 figures, ICLR 2025 Spotlight

  37. arXiv:2410.00373  [pdf, other

    cs.LG cs.AI cs.DB stat.ML

    Robust Traffic Forecasting against Spatial Shift over Years

    Authors: Hongjun Wang, Jiyuan Chen, Tong Pan, Zheng Dong, Lingyu Zhang, Renhe Jiang, Xuan Song

    Abstract: Recent advancements in Spatiotemporal Graph Neural Networks (ST-GNNs) and Transformers have demonstrated promising potential for traffic forecasting by effectively capturing both temporal and spatial correlations. The generalization ability of spatiotemporal models has received considerable attention in recent scholarly discourse. However, no substantive datasets specifically addressing traffic ou… ▽ More

    Submitted 30 September, 2024; originally announced October 2024.

  38. arXiv:2409.18371  [pdf, other

    stat.ML cs.LG stat.CO

    A Model-Constrained Discontinuous Galerkin Network (DGNet) for Compressible Euler Equations with Out-of-Distribution Generalization

    Authors: Hai V. Nguyen, Jau-Uei Chen, Tan Bui-Thanh

    Abstract: Real-time accurate solutions of large-scale complex dynamical systems are critically needed for control, optimization, uncertainty quantification, and decision-making in practical engineering and science applications, particularly in digital twin contexts. In this work, we develop a model-constrained discontinuous Galerkin Network (DGNet) approach, a significant extension to our previous work [Mod… ▽ More

    Submitted 4 December, 2024; v1 submitted 26 September, 2024; originally announced September 2024.

  39. arXiv:2409.08521  [pdf, other

    stat.ML cs.CR cs.LG math.ST

    Optimal Classification-based Anomaly Detection with Neural Networks: Theory and Practice

    Authors: Tian-Yi Zhou, Matthew Lau, Jizhou Chen, Wenke Lee, Xiaoming Huo

    Abstract: Anomaly detection is an important problem in many application areas, such as network security. Many deep learning methods for unsupervised anomaly detection produce good empirical performance but lack theoretical guarantees. By casting anomaly detection into a binary classification problem, we establish non-asymptotic upper bounds and a convergence rate on the excess risk on rectified linear unit… ▽ More

    Submitted 12 September, 2024; originally announced September 2024.

  40. arXiv:2408.06615  [pdf, other

    math.NA stat.CO

    Gaussian mixture Taylor approximations of risk measures constrained by PDEs with Gaussian random field inputs

    Authors: Dingcheng Luo, Joshua Chen, Peng Chen, Omar Ghattas

    Abstract: This work considers the computation of risk measures for quantities of interest governed by PDEs with Gaussian random field parameters using Taylor approximations. While efficient, Taylor approximations are local to the point of expansion, and hence may degrade in accuracy when the variances of the input parameters are large. To address this challenge, we approximate the underlying Gaussian measur… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

    Comments: 34 Pages, 13 Figures, 1 Table

    MSC Class: 65D32 (Primary) 35R60; 41A30; 65C20; 68U05 (Secondary)

  41. arXiv:2407.21119  [pdf, other

    econ.EM stat.ME

    Potential weights and implicit causal designs in linear regression

    Authors: Jiafeng Chen

    Abstract: When we interpret linear regression estimates as causal effects justified by quasi-experiments, what do we mean? This paper characterizes the necessary implications when researchers ascribe a design-based interpretation to a given regression. To do so, we define a notion of potential weights, which encode counterfactual decisions a given regression makes to unobserved potential outcomes. A plausib… ▽ More

    Submitted 13 January, 2025; v1 submitted 30 July, 2024; originally announced July 2024.

  42. arXiv:2407.13980  [pdf, other

    stat.ME cs.LG stat.ML

    Byzantine-tolerant distributed learning of finite mixture models

    Authors: Qiong Zhang, Yan Shuo Tan, Jiahua Chen

    Abstract: Traditional statistical methods need to be updated to work with modern distributed data storage paradigms. A common approach is the split-and-conquer framework, which involves learning models on local machines and averaging their parameter estimates. However, this does not work for the important problem of learning finite mixture models, because subpopulation indices on each local machine may be a… ▽ More

    Submitted 10 March, 2025; v1 submitted 18 July, 2024; originally announced July 2024.

    ACM Class: G.3; I.5.3

  43. arXiv:2407.08382  [pdf, ps, other

    stat.ME

    Adjusting for Participation Bias in Case-Control Genetic Association Studies for Rare Diseases

    Authors: Le Wang, Zhengbang Li, Ben Fitzpatrick, Clarice Weinberg, Jinbo Chen

    Abstract: Collection of genotype data in case-control genetic association studies may often be incomplete for reasons related to genes themselves. This non-ignorable missingness structure, if not appropriately accounted for, can result in participation bias in association analyses. To deal with this issue, Chen et al. (2016) proposed to collect additional genetic information from family members of individua… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  44. arXiv:2407.00224  [pdf, other

    cs.CV stat.AP

    Multimodal Prototyping for cancer survival prediction

    Authors: Andrew H. Song, Richard J. Chen, Guillaume Jaume, Anurag J. Vaidya, Alexander S. Baras, Faisal Mahmood

    Abstract: Multimodal survival methods combining gigapixel histology whole-slide images (WSIs) and transcriptomic profiles are particularly promising for patient prognostication and stratification. Current approaches involve tokenizing the WSIs into smaller patches (>10,000 patches) and transcriptomics into gene groups, which are then integrated using a Transformer for predicting outcomes. However, this proc… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: ICML 2024

  45. arXiv:2406.16988  [pdf, other

    cs.LG stat.ML

    MD tree: a model-diagnostic tree grown on loss landscape

    Authors: Yefan Zhou, Jianlong Chen, Qinxue Cao, Konstantin Schürholt, Yaoqing Yang

    Abstract: This paper considers "model diagnosis", which we formulate as a classification problem. Given a pre-trained neural network (NN), the goal is to predict the source of failure from a set of failure modes (such as a wrong hyperparameter, inadequate model size, and insufficient data) without knowing the training configuration of the pre-trained NN. The conventional diagnosis approach uses training and… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: ICML 2024, first two authors contributed equally

    Journal ref: Proceedings of the 41st International Conference on Machine Learning, PMLR 235:61825-61853, 2024

  46. arXiv:2406.12531  [pdf, other

    cs.LG stat.ML

    TREE: Tree Regularization for Efficient Execution

    Authors: Lena Schmid, Daniel Biebert, Christian Hakert, Kuan-Hsun Chen, Michel Lang, Markus Pauly, Jian-Jia Chen

    Abstract: The rise of machine learning methods on heavily resource constrained devices requires not only the choice of a suitable model architecture for the target platform, but also the optimization of the chosen model with regard to execution time consumption for inference in order to optimally utilize the available resources. Random forests and decision trees are shown to be a suitable model for such a s… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  47. arXiv:2406.05592  [pdf, other

    stat.ME

    Constrained Design of a Binary Instrument in a Partially Linear Model

    Authors: Tim Morrison, Minh Nguyen, Jonathan Chen, Michael Baiocchi, Art B. Owen

    Abstract: We study the question of how best to assign an encouragement in a randomized encouragement study. In our setting, units arrive with covariates, receive a nudge toward treatment or control, acquire one of those statuses in a way that need not align with the nudge, and finally have a response observed. The nudge can be modeled as a binary instrument if one assumes that it affects the response only v… ▽ More

    Submitted 8 May, 2025; v1 submitted 8 June, 2024; originally announced June 2024.

    Comments: 29 pages, 6 figures

  48. arXiv:2406.01252  [pdf, other

    cs.CL cs.AI stat.ML

    Towards Scalable Automated Alignment of LLMs: A Survey

    Authors: Boxi Cao, Keming Lu, Xinyu Lu, Jiawei Chen, Mengjie Ren, Hao Xiang, Peilin Liu, Yaojie Lu, Ben He, Xianpei Han, Le Sun, Hongyu Lin, Bowen Yu

    Abstract: Alignment is the most critical step in building large language models (LLMs) that meet human needs. With the rapid development of LLMs gradually surpassing human capabilities, traditional alignment methods based on human-annotation are increasingly unable to meet the scalability demands. Therefore, there is an urgent need to explore new sources of automated alignment signals and technical approach… ▽ More

    Submitted 3 September, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Paper List: https://github.com/cascip/awesome-auto-alignment

  49. arXiv:2405.17490  [pdf, other

    cs.LG stat.ML

    Revisit, Extend, and Enhance Hessian-Free Influence Functions

    Authors: Ziao Yang, Han Yue, Jian Chen, Hongfu Liu

    Abstract: Influence functions serve as crucial tools for assessing sample influence in model interpretation, subset training set selection, noisy label detection, and more. By employing the first-order Taylor extension, influence functions can estimate sample influence without the need for expensive model retraining. However, applying influence functions directly to deep models presents challenges, primaril… ▽ More

    Submitted 20 October, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

  50. arXiv:2405.16672  [pdf, other

    stat.ML cs.LG stat.ME

    Transfer Learning Under High-Dimensional Graph Convolutional Regression Model for Node Classification

    Authors: Jiachen Chen, Danyang Huang, Liyuan Wang, Kathryn L. Lunetta, Debarghya Mukherjee, Huimin Cheng

    Abstract: Node classification is a fundamental task, but obtaining node classification labels can be challenging and expensive in many real-world scenarios. Transfer learning has emerged as a promising solution to address this challenge by leveraging knowledge from source domains to enhance learning in a target domain. Existing transfer learning methods for node classification primarily focus on integrating… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.