Skip to main content

Showing 1–50 of 186 results for author: Zhang, B

Searching in archive stat. Search in all archives.
.
  1. arXiv:2507.02929  [pdf, ps, other

    cs.CV cs.AI cs.LG stat.ML

    OBSER: Object-Based Sub-Environment Recognition for Zero-Shot Environmental Inference

    Authors: Won-Seok Choi, Dong-Sig Han, Suhyung Choi, Hyeonseo Yang, Byoung-Tak Zhang

    Abstract: We present the Object-Based Sub-Environment Recognition (OBSER) framework, a novel Bayesian framework that infers three fundamental relationships between sub-environments and their constituent objects. In the OBSER framework, metric and self-supervised learning models estimate the object distributions of sub-environments on the latent space to compute these measures. Both theoretically and empiric… ▽ More

    Submitted 26 June, 2025; originally announced July 2025.

    Comments: This manuscript was initially submitted to ICCV 2025 and is now made available as a preprint

  2. arXiv:2505.23456  [pdf, ps, other

    math.NA stat.CO

    Particle exchange Monte Carlo methods for eigenfunction and related nonlinear problems

    Authors: Paul Dupuis, Benjamin J. Zhang

    Abstract: We introduce and develop a novel particle exchange Monte Carlo method. Whereas existing methods apply to eigenfunction problems where the eigenvalue is known (e.g., integrals with respect to a Gibbs measure, which can be interpreted as corresponding to eigenvalue zero), here the focus is on problems where the eigenvalue is not known a priori. To obtain an appropriate particle exchange rule we must… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  3. arXiv:2505.12695  [pdf, other

    stat.ME

    Pseudo-Likelihood Ratio Screening based on Network Data with Applications

    Authors: Wei Hu, Danyang Huang, Bo Zhang

    Abstract: Social network platforms today generate vast amounts of data, including network structures and a large number of user-defined tags, which reflect users' interests. The dimensionality of these personalized tags can be ultra-high, posing challenges for model analysis in targeted preference analysis. Traditional categorical feature screening methods overlook the network structure, which can lead to i… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  4. arXiv:2505.12097  [pdf, ps, other

    math.OC math.PR stat.ME stat.ML

    Proximal optimal transport divergences

    Authors: Ricardo Baptista, Panagiota Birmpa, Markos A. Katsoulakis, Luc Rey-Bellet, Benjamin J. Zhang

    Abstract: We introduce proximal optimal transport divergence, a novel discrepancy measure that interpolates between information divergences and optimal transport distances via an infimal convolution formulation. This divergence provides a principled foundation for optimal transport proximals and proximal optimization methods frequently used in generative modeling. We explore its mathematical properties, inc… ▽ More

    Submitted 17 May, 2025; originally announced May 2025.

  5. arXiv:2505.01357  [pdf, other

    stat.ME math.ST

    Weight-calibrated estimation for factor models of high-dimensional time series

    Authors: Xinghao Qiao, Zihan Wang, Qiwei Yao, Bo Zhang

    Abstract: The factor modeling for high-dimensional time series is powerful in discovering latent common components for dimension reduction and information extraction. Most available estimation methods can be divided into two categories: the covariance-based under asymptotically-identifiable assumption and the autocovariance-based with white idiosyncratic noise. This paper follows the autocovariance-based fr… ▽ More

    Submitted 4 May, 2025; v1 submitted 2 May, 2025; originally announced May 2025.

    Comments: This version includes the supplementary material of the paper

  6. arXiv:2504.09210  [pdf, ps, other

    cs.LG cs.AI cs.SI stat.ML

    FairACE: Achieving Degree Fairness in Graph Neural Networks via Contrastive and Adversarial Group-Balanced Training

    Authors: Jiaxin Liu, Xiaoqian Jiang, Xiang Li, Bohan Zhang, Jing Zhang

    Abstract: Fairness has been a significant challenge in graph neural networks (GNNs) since degree biases often result in un-equal prediction performance among nodes with varying degrees. Existing GNN models focus on prediction accuracy, frequently overlooking fairness across different degree groups. To addressthis issue, we propose a novel GNN framework, namely Fairness- Aware Asymmetric Contrastive Ensemble… ▽ More

    Submitted 14 April, 2025; v1 submitted 12 April, 2025; originally announced April 2025.

  7. arXiv:2504.04020  [pdf, other

    stat.ME

    Leveraging Shared Factor Structures for Enhanced Matrix Completion with Nonconvex Penalty Regularization

    Authors: Yuanhong A, Xinyan Fan, Bingyi Jing, Bo Zhang

    Abstract: This article investigates the problem of noisy low-rank matrix completion with a shared factor structure, leveraging the auxiliary information from the missing indicator matrix to enhance prediction accuracy. Despite decades of development in matrix completion, the potential relationship between observed data and missing indicators has largely been overlooked. To address this gap, we propose a joi… ▽ More

    Submitted 4 April, 2025; originally announced April 2025.

  8. arXiv:2504.04016  [pdf, ps, other

    stat.ML cs.LG

    Computational Efficient and Minimax Optimal Nonignorable Matrix Completion

    Authors: Yuanhong A, Guoyu Zhang, Yongcheng Zeng, Bo Zhang

    Abstract: While the matrix completion problem has attracted considerable attention over the decades, few works address the nonignorable missing issue and all have their limitations. In this article, we propose a nuclear norm regularized row- and column-wise matrix U-statistic loss function for the generalized nonignorable missing mechanism, a flexible and generally applicable missing mechanism which contain… ▽ More

    Submitted 26 June, 2025; v1 submitted 4 April, 2025; originally announced April 2025.

  9. arXiv:2504.02618  [pdf, other

    cs.LG stat.ML

    Variational Online Mirror Descent for Robust Learning in Schrödinger Bridge

    Authors: Dong-Sig Han, Jaein Kim, Hee Bin Yoo, Byoung-Tak Zhang

    Abstract: Schödinger bridge (SB) has evolved into a universal class of probabilistic generative models. In practice, however, estimated learning signals are often uncertain, and the reliability promised by existing methods is often based on speculative optimal-case scenarios. Recent studies regarding the Sinkhorn algorithm through mirror descent (MD) have gained attention, revealing geometric insights into… ▽ More

    Submitted 8 April, 2025; v1 submitted 3 April, 2025; originally announced April 2025.

  10. arXiv:2503.24012  [pdf, other

    cs.LG stat.CO

    Tree-Guided $L_1$-Convex Clustering

    Authors: Bingyuan Zhang, Yoshikazu Terada

    Abstract: Convex clustering is a modern clustering framework that guarantees globally optimal solutions and performs comparably to other advanced clustering methods. However, obtaining a complete dendrogram (clusterpath) for large-scale datasets remains computationally challenging due to the extensive costs associated with iterative optimization approaches. To address this limitation, we develop a novel con… ▽ More

    Submitted 31 March, 2025; originally announced March 2025.

  11. arXiv:2502.18582  [pdf, ps, other

    stat.ML cs.GT cs.LG

    Learning and Computation of $Φ$-Equilibria at the Frontier of Tractability

    Authors: Brian Hu Zhang, Ioannis Anagnostides, Emanuel Tewolde, Ratip Emin Berker, Gabriele Farina, Vincent Conitzer, Tuomas Sandholm

    Abstract: $Φ$-equilibria -- and the associated notion of $Φ$-regret -- are a powerful and flexible framework at the heart of online learning and game theory, whereby enriching the set of deviations $Φ$ begets stronger notions of rationality. Recently, Daskalakis, Farina, Fishelson, Pipis, and Schneider (STOC '24) -- abbreviated as DFFPS -- settled the existence of efficient algorithms when $Φ… ▽ More

    Submitted 27 February, 2025; v1 submitted 25 February, 2025; originally announced February 2025.

  12. arXiv:2502.17214  [pdf, ps, other

    cs.CL cs.LG stat.ML

    CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought

    Authors: Boxuan Zhang, Ruqi Zhang

    Abstract: Large language models (LLMs) excel in many tasks but struggle to accurately quantify uncertainty in their generated responses. This limitation makes it challenging to detect misinformation and ensure reliable decision-making. Existing uncertainty quantification (UQ) methods for LLMs are primarily prompt-wise rather than response-wise, often requiring multiple response samples, which incurs high co… ▽ More

    Submitted 3 June, 2025; v1 submitted 24 February, 2025; originally announced February 2025.

    Comments: Accepted by ACL 2025 Findings

  13. arXiv:2502.16888  [pdf, ps, other

    stat.ME stat.ML

    Functional BART with Shape Priors: A Bayesian Tree Approach to Constrained Functional Regression

    Authors: Jiahao Cao, Shiyuan He, Bohai Zhang

    Abstract: Motivated by the remarkable success of Bayesian additive regression trees (BART) in regression modelling, we propose a novel nonparametric Bayesian method, termed Functional BART (FBART), tailored specifically for function-on-scalar regression. FBART leverages spline-based representations for functional responses coupled with a flexible tree-based partitioning structure, effectively capturing comp… ▽ More

    Submitted 1 June, 2025; v1 submitted 24 February, 2025; originally announced February 2025.

  14. arXiv:2502.08776  [pdf, other

    stat.ME cs.LG stat.ML

    Treatment response as a latent variable

    Authors: Christopher Tosh, Boyuan Zhang, Wesley Tansey

    Abstract: Scientists often need to analyze the samples in a study that responded to treatment in order to refine their hypotheses and find potential causal drivers of response. Natural variation in outcomes makes teasing apart responders from non-responders a statistical inference problem. To handle latent responses, we introduce the causal two-groups (C2G) model, a causal extension of the classical two-gro… ▽ More

    Submitted 12 February, 2025; originally announced February 2025.

  15. arXiv:2502.02552  [pdf, other

    cs.LG q-bio.BM stat.AP stat.CO stat.ME

    Hierarchical Sparse Bayesian Multitask Model with Scalable Inference for Microbiome Analysis

    Authors: Haonan Zhu, Andre R. Goncalves, Camilo Valdes, Hiranmayi Ranganathan, Boya Zhang, Jose Manuel Martí, Car Reen Kok, Monica K. Borucki, Nisha J. Mulakken, James B. Thissen, Crystal Jaing, Alfred Hero, Nicholas A. Be

    Abstract: This paper proposes a hierarchical Bayesian multitask learning model that is applicable to the general multi-task binary classification learning problem where the model assumes a shared sparsity structure across different tasks. We derive a computationally efficient inference algorithm based on variational inference to approximate the posterior distribution. We demonstrate the potential of the new… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

  16. arXiv:2501.10440  [pdf, other

    stat.ME cs.LG math.NA stat.CO stat.ML

    Median of Means Sampling for the Keister Function

    Authors: Bocheng Zhang

    Abstract: This study investigates the performance of median-of-means sampling compared to traditional mean-of-means sampling for computing the Keister function integral using Randomized Quasi-Monte Carlo (RQMC) methods. The research tests both lattice points and digital nets as point distributions across dimensions 2, 3, 5, and 8, with sample sizes ranging from 2^8 to 2^19 points. Results demonstrate that m… ▽ More

    Submitted 13 January, 2025; originally announced January 2025.

  17. Distributed Pseudo-Likelihood Method for Community Detection in Large-Scale Networks

    Authors: Jiayi Deng, Danyang Huang, Bo Zhang

    Abstract: This paper proposes a distributed pseudo-likelihood method (DPL) to conveniently identify the community structure of large-scale networks. Specifically, we first propose a block-wise splitting method to divide large-scale network data into several subnetworks and distribute them among multiple workers. For simplicity, we assume the classical stochastic block model. Then, the DPL algorithm is itera… ▽ More

    Submitted 2 November, 2024; originally announced November 2024.

  18. arXiv:2410.08078  [pdf, other

    stat.ME

    Negative Control Outcome Adjustment in Early-Phase Randomized Trials: Estimating Vaccine Effects on Immune Responses in HIV Exposed Uninfected Infants

    Authors: Ethan Ashby, Bo Zhang, Genevieve G Fouda, Youyi Fong, Holly Janes

    Abstract: Adjustment for prognostic baseline variables can reduce bias due to covariate imbalance and increase efficiency in randomized trials. While the use of covariate adjustment in late-phase trials is justified by favorable large-sample properties, it is seldom used in small, early-phase studies, due to uncertainty in which variables are prognostic and the potential for precision loss, type I error rat… ▽ More

    Submitted 4 April, 2025; v1 submitted 10 October, 2024; originally announced October 2024.

    Comments: 30 pages, 8 figure, 4 in main text

  19. arXiv:2410.07135  [pdf

    stat.AP cs.LG stat.ML

    Causal Inference with Double/Debiased Machine Learning for Evaluating the Health Effects of Multiple Mismeasured Pollutants

    Authors: Gang Xu, Xin Zhou, Molin Wang, Boya Zhang, Wenhao Jiang, Francine Laden, Helen H. Suh, Adam A. Szpiro, Donna Spiegelman, Zuoheng Wang

    Abstract: One way to quantify exposure to air pollution and its constituents in epidemiologic studies is to use an individual's nearest monitor. This strategy results in potential inaccuracy in the actual personal exposure, introducing bias in estimating the health effects of air pollution and its constituents, especially when evaluating the causal effects of correlated multi-pollutant constituents measured… ▽ More

    Submitted 21 September, 2024; originally announced October 2024.

  20. arXiv:2410.06281  [pdf, other

    stat.ME

    Sequential Design with Derived Win Statistics

    Authors: Baoshan Zhang, Yuan Wu

    Abstract: The Win Ratio has gained significant traction in cardiovascular trials as a novel method for analyzing composite endpoints (Pocock and others, 2012). Compared with conventional approaches based on time to the first event, the Win Ratio accommodates the varying priorities and types of outcomes among components, potentially offering greater statistical power by fully utilizing the information contai… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

    Comments: 36 pages, 1 figure, 5 tables

  21. arXiv:2410.01244  [pdf, other

    stat.ML cs.LG

    Equivariant score-based generative models provably learn distributions with symmetries efficiently

    Authors: Ziyu Chen, Markos A. Katsoulakis, Benjamin J. Zhang

    Abstract: Symmetry is ubiquitous in many real-world phenomena and tasks, such as physics, images, and molecular simulations. Empirical studies have demonstrated that incorporating symmetries into generative models can provide better generalization and sampling efficiency when the underlying data distribution has group symmetry. In this work, we provide the first theoretical analysis and guarantees of score-… ▽ More

    Submitted 2 October, 2024; originally announced October 2024.

  22. arXiv:2408.00920  [pdf, other

    cs.LG stat.ML

    Towards Certified Unlearning for Deep Neural Networks

    Authors: Binchi Zhang, Yushun Dong, Tianhao Wang, Jundong Li

    Abstract: In the field of machine unlearning, certified unlearning has been extensively studied in convex machine learning models due to its high efficiency and strong theoretical guarantees. However, its application to deep neural networks (DNNs), known for their highly nonconvex nature, still poses challenges. To bridge the gap between certified unlearning and DNNs, we propose several simple techniques to… ▽ More

    Submitted 7 May, 2025; v1 submitted 1 August, 2024; originally announced August 2024.

    Comments: ICML 2024 (errata)

  23. arXiv:2407.11901  [pdf, other

    stat.ML cs.LG stat.CO stat.ME

    Combining Wasserstein-1 and Wasserstein-2 proximals: robust manifold learning via well-posed generative flows

    Authors: Hyemin Gu, Markos A. Katsoulakis, Luc Rey-Bellet, Benjamin J. Zhang

    Abstract: We formulate well-posed continuous-time generative flows for learning distributions that are supported on low-dimensional manifolds through Wasserstein proximal regularizations of $f$-divergences. Wasserstein-1 proximal operators regularize $f$-divergences so that singular distributions can be compared. Meanwhile, Wasserstein-2 proximal operators regularize the paths of the generative flows by add… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  24. arXiv:2405.18782  [pdf, other

    eess.IV cs.CV stat.ML

    Principled Probabilistic Imaging using Diffusion Models as Plug-and-Play Priors

    Authors: Zihui Wu, Yu Sun, Yifan Chen, Bingliang Zhang, Yisong Yue, Katherine L. Bouman

    Abstract: Diffusion models (DMs) have recently shown outstanding capabilities in modeling complex image distributions, making them expressive image priors for solving Bayesian inverse problems. However, most existing DM-based methods rely on approximations in the generative process to be generic to different inverse problems, leading to inaccurate sample distributions that deviate from the target posterior… ▽ More

    Submitted 6 November, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted to NeurIPS 2024

  25. arXiv:2405.15754  [pdf, ps, other

    stat.ML cs.LG math.ST

    Score-based generative models are provably robust: an uncertainty quantification perspective

    Authors: Nikiforos Mimikos-Stamatopoulos, Benjamin J. Zhang, Markos A. Katsoulakis

    Abstract: Through an uncertainty quantification (UQ) perspective, we show that score-based generative models (SGMs) are provably robust to the multiple sources of error in practical implementation. Our primary tool is the Wasserstein uncertainty propagation (WUP) theorem, a model-form UQ bound that describes how the $L^2$ error from learning the score function propagates to a Wasserstein-1 ($\mathbf{d}_1$)… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  26. arXiv:2405.15625  [pdf, other

    stat.ML cs.LG

    Nonlinear denoising score matching for enhanced learning of structured distributions

    Authors: Jeremiah Birrell, Markos A. Katsoulakis, Luc Rey-Bellet, Benjamin Zhang, Wei Zhu

    Abstract: We present a novel method for training score-based generative models which uses nonlinear noising dynamics to improve learning of structured distributions. Generalizing to a nonlinear drift allows for additional structure to be incorporated into the dynamics, thus making the training better adapted to the data, e.g., in the case of multimodality or (approximate) symmetries. Such structure can be o… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 14 pages, 8 figures

  27. arXiv:2405.07220  [pdf, other

    cs.LG cs.AI stat.ML

    On Discovery of Local Independence over Continuous Variables via Neural Contextual Decomposition

    Authors: Inwoo Hwang, Yunhyeok Kwak, Yeon-Ji Song, Byoung-Tak Zhang, Sanghack Lee

    Abstract: Conditional independence provides a way to understand causal relationships among the variables of interest. An underlying system may exhibit more fine-grained causal relationships especially between a variable and its parents, which will be called the local independence relationships. One of the most widely studied local relationships is Context-Specific Independence (CSI), which holds in a specif… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

    Comments: Conference on Causal Learning and Reasoning (CLeaR), 2023

  28. arXiv:2405.07102  [pdf, other

    stat.ME stat.AP stat.OT

    Nested Instrumental Variables Design: Switcher Average Treatment Effect, Identification, Efficient Estimation and Generalizability

    Authors: Rui Wang, Ying-Qi Zhao, Oliver Dukes, Bo Zhang

    Abstract: Instrumental variables (IV) are a commonly used tool to estimate causal effects from non-randomized data. An archetype of an IV is a randomized trial with non-compliance where the randomized treatment assignment serves as an IV for the non-ignorable treatment received. Under a monotonicity assumption, a valid IV non-parametrically identifies the average treatment effect among a non-identified, lat… ▽ More

    Submitted 14 March, 2025; v1 submitted 11 May, 2024; originally announced May 2024.

  29. arXiv:2404.17734  [pdf, other

    stat.ME stat.AP

    Manipulating a Continuous Instrumental Variable in an Observational Study of Premature Babies: Algorithm, Partial Identification Bounds, and Inference under Randomization and Biased Randomization Assumptions

    Authors: Zhe Chen, Min Haeng Cho, Bo Zhang

    Abstract: Regionalization of intensive care for premature babies refers to a triage system of mothers with high-risk pregnancies to hospitals of varied capabilities based on risks faced by infants. Due to the limited capacity of high-level hospitals, which are equipped with advanced expertise to provide critical care, understanding the effect of delivering premature babies at such hospitals on infant mortal… ▽ More

    Submitted 27 September, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

  30. arXiv:2404.08331  [pdf, other

    stat.ME

    A Balanced Statistical Boosting Approach for GAMLSS via New Step Lengths

    Authors: Alexandra Daub, Andreas Mayr, Boyao Zhang, Elisabeth Bergherr

    Abstract: Component-wise gradient boosting algorithms are popular for their intrinsic variable selection and implicit regularization, which can be especially beneficial for very flexible model classes. When estimating generalized additive models for location, scale and shape (GAMLSS) by means of a component-wise gradient boosting algorithm, an important part of the estimation procedure is to determine the r… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 34 pages, 26 figures

  31. arXiv:2404.06064  [pdf, other

    stat.ME

    Constructing hierarchical time series through clustering: Is there an optimal way for forecasting?

    Authors: Bohan Zhang, Anastasios Panagiotelis, Han Li

    Abstract: Forecast reconciliation has attracted significant research interest in recent years, with most studies taking the hierarchy of time series as given. We extend existing work that uses time series clustering to construct hierarchies, with the goal of improving forecast accuracy, in three ways. First, we investigate multiple approaches to clustering, including not only different clustering algorithms… ▽ More

    Submitted 7 September, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

    Comments: 28 pages, 13 figures

  32. arXiv:2402.13934  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Do Efficient Transformers Really Save Computation?

    Authors: Kai Yang, Jan Ackermann, Zhenyu He, Guhao Feng, Bohang Zhang, Yunzhen Feng, Qiwei Ye, Di He, Liwei Wang

    Abstract: As transformer-based language models are trained on increasingly large datasets and with vast numbers of parameters, finding more efficient alternatives to the standard Transformer has become very valuable. While many efficient Transformers and Transformer alternatives have been proposed, none provide theoretical guarantees that they are a suitable replacement for the standard Transformer. This ma… ▽ More

    Submitted 8 November, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: 20 pages, ICML 2024 Camera Ready Version

  33. arXiv:2402.06162  [pdf, other

    stat.ML cs.LG

    Wasserstein proximal operators describe score-based generative models and resolve memorization

    Authors: Benjamin J. Zhang, Siting Liu, Wuchen Li, Markos A. Katsoulakis, Stanley J. Osher

    Abstract: We focus on the fundamental mathematical structure of score-based generative models (SGMs). We first formulate SGMs in terms of the Wasserstein proximal operator (WPO) and demonstrate that, via mean-field games (MFGs), the WPO formulation reveals mathematical structure that describes the inductive bias of diffusion and score-based models. In particular, MFGs yield optimality conditions in the form… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  34. arXiv:2401.03106  [pdf, other

    stat.ME

    Contrastive linear regression

    Authors: Boyang Zhang, Sarah Nyquist, Andrew Jones, Barbara E. Engelhardt, Didong Li

    Abstract: Contrastive dimension reduction methods have been developed for case-control study data to identify variation that is enriched in the foreground (case) data X relative to the background (control) data Y. Here, we develop contrastive regression for the setting when there is a response variable r associated with each foreground observation. This situation occurs frequently when, for example, the una… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  35. arXiv:2312.15217  [pdf, other

    stat.ME stat.AP

    Constructing a T-test for Value Function Comparison of Individualized Treatment Regimes in the Presence of Multiple Imputation for Missing Data

    Authors: Minxin Lu, Annie Green Howard, Penny Gordon-Larsen, Katie A. Meyer, Hsiao-Chuan Tien, Shufa Du, Huijun Wang, Bing Zhang, Michael R. Kosorok

    Abstract: Optimal individualized treatment decision-making has improved health outcomes in recent years. The value function is commonly used to evaluate the goodness of an individualized treatment decision rule. Despite recent advances, comparing value functions between different treatment decision rules or constructing confidence intervals around value functions remains difficult. We propose a t-test based… ▽ More

    Submitted 19 April, 2025; v1 submitted 23 December, 2023; originally announced December 2023.

  36. arXiv:2311.02757  [pdf, other

    cs.LG cs.CR stat.ML

    Certified Defense on the Fairness of Graph Neural Networks

    Authors: Yushun Dong, Binchi Zhang, Hanghang Tong, Jundong Li

    Abstract: Graph Neural Networks (GNNs) have emerged as a prominent graph learning model in various graph-based tasks over the years. Nevertheless, due to the vulnerabilities of GNNs, it has been empirically proved that malicious attackers could easily corrupt the fairness level of their predictions by adding perturbations to the input graph data. In this paper, we take crucial steps to study a novel problem… ▽ More

    Submitted 4 April, 2025; v1 submitted 5 November, 2023; originally announced November 2023.

  37. arXiv:2310.14399  [pdf, other

    stat.ME stat.AP

    The role of randomization inference in unraveling individual treatment effects in early phase vaccine trials

    Authors: Zhe Chen, Xinran Li, Bo Zhang

    Abstract: Randomization inference is a powerful tool in early phase vaccine trials when estimating the causal effect of a regimen against a placebo or another regimen. Randomization-based inference often focuses on testing either Fisher's sharp null hypothesis of no treatment effect for any participant or Neyman's weak null hypothesis of no sample average treatment effect. Many recent efforts have explored… ▽ More

    Submitted 26 February, 2024; v1 submitted 22 October, 2023; originally announced October 2023.

  38. arXiv:2309.04354  [pdf, other

    cs.CV cs.LG stat.ML

    Mobile V-MoEs: Scaling Down Vision Transformers via Sparse Mixture-of-Experts

    Authors: Erik Daxberger, Floris Weers, Bowen Zhang, Tom Gunter, Ruoming Pang, Marcin Eichner, Michael Emmersberger, Yinfei Yang, Alexander Toshev, Xianzhi Du

    Abstract: Sparse Mixture-of-Experts models (MoEs) have recently gained popularity due to their ability to decouple model size from inference efficiency by only activating a small subset of the model parameters for any given input token. As such, sparse MoEs have enabled unprecedented scalability, resulting in tremendous successes across domains such as natural language processing and computer vision. In thi… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

  39. arXiv:2308.08217  [pdf, other

    stat.AP

    Matching with multiple criteria and its application to health disparities research

    Authors: Chang Chen, Zhiyu Qian, Bo Zhang

    Abstract: Matching is a popular nonparametric covariate adjustment strategy in empirical health services research. Matching helps construct two groups comparable in many baseline covariates but different in some key aspects under investigation. In health disparities research, it is desirable to understand the contributions of various modifiable factors, like income and insurance type, to the observed dispar… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

  40. arXiv:2306.04952  [pdf, ps, other

    stat.ML cs.LG

    Entropy-based Training Methods for Scalable Neural Implicit Sampler

    Authors: Weijian Luo, Boya Zhang, Zhihua Zhang

    Abstract: Efficiently sampling from un-normalized target distributions is a fundamental problem in scientific computing and machine learning. Traditional approaches such as Markov Chain Monte Carlo (MCMC) guarantee asymptotically unbiased samples from such distributions but suffer from computational inefficiency, particularly when dealing with high-dimensional targets, as they require numerous iterations to… ▽ More

    Submitted 5 June, 2025; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: Revision: The paper was accepted by NeurIPS2023

  41. arXiv:2305.18809  [pdf, other

    stat.ME

    Discrete forecast reconciliation

    Authors: Bohan Zhang, Anastasios Panagiotelis, Yanfei Kang

    Abstract: This paper presents a formal framework and proposes algorithms to extend forecast reconciliation to discrete-valued data to extend forecast reconciliation to discrete-valued data, including low counts. A novel method is introduced based on recasting the optimisation of scoring rules as an assignment problem, which is solved using quadratic programming. The proposed framework produces coherent join… ▽ More

    Submitted 14 April, 2024; v1 submitted 30 May, 2023; originally announced May 2023.

  42. arXiv:2305.15408  [pdf, other

    cs.LG cs.CC cs.CL stat.ML

    Towards Revealing the Mystery behind Chain of Thought: A Theoretical Perspective

    Authors: Guhao Feng, Bohang Zhang, Yuntian Gu, Haotian Ye, Di He, Liwei Wang

    Abstract: Recent studies have discovered that Chain-of-Thought prompting (CoT) can dramatically improve the performance of Large Language Models (LLMs), particularly when dealing with complex tasks involving mathematics or reasoning. Despite the enormous empirical success, the underlying mechanisms behind CoT and how it unlocks the potential of LLMs remain elusive. In this paper, we take a first step toward… ▽ More

    Submitted 22 December, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: 42 pages; Camera-ready version for NeurIPS 2023 (Oral Presentation)

  43. arXiv:2305.05759  [pdf, ps, other

    cs.LG cs.AI cs.CL stat.ML

    Ranking & Reweighting Improves Group Distributional Robustness

    Authors: Yachuan Liu, Bohan Zhang, Qiaozhu Mei, Paramveer Dhillon

    Abstract: Recent work has shown that standard training via empirical risk minimization (ERM) can produce models that achieve high accuracy on average but low accuracy on underrepresented groups due to the prevalence of spurious features. A predominant approach to tackle this group robustness problem minimizes the worst group error (akin to a minimax strategy) on the training data, hoping it will generalize… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

  44. arXiv:2304.13534  [pdf, other

    stat.ML cs.LG

    A mean-field games laboratory for generative modeling

    Authors: Benjamin J. Zhang, Markos A. Katsoulakis

    Abstract: We demonstrate the versatility of mean-field games (MFGs) as a mathematical framework for explaining, enhancing, and designing generative models. In generative flows, a Lagrangian formulation is used where each particle (generated sample) aims to minimize a loss function over its simulated path. The loss, however, is dependent on the paths of other particles, which leads to a competition among the… ▽ More

    Submitted 24 October, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

    Comments: 56 pages, 10 figures. Version 5 has a slightly modified version of the normalizing flow and improved introduction and conclusions

  45. arXiv:2304.06900  [pdf, other

    stat.ME

    Subsampling-Based Modified Bayesian Information Criterion for Large-Scale Stochastic Block Models

    Authors: Jiayi Deng, Danyang Huang, Xiangyu Chang, Bo Zhang

    Abstract: Identifying the number of communities is a fundamental problem in community detection, which has received increasing attention recently. However, rapid advances in technology have led to the emergence of large-scale networks in various disciplines, thereby making existing methods computationally infeasible. To address this challenge, we propose a novel subsampling-based modified Bayesian informati… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

  46. Prediction method of cigarette draw resistance based on correlation analysis

    Authors: Linsheng Chen, Zhonghua Yu, Bo Zhang, Qiang Zhu, Hu Fan, Yucan Qiu

    Abstract: The cigarette draw resistance monitoring method is incomplete and single, and the lacks correlation analysis and preventive modeling, resulting in substandard cigarettes in the market. To address this problem without increasing the hardware cost, in this paper, multi-indicator correlation analysis is used to predict cigarette draw resistance. First, the monitoring process of draw resistance is ana… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

    Comments: Preprint, submitted to Computers and Electronics in Agriculture. For any suggestions or improvements, please contact me directly by e-mail

  47. arXiv:2304.03476  [pdf, other

    stat.ME

    Generalizing the intention-to-treat effect of an active control against placebo from historical placebo-controlled trials to an active-controlled trial: A case study of the efficacy of daily oral TDF/FTC in the HPTN 084 study

    Authors: Qijia He, Fei Gao, Oliver Dukes, Sinead Delany-Moretlwe, Bo Zhang

    Abstract: In many clinical settings, an active-controlled trial design (e.g., a non-inferiority or superiority design) is often used to compare an experimental medicine to an active control (e.g., an FDA-approved, standard therapy). One prominent example is a recent phase 3 efficacy trial, HIV Prevention Trials Network Study 084 (HPTN 084), comparing long-acting cabotegravir, a new HIV pre-exposure prophyla… ▽ More

    Submitted 29 December, 2023; v1 submitted 7 April, 2023; originally announced April 2023.

  48. arXiv:2302.07227  [pdf, other

    stat.ME math.PR stat.ML

    Transport map unadjusted Langevin algorithms: learning and discretizing perturbed samplers

    Authors: Benjamin J. Zhang, Youssef M. Marzouk, Konstantinos Spiliopoulos

    Abstract: Langevin dynamics are widely used in sampling high-dimensional, non-Gaussian distributions whose densities are known up to a normalizing constant. In particular, there is strong interest in unadjusted Langevin algorithms (ULA), which directly discretize Langevin dynamics to estimate expectations over the target distribution. We study the use of transport maps that approximately normalize a target… ▽ More

    Submitted 22 October, 2024; v1 submitted 14 February, 2023; originally announced February 2023.

    Comments: 29 pages, 12 figures

    MSC Class: 62D99; 60H35

  49. arXiv:2301.11562  [pdf, other

    cs.LG cs.AI cs.CY stat.ML

    Arbitrariness and Social Prediction: The Confounding Role of Variance in Fair Classification

    Authors: A. Feder Cooper, Katherine Lee, Madiha Zahrah Choksi, Solon Barocas, Christopher De Sa, James Grimmelmann, Jon Kleinberg, Siddhartha Sen, Baobao Zhang

    Abstract: Variance in predictions across different trained models is a significant, under-explored source of error in fair binary classification. In practice, the variance on some data examples is so large that decisions can be effectively arbitrary. To investigate this problem, we take an experimental approach and make four overarching contributions: We: 1) Define a metric called self-consistency, derived… ▽ More

    Submitted 6 March, 2024; v1 submitted 27 January, 2023; originally announced January 2023.

    Comments: AAAI '24 (received a Best Paper Honorable Mention designation)

  50. arXiv:2301.09505  [pdf, other

    cs.LG stat.ML

    Rethinking the Expressive Power of GNNs via Graph Biconnectivity

    Authors: Bohang Zhang, Shengjie Luo, Liwei Wang, Di He

    Abstract: Designing expressive Graph Neural Networks (GNNs) is a central topic in learning graph-structured data. While numerous approaches have been proposed to improve GNNs in terms of the Weisfeiler-Lehman (WL) test, generally there is still a lack of deep understanding of what additional power they can systematically and provably gain. In this paper, we take a fundamentally different perspective to stud… ▽ More

    Submitted 10 February, 2024; v1 submitted 23 January, 2023; originally announced January 2023.

    Comments: Extended from ICLR 2023 Outstanding Paper; 60 pages, 12 figures. Fix typos in the previous version