Skip to main content

Showing 1–50 of 63 results for author: Wang, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2505.22622  [pdf, other

    stat.ML cs.LG math.ST

    Principled Out-of-Distribution Generalization via Simplicity

    Authors: Jiawei Ge, Amanda Wang, Shange Tang, Chi Jin

    Abstract: Modern foundation models exhibit remarkable out-of-distribution (OOD) generalization, solving tasks far beyond the support of their training data. However, the theoretical principles underpinning this phenomenon remain elusive. This paper investigates this problem by examining the compositional generalization abilities of diffusion models in image generation. Our analysis reveals that while neural… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

  2. arXiv:2504.20879  [pdf, other

    cs.AI cs.CL cs.LG stat.ME

    The Leaderboard Illusion

    Authors: Shivalika Singh, Yiyang Nan, Alex Wang, Daniel D'Souza, Sayash Kapoor, Ahmet Üstün, Sanmi Koyejo, Yuntian Deng, Shayne Longpre, Noah A. Smith, Beyza Ermis, Marzieh Fadaee, Sara Hooker

    Abstract: Measuring progress is fundamental to the advancement of any scientific field. As benchmarks play an increasingly central role, they also grow more susceptible to distortion. Chatbot Arena has emerged as the go-to leaderboard for ranking the most capable AI systems. Yet, in this work we identify systematic issues that have resulted in a distorted playing field. We find that undisclosed private test… ▽ More

    Submitted 12 May, 2025; v1 submitted 29 April, 2025; originally announced April 2025.

    Comments: 68 pages, 18 figures, 9 tables

  3. arXiv:2504.18409  [pdf, other

    stat.CO math.PR

    Analysis of Multiple-try Metropolis via Poincaré inequalities

    Authors: Rocco Caprio, Sam Power, Andi Wang

    Abstract: We study the Multiple-try Metropolis algorithm using the framework of Poincaré inequalities. We describe the Multiple-try Metropolis as an auxiliary variable implementation of a resampling approximation to an ideal Metropolis--Hastings algorithm. Under suitable moment conditions on the importance weights, we derive explicit Poincaré comparison results between the Multiple-try algorithm and the ide… ▽ More

    Submitted 25 April, 2025; originally announced April 2025.

  4. arXiv:2504.04829  [pdf, other

    cs.LG eess.SP stat.ML

    Attentional Graph Meta-Learning for Indoor Localization Using Extremely Sparse Fingerprints

    Authors: Wenzhong Yan, Feng Yin, Jun Gao, Ao Wang, Yang Tian, Ruizhi Chen

    Abstract: Fingerprint-based indoor localization is often labor-intensive due to the need for dense grids and repeated measurements across time and space. Maintaining high localization accuracy with extremely sparse fingerprints remains a persistent challenge. Existing benchmark methods primarily rely on the measured fingerprints, while neglecting valuable spatial and environmental characteristics. In this p… ▽ More

    Submitted 7 April, 2025; originally announced April 2025.

  5. arXiv:2502.00470  [pdf, other

    math.OC cs.LG stat.ML

    Distributed Primal-Dual Algorithms: Unification, Connections, and Insights

    Authors: Runxiong Wu, Dong Liu, Xueqin Wang, Andi Wang

    Abstract: We study primal-dual algorithms for general empirical risk minimization problems in distributed settings, focusing on two prominent classes of algorithms. The first class is the communication-efficient distributed dual coordinate ascent (CoCoA), derived from the coordinate ascent method for solving the dual problem. The second class is the alternating direction method of multipliers (ADMM), includ… ▽ More

    Submitted 1 February, 2025; originally announced February 2025.

    Comments: 15 pages, 4 figures, 1 table

  6. arXiv:2501.12352  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Test-time regression: a unifying framework for designing sequence models with associative memory

    Authors: Ke Alexander Wang, Jiaxin Shi, Emily B. Fox

    Abstract: Sequence models lie at the heart of modern deep learning. However, rapid advancements have produced a diversity of seemingly unrelated architectures, such as Transformers and recurrent alternatives. In this paper, we introduce a unifying framework to understand and derive these sequence models, inspired by the empirical importance of associative recall, the capability to retrieve contextually rele… ▽ More

    Submitted 1 May, 2025; v1 submitted 21 January, 2025; originally announced January 2025.

  7. arXiv:2501.02128  [pdf, other

    stat.AP

    Transfer Learning for Individualized Treatment Rules: Application to Sepsis Patients Data from eICU-CRD and MIMIC-III Databases

    Authors: Andong Wang, Kelly Wentzlof, Johnny Rajala, Miontranese Green, Yunshu Zhang, Shu Yang

    Abstract: Modern precision medicine aims to utilize real-world data to provide the best treatment for an individual patient. An individualized treatment rule (ITR) maps each patient's characteristics to a recommended treatment scheme that maximizes the expected outcome of the patient. A challenge precision medicine faces is population heterogeneity, as studies on treatment effects are often conducted on sou… ▽ More

    Submitted 3 January, 2025; originally announced January 2025.

    Comments: 23 pages, 4 figures

  8. arXiv:2412.13574  [pdf

    cs.HC stat.AP

    Revisiting Interactions of Multiple Driver States in Heterogenous Population and Cognitive Tasks

    Authors: Jiyao Wang, Ange Wang, Song Yan, Dengbo He, Kaishun Wu

    Abstract: In real-world driving scenarios, multiple states occur simultaneously due to individual differences and environmental factors, complicating the analysis and estimation of driver states. Previous studies, limited by experimental design and analytical methods, may not be able to disentangle the relationships among multiple driver states and environmental factors. This paper introduces the Double Mac… ▽ More

    Submitted 19 December, 2024; v1 submitted 18 December, 2024; originally announced December 2024.

  9. arXiv:2409.20175  [pdf, ps, other

    cs.LG stat.ML

    Ensemble Kalman Diffusion Guidance: A Derivative-free Method for Inverse Problems

    Authors: Hongkai Zheng, Wenda Chu, Austin Wang, Nikola Kovachki, Ricardo Baptista, Yisong Yue

    Abstract: When solving inverse problems, one increasingly popular approach is to use pre-trained diffusion models as plug-and-play priors. This framework can accommodate different forward models without re-training while preserving the generative capability of diffusion models. Despite their success in many imaging inverse problems, most existing methods rely on privileged information such as derivative, ps… ▽ More

    Submitted 2 June, 2025; v1 submitted 30 September, 2024; originally announced September 2024.

    Journal ref: Transactions on Machine Learning Research, 2025

  10. arXiv:2407.16033  [pdf, ps, other

    math.PR math.AP stat.CO

    Explicit convergence rates of underdamped Langevin dynamics under weighted and weak Poincaré--Lions inequalities

    Authors: Giovanni Brigati, Gabriel Stoltz, Andi Q. Wang, Lihan Wang

    Abstract: We study the long-time behavior of the underdamped Langevin dynamics, in the case of so-called \emph{weak confinement}. Indeed, any $\mathrm{L}^\infty$ distribution (in position and velocity) relaxes to equilibrium over time, and we quantify the convergence rate. In our situation, the spatial equilibrium distribution does not satisfy a Poincaré inequality. Instead, we assume a weighted Poincaré in… ▽ More

    Submitted 17 June, 2025; v1 submitted 22 July, 2024; originally announced July 2024.

    Comments: first version submitted to journal

  11. arXiv:2405.17248  [pdf, other

    stat.ML cs.LG

    On Understanding Attention-Based In-Context Learning for Categorical Data

    Authors: Aaron T. Wang, William Convertino, Xiang Cheng, Ricardo Henao, Lawrence Carin

    Abstract: In-context learning based on attention models is examined for data with categorical outcomes, with inference in such models viewed from the perspective of functional gradient descent (GD). We develop a network composed of attention blocks, with each block employing a self-attention layer followed by a cross-attention layer, with associated skip connections. This model can exactly perform multi-ste… ▽ More

    Submitted 6 May, 2025; v1 submitted 27 May, 2024; originally announced May 2024.

    Comments: Accepted to ICML 2025

  12. arXiv:2403.18540  [pdf, other

    stat.ML cs.LG stat.CO

    skscope: Fast Sparsity-Constrained Optimization in Python

    Authors: Zezhi Wang, Jin Zhu, Peng Chen, Huiyang Peng, Xiaoke Zhang, Anran Wang, Junxian Zhu, Xueqin Wang

    Abstract: Applying iterative solvers on sparsity-constrained optimization (SCO) requires tedious mathematical deduction and careful programming/debugging that hinders these solvers' broad impact. In the paper, the library skscope is introduced to overcome such an obstacle. With skscope, users can solve the SCO by just programming the objective function. The convenience of skscope is demonstrated through two… ▽ More

    Submitted 11 October, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: 4 pages;add experiment;

  13. arXiv:2402.13678  [pdf, other

    stat.CO math.PR stat.ME

    Weak Poincaré inequality comparisons for ideal and hybrid slice sampling

    Authors: Sam Power, Daniel Rudolf, Björn Sprungk, Andi Q. Wang

    Abstract: Using the framework of weak Poincar{é} inequalities, we provide a general comparison between the Hybrid and Ideal Slice Sampling Markov chains in terms of their Dirichlet forms. In particular, under suitable assumptions Hybrid Slice Sampling will inherit fast convergence from Ideal Slice Sampling and conversely. We apply our results to analyse the convergence of the Independent Metropolis--Hasting… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: 35 pages, 2 figures

    MSC Class: 65C05; 60J22

  14. arXiv:2312.11689  [pdf, ps, other

    math.PR stat.CO

    Weak Poincaré Inequalities for Markov chains: theory and applications

    Authors: Christophe Andrieu, Anthony Lee, Sam Power, Andi Q. Wang

    Abstract: We investigate the application of Weak Poincaré Inequalities (WPI) to Markov chains to study their rates of convergence and to derive complexity bounds. At a theoretical level we investigate the necessity of the existence of WPIs to ensure \mathrm{L}^{2}-convergence, in particular by establishing equivalence with the Resolvent Uniform Positivity-Improving (RUPI) condition and providing a counterex… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  15. arXiv:2312.07636  [pdf, other

    cs.LG cs.CV stat.ML

    Go beyond End-to-End Training: Boosting Greedy Local Learning with Context Supply

    Authors: Chengting Yu, Fengzhao Zhang, Hanzhi Ma, Aili Wang, Erping Li

    Abstract: Traditional end-to-end (E2E) training of deep networks necessitates storing intermediate activations for back-propagation, resulting in a large memory footprint on GPUs and restricted model parallelization. As an alternative, greedy local learning partitions the network into gradient-isolated modules and trains supervisely based on local preliminary losses, thereby providing asynchronous and paral… ▽ More

    Submitted 3 December, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: 9 figures, 12 tables

  16. arXiv:2312.03344  [pdf, other

    cs.LG math.DS stat.AP stat.ML

    Interpretable Mechanistic Representations for Meal-level Glycemic Control in the Wild

    Authors: Ke Alexander Wang, Emily B. Fox

    Abstract: Diabetes encompasses a complex landscape of glycemic control that varies widely among individuals. However, current methods do not faithfully capture this variability at the meal level. On the one hand, expert-crafted features lack the flexibility of data-driven methods; on the other hand, learned representations tend to be uninterpretable which hampers clinical adoption. In this paper, we propose… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: Proceedings of Machine Learning for Health (ML4H) 2023. Code available at: https://github.com/KeAWang/interpretable-cgm-representations

  17. arXiv:2309.11120  [pdf, other

    stat.ML cs.LG

    Ano-SuPs: Multi-size anomaly detection for manufactured products by identifying suspected patches

    Authors: Hao Xu, Juan Du, Andi Wang, YingCong Chen

    Abstract: Image-based systems have gained popularity owing to their capacity to provide rich manufacturing status information, low implementation costs and high acquisition rates. However, the complexity of the image background and various anomaly patterns pose new challenges to existing matrix decomposition methods, which are inadequate for modeling requirements. Moreover, the uncertainty of the anomaly ca… ▽ More

    Submitted 3 January, 2025; v1 submitted 20 September, 2023; originally announced September 2023.

    Comments: accepted oral presentation at the 18th INFORMS DMDA Workshop

  18. arXiv:2307.02052  [pdf

    stat.ME

    Replicability of Simulation Studies for the Investigation of Statistical Methods: The RepliSims Project

    Authors: K. Luijken, A. Lohmann, U. Alter, J. Claramunt Gonzalez, F. J. Clouth, J. L. Fossum, L. Hesen, A. H. J. Huizing, J. Ketelaar, A. K. Montoya, L. Nab, R. C. C. Nijman, B. B. L. Penning de Vries, T. D. Tibbe, Y. A. Wang, R. H. H. Groenwold

    Abstract: Results of simulation studies evaluating the performance of statistical methods are often considered actionable and thus can have a major impact on the way empirical research is implemented. However, so far there is limited evidence about the reproducibility and replicability of statistical simulation studies. Therefore, eight highly cited statistical simulation studies were selected, and their re… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

    Comments: 36 pages, 0 figures

  19. arXiv:2306.13794  [pdf, other

    stat.ML cs.LG

    Tensor Dirichlet Process Multinomial Mixture Model for Passenger Trajectory Clustering

    Authors: Ziyue Li, Hao Yan, Chen Zhang, Andi Wang, Wolfgang Ketter, Lijun Sun, Fugee Tsung

    Abstract: Passenger clustering based on travel records is essential for transportation operators. However, existing methods cannot easily cluster the passengers due to the hierarchical structure of the passenger trip information, namely: each passenger has multiple trips, and each trip contains multi-dimensional multi-mode information. Furthermore, existing approaches rely on an accurate specification of th… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

    Comments: Under Review of Transportation Research Part C: Emerging Technologies

  20. arXiv:2306.08343  [pdf

    stat.AP

    A Unified Probabilistic Framework for Spatiotemporal Passenger Crowdedness Inference within Urban Rail Transit Network

    Authors: Min Jiang, Andi Wang, Ziyue Li, Fugee Tsung

    Abstract: This paper proposes the Spatio-Temporal Crowdedness Inference Model (STCIM), a framework to infer the passenger distribution inside the whole urban rail transit (URT) system in real-time. Our model is practical since the model is designed in a probabilistic manner and only based on the entry and exit timestamps information collected by the automatic fare collection (AFC) system. Firstly, the entir… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: Accepted to IEEE CASE 2023

  21. arXiv:2305.14767  [pdf

    stat.ME

    Interpretation and visualization of distance covariance through additive decomposition of correlations formula

    Authors: Andi Wang, Hao Yan, Juan Du

    Abstract: Distance covariance is a widely used statistical methodology for testing the dependency between two groups of variables. Despite the appealing properties of consistency and superior testing power, the testing results of distance covariance are often hard to be interpreted. This paper presents an elementary interpretation of the mechanism of distance covariance through an additive decomposition of… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  22. arXiv:2305.01638  [pdf, other

    cs.LG cs.CV stat.ML

    Sequence Modeling with Multiresolution Convolutional Memory

    Authors: Jiaxin Shi, Ke Alexander Wang, Emily B. Fox

    Abstract: Efficiently capturing the long-range patterns in sequential data sources salient to a given task -- such as classification and generative modeling -- poses a fundamental challenge. Popular approaches in the space tradeoff between the memory burden of brute-force enumeration and comparison, as in transformers, the computational burden of complicated sequential dependencies, as in recurrent neural n… ▽ More

    Submitted 1 November, 2023; v1 submitted 2 May, 2023; originally announced May 2023.

    Comments: ICML 2023, Source code: https://github.com/thjashin/multires-conv

  23. arXiv:2303.13865  [pdf, ps, other

    math.CT stat.ME

    Compositionality in algorithms for smoothing

    Authors: Moritz Schauer, Frank van der Meulen, Andi Q. Wang

    Abstract: Backward Filtering Forward Guiding (BFFG) is a bidirectional algorithm proposed in Mider et al. [2021] and studied more in depth in a general setting in Van der Meulen and Schauer [2022]. In category theory, optics have been proposed for modelling systems with bidirectional data flow. We connect BFFG with optics by demonstrating that the forward and backwards map together define a functor from a c… ▽ More

    Submitted 16 April, 2025; v1 submitted 24 March, 2023; originally announced March 2023.

    MSC Class: Primary: 62M05; 18M35; Secondary: 18M05

  24. arXiv:2301.03847  [pdf

    stat.AP

    Evaluating the Performance of Low-Cost PM2.5 Sensors in Mobile Settings

    Authors: Priyanka deSouza, An Wang, Yuki Machida, Tiffany Duhl, Simone Mora, Prashant Kumar, Ralph Kahn, Carlo Ratti, John L. Durant, Neelakshi Hudda

    Abstract: Low-cost sensors (LCS) for measuring air pollution are increasingly being deployed in mobile applications but questions concerning the quality of the measurements remain unanswered. For example, what is the best way to correct LCS data in a mobile setting? Which factors most significantly contribute to differences between mobile LCS data and higher-quality instruments? Can data from LCS be used to… ▽ More

    Submitted 10 January, 2023; originally announced January 2023.

    Comments: 43 pages

  25. Assessing long-term medical remanufacturing emissions with Life Cycle Analysis

    Authors: Julia A. Meister, Jack Sharp, and Yan Wang, Khuong An Nguyen

    Abstract: The unsustainable take-make-dispose linear economy prevalent in healthcare contributes 4.4% to global Greenhouse Gas emissions. A popular but not yet widely-embraced solution is to remanufacture common single-use medical devices like electrophysiology catheters, significantly extending their lifetimes by enabling a circular life cycle. To support the adoption of catheter remanufacturing, we propos… ▽ More

    Submitted 9 February, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

    Comments: 29 pages, 10 figures, 8 tables

    Journal ref: Processes 2023, 11, 36

  26. arXiv:2211.13937  [pdf, other

    cs.LG cs.AI eess.SY math.OC stat.ML

    Operator Splitting Value Iteration

    Authors: Amin Rakhsha, Andrew Wang, Mohammad Ghavamzadeh, Amir-massoud Farahmand

    Abstract: We introduce new planning and reinforcement learning algorithms for discounted MDPs that utilize an approximate model of the environment to accelerate the convergence of the value function. Inspired by the splitting approach in numerical linear algebra, we introduce Operator Splitting Value Iteration (OS-VI) for both Policy Evaluation and Control problems. OS-VI achieves a much faster convergence… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

    Comments: Accepted to NeurIPS2022

  27. arXiv:2211.08959  [pdf, other

    math.PR stat.CO

    Explicit convergence bounds for Metropolis Markov chains: isoperimetry, spectral gaps and profiles

    Authors: Christophe Andrieu, Anthony Lee, Sam Power, Andi Q. Wang

    Abstract: We derive the first explicit bounds for the spectral gap of a random walk Metropolis algorithm on $R^d$ for any value of the proposal variance, which when scaled appropriately recovers the correct $d^{-1}$ dependence on dimension for suitably regular invariant distributions. We also obtain explicit bounds on the ${\rm L}^2$-mixing time for a broad class of models. In obtaining these results, we re… ▽ More

    Submitted 31 October, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

    Journal ref: Ann. Appl. Probab. 34(4): 4022-4071 (August 2024)

  28. arXiv:2210.09901  [pdf, other

    stat.CO

    Sampling using Adaptive Regenerative Processes

    Authors: Hector McKimm, Andi Q Wang, Murray Pollock, Christian P Robert, Gareth O Roberts

    Abstract: Enriching Brownian motion with regenerations from a fixed regeneration distribution $μ$ at a particular regeneration rate $κ$ results in a Markov process that has a target distribution $π$ as its invariant distribution. For the purpose of Monte Carlo inference, implementing such a scheme requires firstly selection of regeneration distribution $μ$, and secondly computation of a specific constant… ▽ More

    Submitted 20 February, 2024; v1 submitted 18 October, 2022; originally announced October 2022.

    Comments: 43 pages, 10 figures

  29. arXiv:2210.05241  [pdf, other

    cs.NE q-bio.NC stat.ML

    STSC-SNN: Spatio-Temporal Synaptic Connection with Temporal Convolution and Attention for Spiking Neural Networks

    Authors: Chengting Yu, Zheming Gu, Da Li, Gaoang Wang, Aili Wang, Erping Li

    Abstract: Spiking Neural Networks (SNNs), as one of the algorithmic models in neuromorphic computing, have gained a great deal of research attention owing to temporal information processing capability, low power consumption, and high biological plausibility. The potential to efficiently extract spatio-temporal features makes it suitable for processing the event streams. However, existing synaptic structures… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Journal ref: Frontiers in neuroscience, 2022, 12

  30. arXiv:2210.01019  [pdf, other

    stat.ML cs.LG

    Plateau in Monotonic Linear Interpolation -- A "Biased" View of Loss Landscape for Deep Networks

    Authors: Xiang Wang, Annie N. Wang, Mo Zhou, Rong Ge

    Abstract: Monotonic linear interpolation (MLI) - on the line connecting a random initialization with the minimizer it converges to, the loss and accuracy are monotonic - is a phenomenon that is commonly observed in the training of neural networks. Such a phenomenon may seem to suggest that optimization of neural networks is easy. In this paper, we show that the MLI property is not necessarily related to the… ▽ More

    Submitted 14 February, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: ICLR 2023

  31. arXiv:2208.05239  [pdf, ps, other

    math.PR stat.CO

    Poincaré inequalities for Markov chains: a meeting with Cheeger, Lyapunov and Metropolis

    Authors: Christophe Andrieu, Anthony Lee, Sam Power, Andi Q. Wang

    Abstract: We develop a theory of weak Poincaré inequalities to characterize convergence rates of ergodic Markov chains. Motivated by the application of Markov chains in the context of algorithms, we develop a relevant set of tools which enable the practical study of convergence rates in the setting of Markov chain Monte Carlo methods, but also well beyond.

    Submitted 10 August, 2022; originally announced August 2022.

    Comments: 80 pages

    MSC Class: 60J22; 65C05

  32. arXiv:2207.00559  [pdf, other

    cs.LG hep-ex physics.ins-det stat.ML

    Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml

    Authors: Elham E Khoda, Dylan Rankin, Rafael Teixeira de Lima, Philip Harris, Scott Hauck, Shih-Chieh Hsu, Michael Kagan, Vladimir Loncar, Chaitanya Paikara, Richa Rao, Sioni Summers, Caterina Vernieri, Aaron Wang

    Abstract: Recurrent neural networks have been shown to be effective architectures for many tasks in high energy physics, and thus have been widely adopted. Their use in low-latency environments has, however, been limited as a result of the difficulties of implementing recurrent architectures on field-programmable gate arrays (FPGAs). In this paper we present an implementation of two types of recurrent neura… ▽ More

    Submitted 1 July, 2022; originally announced July 2022.

    Comments: 12 pages, 6 figures, 5 tables

  33. arXiv:2203.02433  [pdf, ps, other

    cs.LG cs.NE math.OC stat.ML

    The Machine Learning for Combinatorial Optimization Competition (ML4CO): Results and Insights

    Authors: Maxime Gasse, Quentin Cappart, Jonas Charfreitag, Laurent Charlin, Didier Chételat, Antonia Chmiela, Justin Dumouchelle, Ambros Gleixner, Aleksandr M. Kazachkov, Elias Khalil, Pawel Lichocki, Andrea Lodi, Miles Lubin, Chris J. Maddison, Christopher Morris, Dimitri J. Papageorgiou, Augustin Parjadis, Sebastian Pokutta, Antoine Prouvost, Lara Scavuzzo, Giulia Zarpellon, Linxin Yang, Sha Lai, Akang Wang, Xiaodong Luo , et al. (16 additional authors not shown)

    Abstract: Combinatorial optimization is a well-established area in operations research and computer science. Until recently, its methods have focused on solving problem instances in isolation, ignoring that they often stem from related data distributions in practice. However, recent years have seen a surge of interest in using machine learning as a new approach for solving combinatorial problems, either dir… ▽ More

    Submitted 17 March, 2022; v1 submitted 4 March, 2022; originally announced March 2022.

    Comments: Neurips 2021 competition. arXiv admin note: text overlap with arXiv:2112.12251 by other authors

  34. arXiv:2201.02967  [pdf, other

    stat.ML cs.LG stat.AP

    Robust classification with flexible discriminant analysis in heterogeneous data

    Authors: Pierre Houdouin, Frédéric Pascal, Matthieu Jonckheere, Andrew Wang

    Abstract: Linear and Quadratic Discriminant Analysis are well-known classical methods but can heavily suffer from non-Gaussian distributions and/or contaminated datasets, mainly because of the underlying Gaussian assumption that is not robust. To fill this gap, this paper presents a new robust discriminant analysis where each data point is drawn by its own arbitrary Elliptically Symmetrical (ES) distributio… ▽ More

    Submitted 9 January, 2022; originally announced January 2022.

    Comments: ICASSP conference paper, 5 pages

  35. arXiv:2112.12986  [pdf, other

    cs.LG stat.ML

    Is Importance Weighting Incompatible with Interpolating Classifiers?

    Authors: Ke Alexander Wang, Niladri S. Chatterji, Saminul Haque, Tatsunori Hashimoto

    Abstract: Importance weighting is a classic technique to handle distribution shifts. However, prior work has presented strong empirical and theoretical evidence demonstrating that importance weights can have little to no effect on overparameterized neural networks. Is importance weighting truly incompatible with the training of overparameterized neural networks? Our paper answers this in the negative. We sh… ▽ More

    Submitted 4 March, 2022; v1 submitted 24 December, 2021; originally announced December 2021.

    Comments: International Conference on Learning Representations (ICLR), 2022

  36. arXiv:2112.05605  [pdf, other

    stat.CO cs.LG

    Comparison of Markov chains via weak Poincaré inequalities with application to pseudo-marginal MCMC

    Authors: Christophe Andrieu, Anthony Lee, Sam Power, Andi Q. Wang

    Abstract: We investigate the use of a certain class of functional inequalities known as weak Poincaré inequalities to bound convergence of Markov chains to equilibrium. We show that this enables the straightforward and transparent derivation of subgeometric convergence bounds for methods such as the Independent Metropolis--Hastings sampler and pseudo-marginal methods for intractable likelihoods, the latter… ▽ More

    Submitted 9 August, 2022; v1 submitted 10 December, 2021; originally announced December 2021.

    Comments: Revised manuscript; includes additional results

    MSC Class: 65C40; 65C05; 62J10

    Journal ref: Ann. Statist. 50(6): 3592-3618 (December 2022)

  37. arXiv:2111.05859  [pdf, other

    math.ST math.PR stat.CO stat.ME

    PDMP Monte Carlo methods for piecewise-smooth densities

    Authors: Augustin Chevallier, Sam Power, Andi Q. Wang, Paul Fearnhead

    Abstract: There has been substantial interest in developing Markov chain Monte Carlo algorithms based on piecewise-deterministic Markov processes. However existing algorithms can only be used if the target distribution of interest is differentiable everywhere. The key to adapting these algorithms so that they can sample from to densities with discontinuities is defining appropriate dynamics for the process… ▽ More

    Submitted 10 November, 2021; originally announced November 2021.

  38. arXiv:2109.13819  [pdf, other

    math.PR stat.ME

    Perturbation theory for killed Markov processes and quasi-stationary distributions

    Authors: Daniel Rudolf, Andi Q. Wang

    Abstract: Motivated by recent developments of quasi-stationary Monte Carlo methods, we investigate the stability of quasi-stationary distributions of killed Markov processes under perturbations of the generator. We first consider a general bounded self-adjoint perturbation operator, and after that, study a particular unbounded perturbation corresponding to truncation of the killing rate. In both scenarios,… ▽ More

    Submitted 24 September, 2024; v1 submitted 28 September, 2021; originally announced September 2021.

    Comments: 32 pages, 1 figure

    MSC Class: 60J70; 47A55; 65C05; 60J22

  39. arXiv:2106.06695  [pdf, other

    cs.LG stat.ML

    SKIing on Simplices: Kernel Interpolation on the Permutohedral Lattice for Scalable Gaussian Processes

    Authors: Sanyam Kapoor, Marc Finzi, Ke Alexander Wang, Andrew Gordon Wilson

    Abstract: State-of-the-art methods for scalable Gaussian processes use iterative algorithms, requiring fast matrix vector multiplies (MVMs) with the covariance kernel. The Structured Kernel Interpolation (SKI) framework accelerates these MVMs by performing efficient MVMs on a grid and interpolating back to the original space. In this work, we develop a connection between SKI and the permutohedral lattice us… ▽ More

    Submitted 12 June, 2021; originally announced June 2021.

    Comments: International Conference on Machine Learning (ICML), 2021

  40. arXiv:2104.09460  [pdf, other

    stat.ML cs.AI cs.IT cs.LG cs.NE

    Bayesian Algorithm Execution: Estimating Computable Properties of Black-box Functions Using Mutual Information

    Authors: Willie Neiswanger, Ke Alexander Wang, Stefano Ermon

    Abstract: In many real-world problems, we want to infer some property of an expensive black-box function $f$, given a budget of $T$ function evaluations. One example is budget constrained global optimization of $f$, for which Bayesian optimization is a popular method. Other properties of interest include local optima, level sets, integrals, or graph-structured information induced by $f$. Often, we can find… ▽ More

    Submitted 6 July, 2021; v1 submitted 19 April, 2021; originally announced April 2021.

    Comments: Appears in Proceedings of the 38th International Conference on Machine Learning (ICML), 2021

  41. arXiv:2103.02877  [pdf, other

    stat.ME

    A Two-Sample Robust Bayesian Mendelian Randomization Method Accounting for Linkage Disequilibrium and Idiosyncratic Pleiotropy with Applications to the COVID-19 Outcome

    Authors: Anqi Wang, Zhonghua Liu

    Abstract: Mendelian randomization (MR) is a statistical method exploiting genetic variants as instrumental variables to estimate the causal effect of modifiable risk factors on an outcome of interest. Despite wide uses of various popular two-sample MR methods based on genome-wide association study summary level data, however, those methods could suffer from potential power loss or/and biased inference when… ▽ More

    Submitted 16 November, 2021; v1 submitted 4 March, 2021; originally announced March 2021.

  42. arXiv:2011.09341  [pdf, other

    math.PR stat.CO

    Subgeometric hypocoercivity for piecewise-deterministic Markov process Monte Carlo methods

    Authors: Christophe Andrieu, Paul Dobson, Andi Q. Wang

    Abstract: We extend the hypocoercivity framework for piecewise-deterministic Markov process (PDMP) Monte Carlo established in [Andrieu et. al. (2018)] to heavy-tailed target distributions, which exhibit subgeometric rates of convergence to equilibrium. We make use of weak Poincaré inequalities, as developed in the work of [Grothaus and Wang (2019)], the ideas of which we adapt to the PDMPs of interest. On t… ▽ More

    Submitted 29 April, 2021; v1 submitted 18 November, 2020; originally announced November 2020.

    Comments: 33 pages, 1 figure. Minor revisions made

    Journal ref: Electron. J. Probab. 26 1 - 26, 2021

  43. arXiv:2010.13581  [pdf, other

    cs.LG math.DS physics.comp-ph physics.data-an stat.ML

    Simplifying Hamiltonian and Lagrangian Neural Networks via Explicit Constraints

    Authors: Marc Finzi, Ke Alexander Wang, Andrew Gordon Wilson

    Abstract: Reasoning about the physical world requires models that are endowed with the right inductive biases to learn the underlying dynamics. Recent works improve generalization for predicting trajectories by learning the Hamiltonian or Lagrangian of a system rather than the differential equations directly. While these methods encode the constraints of the systems using generalized coordinates, we show th… ▽ More

    Submitted 26 October, 2020; originally announced October 2020.

    Comments: NeurIPS 2020. Code available at https://github.com/mfinzi/constrained-hamiltonian-neural-networks

  44. arXiv:2010.04261  [pdf, other

    cs.LG cs.NE stat.ML

    Dissecting Hessian: Understanding Common Structure of Hessian in Neural Networks

    Authors: Yikai Wu, Xingyu Zhu, Chenwei Wu, Annie Wang, Rong Ge

    Abstract: Hessian captures important properties of the deep neural network loss landscape. Previous works have observed low rank structure in the Hessians of neural networks. In this paper, we propose a decoupling conjecture that decomposes the layer-wise Hessians of a network as the Kronecker product of two smaller matrices. We can analyze the properties of these smaller matrices and prove the structure of… ▽ More

    Submitted 21 October, 2022; v1 submitted 8 October, 2020; originally announced October 2020.

    Comments: 72 pages, 31 figures. Main text: 10 pages, 7 figures. First two authors have equal contribution and are in alphabetical order

    ACM Class: I.2.6

  45. arXiv:2009.04806  [pdf, other

    cs.CV cs.LG cs.NE stat.ML

    SketchEmbedNet: Learning Novel Concepts by Imitating Drawings

    Authors: Alexander Wang, Mengye Ren, Richard S. Zemel

    Abstract: Sketch drawings capture the salient information of visual concepts. Previous work has shown that neural networks are capable of producing sketches of natural objects drawn from a small number of classes. While earlier approaches focus on generation quality or retrieval, we explore properties of image representations learned by training a model to produce sketches of images. We show that this gener… ▽ More

    Submitted 22 June, 2021; v1 submitted 27 August, 2020; originally announced September 2020.

    Comments: ICML 2021

  46. arXiv:2009.03859  [pdf, other

    cs.LG stat.ML

    Trajectory Based Podcast Recommendation

    Authors: Greg Benton, Ghazal Fazelnia, Alice Wang, Ben Carterette

    Abstract: Podcast recommendation is a growing area of research that presents new challenges and opportunities. Individuals interact with podcasts in a way that is distinct from most other media; and primary to our concerns is distinct from music consumption. We show that successful and consistent recommendations can be made by viewing users as moving through the podcast library sequentially. Recommendations… ▽ More

    Submitted 8 September, 2020; originally announced September 2020.

  47. arXiv:2008.02883  [pdf, other

    cs.LG stat.ML

    Stronger and Faster Wasserstein Adversarial Attacks

    Authors: Kaiwen Wu, Allen Houze Wang, Yaoliang Yu

    Abstract: Deep models, while being extremely flexible and accurate, are surprisingly vulnerable to "small, imperceptible" perturbations known as adversarial attacks. While the majority of existing attacks focus on measuring perturbations under the $\ell_p$ metric, Wasserstein distance, which takes geometry in pixel space into account, has long been known to be a suitable metric for measuring image quality a… ▽ More

    Submitted 6 August, 2020; originally announced August 2020.

    Comments: 30 pages, accepted to ICML 2020

  48. arXiv:2007.13860  [pdf

    stat.ML cs.LG eess.IV

    Additive Tensor Decomposition Considering Structural Data Information

    Authors: Shancong Mou, Andi Wang, Chuck Zhang, Jianjun Shi

    Abstract: Tensor data with rich structural information becomes increasingly important in process modeling, monitoring, and diagnosis. Here structural information is referred to structural properties such as sparsity, smoothness, low-rank, and piecewise constancy. To reveal useful information from tensor data, we propose to decompose the tensor into the summation of multiple components based on different str… ▽ More

    Submitted 27 July, 2020; originally announced July 2020.

    Comments: This work has been submitted to the IEEE for possible publication

  49. arXiv:2006.05267  [pdf

    cs.CY cs.LG stat.ML

    Quantum Criticism: A Tagged News Corpus Analysed for Sentiment and Named Entities

    Authors: Ashwini Badgujar, Sheng Chen, Andrew Wang, Kai Yu, Paul Intrevado, David Guy Brizan

    Abstract: In this research, we continuously collect data from the RSS feeds of traditional news sources. We apply several pre-trained implementations of named entity recognition (NER) tools, quantifying the success of each implementation. We also perform sentiment analysis of each news article at the document, paragraph and sentence level, with the goal of creating a corpus of tagged news articles that is m… ▽ More

    Submitted 5 June, 2020; originally announced June 2020.

  50. arXiv:2005.13458  [pdf, other

    cs.RO cs.LG stat.ML

    Fast Risk Assessment for Autonomous Vehicles Using Learned Models of Agent Futures

    Authors: Allen Wang, Xin Huang, Ashkan Jasour, Brian Williams

    Abstract: This paper presents fast non-sampling based methods to assess the risk of trajectories for autonomous vehicles when probabilistic predictions of other agents' futures are generated by deep neural networks (DNNs). The presented methods address a wide range of representations for uncertain predictions including both Gaussian and non-Gaussian mixture models for predictions of both agent positions and… ▽ More

    Submitted 3 June, 2020; v1 submitted 27 May, 2020; originally announced May 2020.

    Comments: To appear in Robotics: Science and Systems