Skip to main content

Showing 1–50 of 89 results for author: Gao, X

Searching in archive stat. Search in all archives.
.
  1. arXiv:2506.12489  [pdf, ps, other

    stat.ME stat.AP

    Truncated Cauchy Combination Test: a Robust and Powerful P-value Combination Method with Arbitrary Correlations

    Authors: Bo Chen, Wei Xu, Xin Gao

    Abstract: Cauchy combination test has been widely used for combining correlated p-values, but it may fail to work under certain scenarios. We propose a truncated Cauchy combination test (TCCT) which focus on combining p-values with arbitrary correlations, and demonstrate that our proposed test solves the limitations of Cauchy combination test and always has higher power. We prove that the tail probability o… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

  2. arXiv:2505.21074  [pdf, ps, other

    cs.LG cs.AI cs.CR cs.CV stat.ML

    Red-Teaming Text-to-Image Systems by Rule-based Preference Modeling

    Authors: Yichuan Cao, Yibo Miao, Xiao-Shan Gao, Yinpeng Dong

    Abstract: Text-to-image (T2I) models raise ethical and safety concerns due to their potential to generate inappropriate or harmful images. Evaluating these models' security through red-teaming is vital, yet white-box approaches are limited by their need for internal access, complicating their use with closed-source models. Moreover, existing black-box methods often assume knowledge about the model's specifi… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  3. arXiv:2505.09783  [pdf

    stat.AP cs.LG

    Pure Component Property Estimation Framework Using Explainable Machine Learning Methods

    Authors: Jianfeng Jiao, Xi Gao, Jie Li

    Abstract: Accurate prediction of pure component physiochemical properties is crucial for process integration, multiscale modeling, and optimization. In this work, an enhanced framework for pure component property prediction by using explainable machine learning methods is proposed. In this framework, the molecular representation method based on the connectivity matrix effectively considers atomic bonding re… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

  4. arXiv:2502.17684  [pdf, other

    stat.ME

    High-Dimensional Covariate-Dependent Gaussian Graphical Models

    Authors: Jiacheng Wang, Xin Gao

    Abstract: Motivated by dynamic biologic network analysis, we propose a covariate-dependent Gaussian graphical model (cdexGGM) for capturing network structure that varies with covariates through a novel parameterization. Utilizing a likelihood framework, our methodology jointly estimates all dynamic edge and vertex parameters. We further develop statistical inference procedures to test the dynamic nature of… ▽ More

    Submitted 24 February, 2025; originally announced February 2025.

  5. arXiv:2412.15315  [pdf, other

    stat.ML cs.LG

    Enhancing Masked Time-Series Modeling via Dropping Patches

    Authors: Tianyu Qiu, Yi Xie, Yun Xiong, Hao Niu, Xiaofeng Gao

    Abstract: This paper explores how to enhance existing masked time-series modeling by randomly dropping sub-sequence level patches of time series. On this basis, a simple yet effective method named DropPatch is proposed, which has two remarkable advantages: 1) It improves the pre-training efficiency by a square-level advantage; 2) It provides additional advantages for modeling in scenarios such as in-domain,… ▽ More

    Submitted 19 December, 2024; originally announced December 2024.

  6. arXiv:2412.08604  [pdf, other

    cs.IR cs.AI cs.LG stat.ML

    Preference Discerning with LLM-Enhanced Generative Retrieval

    Authors: Fabian Paischer, Liu Yang, Linfeng Liu, Shuai Shao, Kaveh Hassani, Jiacheng Li, Ricky Chen, Zhang Gabriel Li, Xialo Gao, Wei Shao, Xue Feng, Nima Noorshams, Sem Park, Bo Long, Hamid Eghbalzadeh

    Abstract: Sequential recommendation systems aim to provide personalized recommendations for users based on their interaction history. To achieve this, they often incorporate auxiliary information, such as textual descriptions of items and auxiliary tasks, like predicting user preferences and intent. Despite numerous efforts to enhance these models, they still suffer from limited personalization. To address… ▽ More

    Submitted 11 December, 2024; originally announced December 2024.

    Comments: 11 pages + references and appendix

  7. arXiv:2409.03986  [pdf, other

    cs.LG stat.ML

    An Efficient and Generalizable Symbolic Regression Method for Time Series Analysis

    Authors: Yi Xie, Tianyu Qiu, Yun Xiong, Xiuqi Huang, Xiaofeng Gao, Chao Chen

    Abstract: Time series analysis and prediction methods currently excel in quantitative analysis, offering accurate future predictions and diverse statistical indicators, but generally falling short in elucidating the underlying evolution patterns of time series. To gain a more comprehensive understanding and provide insightful explanations, we utilize symbolic regression techniques to derive explicit express… ▽ More

    Submitted 5 September, 2024; originally announced September 2024.

  8. arXiv:2405.19098  [pdf, other

    cs.LG cs.AI cs.CR cs.CV stat.ML

    Efficient Black-box Adversarial Attacks via Bayesian Optimization Guided by a Function Prior

    Authors: Shuyu Cheng, Yibo Miao, Yinpeng Dong, Xiao Yang, Xiao-Shan Gao, Jun Zhu

    Abstract: This paper studies the challenging black-box adversarial attack that aims to generate adversarial examples against a black-box model by only using output feedback of the model to input queries. Some previous methods improve the query efficiency by incorporating the gradient of a surrogate white-box model into query-based attacks due to the adversarial transferability. However, the localized gradie… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  9. arXiv:2402.04010  [pdf, other

    cs.LG stat.ML

    Efficient Availability Attacks against Supervised and Contrastive Learning Simultaneously

    Authors: Yihan Wang, Yifan Zhu, Xiao-Shan Gao

    Abstract: Availability attacks can prevent the unauthorized use of private data and commercial datasets by generating imperceptible noise and making unlearnable examples before release. Ideally, the obtained unlearnability prevents algorithms from training usable models. When supervised learning (SL) algorithms have failed, a malicious data collector possibly resorts to contrastive learning (CL) algorithms… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  10. arXiv:2401.17958  [pdf, ps, other

    stat.ML cs.LG math.PR

    Convergence Analysis for General Probability Flow ODEs of Diffusion Models in Wasserstein Distances

    Authors: Xuefeng Gao, Lingjiong Zhu

    Abstract: Score-based generative modeling with probability flow ordinary differential equations (ODEs) has achieved remarkable success in a variety of applications. While various fast ODE-based samplers have been proposed in the literature and employed in practice, the theoretical understandings about convergence properties of the probability flow ODE are still quite limited. In this paper, we provide the f… ▽ More

    Submitted 15 February, 2025; v1 submitted 31 January, 2024; originally announced January 2024.

    Comments: 48 pages, 2 tables

  11. arXiv:2401.17523  [pdf, other

    cs.LG cs.CR stat.ML

    Game-Theoretic Unlearnable Example Generator

    Authors: Shuang Liu, Yihan Wang, Xiao-Shan Gao

    Abstract: Unlearnable example attacks are data poisoning attacks aiming to degrade the clean test accuracy of deep learning by adding imperceptible perturbations to the training samples, which can be formulated as a bi-level optimization problem. However, directly solving this optimization problem is intractable for deep neural networks. In this paper, we investigate unlearnable example attacks from a game-… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  12. arXiv:2311.11003  [pdf, other

    cs.LG math.PR stat.ML

    Wasserstein Convergence Guarantees for a General Class of Score-Based Generative Models

    Authors: Xuefeng Gao, Hoang M. Nguyen, Lingjiong Zhu

    Abstract: Score-based generative models (SGMs) is a recent class of deep generative models with state-of-the-art performance in many applications. In this paper, we establish convergence guarantees for a general class of SGMs in 2-Wasserstein distance, assuming accurate score estimates and smooth log-concave data distribution. We specialize our result to several concrete SGMs with specific choices of forwar… ▽ More

    Submitted 15 February, 2025; v1 submitted 18 November, 2023; originally announced November 2023.

  13. arXiv:2311.02221  [pdf, other

    cs.LG stat.ML

    Structured Neural Networks for Density Estimation and Causal Inference

    Authors: Asic Q. Chen, Ruian Shi, Xiang Gao, Ricardo Baptista, Rahul G. Krishnan

    Abstract: Injecting structure into neural networks enables learning functions that satisfy invariances with respect to subsets of inputs. For instance, when learning generative models using neural networks, it is advantageous to encode the conditional independence structure of observed variables, often in the form of Bayesian networks. We propose the Structured Neural Network (StrNN), which injects structur… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: 10 pages with 5 figures, to be published in Neural Information Processing Systems 2023

  14. arXiv:2309.05359  [pdf

    stat.ME stat.AP

    A Note on Location Parameter Estimation using the Weighted Hodges-Lehmann Estimator

    Authors: Xuehong Gao, Zhijin Chen, Bosung Kim, Chanseok Park

    Abstract: Robust design is one of the main tools employed by engineers for the facilitation of the design of high-quality processes. However, most real-world processes invariably contend with external uncontrollable factors, often denoted as outliers or contaminated data, which exert a substantial distorting effect upon the computed sample mean. In pursuit of mitigating the inherent bias entailed by outlier… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

  15. arXiv:2307.01389  [pdf, other

    cs.LG stat.ME

    Identification of Causal Relationship between Amyloid-beta Accumulation and Alzheimer's Disease Progression via Counterfactual Inference

    Authors: Haixing Dai, Mengxuan Hu, Qing Li, Lu Zhang, Lin Zhao, Dajiang Zhu, Ibai Diez, Jorge Sepulcre, Fan Zhang, Xingyu Gao, Manhua Liu, Quanzheng Li, Sheng Li, Tianming Liu, Xiang Li

    Abstract: Alzheimer's disease (AD) is a neurodegenerative disorder that is beginning with amyloidosis, followed by neuronal loss and deterioration in structure, function, and cognition. The accumulation of amyloid-beta in the brain, measured through 18F-florbetapir (AV45) positron emission tomography (PET) imaging, has been widely used for early diagnosis of AD. However, the relationship between amyloid-bet… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  16. arXiv:2306.09882  [pdf, other

    cs.LG stat.ML stat.OT

    Uncertainty Quantification via Spatial-Temporal Tweedie Model for Zero-inflated and Long-tail Travel Demand Prediction

    Authors: Xinke Jiang, Dingyi Zhuang, Xianghui Zhang, Hao Chen, Jiayuan Luo, Xiaowei Gao

    Abstract: Understanding Origin-Destination (O-D) travel demand is crucial for transportation management. However, traditional spatial-temporal deep learning models grapple with addressing the sparse and long-tail characteristics in high-resolution O-D matrices and quantifying prediction uncertainty. This dilemma arises from the numerous zeros and over-dispersed demand patterns within these matrices, which c… ▽ More

    Submitted 30 January, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: In proceeding of CIKM 2023. Doi: https://dl.acm.org/doi/10.1145/3583780.3615215

  17. arXiv:2304.12460  [pdf, other

    stat.ME stat.AP

    Functional Causal Inference with Time-to-Event Data

    Authors: Xiyuan Gao, Jiayi Wang, Guanyu Hu, Jianguo Sun

    Abstract: Functional data is a powerful tool for capturing and analyzing complex patterns and relationships in a variety of fields, allowing for more precise modeling, visualization, and decision-making. For example, in healthcare, functional data such as medical images can help doctors make more accurate diagnoses and develop more effective treatment plans. However, understanding the causal relationships b… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

    Comments: 22 pages with references, 2 figures, and 3 tables in the main paper. Supplementary material is included

  18. arXiv:2210.15291  [pdf, other

    cs.CV cs.AI cs.CR cs.LG stat.ML

    Isometric 3D Adversarial Examples in the Physical World

    Authors: Yibo Miao, Yinpeng Dong, Jun Zhu, Xiao-Shan Gao

    Abstract: 3D deep learning models are shown to be as vulnerable to adversarial examples as 2D models. However, existing attack methods are still far from stealthy and suffer from severe performance degradation in the physical world. Although 3D data is highly structured, it is difficult to bound the perturbations with simple metrics in the Euclidean space. In this paper, we propose a novel $ε$-isometric (… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022

  19. arXiv:2207.02346  [pdf, other

    quant-ph cond-mat.dis-nn cond-mat.stat-mech cs.LG stat.ML

    Many-body localized hidden generative models

    Authors: Weishun Zhong, Xun Gao, Susanne F. Yelin, Khadijeh Najafi

    Abstract: Born machines are quantum-inspired generative models that leverage the probabilistic nature of quantum states. Here, we present a new architecture called many-body localized (MBL) hidden Born machine that utilizes both MBL dynamics and hidden units as learning resources. We show that the hidden units act as an effective thermal bath that enhances the trainability of the system, while the MBL dynam… ▽ More

    Submitted 28 December, 2023; v1 submitted 5 July, 2022; originally announced July 2022.

    Comments: 13 pages, 11 figures; added references

  20. arXiv:2205.11168  [pdf, ps, other

    cs.LG math.OC stat.ML

    Logarithmic regret bounds for continuous-time average-reward Markov decision processes

    Authors: Xuefeng Gao, Xun Yu Zhou

    Abstract: We consider reinforcement learning for continuous-time Markov decision processes (MDPs) in the infinite-horizon, average-reward setting. In contrast to discrete-time MDPs, a continuous-time process moves to a state and stays there for a random holding time after an action is taken. With unknown transition probabilities and rates of exponential holding times, we derive instance-dependent regret low… ▽ More

    Submitted 2 July, 2024; v1 submitted 23 May, 2022; originally announced May 2022.

  21. arXiv:2111.04404  [pdf, other

    cs.LG cs.CR stat.ML

    Robust and Information-theoretically Safe Bias Classifier against Adversarial Attacks

    Authors: Lijia Yu, Xiao-Shan Gao

    Abstract: In this paper, the bias classifier is introduced, that is, the bias part of a DNN with Relu as the activation function is used as a classifier. The work is motivated by the fact that the bias part is a piecewise constant function with zero gradient and hence cannot be directly attacked by gradient-based methods to generate adversaries, such as FGSM. The existence of the bias classifier is proved a… ▽ More

    Submitted 14 February, 2022; v1 submitted 8 November, 2021; originally announced November 2021.

  22. arXiv:2106.15927  [pdf, other

    cs.LG stat.ML

    A Robust Classification-autoencoder to Defend Outliers and Adversaries

    Authors: Lijia Yu, Xiao-Shan Gao

    Abstract: In this paper, a robust classification-autoencoder (CAE) is proposed, which has strong ability to recognize outliers and defend adversaries. The main idea is to change the autoencoder from an unsupervised learning model into a classifier, where the encoder is used to compress samples with different labels into disjoint compression spaces and the decoder is used to recover samples from their compre… ▽ More

    Submitted 7 June, 2022; v1 submitted 30 June, 2021; originally announced June 2021.

  23. arXiv:2106.09910  [pdf, other

    cs.LG cs.AI eess.SP stat.ML

    Message Passing in Graph Convolution Networks via Adaptive Filter Banks

    Authors: Xing Gao, Wenrui Dai, Chenglin Li, Junni Zou, Hongkai Xiong, Pascal Frossard

    Abstract: Graph convolution networks, like message passing graph convolution networks (MPGCNs), have been a powerful tool in representation learning of networked data. However, when data is heterogeneous, most architectures are limited as they employ a single strategy to handle multi-channel graph signals and they typically focus on low-frequency information. In this paper, we present a novel graph convolut… ▽ More

    Submitted 18 June, 2021; originally announced June 2021.

  24. arXiv:2105.00045  [pdf, other

    stat.ME

    Estimation and Selection Properties of the LAD Fused Lasso Signal Approximator

    Authors: Xiaoli Gao

    Abstract: The fused lasso is an important method for signal processing when the hidden signals are sparse and blocky. It is often used in combination with the squared loss function. However, the squared loss is not suitable for heavy tail error distributions nor is robust against outliers which arise often in practice. The least absolute deviations (LAD) loss provides a robust alternative to the squared los… ▽ More

    Submitted 30 April, 2021; originally announced May 2021.

  25. arXiv:2104.14827  [pdf, other

    stat.ME

    Joint Linear Trend Recovery Using L1 Regularization

    Authors: Xiaoli Gao, Ejaz Ahmed

    Abstract: This paper studies the recovery of a joint piece-wise linear trend from a time series using L1 regularization approach, called L1 trend filtering (Kim, Koh and Boyd, 2009). We provide some sufficient conditions under which a L1 trend filter can be well-behaved in terms of mean estimation and change point detection. The result is two-fold: for the mean estimation, an almost optimal consistent rate… ▽ More

    Submitted 30 April, 2021; originally announced April 2021.

  26. arXiv:2101.08354  [pdf, other

    quant-ph cond-mat.stat-mech cs.LG stat.ML

    Enhancing Generative Models via Quantum Correlations

    Authors: Xun Gao, Eric R. Anschuetz, Sheng-Tao Wang, J. Ignacio Cirac, Mikhail D. Lukin

    Abstract: Generative modeling using samples drawn from the probability distribution constitutes a powerful approach for unsupervised machine learning. Quantum mechanical systems can produce probability distributions that exhibit quantum correlations which are difficult to capture using classical models. We show theoretically that such quantum correlations provide a powerful resource for generative modeling.… ▽ More

    Submitted 20 January, 2021; originally announced January 2021.

    Comments: 25 pages, 13 figures

    Journal ref: Phys. Rev. X 12, 021037 (2022)

  27. arXiv:2101.02439  [pdf, other

    stat.ME

    Addressing patient heterogeneity in disease predictive model development

    Authors: Xu Gao, Weining Shen, Jing Ning, Ziding Feng, Jianhua Hu

    Abstract: This paper addresses patient heterogeneity associated with prediction problems in biomedical applications. We propose a systematic hypothesis testing approach to determine the existence of patient subgroup structure and the number of subgroups in patient population if subgroups exist. A mixture of generalized linear models is considered to model the relationship between the disease outcome and pat… ▽ More

    Submitted 7 January, 2021; originally announced January 2021.

  28. arXiv:2010.04912  [pdf, other

    stat.ML cs.LG

    Improve the Robustness and Accuracy of Deep Neural Network with $L_{2,\infty}$ Normalization

    Authors: Lijia Yu, Xiao-Shan Gao

    Abstract: In this paper, the robustness and accuracy of the deep neural network (DNN) was enhanced by introducing the $L_{2,\infty}$ normalization of the weight matrices of the DNN with Relu as the activation function. It is proved that the $L_{2,\infty}$ normalization leads to large dihedral angles between two adjacent faces of the polyhedron graph of the DNN function and hence smoother DNN functions, whic… ▽ More

    Submitted 10 October, 2020; originally announced October 2020.

  29. arXiv:2010.01986  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    A Spherical Hidden Markov Model for Semantics-Rich Human Mobility Modeling

    Authors: Wanzheng Zhu, Chao Zhang, Shuochao Yao, Xiaobin Gao, Jiawei Han

    Abstract: We study the problem of modeling human mobility from semantic trace data, wherein each GPS record in a trace is associated with a text message that describes the user's activity. Existing methods fall short in unveiling human movement regularities, because they either do not model the text data at all or suffer from text sparsity severely. We propose SHMM, a multi-modal spherical hidden Markov mod… ▽ More

    Submitted 5 October, 2020; originally announced October 2020.

  30. arXiv:2007.16031  [pdf, other

    stat.ME stat.AP

    Decomposition of the Total Effect for Two Mediators: A Natural Counterfactual Interaction Effect Framework

    Authors: Xin Gao, Li Li, Li Luo

    Abstract: Mediation analysis has been used in many disciplines to explain the mechanism or process that underlies an observed relationship between an exposure variable and an outcome variable via the inclusion of mediators. Decompositions of the total causal effect of an exposure variable into effects characterizing mediation pathways and interactions have gained an increasing amount of interest in the last… ▽ More

    Submitted 30 July, 2020; originally announced July 2020.

    Comments: 112 pages, 6 figures. arXiv admin note: text overlap with arXiv:2004.06054

  31. arXiv:2007.00590  [pdf, other

    stat.ML cs.LG math.OC

    Decentralized Stochastic Gradient Langevin Dynamics and Hamiltonian Monte Carlo

    Authors: Mert Gürbüzbalaban, Xuefeng Gao, Yuanhan Hu, Lingjiong Zhu

    Abstract: Stochastic gradient Langevin dynamics (SGLD) and stochastic gradient Hamiltonian Monte Carlo (SGHMC) are two popular Markov Chain Monte Carlo (MCMC) algorithms for Bayesian inference that can scale to large datasets, allowing to sample from the posterior distribution of the parameters of a statistical model given the input data and the prior distribution over the model parameters. However, these a… ▽ More

    Submitted 26 August, 2021; v1 submitted 1 July, 2020; originally announced July 2020.

    MSC Class: Primary: 68W15; 62F15; 65C05; 62D05; 62L20; secondary: 60J20; 90C15

  32. arXiv:2006.11118  [pdf, other

    cs.LG eess.SP stat.ML

    Graph Pooling with Node Proximity for Hierarchical Representation Learning

    Authors: Xing Gao, Wenrui Dai, Chenglin Li, Hongkai Xiong, Pascal Frossard

    Abstract: Graph neural networks have attracted wide attentions to enable representation learning of graph data in recent works. In complement to graph convolution operators, graph pooling is crucial for extracting hierarchical representation of graph data. However, most recent graph pooling methods still fail to efficiently exploit the geometry of graph data. In this paper, we propose a novel graph pooling… ▽ More

    Submitted 19 June, 2020; originally announced June 2020.

  33. arXiv:2006.05664  [pdf, other

    cs.LG cs.NE stat.ML

    OpEvo: An Evolutionary Method for Tensor Operator Optimization

    Authors: Xiaotian Gao, Cui Wei, Lintao Zhang, Mao Yang

    Abstract: Training and inference efficiency of deep neural networks highly rely on the performance of tensor operators on hardware platforms. Manually optimizing tensor operators has limitations in terms of supporting new operators or hardware platforms. Therefore, automatically optimizing device code configurations of tensor operators is getting increasingly attractive. However, current methods for tensor… ▽ More

    Submitted 21 December, 2020; v1 submitted 10 June, 2020; originally announced June 2020.

    Comments: Accepted at AAAI 2021

  34. arXiv:2006.05082  [pdf, other

    cs.LG stat.ML

    Learning to Stop While Learning to Predict

    Authors: Xinshi Chen, Hanjun Dai, Yu Li, Xin Gao, Le Song

    Abstract: There is a recent surge of interest in designing deep architectures based on the update steps in traditional algorithms, or learning neural networks to improve and replace traditional algorithms. While traditional algorithms have certain stopping criteria for outputting results at different iterations, many algorithm-inspired deep models are restricted to a ``fixed-depth'' for all inputs. Similar… ▽ More

    Submitted 9 June, 2020; originally announced June 2020.

    Comments: Proceedings of the 37th International Conference on Machine Learning

  35. arXiv:2005.12901  [pdf, other

    cs.LG stat.ML

    A Framework for Behavioral Biometric Authentication using Deep Metric Learning on Mobile Devices

    Authors: Cong Wang, Yanru Xiao, Xing Gao, Li Li, Jun Wang

    Abstract: Mobile authentication using behavioral biometrics has been an active area of research. Existing research relies on building machine learning classifiers to recognize an individual's unique patterns. However, these classifiers are not powerful enough to learn the discriminative features. When implemented on the mobile devices, they face new challenges from the behavioral dynamics, data privacy and… ▽ More

    Submitted 17 August, 2020; v1 submitted 26 May, 2020; originally announced May 2020.

  36. arXiv:2004.06054  [pdf, other

    stat.ME stat.AP stat.OT

    Decomposition of Total Effect with the Notion of Natural Counterfactual Interaction Effect

    Authors: Xin Gao, Li Li, Li Luo

    Abstract: Mediation analysis serves as a crucial tool to obtain causal inference based on directed acyclic graphs, which has been widely employed in the areas of biomedical science, social science, epidemiology and psychology. Decomposition of total effect provides a deep insight to fully understand the casual contribution from each path and interaction term. Since the four-way decomposition method was prop… ▽ More

    Submitted 13 April, 2020; originally announced April 2020.

    Comments: 72 pages in total, 12 figures

  37. arXiv:2004.04092  [pdf, other

    cs.CL cs.LG stat.ML

    Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space

    Authors: Chunyuan Li, Xiang Gao, Yuan Li, Baolin Peng, Xiujun Li, Yizhe Zhang, Jianfeng Gao

    Abstract: When trained effectively, the Variational Autoencoder (VAE) can be both a powerful generative model and an effective representation learning framework for natural language. In this paper, we propose the first large-scale language VAE model, Optimus. A universal latent embedding space for sentences is first pre-trained on large text corpus, and then fine-tuned for various language generation and un… ▽ More

    Submitted 11 October, 2020; v1 submitted 5 April, 2020; originally announced April 2020.

    Comments: Accepted in EMNLP 2020; Code: https://github.com/ChunyuanLI/Optimus Demo: http://aka.ms/optimus

  38. arXiv:2004.02823  [pdf, other

    math.OC stat.ML

    Non-Convex Optimization via Non-Reversible Stochastic Gradient Langevin Dynamics

    Authors: Yuanhan Hu, Xiaoyu Wang, Xuefeng Gao, Mert Gurbuzbalaban, Lingjiong Zhu

    Abstract: Stochastic Gradient Langevin Dynamics (SGLD) is a powerful algorithm for optimizing a non-convex objective, where a controlled and properly scaled Gaussian noise is added to the stochastic gradients to steer the iterates towards a global minimum. SGLD is based on the overdamped Langevin diffusion which is reversible in time. By adding an anti-symmetric matrix to the drift term of the overdamped La… ▽ More

    Submitted 2 June, 2020; v1 submitted 6 April, 2020; originally announced April 2020.

    Comments: 45 pages

  39. arXiv:2004.00764  [pdf, other

    stat.ME

    Bayesian model selection approach for colored graphical Gaussian models

    Authors: Qiong Li, Xin Gao, Helene Massam

    Abstract: We consider a class of colored graphical Gaussian models obtained by placing symmetry constraints on the precision matrix in a Bayesian framework. The prior distribution on the precision matrix is the colored $G$-Wishart prior which is the Diaconis-Ylvisaker conjugate prior. In this paper, we develop a computationally efficient model search algorithm which combines linear regression with a double… ▽ More

    Submitted 1 April, 2020; originally announced April 2020.

    Comments: 35 pages

    MSC Class: 62H12 (Primary); 62F15 (Secondary)

  40. arXiv:2003.09676  [pdf, other

    cs.LG stat.ML

    Probabilistic Dual Network Architecture Search on Graphs

    Authors: Yiren Zhao, Duo Wang, Xitong Gao, Robert Mullins, Pietro Lio, Mateja Jamnik

    Abstract: We present the first differentiable Network Architecture Search (NAS) for Graph Neural Networks (GNNs). GNNs show promising performance on a wide range of tasks, but require a large amount of architecture engineering. First, graphs are inherently a non-Euclidean and sophisticated data structure, leading to poor adaptivity of GNN architectures across different datasets. Second, a typical graph bloc… ▽ More

    Submitted 21 March, 2020; originally announced March 2020.

  41. arXiv:2002.05810  [pdf, other

    cs.LG stat.ML

    RNA Secondary Structure Prediction By Learning Unrolled Algorithms

    Authors: Xinshi Chen, Yu Li, Ramzan Umarov, Xin Gao, Le Song

    Abstract: In this paper, we propose an end-to-end deep learning model, called E2Efold, for RNA secondary structure prediction which can effectively take into account the inherent constraints in the problem. The key idea of E2Efold is to directly predict the RNA base-pairing matrix, and use an unrolled algorithm for constrained programming as the template for deep architectures to enforce constraints. With c… ▽ More

    Submitted 13 February, 2020; originally announced February 2020.

    Comments: International Conference on Learning Representations 2020

    Journal ref: International Conference on Learning Representations 2020, https://openreview.net/forum?id=S1eALyrYDH

  42. arXiv:2001.09390  [pdf, ps, other

    cs.LG stat.ML

    Regime Switching Bandits

    Authors: Xiang Zhou, Yi Xiong, Ningyuan Chen, Xuefeng Gao

    Abstract: We study a multi-armed bandit problem where the rewards exhibit regime switching. Specifically, the distributions of the random rewards generated from all arms are modulated by a common underlying state modeled as a finite-state Markov chain. The agent does not observe the underlying state and has to learn the transition matrix and the reward distributions. We propose a learning algorithm for this… ▽ More

    Submitted 1 February, 2021; v1 submitted 25 January, 2020; originally announced January 2020.

  43. arXiv:2001.08277  [pdf, ps, other

    cs.LG stat.ML

    Intermittent Pulling with Local Compensation for Communication-Efficient Federated Learning

    Authors: Haozhao Wang, Zhihao Qu, Song Guo, Xin Gao, Ruixuan Li, Baoliu Ye

    Abstract: Federated Learning is a powerful machine learning paradigm to cooperatively train a global model with highly distributed data. A major bottleneck on the performance of distributed Stochastic Gradient Descent (SGD) algorithm for large-scale Federated Learning is the communication overhead on pushing local gradients and pulling global model. In this paper, to reduce the communication complexity of F… ▽ More

    Submitted 22 January, 2020; originally announced January 2020.

  44. arXiv:1912.06269  [pdf, other

    math.OC eess.SY stat.ML

    Learning and Optimization with Bayesian Hybrid Models

    Authors: Elvis A. Eugene, Xian Gao, Alexander W. Dowling

    Abstract: Bayesian hybrid models fuse physics-based insights with machine learning constructs to correct for systematic bias. In this paper, we compare Bayesian hybrid models against physics-based glass-box and Gaussian process black-box surrogate models. We consider ballistic firing as an illustrative case study for a Bayesian decision-making workflow. First, Bayesian calibration is performed to estimate m… ▽ More

    Submitted 12 December, 2019; originally announced December 2019.

    Comments: Submitted to 2020 American Control Conference

  45. arXiv:1911.12216  [pdf, other

    cs.LG stat.ML

    ConCare: Personalized Clinical Feature Embedding via Capturing the Healthcare Context

    Authors: Liantao Ma, Chaohe Zhang, Yasha Wang, Wenjie Ruan, Jiantao Wang, Wen Tang, Xinyu Ma, Xin Gao, Junyi Gao

    Abstract: Predicting the patient's clinical outcome from the historical electronic medical records (EMR) is a fundamental research problem in medical informatics. Most deep learning-based solutions for EMR analysis concentrate on learning the clinical visit embedding and exploring the relations between visits. Although those works have shown superior performances in healthcare prediction, they fail to explo… ▽ More

    Submitted 27 November, 2019; originally announced November 2019.

  46. arXiv:1911.12205  [pdf, other

    cs.LG stat.ML

    AdaCare: Explainable Clinical Health Status Representation Learning via Scale-Adaptive Feature Extraction and Recalibration

    Authors: Liantao Ma, Junyi Gao, Yasha Wang, Chaohe Zhang, Jiangtao Wang, Wenjie Ruan, Wen Tang, Xin Gao, Xinyu Ma

    Abstract: Deep learning-based health status representation learning and clinical prediction have raised much research interest in recent years. Existing models have shown superior performance, but there are still several major issues that have not been fully taken into consideration. First, the historical variation pattern of the biomarker in diverse time scales plays a vital role in indicating the health s… ▽ More

    Submitted 27 November, 2019; originally announced November 2019.

  47. A High-dimensional M-estimator Framework for Bi-level Variable Selection

    Authors: Bin Luo, Xiaoli Gao

    Abstract: In high-dimensional data analysis, bi-level sparsity is often assumed when covariates function group-wisely and sparsity can appear either at the group level or within certain groups. In such cases, an ideal model should be able to encourage the bi-level variable selection consistently. Bi-level variable selection has become even more challenging when data have heavy-tailed distribution or outlier… ▽ More

    Submitted 10 September, 2021; v1 submitted 26 November, 2019; originally announced November 2019.

    Comments: Ann Inst Stat Math (2021). arXiv admin note: text overlap with arXiv:1910.09493

  48. arXiv:1910.12469  [pdf, other

    cs.LG stat.ML

    Learning Latent Process from High-Dimensional Event Sequences via Efficient Sampling

    Authors: Qitian Wu, Zixuan Zhang, Xiaofeng Gao, Junchi Yan, Guihai Chen

    Abstract: We target modeling latent dynamics in high-dimension marked event sequences without any prior knowledge about marker relations. Such problem has been rarely studied by previous works which would have fundamental difficulty to handle the arisen challenges: 1) the high-dimensional markers and unknown relation network among them pose intractable obstacles for modeling the latent dynamic process; 2) o… ▽ More

    Submitted 28 October, 2019; originally announced October 2019.

  49. arXiv:1910.08945  [pdf, ps, other

    cs.LG stat.ML

    Online Bagging for Anytime Transfer Learning

    Authors: Guokun Chi, Min Jiang, Xing Gao, Weizhen Hu, Shihui Guo, Kay Chen Tan

    Abstract: Transfer learning techniques have been widely used in the reality that it is difficult to obtain sufficient labeled data in the target domain, but a large amount of auxiliary data can be obtained in the relevant source domain. But most of the existing methods are based on offline data. In practical applications, it is often necessary to face online learning problems in which the data samples are a… ▽ More

    Submitted 20 October, 2019; originally announced October 2019.

    Comments: 7 pages; SSCI2019

  50. arXiv:1910.08211  [pdf, other

    cs.LG stat.ML

    Differentiable Combinatorial Losses through Generalized Gradients of Linear Programs

    Authors: Xi Gao, Han Zhang, Aliakbar Panahi, Tom Arodz

    Abstract: When samples have internal structure, we often see a mismatch between the objective optimized during training and the model's goal during inference. For example, in sequence-to-sequence modeling we are interested in high-quality translated sentences, but training typically uses maximum likelihood at the word level. The natural training-time loss would involve a combinatorial problem -- dynamic pro… ▽ More

    Submitted 2 October, 2020; v1 submitted 17 October, 2019; originally announced October 2019.