Skip to main content

Showing 1–27 of 27 results for author: Dai, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2205.04151  [pdf, other

    stat.ML cs.LG

    Learning effective dynamics from data-driven stochastic systems

    Authors: Lingyu Feng, Ting Gao, Min Dai, Jinqiao Duan

    Abstract: Multiscale stochastic dynamical systems have been widely adopted to a variety of scientific and engineering problems due to their capability of depicting complex phenomena in many real world applications. This work is devoted to investigating the effective dynamics for slow-fast stochastic dynamical systems. Given observation data on a short-term period satisfying some unknown slow-fast stochastic… ▽ More

    Submitted 29 December, 2023; v1 submitted 9 May, 2022; originally announced May 2022.

  2. arXiv:2111.07109  [pdf, other

    cs.LG stat.ML

    Nyström Regularization for Time Series Forecasting

    Authors: Zirui Sun, Mingwei Dai, Yao Wang, Shao-Bo Lin

    Abstract: This paper focuses on learning rate analysis of Nyström regularization with sequential sub-sampling for $τ$-mixing time series. Using a recently developed Banach-valued Bernstein inequality for $τ$-mixing sequences and an integral operator approach based on second-order decomposition, we succeed in deriving almost optimal learning rates of Nyström regularization with sequential sub-sampling for… ▽ More

    Submitted 13 November, 2021; originally announced November 2021.

    Comments: 35 pages

  3. arXiv:2109.03378  [pdf, other

    stat.ML cs.LG

    Rethinking Multidimensional Discriminator Output for Generative Adversarial Networks

    Authors: Mengyu Dai, Haibin Hang, Anuj Srivastava

    Abstract: The study of multidimensional discriminator (critic) output for Generative Adversarial Networks has been underexplored in the literature. In this paper, we generalize the Wasserstein GAN framework to take advantage of multidimensional critic output and explore its properties. We also introduce a square-root velocity transformation (SRVT) block which favors training in the multidimensional setting.… ▽ More

    Submitted 14 July, 2022; v1 submitted 7 September, 2021; originally announced September 2021.

    Comments: Frontiers in Adversarial Machine Learning ICML 2022

  4. arXiv:2103.15023  [pdf, other

    stat.ME

    Nonparametric tests for treatment effect heterogeneity in observational studies

    Authors: Maozhu Dai, Weining Shen, Hal S. Stern

    Abstract: We consider the problem of testing for treatment effect heterogeneity in observational studies, and propose a nonparametric test based on multisample U-statistics. To account for potential confounders, we use reweighted data where the weights are determined by estimated propensity scores. The proposed method does not require any parametric assumptions on the outcomes and bypasses the need for mode… ▽ More

    Submitted 27 March, 2021; originally announced March 2021.

  5. arXiv:2012.03432  [pdf, other

    stat.ME

    A U-statistic-based test of treatment effect heterogeneity

    Authors: Maozhu Dai, Hal S. Stern

    Abstract: Many studies include a goal of determining whether there is treatment effect heterogeneity across different subpopulations. In this paper, we propose a U-statistic-based non-parametric test of the null hypothesis that the treatment effects are identical in different subgroups. The proposed test provides more power than the standard parametric test when the underlying distribution assumptions of th… ▽ More

    Submitted 6 December, 2020; originally announced December 2020.

  6. arXiv:2010.06610  [pdf, other

    cs.LG cs.CV stat.ML

    Training independent subnetworks for robust prediction

    Authors: Marton Havasi, Rodolphe Jenatton, Stanislav Fort, Jeremiah Zhe Liu, Jasper Snoek, Balaji Lakshminarayanan, Andrew M. Dai, Dustin Tran

    Abstract: Recent approaches to efficiently ensemble neural networks have shown that strong robustness and uncertainty performance can be achieved with a negligible gain in parameters over the original network. However, these methods still require multiple forward passes for prediction, leading to a significant computational cost. In this work, we show a surprising result: the benefits of using multiple pred… ▽ More

    Submitted 4 August, 2021; v1 submitted 13 October, 2020; originally announced October 2020.

    Comments: Updated to the ICLR camera ready version, added reference to Soflaei et al. 2020

  7. arXiv:2007.05189  [pdf, other

    cs.LG math.OC stat.ML

    Learning Unstable Dynamical Systems with Time-Weighted Logarithmic Loss

    Authors: Kamil Nar, Yuan Xue, Andrew M. Dai

    Abstract: When training the parameters of a linear dynamical model, the gradient descent algorithm is likely to fail to converge if the squared-error loss is used as the training loss function. Restricting the parameter space to a smaller subset and running the gradient descent algorithm within this subset can allow learning stable dynamical systems, but this strategy does not work for unstable systems. In… ▽ More

    Submitted 10 July, 2020; originally announced July 2020.

  8. arXiv:1912.00589  [pdf, other

    stat.ML cs.CV cs.LG

    Flow Contrastive Estimation of Energy-Based Models

    Authors: Ruiqi Gao, Erik Nijkamp, Diederik P. Kingma, Zhen Xu, Andrew M. Dai, Ying Nian Wu

    Abstract: This paper studies a training method to jointly estimate an energy-based model and a flow-based model, in which the two models are iteratively updated based on a shared adversarial value function. This joint training method has the following traits. (1) The update of the energy-based model is based on noise contrastive estimation, with the flow model serving as a strong noise distribution. (2) The… ▽ More

    Submitted 1 April, 2020; v1 submitted 2 December, 2019; originally announced December 2019.

  9. arXiv:1911.06410  [pdf, other

    cs.LG cs.CY stat.ML

    Modelling EHR timeseries by restricting feature interaction

    Authors: Kun Zhang, Yuan Xue, Gerardo Flores, Alvin Rajkomar, Claire Cui, Andrew M. Dai

    Abstract: Time series data are prevalent in electronic health records, mostly in the form of physiological parameters such as vital signs and lab tests. The patterns of these values may be significant indicators of patients' clinical states and there might be patterns that are unknown to clinicians but are highly predictive of some outcomes. Many of these values are also missing which makes it difficult to… ▽ More

    Submitted 14 November, 2019; originally announced November 2019.

    Comments: Machine Learning for Health (ML4H) at NeurIPS 2019 - Extended Abstract

  10. arXiv:1911.05861  [pdf, other

    cs.LG stat.ML

    Federated and Differentially Private Learning for Electronic Health Records

    Authors: Stephen R. Pfohl, Andrew M. Dai, Katherine Heller

    Abstract: The use of collaborative and decentralized machine learning techniques such as federated learning have the potential to enable the development and deployment of clinical risk predictions models in low-resource settings without requiring sensitive data be shared or stored in a central repository. This process necessitates communication of model weights or updates between collaborating entities, but… ▽ More

    Submitted 13 November, 2019; originally announced November 2019.

    Comments: Machine Learning for Health (ML4H) at NeurIPS 2019 - Extended Abstract

  11. arXiv:1910.11424  [pdf, other

    cs.CL cs.AI cs.LG cs.MA stat.ML

    Capacity, Bandwidth, and Compositionality in Emergent Language Learning

    Authors: Cinjon Resnick, Abhinav Gupta, Jakob Foerster, Andrew M. Dai, Kyunghyun Cho

    Abstract: Many recent works have discussed the propensity, or lack thereof, for emergent languages to exhibit properties of natural languages. A favorite in the literature is learning compositionality. We note that most of those works have focused on communicative bandwidth as being of primary importance. While important, it is not the only contributing factor. In this paper, we investigate the learning bia… ▽ More

    Submitted 15 April, 2020; v1 submitted 24 October, 2019; originally announced October 2019.

    Comments: The first two authors contributed equally. Accepted at AAMAS 2020

  12. arXiv:1909.09712  [pdf, other

    cs.LG stat.ML

    Learning an Adaptive Learning Rate Schedule

    Authors: Zhen Xu, Andrew M. Dai, Jonas Kemp, Luke Metz

    Abstract: The learning rate is one of the most important hyper-parameters for model training and generalization. However, current hand-designed parametric learning rate schedules offer limited flexibility and the predefined schedule may not match the training dynamics of high dimensional and non-convex optimization problems. In this paper, we propose a reinforcement learning based framework that can automat… ▽ More

    Submitted 20 September, 2019; originally announced September 2019.

  13. arXiv:1909.03039  [pdf, other

    cs.LG cs.CL stat.ML

    Improved Hierarchical Patient Classification with Language Model Pretraining over Clinical Notes

    Authors: Jonas Kemp, Alvin Rajkomar, Andrew M. Dai

    Abstract: Clinical notes in electronic health records contain highly heterogeneous writing styles, including non-standard terminology or abbreviations. Using these notes in predictive modeling has traditionally required preprocessing (e.g. taking frequent terms or topic modeling) that removes much of the richness of the source data. We propose a pretrained hierarchical recurrent neural network model that pa… ▽ More

    Submitted 14 November, 2019; v1 submitted 6 September, 2019; originally announced September 2019.

    Comments: Machine Learning for Health (ML4H) at NeurIPS 2019 - extended abstract

  14. arXiv:1906.04716  [pdf, other

    cs.LG stat.ML

    Learning the Graphical Structure of Electronic Health Records with Graph Convolutional Transformer

    Authors: Edward Choi, Zhen Xu, Yujia Li, Michael W. Dusenberry, Gerardo Flores, Yuan Xue, Andrew M. Dai

    Abstract: Effective modeling of electronic health records (EHR) is rapidly becoming an important topic in both academia and industry. A recent study showed that using the graphical structure underlying EHR data (e.g. relationship between diagnoses and treatments) improves the performance of prediction tasks such as heart failure prediction. However, EHR data do not always contain complete structure informat… ▽ More

    Submitted 19 January, 2020; v1 submitted 11 June, 2019; originally announced June 2019.

    Comments: To be presented at AAAI 2020

  15. Analyzing the Role of Model Uncertainty for Electronic Health Records

    Authors: Michael W. Dusenberry, Dustin Tran, Edward Choi, Jonas Kemp, Jeremy Nixon, Ghassen Jerfel, Katherine Heller, Andrew M. Dai

    Abstract: In medicine, both ethical and monetary costs of incorrect predictions can be significant, and the complexity of the problems often necessitates increasingly complex models. Recent work has shown that changing just the random seed is enough for otherwise well-tuned deep neural networks to vary in their individual predicted probabilities. In light of this, we investigate the role of model uncertaint… ▽ More

    Submitted 25 March, 2020; v1 submitted 10 June, 2019; originally announced June 2019.

    Comments: Published in the ACM Conference on Health, Inference, and Learning (CHIL) 2020. Code available at https://github.com/Google-Health/records-research

  16. arXiv:1902.03525  [pdf, other

    stat.ME

    BOLT-SSI: A Statistical Approach to Screening Interaction Effects for Ultra-High Dimensional Data

    Authors: Min Zhou, Mingwei Dai, Yuan Yao, Jin Liu, Can Yang, Heng Peng

    Abstract: Detecting interaction effects among predictors on the response variable is a crucial step in various applications. In this paper, we first propose a simple method for sure screening interactions (SSI). Although its computation complexity is $O(p^2n)$, SSI works well for problems of moderate dimensionality (e.g., $p=10^3\sim10^4$), without the heredity assumption. To ultra-high dimensional problems… ▽ More

    Submitted 15 December, 2020; v1 submitted 9 February, 2019; originally announced February 2019.

    Comments: 56 pages, 7 figures

  17. arXiv:1809.04281  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    Music Transformer

    Authors: Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Noam Shazeer, Ian Simon, Curtis Hawthorne, Andrew M. Dai, Matthew D. Hoffman, Monica Dinculescu, Douglas Eck

    Abstract: Music relies heavily on repetition to build structure and meaning. Self-reference occurs on multiple timescales, from motifs to phrases to reusing of entire sections of music, such as in pieces with ABA structure. The Transformer (Vaswani et al., 2017), a sequence model based on self-attention, has achieved compelling results in many generation tasks that require maintaining long-range coherence.… ▽ More

    Submitted 12 December, 2018; v1 submitted 12 September, 2018; originally announced September 2018.

    Comments: Improved skewing section and accompanying figures. Previous titles are "An Improved Relative Self-Attention Mechanism for Transformer with Application to Music Generation" and "Music Transformer"

  18. arXiv:1808.06576  [pdf, other

    q-bio.QM stat.ML

    Peptide-Spectra Matching from Weak Supervision

    Authors: Samuel S. Schoenholz, Sean Hackett, Laura Deming, Eugene Melamud, Navdeep Jaitly, Fiona McAllister, Jonathon O'Brien, George Dahl, Bryson Bennett, Andrew M. Dai, Daphne Koller

    Abstract: As in many other scientific domains, we face a fundamental problem when using machine learning to identify proteins from mass spectrometry data: large ground truth datasets mapping inputs to correct outputs are extremely difficult to obtain. Instead, we have access to imperfect hand-coded models crafted by domain experts. In this paper, we apply deep neural networks to an important step of the pro… ▽ More

    Submitted 22 August, 2018; v1 submitted 20 August, 2018; originally announced August 2018.

  19. arXiv:1804.11011  [pdf, other

    q-bio.GN stat.ME

    Joint Analysis of Individual-level and Summary-level GWAS Data by Leveraging Pleiotropy

    Authors: Mingwei Dai, Xiang Wan, Hao Peng, Yao Wang, Yue Liu, Jin Liu, Zongben Xu, Can Yang

    Abstract: A large number of recent genome-wide association studies (GWASs) for complex phenotypes confirm the early conjecture for polygenicity, suggesting the presence of large number of variants with only tiny or moderate effects. However, due to the limited sample size of a single GWAS, many associated genetic variants are too weak to achieve the genome-wide significance. These undiscovered variants furt… ▽ More

    Submitted 29 April, 2018; originally announced April 2018.

    Comments: 32 pages, 11 figures, 2 tables

  20. arXiv:1803.10439  [pdf, other

    stat.AP

    BIVAS: A scalable Bayesian method for bi-level variable selection with applications

    Authors: Mingxuan Cai, Mingwei Dai, Jingsi Ming, Heng Peng, Jin Liu, Can Yang

    Abstract: In this paper, we consider a Bayesian bi-level variable selection problem in high-dimensional regressions. In many practical situations, it is natural to assign group membership to each predictor. Examples include that genetic variants can be grouped at the gene level and a covariate from different tasks naturally forms a group. Thus, it is of interest to select important groups as well as importa… ▽ More

    Submitted 28 March, 2018; originally announced March 2018.

  21. arXiv:1803.00144  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Learning Longer-term Dependencies in RNNs with Auxiliary Losses

    Authors: Trieu H. Trinh, Andrew M. Dai, Minh-Thang Luong, Quoc V. Le

    Abstract: Despite recent advances in training recurrent neural networks (RNNs), capturing long-term dependencies in sequences remains a fundamental challenge. Most approaches use backpropagation through time (BPTT), which is difficult to scale to very long sequences. This paper proposes a simple method that improves the ability to capture long term dependencies in RNNs by adding an unsupervised auxiliary lo… ▽ More

    Submitted 13 June, 2018; v1 submitted 28 February, 2018; originally announced March 2018.

    Comments: ICML 2018

  22. arXiv:1801.07736  [pdf, other

    stat.ML cs.AI cs.LG

    MaskGAN: Better Text Generation via Filling in the______

    Authors: William Fedus, Ian Goodfellow, Andrew M. Dai

    Abstract: Neural text generation models are often autoregressive language models or seq2seq models. These models generate text by sampling words sequentially, with each word conditioned on the previous word, and are state-of-the-art for several machine translation and summarization benchmarks. These benchmarks are often defined by validation perplexity even though this is not a direct measure of the quality… ▽ More

    Submitted 1 March, 2018; v1 submitted 23 January, 2018; originally announced January 2018.

    Comments: 16 pages, ICLR 2018

  23. arXiv:1710.09551  [pdf, other

    stat.ME

    LPG: a four-groups probabilistic approach to leveraging pleiotropy in genome-wide association studies

    Authors: Yi Yang, Mingwei Dai, Jian Huang, Xinyi Lin, Can Yang, Jin Liu, Min Chen

    Abstract: To date, genome-wide association studies (GWAS) have successfully identified tens of thousands of genetic variants among a variety of traits/diseases, shedding a light on the genetic architecture of complex diseases. Polygenicity of complex diseases, which refers to the phenomenon that a vast number of risk variants collectively contribute to the heritability of complex diseases with modest indivi… ▽ More

    Submitted 26 October, 2017; originally announced October 2017.

    Comments: 81 page (include supplementary)

  24. arXiv:1710.08446  [pdf, other

    stat.ML cs.LG

    Many Paths to Equilibrium: GANs Do Not Need to Decrease a Divergence At Every Step

    Authors: William Fedus, Mihaela Rosca, Balaji Lakshminarayanan, Andrew M. Dai, Shakir Mohamed, Ian Goodfellow

    Abstract: Generative adversarial networks (GANs) are a family of generative models that do not minimize a single training criterion. Unlike other generative models, the data distribution is learned via a game between a generator (the generative model) and a discriminator (a teacher providing training signal) that each minimize their own cost. GANs are designed to reach a Nash equilibrium at which each playe… ▽ More

    Submitted 20 February, 2018; v1 submitted 23 October, 2017; originally announced October 2017.

    Comments: 18 pages

  25. arXiv:1710.07201  [pdf, other

    stat.ME q-bio.GN

    LSMM: A statistical approach to integrating functional annotations with genome-wide association studies

    Authors: Jingsi Ming, Mingwei Dai, Mingxuan Cai, Xiang Wan, Jin Liu, Can Yang

    Abstract: Thousands of risk variants underlying complex phenotypes (quantitative traits and diseases) have been identified in genome-wide association studies (GWAS). However, there are still two major challenges towards deepening our understanding of the genetic architectures of complex phenotypes. First, the majority of GWAS hits are in the non-coding region and their biological interpretation is still unc… ▽ More

    Submitted 19 October, 2017; originally announced October 2017.

  26. arXiv:1605.07725  [pdf, ps, other

    stat.ML cs.LG

    Adversarial Training Methods for Semi-Supervised Text Classification

    Authors: Takeru Miyato, Andrew M. Dai, Ian Goodfellow

    Abstract: Adversarial training provides a means of regularizing supervised learning algorithms while virtual adversarial training is able to extend supervised learning algorithms to the semi-supervised setting. However, both methods require making small perturbations to numerous entries of the input vector, which is inappropriate for sparse high-dimensional inputs such as one-hot word representations. We ex… ▽ More

    Submitted 16 November, 2021; v1 submitted 25 May, 2016; originally announced May 2016.

    Comments: Published as a conference paper at ICLR 2017

  27. The supervised hierarchical Dirichlet process

    Authors: Andrew M. Dai, Amos J. Storkey

    Abstract: We propose the supervised hierarchical Dirichlet process (sHDP), a nonparametric generative model for the joint distribution of a group of observations and a response variable directly associated with that whole group. We compare the sHDP with another leading method for regression on grouped data, the supervised latent Dirichlet allocation (sLDA) model. We evaluate our method on two real-world cla… ▽ More

    Submitted 16 December, 2014; originally announced December 2014.

    Comments: 14 pages