Skip to main content

Showing 1–4 of 4 results for author: Ng, N

Searching in archive stat. Search in all archives.
.
  1. arXiv:2209.05364  [pdf, other

    cs.LG stat.ML

    If Influence Functions are the Answer, Then What is the Question?

    Authors: Juhan Bae, Nathan Ng, Alston Lo, Marzyeh Ghassemi, Roger Grosse

    Abstract: Influence functions efficiently estimate the effect of removing a single training data point on a model's learned parameters. While influence estimates align well with leave-one-out retraining for linear models, recent works have shown this alignment is often poor in neural networks. In this work, we investigate the specific factors that cause this discrepancy by decomposing it into five separate… ▽ More

    Submitted 12 September, 2022; originally announced September 2022.

    Comments: 28 pages, 6 figures

  2. arXiv:2207.02093  [pdf, other

    cs.LG stat.ML

    Predicting Out-of-Domain Generalization with Neighborhood Invariance

    Authors: Nathan Ng, Neha Hulkund, Kyunghyun Cho, Marzyeh Ghassemi

    Abstract: Developing and deploying machine learning models safely depends on the ability to characterize and compare their abilities to generalize to new environments. Although recent work has proposed a variety of methods that can directly predict or theoretically bound the generalization capacity of a model, they rely on strong assumptions such as matching train/test distributions and access to model grad… ▽ More

    Submitted 17 July, 2023; v1 submitted 5 July, 2022; originally announced July 2022.

    Comments: 38 pages, 5 figures, 28 tables

  3. arXiv:2009.10195  [pdf, other

    cs.CL cs.LG stat.ML

    SSMBA: Self-Supervised Manifold Based Data Augmentation for Improving Out-of-Domain Robustness

    Authors: Nathan Ng, Kyunghyun Cho, Marzyeh Ghassemi

    Abstract: Models that perform well on a training domain often fail to generalize to out-of-domain (OOD) examples. Data augmentation is a common method used to prevent overfitting and improve OOD generalization. However, in natural language, it is difficult to generate new examples that stay on the underlying data manifold. We introduce SSMBA, a data augmentation method for generating synthetic training exam… ▽ More

    Submitted 4 October, 2020; v1 submitted 21 September, 2020; originally announced September 2020.

    Comments: 16 pages, 8 figures, to be published in EMNLP 2020

  4. arXiv:1702.05386  [pdf, other

    stat.ML cs.LG cs.NE

    Predicting Surgery Duration with Neural Heteroscedastic Regression

    Authors: Nathan Ng, Rodney A Gabriel, Julian McAuley, Charles Elkan, Zachary C Lipton

    Abstract: Scheduling surgeries is a challenging task due to the fundamental uncertainty of the clinical environment, as well as the risks and costs associated with under- and over-booking. We investigate neural regression algorithms to estimate the parameters of surgery case durations, focusing on the issue of heteroscedasticity. We seek to simultaneously estimate the duration of each surgery, as well as a… ▽ More

    Submitted 12 July, 2017; v1 submitted 17 February, 2017; originally announced February 2017.