Skip to main content

Showing 1–3 of 3 results for author: Bouthillier, X

Searching in archive stat. Search in all archives.
.
  1. arXiv:2103.03098  [pdf, other

    cs.LG stat.ML

    Accounting for Variance in Machine Learning Benchmarks

    Authors: Xavier Bouthillier, Pierre Delaunay, Mirko Bronzi, Assya Trofimov, Brennan Nichyporuk, Justin Szeto, Naz Sepah, Edward Raff, Kanika Madan, Vikram Voleti, Samira Ebrahimi Kahou, Vincent Michalski, Dmitriy Serdyuk, Tal Arbel, Chris Pal, Gaël Varoquaux, Pascal Vincent

    Abstract: Strong empirical evidence that one machine-learning algorithm A outperforms another one B ideally calls for multiple trials optimizing the learning pipeline over sources of variation such as data sampling, data augmentation, parameter initialization, and hyperparameters choices. This is prohibitively expensive, and corners are cut to reach conclusions. We model the whole benchmarking process, reve… ▽ More

    Submitted 1 March, 2021; originally announced March 2021.

    Comments: Submitted to MLSys2021

  2. arXiv:1806.03884  [pdf, other

    cs.LG stat.ML

    Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis

    Authors: Thomas George, César Laurent, Xavier Bouthillier, Nicolas Ballas, Pascal Vincent

    Abstract: Optimization algorithms that leverage gradient covariance information, such as variants of natural gradient descent (Amari, 1998), offer the prospect of yielding more effective descent directions. For models with many parameters, the covariance matrix they are based on becomes gigantic, making them inapplicable in their original form. This has motivated research into both simple diagonal approxima… ▽ More

    Submitted 26 July, 2021; v1 submitted 11 June, 2018; originally announced June 2018.

    Journal ref: Advances in Neural Information Processing Systems 2018

  3. arXiv:1506.08700  [pdf, other

    stat.ML cs.LG

    Dropout as data augmentation

    Authors: Xavier Bouthillier, Kishore Konda, Pascal Vincent, Roland Memisevic

    Abstract: Dropout is typically interpreted as bagging a large number of models sharing parameters. We show that using dropout in a network can also be interpreted as a kind of data augmentation in the input space without domain knowledge. We present an approach to projecting the dropout noise within a network back into the input space, thereby generating augmented versions of the training data, and we show… ▽ More

    Submitted 7 January, 2016; v1 submitted 29 June, 2015; originally announced June 2015.