Search | arXiv e-print repository

Spherical Inducing Features for Orthogonally-Decoupled Gaussian Processes

Authors: Louis C. Tiao, Vincent Dutordoir, Victor Picheny

Abstract: Despite their many desirable properties, Gaussian processes (GPs) are often compared unfavorably to deep neural networks (NNs) for lacking the ability to learn representations. Recent efforts to bridge the gap between GPs and deep NNs have yielded a new class of inter-domain variational GPs in which the inducing variables correspond to hidden units of a feedforward NN. In this work, we examine som… ▽ More Despite their many desirable properties, Gaussian processes (GPs) are often compared unfavorably to deep neural networks (NNs) for lacking the ability to learn representations. Recent efforts to bridge the gap between GPs and deep NNs have yielded a new class of inter-domain variational GPs in which the inducing variables correspond to hidden units of a feedforward NN. In this work, we examine some practical issues associated with this approach and propose an extension that leverages the orthogonal decomposition of GPs to mitigate these limitations. In particular, we introduce spherical inter-domain features to construct more flexible data-dependent basis functions for both the principal and orthogonal components of the GP approximation and show that incorporating NN activation features under this framework not only alleviates these shortcomings but is more scalable than alternative strategies. Experiments on multiple benchmark datasets demonstrate the effectiveness of our approach. △ Less

Submitted 15 June, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

Comments: To appear in the Proceedings of the 40th International Conference on Machine Learning (ICML2023)

arXiv:2102.09009 [pdf, other]

BORE: Bayesian Optimization by Density-Ratio Estimation

Authors: Louis C. Tiao, Aaron Klein, Matthias Seeger, Edwin V. Bonilla, Cedric Archambeau, Fabio Ramos

Abstract: Bayesian optimization (BO) is among the most effective and widely-used blackbox optimization methods. BO proposes solutions according to an explore-exploit trade-off criterion encoded in an acquisition function, many of which are computed from the posterior predictive of a probabilistic surrogate model. Prevalent among these is the expected improvement (EI) function. The need to ensure analytical… ▽ More Bayesian optimization (BO) is among the most effective and widely-used blackbox optimization methods. BO proposes solutions according to an explore-exploit trade-off criterion encoded in an acquisition function, many of which are computed from the posterior predictive of a probabilistic surrogate model. Prevalent among these is the expected improvement (EI) function. The need to ensure analytical tractability of the predictive often poses limitations that can hinder the efficiency and applicability of BO. In this paper, we cast the computation of EI as a binary classification problem, building on the link between class-probability estimation and density-ratio estimation, and the lesser-known link between density-ratios and EI. By circumventing the tractability constraints, this reformulation provides numerous advantages, not least in terms of expressiveness, versatility, and scalability. △ Less

Submitted 17 February, 2021; originally announced February 2021.

Comments: preprint, under review

arXiv:2003.10865 [pdf, other]

Model-based Asynchronous Hyperparameter and Neural Architecture Search

Authors: Aaron Klein, Louis C. Tiao, Thibaut Lienart, Cedric Archambeau, Matthias Seeger

Abstract: We introduce a model-based asynchronous multi-fidelity method for hyperparameter and neural architecture search that combines the strengths of asynchronous Hyperband and Gaussian process-based Bayesian optimization. At the heart of our method is a probabilistic model that can simultaneously reason across hyperparameters and resource levels, and supports decision-making in the presence of pending e… ▽ More We introduce a model-based asynchronous multi-fidelity method for hyperparameter and neural architecture search that combines the strengths of asynchronous Hyperband and Gaussian process-based Bayesian optimization. At the heart of our method is a probabilistic model that can simultaneously reason across hyperparameters and resource levels, and supports decision-making in the presence of pending evaluations. We demonstrate the effectiveness of our method on a wide range of challenging benchmarks, for tabular data, image classification and language modelling, and report substantial speed-ups over current state-of-the-art methods. Our new methods, along with asynchronous baselines, are implemented in a distributed framework which will be open sourced along with this publication. △ Less

Submitted 30 June, 2020; v1 submitted 24 March, 2020; originally announced March 2020.

arXiv:1806.01771 [pdf, other]

Cycle-Consistent Adversarial Learning as Approximate Bayesian Inference

Authors: Louis C. Tiao, Edwin V. Bonilla, Fabio Ramos

Abstract: We formalize the problem of learning interdomain correspondences in the absence of paired data as Bayesian inference in a latent variable model (LVM), where one seeks the underlying hidden representations of entities from one domain as entities from the other domain. First, we introduce implicit latent variable models, where the prior over hidden representations can be specified flexibly as an imp… ▽ More We formalize the problem of learning interdomain correspondences in the absence of paired data as Bayesian inference in a latent variable model (LVM), where one seeks the underlying hidden representations of entities from one domain as entities from the other domain. First, we introduce implicit latent variable models, where the prior over hidden representations can be specified flexibly as an implicit distribution. Next, we develop a new variational inference (VI) algorithm for this model based on minimization of the symmetric Kullback-Leibler (KL) divergence between a variational joint and the exact joint distribution. Lastly, we demonstrate that the state-of-the-art cycle-consistent adversarial learning (CYCLEGAN) models can be derived as a special case within our proposed VI framework, thus establishing its connection to approximate Bayesian inference methods. △ Less

Submitted 24 August, 2018; v1 submitted 5 June, 2018; originally announced June 2018.

Comments: Presented at the ICML 2018 Workshop on Theoretical Foundations and Applications of Deep Generative Models. Stockholm, Sweden, 2018

Showing 1–4 of 4 results for author: Tiao, L C