Skip to main content

Showing 1–19 of 19 results for author: Seeger, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2410.22854  [pdf, other

    stat.ML cs.LG

    Hyperparameter Optimization in Machine Learning

    Authors: Luca Franceschi, Michele Donini, Valerio Perrone, Aaron Klein, Cédric Archambeau, Matthias Seeger, Massimiliano Pontil, Paolo Frasconi

    Abstract: Hyperparameters are configuration variables controlling the behavior of machine learning algorithms. They are ubiquitous in machine learning and artificial intelligence and the choice of their values determines the effectiveness of systems based on these technologies. Manual hyperparameter search is often unsatisfactory and becomes infeasible when the number of hyperparameters is large. Automating… ▽ More

    Submitted 8 April, 2025; v1 submitted 30 October, 2024; originally announced October 2024.

    Comments: Preprint

  2. arXiv:2402.09947  [pdf, other

    cs.LG

    Explaining Probabilistic Models with Distributional Values

    Authors: Luca Franceschi, Michele Donini, Cédric Archambeau, Matthias Seeger

    Abstract: A large branch of explainable machine learning is grounded in cooperative game theory. However, research indicates that game-theoretic explanations may mislead or be hard to interpret. We argue that often there is a critical mismatch between what one wishes to explain (e.g. the output of a classifier) and what current methods such as SHAP explain (e.g. the scalar probability of a class). This pape… ▽ More

    Submitted 25 October, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: ICML 2024 (spotlight paper). Code: https://github.com/amazon-science/explaining-probabilistic-models-with-distributinal-values. v2: updated references

  3. arXiv:2305.03623  [pdf, other

    cs.LG stat.ML

    Optimizing Hyperparameters with Conformal Quantile Regression

    Authors: David Salinas, Jacek Golebiowski, Aaron Klein, Matthias Seeger, Cedric Archambeau

    Abstract: Many state-of-the-art hyperparameter optimization (HPO) algorithms rely on model-based optimizers that learn surrogate models of the target function to guide the search. Gaussian processes are the de facto surrogate model due to their ability to capture uncertainty but they make strong assumptions about the observation noise, which might not be warranted in practice. In this work, we propose to le… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

  4. arXiv:2302.04019  [pdf, other

    cs.LG stat.ML

    Fortuna: A Library for Uncertainty Quantification in Deep Learning

    Authors: Gianluca Detommaso, Alberto Gasparin, Michele Donini, Matthias Seeger, Andrew Gordon Wilson, Cedric Archambeau

    Abstract: We present Fortuna, an open-source library for uncertainty quantification in deep learning. Fortuna supports a range of calibration techniques, such as conformal prediction that can be applied to any trained neural network to generate reliable uncertainty estimates, and scalable Bayesian inference methods that can be applied to Flax-based deep neural networks trained from scratch for improved unce… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

  5. arXiv:2111.03418  [pdf, other

    cs.LG cs.AI stat.ML

    Meta-Forecasting by combining Global Deep Representations with Local Adaptation

    Authors: Riccardo Grazzi, Valentin Flunkert, David Salinas, Tim Januschowski, Matthias Seeger, Cedric Archambeau

    Abstract: While classical time series forecasting considers individual time series in isolation, recent advances based on deep learning showed that jointly learning from a large pool of related time series can boost the forecasting accuracy. However, the accuracy of these methods suffers greatly when modeling out-of-sample time series, significantly limiting their applicability compared to classical forecas… ▽ More

    Submitted 12 November, 2021; v1 submitted 5 November, 2021; originally announced November 2021.

  6. arXiv:2106.06079  [pdf, other

    cs.LG stat.ML

    A Nonmyopic Approach to Cost-Constrained Bayesian Optimization

    Authors: Eric Hans Lee, David Eriksson, Valerio Perrone, Matthias Seeger

    Abstract: Bayesian optimization (BO) is a popular method for optimizing expensive-to-evaluate black-box functions. BO budgets are typically given in iterations, which implicitly assumes each evaluation has the same cost. In fact, in many BO applications, evaluation costs vary significantly in different regions of the search space. In hyperparameter optimization, the time spent on neural network training inc… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

    Comments: To appear in UAI 2021

  7. arXiv:2104.08166  [pdf, other

    cs.LG cs.AI stat.ML

    Automatic Termination for Hyperparameter Optimization

    Authors: Anastasia Makarova, Huibin Shen, Valerio Perrone, Aaron Klein, Jean Baptiste Faddoul, Andreas Krause, Matthias Seeger, Cedric Archambeau

    Abstract: Bayesian optimization (BO) is a widely popular approach for the hyperparameter optimization (HPO) in machine learning. At its core, BO iteratively evaluates promising configurations until a user-defined budget, such as wall-clock time or number of iterations, is exhausted. While the final performance after tuning heavily depends on the provided budget, it is hard to pre-specify an optimal value in… ▽ More

    Submitted 22 July, 2022; v1 submitted 16 April, 2021; originally announced April 2021.

    Comments: Accepted at AutoML Conference 2022

  8. arXiv:2102.09009  [pdf, other

    cs.LG stat.ML

    BORE: Bayesian Optimization by Density-Ratio Estimation

    Authors: Louis C. Tiao, Aaron Klein, Matthias Seeger, Edwin V. Bonilla, Cedric Archambeau, Fabio Ramos

    Abstract: Bayesian optimization (BO) is among the most effective and widely-used blackbox optimization methods. BO proposes solutions according to an explore-exploit trade-off criterion encoded in an acquisition function, many of which are computed from the posterior predictive of a probabilistic surrogate model. Prevalent among these is the expected improvement (EI) function. The need to ensure analytical… ▽ More

    Submitted 17 February, 2021; originally announced February 2021.

    Comments: preprint, under review

  9. arXiv:2012.08489  [pdf, other

    cs.LG cs.AI stat.ML

    Amazon SageMaker Automatic Model Tuning: Scalable Gradient-Free Optimization

    Authors: Valerio Perrone, Huibin Shen, Aida Zolic, Iaroslav Shcherbatyi, Amr Ahmed, Tanya Bansal, Michele Donini, Fela Winkelmolen, Rodolphe Jenatton, Jean Baptiste Faddoul, Barbara Pogorzelska, Miroslav Miladinovic, Krishnaram Kenthapadi, Matthias Seeger, Cédric Archambeau

    Abstract: Tuning complex machine learning systems is challenging. Machine learning typically requires to set hyperparameters, be it regularization, architecture, or optimization parameters, whose tuning is critical to achieve good predictive performance. To democratize access to machine learning systems, it is essential to automate the tuning. This paper presents Amazon SageMaker Automatic Model Tuning (AMT… ▽ More

    Submitted 18 June, 2021; v1 submitted 15 December, 2020; originally announced December 2020.

  10. arXiv:2012.08483  [pdf, other

    cs.LG

    Amazon SageMaker Autopilot: a white box AutoML solution at scale

    Authors: Piali Das, Valerio Perrone, Nikita Ivkin, Tanya Bansal, Zohar Karnin, Huibin Shen, Iaroslav Shcherbatyi, Yotam Elor, Wilton Wu, Aida Zolic, Thibaut Lienart, Alex Tang, Amr Ahmed, Jean Baptiste Faddoul, Rodolphe Jenatton, Fela Winkelmolen, Philip Gautier, Leo Dirac, Andre Perunicic, Miroslav Miladinovic, Giovanni Zappella, Cédric Archambeau, Matthias Seeger, Bhaskar Dutt, Laurence Rouesnel

    Abstract: AutoML systems provide a black-box solution to machine learning problems by selecting the right way of processing features, choosing an algorithm and tuning the hyperparameters of the entire pipeline. Although these systems perform well on many datasets, there is still a non-negligible number of datasets for which the one-shot solution produced by each particular system would provide sub-par perfo… ▽ More

    Submitted 16 December, 2020; v1 submitted 15 December, 2020; originally announced December 2020.

  11. arXiv:2003.10870  [pdf, other

    cs.LG stat.ML

    Cost-aware Bayesian Optimization

    Authors: Eric Hans Lee, Valerio Perrone, Cedric Archambeau, Matthias Seeger

    Abstract: Bayesian optimization (BO) is a class of global optimization algorithms, suitable for minimizing an expensive objective function in as few function evaluations as possible. While BO budgets are typically given in iterations, this implicitly measures convergence in terms of iteration count and assumes each evaluation has identical cost. In practice, evaluation costs may vary in different regions of… ▽ More

    Submitted 22 March, 2020; originally announced March 2020.

  12. arXiv:2003.10865  [pdf, other

    cs.LG stat.ML

    Model-based Asynchronous Hyperparameter and Neural Architecture Search

    Authors: Aaron Klein, Louis C. Tiao, Thibaut Lienart, Cedric Archambeau, Matthias Seeger

    Abstract: We introduce a model-based asynchronous multi-fidelity method for hyperparameter and neural architecture search that combines the strengths of asynchronous Hyperband and Gaussian process-based Bayesian optimization. At the heart of our method is a probabilistic model that can simultaneously reason across hyperparameters and resource levels, and supports decision-making in the presence of pending e… ▽ More

    Submitted 30 June, 2020; v1 submitted 24 March, 2020; originally announced March 2020.

  13. arXiv:2002.12462  [pdf, other

    cs.LG cs.CV stat.ML

    LEEP: A New Measure to Evaluate Transferability of Learned Representations

    Authors: Cuong V. Nguyen, Tal Hassner, Matthias Seeger, Cedric Archambeau

    Abstract: We introduce a new measure to evaluate the transferability of representations learned by classifiers. Our measure, the Log Expected Empirical Prediction (LEEP), is simple and easy to compute: when given a classifier trained on a source data set, it only requires running the target data set through this classifier once. We analyze the properties of LEEP theoretically and demonstrate its effectivene… ▽ More

    Submitted 13 August, 2020; v1 submitted 27 February, 2020; originally announced February 2020.

    Comments: Published at the International Conference on Machine Learning (ICML) 2020

  14. arXiv:1910.07003  [pdf, other

    stat.ML cs.LG

    Constrained Bayesian Optimization with Max-Value Entropy Search

    Authors: Valerio Perrone, Iaroslav Shcherbatyi, Rodolphe Jenatton, Cedric Archambeau, Matthias Seeger

    Abstract: Bayesian optimization (BO) is a model-based approach to sequentially optimize expensive black-box functions, such as the validation error of a deep neural network with respect to its hyperparameters. In many real-world scenarios, the optimization is further subject to a priori unknown constraints. For example, training a deep network configuration may fail with an out-of-memory error when the mode… ▽ More

    Submitted 15 October, 2019; originally announced October 2019.

  15. arXiv:1909.12552  [pdf, other

    stat.ML cs.LG

    Learning search spaces for Bayesian optimization: Another view of hyperparameter transfer learning

    Authors: Valerio Perrone, Huibin Shen, Matthias Seeger, Cedric Archambeau, Rodolphe Jenatton

    Abstract: Bayesian optimization (BO) is a successful methodology to optimize black-box functions that are expensive to evaluate. While traditional methods optimize each black-box function in isolation, there has been recent interest in speeding up BO by transferring knowledge across multiple related black-box functions. In this work, we introduce a method to automatically design the BO search space by relyi… ▽ More

    Submitted 27 September, 2019; originally announced September 2019.

  16. arXiv:1710.08717  [pdf, other

    cs.MS cs.LG stat.ML

    Auto-Differentiating Linear Algebra

    Authors: Matthias Seeger, Asmus Hetzel, Zhenwen Dai, Eric Meissner, Neil D. Lawrence

    Abstract: Development systems for deep learning (DL), such as Theano, Torch, TensorFlow, or MXNet, are easy-to-use tools for creating complex neural network models. Since gradient computations are automatically baked in, and execution is mapped to high performance hardware, these models can be trained end-to-end on large amounts of data. However, it is currently not easy to implement many basic machine lear… ▽ More

    Submitted 14 August, 2019; v1 submitted 24 October, 2017; originally announced October 2017.

  17. arXiv:1709.07638  [pdf, other

    stat.ML cs.LG

    Approximate Bayesian Inference in Linear State Space Models for Intermittent Demand Forecasting at Scale

    Authors: Matthias Seeger, Syama Rangapuram, Yuyang Wang, David Salinas, Jan Gasthaus, Tim Januschowski, Valentin Flunkert

    Abstract: We present a scalable and robust Bayesian inference method for linear state space models. The method is applied to demand forecasting in the context of a large e-commerce platform, paying special attention to intermittent and bursty target statistics. Inference is approximated by the Newton-Raphson algorithm, reduced to linear-time Kalman smoothing, which allows us to operate on several orders of… ▽ More

    Submitted 22 September, 2017; originally announced September 2017.

  18. arXiv:1206.6437  [pdf

    cs.CV cs.LG stat.ML

    Large Scale Variational Bayesian Inference for Structured Scale Mixture Models

    Authors: Young Jun Ko, Matthias Seeger

    Abstract: Natural image statistics exhibit hierarchical dependencies across multiple scales. Representing such prior knowledge in non-factorial latent tree models can boost performance of image denoising, inpainting, deconvolution or reconstruction substantially, beyond standard factorial "sparse" methodology. We derive a large scale approximate Bayesian inference algorithm for linear models with non-factor… ▽ More

    Submitted 27 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)

  19. Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design

    Authors: Niranjan Srinivas, Andreas Krause, Sham M. Kakade, Matthias Seeger

    Abstract: Many applications require optimizing an unknown, noisy function that is expensive to evaluate. We formalize this task as a multi-armed bandit problem, where the payoff function is either sampled from a Gaussian process (GP) or has low RKHS norm. We resolve the important open problem of deriving regret bounds for this setting, which imply novel convergence rates for GP optimization. We analyze GP-U… ▽ More

    Submitted 9 June, 2010; v1 submitted 20 December, 2009; originally announced December 2009.