Skip to main content

Showing 1–23 of 23 results for author: Wenzel, F

.
  1. arXiv:2410.17772  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Scaling Robot Policy Learning via Zero-Shot Labeling with Foundation Models

    Authors: Nils Blank, Moritz Reuss, Marcel Rühle, Ömer Erdinç Yağmurlu, Fabian Wenzel, Oier Mees, Rudolf Lioutikov

    Abstract: A central challenge towards developing robots that can relate human language to their perception and actions is the scarcity of natural language annotations in diverse robot datasets. Moreover, robot policies that follow natural language instructions are typically trained on either templated language or expensive human-labeled instructions, hindering their scalability. To this end, we introduce NI… ▽ More

    Submitted 26 October, 2024; v1 submitted 23 October, 2024; originally announced October 2024.

    Comments: Project Website at https://robottasklabeling.github.io/

  2. arXiv:2407.05996  [pdf, other

    cs.RO

    Multimodal Diffusion Transformer: Learning Versatile Behavior from Multimodal Goals

    Authors: Moritz Reuss, Ömer Erdinç Yağmurlu, Fabian Wenzel, Rudolf Lioutikov

    Abstract: This work introduces the Multimodal Diffusion Transformer (MDT), a novel diffusion policy framework, that excels at learning versatile behavior from multimodal goal specifications with few language annotations. MDT leverages a diffusion-based multimodal transformer backbone and two self-supervised auxiliary objectives to master long-horizon manipulation tasks based on multimodal goals. The vast ma… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: RSS 2024

  3. arXiv:2403.00025  [pdf, ps, other

    cs.LG cs.AI

    On the Challenges and Opportunities in Generative AI

    Authors: Laura Manduchi, Kushagra Pandey, Clara Meister, Robert Bamler, Ryan Cotterell, Sina Däubener, Sophie Fellenz, Asja Fischer, Thomas Gärtner, Matthias Kirchler, Marius Kloft, Yingzhen Li, Christoph Lippert, Gerard de Melo, Eric Nalisnick, Björn Ommer, Rajesh Ranganath, Maja Rudolph, Karen Ullrich, Guy Van den Broeck, Julia E Vogt, Yixin Wang, Florian Wenzel, Frank Wood, Stephan Mandt , et al. (1 additional authors not shown)

    Abstract: The field of deep generative modeling has grown rapidly in the last few years. With the availability of massive amounts of training data coupled with advances in scalable unsupervised learning paradigms, recent large-scale generative models show tremendous promise in synthesizing high-resolution images and text, as well as structured data such as videos and molecules. However, we argue that curren… ▽ More

    Submitted 20 March, 2025; v1 submitted 28 February, 2024; originally announced March 2024.

  4. arXiv:2310.11867  [pdf, other

    cs.CV cs.CY cs.LG

    Evaluating the Fairness of Discriminative Foundation Models in Computer Vision

    Authors: Junaid Ali, Matthaeus Kleindessner, Florian Wenzel, Kailash Budhathoki, Volkan Cevher, Chris Russell

    Abstract: We propose a novel taxonomy for bias evaluation of discriminative foundation models, such as Contrastive Language-Pretraining (CLIP), that are used for labeling tasks. We then systematically evaluate existing methods for mitigating bias in these models with respect to our taxonomy. Specifically, we evaluate OpenAI's CLIP and OpenCLIP models for key applications, such as zero-shot classification, i… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: Accepted at AIES'23

  5. arXiv:2304.10253  [pdf, other

    cs.CV cs.LG

    Image retrieval outperforms diffusion models on data augmentation

    Authors: Max F. Burg, Florian Wenzel, Dominik Zietlow, Max Horn, Osama Makansi, Francesco Locatello, Chris Russell

    Abstract: Many approaches have been proposed to use diffusion models to augment training datasets for downstream tasks, such as classification. However, diffusion models are themselves trained on large datasets, often with noisy annotations, and it remains an open question to which extent these models contribute to downstream classification performance. In particular, it remains unclear if they generalize e… ▽ More

    Submitted 30 November, 2023; v1 submitted 20 April, 2023; originally announced April 2023.

  6. arXiv:2304.07939  [pdf, other

    cs.LG

    Leveraging sparse and shared feature activations for disentangled representation learning

    Authors: Marco Fumero, Florian Wenzel, Luca Zancato, Alessandro Achille, Emanuele Rodolà, Stefano Soatto, Bernhard Schölkopf, Francesco Locatello

    Abstract: Recovering the latent factors of variation of high dimensional data has so far focused on simple synthetic settings. Mostly building on unsupervised and weakly-supervised objectives, prior work missed out on the positive implications for representation learning on real world data. In this work, we propose to leverage knowledge extracted from a diversified set of supervised tasks to learn a common… ▽ More

    Submitted 12 December, 2023; v1 submitted 16 April, 2023; originally announced April 2023.

  7. arXiv:2303.02484  [pdf, other

    cs.LG cs.AI cs.CV

    Multi-Symmetry Ensembles: Improving Diversity and Generalization via Opposing Symmetries

    Authors: Charlotte Loh, Seungwook Han, Shivchander Sudalairaj, Rumen Dangovski, Kai Xu, Florian Wenzel, Marin Soljacic, Akash Srivastava

    Abstract: Deep ensembles (DE) have been successful in improving model performance by learning diverse members via the stochasticity of random initialization. While recent works have attempted to promote further diversity in DE via hyperparameters or regularizing loss functions, these methods primarily still rely on a stochastic approach to explore the hypothesis space. In this work, we present Multi-Symmetr… ▽ More

    Submitted 19 June, 2023; v1 submitted 4 March, 2023; originally announced March 2023.

    Comments: Camera Ready Revision. ICML 2023

  8. arXiv:2212.08044  [pdf, other

    cs.CV

    Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift

    Authors: Jielin Qiu, Yi Zhu, Xingjian Shi, Florian Wenzel, Zhiqiang Tang, Ding Zhao, Bo Li, Mu Li

    Abstract: Multimodal image-text models have shown remarkable performance in the past few years. However, evaluating robustness against distribution shifts is crucial before adopting them in real-world applications. In this work, we investigate the robustness of 12 popular open-sourced image-text models under common perturbations on five tasks (image-text retrieval, visual reasoning, visual entailment, image… ▽ More

    Submitted 19 January, 2024; v1 submitted 15 December, 2022; originally announced December 2022.

    Comments: Accepted by Journal of Data-centric Machine Learning Research (DMLR) 2024

  9. arXiv:2207.09239  [pdf, other

    cs.LG stat.ML

    Assaying Out-Of-Distribution Generalization in Transfer Learning

    Authors: Florian Wenzel, Andrea Dittadi, Peter Vincent Gehler, Carl-Johann Simon-Gabriel, Max Horn, Dominik Zietlow, David Kernert, Chris Russell, Thomas Brox, Bernt Schiele, Bernhard Schölkopf, Francesco Locatello

    Abstract: Since out-of-distribution generalization is a generally ill-posed problem, various proxy targets (e.g., calibration, adversarial robustness, algorithmic corruptions, invariance across shifts) were studied across different research programs resulting in different recommendations. While sharing the same aspirational goal, these approaches have never been tested under the same experimental conditions… ▽ More

    Submitted 21 October, 2022; v1 submitted 19 July, 2022; originally announced July 2022.

  10. arXiv:2110.03360  [pdf, other

    cs.LG cs.CV stat.ML

    Sparse MoEs meet Efficient Ensembles

    Authors: James Urquhart Allingham, Florian Wenzel, Zelda E Mariet, Basil Mustafa, Joan Puigcerver, Neil Houlsby, Ghassen Jerfel, Vincent Fortuin, Balaji Lakshminarayanan, Jasper Snoek, Dustin Tran, Carlos Riquelme Ruiz, Rodolphe Jenatton

    Abstract: Machine learning models based on the aggregated outputs of submodels, either at the activation or prediction levels, often exhibit strong performance compared to individual models. We study the interplay of two popular classes of such models: ensembles of neural networks and sparse mixture of experts (sparse MoEs). First, we show that the two approaches have complementary features whose combinatio… ▽ More

    Submitted 9 July, 2023; v1 submitted 7 October, 2021; originally announced October 2021.

    Comments: 59 pages, 26 figures, 36 tables. Accepted at TMLR

  11. arXiv:2110.02609  [pdf, other

    stat.ML cs.LG

    Deep Classifiers with Label Noise Modeling and Distance Awareness

    Authors: Vincent Fortuin, Mark Collier, Florian Wenzel, James Allingham, Jeremiah Liu, Dustin Tran, Balaji Lakshminarayanan, Jesse Berent, Rodolphe Jenatton, Effrosyni Kokiopoulou

    Abstract: Uncertainty estimation in deep learning has recently emerged as a crucial area of interest to advance reliability and robustness in safety-critical applications. While there have been many proposed methods that either focus on distance-aware model uncertainties for out-of-distribution detection or on input-dependent label uncertainties for in-distribution calibration, both of these types of uncert… ▽ More

    Submitted 8 August, 2022; v1 submitted 6 October, 2021; originally announced October 2021.

    Comments: Published in TMLR

  12. arXiv:2106.10760  [pdf, other

    cs.LG stat.ML

    On Stein Variational Neural Network Ensembles

    Authors: Francesco D'Angelo, Vincent Fortuin, Florian Wenzel

    Abstract: Ensembles of deep neural networks have achieved great success recently, but they do not offer a proper Bayesian justification. Moreover, while they allow for averaging of predictions over several hypotheses, they do not provide any guarantees for their diversity, leading to redundant solutions in function space. In contrast, particle-based inference methods, such as Stein variational gradient desc… ▽ More

    Submitted 22 June, 2021; v1 submitted 20 June, 2021; originally announced June 2021.

  13. arXiv:2106.04015  [pdf, other

    cs.LG

    Uncertainty Baselines: Benchmarks for Uncertainty & Robustness in Deep Learning

    Authors: Zachary Nado, Neil Band, Mark Collier, Josip Djolonga, Michael W. Dusenberry, Sebastian Farquhar, Qixuan Feng, Angelos Filos, Marton Havasi, Rodolphe Jenatton, Ghassen Jerfel, Jeremiah Liu, Zelda Mariet, Jeremy Nixon, Shreyas Padhy, Jie Ren, Tim G. J. Rudner, Faris Sbahi, Yeming Wen, Florian Wenzel, Kevin Murphy, D. Sculley, Balaji Lakshminarayanan, Jasper Snoek, Yarin Gal , et al. (1 additional authors not shown)

    Abstract: High-quality estimates of uncertainty and robustness are crucial for numerous real-world applications, especially for deep learning which underlies many deployed ML systems. The ability to compare techniques for improving these estimates is therefore very important for research and practice alike. Yet, competitive comparisons of methods are often lacking due to a range of reasons, including: compu… ▽ More

    Submitted 5 January, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

  14. arXiv:2102.06571  [pdf, other

    stat.ML cs.LG

    Bayesian Neural Network Priors Revisited

    Authors: Vincent Fortuin, Adrià Garriga-Alonso, Sebastian W. Ober, Florian Wenzel, Gunnar Rätsch, Richard E. Turner, Mark van der Wilk, Laurence Aitchison

    Abstract: Isotropic Gaussian priors are the de facto standard for modern Bayesian neural network inference. However, it is unclear whether these priors accurately reflect our true beliefs about the weight distributions or give optimal performance. To find better priors, we study summary statistics of neural network weights in networks trained using stochastic gradient descent (SGD). We find that convolution… ▽ More

    Submitted 16 March, 2022; v1 submitted 12 February, 2021; originally announced February 2021.

    Comments: Accepted at ICLR 2022

  15. arXiv:2006.13570  [pdf, other

    cs.LG stat.ML

    Hyperparameter Ensembles for Robustness and Uncertainty Quantification

    Authors: Florian Wenzel, Jasper Snoek, Dustin Tran, Rodolphe Jenatton

    Abstract: Ensembles over neural network weights trained from different random initialization, known as deep ensembles, achieve state-of-the-art accuracy and calibration. The recently introduced batch ensembles provide a drop-in replacement that is more parameter efficient. In this paper, we design ensembles not only over weights, but over hyperparameters to improve the state of the art in both settings. For… ▽ More

    Submitted 8 January, 2021; v1 submitted 24 June, 2020; originally announced June 2020.

    Comments: Accepted at NeurIPS 2020

  16. arXiv:2002.11451  [pdf, other

    stat.ML cs.LG

    Automated Augmented Conjugate Inference for Non-conjugate Gaussian Process Models

    Authors: Théo Galy-Fajou, Florian Wenzel, Manfred Opper

    Abstract: We propose automated augmented conjugate inference, a new inference method for non-conjugate Gaussian processes (GP) models. Our method automatically constructs an auxiliary variable augmentation that renders the GP model conditionally conjugate. Building on the conjugate structure of the augmented model, we develop two inference methods. First, a fast and scalable stochastic variational inference… ▽ More

    Submitted 26 February, 2020; originally announced February 2020.

    Comments: Accepted at AISTATS 2020

  17. arXiv:2002.02405  [pdf, other

    stat.ML cs.LG stat.CO

    How Good is the Bayes Posterior in Deep Neural Networks Really?

    Authors: Florian Wenzel, Kevin Roth, Bastiaan S. Veeling, Jakub Świątkowski, Linh Tran, Stephan Mandt, Jasper Snoek, Tim Salimans, Rodolphe Jenatton, Sebastian Nowozin

    Abstract: During the past five years the Bayesian deep learning community has developed increasingly accurate and efficient approximate inference procedures that allow for Bayesian inference in deep neural networks. However, despite this algorithmic progress and the promise of improved uncertainty quantification and sample efficiency there are---as of early 2020---no publicized deployments of Bayesian neura… ▽ More

    Submitted 2 July, 2020; v1 submitted 6 February, 2020; originally announced February 2020.

    Comments: Full version (main paper and appendix) of the ICML 2020 publication

  18. arXiv:1905.09670  [pdf, other

    stat.ML cs.LG

    Multi-Class Gaussian Process Classification Made Conjugate: Efficient Inference via Data Augmentation

    Authors: Théo Galy-Fajou, Florian Wenzel, Christian Donner, Manfred Opper

    Abstract: We propose a new scalable multi-class Gaussian process classification approach building on a novel modified softmax likelihood function. The new likelihood has two benefits: it leads to well-calibrated uncertainty estimates and allows for an efficient latent variable augmentation. The augmented model has the advantage that it is conditionally conjugate leading to a fast variational inference metho… ▽ More

    Submitted 23 May, 2019; originally announced May 2019.

    Comments: Accepted at UAI 2019

  19. arXiv:1807.01604  [pdf, other

    stat.ML cs.LG

    Quasi-Monte Carlo Variational Inference

    Authors: Alexander Buchholz, Florian Wenzel, Stephan Mandt

    Abstract: Many machine learning problems involve Monte Carlo gradient estimators. As a prominent example, we focus on Monte Carlo variational inference (MCVI) in this paper. The performance of MCVI crucially depends on the variance of its stochastic gradients. We propose variance reduction by means of Quasi-Monte Carlo (QMC) sampling. QMC replaces N i.i.d. samples from a uniform probability distribution by… ▽ More

    Submitted 4 July, 2018; originally announced July 2018.

    Journal ref: Published in the proceedings of the 35th International Conference on Machine Learning (ICML 2018)

  20. arXiv:1803.07868  [pdf, other

    stat.ML cs.LG

    Scalable Generalized Dynamic Topic Models

    Authors: Patrick Jähnichen, Florian Wenzel, Marius Kloft, Stephan Mandt

    Abstract: Dynamic topic models (DTMs) model the evolution of prevalent themes in literature, online media, and other forms of text over time. DTMs assume that word co-occurrence statistics change continuously and therefore impose continuous stochastic process priors on their model parameters. These dynamical priors make inference much harder than in regular topic models, and also limit scalability. In this… ▽ More

    Submitted 21 March, 2018; originally announced March 2018.

    Comments: Published version, International Conference on Artificial Intelligence and Statistics (AISTATS 2018)

  21. arXiv:1802.06383  [pdf, other

    stat.ML cs.LG

    Efficient Gaussian Process Classification Using Polya-Gamma Data Augmentation

    Authors: Florian Wenzel, Theo Galy-Fajou, Christan Donner, Marius Kloft, Manfred Opper

    Abstract: We propose a scalable stochastic variational approach to GP classification building on Polya-Gamma data augmentation and inducing points. Unlike former approaches, we obtain closed-form updates based on natural gradients that lead to efficient optimization. We evaluate the algorithm on real-world datasets containing up to 11 million data points and demonstrate that it is up to two orders of magnit… ▽ More

    Submitted 27 November, 2018; v1 submitted 18 February, 2018; originally announced February 2018.

  22. arXiv:1707.05532  [pdf, other

    stat.ML cs.LG

    Bayesian Nonlinear Support Vector Machines for Big Data

    Authors: Florian Wenzel, Theo Galy-Fajou, Matthaeus Deutsch, Marius Kloft

    Abstract: We propose a fast inference method for Bayesian nonlinear support vector machines that leverages stochastic variational inference and inducing points. Our experiments show that the proposed method is faster than competing Bayesian approaches and scales easily to millions of data points. It provides additional features over frequentist competitors such as accurate predictive uncertainty estimates a… ▽ More

    Submitted 18 July, 2017; originally announced July 2017.

    Comments: accepted as conference paper at ECML-PKDD 2017

  23. Sparse Probit Linear Mixed Model

    Authors: Stephan Mandt, Florian Wenzel, Shinichi Nakajima, John P. Cunningham, Christoph Lippert, Marius Kloft

    Abstract: Linear Mixed Models (LMMs) are important tools in statistical genetics. When used for feature selection, they allow to find a sparse set of genetic traits that best predict a continuous phenotype of interest, while simultaneously correcting for various confounding factors such as age, ethnicity and population structure. Formulated as models for linear regression, LMMs have been restricted to conti… ▽ More

    Submitted 17 July, 2017; v1 submitted 16 July, 2015; originally announced July 2015.

    Comments: Published version, 21 pages, 6 figures

    Journal ref: Machine Learning, 106(9), 1621-1642 (2017)