Skip to main content

Showing 1–14 of 14 results for author: Wenzel, F

Searching in archive stat. Search in all archives.
.
  1. arXiv:2207.09239  [pdf, other

    cs.LG stat.ML

    Assaying Out-Of-Distribution Generalization in Transfer Learning

    Authors: Florian Wenzel, Andrea Dittadi, Peter Vincent Gehler, Carl-Johann Simon-Gabriel, Max Horn, Dominik Zietlow, David Kernert, Chris Russell, Thomas Brox, Bernt Schiele, Bernhard Schölkopf, Francesco Locatello

    Abstract: Since out-of-distribution generalization is a generally ill-posed problem, various proxy targets (e.g., calibration, adversarial robustness, algorithmic corruptions, invariance across shifts) were studied across different research programs resulting in different recommendations. While sharing the same aspirational goal, these approaches have never been tested under the same experimental conditions… ▽ More

    Submitted 21 October, 2022; v1 submitted 19 July, 2022; originally announced July 2022.

  2. arXiv:2110.03360  [pdf, other

    cs.LG cs.CV stat.ML

    Sparse MoEs meet Efficient Ensembles

    Authors: James Urquhart Allingham, Florian Wenzel, Zelda E Mariet, Basil Mustafa, Joan Puigcerver, Neil Houlsby, Ghassen Jerfel, Vincent Fortuin, Balaji Lakshminarayanan, Jasper Snoek, Dustin Tran, Carlos Riquelme Ruiz, Rodolphe Jenatton

    Abstract: Machine learning models based on the aggregated outputs of submodels, either at the activation or prediction levels, often exhibit strong performance compared to individual models. We study the interplay of two popular classes of such models: ensembles of neural networks and sparse mixture of experts (sparse MoEs). First, we show that the two approaches have complementary features whose combinatio… ▽ More

    Submitted 9 July, 2023; v1 submitted 7 October, 2021; originally announced October 2021.

    Comments: 59 pages, 26 figures, 36 tables. Accepted at TMLR

  3. arXiv:2110.02609  [pdf, other

    stat.ML cs.LG

    Deep Classifiers with Label Noise Modeling and Distance Awareness

    Authors: Vincent Fortuin, Mark Collier, Florian Wenzel, James Allingham, Jeremiah Liu, Dustin Tran, Balaji Lakshminarayanan, Jesse Berent, Rodolphe Jenatton, Effrosyni Kokiopoulou

    Abstract: Uncertainty estimation in deep learning has recently emerged as a crucial area of interest to advance reliability and robustness in safety-critical applications. While there have been many proposed methods that either focus on distance-aware model uncertainties for out-of-distribution detection or on input-dependent label uncertainties for in-distribution calibration, both of these types of uncert… ▽ More

    Submitted 8 August, 2022; v1 submitted 6 October, 2021; originally announced October 2021.

    Comments: Published in TMLR

  4. arXiv:2106.10760  [pdf, other

    cs.LG stat.ML

    On Stein Variational Neural Network Ensembles

    Authors: Francesco D'Angelo, Vincent Fortuin, Florian Wenzel

    Abstract: Ensembles of deep neural networks have achieved great success recently, but they do not offer a proper Bayesian justification. Moreover, while they allow for averaging of predictions over several hypotheses, they do not provide any guarantees for their diversity, leading to redundant solutions in function space. In contrast, particle-based inference methods, such as Stein variational gradient desc… ▽ More

    Submitted 22 June, 2021; v1 submitted 20 June, 2021; originally announced June 2021.

  5. arXiv:2102.06571  [pdf, other

    stat.ML cs.LG

    Bayesian Neural Network Priors Revisited

    Authors: Vincent Fortuin, Adrià Garriga-Alonso, Sebastian W. Ober, Florian Wenzel, Gunnar Rätsch, Richard E. Turner, Mark van der Wilk, Laurence Aitchison

    Abstract: Isotropic Gaussian priors are the de facto standard for modern Bayesian neural network inference. However, it is unclear whether these priors accurately reflect our true beliefs about the weight distributions or give optimal performance. To find better priors, we study summary statistics of neural network weights in networks trained using stochastic gradient descent (SGD). We find that convolution… ▽ More

    Submitted 16 March, 2022; v1 submitted 12 February, 2021; originally announced February 2021.

    Comments: Accepted at ICLR 2022

  6. arXiv:2006.13570  [pdf, other

    cs.LG stat.ML

    Hyperparameter Ensembles for Robustness and Uncertainty Quantification

    Authors: Florian Wenzel, Jasper Snoek, Dustin Tran, Rodolphe Jenatton

    Abstract: Ensembles over neural network weights trained from different random initialization, known as deep ensembles, achieve state-of-the-art accuracy and calibration. The recently introduced batch ensembles provide a drop-in replacement that is more parameter efficient. In this paper, we design ensembles not only over weights, but over hyperparameters to improve the state of the art in both settings. For… ▽ More

    Submitted 8 January, 2021; v1 submitted 24 June, 2020; originally announced June 2020.

    Comments: Accepted at NeurIPS 2020

  7. arXiv:2002.11451  [pdf, other

    stat.ML cs.LG

    Automated Augmented Conjugate Inference for Non-conjugate Gaussian Process Models

    Authors: Théo Galy-Fajou, Florian Wenzel, Manfred Opper

    Abstract: We propose automated augmented conjugate inference, a new inference method for non-conjugate Gaussian processes (GP) models. Our method automatically constructs an auxiliary variable augmentation that renders the GP model conditionally conjugate. Building on the conjugate structure of the augmented model, we develop two inference methods. First, a fast and scalable stochastic variational inference… ▽ More

    Submitted 26 February, 2020; originally announced February 2020.

    Comments: Accepted at AISTATS 2020

  8. arXiv:2002.02405  [pdf, other

    stat.ML cs.LG stat.CO

    How Good is the Bayes Posterior in Deep Neural Networks Really?

    Authors: Florian Wenzel, Kevin Roth, Bastiaan S. Veeling, Jakub Świątkowski, Linh Tran, Stephan Mandt, Jasper Snoek, Tim Salimans, Rodolphe Jenatton, Sebastian Nowozin

    Abstract: During the past five years the Bayesian deep learning community has developed increasingly accurate and efficient approximate inference procedures that allow for Bayesian inference in deep neural networks. However, despite this algorithmic progress and the promise of improved uncertainty quantification and sample efficiency there are---as of early 2020---no publicized deployments of Bayesian neura… ▽ More

    Submitted 2 July, 2020; v1 submitted 6 February, 2020; originally announced February 2020.

    Comments: Full version (main paper and appendix) of the ICML 2020 publication

  9. arXiv:1905.09670  [pdf, other

    stat.ML cs.LG

    Multi-Class Gaussian Process Classification Made Conjugate: Efficient Inference via Data Augmentation

    Authors: Théo Galy-Fajou, Florian Wenzel, Christian Donner, Manfred Opper

    Abstract: We propose a new scalable multi-class Gaussian process classification approach building on a novel modified softmax likelihood function. The new likelihood has two benefits: it leads to well-calibrated uncertainty estimates and allows for an efficient latent variable augmentation. The augmented model has the advantage that it is conditionally conjugate leading to a fast variational inference metho… ▽ More

    Submitted 23 May, 2019; originally announced May 2019.

    Comments: Accepted at UAI 2019

  10. arXiv:1807.01604  [pdf, other

    stat.ML cs.LG

    Quasi-Monte Carlo Variational Inference

    Authors: Alexander Buchholz, Florian Wenzel, Stephan Mandt

    Abstract: Many machine learning problems involve Monte Carlo gradient estimators. As a prominent example, we focus on Monte Carlo variational inference (MCVI) in this paper. The performance of MCVI crucially depends on the variance of its stochastic gradients. We propose variance reduction by means of Quasi-Monte Carlo (QMC) sampling. QMC replaces N i.i.d. samples from a uniform probability distribution by… ▽ More

    Submitted 4 July, 2018; originally announced July 2018.

    Journal ref: Published in the proceedings of the 35th International Conference on Machine Learning (ICML 2018)

  11. arXiv:1803.07868  [pdf, other

    stat.ML cs.LG

    Scalable Generalized Dynamic Topic Models

    Authors: Patrick Jähnichen, Florian Wenzel, Marius Kloft, Stephan Mandt

    Abstract: Dynamic topic models (DTMs) model the evolution of prevalent themes in literature, online media, and other forms of text over time. DTMs assume that word co-occurrence statistics change continuously and therefore impose continuous stochastic process priors on their model parameters. These dynamical priors make inference much harder than in regular topic models, and also limit scalability. In this… ▽ More

    Submitted 21 March, 2018; originally announced March 2018.

    Comments: Published version, International Conference on Artificial Intelligence and Statistics (AISTATS 2018)

  12. arXiv:1802.06383  [pdf, other

    stat.ML cs.LG

    Efficient Gaussian Process Classification Using Polya-Gamma Data Augmentation

    Authors: Florian Wenzel, Theo Galy-Fajou, Christan Donner, Marius Kloft, Manfred Opper

    Abstract: We propose a scalable stochastic variational approach to GP classification building on Polya-Gamma data augmentation and inducing points. Unlike former approaches, we obtain closed-form updates based on natural gradients that lead to efficient optimization. We evaluate the algorithm on real-world datasets containing up to 11 million data points and demonstrate that it is up to two orders of magnit… ▽ More

    Submitted 27 November, 2018; v1 submitted 18 February, 2018; originally announced February 2018.

  13. arXiv:1707.05532  [pdf, other

    stat.ML cs.LG

    Bayesian Nonlinear Support Vector Machines for Big Data

    Authors: Florian Wenzel, Theo Galy-Fajou, Matthaeus Deutsch, Marius Kloft

    Abstract: We propose a fast inference method for Bayesian nonlinear support vector machines that leverages stochastic variational inference and inducing points. Our experiments show that the proposed method is faster than competing Bayesian approaches and scales easily to millions of data points. It provides additional features over frequentist competitors such as accurate predictive uncertainty estimates a… ▽ More

    Submitted 18 July, 2017; originally announced July 2017.

    Comments: accepted as conference paper at ECML-PKDD 2017

  14. Sparse Probit Linear Mixed Model

    Authors: Stephan Mandt, Florian Wenzel, Shinichi Nakajima, John P. Cunningham, Christoph Lippert, Marius Kloft

    Abstract: Linear Mixed Models (LMMs) are important tools in statistical genetics. When used for feature selection, they allow to find a sparse set of genetic traits that best predict a continuous phenotype of interest, while simultaneously correcting for various confounding factors such as age, ethnicity and population structure. Formulated as models for linear regression, LMMs have been restricted to conti… ▽ More

    Submitted 17 July, 2017; v1 submitted 16 July, 2015; originally announced July 2015.

    Comments: Published version, 21 pages, 6 figures

    Journal ref: Machine Learning, 106(9), 1621-1642 (2017)