Skip to main content

Showing 1–50 of 55 results for author: Hein, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2402.12336  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models

    Authors: Christian Schlarmann, Naman Deep Singh, Francesco Croce, Matthias Hein

    Abstract: Multi-modal foundation models like OpenFlamingo, LLaVA, and GPT-4 are increasingly used for various real-world tasks. Prior work has shown that these models are highly vulnerable to adversarial attacks on the vision modality. These attacks can be leveraged to spread fake information or defraud users, and thus pose a significant risk, which makes the robustness of large multi-modal foundation model… ▽ More

    Submitted 5 June, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: ICML 2024 Oral

  2. arXiv:2106.10065  [pdf, other

    cs.LG stat.ML

    Being a Bit Frequentist Improves Bayesian Neural Networks

    Authors: Agustinus Kristiadi, Matthias Hein, Philipp Hennig

    Abstract: Despite their compelling theoretical properties, Bayesian neural networks (BNNs) tend to perform worse than frequentist methods in classification-based uncertainty quantification (UQ) tasks such as out-of-distribution (OOD) detection. In this paper, based on empirical findings in prior works, we hypothesize that this issue is because even recent Bayesian methods have never considered OOD data in t… ▽ More

    Submitted 2 February, 2022; v1 submitted 18 June, 2021; originally announced June 2021.

    Comments: AISTATS 2022

  3. arXiv:2104.04448  [pdf, other

    cs.LG cs.CV stat.ML

    Relating Adversarially Robust Generalization to Flat Minima

    Authors: David Stutz, Matthias Hein, Bernt Schiele

    Abstract: Adversarial training (AT) has become the de-facto standard to obtain models robust against adversarial examples. However, AT exhibits severe robust overfitting: cross-entropy loss on adversarial examples, so-called robust loss, decreases continuously on training examples, while eventually increasing on test examples. In practice, this leads to poor robust generalization, i.e., adversarial robustne… ▽ More

    Submitted 6 October, 2021; v1 submitted 9 April, 2021; originally announced April 2021.

    Comments: ICCV'21

  4. arXiv:2010.09670  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    RobustBench: a standardized adversarial robustness benchmark

    Authors: Francesco Croce, Maksym Andriushchenko, Vikash Sehwag, Edoardo Debenedetti, Nicolas Flammarion, Mung Chiang, Prateek Mittal, Matthias Hein

    Abstract: As a research community, we are still lacking a systematic understanding of the progress on adversarial robustness which often makes it hard to identify the most promising ideas in training robust models. A key challenge in benchmarking robustness is that its evaluation is often error-prone leading to robustness overestimation. Our goal is to establish a standardized benchmark of adversarial robus… ▽ More

    Submitted 31 October, 2021; v1 submitted 19 October, 2020; originally announced October 2020.

    Comments: The camera-ready version accepted at the NeurIPS'21 Datasets and Benchmarks Track: 120+ evaluations, 80+ models, 7 leaderboards (Linf, L2, common corruptions; CIFAR-10, CIFAR-100, ImageNet), significantly expanded analysis part (calibration, fairness, privacy leakage, smoothness, transferability)

  5. arXiv:2010.02709  [pdf, other

    cs.LG stat.ML

    An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their Asymptotic Overconfidence

    Authors: Agustinus Kristiadi, Matthias Hein, Philipp Hennig

    Abstract: A Bayesian treatment can mitigate overconfidence in ReLU nets around the training data. But far away from them, ReLU Bayesian neural networks (BNNs) can still underestimate uncertainty and thus be asymptotically overconfident. This issue arises since the output variance of a BNN with finitely many features is quadratic in the distance from the data region. Meanwhile, Bayesian linear models with Re… ▽ More

    Submitted 24 January, 2022; v1 submitted 6 October, 2020; originally announced October 2020.

    Comments: NeurIPS 2021

  6. arXiv:2007.08473  [pdf, other

    cs.LG cs.CV stat.ML

    Certifiably Adversarially Robust Detection of Out-of-Distribution Data

    Authors: Julian Bitterwolf, Alexander Meinke, Matthias Hein

    Abstract: Deep neural networks are known to be overconfident when applied to out-of-distribution (OOD) inputs which clearly do not belong to any class. This is a problem in safety-critical applications since a reliable assessment of the uncertainty of a classifier is a key property, allowing the system to trigger human intervention or to transfer into a safe state. In this paper, we aim for certifiable wors… ▽ More

    Submitted 10 March, 2021; v1 submitted 16 July, 2020; originally announced July 2020.

    Comments: Published and presented at NeurIPS 2020. Code available at https://gitlab.com/Bitterwolf/GOOD v3: added missing acknowledgement

    Journal ref: Advances in Neural Information Processing Systems 33 (NeurIPS 2020)

  7. arXiv:2006.13977  [pdf, other

    cs.LG cs.AR cs.CR cs.CV stat.ML

    Bit Error Robustness for Energy-Efficient DNN Accelerators

    Authors: David Stutz, Nandhini Chandramoorthy, Matthias Hein, Bernt Schiele

    Abstract: Deep neural network (DNN) accelerators received considerable attention in past years due to saved energy compared to mainstream hardware. Low-voltage operation of DNN accelerators allows to further reduce energy consumption significantly, however, causes bit-level failures in the memory storing the quantized DNN weights. In this paper, we show that a combination of robust fixed-point quantization,… ▽ More

    Submitted 9 April, 2021; v1 submitted 24 June, 2020; originally announced June 2020.

  8. arXiv:2006.12834  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Sparse-RS: a versatile framework for query-efficient sparse black-box adversarial attacks

    Authors: Francesco Croce, Maksym Andriushchenko, Naman D. Singh, Nicolas Flammarion, Matthias Hein

    Abstract: We propose a versatile framework based on random search, Sparse-RS, for score-based sparse targeted and untargeted attacks in the black-box setting. Sparse-RS does not rely on substitute models and achieves state-of-the-art success rate and query efficiency for multiple sparse attack models: $l_0$-bounded perturbations, adversarial patches, and adversarial frames. The $l_0$-version of untargeted S… ▽ More

    Submitted 7 February, 2022; v1 submitted 23 June, 2020; originally announced June 2020.

    Comments: Accepted at AAAI 2022. This version contains considerably extended results in the L0 threat model

  9. arXiv:2003.09461  [pdf, other

    cs.LG cs.CV stat.ML

    Adversarial Robustness on In- and Out-Distribution Improves Explainability

    Authors: Maximilian Augustin, Alexander Meinke, Matthias Hein

    Abstract: Neural networks have led to major improvements in image classification but suffer from being non-robust to adversarial changes, unreliable uncertainty estimates on out-distribution samples and their inscrutable black-box decisions. In this work we propose RATIO, a training procedure for Robustness via Adversarial Training on In- and Out-distribution, which leads to robust models with reliable and… ▽ More

    Submitted 29 July, 2020; v1 submitted 20 March, 2020; originally announced March 2020.

  10. arXiv:2003.01690  [pdf, other

    cs.LG cs.CV stat.ML

    Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks

    Authors: Francesco Croce, Matthias Hein

    Abstract: The field of defense strategies against adversarial attacks has significantly grown over the last years, but progress is hampered as the evaluation of adversarial defenses is often insufficient and thus gives a wrong impression of robustness. Many promising defenses could be broken later on, making it difficult to identify the state-of-the-art. Frequent pitfalls in the evaluation are improper tuni… ▽ More

    Submitted 4 August, 2020; v1 submitted 3 March, 2020; originally announced March 2020.

    Comments: In ICML 2020

  11. arXiv:2002.10118  [pdf, other

    stat.ML cs.LG

    Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks

    Authors: Agustinus Kristiadi, Matthias Hein, Philipp Hennig

    Abstract: The point estimates of ReLU classification networks---arguably the most widely used neural network architecture---have been shown to yield arbitrarily high confidence far away from the training data. This architecture, in conjunction with a maximum a posteriori estimation scheme, is thus not calibrated nor robust. Approximate Bayesian inference has been empirically demonstrated to improve predicti… ▽ More

    Submitted 17 July, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

    Comments: ICML 2020

  12. arXiv:1912.00049  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Square Attack: a query-efficient black-box adversarial attack via random search

    Authors: Maksym Andriushchenko, Francesco Croce, Nicolas Flammarion, Matthias Hein

    Abstract: We propose the Square Attack, a score-based black-box $l_2$- and $l_\infty$-adversarial attack that does not rely on local gradient information and thus is not affected by gradient masking. Square Attack is based on a randomized search scheme which selects localized square-shaped updates at random positions so that at each iteration the perturbation is situated approximately at the boundary of the… ▽ More

    Submitted 29 July, 2020; v1 submitted 29 November, 2019; originally announced December 2019.

    Comments: Accepted at ECCV 2020; added imperceptible perturbations, analysis of examples that require more queries, results on dilated CNNs

  13. arXiv:1910.13951  [pdf, other

    cs.LG math.NA stat.ML

    Generalized Matrix Means for Semi-Supervised Learning with Multilayer Graphs

    Authors: Pedro Mercado, Francesco Tudisco, Matthias Hein

    Abstract: We study the task of semi-supervised learning on multilayer graphs by taking into account both labeled and unlabeled observations together with the information encoded by each individual graph layer. We propose a regularizer based on the generalized matrix mean, which is a one-parameter family of matrix means that includes the arithmetic, geometric and harmonic means as particular cases. We analyz… ▽ More

    Submitted 30 October, 2019; originally announced October 2019.

    Comments: Accepted in NeurIPS 2019

  14. arXiv:1910.06259  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Confidence-Calibrated Adversarial Training: Generalizing to Unseen Attacks

    Authors: David Stutz, Matthias Hein, Bernt Schiele

    Abstract: Adversarial training yields robust models against a specific threat model, e.g., $L_\infty$ adversarial examples. Typically robustness does not generalize to previously unseen threat models, e.g., other $L_p$ norms, or larger perturbations. Our confidence-calibrated adversarial training (CCAT) tackles this problem by biasing the model towards low confidence predictions on adversarial examples. By… ▽ More

    Submitted 30 June, 2020; v1 submitted 14 October, 2019; originally announced October 2019.

  15. arXiv:1909.12180  [pdf, ps, other

    cs.LG cs.CV stat.ML

    Towards neural networks that provably know when they don't know

    Authors: Alexander Meinke, Matthias Hein

    Abstract: It has recently been shown that ReLU networks produce arbitrarily over-confident predictions far away from the training data. Thus, ReLU networks do not know when they don't know. However, this is a highly important property in safety critical applications. In the context of out-of-distribution detection (OOD) there have been a number of proposals to mitigate this problem but none of them are able… ▽ More

    Submitted 21 February, 2020; v1 submitted 26 September, 2019; originally announced September 2019.

  16. arXiv:1909.05040  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Sparse and Imperceivable Adversarial Attacks

    Authors: Francesco Croce, Matthias Hein

    Abstract: Neural networks have been proven to be vulnerable to a variety of adversarial attacks. From a safety perspective, highly sparse adversarial attacks are particularly dangerous. On the other hand the pixelwise perturbations of sparse attacks are typically large and thus can be potentially detected. We propose a new black-box technique to craft adversarial examples aiming at minimizing $l_0$-distance… ▽ More

    Submitted 11 September, 2019; originally announced September 2019.

    Comments: Accepted to ICCV 2019

  17. arXiv:1907.02044  [pdf, other

    cs.LG cs.CR cs.CV stat.ML

    Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack

    Authors: Francesco Croce, Matthias Hein

    Abstract: The evaluation of robustness against adversarial manipulation of neural networks-based classifiers is mainly tested with empirical attacks as methods for the exact computation, even when available, do not scale to large networks. We propose in this paper a new white-box adversarial attack wrt the $l_p$-norms for $p \in \{1,2,\infty\}$ aiming at finding the minimal perturbation necessary to change… ▽ More

    Submitted 20 July, 2020; v1 submitted 3 July, 2019; originally announced July 2019.

  18. arXiv:1906.03526  [pdf, other

    cs.LG cs.CR stat.ML

    Provably Robust Boosted Decision Stumps and Trees against Adversarial Attacks

    Authors: Maksym Andriushchenko, Matthias Hein

    Abstract: The problem of adversarial robustness has been studied extensively for neural networks. However, for boosted decision trees and decision stumps there are almost no results, even though they are widely used in practice (e.g. XGBoost) due to their accuracy, interpretability, and efficiency. We show in this paper that for boosted decision stumps the \textit{exact} min-max robust loss and test error f… ▽ More

    Submitted 30 October, 2019; v1 submitted 8 June, 2019; originally announced June 2019.

    Comments: Camera-ready version (accepted at NeurIPS 2019)

  19. arXiv:1905.11213  [pdf, other

    cs.LG cs.CR stat.ML

    Provable robustness against all adversarial $l_p$-perturbations for $p\geq 1$

    Authors: Francesco Croce, Matthias Hein

    Abstract: In recent years several adversarial attacks and defenses have been proposed. Often seemingly robust models turn out to be non-robust when more sophisticated attacks are used. One way out of this dilemma are provable robustness guarantees. While provably robust models for specific $l_p$-perturbation models have been developed, we show that they do not come with any guarantee against other $l_q$-per… ▽ More

    Submitted 24 April, 2020; v1 submitted 27 May, 2019; originally announced May 2019.

  20. arXiv:1905.06230  [pdf, other

    cs.LG stat.ML

    Spectral Clustering of Signed Graphs via Matrix Power Means

    Authors: Pedro Mercado, Francesco Tudisco, Matthias Hein

    Abstract: Signed graphs encode positive (attractive) and negative (repulsive) relations between nodes. We extend spectral clustering to signed graphs via the one-parameter family of Signed Power Mean Laplacians, defined as the matrix power mean of normalized standard and signless Laplacians of positive and negative edges. We provide a thorough analysis of the proposed approach in the setting of a general St… ▽ More

    Submitted 15 May, 2019; originally announced May 2019.

    Comments: final version accepted at ICML 2019

  21. arXiv:1903.11359  [pdf, other

    cs.LG cs.CR cs.CV cs.NE stat.ML

    Scaling up the randomized gradient-free adversarial attack reveals overestimation of robustness using established attacks

    Authors: Francesco Croce, Jonas Rauber, Matthias Hein

    Abstract: Modern neural networks are highly non-robust against adversarial manipulation. A significant amount of work has been invested in techniques to compute lower bounds on robustness through formal guarantees and to build provably robust models. However, it is still difficult to get guarantees for larger networks or robustness against larger perturbations. Thus attack strategies are needed to provide t… ▽ More

    Submitted 25 September, 2019; v1 submitted 27 March, 2019; originally announced March 2019.

    Comments: Accepted at International Journal of Computer Vision

  22. arXiv:1812.05720  [pdf, other

    cs.LG cs.CV stat.ML

    Why ReLU networks yield high-confidence predictions far away from the training data and how to mitigate the problem

    Authors: Matthias Hein, Maksym Andriushchenko, Julian Bitterwolf

    Abstract: Classifiers used in the wild, in particular for safety-critical systems, should not only have good generalization properties but also should know when they don't know, in particular make low confidence predictions far away from the training data. We show that ReLU type neural networks which yield a piecewise linear classifier function fail in this regard as they produce almost always high confiden… ▽ More

    Submitted 7 May, 2019; v1 submitted 13 December, 2018; originally announced December 2018.

    Comments: Slight update of the CVPR 2019 final version [accepted with an oral presentation]

  23. arXiv:1812.00740  [pdf, other

    cs.CV cs.CR cs.LG stat.ML

    Disentangling Adversarial Robustness and Generalization

    Authors: David Stutz, Matthias Hein, Bernt Schiele

    Abstract: Obtaining deep networks that are robust against adversarial examples and generalize well is an open problem. A recent hypothesis even states that both robust and accurate models are impossible, i.e., adversarial robustness and generalization are conflicting goals. In an effort to clarify the relationship between robustness and generalization, we assume an underlying, low-dimensional data manifold… ▽ More

    Submitted 10 April, 2019; v1 submitted 3 December, 2018; originally announced December 2018.

    Comments: Conference on Computer Vision and Pattern Recognition 2019

  24. arXiv:1811.11493  [pdf, other

    cs.LG cs.CR stat.ML

    A randomized gradient-free attack on ReLU networks

    Authors: Francesco Croce, Matthias Hein

    Abstract: It has recently been shown that neural networks but also other classifiers are vulnerable to so called adversarial attacks e.g. in object recognition an almost non-perceivable change of the image changes the decision of the classifier. Relatively fast heuristics have been proposed to produce these adversarial inputs but the problem of finding the optimal adversarial input, that is with the minimal… ▽ More

    Submitted 28 November, 2018; originally announced November 2018.

    Comments: In GCPR 2018

  25. arXiv:1810.12042  [pdf, other

    cs.LG cs.CR stat.ML

    Logit Pairing Methods Can Fool Gradient-Based Attacks

    Authors: Marius Mosbach, Maksym Andriushchenko, Thomas Trost, Matthias Hein, Dietrich Klakow

    Abstract: Recently, Kannan et al. [2018] proposed several logit regularization methods to improve the adversarial robustness of classifiers. We show that the computationally fast methods they propose - Clean Logit Pairing (CLP) and Logit Squeezing (LSQ) - just make the gradient-based optimization problem of crafting adversarial examples harder without providing actual robustness. We find that Adversarial Lo… ▽ More

    Submitted 12 March, 2019; v1 submitted 29 October, 2018; originally announced October 2018.

    Comments: Accepted to NeurIPS 2018 Workshop on Security in Machine Learning

  26. arXiv:1810.07481  [pdf, other

    cs.LG stat.ML

    Provable Robustness of ReLU networks via Maximization of Linear Regions

    Authors: Francesco Croce, Maksym Andriushchenko, Matthias Hein

    Abstract: It has been shown that neural network classifiers are not robust. This raises concerns about their usage in safety-critical systems. We propose in this paper a regularization scheme for ReLU networks which provably improves the robustness of the classifier by maximizing the linear regions of the classifier as well as the distance to the decision boundary. Our techniques allow even to find the mini… ▽ More

    Submitted 8 March, 2019; v1 submitted 17 October, 2018; originally announced October 2018.

    Comments: In AISTATS 2019. Conference version with the following modifications: improved readability, comparison to Xiao et al (2018) added, section on visualizations extended

  27. arXiv:1809.10749  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    On the loss landscape of a class of deep neural networks with no bad local valleys

    Authors: Quynh Nguyen, Mahesh Chandra Mukkamala, Matthias Hein

    Abstract: We identify a class of over-parameterized deep neural networks with standard activation functions and cross-entropy loss which provably have no bad local valley, in the sense that from any point in parameter space there exists a continuous path on which the cross-entropy loss is non-increasing and gets arbitrarily close to zero. This implies that these networks have no sub-optimal strict local min… ▽ More

    Submitted 23 December, 2018; v1 submitted 27 September, 2018; originally announced September 2018.

    Comments: Accepted at ICLR 2019

  28. arXiv:1803.00491  [pdf, other

    stat.ML cs.LG math.NA

    The Power Mean Laplacian for Multilayer Graph Clustering

    Authors: Pedro Mercado, Antoine Gautier, Francesco Tudisco, Matthias Hein

    Abstract: Multilayer graphs encode different kind of interactions between the same set of entities. When one wants to cluster such a multilayer graph, the natural question arises how one should merge the information different layers. We introduce in this paper a one-parameter family of matrix power means for merging the Laplacians from different layers and analyze it in expectation in the stochastic block m… ▽ More

    Submitted 1 March, 2018; originally announced March 2018.

    Comments: 19 pages, 3 figures. Accepted in Artificial Intelligence and Statistics (AISTATS), 2018

  29. arXiv:1803.00094  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Neural Networks Should Be Wide Enough to Learn Disconnected Decision Regions

    Authors: Quynh Nguyen, Mahesh Chandra Mukkamala, Matthias Hein

    Abstract: In the recent literature the important role of depth in deep learning has been emphasized. In this paper we argue that sufficient width of a feedforward network is equally important by answering the simple question under which conditions the decision regions of a neural network are connected. It turns out that for a class of activation functions including leaky ReLU, neural networks having a pyram… ▽ More

    Submitted 8 June, 2018; v1 submitted 28 February, 2018; originally announced March 2018.

    Comments: Accepted at ICML 2018. Added discussion for non-pyramidal networks and ReLU activation function

  30. arXiv:1801.10108  [pdf, ps, other

    stat.ML math.AP math.DG math.PR

    Error estimates for spectral convergence of the graph Laplacian on random geometric graphs towards the Laplace--Beltrami operator

    Authors: Nicolas Garcia Trillos, Moritz Gerlach, Matthias Hein, Dejan Slepcev

    Abstract: We study the convergence of the graph Laplacian of a random geometric graph generated by an i.i.d. sample from a $m$-dimensional submanifold $M$ in $R^d$ as the sample size $n$ increases and the neighborhood size $h$ tends to zero. We show that eigenvalues and eigenvectors of the graph Laplacian converge with a rate of $O\Big(\big(\frac{\log n}{n}\big)^\frac{1}{2m}\Big)$ to the eigenvalues and eig… ▽ More

    Submitted 30 January, 2018; originally announced January 2018.

    MSC Class: 62G20; 65N25; 60D05; 58J50; 68R10; 05C50

  31. arXiv:1710.10928  [pdf, other

    cs.LG cs.AI cs.CV math.OC stat.ML

    Optimization Landscape and Expressivity of Deep CNNs

    Authors: Quynh Nguyen, Matthias Hein

    Abstract: We analyze the loss landscape and expressiveness of practical deep convolutional neural networks (CNNs) with shared weights and max pooling layers. We show that such CNNs produce linearly independent features at a "wide" layer which has more neurons than the number of training samples. This condition holds e.g. for the VGG network. Furthermore, we provide for such wide CNNs necessary and sufficien… ▽ More

    Submitted 6 June, 2018; v1 submitted 30 October, 2017; originally announced October 2017.

    Comments: Accepted at ICML 2018

  32. arXiv:1708.05569  [pdf, other

    cs.SI math.OC stat.ML

    Community detection in networks via nonlinear modularity eigenvectors

    Authors: Francesco Tudisco, Pedro Mercado, Matthias Hein

    Abstract: Revealing a community structure in a network or dataset is a central problem arising in many scientific areas. The modularity function $Q$ is an established measure quantifying the quality of a community, being identified as a set of nodes having high modularity. In our terminology, a set of nodes with positive modularity is called a \textit{module} and a set that maximizes $Q$ is thus called \tex… ▽ More

    Submitted 12 September, 2018; v1 submitted 18 August, 2017; originally announced August 2017.

    MSC Class: 05C50; 05C70; 47H30; 68R10

    Journal ref: SIAM J. Applied Mathematics, 78:2393--2419, 2018

  33. arXiv:1706.05507  [pdf, other

    cs.LG cs.AI cs.CV cs.NE stat.ML

    Variants of RMSProp and Adagrad with Logarithmic Regret Bounds

    Authors: Mahesh Chandra Mukkamala, Matthias Hein

    Abstract: Adaptive gradient methods have become recently very popular, in particular as they have been shown to be useful in the training of deep neural networks. In this paper we have analyzed RMSProp, originally proposed for the training of deep neural networks, in the context of online convex optimization and show $\sqrt{T}$-type regret bounds. Moreover, we propose two variants SC-Adagrad and SC-RMSProp… ▽ More

    Submitted 28 November, 2017; v1 submitted 17 June, 2017; originally announced June 2017.

    Comments: ICML 2017, 16 pages, 23 figures

  34. arXiv:1705.08475  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation

    Authors: Matthias Hein, Maksym Andriushchenko

    Abstract: Recent work has shown that state-of-the-art classifiers are quite brittle, in the sense that a small adversarial change of an originally with high confidence correctly classified input leads to a wrong classification again with high confidence. This raises concerns that such classifiers are vulnerable to attacks and calls into question their usage in safety-critical systems. We show in this paper… ▽ More

    Submitted 5 November, 2017; v1 submitted 23 May, 2017; originally announced May 2017.

    Comments: final version accepted at NIPS 2017, fixed bug in implementation of Cross-Lipschitz regularization and lower bound computation, now results are better

  35. arXiv:1704.08045  [pdf, ps, other

    cs.LG cs.AI cs.CV cs.NE stat.ML

    The loss surface of deep and wide neural networks

    Authors: Quynh Nguyen, Matthias Hein

    Abstract: While the optimization problem behind deep neural networks is highly non-convex, it is frequently observed in practice that training deep networks seems possible without getting stuck in suboptimal points. It has been argued that this is the case as all local minima are close to being globally optimal. We show that this is (almost) true, in fact almost all local minima are globally optimal, for a… ▽ More

    Submitted 12 June, 2017; v1 submitted 26 April, 2017; originally announced April 2017.

    Comments: ICML 2017. Main results now hold for larger classes of loss functions

  36. arXiv:1701.00757  [pdf, other

    stat.ML cs.LG math.NA

    Clustering Signed Networks with the Geometric Mean of Laplacians

    Authors: Pedro Mercado, Francesco Tudisco, Matthias Hein

    Abstract: Signed networks allow to model positive and negative relationships. We analyze existing extensions of spectral clustering to signed networks. It turns out that existing approaches do not recover the ground truth clustering in several situations where either the positive or the negative network structures contain no noise. Our analysis shows that these problems arise as existing approaches take som… ▽ More

    Submitted 3 January, 2017; originally announced January 2017.

    Comments: 14 pages, 5 figures. Accepted in Neural Information Processing Systems (NIPS), 2016

    Journal ref: Advances in Neural Information Processing Systems 29, pp.4421--4429, 2016

  37. arXiv:1612.03663  [pdf, other

    cs.CV cs.LG stat.ML

    Analysis and Optimization of Loss Functions for Multiclass, Top-k, and Multilabel Classification

    Authors: Maksim Lapin, Matthias Hein, Bernt Schiele

    Abstract: Top-k error is currently a popular performance measure on large scale image classification benchmarks such as ImageNet and Places. Despite its wide acceptance, our understanding of this metric is limited as most of the previous research is focused on its special case, the top-1 error. In this work, we explore two directions that shed more light on the top-k error. First, we provide an in-depth ana… ▽ More

    Submitted 12 December, 2016; originally announced December 2016.

  38. arXiv:1610.09300  [pdf, other

    cs.LG math.OC stat.ML

    Globally Optimal Training of Generalized Polynomial Neural Networks with Nonlinear Spectral Methods

    Authors: Antoine Gautier, Quynh Nguyen, Matthias Hein

    Abstract: The optimization problem behind neural networks is highly non-convex. Training with stochastic gradient descent and variants requires careful parameter tuning and provides no guarantee to achieve the global optimum. In contrast we show under quite weak assumptions on the data that a particular class of feedforward neural networks can be trained globally optimal with a linear convergence rate with… ▽ More

    Submitted 28 October, 2016; originally announced October 2016.

    Comments: Long version of NIPS 2016 paper

  39. arXiv:1512.00486  [pdf, other

    stat.ML cs.CV cs.LG

    Loss Functions for Top-k Error: Analysis and Insights

    Authors: Maksim Lapin, Matthias Hein, Bernt Schiele

    Abstract: In order to push the performance on realistic computer vision tasks, the number of classes in modern benchmark datasets has significantly increased in recent years. This increase in the number of classes comes along with increased ambiguity between the class labels, raising the question if top-1 error is the right performance measure. In this paper, we provide an extensive comparison and evaluatio… ▽ More

    Submitted 13 April, 2016; v1 submitted 1 December, 2015; originally announced December 2015.

    Comments: In Computer Vision and Pattern Recognition (CVPR), 2016

  40. arXiv:1511.06683  [pdf, other

    stat.ML cs.CV cs.LG

    Top-k Multiclass SVM

    Authors: Maksim Lapin, Matthias Hein, Bernt Schiele

    Abstract: Class ambiguity is typical in image classification problems with a large number of classes. When classes are difficult to discriminate, it makes sense to allow k guesses and evaluate classifiers based on the top-k error instead of the standard zero-one loss. We propose top-k multiclass SVM as a direct method to optimize for top-k performance. Our generalization of the well-known multiclass SVM is… ▽ More

    Submitted 20 November, 2015; originally announced November 2015.

    Comments: NIPS 2015

  41. arXiv:1511.05706  [pdf, other

    stat.ML cs.LG

    Efficient Output Kernel Learning for Multiple Tasks

    Authors: Pratik Jawanpuria, Maksim Lapin, Matthias Hein, Bernt Schiele

    Abstract: The paradigm of multi-task learning is that one can achieve better generalization by learning tasks jointly and thus exploiting the similarity between the tasks rather than learning them independently of each other. While previously the relationship between tasks had to be user-defined in the form of an output kernel, recent approaches jointly learn the tasks and the output kernel. As the output k… ▽ More

    Submitted 18 November, 2015; originally announced November 2015.

  42. arXiv:1506.00323  [pdf, other

    stat.ML cs.LG

    Robust PCA: Optimization of the Robust Reconstruction Error over the Stiefel Manifold

    Authors: Anastasia Podosinnikova, Simon Setzer, Matthias Hein

    Abstract: It is well known that Principal Component Analysis (PCA) is strongly affected by outliers and a lot of effort has been put into robustification of PCA. In this paper we present a new algorithm for robust PCA minimizing the trimmed reconstruction error. By directly minimizing over the Stiefel manifold, we avoid deflation as often used by projection pursuit methods. In distinction to other methods f… ▽ More

    Submitted 31 May, 2015; originally announced June 2015.

    Comments: long version of GCPR 2014 paper

  43. arXiv:1505.06485  [pdf, other

    stat.ML cs.LG

    Constrained 1-Spectral Clustering

    Authors: Syama Sundar Rangapuram, Matthias Hein

    Abstract: An important form of prior information in clustering comes in form of cannot-link and must-link constraints. We present a generalization of the popular spectral clustering technique which integrates such constraints. Motivated by the recently proposed $1$-spectral clustering for the unconstrained problem, our method is based on a tight relaxation of the constrained normalized cut into a continuous… ▽ More

    Submitted 24 May, 2015; originally announced May 2015.

    Comments: Long version of paper accepted at AISTATS 2012

  44. arXiv:1505.06478  [pdf, other

    stat.ML cs.LG

    Tight Continuous Relaxation of the Balanced $k$-Cut Problem

    Authors: Syama Sundar Rangapuram, Pramod Kaushik Mudrakarta, Matthias Hein

    Abstract: Spectral Clustering as a relaxation of the normalized/ratio cut has become one of the standard graph-based clustering methods. Existing methods for the computation of multiple clusters, corresponding to a balanced $k$-cut of the graph, are either based on greedy techniques or heuristics which have weak connection to the original motivation of minimizing the normalized cut. In this paper we propose… ▽ More

    Submitted 24 May, 2015; originally announced May 2015.

    Comments: Long version of paper accepted at NIPS 2014

  45. arXiv:1504.06305  [pdf, ps, other

    stat.ML cs.LG stat.ME

    Regularization-free estimation in trace regression with symmetric positive semidefinite matrices

    Authors: Martin Slawski, Ping Li, Matthias Hein

    Abstract: Over the past few years, trace regression models have received considerable attention in the context of matrix completion, quantum state tomography, and compressed sensing. Estimation of the underlying matrix from regularization-based approaches promoting low-rankedness, notably nuclear norm regularization, have enjoyed great popularity. In the present paper, we argue that such regularization may… ▽ More

    Submitted 23 April, 2015; originally announced April 2015.

  46. arXiv:1404.6640  [pdf, ps, other

    math.ST stat.ML

    Estimation of positive definite M-matrices and structure learning for attractive Gaussian Markov Random fields

    Authors: Martin Slawski, Matthias Hein

    Abstract: Consider a random vector with finite second moments. If its precision matrix is an M-matrix, then all partial correlations are non-negative. If that random vector is additionally Gaussian, the corresponding Markov random field (GMRF) is called attractive. We study estimation of M-matrices taking the role of inverse second moment or precision matrices using sign-constrained log-determinant divergen… ▽ More

    Submitted 26 April, 2014; originally announced April 2014.

    Comments: long version of a manuscript accepted for publication in Linear Algebra and its Applications

  47. arXiv:1401.6024  [pdf, ps, other

    stat.ML cs.LG

    Matrix factorization with Binary Components

    Authors: Martin Slawski, Matthias Hein, Pavlo Lutsik

    Abstract: Motivated by an application in computational biology, we consider low-rank matrix factorization with $\{0,1\}$-constraints on one of the factors and optionally convex constraints on the second one. In addition to the non-convexity shared with other matrix factorization schemes, our problem is further complicated by a combinatorial constraint set of size $2^{m \cdot r}$, where $m$ is the dimension… ▽ More

    Submitted 23 January, 2014; originally announced January 2014.

    Comments: appeared in NIPS 2013

  48. arXiv:1312.5192  [pdf, ps, other

    stat.ML cs.LG math.OC

    Nonlinear Eigenproblems in Data Analysis - Balanced Graph Cuts and the RatioDCA-Prox

    Authors: Leonardo Jost, Simon Setzer, Matthias Hein

    Abstract: It has been recently shown that a large class of balanced graph cuts allows for an exact relaxation into a nonlinear eigenproblem. We review briefly some of these results and propose a family of algorithms to compute nonlinear eigenvectors which encompasses previous work as special cases. We provide a detailed analysis of the properties and the convergence behavior of these algorithms and then dis… ▽ More

    Submitted 24 March, 2014; v1 submitted 18 December, 2013; originally announced December 2013.

  49. arXiv:1312.5179  [pdf, other

    stat.ML cs.LG math.OC

    The Total Variation on Hypergraphs - Learning on Hypergraphs Revisited

    Authors: Matthias Hein, Simon Setzer, Leonardo Jost, Syama Sundar Rangapuram

    Abstract: Hypergraphs allow one to encode higher-order relationships in data and are thus a very flexible modeling tool. Current learning methods are either based on approximations of the hypergraphs via graphs or on tensor methods which are only applicable under special conditions. In this paper, we present a new learning framework on hypergraphs which fully uses the hypergraph structure. The key element i… ▽ More

    Submitted 18 December, 2013; originally announced December 2013.

    Comments: Long version of paper accepted at NIPS 2013

  50. arXiv:1306.3409  [pdf, other

    stat.ML cs.LG math.OC

    Constrained fractional set programs and their application in local clustering and community detection

    Authors: Thomas Bühler, Syama Sundar Rangapuram, Simon Setzer, Matthias Hein

    Abstract: The (constrained) minimization of a ratio of set functions is a problem frequently occurring in clustering and community detection. As these optimization problems are typically NP-hard, one uses convex or spectral relaxations in practice. While these relaxations can be solved globally optimally, they are often too loose and thus lead to results far away from the optimum. In this paper we show that… ▽ More

    Submitted 14 June, 2013; originally announced June 2013.

    Comments: Long version of paper accepted at ICML 2013