-
Learning to Generate Noise for Multi-Attack Robustness
Authors:
Divyam Madaan,
Jinwoo Shin,
Sung Ju Hwang
Abstract:
Adversarial learning has emerged as one of the successful techniques to circumvent the susceptibility of existing methods against adversarial perturbations. However, the majority of existing defense methods are tailored to defend against a single category of adversarial perturbation (e.g. $\ell_\infty$-attack). In safety-critical applications, this makes these methods extraneous as the attacker ca…
▽ More
Adversarial learning has emerged as one of the successful techniques to circumvent the susceptibility of existing methods against adversarial perturbations. However, the majority of existing defense methods are tailored to defend against a single category of adversarial perturbation (e.g. $\ell_\infty$-attack). In safety-critical applications, this makes these methods extraneous as the attacker can adopt diverse adversaries to deceive the system. Moreover, training on multiple perturbations simultaneously significantly increases the computational overhead during training. To address these challenges, we propose a novel meta-learning framework that explicitly learns to generate noise to improve the model's robustness against multiple types of attacks. Its key component is Meta Noise Generator (MNG) that outputs optimal noise to stochastically perturb a given sample, such that it helps lower the error on diverse adversarial perturbations. By utilizing samples generated by MNG, we train a model by enforcing the label consistency across multiple perturbations. We validate the robustness of models trained by our scheme on various datasets and against a wide variety of perturbations, demonstrating that it significantly outperforms the baselines across multiple perturbations with a marginal computational cost.
△ Less
Submitted 24 June, 2021; v1 submitted 22 June, 2020;
originally announced June 2020.
-
Adversarial Neural Pruning with Latent Vulnerability Suppression
Authors:
Divyam Madaan,
Jinwoo Shin,
Sung Ju Hwang
Abstract:
Despite the remarkable performance of deep neural networks on various computer vision tasks, they are known to be susceptible to adversarial perturbations, which makes it challenging to deploy them in real-world safety-critical applications. In this paper, we conjecture that the leading cause of adversarial vulnerability is the distortion in the latent feature space, and provide methods to suppres…
▽ More
Despite the remarkable performance of deep neural networks on various computer vision tasks, they are known to be susceptible to adversarial perturbations, which makes it challenging to deploy them in real-world safety-critical applications. In this paper, we conjecture that the leading cause of adversarial vulnerability is the distortion in the latent feature space, and provide methods to suppress them effectively. Explicitly, we define \emph{vulnerability} for each latent feature and then propose a new loss for adversarial learning, \emph{Vulnerability Suppression (VS)} loss, that aims to minimize the feature-level vulnerability during training. We further propose a Bayesian framework to prune features with high vulnerability to reduce both vulnerability and loss on adversarial samples. We validate our \emph{Adversarial Neural Pruning with Vulnerability Suppression (ANP-VS)} method on multiple benchmark datasets, on which it not only obtains state-of-the-art adversarial robustness but also improves the performance on clean examples, using only a fraction of the parameters used by the full network. Further qualitative analysis suggests that the improvements come from the suppression of feature-level vulnerability.
△ Less
Submitted 2 July, 2020; v1 submitted 12 August, 2019;
originally announced August 2019.
-
Learning Sparse Networks Using Targeted Dropout
Authors:
Aidan N. Gomez,
Ivan Zhang,
Siddhartha Rao Kamalakara,
Divyam Madaan,
Kevin Swersky,
Yarin Gal,
Geoffrey E. Hinton
Abstract:
Neural networks are easier to optimise when they have many more weights than are required for modelling the mapping from inputs to outputs. This suggests a two-stage learning procedure that first learns a large net and then prunes away connections or hidden units. But standard training does not necessarily encourage nets to be amenable to pruning. We introduce targeted dropout, a method for traini…
▽ More
Neural networks are easier to optimise when they have many more weights than are required for modelling the mapping from inputs to outputs. This suggests a two-stage learning procedure that first learns a large net and then prunes away connections or hidden units. But standard training does not necessarily encourage nets to be amenable to pruning. We introduce targeted dropout, a method for training a neural network so that it is robust to subsequent pruning. Before computing the gradients for each weight update, targeted dropout stochastically selects a set of units or weights to be dropped using a simple self-reinforcing sparsity criterion and then computes the gradients for the remaining weights. The resulting network is robust to post hoc pruning of weights or units that frequently occur in the dropped sets. The method improves upon more complicated sparsifying regularisers while being simple to implement and easy to tune.
△ Less
Submitted 9 September, 2019; v1 submitted 31 May, 2019;
originally announced May 2019.
-
Calibration for Weak Variance-Alpha-Gamma Processes
Authors:
Boris Buchmann,
Kevin W. Lu,
Dilip B. Madan
Abstract:
The weak variance-alpha-gamma process is a multivariate Lévy process constructed by weakly subordinating Brownian motion, possibly with correlated components with an alpha-gamma subordinator. It generalises the variance-alpha-gamma process of Semeraro constructed by traditional subordination. We compare three calibration methods for the weak variance-alpha-gamma process, method of moments, maximum…
▽ More
The weak variance-alpha-gamma process is a multivariate Lévy process constructed by weakly subordinating Brownian motion, possibly with correlated components with an alpha-gamma subordinator. It generalises the variance-alpha-gamma process of Semeraro constructed by traditional subordination. We compare three calibration methods for the weak variance-alpha-gamma process, method of moments, maximum likelihood estimation (MLE) and digital moment estimation (DME). We derive a condition for Fourier invertibility needed to apply MLE and show in our simulations that MLE produces a better fit when this condition holds, while DME produces a better fit when it is violated. We also find that the weak variance-alpha-gamma process exhibits a wider range of dependence and produces a significantly better fit than the variance-alpha-gamma process on an S&P500-FTSE100 data set, and that DME produces the best fit in this situation.
△ Less
Submitted 27 July, 2018; v1 submitted 26 January, 2018;
originally announced January 2018.