Search | arXiv e-print repository

arXiv:2006.12135 [pdf, other]

Learning to Generate Noise for Multi-Attack Robustness

Authors: Divyam Madaan, Jinwoo Shin, Sung Ju Hwang

Abstract: Adversarial learning has emerged as one of the successful techniques to circumvent the susceptibility of existing methods against adversarial perturbations. However, the majority of existing defense methods are tailored to defend against a single category of adversarial perturbation (e.g. $\ell_\infty$-attack). In safety-critical applications, this makes these methods extraneous as the attacker ca… ▽ More Adversarial learning has emerged as one of the successful techniques to circumvent the susceptibility of existing methods against adversarial perturbations. However, the majority of existing defense methods are tailored to defend against a single category of adversarial perturbation (e.g. $\ell_\infty$-attack). In safety-critical applications, this makes these methods extraneous as the attacker can adopt diverse adversaries to deceive the system. Moreover, training on multiple perturbations simultaneously significantly increases the computational overhead during training. To address these challenges, we propose a novel meta-learning framework that explicitly learns to generate noise to improve the model's robustness against multiple types of attacks. Its key component is Meta Noise Generator (MNG) that outputs optimal noise to stochastically perturb a given sample, such that it helps lower the error on diverse adversarial perturbations. By utilizing samples generated by MNG, we train a model by enforcing the label consistency across multiple perturbations. We validate the robustness of models trained by our scheme on various datasets and against a wide variety of perturbations, demonstrating that it significantly outperforms the baselines across multiple perturbations with a marginal computational cost. △ Less

Submitted 24 June, 2021; v1 submitted 22 June, 2020; originally announced June 2020.

Comments: Accepted to ICML 2021. Code available at https://github.com/divyam3897/MNG_AC

arXiv:1908.04355 [pdf, other]

Adversarial Neural Pruning with Latent Vulnerability Suppression

Authors: Divyam Madaan, Jinwoo Shin, Sung Ju Hwang

Abstract: Despite the remarkable performance of deep neural networks on various computer vision tasks, they are known to be susceptible to adversarial perturbations, which makes it challenging to deploy them in real-world safety-critical applications. In this paper, we conjecture that the leading cause of adversarial vulnerability is the distortion in the latent feature space, and provide methods to suppres… ▽ More Despite the remarkable performance of deep neural networks on various computer vision tasks, they are known to be susceptible to adversarial perturbations, which makes it challenging to deploy them in real-world safety-critical applications. In this paper, we conjecture that the leading cause of adversarial vulnerability is the distortion in the latent feature space, and provide methods to suppress them effectively. Explicitly, we define \emph{vulnerability} for each latent feature and then propose a new loss for adversarial learning, \emph{Vulnerability Suppression (VS)} loss, that aims to minimize the feature-level vulnerability during training. We further propose a Bayesian framework to prune features with high vulnerability to reduce both vulnerability and loss on adversarial samples. We validate our \emph{Adversarial Neural Pruning with Vulnerability Suppression (ANP-VS)} method on multiple benchmark datasets, on which it not only obtains state-of-the-art adversarial robustness but also improves the performance on clean examples, using only a fraction of the parameters used by the full network. Further qualitative analysis suggests that the improvements come from the suppression of feature-level vulnerability. △ Less

Submitted 2 July, 2020; v1 submitted 12 August, 2019; originally announced August 2019.

Comments: Accepted to ICML 2020. Code available at https://github.com/divyam3897/ANP_VS

arXiv:1905.13678 [pdf, other]

Learning Sparse Networks Using Targeted Dropout

Authors: Aidan N. Gomez, Ivan Zhang, Siddhartha Rao Kamalakara, Divyam Madaan, Kevin Swersky, Yarin Gal, Geoffrey E. Hinton

Abstract: Neural networks are easier to optimise when they have many more weights than are required for modelling the mapping from inputs to outputs. This suggests a two-stage learning procedure that first learns a large net and then prunes away connections or hidden units. But standard training does not necessarily encourage nets to be amenable to pruning. We introduce targeted dropout, a method for traini… ▽ More Neural networks are easier to optimise when they have many more weights than are required for modelling the mapping from inputs to outputs. This suggests a two-stage learning procedure that first learns a large net and then prunes away connections or hidden units. But standard training does not necessarily encourage nets to be amenable to pruning. We introduce targeted dropout, a method for training a neural network so that it is robust to subsequent pruning. Before computing the gradients for each weight update, targeted dropout stochastically selects a set of units or weights to be dropped using a simple self-reinforcing sparsity criterion and then computes the gradients for the remaining weights. The resulting network is robust to post hoc pruning of weights or units that frequently occur in the dropped sets. The method improves upon more complicated sparsifying regularisers while being simple to implement and easy to tune. △ Less

Submitted 9 September, 2019; v1 submitted 31 May, 2019; originally announced May 2019.

arXiv:1801.08852 [pdf, other]

Calibration for Weak Variance-Alpha-Gamma Processes

Authors: Boris Buchmann, Kevin W. Lu, Dilip B. Madan

Abstract: The weak variance-alpha-gamma process is a multivariate Lévy process constructed by weakly subordinating Brownian motion, possibly with correlated components with an alpha-gamma subordinator. It generalises the variance-alpha-gamma process of Semeraro constructed by traditional subordination. We compare three calibration methods for the weak variance-alpha-gamma process, method of moments, maximum… ▽ More The weak variance-alpha-gamma process is a multivariate Lévy process constructed by weakly subordinating Brownian motion, possibly with correlated components with an alpha-gamma subordinator. It generalises the variance-alpha-gamma process of Semeraro constructed by traditional subordination. We compare three calibration methods for the weak variance-alpha-gamma process, method of moments, maximum likelihood estimation (MLE) and digital moment estimation (DME). We derive a condition for Fourier invertibility needed to apply MLE and show in our simulations that MLE produces a better fit when this condition holds, while DME produces a better fit when it is violated. We also find that the weak variance-alpha-gamma process exhibits a wider range of dependence and produces a significantly better fit than the variance-alpha-gamma process on an S&P500-FTSE100 data set, and that DME produces the best fit in this situation. △ Less

Submitted 27 July, 2018; v1 submitted 26 January, 2018; originally announced January 2018.

MSC Class: 60G51; 62F10; 60E10

Showing 1–4 of 4 results for author: Madaan, D