Search | arXiv e-print repository

Mitigating Backdoor Attacks using Activation-Guided Model Editing

Authors: Felix Hsieh, Huy H. Nguyen, AprilPyone MaungMaung, Dmitrii Usynin, Isao Echizen

Abstract: Backdoor attacks compromise the integrity and reliability of machine learning models by embedding a hidden trigger during the training process, which can later be activated to cause unintended misbehavior. We propose a novel backdoor mitigation approach via machine unlearning to counter such backdoor attacks. The proposed method utilizes model activation of domain-equivalent unseen data to guide t… ▽ More Backdoor attacks compromise the integrity and reliability of machine learning models by embedding a hidden trigger during the training process, which can later be activated to cause unintended misbehavior. We propose a novel backdoor mitigation approach via machine unlearning to counter such backdoor attacks. The proposed method utilizes model activation of domain-equivalent unseen data to guide the editing of the model's weights. Unlike the previous unlearning-based mitigation methods, ours is computationally inexpensive and achieves state-of-the-art performance while only requiring a handful of unseen samples for unlearning. In addition, we also point out that unlearning the backdoor may cause the whole targeted class to be unlearned, thus introducing an additional repair step to preserve the model's utility after editing the model. Experiment results show that the proposed method is effective in unlearning the backdoor on different datasets and trigger patterns. △ Less

Submitted 30 September, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

arXiv:2402.08200 [pdf, other]

Fine-Tuning Text-To-Image Diffusion Models for Class-Wise Spurious Feature Generation

Authors: AprilPyone MaungMaung, Huy H. Nguyen, Hitoshi Kiya, Isao Echizen

Abstract: We propose a method for generating spurious features by leveraging large-scale text-to-image diffusion models. Although the previous work detects spurious features in a large-scale dataset like ImageNet and introduces Spurious ImageNet, we found that not all spurious images are spurious across different classifiers. Although spurious images help measure the reliance of a classifier, filtering many… ▽ More We propose a method for generating spurious features by leveraging large-scale text-to-image diffusion models. Although the previous work detects spurious features in a large-scale dataset like ImageNet and introduces Spurious ImageNet, we found that not all spurious images are spurious across different classifiers. Although spurious images help measure the reliance of a classifier, filtering many images from the Internet to find more spurious features is time-consuming. To this end, we utilize an existing approach of personalizing large-scale text-to-image diffusion models with available discovered spurious images and propose a new spurious feature similarity loss based on neural features of an adversarially robust model. Precisely, we fine-tune Stable Diffusion with several reference images from Spurious ImageNet with a modified objective incorporating the proposed spurious-feature similarity loss. Experiment results show that our method can generate spurious images that are consistently spurious across different classifiers. Moreover, the generated spurious images are visually similar to reference images from Spurious ImageNet. △ Less

Submitted 12 February, 2024; originally announced February 2024.

arXiv:2401.07441 [pdf, other]

Stability Analysis of ChatGPT-based Sentiment Analysis in AI Quality Assurance

Authors: Tinghui Ouyang, AprilPyone MaungMaung, Koichi Konishi, Yoshiki Seo, Isao Echizen

Abstract: In the era of large AI models, the complex architecture and vast parameters present substantial challenges for effective AI quality management (AIQM), e.g. large language model (LLM). This paper focuses on investigating the quality assurance of a specific LLM-based AI product--a ChatGPT-based sentiment analysis system. The study delves into stability issues related to both the operation and robust… ▽ More In the era of large AI models, the complex architecture and vast parameters present substantial challenges for effective AI quality management (AIQM), e.g. large language model (LLM). This paper focuses on investigating the quality assurance of a specific LLM-based AI product--a ChatGPT-based sentiment analysis system. The study delves into stability issues related to both the operation and robustness of the expansive AI model on which ChatGPT is based. Experimental analysis is conducted using benchmark datasets for sentiment analysis. The results reveal that the constructed ChatGPT-based sentiment analysis system exhibits uncertainty, which is attributed to various operational factors. It demonstrated that the system also exhibits stability issues in handling conventional small text attacks involving robustness. △ Less

Submitted 14 January, 2024; originally announced January 2024.

arXiv:2311.16577 [pdf, other]

Efficient Key-Based Adversarial Defense for ImageNet by Using Pre-trained Model

Authors: AprilPyone MaungMaung, Isao Echizen, Hitoshi Kiya

Abstract: In this paper, we propose key-based defense model proliferation by leveraging pre-trained models and utilizing recent efficient fine-tuning techniques on ImageNet-1k classification. First, we stress that deploying key-based models on edge devices is feasible with the latest model deployment advancements, such as Apple CoreML, although the mainstream enterprise edge artificial intelligence (Edge AI… ▽ More In this paper, we propose key-based defense model proliferation by leveraging pre-trained models and utilizing recent efficient fine-tuning techniques on ImageNet-1k classification. First, we stress that deploying key-based models on edge devices is feasible with the latest model deployment advancements, such as Apple CoreML, although the mainstream enterprise edge artificial intelligence (Edge AI) has been focused on the Cloud. Then, we point out that the previous key-based defense on on-device image classification is impractical for two reasons: (1) training many classifiers from scratch is not feasible, and (2) key-based defenses still need to be thoroughly tested on large datasets like ImageNet. To this end, we propose to leverage pre-trained models and utilize efficient fine-tuning techniques to proliferate key-based models even on limited computing resources. Experiments were carried out on the ImageNet-1k dataset using adaptive and non-adaptive attacks. The results show that our proposed fine-tuned key-based models achieve a superior classification accuracy (more than 10% increase) compared to the previous key-based models on classifying clean and adversarial examples. △ Less

Submitted 28 November, 2023; originally announced November 2023.

arXiv:2309.01620 [pdf, other]

Hindering Adversarial Attacks with Multiple Encrypted Patch Embeddings

Authors: AprilPyone MaungMaung, Isao Echizen, Hitoshi Kiya

Abstract: In this paper, we propose a new key-based defense focusing on both efficiency and robustness. Although the previous key-based defense seems effective in defending against adversarial examples, carefully designed adaptive attacks can bypass the previous defense, and it is difficult to train the previous defense on large datasets like ImageNet. We build upon the previous defense with two major impro… ▽ More In this paper, we propose a new key-based defense focusing on both efficiency and robustness. Although the previous key-based defense seems effective in defending against adversarial examples, carefully designed adaptive attacks can bypass the previous defense, and it is difficult to train the previous defense on large datasets like ImageNet. We build upon the previous defense with two major improvements: (1) efficient training and (2) optional randomization. The proposed defense utilizes one or more secret patch embeddings and classifier heads with a pre-trained isotropic network. When more than one secret embeddings are used, the proposed defense enables randomization on inference. Experiments were carried out on the ImageNet dataset, and the proposed defense was evaluated against an arsenal of state-of-the-art attacks, including adaptive ones. The results show that the proposed defense achieves a high robust accuracy and a comparable clean accuracy compared to the previous key-based defense. △ Less

Submitted 4 September, 2023; originally announced September 2023.

Comments: To appear in APSIPA ASC 2023

arXiv:2303.05036 [pdf, other]

Generative Model-Based Attack on Learnable Image Encryption for Privacy-Preserving Deep Learning

Authors: AprilPyone MaungMaung, Hitoshi Kiya

Abstract: In this paper, we propose a novel generative model-based attack on learnable image encryption methods proposed for privacy-preserving deep learning. Various learnable encryption methods have been studied to protect the sensitive visual information of plain images, and some of them have been investigated to be robust enough against all existing attacks. However, previous attacks on image encryption… ▽ More In this paper, we propose a novel generative model-based attack on learnable image encryption methods proposed for privacy-preserving deep learning. Various learnable encryption methods have been studied to protect the sensitive visual information of plain images, and some of them have been investigated to be robust enough against all existing attacks. However, previous attacks on image encryption focus only on traditional cryptanalytic attacks or reverse translation models, so these attacks cannot recover any visual information if a block-scrambling encryption step, which effectively destroys global information, is applied. Accordingly, in this paper, generative models are explored to evaluate whether such models can restore sensitive visual information from encrypted images for the first time. We first point out that encrypted images have some similarity with plain images in the embedding space. By taking advantage of leaked information from encrypted images, we propose a guided generative model as an attack on learnable image encryption to recover personally identifiable visual information. We implement the proposed attack in two ways by utilizing two state-of-the-art generative models: a StyleGAN-based model and latent diffusion-based one. Experiments were carried out on the CelebA-HQ and ImageNet datasets. Results show that images reconstructed by the proposed method have perceptual similarities to plain images. △ Less

Submitted 9 March, 2023; originally announced March 2023.

Comments: arXiv admin note: text overlap with arXiv:2209.07953

arXiv:2302.06883 [pdf, other]

Text-Guided Scene Sketch-to-Photo Synthesis

Authors: AprilPyone MaungMaung, Makoto Shing, Kentaro Mitsui, Kei Sawada, Fumio Okura

Abstract: We propose a method for scene-level sketch-to-photo synthesis with text guidance. Although object-level sketch-to-photo synthesis has been widely studied, whole-scene synthesis is still challenging without reference photos that adequately reflect the target style. To this end, we leverage knowledge from recent large-scale pre-trained generative models, resulting in text-guided sketch-to-photo synt… ▽ More We propose a method for scene-level sketch-to-photo synthesis with text guidance. Although object-level sketch-to-photo synthesis has been widely studied, whole-scene synthesis is still challenging without reference photos that adequately reflect the target style. To this end, we leverage knowledge from recent large-scale pre-trained generative models, resulting in text-guided sketch-to-photo synthesis without the need for reference images. To train our model, we use self-supervised learning from a set of photographs. Specifically, we use a pre-trained edge detector that maps both color and sketch images into a standardized edge domain, which reduces the gap between photograph-based edge images (during training) and hand-drawn sketch images (during inference). We implement our method by fine-tuning a latent diffusion model (i.e., Stable Diffusion) with sketch and text conditions. Experiments show that the proposed method translates original sketch images that are not extracted from color images into photos with compelling visual quality. △ Less

Submitted 14 February, 2023; originally announced February 2023.

arXiv:2301.04875 [pdf, other]

Color-NeuraCrypt: Privacy-Preserving Color-Image Classification Using Extended Random Neural Networks

Authors: Zheng Qi, AprilPyone MaungMaung, Hitoshi Kiya

Abstract: In recent years, with the development of cloud computing platforms, privacy-preserving methods for deep learning have become an urgent problem. NeuraCrypt is a private random neural network for privacy-preserving that allows data owners to encrypt the medical data before the data uploading, and data owners can train and then test their models in a cloud server with the encrypted data directly. How… ▽ More In recent years, with the development of cloud computing platforms, privacy-preserving methods for deep learning have become an urgent problem. NeuraCrypt is a private random neural network for privacy-preserving that allows data owners to encrypt the medical data before the data uploading, and data owners can train and then test their models in a cloud server with the encrypted data directly. However, we point out that the performance of NeuraCrypt is heavily degraded when using color images. In this paper, we propose a Color-NeuraCrypt to solve this problem. Experiment results show that our proposed Color-NeuraCrypt can achieve a better classification accuracy than the original one and other privacy-preserving methods. △ Less

Submitted 12 January, 2023; originally announced January 2023.

arXiv:2209.14831 [pdf, other]

doi 10.1587/transinf.2022MUP0002

Access Control with Encrypted Feature Maps for Object Detection Models

Authors: Teru Nagamori, Hiroki Ito, AprilPyone MaungMaung, Hitoshi Kiya

Abstract: In this paper, we propose an access control method with a secret key for object detection models for the first time so that unauthorized users without a secret key cannot benefit from the performance of trained models. The method enables us not only to provide a high detection performance to authorized users but to also degrade the performance for unauthorized users. The use of transformed images… ▽ More In this paper, we propose an access control method with a secret key for object detection models for the first time so that unauthorized users without a secret key cannot benefit from the performance of trained models. The method enables us not only to provide a high detection performance to authorized users but to also degrade the performance for unauthorized users. The use of transformed images was proposed for the access control of image classification models, but these images cannot be used for object detection models due to performance degradation. Accordingly, in this paper, selected feature maps are encrypted with a secret key for training and testing models, instead of input images. In an experiment, the protected models allowed authorized users to obtain almost the same performance as that of non-protected models but also with robustness against unauthorized access without a key. △ Less

Submitted 29 September, 2022; originally announced September 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:2206.05422

arXiv:2209.07953 [pdf, other]

StyleGAN Encoder-Based Attack for Block Scrambled Face Images

Authors: AprilPyone MaungMaung, Hitoshi Kiya

Abstract: In this paper, we propose an attack method to block scrambled face images, particularly Encryption-then-Compression (EtC) applied images by utilizing the existing powerful StyleGAN encoder and decoder for the first time. Instead of reconstructing identical images as plain ones from encrypted images, we focus on recovering styles that can reveal identifiable information from the encrypted images. T… ▽ More In this paper, we propose an attack method to block scrambled face images, particularly Encryption-then-Compression (EtC) applied images by utilizing the existing powerful StyleGAN encoder and decoder for the first time. Instead of reconstructing identical images as plain ones from encrypted images, we focus on recovering styles that can reveal identifiable information from the encrypted images. The proposed method trains an encoder by using plain and encrypted image pairs with a particular training strategy. While state-of-the-art attack methods cannot recover any perceptual information from EtC images, the proposed method discloses personally identifiable information such as hair color, skin color, eyeglasses, gender, etc. Experiments were carried out on the CelebA dataset, and results show that reconstructed images have some perceptual similarities compared to plain images. △ Less

Submitted 16 September, 2022; originally announced September 2022.

Comments: To appear in APSIPA ASC 2022

arXiv:2208.02556 [pdf, other]

Privacy-Preserving Image Classification Using ConvMixer with Adaptive Permutation Matrix

Authors: Zheng Qi, AprilPyone MaungMaung, Hitoshi Kiya

Abstract: In this paper, we propose a privacy-preserving image classification method using encrypted images under the use of the ConvMixer structure. Block-wise scrambled images, which are robust enough against various attacks, have been used for privacy-preserving image classification tasks, but the combined use of a classification network and an adaptation network is needed to reduce the influence of imag… ▽ More In this paper, we propose a privacy-preserving image classification method using encrypted images under the use of the ConvMixer structure. Block-wise scrambled images, which are robust enough against various attacks, have been used for privacy-preserving image classification tasks, but the combined use of a classification network and an adaptation network is needed to reduce the influence of image encryption. However, images with a large size cannot be applied to the conventional method with an adaptation network because the adaptation network has so many parameters. Accordingly, we propose a novel method, which allows us not only to apply block-wise scrambled images to ConvMixer for both training and testing without the adaptation network, but also to provide a higher classification accuracy than conventional methods. △ Less

Submitted 4 August, 2022; originally announced August 2022.

Comments: arXiv admin note: text overlap with arXiv:2205.12041

arXiv:2206.05422 [pdf, other]

Access Control of Semantic Segmentation Models Using Encrypted Feature Maps

Authors: Hiroki Ito, AprilPyone MaungMaung, Sayaka Shiota, Hitoshi Kiya

Abstract: In this paper, we propose an access control method with a secret key for semantic segmentation models for the first time so that unauthorized users without a secret key cannot benefit from the performance of trained models. The method enables us not only to provide a high segmentation performance to authorized users but to also degrade the performance for unauthorized users. We first point out tha… ▽ More In this paper, we propose an access control method with a secret key for semantic segmentation models for the first time so that unauthorized users without a secret key cannot benefit from the performance of trained models. The method enables us not only to provide a high segmentation performance to authorized users but to also degrade the performance for unauthorized users. We first point out that, for the application of semantic segmentation, conventional access control methods which use encrypted images for classification tasks are not directly applicable due to performance degradation. Accordingly, in this paper, selected feature maps are encrypted with a secret key for training and testing models, instead of input images. In an experiment, the protected models allowed authorized users to obtain almost the same performance as that of non-protected models but also with robustness against unauthorized access without a key. △ Less

Submitted 11 June, 2022; originally announced June 2022.

arXiv:2205.12041 [pdf, other]

Privacy-Preserving Image Classification Using Vision Transformer

Authors: Zheng Qi, AprilPyone MaungMaung, Yuma Kinoshita, Hitoshi Kiya

Abstract: In this paper, we propose a privacy-preserving image classification method that is based on the combined use of encrypted images and the vision transformer (ViT). The proposed method allows us not only to apply images without visual information to ViT models for both training and testing but to also maintain a high classification accuracy. ViT utilizes patch embedding and position embedding for im… ▽ More In this paper, we propose a privacy-preserving image classification method that is based on the combined use of encrypted images and the vision transformer (ViT). The proposed method allows us not only to apply images without visual information to ViT models for both training and testing but to also maintain a high classification accuracy. ViT utilizes patch embedding and position embedding for image patches, so this architecture is shown to reduce the influence of block-wise image transformation. In an experiment, the proposed method for privacy-preserving image classification is demonstrated to outperform state-of-the-art methods in terms of classification accuracy and robustness against various attacks. △ Less

Submitted 24 May, 2022; originally announced May 2022.

arXiv:2204.07707 [pdf, other]

Privacy-Preserving Image Classification Using Isotropic Network

Authors: AprilPyone MaungMaung, Hitoshi Kiya

Abstract: In this paper, we propose a privacy-preserving image classification method that uses encrypted images and an isotropic network such as the vision transformer. The proposed method allows us not only to apply images without visual information to deep neural networks (DNNs) for both training and testing but also to maintain a high classification accuracy. In addition, compressible encrypted images, c… ▽ More In this paper, we propose a privacy-preserving image classification method that uses encrypted images and an isotropic network such as the vision transformer. The proposed method allows us not only to apply images without visual information to deep neural networks (DNNs) for both training and testing but also to maintain a high classification accuracy. In addition, compressible encrypted images, called encryption-then-compression (EtC) images, can be used for both training and testing without any adaptation network. Previously, to classify EtC images, an adaptation network was required before a classification network, so methods with an adaptation network have been only tested on small images. To the best of our knowledge, previous privacy-preserving image classification methods have never considered image compressibility and patch embedding-based isotropic networks. In an experiment, the proposed privacy-preserving image classification was demonstrated to outperform state-of-the-art methods even when EtC images were used in terms of classification accuracy and robustness against various attacks under the use of two isotropic networks: vision transformer and ConvMixer. △ Less

Submitted 15 April, 2022; originally announced April 2022.

arXiv:2201.11006 [pdf, other]

An Overview of Compressible and Learnable Image Transformation with Secret Key and Its Applications

Authors: Hitoshi Kiya, AprilPyone MaungMaung, Yuma Kinoshita, Shoko Imaizumi, Sayaka Shiota

Abstract: This article presents an overview of image transformation with a secret key and its applications. Image transformation with a secret key enables us not only to protect visual information on plain images but also to embed unique features controlled with a key into images. In addition, numerous encryption methods can generate encrypted images that are compressible and learnable for machine learning.… ▽ More This article presents an overview of image transformation with a secret key and its applications. Image transformation with a secret key enables us not only to protect visual information on plain images but also to embed unique features controlled with a key into images. In addition, numerous encryption methods can generate encrypted images that are compressible and learnable for machine learning. Various applications of such transformation have been developed by using these properties. In this paper, we focus on a class of image transformation referred to as learnable image encryption, which is applicable to privacy-preserving machine learning and adversarially robust defense. Detailed descriptions of both transformation algorithms and performances are provided. Moreover, we discuss robustness against various attacks. △ Less

Submitted 15 April, 2022; v1 submitted 26 January, 2022; originally announced January 2022.

arXiv:2111.08927 [pdf, other]

Protection of SVM Model with Secret Key from Unauthorized Access

Authors: Ryota Iijima, AprilPyone MaungMaung, Hitoshi Kiya

Abstract: In this paper, we propose a block-wise image transformation method with a secret key for support vector machine (SVM) models. Models trained by using transformed images offer a poor performance to unauthorized users without a key, while they can offer a high performance to authorized users with a key. The proposed method is demonstrated to be robust enough against unauthorized access even under th… ▽ More In this paper, we propose a block-wise image transformation method with a secret key for support vector machine (SVM) models. Models trained by using transformed images offer a poor performance to unauthorized users without a key, while they can offer a high performance to authorized users with a key. The proposed method is demonstrated to be robust enough against unauthorized access even under the use of kernel functions in a facial recognition experiment. △ Less

Submitted 17 November, 2021; originally announced November 2021.

Comments: To appear in IWAIT 2022

arXiv:2105.14756 [pdf, other]

A Protection Method of Trained CNN Model with Secret Key from Unauthorized Access

Authors: AprilPyone MaungMaung, Hitoshi Kiya

Abstract: In this paper, we propose a novel method for protecting convolutional neural network (CNN) models with a secret key set so that unauthorized users without the correct key set cannot access trained models. The method enables us to protect not only from copyright infringement but also the functionality of a model from unauthorized access without any noticeable overhead. We introduce three block-wise… ▽ More In this paper, we propose a novel method for protecting convolutional neural network (CNN) models with a secret key set so that unauthorized users without the correct key set cannot access trained models. The method enables us to protect not only from copyright infringement but also the functionality of a model from unauthorized access without any noticeable overhead. We introduce three block-wise transformations with a secret key set to generate learnable transformed images: pixel shuffling, negative/positive transformation, and FFX encryption. Protected models are trained by using transformed images. The results of experiments with the CIFAR and ImageNet datasets show that the performance of a protected model was close to that of non-protected models when the key set was correct, while the accuracy severely dropped when an incorrect key set was given. The protected model was also demonstrated to be robust against various attacks. Compared with the state-of-the-art model protection with passports, the proposed method does not have any additional layers in the network, and therefore, there is no overhead during training and inference processes. △ Less

Submitted 31 May, 2021; originally announced May 2021.

Showing 1–17 of 17 results for author: MaungMaung, A