ARGAN Adversarially Robust Generative Adversarial Networks For Deep Neural Networks Against Adversarial Examples

The document presents ARGAN, a new generative adversarial networks-based defense method designed to enhance the robustness of deep neural networks against adversarial examples while maintaining accuracy for legitimate input data. ARGAN employs a two-step transformation architecture to optimize the generator model, addressing the limitations of existing GAN-based defense methods that often lead to false positives for legitimate inputs. Experimental results demonstrate that ARGAN achieves superior accuracy compared to state-of-the-art GAN-based defenses across various datasets and applications.

Uploaded by

anuanamika0220

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views14 pages

ARGAN Adversarially Robust Generative Adversarial Networks For Deep Neural Networks Against Adversarial Examples

Uploaded by

anuanamika0220

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Received February 27, 2022, accepted March 14, 2022, date of publication March 16, 2022, date of current

version March 31, 2022.

Digital Object Identifier 10.1109/ACCESS.2022.3160283

ARGAN: Adversarially Robust Generative

Adversarial Networks for Deep
Neural Networks Against
Adversarial Examples
SEOK-HWAN CHOI 1 , JIN-MYEONG SHIN1 , PENG LIU 2, (Member, IEEE),
AND YOON-HO CHOI 1 , (Member, IEEE)
1 School of Computer Science and Engineering, Pusan National University, Busan 46241, Republic of Korea
2 College of Information Sciences and Technology, Pennsylvania State University, State College, PA 16801, USA
Corresponding author: Yoon-Ho Choi ([email protected])
This work was supported in part by BK21 FOUR, the Korean Southeast Center for the 4th Industrial Revolution Leader Education, in part
by the National Research Foundation of Korea (NRF) Grant by the Korean Government (MSIT) under Grant NRF-2021R1F1A1049655,
and in part by the Institute of Information and Communications Technology Planning and Evaluation (IITP) Grant by the Korean
Government (MSIT) (Regional strategic industry convergence security core talent training business) under Grant 2019-0-01343.

ABSTRACT An adversarial example, which is an input instance with small, intentional feature perturbations
to machine learning models, represents a concrete problem in Artificial intelligence safety. As an emerging
defense method to defend against adversarial examples, generative adversarial networks-based defense
methods have recently been studied. However, the performance of the state-of-the-art generative adversarial
networks-based defense methods is limited because the target deep neural network models with generative
adversarial networks-based defense methods are robust against adversarial examples but make a false
decision for legitimate input data. To solve the accuracy degradation of the generative adversarial networks-
based defense methods for legitimate input data, we propose a new generative adversarial networks-based
defense method, which is called Adversarially Robust Generative Adversarial Networks(ARGAN). While
converting input data to machine learning models using the two-step transformation architecture, ARGAN
learns the generator model to reflect the vulnerability of the target deep neural network model against
adversarial examples and optimizes parameter values of the generator model for a joint loss function. From
the experimental results under various datasets collected from diverse applications, we show that the accuracy
of ARGAN for legitimate input data is good-enough while keeping the target deep neural network model
robust against adversarial examples. We also show that the accuracy of ARGAN outperforms the accuracy
of the state-of-the-art generative adversarial networks-based defense methods.

INDEX TERMS Adversarial examples, adversarial perturbation, deep neural networks (DNNs), security.

I. INTRODUCTION bations, adversarial example largely changes the output value

As a severe security problem of deep learning technology, the given by neural networks. As a result, this kind of security
notion of adversarial example was introduced in 2013 [1]. problem for the target deep neural network (DNN) model
With the evolution of deep learning technology, adversarial causes severe damage to the artificial intelligent (AI) systems
example has recently been highlighted as the most severe such as self-driving systems [3], bio-medicine systems [4],
problem of deep learning technology [2]. While modifying and user authentication systems [5]–[8] which are sensitive
legitimate input data with slight human-imperceptible pertur- to small variation in accuracy.
To defend the target DNN model against adversarial
The associate editor coordinating the review of this manuscript and examples, generative adversarial networks (GANs)-based
approving it for publication was Ilsun You . defense methods have actively been studied as an emerging
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
33602 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ VOLUME 10, 2022
S.-H. Choi et al.: ARGAN: ARGAN for DNNs Against Adversarial Examples

effective defense technology [9], [10]. The existing GAN- to adversarial examples and presenting a new joint loss
based defense methods are grouped into one of two types function to optimize the parameter values of the generator.
of architectures according to the design purpose of the In the second transformation step, the robust input data are
generator, i.e., noise generation architecture and noise transformed into feeding input data to the target DNN model
reduction architecture. Note that given a training dataset, using the additive inverse of the generator.
GAN learns to generate new data with the same statistics as Compared to the other state-of-the-art GAN-based defense
the training dataset using two deep networks, which are the methods, ARGAN shows the good accuracy for both
generator and the discriminator. The generator in the noise legitimate input data and adversarial examples. From the
generation architecture produces the corrupted input data. experimental results under various datasets collected from
While training DNN models by using both the corrupted diverse applications, we show that the accuracy of ARGAN
input data and legitimate input data, the noise generation for legitimate input data is good-enough while keeping the
architecture makes DNN models robust against adversarial DNN model robust against adversarial examples. We also
examples [11], [12]. On the other hand, the generator in the show that the accuracy of ARGAN outperforms the accuracy
noise reduction architecture produces the purified input data of the state-of-the-art GAN-based defense methods using
whose data distribution is close to the legitimate input data noise generation architecture and noise reduction archi-
distribution [13], [14]. Thus, the noise reduction architecture tecture, e.g., Gandef [12], Defense-GAN [13] and APE-
results in reducing the perturbation of adversarial examples GAN [14].
before feeding input data into the target DNN model. The rest of the paper is organized as follows. In section II,
However, the accuracy of such GAN-based defense meth- the authors overview the well-known adversarial attacks and
ods can decrease when predicting or classifying legitimate the state-of-the-art GAN-based defense methods. Also the
input data. This is because GAN-based defense methods authors describe the motivation of this paper. In section III,
using the noise generation architecture or the noise reduction the authors describe the threat model, the overall operation
architecture use the slightly modified input data, i.e., the and the details of ARGAN. In section IV, the authors
corrupted input data in the noise generation architecture and verify the effectiveness(accuracy) of ARGAN from various
the purified input data in the noise reduction architecture. experimental results under different adversarial attacks,
While the target DNN model is robust against adversarial different datasets and so on. Finally, the authors conclude this
examples due to the modified input data, it can make wrong paper in section V.
decision for legitimate input data and thus, generates the
false positives for legitimate input data. If such GAN- II. PRELIMINARIES AND RELATED WORKS
based defense methods which cause the wrong decision are In this section, after introducing well-known adversarial
used at self-driving systems, bio-medicine systems, and user attacks, we overview previous GAN-based defense methods.
authentication systems [6]–[8] which are sensitive to small We also describe the motivation for considering a new GAN-
accuracy variation, significant side effects can be caused. based defense method by explaining the limitations of the
To solve the accuracy degradation of the existing GAN- previous GAN-based defense methods.
based defense methods for legitimate input data while
keeping the target DNN model robust against adversarial A. ADVERSARIAL ATTACKS
examples, we propose a new GAN-based defense method, In this section, we summarize the characteristics of four
which is called Adversarially Robust GAN (ARGAN). adversarial attacks [16]–[19], which are frequently used for
Similar to EEJE [15], the proposed ARGAN architecture is performance verification in many defense methods.
designed following two-step transformation of input data to Based on linearization of cost functions, Fast Gradient Sign
the target DNN model. That is, in the first transformation Method (FGSM) generates adversarial examples using the
step, the noise data is added into the input data to the sign of the gradient to increase loss of DNN models [16].
target DNN model and in the second transformation step, FGSM is a simple and fast adversarial attacks, but it often
the inverse noise data is added to eliminate the influence makes sub-optimal perturbations. To resolve the problem
of the noise added from the first transformation step on of FGSM, Projected Gradient Descent (PGD) generates
the output. However, ARGAN is designed using a black- adversarial examples using several gradient updates for
box transformation method different from a white-box the fine optimization [17]. To perform fine optimization
transformation method of EEJE. That is, while EEJE requires efficiently, PGD performs iterative gradient updates from
knowledge of the network architecture, weight values and randomly selected an initial point. To calculate minimal
other parameters of the target DNN model when generating perturbation, DeepFool used the iterative linearization of the
the noise data in the first transformation step, ARGAN target DNN model [18]. In each iteration, the adversarial
does not require complete knowledge of model because it perturbation is updated to reach the decision boundary
uses a pre-trained generator as the transformation method. closest to the input data X . To increase attack success rate
Specifically, in the first transformation step, the generator while calculating the minimum perturbation, C&W generates
produces the robust input data against adversarial examples adversarial examples based on various distance metrics such
while reflecting the vulnerability of the target DNN model as L0 , L∞ and L2 norm [19]. In this paper, only L2 type of