Hierarchical Adversarial Attacks Against Graph-Neural-Network-Based IoT Network Intrusion Detection System
Hierarchical Adversarial Attacks Against Graph-Neural-Network-Based IoT Network Intrusion Detection System
Abstract—The advancement of Internet of Things (IoT) tech- in the two state-of-the-art GNN models, GCN and JK-Net,
nologies leads to a wide penetration and large-scale deployment respectively, for NIDS in IoT environments.
of IoT systems across an entire city or even country. While Index Terms—Adversarial attack, deep learning, graph neu-
IoT systems are capable of providing intelligent services, the ral network (GNN), Internet of Things (IoT), network intrusion
large amount of data collected and processed in IoT systems detection.
also raises serious security concerns. Many research efforts have
been devoted to design intelligent network intrusion detection I. I NTRODUCTION
system (NIDS) to prevent misuse of IoT data across smart appli- HE proliferation of Internet of Things (IoT) technologies
cations. However, existing approaches may suffer from the issue
of limited and imbalanced attack data when training the detec-
T and systems are growing at an unprecedented rate. The
scale of modern IoT systems goes far beyond the individual
tion model, which make the system vulnerable especially for those
unknown type attacks. In this study, a novel hierarchical adver- level, with interconnected IoT devices that are widely spread
sarial attack (HAA) generation method is introduced to realize across the entire cities or even countries. Supported by the
the level-aware black-box adversarial attack strategy, targeting increasing communication speed and bandwidth, IoT devices
the graph neural network (GNN)-based intrusion detection in are capable of collecting, transmitting, and processing an enor-
IoT systems with a limited budget. By constructing a shadow
GNN model, an intelligent mechanism based on a saliency map mous amount of data [1], [2]. These IoT systems, associated
technique is designed to generate adversarial examples by effec- with the collected data, are offering great opportunities in
tively identifying and modifying the critical feature elements with designing and providing intelligent services in different appli-
minimal perturbations. A hierarchical node selection algorithm cations, such as intelligent transportation, automated surveil-
based on random walk with restart (RWR) is developed to select lance, and smart cyber–physical systems [3], [4]. However, the
a set of more vulnerable nodes with high attack priority, consid-
ering their structural features, and overall loss changes within the collected IoT data also contain sensitive information and there-
targeted IoT network. The proposed HAA generation method is fore require more attention on privacy protection and reliable
evaluated using the open-source data set UNSW-SOSR2019 with data security issues.
three baseline methods. Comparison results demonstrate its abil- To deal with such increasing privacy and security concerns,
ity in degrading the classification precision by more than 30% modern IoT or distributed systems need to be able to detect
and prevent network intrusions in a more intelligent way. Many
Manuscript received April 2, 2021; revised June 1, 2021, July 15, 2021, research efforts have been devoted to develop machine learn-
and September 20, 2021; accepted November 2, 2021. Date of publication ing or deep learning-based approaches for network intrusion
November 24, 2021; date of current version June 7, 2022. This work was detection system (NIDS), in order to prevent any deviation or
supported in part by the National Key Research and Development Program of
China under Grant 2017YFE0117500, Grant 2019YFE0190500; in part by the misuse in IoT systems and infrastructures [5]–[7]. Although
National Natural Science Foundation of China under Grant 62072171; in part NIDS has been well exploited in detecting malicious network
by the Natural Science Foundation of Hunan Province of China under Grant activities, one of the main vulnerabilities of existing NIDS is
2020SK2089; and in part by the by Open Fund of Key Laboratory of Hunan
Province under Grant 2017TP1026. (Corresponding author: Wei Liang.) the lack of ability to detect unknown types of network intru-
Xiaokang Zhou and Shohei Shimizu are with the Faculty of Data Science, sion, due to the limited or imbalanced intrusion data during the
Shiga University, Hikone 522-8522, Japan, and also with the RIKEN Center model training process [8], [9]. In addition, existing machine
for Advanced Intelligence Project, RIKEN, Tokyo 103-0027, Japan (e-mail:
[email protected]; [email protected]). learning approaches are not able to handle multidomain intru-
Wei Liang is with the Base of International Science and Technology sion detections, which calls for the further exploration on the
Innovation and Cooperation on Big Data Technology and Management, Hunan hybrid deep learning architecture [6], [10], [11].
University of Technology and Business, Changsha 410205, China (e-mail:
[email protected]). As a typical type of neural network in deep learning
Weimin Li is with the School of Computer Engineering and Science, models, graph neural network (GNN) has demonstrated its
Shanghai University, Shanghai 200444, China (e-mail: [email protected]). promising performance in dealing with a graph or network
Ke Yan is with the Department of the Built Environment, College of Design
and Engineering, National University of Singapore, Singapore 117566 (e-mail: data [12]. However, it still suffers when facing limited or
[email protected]). imbalanced training data, and can also be vulnerable to adver-
Kevin I-Kai Wang is with the Department of Electrical, Computer, sarial attacks. In recent years, adversarial attacks or examples
and Software Engineering, The University of Auckland, Auckland 1010,
New Zealand (e-mail: [email protected]). have been proved as one significant tool in analyzing deep neu-
Digital Object Identifier 10.1109/JIOT.2021.3130434 ral networks in terms of their theoretical property and practical
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
ZHOU et al.: HIERARCHICAL ADVERSARIAL ATTACKS AGAINST GNN-BASED IoT NIDS 9311
performance [13]. It can affect deep graph learning algorithms A. GNN-Based Network Modeling With IoT
with small and imperceptible perturbations and lead to inaccu- With the rapid evolution of deep learning techniques in var-
rate classifications or wrong decisions [14]. Therefore, further ious smart applications for classification or prediction tasks,
investigations are necessary in GNN-based NIDS. In general, GNN has become an emerging learning paradigm when deal-
adversarial attacks can be classified into three types, namely, ing with interdependent data with complex relationships in
white-box attacks, gray-box attacks, and black-box attacks, network modeling [12]. Several researches have explored the
according to how much the attacker knows about the learn- use of GNN in big data mining, machine learning, and IoT
ing model. In white-box attacks, the entire model structure, applications. Zhou et al. [15] introduced a so-called reinforced
parameters, input data, and labels are completely exposed to spatial-temporal attention GNN model for traffic prediction,
the attacker. While in gray-box attacks, the attacker has par- which utilized the diffusion convolution neural network and
tial information of the training model. In black-box attacks, a temporal attention mechanism to analyze spatial depen-
the attacker knows almost nothing about the network except dencies and temporal dynamics from traffic sensor networks.
the input data, which offers a more realistic representation for Zhang et al. [16] applied GNN in the modeling of IoT equip-
real threat scenarios. ment. They reconstructed the input data using a variational
In this study, a new hierarchical adversarial attack (HAA) autoencoder to analyze the temporal and inner logic rela-
generation method is proposed, which can be used to examine tions of data. Rusek et al. [17] employed GNN to model the
the robustness and generality of an NIDS designed for typical graph-structured information and designed a message-passing
IoT applications. Considering the black-box attack scenario, function to extract complex relationships from network topolo-
a shadow GNN model is constructed with the intercepted gies and routing configurations based on the generalized linear
network packets and extracted input data features to imi- models, which could be applied for routing optimization and
tate the original model. The saliency map technique is used network planning. Guo and Wang [18] built a recommenda-
to find some critical elements in the feature vector, follow- tion framework based on deep GNN for future IoT. They
ing which adversarial examples can then be generated to flip modeled feature spaces into two graph networks and used
the classification labels with minimal modifications on those matrix factorization to improve the missing rating values in
identified critical elements. In addition, a random walk with a user-item rating matrix. Cui et al. [19] presented a deep
restart (RWR)-based algorithm is developed to select a set of learning framework, in which the traffic network was mod-
nodes with high attack priority based on structural features and eled by a graph convolutional long short-term memory neural
overall loss changes within the targeted IoT network. The main network. They designed a graph convolution operator to learn
contributions of this article can be summarized as follows. the spatial and temporal dependency and defined two regu-
1) An integrated framework for the level-aware black-box larization terms to optimize loss functions in model training.
adversarial attack strategy is designed and constructed To identify graph patterns in directed role-based concep-
to compromise the GNN-based NIDS in typical IoT tual attributed graph, Krleža and Fertalj [20] proposed a fuzzy
environments with a limited budget. GNN for graph matching. They built this model based on
2) An intelligent adversarial example generation mecha- the combination of graph element comparison using fuzzy
nism is developed based on a constructed shadow GNN logic and graph structure verification using a recursive neural
model, which can effectively modify the critical feature network. Shen et al. [21] involved GNN into the large-scale
elements identified using saliency mapping with minimal radio resource management as a graph optimization problem.
perturbations. They designed a so-called message passing GNN, where
3) An RWR-based hierarchical node selection algorithm, agents were considered as nodes and communication chan-
which considers both the link analysis and loss change in nels were considered as edges, to achieve a low-complexity
initializing and updating the transfer matrix, is designed neural network operation. Gama et al. [22] introduced two
to efficiently identify and select a set of more vulnerable improvements of GNN architectures. One called selection
nodes to attack the GNN model. GNN replaced the linear time-invariant filter for convolutional
The remainder of this article is organized as follows. feature generation, the other one called aggregation GNN used
Section II presents the summary of related works on a temporal structure to capture the graph topology. Both of
GNN-based modeling and adversarial attacks against GNN them were applied in synthetic networks for source localiza-
in modern IoT systems. Section III introduces the over- tion. Zhu et al. [23] constructed a hierarchical unsupervised
all application scenario and problem formulations, followed model based on cycle adversarial networks for graph align-
by the proposed HAA generation mechanism explained in ment, in which an optimization module for group structure
Section IV. Section V presents and discusses the evaluations aggregation was developed to recognize similar IoT devices
using the open-source data set, and Section VI concludes this in different networks.
study and gives a promising perspective on future research.
is a nonempty set of nodes in the network, E = using X . In particular, we set |X | < h, in which h is the max-
{e1 , e2 , . . . , eQ } is the edge set associated with V, and D = imum number of the nodes to be attacked initially due to the
{d1 , d2 , . . . , dP } denotes a set of measures to quantify the hier- limited budget.
archical relations based on link analysis among all the nodes
in V. P = |V| and Q = |E| denote the total number of nodes IV. HAA G ENERATION AGAINST GNN-BASED NIDS
and edges, respectively. X = {x1 , x2 , . . . , xP } is the set of fea- In this section, we first introduce the overview of HAA gen-
tures that corresponds to each individual node in V, where eration against the GNN-based NIDS. The modeling of
each xi ∈ RL in X is the L-dimensional feature vector for each shadow GNN for HAA generation is then addressed. Then,
node vi ∈ V. we discuss how to effectively generate the adversarial exam-
Given an NIDS deployed with a GNN-based classification ples and how to efficiently compromise the GNN-based NIDS
model, the GNN model may classify all the nodes into C using a hierarchical node selection strategy.
classes. We define yi ∈ {1, 2, . . . , C} as the ground truth
class for node vi , and ŷi = f (X; θ ) as the predicted result
A. Overview of HAA Generation
for vi , where θ indicates the set of parameters in the GNN
model. As we discussed above, a practical black-box attack The overview of the proposed HAA generation method is
scenario is adopted, in which the NIDS’s integrity is pro- shown in Fig. 2, which includes three essential parts. To gen-
tected thus the attacker cannot directly obtain the detailed erate black-box attacks, it is important to imitate the original
model structure and parameters. Accordingly, the attacker may GNN model. Thus, the first part is to construct a shadow
attempt to degrade the classification performance of f (∗) by GNN model, based on the intercepted network packets and
adding perturbations into the original G, and turn it into a extracted features from the input data of the original model.
perturbed graph G . Specifically, we consider the perturbation Then, the second part is to select an optimum node to attack.
only occurs in nodes of the graph model, which results in a An RWR-based mechanism is designed to measure each node
perturbed G = (V , E, D , X ). in the constructed shadow model, and the node with the higher
The problem definition is given as follows. Assuming an weight is more likely to be selected as the attack node due to
attacker tries to degrade the classification performance of a the limited budget. Finally, the third part is to generate adver-
GNN model by adding perturbation to the original G, it is sarial examples based on the shadow GNN model, which aim
needed to confuse the GNN model by perturbing the origi- to perturb critical features and alter the classification labels.
nal feature vector set X to X . Therefore, given a predicted In summary, the adversarial examples are generated based
classification result ŷ comparing to the ground truth y, the on the constructed shadow GNN model with the intercepted
goal is to obtain an optimal perturbed X based on the loss network packets, which can efficiently mislead the detection
optimization [34], [35], which can be described as follows: or prediction result, and ensure the attack damage to the target
system via selecting some more vulnerable nodes.
argmax Lattack f X ; θ ∗ , y
|X |<h B. Generation of the Shadow GNN Model
∗
s.t. θ = argmin|X |<h Lpredict (f (X; θ ), y) (1) As mentioned previously, in a black-box attack scenario,
attackers can only intercept the network traffic data, and
where Lattack and Lpredict are the cross-entropy loss of attacker assume a known GNN structure for training. They have no
and original GNN, respectively. θ ∗ denotes the optimal param- access to the actual parameters of the GNN model used in the
eters and f (X ; θ ∗ ) denotes the prediction result based on θ ∗ NIDS. Therefore, it is necessary to construct a shadow model
9314 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO. 12, JUNE 15, 2022
to assist in generating adversarial examples that can confuse Accordingly, we can investigate the sensitivity of each fea-
the GNN-based NIDS. In particular, assuming the attacker can ture element z in x, and obtain the top-k critical elements
monitor the traffic flow in or out of the target model and col- according to the rank of ωz . Then, the cross-entropy loss
lect a certain number of network packets, these intercepted of attacker can be calculated via (1) based on the perturba-
network packets, including the detailed information of source tion εi which is applied to those identified critical elements
IP, destination IP, timestamp, traffic flows, etc., can be uti- of the original xi . Specifically, we set a trivial perturbation
lized to learn a shadow GNN model with an existing GNN stride as 0.001 for εi in the maximum 20 episodes to estimate
network structure. The process of generating the shadow GNN Lattack (f (X ; θ ∗ ), y) during the training process.
model can be split into two parts as: 1) feature extraction and
2) shadow model training. D. Hierarchical Node Selection Strategy
First, a feature extractor is constructed to extract the critical As discussed before, it is impossible for an attacker to com-
information from the intercepted network packets, transform promise the whole network by modifying the whole feature set
it into a feature vector xi , and form a feature set X = X for all nodes V. Considering the limited budget, the attacker
{x1 , x2 , . . . , xP }. A shadow model described as f (∗) is then usually chooses to compromise a subset of nodes with rel-
initialized based on the extracted X and original y, to reproduce atively smaller cost. Thus, an RWR-based algorithm on the
the original f (∗) as much as possible. GNN model is employed to capture the hierarchical structure
Second, we go further to learn the temporary parame- feature in a weighted graph, which conducts the node selection
ters θ in the shadow model, thus it may output the same task and generate a node set to be attacked with high priority.
or similar prediction results as the original GNN model. RWR has been proved as an efficient way to calculate the
Specifically, given the predicted result of the shadow model weighting score in terms of the connections among networked
as ŷi = f (xi ; θ ), the goal is to minimize the error between nodes in a constructed graph model [36]. Basically, the RWR
the new ŷi and ŷi in the original GNN model. The root mean that measures the importance on the edge set E can be defined
square error (RMSE) is used to measure the error between ŷi and expressed as follows:
and ŷi , and a classical gradient descent method is employed to
optimize the parameters to make sure that the shadow model HR(t+1) = λM · HR(t) + (1 − λ)HR0 (4)
is as close to the original model as possible. where λ ranging from 0 to 1 is a damping coefficient for
the random navigation during the iteration. HRt denotes a
C. Adversarial Example Generation score vector in terms of the feature importance at the step t.
As the core issue in the HAA model, adversarial examples Particularly, HR0 = [0, . . . , 1, . . . , 0] is the initial vector
are generated by the shadow GNN to modify the feature set when starting the RWR, in which the element of value “1”
X = {x 1 , x 2 , . . . , x N } from X, so as to disguise malicious denotes that the corresponding node vi is selected as a target
packets as the normal or vice versa. Thus, the key idea is of attack at the beginning. The transfer matrix M ∈ RP×P
to learn the decision boundary from the GNN discriminator, stores the probability of each node to transfer to the others.
then modify the features based on the original data packet and Specifically, each mij ∈ M can be initially measured accord-
change it across the decision boundary with minimal modifi- ing to the outlink di of vi , as the probability of transfer from
cations. Specifically, the modified feature x i for node vi can vi to vj , which can be calculated as follows:
be defined as follows:
1/di , if ∃eij ∈ E, or i = j
mij = (5)
x i = xi + εi . (2) 0, otherwise
Obviously, the goal is to minimize ||εi || which satisfies where di is quantified by the outlink of node vi .
f (x i ) = f (xi ), and how to distinguish the critical feature To measure the hierarchical feature of each node, A level-
elements in the feature space becomes essential to generate threshold τ is introduced to empirically evaluate di for each
adversarial examples with minimal perturbations to alter the vi in the network, which can measure and identify the level of
labels. each node as exemplified in Fig. 1. Given a randomly selected
In particular, the saliency map [37] is utilized to identify node vi as an initial node, it will iteratively transmit to its
the critical elements from the feature space in a gradient-based neighborhood node based on M by calculating the correspond-
back propagation process. Based on revising these identified ing λmij , while it may transmit back to itself with a probability
key elements, adversarial examples are generated by adding of (1 − λ)HR0 .
some perturbations according to (2). Furthermore, the overall loss change is considered and
To identify the critical feature element with saliency map, investigated in terms of the update of transfer matrix M when
the derivative weight ω for each element z in x is introduced we choose to attack different nodes with a limited budget. In
and calculated based on the back propagation as follows: particular, we evaluate the loss based on xi when perturbing
a node vi in the graph. Thus, the loss change for a selected
∂ ŷ node xi can be defined and described as follows:
ωz = |z (3)
∂x
i (x) = Lattack (f (xi , y) − Lattack (f (xi , y)). (6)
where ŷ is the corresponding predicted result. Each ωz indi-
cates the sensitivity of the corresponding z in x, in terms of Specifically, we can measure this loss change
its influence to the output ŷ. based on the first-order Taylor approximation
ZHOU et al.: HIERARCHICAL ADVERSARIAL ATTACKS AGAINST GNN-BASED IoT NIDS 9315
TABLE I
Algorithm 1 RWR-Based Hierarchical Adversarial Example DATA S ET D ESCRIPTION
Generation
Input:Graph G = (V, E, D, X), ground truth class label y
Output:A set of adversarial examples generated according to
the top-n selected nodes
1: Initialize HR0 , Max Iteration π , Threshold δrwr , δloss
2: for each eij in M do
3: Initialize mij via Eq. (5)
4: end for
5: for t = 1 to π do
6: Compute HR(t+1) = λM · HR(t) + (1 − λ)HR0
7: Compute error e(t) = HR(t+1) − HR(t)
8: if e(t) < δrwr : break
9: Compute loss change i (x) via Eq. (6) for the temporal traffic from ten IoT devices in total [38]. Table I summarizes
top-n nodes in M and update M the overall data set, including the used IoT devices, and their
10: end for
corresponding training, testing, and attack sample sizes.
11: Select top-n nodes based on their ranking scores
In addition, a set of preprocesses is conducted on the raw
12: for each vi in the selected top-n nodes do
data set before training the model.
13: Compute critical features for vi via Eq. (3) 1) Construct the training set with 80% benign traffic and
14: Compute loss Lattack (f (X ;θ ∗ ), y) via Eq. (1) 20% attack traffic.
15: While Lattack (f (X ;θ ∗ ), y) < δloss do 2) Remove the unreasonable traffics from the training set.
16: Generate adversarial examples xi by adding pertur- 3) Build the graph model with the input data selected
bation to vi via Eq. (2) randomly from the training set.
17: end for
18: return adversarial examples {xi |i∈{1,2,...,N} }
B. Experiment Design
We evaluate the proposed HAA generation method on two
typical GNN models, namely, the GCN [39] and JK-Net [40].
i (x) i (x) + ∇δi T (x − xi ) [13], which can be related
The layer of GCN is set to 3 and JK-Net is set to 7. The other
to the column sum of the ith column in the transfer matrix M, hyper-parameters follow closely the setups in [33] and [37].
and further be used to update M during the training process. Specifically, to reflect the actual hierarchical network structure
The RWR-based hierarchical node selection strategy for in IoT systems, three levels of the IoT nodes are configured
adversarial example generation is shown in Algorithm 1. as low level (outlinks: 0–5), medium level (outlinks: 5–10),
Given a typical IoT network represented by a graph model and top level (outlinks: more than 10), with percentages 60%,
G = (V, E, D, X) with the ground truth label set y, a set of 25%, and 15%, respectively, according to a typical real-world
adversarial examples {xi |i∈{1,2,...,N} } can be generated based on embedded Industrial IoT scenario. Experiments are conducted
the cooperation of the hierarchical node selection and critical in a server with CentOS 8, GTX 1070, G39030 Duel Core,
feature identification. In particular, we initialize the transfer 16-G RAM, Python 3.6, and PyTorch 1.4, and all the results
matrix M based on the outlink of a specific node vi , and iter- are generated with 40 repeated and independent trials.
atively update M considering the corresponding loss changes The following three strategies are considered as the baseline
of the nodes, so as to select a set of nodes with high attack methods when compromising the targeted GNN models.
priority. Perturbations are added to the selected nodes to gen- 1) Improved Random Walk With Restart (iRWR) [36]: This
erate adversarial examples when the loss criterion described in method took the time-varying features into consideration
(1) is satisfied. Finally, we can efficiently select a set of more to find a navigation based on the importance score of
vulnerable nodes and attack the GNN-based NIDS using the nodes across network connections.
generated adversarial examples. 2) Resistive Switching Memory (RSM) [41]: This method
used a cross-point array of RSM with a feedback config-
V. E VALUATIONS AND D ISCUSSION uration to solve the eigenvector calculation and webpage
In this section, we evaluate and compare the proposed ranking tasks, which calculated the importance score
HAA generation method with several baseline methods. similar to the PageRank strategy.
Experiments are designed and conducted based on an open- 3) Greedily Corrected Random Walk (GCRW) [13]: This
source IoT data set, to demonstrate the effectiveness and method generated black-box attacks based on analyz-
usefulness of our method. ing the connection between the backward propagation
of GNNs and random walks, which calculated the
A. Data Set importance score in a greedy correction procedure.
An open data set UNSW-SOSR2019, which is collected by
the security laboratory at the University of New South Wales C. Attack Effectiveness Evaluation
using the tcpdump tool, is adopted for our evaluations. It We evaluate the proposed method and compare its attack
collected packet traces of different kinds of attack and benign effectiveness with the mentioned three baseline methods, by
9316 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO. 12, JUNE 15, 2022
Fig. 3. Feature saliency map generated for different IoT devices. (a) WeMo motion. (b) Samsung cam. (c) TP-Link plug. (d) Netatmo camera. (e) Chromecast
Ultra. (f) Amazon Echo. (g) Phillips Hue bulb. (h) iHome plug.
Fig. 4. Performance degradation in loss and classification precision in JK-Net and GCN with varying percentages of perturbed nodes and training data.
(a) Loss with varying perturbation in JK-Net. (b) Precision with varying perturbation in JK-Net. (c) Loss with varying perturbation in GCN. (d) Precision
with varying perturbation in GCN. (e) Precision with varying training data ratio in JK-Net. (f) Precision with varying training data ratio in GCN.
measuring the classification performance degradation based on with different levels of perturbation. The performance of the
two different GNN models in IoT environments. GNN models is evaluated based on its classification loss
First, we investigate the critical feature identification dur- and precision. A method that has a larger impact (i.e., large
ing the adversarial example generation process. The saliency performance degradation) to the GNN models is expected to
map is utilized to illustrate the critical features of different result in a higher loss and a lower precision. Fig. 4 presents
IoT devices. Fig. 3 shows a set of generated feature saliency the loss and precision results for JK-Net (Fig. 4(a) and (b))
maps according to eight network traffic classes, after principal and GCN (Fig. 4(c) and (d)), respectively, under varying % of
component analysis (PCA) dimension reduction. the perturbed nodes from 2% to 30%.
Furthermore, we investigate attack effectiveness for all In Fig. 4(a) and (b), the expected trends can be observed,
the methods. We first evaluate how these methods perform where higher loss and greater precision degradation are
ZHOU et al.: HIERARCHICAL ADVERSARIAL ATTACKS AGAINST GNN-BASED IoT NIDS 9317
resulted when more nodes are perturbed (i.e., higher level ensure the security of IoT systems. However, existing NIDS
of perturbation). Referring to Fig. 4(a), it can be found that approaches all suffer from the fact that there is only a limited
the proposed HAA results in the highest loss in the JK-Net amount of very imbalanced training data, which leads to the
model in comparison with the other three methods. In addition, vulnerability against unknown types of malicious attack.
instead of having a linear increment in loss with respect to the In this article, we introduced an HAA generation method,
increasing level of perturbation, the HAA generation method targeting the state-of-the-art GNN-based NIDS in black-box
demonstrates a more exponential increase in loss. This indi- attack scenarios. Specifically, we presented an integrated
cates that the method is more effective when the number of framework for the level-aware black-box adversarial attack
perturbed nodes increases. Similar results are also reflected in strategy, which could generate adversarial examples based on
Fig. 4(b), where both iRWR and RSM do not demonstrate very the constructed shadow GNN model with a limited budget. The
effective attack strength, and hence the precision of the GNN saliency map technique was utilized to facilitate the generation
model remains reasonably well. In comparison, HAA is able mechanism, based on which we could effectively identify those
to reduce the classification precision to close to 0.5 at 30% critical feature elements, so as to modify them with minimal
perturbation, which is a significant reduction and impact on perturbations. The RWR algorithm was employed to realize
the model performance. Overall, the proposed HAA method the hierarchical node selection, in which both structural fea-
outperforms the other three methods and is able to achieve a tures and overall loss changes within the targeted IoT network
42.5% reduction in classification precision between 2% and were considered to improve the transfer matrix, so as to effi-
30% perturbed nodes. ciently select a set of more vulnerable nodes to attack the
Fig. 4(c) and (d) present the loss and classification precision GNN-based NIDS using the generated adversarial examples.
for the GCN model, respectively, under the four attack mod- Evaluations were conducted using the open-source data set
els. The results follow the general trend, where higher loss UNSW-SOSR2019. The results compared with three baseline
and greater precision degradation will be achieved when more methods demonstrate the ability of the proposed method in
nodes are perturbed, although it can be observed that the reducing the classification precision by more than 30% in two
GCN model offers a slightly better performance under our state-of-the-art GNN models, GCN and JK-Net, respectively.
targeted scenarios. Similar to the results for JK-Net, the In the future, we will go further to study more efficient
performance degradation in GCN also exhibit similar behavior. and effective adversarial attack strategy. More evaluations in
The HAA can achieve the best loss increment and classifica- different IoT network application scenarios will be investigated
tion precision reduction. Overall, the proposed method has to improve the adaptability of our method.
achieved more than 30% reduction for classification precision
in the GCN model.
We go further to analyze the effectiveness of the attack R EFERENCES
methods under varying sizes of training data from 2% to 30% [1] Z. Cai and Z. He, “Trading private range counting over big IoT data,” in
of the original training data set. Referring to Fig. 4(e) and (f), Proc. 39th IEEE Int. Conf. Distrib. Comput. Syst., 2019, pp. 144–153.
[2] X. Zhou et al., “Intelligent small object detection for digital twin in smart
which show the model performance with varying sizes of train- manufacturing with industrial cyber-physical systems,” IEEE Trans. Ind.
ing data set for JK-Net and GCN, respectively. The general Informat., vol. 18, no. 2, pp. 1377–1386, Feb. 2022.
trend can be observed, where more training data will result in [3] Z. Cai and X. Zheng, “A private and efficient mechanism for data upload-
ing in smart cyber-physical systems,” IEEE Trans. Netw. Sci. Eng.,
better GNN model performance. It can also be observed that, vol. 7, no. 2, pp. 766–775, Apr./Jun. 2020.
with 30% training data, the model performance has already [4] X. Zhou, X. Xu, W. Liang, Z. Zeng, and Z. Yan, “Deep-learning-
reached to a comparable level to the complete training data enhanced multitarget detection for end–edge–cloud surveillance in
set in the cases of iRWR and RSM. This clearly indicates smart IoT,” IEEE Internet Things J., vol. 8, no. 16, pp. 12588–12596,
Aug. 2021.
that the attack strength of iRWR and RSM is not very strong. [5] K. Sha, T. A. Yang, W. Wei, and S. Davari, “A survey of edge computing-
With GCRW, the model performance is also reaching close to based designs for IoT security,” Digit. Commun. Netw., vol. 6, no. 2,
0.76, whereas in the case of HAA, the classification precision pp. 195–202, 2020.
[6] A. Aldweesh, A. Derhab, and A. Z. Emam, “Deep learning approaches
remains to be close to or lower than 0.6 when up to 30% of for anomaly-based intrusion detection systems: A survey, taxonomy, and
the training data is used. This gives a strong indication of the open issues,” Knowl. Based Syst., vol. 189, pp. 105–124, Feb. 2020.
attack strength achieved by the proposed HAA method, which [7] X. Zhou, X. Yang, J. Ma, and K. I.-K. Wang, “Energy efficient smart
routing based on link correlation mining for wireless edge comput-
performs more consistently with varying sizes of training data. ing in IoT,” IEEE Internet Things J., early access, May 6, 2021,
doi: 10.1109/JIOT.2021.3077937.
[8] B. B. Zarpelão, R. S. Miani, C. T. Kawakani, and S. C. de Alvarenga,
VI. C ONCLUSION “A survey of intrusion detection in Internet of Things,” J. Netw. Comput.
Appl., vol. 84, pp. 25–37, Apr. 2017.
Advanced IoT networks and systems are growing at an [9] X. Zhou, W. Liang, S. Shimizu, J. Ma, and Q. Jin, “Siamese neural
unforeseen rate, reaching every corner of our cities and coun- network based few-shot learning for anomaly detection in industrial
cyber-physical systems,” IEEE Trans. Ind. Informat., vol. 17, no. 8,
tries, to collect useful data, and to offer intelligent services. pp. 5790–5798, Aug. 2021.
Considering the amount of data collected and processed by [10] X. Zheng and Z. Cai, “Privacy-preserved data sharing towards multiple
modern IoT systems, it is of critical importance to make parties in industrial IoTs,” IEEE J. Sel. Areas Commun., vol. 38, no. 5,
sure that those systems are secure and not to be misused pp. 968–979, May 2020.
[11] Z. Cai, Z. Xiong, H. Xu, P. Wang, W. Li, and Y. Pan, “Generative
for any malicious purposes. To address this issue, tremendous adversarial networks: A survey toward private and secure applications,”
amount of research effort is devoted to design robust NIDS to ACM Comput. Surv., vol. 54, no. 6, pp. 1–38, 2021.
9318 IEEE INTERNET OF THINGS JOURNAL, VOL. 9, NO. 12, JUNE 15, 2022
[12] Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, and P. S. Yu, “A compre- [35] D. Zügner, A. Akbarnejad, and S. Günnemann, “Adversarial attacks
hensive survey on graph neural networks,” IEEE Trans. Neural Netw. on neural networks for graph data,” in Proc. 24th ACM SIGKDD Int.
Learn. Syst., vol. 32, no. 1, pp. 4–24, Jan. 2021. Conf. Knowl. Discovery Data Min. (KDD), London, U.K., Aug. 2018,
[13] J. Ma, S. Ding, and Q. Mei, “Towards more practical adversarial attacks pp. 2847–2856.
on graph neural networks,” in Proc. 34th Conf. Neural Info. Process. [36] X. Zhou, W. Liang, K. I.-K. Wang, R. Huang, and Q. Jin, “Academic
Syst. (NeurIPS 2020), Vancouver, BC, Canada, Dec. 2020. [Online]. influence aware and multidimensional network analysis for research col-
Available: https://nips.cc/Conferences/2020, NeurIPS 2020 is a Virtual- laboration navigation based on scholarly big data,” IEEE Trans. Emerg.
only Conference Topics Comput., vol. 9, no. 1, pp. 246–257, Jan.–Mar. 2021.
[14] H. Qiu, T. Dong, T. Zhang, J. Lu, G. Memmi, and M. Qiu, “Adversarial [37] M. Ahmadi, M. Hajabdollahi, N. Karimi, and S. Samavi, “Context-aware
attacks against network intrusion detection in IoT systems,” IEEE saliency map generation using semantic segmentation,” in Proc. Iranian
Internet Things J., vol. 8, no. 13, pp. 10327–10335, Jul. 2021. Conf. Elect. Eng. (ICEE), Mashhad, Iran, 2018, pp. 616–620.
[15] F. Zhou, Q. Yang, K. Zhang, G. Trajcevski, T. Zhong, and A. Khokhar, [38] A. Hamza, H. H. Gharakheili, T. Benson, and V. Sivaraman, “Detecting
“Reinforced spatiotemporal attentive graph neural networks for traffic volumetric attacks on loT devices via SDN-based monitoring of MUD
forecasting,” IEEE Internet Things J., vol. 7, no. 7, pp. 6414–6428, activity,” in Proc. ACM SOSR, San Jose, CA, USA, Apr 2019, pp. 36–48.
Jul. 2020. [39] T. N. Kipf and M. Welling, “Semi-supervised classification with graph
[16] W. Zhang et al., “Modeling IoT equipment with graph neural networks,” convolutional networks,” 2016, arXiv:1609.02907.
IEEE Access, vol. 7, pp. 32754–32764, 2019. [40] K. Xu, C. Li, Y. Tian, T. Sonobe, K.-I. Kawarabayashi, and S. Jegelka,
[17] K. Rusek, J. Suárez-Varela, P. Almasan, P. Barlet-Ros, and “Representation learning on graphs with jumping knowledge networks,”
A. Cabellos-Aparicio, “RouteNet: Leveraging graph neural networks in Proc. Int. Conf. Mach. Learn., Jul. 2018, pp. 5453–5462.
for network modeling and optimization in SDN,” IEEE J. Sel. Areas [41] Z. Sun, E. Ambrosi, G. Pedretti, A. Bricalli, and D. Ielmini, “In-memory
Commun., vol. 38, no. 10, pp. 2260–2270, Oct. 2020. PageRank accelerator with a cross-point array of resistive memories,”
[18] Z. Guo and H. Wang, “A deep graph neural network-based mechanism IEEE Trans. Electron Devices, vol. 67, no. 4, pp. 1466–1470, Apr. 2020.
for social recommendations,” IEEE Trans. Ind. Informat., vol. 17, no. 4,
pp. 2776–2783, Apr. 2021.
Xiaokang Zhou (Member, IEEE) received the Ph.D.
[19] Z. Cui, K. Henrickson, R. Ke, and Y. Wang, “Traffic graph convolutional
degree in human sciences from Waseda University,
recurrent neural network: A deep learning framework for network-
Tokyo, Japan, in 2014.
scale traffic learning and forecasting,” IEEE Trans. Intell. Transp. Syst.,
He is currently an Associate Professor with the
vol. 21, no. 11, pp. 4883–4894, Nov. 2020.
Faculty of Data Science, Shiga University, Hikone,
[20] D. Krleža and K. Fertalj, “Graph matching using hierarchical fuzzy
Japan. From 2012 to 2015, he was a Research
graph neural networks,” IEEE Trans. Fuzzy Syst., vol. 25, no. 4,
Associate with the Faculty of Human Sciences,
pp. 892–904, Aug. 2017.
Waseda University. He has been working as a
[21] Y. Shen, Y. Shi, J. Zhang, and K. B. Letaief, “Graph neural networks for Visiting Researcher with the RIKEN Center for
scalable radio resource management: Architecture design and theoretical Advanced Intelligence Project, RIKEN, Tokyo, since
analysis,” IEEE J. Sel. Areas Commun., vol. 39, no. 1, pp. 101–115, 2017. He has been engaged in interdisciplinary
Jan. 2021. research works in the fields of computer science and engineering, information
[22] F. Gama, A. G. Marques, G. Leus, and A. Ribeiro, “Convolutional neural systems, and social and human informatics. His recent research interests
network architectures for signals supported on graphs,” IEEE Trans. include ubiquitous computing, big data, machine learning, behavior and
Signal Process., vol. 67, no. 4, pp. 1034–1049, Feb. 2019. cognitive informatics, cyber-physical-social systems, cyber intelligence, and
[23] D. Zhu, Y. Sun, H. Du, N. Cao, T. Baker, and G. Srivastava, “HUNA: A security.
method of hierarchical unsupervised network alignment for IoT,” IEEE Dr. Zhou is a member of IEEE CS, and ACM, USA, IPSJ, and JSAI, Japan,
Internet Things J., vol. 8, no. 5, pp. 3201–3210, Mar. 2021. and CCF, China.
[24] D. J. Miller, Z. Xiang, and G. Kesidis, “Adversarial learning targeting
deep neural network classification: A comprehensive review of defenses
against attacks,” Proc. IEEE, vol. 108, no. 3, pp. 402–433, Mar. 2020. Wei Liang (Member, IEEE) received the M.S. and
[25] S. Krithivasan, S. Sen, and A. Raghunathan, “Sparsity turns adver- Ph.D. degrees in computer science from Central
sarial: Energy and latency attacks on deep neural networks,” IEEE South University, Changsha, China, in 2005 and
Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 39, no. 11, 2016, respectively.
pp. 4129–4141, Nov. 2020. From 2014 to 2015, he was a Researcher with
[26] T. Takahashi, “Indirect adversarial attacks via poisoning neighbors for the Department of Human Informatics and Cognitive
graph convolutional networks,” in Proc. IEEE Int. Conf. Big Data, Sciences, Waseda University, Tokyo, Japan. He is
Los Angeles, CA, USA, 2019, pp. 1395–1400. currently working with the Base of International
[27] J. Yuan and Z. He, “Adversarial dual network learning with randomized Science and Technology Innovation and Cooperation
image transform for restoring attacked images,” IEEE Access, vol. 8, on Big Data Technology and Management, Hunan
pp. 22617–22624, 2020. University of Technology and Business, Changsha.
[28] X. Lin et al., “Exploratory adversarial attacks on graph neural networks,” He has published more than 20 papers at various conferences and journals.
in Proc. IEEE Int. Conf. Data Min. (ICDM), Sorrento, Italy, 2020, His research interests include information retrieval, data mining, and artificial
pp. 1136–1141. intelligence.
[29] V. N. Ioannidis and G. B. Giannakis, “Defending graph convolutional Dr. Liang is a member of IEEE CS and CCF, China.
networks against adversarial attacks,” in Proc. IEEE Int. Conf. Acoust.
Speech Signal Process., Barcelona, Spain, 2020, pp. 8469–8473. Weimin Li (Member, IEEE) received the Ph.D.
[30] Y. Xu, B. Du, and L. Zhang, “Assessing the threat of adversarial exam- degree in control theory and control engineer-
ples on deep neural networks for remote sensing scene classification: ing from Donghua University, Shanghai, China, in
Attacks and defenses,” IEEE Trans. Geosci. Remote Sens., vol. 59, no. 2, 2008.
pp. 1604–1617, Feb. 2021. He is a Professor with the School of Computer
[31] G. Apruzzese, M. Andreolini, M. Marchetti, A. Venturi, and Engineering and Science, Shanghai University,
M. Colajanni, “Deep reinforcement adversarial learning against botnet Shanghai, China. He was a JSPS Research Fellow
evasion attacks,” IEEE Trans. Netw. Service Manag., vol. 17, no. 4, with the Department of Human Informatics and
pp. 1975–1987, Dec. 2020. Cognitive Sciences, Waseda University, Tokyo,
[32] D. Li and Q. Li, “Adversarial deep ensemble: Evasion attacks and Japan, from 2012 to 2013. He was a Visiting
defenses for malware detection,” IEEE Trans. Inf. Forensics Security, Scholar with the Department of Computer Science,
vol. 15, pp. 3886–3900, 2020. University of California at Santa Barbara, Santa Barbara, CA, USA, from
[33] Y. E. Sagduyu, Y. Shi, and T. Erpek, “Adversarial deep learning for 2015 to 2016. He has been involved in the extensively research works in the
over-the-air spectrum poisoning attacks,” IEEE Trans. Mobile Comput., fields of computer science, service computing, group behavior, and database
vol. 20, no. 2, pp. 306–319, Feb. 2021. technology. His current research interests include social computing, data
[34] X. Zhang and M. Zitnik, “GNNGuard: Defending graph neural networks mining and analytics, group behavior modeling and simulating, and service
against adversarial attacks,” 2020, arXiv:2006.08149. recommendations.
ZHOU et al.: HIERARCHICAL ADVERSARIAL ATTACKS AGAINST GNN-BASED IoT NIDS 9319
Ke Yan (Member, IEEE) received the bachelor’s Kevin I-Kai Wang (Member, IEEE) received the
and Ph.D. degrees in computer science from Bachelor of Engineering (Hons.) degree in computer
the School of Computing, National University systems engineering and the Ph.D. degree in electri-
of Singapore (NUS), Singapore, in 2006 and cal and electronics engineering from the Department
2012, respectively. of Electrical and Computer Engineering, University
He is currently an Assistant Professor with of Auckland, Auckland, New Zealand, in 2004 and
NUS. He has published more than 70 full 2009, respectively.
length papers with highly ranked conferences He is currently a Senior Lecturer with the
and journals, including Association for the Department of Electrical and Computer Engineering,
Advancement of Artificial Intelligence, IEEE University of Auckland. He was also a Research
T RANSACTIONS ON I NDUSTRIAL I NFORMATICS, Engineer designing commercial home automation
IEEE T RANSACTIONS ON S USTAINABLE E NERGY, IEEE T RANSACTIONS systems and traffic sensing systems from 2009 to 2011. His current research
ON S YSTEMS , M AN AND C YBERNETICS : S YSTEMS , and Applied Energy. He interests include wireless sensor network-based ambient intelligence, perva-
is actively engaged in cross-discipline research fields, including machine sive healthcare systems, human activity recognition, behavior data analytics,
learning, artificial intelligence, cyber intelligence, applied mathematics, and bio-cybernetic systems.
sustainability, and applied energy.