Learning
Learning
Research Article
A R T I C L E I N F O A B S T R A C T
Keywords: Cyber-attacks pose a significant challenge to the security of Internet of Things (IoT) sensor networks,
Blockchain necessitating the development of robust countermeasures tailored to their unique characteristics and limitations.
Machine learning Various prevention and detection techniques have been proposed to mitigate these attacks. In this paper, we
IoT
propose an integrated security framework using blockchain and Machine Learning (ML) to protect IoT sensor
Security
networks. The framework consists of two modules: a blockchain prevention module and an ML detection
Integration
Smart contracts module. The blockchain prevention module has two lightweight mechanisms: identity management and trust
management. Identity management employs a lightweight Smart Contract (SC) to manage node registration
and authentication, ensuring that unauthorized entities are prohibited from engaging in any tasks, while trust
management uses a lightweight SC that is responsible for maintaining trust and credibility between sensor nodes
throughout the network’s lifetime and tracking historical node behaviors. Consensus and transaction validation
are achieved through a Verifiable Byzantine Fault Tolerance (VBFT) mechanism to ensure network reliability
and integrity. The ML detection module utilizes the Light Gradient Boosting Machine (LightGBM) algorithm
to classify malicious nodes and notify the blockchain network if it must make decisions to mitigate their
impacts. We investigate the performance of several off-the-shelf ML algorithms, including Logistic Regression,
Complement Naive Bayes, Nearest Centroid, and Stacking, using the WSN-DS dataset. LightGBM is selected
following a detailed comparative analysis conducted using accuracy, precision, recall, F1-score, processing
time, training time, prediction time, computational complexity, and Matthews Correlation Coefficient (MCC)
evaluation metrics.
* Corresponding author.
E-mail address: [email protected] (S. Ismail).
https://doi.org/10.1016/j.bcra.2023.100174
Received 19 July 2023; Received in revised form 12 October 2023; Accepted 27 November 2023
Available online 30 November 2023
2096-7209/© 2023 THE AUTHORS. Published by Elsevier B.V. on behalf of Zhejiang University Press. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
S. Ismail, M. Nouman, D.W. Dawoud et al. Blockchain: Research and Applications 5 (2024) 100174
Table 1
Mapping of the proposed framework modules with key remarks.
Blockchain prevention Lightweight identity management Identity management and secure Registration, authentication, and revocation
Smart Contract (SC) authentication mechanism phases
Lightweight trust management SC Trust evaluation mechanism Trust score calculation for each of the nodes
using a set of evaluation metrics
Consensus Proof-of-Work (PoW) vs. Verifiable VBFT achieves overall lower transaction cost
Byzantine Fault Tolerance (VBFT) than PoW
Machine learning detection Light Gradient Boosting Machine LightGBM, Logistic Regression (LR), LightGBM has the superiority in terms of
(LightGBM) Complement Naive Bayes (CompNB), accuracy, precision, recall, and F1-score
Nearest Centroid (NC), and Stacking while NC performs better in terms of
performance analysis processing time
more robust and immune to single point of failure [6], and data are • Proposing an ML detection module that will be deployed on the
safe and immutable; therefore data will never be tampered with by any BS and CHs to classify malicious nodes and notify the blockchain
malicious party once added to the blockchain ledger [7]. network. A malicious node’s registration is revoked from the
The purpose of ML is to analyze the data generated by IoT devices blockchain network using the identity management SC once iden-
and make classifications and predictions based on that data [8]. ML- tified and classified. We compared the performance of several
based approaches for cyber-attack detection are effective when identi- off-the-shelf supervised ML algorithms, including Light Gradient
fying and mitigating threats. While blockchain can be recognized as a Boosting Machine (LightGBM), Logistic Regression (LR), Comple-
trusted layer for IoT network participants. Combining blockchain and ment Naive Bayes (CompNB), Nearest Centroid (NC), and Stacking,
IoT devices with the ability to automatically record data and transfer it for cyber-attack detection to implement the ML malicious detec-
over a network ensures that unauthorized entities are prohibited from tion module and select the classifier with the appropriate perfor-
engaging in any tasks and can maintain trust and credibility between mance. The comparative analysis is conducted using the following
sensor nodes throughout the network’s lifetime. evaluation metrics: accuracy, precision, recall, F1-score, process-
ML and blockchain integration has recently emerged as a promising ing time, training time, prediction time, computational complexity,
security approach for safeguarding IoT sensor networks against mali- and Matthews Correlation Coefficient (MCC). Table 1 summarizes
cious nodes. The decentralized nature of the blockchain contributes to the key contributions of the proposed framework and its key re-
the network’s resilience by eliminating a single point of failure [6]. marks.
Data become tamper-proof once added to the blockchain ledger, assur-
ing that they cannot be manipulated by malicious parties [7,9]. ML The rest of this paper is organized as follows. Section 2 outlines
applications focus on detecting and classifying malicious nodes in IoT some necessary preliminaries for integrating ML and blockchain to se-
sensor networks. Trained ML classifiers are deployed to analyze node cure IoT sensor networks. Section 3 reviews the literature and identifies
the recent work on ML and blockchain integrated solutions. Section 4
behavior, enabling the network to respond appropriately and mitigate
discusses the proposed integrated security framework in detail. Sec-
their impacts [10]. The response can take the form of generating alarms
tion 5 extends the discussion to include the system implementation,
within the blockchain network, which can lead to isolating the attacker
and results illustration and discussion. Section 6 concludes this paper
node or revoking its identity, preventing further transactions within the
with key remarks.
network.
Unlike the related literature discussed in Section 2, this study
2. Preliminary
presents a lightweight, integrated security framework that combines
the power of blockchain and ML technologies to strengthen IoT sen-
Integrating ML and blockchain for IoT sensor network security is a
sor network security from the time of network node deployment and
promising approach to address vulnerabilities and possible threats as-
throughout the network’s lifetime. This study has several contributions,
sociated with IoT devices and data; however, the potential benefits and
including: challenges involved in combining these two technologies to enhance IoT
sensor network security have not been widely investigated in existing
• Deploying a permissionless blockchain on the Base Station (BS) literature, primarily because this represents a relatively novel research
and Cluster Heads (CHs) to register and authenticate the Moni- direction.
tor Nodes (MNs) within its vicinity using their credentials. Identity Blockchain offers a decentralized and trustless approach to man-
information is stored on the public blockchain network after au- aging transactions and data, eliminating the need for a third-party
thentication. VBFT is the consensus mechanism of the proposed central authority. Blockchain records all committed transactions on a
framework, which is one of the preferable low-complexity proto- distributed ledger, making it particularly valuable for enhancing cryp-
cols developed for distributed systems with connected unreliable tocurrency system security. The blockchain architecture is well-suited
wireless nodes [11]. for applications involving distributed transactions, decentralized com-
• Proposing a lightweight blockchain prevention module that con- putation, and management in a trustless environment [12].
sists of two mechanisms: identity management and trust manage- Integrating blockchain with IoT sensor networks can mitigate se-
ment. The identity management Smart Contract (SC) employs a curity risks associated with data storage, resource access, routing, and
lightweight SC that is responsible for verifying and registering identity authentication [13]. The blockchain’s Peer-To-Peer (P2P) dis-
nodes at network node deployment, while the trust management SC tributed ledger, which supports scalability and faster settlement for
has a lightweight SC that maintains the nodes’ trust and credibility coordinating and securing nodes, makes it a promising solution for se-
throughout the network’s lifetime and helps track their historical curing data and authenticating identities in IoT networks, as discussed
behaviors. This SC calculates a Trust Score (TS) value for each node in Ref. [14].
that indicates if the node is normal or misbehaving. The ML detec- Applying blockchain in IoT sensor networks presents challenges due
tion module is triggered to perform malicious detection once the to high storage and computational demands, resulting in increased de-
trust management SC determines that a node is misbehaving. lays and reduced network throughput. Blockchain often incurs costs
2
S. Ismail, M. Nouman, D.W. Dawoud et al. Blockchain: Research and Applications 5 (2024) 100174
related to communication, memory usage, and power consumption, [23], a novel routing protocol was proposed called Generative Adver-
which can be at odds with the resource constraints of the sensor devices; sarial Network (GAN) and Blockchain-based Secured Routing Protocol
however, leveraging blockchain can help reduce the costs associated (GBCRP), which combines the Fully Distributed Generative Adversarial
with setting up and maintaining a centralized database, making effi- Networks (FDGAN)-DL method with Intrusion Detection System (IDS)
cient use of node idle states in terms of computational, storage, and and blockchain. The integration of GAN, IDS, and blockchain in their
bandwidth capabilities, ultimately lowering network calculations and protocol revealed an improvement in the overall security and efficiency
storage costs. of routing in WSNs.
On the other hand, ML algorithms provide efficient classification Malicious node detection based on ML-blockchain integrated ap-
models to identify cyber-attacks, which enhance the network nodes’ proaches has been explored in Refs. [24–27]. For instance, in Ref.
ability to learn without being explicitly programmed. The models are [26], we proposed a blockchain-based identity management and se-
used to make future predictions with new input data. ML algorithms cure authentication mechanism to be deployed on a hybrid blockchain
are currently used in various IoT sensor network applications. One architecture along with a Naive Bayes (NB) detection module. The ob-
approach is using ML to design lightweight detection and mitigation jective was to mitigate potential insider Denial of Service (DoS) attacks
systems to secure IoT sensor networks against cyber-attacks. ML models targeting CH nodes; however, the work did not implement a mecha-
can be trained to recognize patterns associated with known cyber-attack nism to perform registration and authentication for the nodes before
vectors and detect unauthorized access attempts or malicious activities joining the network. Yang et al. [24] used the Isolated Forest as an
within the IoT network [15]. anomaly detection model. This model was chosen for its computational
In this way, the sensor network can detect possible attacks and im- efficiency and high detection performance, particularly when managing
mediately take appropriate actions to mitigate the impact by triggering large volumes of dimensional data. The blockchain component ensured
an alarm, determining the degree of the risk, or isolating the attacker secure storage and updates for the global detection model by provid-
node from the next round of network progress [16]. ing trusted blocks (isolated trees) for model formation. The reported
Integrating ML and blockchain can significantly enhance IoT sen- results demonstrated that the proposed integrated blockchain-Isolated
sor network security by providing data integrity, device identity, access Forest IDS model achieved high detection accuracy for various attacks,
control, and real-time attack detection [17,18]; however, it is necessary while requiring lower communication and storage overhead compared
to develop lightweight security mechanisms that carefully consider the to similar blockchain-based models; however, it only stores the detec-
trade-offs among ML, blockchain, and the design factors and security tion model itself and not the detection results, which eliminates any
requirements of IoT sensor networks, including device resource limita- record of node behavior. Sajid et al. [25] proposed a joint identity
tions, particularly in terms of power consumption and latency [19]. management and secure routing model. The authors examined ML tech-
Our approach has two lines of defense that utilize blockchain and niques such as the Genetic Algorithm-based Support Vector Machine
ML integration. The first line of defense is attack prevention using (GA-SVM) and the Genetic Algorithm-based Decision Tree (GA-DT) to
blockchain, while the second one is attack detection using ML. In the detect malicious nodes, and the results showed that GA-SVM outper-
proposed framework, the first line of defense is represented through formed GA-DT in terms of detection accuracy. The node’s involvement
two lightweight mechanisms: 1) handling registration and authentica- in the routing process or its registration revocation from the blockchain
tion, preventing node failures to prove its identity to join the network, network was determined based on the outcome of the GA-DT process.
and 2) maintaining trust and credibility between sensor nodes by cal- The security of the routing transactions was ensured using PoA con-
culating a trust value for each node to select the trustworthy nodes as sensus. Removing malicious nodes resulted in a packet delivery rate
reliable data sources. The second line of defense is an ML detection increase to 99.72%. This work [25] improved malicious node isolation;
module that is responsible for verifying and examining the incoming however, it only targeted routing process security. Nouman et al. [27]
traffic for any malicious behavior, alerting the network to the presence used the VBFT-blockchain network for node registration and authenti-
of an attacker node. cation. The authors also proposed a Histogram Gradient Boost (HGB)
classifier for detecting DoS attacks. Data associated with normal nodes
3. Related work were stored in an Interplanetary File System (IPFS) to generate hashed
chunks that could be stored in the blockchain ledger. Performance com-
Previous studies have explored integrating ML and blockchain tech- parisons demonstrated high precision (at least 98%) achieved by HGB,
nologies to enhance IoT-Wireless Sensor Networks (WSNs) security via surpassing its counterparts. The transaction costs of VBFT were lower
various approaches. These approaches encompass secure routing, iden- than Proof-of-Work (PoW); however, the proposed model eliminated
tity authentication, attack localization, malicious node detection, and any records of previous node behaviors, similar to Ref. [24].
trust management (Table 2). In addition to malicious detection, trust evaluation was proposed
Integrating blockchain and ML to enhance routing protocol secu- in Ref. [28] considering blockchain-ML integration. In Ref. [28], the
rity in IoT-WSNs has been discussed in several studies [20–23]. Yang Sybil attack detection scheme and blockchain-based trust model were
et al. [20] proposed a framework that leverages a Proof of Author- introduced. The trust model was able to identify the Sybil nodes by
ity (PoA)-blockchain network to securely record routing information computing a trust value for each node using the Hidden Markov Model
using registration and token contracts. A reinforcement learning (RL) (HMM), and these trust values were added to the blockchain for le-
algorithm was applied to dynamically select the trusted routes. The gitimate node reference; however, this only mitigates Sybil attacks in
results revealed an 81% reduction in average packet delay compared underwater sensor networks.
to existing techniques, attributed to the trusted queue length informa- In Ref. [18], Gebremariam et al. used the blockchain-ML integration
tion provided in their framework. Revanesh and Sridhar [21] proposed for attack localization combined with a trust evaluation mechanism.
trusted routing using blockchain and the Salp Swarm Optimization al- The authors specifically proposed an attack localization and detection
gorithm. A Deep learning (DL)-Convolutional Neural Network (CNN) technique incorporated with cascade encryption and trust evaluation
was implemented to manage the decision of routing link selection based using blockchain and hybrid Federated Learning (FL) to secure large-
on the trusted routing information obtained from the blockchain. The scale IoT-WSNs [18]. As is known in the literature, DL techniques are
work of Ref. [22] also employed PoA-based blockchain and introduced more demanding in terms of computational complexity and process-
a DL model using CNN to determine validators for the PoA-SC. By ing power than ML. We have selected supervised ML classification for
pre-selecting and limiting the number of validators, their PoA-DL con- the attack detection module in this work for that reason. ML classifica-
sensus mechanism demonstrated lower latency and enhanced transac- tion algorithms often have a simpler structure and require less data for
tion processing capacity compared to conventional approaches. In Ref. training. Collecting a large, labeled dataset to train DL models can be
3
S. Ismail, M. Nouman, D.W. Dawoud et al. Blockchain: Research and Applications 5 (2024) 100174
Table 2
Existing work on integrated blockchain-machine learning systems for wireless sensor networks security (N/S means not specified).
Public DL-CNN [21] Insider attacks PoA Registration contract Trustworthy cluster-based routing using
and token contract blockchain and DL-CNN for optimal
routing nodes selection
HMM [28] Sybil attacks N/S N/S Trust management model for the
detection of Sybil nodes using
blockchain and HMM
HGB [27] DoS attacks VBFT vs. PoW Registration and Authentication mechanism to mitigate
authentication contract DoS attacks using blockchain and HGB
integrated with IPFS for data storage
Private Isolated [24] Insider attacks IOTA tangle N/S Distributed anomaly detection using
Forest Isolated Forest algorithm and
blockchain
GA-SVM [25] Grayhole, PoA vs. PoW Agreement SC Nodes registration and authentication
and GA-DT mistreatment, and and routing data storage using
MITM attacks blockchain and malicious node
detection using GA-SVM and GA-DT
Consortium RL [20] Blackhole attacks PoA Registration contract Trusted routing using blockchain and
and token contract RL
CNN [22] Routing attacks PoA Registration contract Trusted routing using blockchain and
(Blackhole) and token contract CNN
Hybrid Gaussian [26] Insider attacks N/S Registration and Identity management and secure
NB authentication contract authentication mechanism using
blockchain and malicious node
detection using Gaussian NB
FL [18] Routing attacks PoA vs. PoW Registration and Attack localization and detection
authentication contract incorporated with cascade encryption
and trust evaluation using blockchain
and FL
N/S GAN [23] Network layer N/S N/S Secure routing using blockchain and
attacks authentication and validation of routing
procedures using GAN
Note: SC: Smart Contract, DL-CNN: Deep Learning-Convolutional Neural Network, PoA: Proof-of-Authority, HMM: Hidden Markov Model, HGB:
Histogram Gradient Boost, DoS: Denial of Service, PoW: Proof-of-Work, GA-SVM: Genetic Algorithm-based Support Vector Machine, GA-DT:
Genetic Algorithm-based Decision Tree, MITM: Man-in-the-middle, RL: Reinforcement Learning, CNN: Convolutional Neural Network, Gaussian
NB: Gaussian Naive Bayes, FL: Federated Learning, GAN: Generative Adversarial Network.
challenging in IoT environments. ML classifiers can often perform well The ML detection module identifies and classifies any malicious
with smaller datasets, making them more practical for many IoT use node using an efficient ML model for the blockchain network to be
cases. This simplicity can be advantageous when designing solutions notified to take appropriate actions and isolate this node. Extensive
for IoT networks, making them less vulnerable to numerous attacks performance comparisons are conducted to select the appropriate ML
since it reduces the system’s complexity. We consider our results pre- algorithm to classify the detected malicious nodes. Fig. 1 depicts the
viously published in Ref. [29] as a motivation to use LightGBM for the proposed framework that deploys both modules in a permissionless
ML detection module. Moreover, the mechanisms employed within the blockchain network consisting of multiple clusters through the BS and
blockchain module are considered lightweight SCs in terms of opcodes CH nodes.
and calculations, which help reduce the gas cost of calling SC functions.
4.1. Blockchain prevention module
4. Proposed framework
Identity management and trust evaluation are two important means
for preserving IoT sensor network security, ensuring that legitimate
The proposed framework uses a permissionless decentralized blockchain nodes can access network services or resources and maintain trustwor-
structure on a hierarchical cluster-based architecture to benefit from thiness between them throughout the network’s lifetime. This module
blockchain immutability and, at the same time, reduce its complexity, is responsible for preventing attacks within a blockchain-based IoT sen-
allowing it to be used in IoT networks. sor network that can harm network services and stop legitimate traffic
The proposed framework comprises two modules: the blockchain from accessing the network. The module can achieve this by implement-
prevention module and the ML detection module. The blockchain pre- ing the following measures: registering and authenticating nodes before
vention module employs two mechanisms: identity management and they are granted permission to transact over the network. This process
trust management. Identity management uses an SC to manage node guarantees that only authorized nodes can participate in network ac-
registration and verification, ensuring that unauthorized entities are tivities. Each registered node is assigned a unique identity nameplate,
prohibited from engaging in the blockchain network, while trust man- referred to as an IDCard, which is generated by the identity manage-
agement has an SC that periodically computes a TS for each MN to ment SC. The module then uses VBFT to validate transactions before
evaluate its behavior throughout the network’s operation. Transaction they are added to the blockchain ledger, which ensures the integrity
validation is performed using the VBFT consensus algorithm. and reliability of recorded transactions.
4
S. Ismail, M. Nouman, D.W. Dawoud et al. Blockchain: Research and Applications 5 (2024) 100174
The trust management SC establishes trustworthiness among the and Energy Consumption Amount (ECA). Each node is equipped with
nodes by periodically calculating the TS value for each node. Each node a TrustCard that contains SN_id, Status, TS, TSS, PSR, PFR,
should be classified into one of two levels of trust: trustworthy or risky. FD, EnergyLevel, and ECA. Nodes are classified into two trust levels:
If a node is classified as risky, the ML detection module is triggered to trustworthy and risky. The trustworthy node is deemed safe for sending,
identify and classify its malicious behavior. The result of the ML detec- receiving, and forwarding packets within the network, while the risky
tion will be recorded on the blockchain. Subsequently, the blockchain node should be recorded on the blockchain to be closely monitored in
network utilizes the identity management SC to revoke the identity subsequent assessment rounds. We adopted the same trust mechanism
of the identified malicious node, which ensures that the node will no discussed in Ref. [31], but with two trust levels.
longer participate in the network operation throughout the network’s
lifetime and consequently mitigates the potential damage caused by its 4.1.3. Blockchain consensus
malicious actions. Data consistency can be assured by blockchain-based consensus on
the data without involving a central authority or a third party [32]. Ex-
4.1.1. Lightweight identity management mechanism amples of consensus algorithms include PoA, PoW, Proof of Stake (PoS),
Centralized security and authentication authorities, such as identity and RAFT. In this study, VBFT is adopted as the consensus mechanism
providers or central access servers, have limitations in terms of sin- for the proposed framework. VBFT enhances the traditional Byzantine
gle point of failure and scalability. Using permissionless decentralized Fault Tolerance (BFT) by introducing verifiable randomness in the selec-
blockchain for identity management and secure authentication should tion of consensus peers for the next block. This randomness, achieved
by applying a random function to the current block, fortifies the al-
avoid single point of failure and support network scalability regardless
gorithm against malicious attacks. VBFT combines Verifiable Random
of the number of managed identities [30].
Function (VRF), BFT, and PoS, making it a hybrid consensus algorithm.
In this work, lightweight identity management is employed to fa-
PoW is used as a benchmark scheme, where the nodes compete to solve
cilitate the registration and authentication of nodes and record their
a mathematical puzzle, which typically requires significant computa-
identities into the blockchain ledger. During the initial deployment, the
tional resources. The node that successfully solves the puzzle first is
node’s credentials will be registered on the blockchain to be able to
granted the authority to add the new block to the blockchain.
transact and communicate over the network. Each node is assigned a
unique IDCard that should include SN_id, SN_password, CH_id,
4.2. Machine learning detection module
BS_id, and SN_time. The identity management SC encompasses sev-
eral functions, including RegisterNode() for node registration, AuthN- The proposed framework deploys an ML detection model on both the
ode() for node authentication, RevokeNode() for node revocation, In- BS and CHs in order to effectively identify and classify malicious nodes.
foNode() for querying node information, and TotalNode() for querying We conduct a comprehensive performance comparison of various su-
the total number of registered and authenticated nodes. pervised ML algorithms to detect cyber-attacks in IoT sensor networks,
specifically LightGBM, LR, CompNB, NC, and Stacking.
4.1.2. Lightweight trust management mechanism
The trust management mechanism proposed in this study aims to en- 4.2.1. Dataset and data preprocessing
sure the selection of reliable data sources in a lightweight manner. The The ML models are trained using the specialized imbalanced dataset,
proposed trust management SC is responsible for maintaining node trust WSN-DS, which contains samples of four insider DoS attacks: Blackhole,
and detecting malicious nodes within the network by periodically evalu- Grayhole, Flooding, and TDMA scheduling [33]. Data preprocessing in-
ating the TS value for each sensor node using a set of assessment metrics volves cleaning, normalization, handling duplicates or missing data,
that are determined during the network’s operations. These metrics feature encoding, dimensionality reduction, and labeling [34].
include node status, Transmitted Signal Strength (TSS), Packet Send- The RandomOverSampler technique is applied to balance the orig-
ing Rate (PSR), Packet Forwarding Rate (PFR), Forwarding Delay (FD), inal dataset using the imbalanced-learn Python library. This technique
5
S. Ismail, M. Nouman, D.W. Dawoud et al. Blockchain: Research and Applications 5 (2024) 100174
6
S. Ismail, M. Nouman, D.W. Dawoud et al. Blockchain: Research and Applications 5 (2024) 100174
Φ = [𝜃1 𝜃2 ⋯ 𝜃𝑛 ]
where
∑
0 ≤ 𝜃𝑗 ≤ 1, 𝜃𝑗 = 1
𝑗
and 𝚫 = [Δ′1 Δ′2 ⋯ Δ′𝑛 ] is the 𝑛 × 𝑐 class distribution vector for the
𝑛 classifiers.
Fig. 2. RegisterNode() inputs from identity management smart contract.
5. Experimental results and discussion
Algorithm 1: Identity management smart contract.
In this section, we discuss the proposed framework’s experimen- 1: Initialize identity management SC structure.
tal results. We first declare the blockchain prevention module over 2: Declare IDCard variables: SN_id, SN_password, SN_registered, CH_id, BS_id, SN_
time.
Ethereum, which is a permissionless blockchain deployed on BS and
3: Define a modifier onlyCH to restrict functions execution to the CH. If the sender
CH nodes. We design and evaluate the two proposed SCs using Remix is not a CH, revert the transaction.
IDE web integrated with Ethereum. The ML detection module is de- 4: Function RegisterNode()
signed using the LightGBM algorithm, which is selected following an • Create a new SN record with the provided initials and set SN_registered to
true.
extensive comparative analysis with other algorithms, including LR,
• Add the new SN to the SN array.
CompNB, NC, and Stacking. We use Google Colaboratory and Python 5: Function AuthNode()
programming to perform the performance comparison on the balanced • Accept SN_id as the index.
and imbalanced WSN-DS datasets. • Authenticate an SN based on provided credentials.
• If successfully authenticated, return “Sensor Authenticated” and set
SN_registered to true.
5.1. Blockchain prevention module • If not authenticated, return “Node revoked” and set SN_registered to false.
6: Function RevokeNode()
• Accept SN_id as the index.
The proposed blockchain prevention module is implemented using • Check if SN_id is valid in the SN array.
• Set SN_registered to false at the specified SN_id.
Ethereum, which is a permissionless blockchain and distributed appli-
7: Function InfoNode()
cation platform commonly known for its virtual cryptocurrency, Ether • Accept SN_id as the index.
or ETH. Ether is the token that powers Ethereum. The Remix IDE web, • If found, return SN_password, SN_registered status, CH_id, BS_id, and SN_
integrated with an Ethereum wallet created using a JavaScript injec- time.
• Otherwise, return “Node revoked” and set SN_registered to false.
tor called Metamask, is used to develop SC and consensus algorithm
8: Function TotalNode()
performance evaluations. • Return the total number of registered SNs.
Two SCs are built in the proposed framework: identity management • Return the length of the SN array.
SC and trust management SC. In general, an SC can be defined as a pro-
gram code that incorporates an automated legal agreement [7]. Vyper,
Bamboo, Serpent, and Mutan programming languages have been used put, logs, etc. The following log depicts an example of the output for
to develop SC code on various blockchain platforms; however, Solidity calling RegisterNode() (Fig. 2).
is the most popular object-oriented high-level programming language
adopted for writing SCs. Implementing SCs within the blockchain makes 1 from: 0x5B3...eddC4
2 to: IdentityManagement.RegisterNode(uint256,string,uint256,
it immutable and tamper-proof; therefore, a deployed contract cannot uint256,uint256) 0xd91...39138
be changed or removed. keccak256 is the hashing function works for 3 value: 0 wei
ETH, built into Solidity and used to generate the hash of node’s unique 4 data: 0xe61...00000
IDCard using the following formula: 5 logs: 0
6 hash: 0x3ac...9d2a9
7 status true Transaction mined and execution succeed
SN_id=keccak256(PA) 8 transaction hash: 0x3ac2...9d2a9
9 block hash: 0x615...68649
where SN_id represents the hashed sensor identity and PA is the node’s 10 block number 2
physical or MAC address. The keccak256 function reduces the cost when 11 from 0x5B38Da6a701c568545dCfcB03FcB875f56beddC4
compared to other hashing algorithms. 12 to IdentityManagement.RegisterNode(uint256,string,uint256,
uint256,uint256)
13 gas 236872 gas
5.1.1. Identity management smart contract 14 transaction cost 205975 gas
15 execution cost 183799 gas
Algorithm 1 description represents identity management SC, which
16 input 0xe61...00000
is written in Solidity and mainly consists of RegisterNode(), AuthN- 17 decoded input {
ode(), and RevokeNode() as core payable functions and InfoNode() and 18 "uint256 SN_id": "1258625",
TotalNode() as non-payable functions. 19 "string SN_password": "Sensor_123",
20 "uint256 CH_id": "10",
Each contract call invoking identity management SC generates a
21 "uint256 BS_id": "1",
transaction that appears on the terminal and has the following fields: 22 "uint256 SN_time": "1"
status, transaction hash, from, to, input, decoded input, decoded out- 23 }
7
S. Ismail, M. Nouman, D.W. Dawoud et al. Blockchain: Research and Applications 5 (2024) 100174
1 from: 0x5B3...eddC4
2 to: Trust.AddNodeTrust(uint256,bool,uint256,uint256,uint256,
uint256,uint256,uint256,uint256) 0xd91...39138
3 value: 0 wei
4 data: 0x113...00019
5 logs: 0hash: 0xf36...43018
6 status true Transaction mined and execution succeed
7 transaction hash 0xf36...43018
8 block hash 0xb5f...940b4
9 block number 2
10 from 0x5B38Da6a701c568545dCfcB03FcB875f56beddC4
11 to Trust.AddNodeTrust(uint256,bool,uint256,uint256,uint256, Fig. 3. AddNodeTrust() inputs from trust management smart contract.
uint256,uint256,uint256,uint256) 0
xd9145CCE52D386f254917e481eB44e9943F39138
12 gas 309437 gas
13 transaction cost 269075 gas
14 execution cost 246727 gas
15 input 0x113...00019
16 decoded input {
17 "uint256 _ID": "1258625",
18 "bool _Status": true,
19 "uint256 _TSS": "85",
20 "uint256 _RSS": "65",
21 "uint256 _PSR": "80",
22 "uint256 _PFR": "58",
23 "uint256 _FD": "62",
24 "uint256 _EnergyLevel": "81",
25 "uint256 _ECA": "25"
26 }
8
S. Ismail, M. Nouman, D.W. Dawoud et al. Blockchain: Research and Applications 5 (2024) 100174
9
S. Ismail, M. Nouman, D.W. Dawoud et al. Blockchain: Research and Applications 5 (2024) 100174
Fig. 6. Performance metrics achieved by LightGBM, LR, CompNB, NC, and Stacking classifiers for balanced and original imbalanced WSN-DS dataset.
This work proposes an integrated security framework to protect throughout the network’s lifetime. VBFT is the consensus mechanism
IoT sensor networks against insider cyber-attacks using blockchain and used to validate the transactions and reduce the transaction costs com-
ML. A permissionless blockchain network is deployed on BS and CH pared to PoW. The ML detection module identifies and classifies the
nodes. Each MN should belong to one cluster and be registered with detected malicious nodes. LightGBM is selected following a detailed
its current CH. The proposed framework consists of two distinct mod- comparative analysis conducted using accuracy, precision, recall, F1-
ules: blockchain prevention and ML detection. The blockchain preven- score, processing time, training time, prediction time, computational
complexity, and MCC evaluation metrics.
tion module has two key mechanisms: identity management and trust
management. Each mechanism is associated with an SC that is imple-
CRediT authorship contribution statement
mented using Solidity and Remix IDE integrated with Ethereum wallet
using Metamask. The identity management mechanism is responsible Shereen Ismail: Investigation, Methodology, Writing – orginal
for verifying and registering the nodes on the blockchain network, draft. Muhammad Nouman: Investigation, Methodology, Writing –
while the trust management mechanism evaluates the trustworthiness orginal draft. Diana W. Dawoud: Investigation, Methodology, Writing
of each sensor node and helps track the historical behavior of the nodes – orginal draft. Hassan Reza: Supervision, Writing – review editing.
10
S. Ismail, M. Nouman, D.W. Dawoud et al. Blockchain: Research and Applications 5 (2024) 100174
Table 4
Evaluation metrics comparison for LightGBM, LR, CompNB, NC, Stacking for different training ratios.
Training ratio ML classifier Accuracy Precision Recall F1-score Processing time Training time Prediction time
11
S. Ismail, M. Nouman, D.W. Dawoud et al. Blockchain: Research and Applications 5 (2024) 100174
[20] J. Yang, S. He, Y. Xu, et al., A trusted routing scheme using blockchain and reinforce- [30] K. Gilani, E. Bertin, J. Hatin, et al., A survey on blockchain-based identity man-
ment learning for wireless sensor networks, Sensors (Switzerland) 19 (4) (2019), agement and decentralized privacy for personal data, in: 2020 2nd Conference on
https://doi.org/10.3390/s19040970. Blockchain Research & Applications for Innovative Networks and Services (BRAINS),
[21] M. Revanesh, V. Sridhar, A trusted distributed routing scheme for wireless sen- IEEE, 2020, pp. 97–101, https://doi.org/10.1109/BRAINS49436.2020.9223312.
sor networks using blockchain and meta-heuristics-based deep learning technique, [31] S. Ismail, D.W. Dawoud, T. Al-Zyoud, et al., Towards blockchain-based adaptive
Trans. Emerg. Telecommun. Technol. 32 (9) (2021) e4259, https://doi.org/10. trust management in wireless sensor networks, in: 2023 IEEE International Confer-
1002/ett.4259. ence on Electro Information Technology (eIT), IEEE, 2023, pp. 163–168, https://
[22] I.A. Abd El-Moghith, S.M. Darwish, Towards designing a trusted routing scheme in doi.org/10.1109/eIT57321.2023.10187278.
wireless sensor networks: a new deep blockchain approach, IEEE Access 9 (2021) [32] X. Fu, H. Wang, P. Shi, A survey of blockchain consensus algorithms: mechanism,
103822–103834, https://doi.org/10.1109/ACCESS.2021.3098933. design and applications, Sci. China Inf. Sci. 64 (2021) 1–15, https://doi.org/10.
[23] S. Rajasoundaran, S. Kumar, M. Selvi, et al., Machine learning based volatile 1007/s11432-019-2790-1.
block chain construction for secure routing in decentralized military sensor net- [33] I. Almomani, B. Al-Kasasbeh, M. Al-Akhras, WSN-DS: a dataset for intrusion detec-
works, Wirel. Netw. 27 (7) (2021) 4513–4534, https://doi.org/10.1007/s11276- tion systems in wireless sensor networks, J. Sens. 2016 (2016), https://doi.org/10.
021-02748-2. 1155/2016/4731953.
[24] X. Yang, Y. Chen, X. Qian, et al., BCEAD: a blockchain-empowered ensemble [34] S. Ismail, Z. El Mrabet, H. Reza, An ensemble-based machine learning approach
anomaly detection for wireless sensor network via isolation forest, Secur. Commun. for cyber-attacks detection in wireless sensor networks, Appl. Sci. 13 (1) (2023),
Netw. 2021 (2021), https://doi.org/10.1155/2021/9430132. https://doi.org/10.3390/app13010030.
[25] M.B.E. Sajid, S. Ullah, N. Javaid, et al., Exploiting machine learning to detect ma- [35] G. Ke, Q. Meng, T. Finley, et al., LightGBM: a highly efficient gradient boosting
licious nodes in intelligent sensor-based systems using blockchain, Wirel. Commun. decision tree, Adv. Neural Inf. Process. Syst. 30 (2017).
Mob. Comput. 2022 (2022), https://doi.org/10.1155/2022/7386049. [36] C.-Y.J. Peng, K.L. Lee, G.M. Ingersoll, An introduction to logistic regression anal-
[26] S. Ismail, D. Dawoud, H. Reza, Towards a lightweight identity management and ysis and reporting, J. Educ. Res. 96 (1) (2002) 3–14, https://doi.org/10.1080/
secure authentication for IoT using blockchain, in: 2022 IEEE World AI IoT 00220670209598786.
Congress (AIIoT), IEEE, 2022, pp. 77–83, https://doi.org/10.1109/AIIoT54504. [37] J.D.M. Rennie, L. Shih, J. Teevan, et al., Tackling the poor assumptions of naive
2022.9817349. Bayes text classifiers, in: Proceedings of the Twentieth International Conference on
[27] M. Nouman, U. Qasim, H. Nasir, et al., Malicious node detection using machine International Conference on Machine Learning, Ser. ICML’03, AAAI Press, 2003,
learning and distributed data storage using blockchain in WSNs, IEEE Access 11 pp. 616–623.
(2023), https://doi.org/10.1109/ACCESS.2023.3236983. [38] M. Thulasidas, Nearest Centroid: a bridge between statistics and machine learning,
[28] M.M. Arifeen, A. Al Mamun, T. Ahmed, et al., A blockchain-based scheme for Sybil in: IEEE International Conference on Teaching, Assessment, and Learning for Engi-
attack detection in underwater wireless sensor networks, in: M.S. Kaiser, A. Bandy- neering, IEEE, 2020, pp. 9–16, https://doi.org/10.1109/TALE48869.2020.9368396.
opadhyay, M. Mahmud, et al. (Eds.), Proceedings of International Conference on [39] R. Sikora, O.H. Al-laymoun, A modified stacking ensemble machine learning algo-
Trends in Computational and Cognitive Engineering, Springer Singapore, Singapore, rithm using genetic algorithms, J. Int. Technol. Inf. Manag. 23 (1) (2014), https://
2021, pp. 467–476, https://doi.org/10.1007/978-981-33-4673-4_37. doi.org/10.58729/1941-6679.1061.
[29] S. Ismail, T.T. Khoei, R. Marsh, et al., A comparative study of machine learning
models for cyber-attacks detection in wireless sensor networks, in: 2021 IEEE 12th
Annual Ubiquitous Computing, Electronics & Mobile Communication Conference
(UEMCON), IEEE, 2021, pp. 313–318, https://doi.org/10.1109/UEMCON53757.
2021.9666581.
12