Final Doc Mine
Final Doc Mine
BACHELOR OF TECHNOLOGY
In
Dr B. SRINIVAS RAJA
Professor
DEPARTMENT OF
ELECTRONICS & COMMUNICATION ENGINEERING
GODAVARI INSTITUTE OF ENGINEERING & TECHNOLOGY (A)
CHAITANYA KNOWLEDGE CITY, NH-16, RAJAMAHENDRAVARAM, AP
DECLARATION
We undersigned solemnly declare that the main Project report “Design of Intrusion
Detection System for IOT Environment with Machine Learning Approach” is
based on our own work carried out during the course of our study under the supervision of
Dr. B. SRINIVAS RAJA , professor and head of the department, ECE.
We assert the statements made and conclusions drawn are an outcome of my research
work. We further certify that,
➢ The work contained in this report is original and has been done by us under the
general supervision of our guide.
➢ The work has not been submitted to any other institution for any other
degree/diploma/certificate in this university or any other university of India or
abroad.
➢ We have followed the guide lines provided by the university in writing the report.
➢ Whenever we have used materials (data, theoretical analysis, and text from other
sources, we have given due credit to them in the text of the report and giving
their details in the references.
D. AKSHAYA (20551A0419)
V. ROHIT (20551A0459)
N. NIKHIL (20551A0438)
GODAVARI INSTITUTE OF ENGINEERING & TECHNOLOGY (A)
CHAITANYA KNOWLEDGE CITY, NH-16, RAJAHMAHENDRAVARAM 533295,
AP
BONAFIDE CERTIFICATE
This is to certify that the project entitled “Design of Intrusion Detection System for
IOT Environment with Machine Learning Approach” is project work by D. AKSHAYA
(20551A0419), CH.AYYAPPA REDDY(20551A0413),V.ROHIT(20551A0459), N. NIKHIL
(20551A0438) who carried out the project work under my supervision during the academic
year 2023-2024, towards partial fulfillment of the requirements of the Degree of Bachelor of
Technology in Electronics & Communication Engineering as administered under the
Regulation of Godavari Institute of Engineering & Technology, Rajamahendravaram, A.P,
India and award of the Degree from Jawaharlal Nehru Technological University, Kakinada.
The results embodied in this report have not been submitted to any other University for the
award of any degree.
Date:
CERTIFICATE OF AUTHENTICATION
It is further certified that this work has not been submitted either in part or in full, to
any other department of the Jawaharlal Nehru Technological University Kakinada or any
other University, institution, or elsewhere, in India or for publication in any form.
Date:
Signature of the students
D. AKSHAYA (20551A0419)
V. ROHIT (20551A0459)
N. NIKHIL (20551A0438)
ACKNOWLEDGEMENT
We are grateful to our guide Dr. B. SRINIVAS RAJA, Professor for having given us
the opportunity carrying out this project work. We take this opportunity to express our
profound and whole hearted thanks to my guide, who with his patience support, and sincere
guidance helped us in the successful completion of the project. We are particularly indebted
to him for his innovative ideas, valuable suggestions, and guidance during the entire period of
work, and without his unfathomable energy and enthusiasm, this project would not have been
completed.
We would like to thank Dr. B. SRINIVAS RAJA , Professor and Head of the
Department, for valuable suggestions throughout our project which have helped in giving a
defined shape to this work.
We also would like to thank all the faculty members and non–teaching staff of the
Department of Electronics and Communication Engineering, GIET (A) for their direct and
indirect help during the project work.
We owe our special thanks to the MANAGEMENT of our college for providing the
necessary arrangements to carry out this project.
The euphoria and satisfaction of completing this project will not be completed until
we thank all the people who have helped me complete this enthusiastic task.
D. AKSHAYA (20551A0419)
V. ROHIT (20551A0459)
N. NIKHIL (20551A0438)
ABSTRACT
This project presents novel ideas on boosting security and thus reducing the risk in Internet of
Things (IoT) based on implementing Intrusion Detection Systems leveraging machine learning. The
project details the designing of two IoT devices based on DHT1 sensors that are used to measure the
temperature and humidity levels. These devices use datagram communication, for which data is
transmitted through a Mosquito broker using the MQTT protocol. On the server side, a machine
learning model made out of Random Forest and Support Vector Machine (SVM)algorithms analyzes
the information that passes it through for patterns hinting at intrusion or unusual behavior. Whenever
an alarm is identified from a possible security threat, notifications are forwarded on the Telegram
platform within the relevant authorities. This project integrates IoT, machine learning, and secured
communications technologies together in order to enhance the robustness of connected
environments that can very quickly detect threats and respond accordingly in a timely manner.
CONTENTS
Title
Declaration
Bonafide Certificate
Certificate Of Authentication
Acknowledgement
Abstract
1 INTRODUCTION 1
1.1 MOTIVATION 2
1.2 BACKGROUND STUDY 3
1.3 OBJECTIVES 7
2. LITERATURE SURVEY 12
3 METHODOLOGY 17
4 HARDWARE AND
SOFTWARE REQUIREMENTS
4.1 HARDWARE MODULES 22
4.2.1 ARDUINO
4.2.2 VS CODE
4.2.3 PYTHON
4.2.4 C LANGUAGE
5 WORKING OF THE 44
PROPOSED SYSTEM
6 ADVANTAGES 54
RESULT 56
CONCLUSION 61
FUTURE SCOPE 62
REFERENCES 63
List of Figures
The fundamental promise of the Internet of Things is its capacity to gather, process,
and act upon massive volumes of data in real-time, providing previously unheard-of insights and
optimization opportunities. IoT systems can identify patterns, anticipate trends, and automate
decision-making by utilizing sensor data and machine learning algorithms. This results in
increased productivity, lower costs, and better user experiences. IoT devices in the healthcare
industry remotely monitors patients' vital signs, allowing for the early identification of health
problems and individualized interventions. Sensors in agriculture monitor environmental factors
and soil moisture content to maximize crop yields and irrigation schedules. IoT-enabled
automobiles in the transportation sector interact with other cars and infrastructure to increase
road safety and efficiency.
1
1.1 MOTIVATION
The drive for this project is to meet the pressing necessity of strengthening security in the
quickly developing Internet of Things (IoT) space. The increasing prevalence of IoT devices in
our daily lives—from smart homes to industrial automation—also means that the inherent
vulnerabilities in these networked systems grow. IoT network security breaches can have
disastrous effects, from invasions of privacy to serious interruptions of vital infrastructure and
services. This study intends to proactively manage these risks by utilizing intrusion detection
systems (IDS) and machine learning. Because IoT ecosystems are dynamic and diverse,
traditional security solutions frequently find it difficult to keep up with them, leaving them
vulnerable to new attacks. As a result, creative solutions that can change and progress with
evolving attack vectors are desperately needed. One useful use of this security-enhancing
framework is the monitoring of temperature and humidity levels by IoT devices using DHT1
sensors. The real-time data these sensors provide allows for ongoing environmental condition
monitoring. Effective and dependable data transfer is ensured by utilizing datagram
communication and the MQTT protocol, which is essential for preserving the integrity of the
monitoring system.
A complex machine learning model that combines the powers of Random Forest and Support
Vector Machine techniques is at the core of the project. By scanning incoming data streams for
unusual patterns suggestive of possible security breaches, our model acts as a watchful defender.
using the detection of aberrant data patterns or unwanted access attempts, among other
deviations from standard operating procedures, the IDS is able to rapidly set off alarms that
notify the appropriate authorities using the Telegram platform. In the end, combining IoT,
machine intelligence, and secure communication technologies provides a comprehensive defense
against new dangers in linked environments. This project aims to foster trust in the dependability
and resilience of IoT ecosystems, assuring their future viability in an increasingly interconnected
world, by facilitating quick detection and reaction to security incidents.
2
1.2 BACKGROUND STUDY
In order to comprehend the reasons and difficulties behind the suggested solution, the
background research for this project include looking at the areas where machine learning,
security, and the Internet of Things converge. First of all, a strong security architecture is required
due to the exponential expansion of IoT devices across all sectors. The risks posed by cyber
threats are significant since IoT devices are present in critical infrastructure, healthcare,
manufacturing, and consumer applications. Because of things like resource limitations, various
communication protocols, and heterogeneous device architectures, traditional security
mechanisms frequently fail in Internet of Things scenarios. Furthermore, the constantly changing
threat landscape emphasizes how urgent it is to improve IoT security. Malicious actors use IoT
ecosystem vulnerabilities to conduct a range of attacks, such as network incursions, device
tampering, and data breaches. These assaults may have detrimental effects, such as jeopardizing
user privacy or stopping the provision of necessary services. Proactive security techniques that
can quickly identify and neutralize new threats are therefore desperately needed.
Thirdly, strengthening IoT security may be possible with the incorporation of machine
learning techniques. Large datasets provide no problems for machine learning algorithms to find
patterns and abnormalities, which makes them ideal for intrusion detection jobs in dynamic
Internet of Things environments. Machine learning models are able to detect aberrations that may
point to possible security breaches or unusual behavior by continuously evaluating data from IoT
sensors. This allows for prompt risk mitigation. Furthermore, the decision to use DHT1 sensors
to track humidity and temperature is a reflection of the practical requirements of actual IoT
installations. The overall security posture of IoT systems is improved when machine learning-
based intrusion detection is integrated with the useful environmental data that these sensors offer.
Furthermore, data transmission between IoT devices and the server is made efficient and
dependable by using the MQTT protocol and the Mosquito broker. This communication
technology selection prioritizes interoperability and scalability while adhering to industry
standards. The background study's overall findings emphasize how critical it is to handle security
issues in Internet of Things environments by developing creative solutions that make use of
sensor technologies, secure communication protocols, and machine learning to efficiently
identify and neutralize threats.
3
A. Introduction
A novel strategy for improving cybersecurity is the combination of machine learning,
secure communication technologies, and the Internet of Things (IoT). The goal of this project is
to strengthen Internet of Things environments by deploying Intrusion Detection Systems (IDS)
that make use of machine learning methods. The project intends to quickly identify and mitigate
security threats through the design of Internet of Things devices equipped with DHT1 sensors
for temperature and humidity monitoring, coupled with datagram communication via MQTT
protocol and analysis by a machine learning model consisting of Random Forest and Support
Vector Machine (SVM) algorithms. Telegram is used to send notifications of any breaches,
guaranteeing that the appropriate authorities act quickly.
Early Threat Detection: Leveraging machine learning algorithms enables the IDS to detect
potential security threats at an early stage, allowing for timely intervention before significant
damage occurs.
Reduced Risk: With the ability to swiftly identify and respond to security incidents, the IDS
helps mitigate risks associated with data breaches, unauthorized access, and other malicious
activities, there by safeguarding sensitive information and critical infrastructure.
Proactive Defense: The IDS provides a proactive defense mechanism against emerging and
evolving cyber threats, adapting its detection capabilities to new attack vectors and
vulnerabilities as they arise.
4
Efficient Resource Utilization: By automating the monitoring and detection process, the IDS
optimizes resource utilization, allowing organizations to allocate human resources more
effectively and focus on higher-priority security tasks.
Compliance Assurance: Implementing an IDS aligns with regulatory requirements and industry
standards for cybersecurity, ensuring compliance with data protection laws and regulations.
Improved Incident Response: By promptly notifying relevant authorities via the Telegram
platform, the IDS facilitates rapid incident response, enabling stakeholders to take appropriate
action to mitigate potential risks and minimize the impact of security breaches.
Lack of Data Encryption: Private information sent between Internet of Things devices and the
cloud may be intercepted and accessed by unauthorized parties if encryption methods are not in
place. The system is now more vulnerable to privacy a buses and data breaches.
Lack of Real-time Monitoring and Alerts: It's possible that the current system doesn't have the
tools necessary to monitor security breaches or irregularities in environmental data in real-time
or to send out alerts to administrators. The impact of security issues may worsen as a result of
this delay in discovery and reaction.
Ensuring the security and integrity of IoT devices is crucial due to their interconnected
natureand crucial significance in multiple sectors. A single hacked device may have far-
reaching
5
effects, including the loss of confidential information, interruption of business activities, and
even danger to one's physical safety. The importance of integrating security measures into IoT
systems and devices from the design stage forward has increased in recent years. To safeguard
data while it is in transit and at rest, this entails putting strong authentication procedures,
encryption standards, and secure communication channels into place. Furthermore, for real-time
threat identification and mitigation of security threats, continuous monitoring and threat
detection skills are critical.
Conventional intrusion detection systems are vulnerable to evasion tactics and zero-day attacks
because they rely on predefined rules and signatures to detect possible threats. On the other hand,
machine learning provides a more dynamic and adaptable method of intrusion detection, able to
identify new threats instantly and learn from historical data. Large datasets can be analyzed by
machine learning algorithms like Random Forest and Support Vector Machine (SVM), which
can also be used to find patterns and anomalies that conventional rule-based detection
techniques can miss. Machine learning models can be trained to differentiate between benign
and malicious activity and adjust to changing risks over time by using labeled datasets that
contain instances of both normal and harmful behavior.
Organizations may detect and mitigate security problems before they escalate by integrating
IDS with IoT devices, which is a proactive approach to cybersecurity. Organizations can discover
possible risks and unauthorized access attempts by monitoring network traffic, device behavior,
and system activity in real-time by strategically placing IDS sensors throughout the IoT network.
IDS may be quite helpful in the Internet of Things (IoT) setting when it comes to protecting
linked devices, spotting unusual activity, and preventing illegal access to private information.
IDS enables enterprises to react quickly and efficiently to new threats by using
6
machine learning algorithms to scan massive volumes of sensor data and spot patterns suggestive
of criminal activity or security breaches.
Maintaining the security and integrity of IoT systems and devices is crucial as the Internet of
Things spreads and changes our environment. Organizations may improve their cybersecurity
posture, proactively identify and mitigate security threats, and protect their IoT deployments
against changing cyber threats by incorporating machine learning-powered intrusion detection
systems. We can create an IoT ecosystem that is more secure and resilient via constant
innovation and cooperation, allowing connected technologies to reach their full potential while
reducing the dangers that come with being linked.
1.3 OBJECTIVES
The initiative being presented aims to improve the security of Internet of Things (IoT)-based
environmental monitoring systems through the deployment of a machine learning-powered
intrusion detection system (IDS). Using intrusion detection systems (IDS) and sophisticated
algorithms such as Support Vector Machine (SVM) and Random Forest, the initiative seeks to
identify and eliminate security risks instantly. The effort aims to protect vital infrastructure and
data from cyber threats by ensuring the integrity and resilience of linked IoT devices through
secure communication protocols and timely notifications via the Telegram platform.
In order to identify and address security threats aimed at Internet of Things (IoT) devices
utilized in environmental monitoring applications, the project will entail developing and
deploying an intrusion detection system (IDS). The inbound data streams from sensors like the
DHT11 sensor will be analyzed by this IDS, which will be able to spot unusual patterns that
could point to security lapses or cyberattacks.
7
Utilize machine learning algorithms to enhance the effectiveness of the IDS:
To improve the IDS's capabilities, machine learning algorithms like Support Vector
Machine(SVM) and Random Forest will be used. With the help of these algorithms, the IDS
will be able to learn from past data and instantly modify its detection skills to counter new
threats. The IDS will be better equipped to recognize and address new and developing
security risks by utilizing machine learning, improving the overall security posture of IoT
environmental monitoring systems.
The project's main objective is to protect data transfer between Internet of Things devices and
backend systems by establishing secure communication protocols. This is in addition to
identifying and addressing security concerns. Data transmissions will be encrypted to prevent
unauthorized access to important information, and secure protocols like MQTT will be used for
this purpose. The project intends to guard against eavesdropping, tampering, and other harmful
acts that could jeopardize the integrity of environmental data by putting strong security
mechanisms in place at the communication layer.
IoT environmental monitoring systems must promptly detect and respond to security incidents
in order to minimize the impact of cyberattacks. In order to rapidly inform system administrators
and pertinent stakeholders of potential security issues, the project will develop real-time alerting
and notification mechanisms. Stakeholders will be able to respond quickly to security problems
and stop additional harm by receiving alerts through channels like the Telegram messaging app.
8
1.4 PROBLEM STATEMENT
IoT has enormous potential, but it also comes with a lot of difficulties and worries, especially
when it comes to security and privacy. Because of their interconnectedness, Internet of Things
devices have a larger attack surface, which leaves them open to online dangers including
malware outbreaks, hacking, and data breaches. Furthermore, privacy and data protection are put
at risk by the massive volumes of sensitive data created by IoT devices, which makes strong
security protocols and legal frameworks necessary to secure user data. Standards compliance
and interoperability are two more issues that the IoT ecosystem must deal with. Making sure
that a multitude of devices from various manufacturers work together seamlessly and with a
variety of communication protocols can be a challenging undertaking. Furthermore, as IoT
installations continue to increase in size and complexity, scalability and sustainability become
more difficult to achieve. To reduce their negative effects on the environment, effective
resource management and infrastructure optimization are needed.
Addressing significant gaps and difficulties in Internet of Things (IoT) environment security
is the rationale for carrying out the study and putting the project as stated into action, especially
in light of the proliferation of connected devices and new cyberthreats. First off, the speed at
which IoT devices are spreading in a variety of industries emphasizes how urgent it is to
strengthen security protocols. IoT ecosystems are becoming more and more vulnerable to
cyberattacks because of their interconnectedness and large attack surface, which makes them
ideal targets for smart homes, healthcare facilities, industrial automation, and other applications.
Because traditional security systems frequently cannot keep up with the constantly changing
threat landscape, IoT deployments are left open to potential exploits. In order to proactively
identify and manage security vulnerabilities, there is a pressing need for creative approaches,
such as the integration of Intrusion Detection Systems (IDS) employing machine learning.
9
Second, there are particular cybersecurity challenges because of the complexity and variety
of IoT environments. IoT devices are vulnerable to a wide range of vulnerabilities due to their
different architectures, resource limits, and communication protocols. Moreover, the abundance
and variety of data produced by Internet of Things sensors make it difficult for traditional
security tools to evaluate and identify unusual activity. The suggested IDS may automatically
learn from data patterns, adjust to changing threats, and provide more resilient defense
mechanisms suited to the complexities of IoT deployments by utilizing machine learning
techniques. The project is made even more relevant and significant by its emphasis on secure
communication protocols like MQTT and timely notifications over the Telegram platform. In
order to reduce the possibility of adversaries listening in on, altering, or intercepting data
transmitted between IoT devices and servers, secure communication channels are necessary to
guarantee the confidentiality, integrity, and availability of that data. Furthermore, real-time
notifications help stakeholders minimize the effects of security breaches and improve overall
incident response skills by enabling them to act quickly in reaction to possible incidents.
Moreover, the use of DHT1 sensors for monitoring temperature and humidity fulfills
pragmatic objectives and is consistent with the overarching objective of augmenting IoT security.
In many applications, including building automation, agriculture, and healthcare, environmental
monitoring is essential. The research not only improves security but also highlights how crucial
it is to include security concerns in the design and development of IoT solutions by integrating
security elements into IoT devices that are in charge of gathering sensitive data. In conclusion,
the necessity of addressing cybersecurity issues in IoT environments urgently, the potential for
machine learning-driven intrusion detection systems to counter new threats, and the real-world
consequences of protecting IoT deployments with cutting-edge technologies and processes all
justify the research and execution of the project as described. The initiative adds to the resilience,
dependability, and credibility of interconnected systems in a world growing more digitally
connected by pushing the boundaries of IoT security.
10
1.6 PROJECT OVERVIEW
11
CHAPTER 2
2. LITERATURE SURVEY
Paper 1: Sajad M. Khan, Faheem Syeed Masoodi "Intrusion Detection System for IoT
Environment using Ensemble Approaches", proposed a system in feb 2013.
The first step in the suggested ensemble IDS approach is the dimensionality reduction process
using Random Forest (RF). RF is a flexible machine learning algorithm that is used to choose the
best subset of features from the original dataset. It is renowned for its scalability and resilience.
To lessen the effects of dimensionality and improve the IDS's capacity for discrimination, the
feature selection procedure is essential. Following that, an ensemble learning technique is used
to identify and detect intrusions. The ensemble intrusion detection system (IDS) is capable of
accurately capturing the intricacy and unpredictability of cyber threats by combining predictions
from several classifiers that were trained on various subsets of the data. Compared to
conventional single-model approaches, this ensemble approach has a number of benefits, such
12
as increased accuracy, resilience, and generalization capacity. The IoTID20 dataset's
experimental findings show how well the suggested RF-based ensemble IDS technique
performs. The strategy achieves an outstanding 99% accuracy rate and beats current state-of-
the- art algorithms in terms of accuracy, precision, recall, and F1-score, among other
performance parameters. Interestingly, the accuracy rates of the ensemble classifiers are greater
than those of the individual models, demonstrating the effectiveness of ensemble learning
methodologies in improving IDS performance.
Furthermore, the results of the experimental assessment show that the suggested framework
considerably raises the Intrusion Detection System's efficiency, resulting in an accuracy rate of
0.9863. The improvement in intrusion detection predictions' dependability and reduction in
classification errors can be ascribed to the synergistic benefits of ensemble learning algorithms.
These tactics leverage the complimentary strengths of multiple classifiers. In conclusion, the
literature review offers strong proof of the suggested ensemble IDS approach's effectiveness in
resolving the shortcomings of the current IDS algorithms. Utilizing machine learning methods
like ensemble learning and Random Forest, the strategy presents a viable way to improve
intrusion detection efficacyand efficiency in intricate and dynamic cybersecurity contexts.
Paper 2: Cristiano Antonio de Souza, Carlos Becker Westphall "Two-step ensemble approach
for intrusion detection and identification in IoT and fog computing environments", proposed a
system in march 2022
Explanation: The widespread adoption of Internet of Things (IoT) devices has resulted in
unparalleled connectedness and ease across a range of everyday activities. Nevertheless, security
issues frequently suffer from these devices' intrinsic resource limits. Thus, effective intrusion
detection strategies are essential for seeing and stopping possible threats, protecting IoT devices
as well as the private information they manage. The work provided in this article presents a two-
step intrusion detection and identification method that is specifically designed to address the
special difficulties associated with IoT contexts. An Extra Tree binary classifier is used to
analyze traffic in the first stage. Finding potentially invasive events in the network traffic data is
the goal of this preliminary investigation. Events that are deemed obtrusive are then examined
in further detail utilizing an ensemble technique in the second step. The Extra Tree, Random
Forest, and Deep Neural Network classifiers work together in this ensemble, utilizing their own
13
advantages to improve detection accuracy and robustness. Extensive tests were performed
utilizing multiple intrusion datasets, including Bot-IoT, IoTID20, NSL-KDD, and CICIDS2018,
to assess the effectiveness of the suggested technique. These datasets provide a thorough
evaluation of the approach's performance in a variety of contexts by representing a wide range of
scenarios and attack types that are frequently seen in IoT environments.
The experimental findings show that, on all assessed datasets, the suggested method
outperforms alternative machine learning techniques and state-of-the-art methods in a manner
that is either comparable or better. This strong result highlights how well the two-step intrusion
detection methodology detects and mitigates security risks in Internet of Things networks. Since
internet- connected devices are ingrained in many facets of society, it is impossible to dispute the
increasing importance of IoT in day-to-day living. The necessity for strong security and privacy
safeguards is further highlighted by the fact that users can communicate and exchange data with
these gadgets through social networking platforms. However, there are many obstacles to
implementing sufficient security measures because of the intrinsic features of IoT devices,
namelytheir tiny size, constrained resources, and poor processing capacity.
The storage, processing, and analysis of generated data frequently require offloading to devices
with more computational capacity due to the resource limits of IoT devices. Unfortunately, data
transfer to cloud computing platforms is made more difficult by the enormous traffic volume that
IoT devices create in addition to latency problems. This calls for the creation of specialized
intrusion detection systems that can function well in the limited settings of Internet of Things
devices. The literature review concludes by emphasizing the vital significance of intrusion
detection in Internet of Things environments and by introducing a brand-new, two-step method
created to handle the particular difficulties brought about by IoT device limitations. The
suggested approach shows strong performance and efficacy in detecting and resolving security
threats within IoT networks, hence adding to the general security and resilience of IoT
ecosystems, through examination utilizing a variety of intrusion datasets.
14
Paper 3: Umaira Ahad, Yashwant Singh, Pooja Anand, "Intrusion Detection System Model for
IoT Networks Using Ensemble Learning", proposed a system in 2022
Explanation: The increasing dependence on Internet of Things (IoT) devices and services
highlights how crucial it is to spot security flaws and malicious activity in IoT networks in order
to maintain the resilience of network infrastructure. The fundamental elements of network
security are intrusion detection systems (IDS), which identify intrusions by continuously
monitoring and evaluating system behavior. But current intrusion detection systems frequent ly
have trouble processing massive amounts of data that contain elements that are inappropriate,
pointless, and irrelevant, which results in longer detection times and lower accuracy. This work
provides an IDS model based on a Random Forest (RF) classifier in response to these problems.
The model's performance is tested on the NSL-KDD dataset, and the findings show that the
model performs satisfactorily in terms of accuracy, detection rate, and false alarm rate. Notably,
the suggested model successfully classifies binary data with an average accuracy of 99.3% and
multiclass data with an average accuracy of 98%.
In order to demonstrate the effectiveness of the suggested model even more, a comparison
with other methods that use other models—like AIDS, ELM and PCA, MapReduce-based hybrid
architecture, and DRNN—is carried out. This comparison analysis shows the superiority of the
RF-based model in terms of accuracy and performance and offers insightful information about
the relative advantages and disadvantages of various IDS approaches. The suggested IDS model
is significant because it can efficiently filter out characteristics of irrelevant input, which reduces
processing overhead and speeds up reaction times. This simplifies the detection process. This
increased effectiveness is especially useful in Internet of Things scenarios where a large number
of linked devices are producing enormous amounts of data.
The dynamic nature of threats and difficulties in IoT contexts highlights the necessity of
implementing strong security measures. The complexity and variety of potential security risks
are increasing along with the number of devices linked to the internet. As a result, IDSs are
essential for defending IoT environments from intrusions and cyberattacks. The literature review
15
concludes by highlighting the vital role that IDSs play in boosting the security and resilience of
IoT networks. With better accuracy and performance than current methods, the suggested RF-
based intrusion detection system (IDS) model constitutes a substantial development in intrusion
detection capabilities. The approach exhibits promise to enhance the overall security posture of
IoT environments by facilitating faster processing and response times for IoT devices by
efficiently resolving the constraints related to data volume and relevancy.
16
CHAPTER 3
3.METHODOLOGY
The proposed project's methodology entails a number of crucial elements that will improve
the security of Internet of Things (IoT) environmental monitoring devices by integrating an
Intrusion Detection System (IDS) that uses machine learning techniques.
1. Device Selection and Configuration: The two IoT devices chosen as the main parts of the
environmental monitoring system are fitted with DHT11 sensors to measure temperature and
humidity. These gadgets are set up to gather and send sensor data for examination.
2. Communication Protocol Enhancement: To enable more secure data transmission, the
connectivity between IoT devices has been improved. The project uses a Mosquito broker and
the MQTT protocol to maintain a datagram connection. Compared to earlier techniques, this
protocol offers a more reliable and secure communication channel, improving the confidentiality
and integrity of data transmission.
3. Integration of an Intrusion Detection System (IDS): In order to continuously watch
network traffic and system behavior for indications of suspicious activities or intrusions, the IDS
component is embedded into the system design. Incoming data streams are analyzed using
machine learning techniques like Random Forest and Support Vector Machine algorithms to look
for patterns that could be signs of security issues.
4. Model Training and Evaluation: Using labeled data, the machine learning model is taught
to identify patterns linked to both possible intrusions and typical system activity. A number of
measures are used to assess how well an intrusion detection system (IDS) performs in detecting
and identifying security risks, including accuracy, precision, recall, and F1-score.
5. Deployment and Testing: The IDS is incorporated into the IoT environmental monitoring
system after it has been trained and assessed. The efficacy and dependability of the IDS in
identifying and addressing security threats in practical settings are rigorously tested.
6. Continuous Monitoring and Optimization: To keep up with changing threats and sustain
peak performance over time, the IDS is constantly monitored and optimized. The machine
learning model is updated and improved on a regular basis in response to input from continuous
monitoring and assessment procedures.
17
Fig 3.1: Block Diagram for Proposed System
The block diagram of the project represents a comprehensive system architecture that
integrates advanced features to strengthen the security of Internet of Things environmental
monitoring sensors. Two Internet of Things (IoT) devices that are major data sources and have
DHT11 sensors for measuring humidity and temperature are at the center of the system. One
important improvement is the communication protocol: MQTT is used to secure data
transmission between devices via a Mosquito broker, guaranteeing integrity and confidentiality.
The Intrusion Detection System (IDS), which is the brains of the system, is based on machine
learning techniques like Random Forest and Support Vector Machine algorithms. This intrusion
detection system (IDS) keeps a close eye on system activity and network traffic, examining
incoming data streams for any unusual patterns that might point to security risks. The IDS is
implemented into the system architecture for comprehensive testing to confirm its effectiveness
in real-world circumstances after training and evaluation using labeled data. Constant
optimization and monitoring provide flexibility in response to changing risks, and the machine
learning model is updated and improved upon on a regular basis in response to user feedback.
The project's goal is to greatly improve the security and resilience of IoT environments against
any intrusions and malicious behavior by using this integrated approach.
18
Fig 3.2: Flow Chart for Proposed System
The project's flowchart starts at the initiation step, denoting the beginning of the effort to
improve the security of Internet of Things environmental monitoring devices. The gathering of
environmental data by IoT devices using DHT11 sensors for temperature and humidity
measurement is a crucial first step in the process. Next, utilizing the MQTT protocol, this data is
safely transferred over a Mosquito broker, guaranteeing the integrity and secrecy of device-to-
device communication. The Intrusion Detection System (IDS), which is based on machine
learning techniques like Random Forest and Support Vector Machine algorithms, then analyzes
the transmitted data. The intrusion detection system (IDS) keeps an eye on system activity and
network traffic to spot any unusual activity or intrusions. After that, labeled data is used for
training and assessment of the machine learning model inside the IDS in order to identify
patterns linked to both possible intrusions and typical system behavior. The IDS determines
whether there are any security threats or intrusions in the IoT environment based on the analysis
results. An alert is generated to advise authorities or relevant stakeholders about a potential
security breach if a security threat or incursion is detected. Appropriate reaction measures are
started to mitigate the security risk and stop additional harm, depending on how
19
serious the discovered danger is. The flowchart culminates with the process of enhancing
security for IoT environmental monitoring devices, highlighting the significance of
incorporating sophisticated features like machine learning-based intrusion detection systems
and secure communication protocols to protect IoT ecosystems from possible security risks.
A number of crucial elements and procedures are included in the project's methodology, which
aims to improve the security of IoT environmental monitoring devices by integrating cutting-
edge features, such as an Intrusion Detection System (IDS) based on machine learning
techniques. First, the project entails choosing and setting up two Internet of Things (IoT) devices
that have DHT11 sensors installed in order to assess humidity and temperature. These units are
responsible for gathering vital information from their surroundings and are the backbone of the
environmental monitoring system. To provide more secure data transmission, these IoT devices
significantly improve their connection protocol. The project uses a Mosquito broker and the
MQTT protocol to maintain a datagram connection. Compared to earlier techniques, this
protocol provides a reliable and secure communication channel, improving the confidentiality
and integrity of data transmission. The intrusion detection system (IDS), which is the main
component of the project, is integrated into the system architecture after the Internet of Things
devices and communication protocols have been set up. This intrusion detection system (IDS)
continuously monitors network traffic and system behavior for indications of suspicious activity
or intrusions. It is based on machine learning techniques including Random Forest and Support
Vector Machine algorithms.
20
Using labeled data, the machine learning model within the IDS is trained and evaluated as part
of the approach. The goal of this stage is to make it possible for the IDS to efficiently identify
patterns linked to typical system behavior and possible intrusions. A range of performance
indicators, including F1-score, recall, accuracy, and precision, are used to evaluate how well an
intrusion detection system (IDS) detects and identifies security threats. In order to depict the
architecture and workflow of the project, a block diagram and a flowchart are created. The block
diagram highlights the integration of environmental monitoring devices, communication
protocols, and the machine learning-based IDS, while also illuminating the main elements and
their interconnections within the system. On the other hand, the flowchart outlines the project's
sequential phases, including data transmission and collecting, intrusion detection, alert creation,
and response activities. The project's goal is to greatly improve the security of IoT environmental
monitoring devices by implementing cutting-edge features like machine learning-based intrusion
detection systems and secure communication protocols through the use of this thorough
methodology. The project aims to protect IoT ecosystems from malicious actions and potential
security concerns by utilizing these technologies in an efficient manner.
21
CHAPTER 4
4. HARDWARE AND SOFTWARE REQUIREMENTS
The detailed list of hardware requirements for the proposed SECURING IOT
ENVIRONMENTS WITH MACHINE LEARNING-DRIVEN INTRUSION DETECTION
SYSTEMS are as follows:
● NODEMCU ESP8266
● DHT11 SENSOR
● JUMPER WIRES
NODE MCU
The NodeMCU is a flexible development board that is popular in Internet of Things (IoT)
applications because of its low cost, large feature set, and simplicity of usage. Based on the
ESP8266 microcontroller, the NodeMCU is a popular option for professionals and hobbyists
alike because to its many features and capabilities. The ESP8266 microcontroller, which powers
the NodeMCU, has a 32-bit Tensilica Xtensa LX106 CPU that can operate at up to 80 MHz. The
NodeMCU's potent processor and integrated Wi-Fi enable it to easily connect to wireless
networks and communicate with other devices via the internet. Firmware, program code, and
other data are stored in the 4MB of flash memory that the NodeMCU normally has available. It
also has 96 KB of data RAM and 64KB of instruction RAM, which is plenty of memory for
storing variables and executing programs. One of the main characteristics of the NodeMCU is its
support for the Lua programming language, which makes it easier to construct Internet of Things
applications by offering a high-level, user-friendly scripting language. Without the need for
complicated programming environments, developers can quickly prototype and implement
Internet of Things projects with the pre-flashed Lua interpreter in the NodeMCU firmware.
22
The NodeMCU enables C/C++ programming with the Arduino IDE (Integrated Development
Environment), in addition to Lua scripting. This allows developers to select the development
environment and programming language that best fits their requirements and tastes. With
regard to networking, the NodeMCU has built-in Wi-Fi functionality, which enables it to
establish wireless network connections and establish online communication with other devices.
Fig : NODEMCU
It is simple to connect to a computer for development and debugging because it has a USB
connector for power and programming. Due to its modest size (about 2.7 x 4.8 cm), the
NodeMCU board can be easily integrated into prototypes and small-scale IoT projects. Its
capabilities can be further increased by connecting sensors, actuators, and other peripherals to
23
the board via a set of GPIO (General Purpose Input/Output) pins. All things considered, the
NodeMCU is a strong and adaptable development board that works well with a variety of
Internet of Things applications. Because of its low cost, simplicity of use, and wide range of
features, it has grown in popularity among developers who want to make Internet of Things
smart solutions and linked products.
SPECIFICATIONS:
Microcontroller: ESP8266EX
Processor: Tensilica L106 32-bit RISC microcontroller
Operating Frequency: Up to 80 MHz
Flash Memory: 4 MB (32 Mb)
RAM: 128 KB
24
WiFi Connectivity:
WiFiStandards: 802.11
b/g/nFrequency Range:
2.4 GHz
Modes: Station, Access Point, Station+ Access
PointSecurity: WEP/WPA-PSK/WPA2-PSK
GPIOPins: 16 pins, including:
Digital I/O: GPIO 0 to GPIO 16
Analog Input: A0 (supports 10-bit
ADC)SPI, I2C, UART interfaces
USBInterface: Micro USBport for power supply, programming, and debugging
Power Supply:
Arduino IDE
Lua-based firmware (NodeMCU firmware)
Programming Language: C/C++(Arduino IDE), Lua (NodeMCU firmware)
Dimensions: Typically, around 49mm x 24mm
Operating Temperature: -40°Cto +125°C
Compatibility: Compatible with various sensors, actuators, and modules commonlyused in IoT
projects
Firmware: Open-source firmware withextensive communitysupport
25
DHT11 SENSOR
The DHT11 sensor is a digital temperature and humidity sensor module that is widely utilized
in many different applications, such as home automation systems and weather stations. It is a
popular option for amateurs, do-it-yourselfers, and novices in the field of electronics due to its
simplicity, affordability, and ease of use. The analog-to-digital converter (ADC), temperature
sensor, and humidity sensor are the three main components of the DHT11 sensor, which are all
combined into one single package. The sensor's interface with microcontrollers and other
electronic equipment is made simpler by its single-wire digital operation. A humidity sensing
element based on a moisture-sensitive substrate is used in the DHT11 sensor's operation. The
electrical resistance of this substrate varies in response to changes in relative humidity. The
resistance is measured by the sensor, which then transforms it into a digital signal that indicates
the relative humidity (RH) level.
A thermistor is also incorporated within the DHT11 sensor to sense temperature. A resistor
type whose resistance changes with temperature is called a thermistor. The sensor can detect the
outside temperature and translate it into a digital signal by measuring the thermistor's resistance.
Through a single-wire digital interface, the DHT11 sensor can communicate with external
devices like microcontrollers or development boards. Through the use of this interface and a
particular communication protocol, the host device receives data packets fromthe sensor
26
that contain temperature and humidityreadings. After that, the host device decodes these
datapackets and handles the data appropriately.
1. Initialization: A start signal is sent by the host device to the DHT11 sensor to begin
communication.
2. Data collecting: The DHT11 sensor starts the data collecting procedure as soon as it receives
the start signal. Its inbuilt sensing devices are used to measure the temperature and humidity
levels.
3. Data Transmission: After the measurements are finished, the sensor sends a digital signal
containing the data to the host device. Temperature and humidity readings are usually included
inthe data packet, along with checksum bits for error detection.
4. Data Interpretation: After the host device gets the sensor's data packet, it analyzes the contents.
It takes the temperature and humidity readings out of the packet and does any calculations or
processing that's required.
5. Error Handling: To guarantee data integrity, the DHT11 sensor has error-checking methods.
The host device can request a resend or take the necessary corrective action if any problems are
found during communication or data transmission.
27
However, the DHT11 sensor offers a reasonably priced way to monitor temperature and
humiditylevels for a wide range of simple applications where exact measurements are not
necessary.
Fig : DHT11Sensor
SPECIFICATIONS:
28
JUMPER WIRES
Because theyact as flexible connectorsthat make it easier to transfer power or signals between
different components, like integrated circuits, sensors, LEDs, and microcontrollers, jumper wires
are crucial parts of electronics and electrical projects. Usually composed of insulated, flexible
conductive material like copper, these wires come in a range of gauges, colors, and lengths to
accommodate a variety of needs and applications. The adaptability and simplicity of usage of
jumper wires are two of their main benefits. They make it possible to quickly and easily connect
several components on a circuit board, prototype board, or breadboard, facilitating
experimentation and prototyping without the need for soldering. This makes them essential tools
for professionals, students, and enthusiasts alike, allowing them to effectively test circuit designs,
troubleshoot problems, and iterate on projects. Additionally, jumper wires are essential for
electronics projects because theyenable modular design and reusability.
Fig: JumperWires
Instead of needing to redesign the entire circuit, designers can quickly swap out or replace
specific modules by employing jumper wires to make modular connections between components.
This encourages adaptability, scalability, and ease of maintenance—especially for intricate
projects or systems where parts might need to be changed or improved on a regular basis.
Furthermore, jumper wires are quite helpful in electronic circuit debugging and defect detection.
Because of its flexibility, engineers and technicians can precisely and specifically probe circuit
nodesto swiftlydiscover and isolate problems. With jumper wires, you mayeasily
29
and dependable probe and access circuit connections for continuity testing, voltage testing, and
signal integrity monitoring. In conclusion, jumper wires are incredibly useful tools for electrical
and electronic projects because of their modularity, usability, and variety. These flexible
connectors, whether they are utilized for modular design, troubleshooting, or prototyping, are
essential for the successful development, experimentation, and upkeep of electrical circuits and
systems.
ARDUINO IDE
Writing, building, and uploading code to Arduino boards is made easy with the help of the
cross- platform Arduino Integrated Development Environment (IDE). When creating and
implementing projects based on Arduino microcontroller boards, professionals, hobbyists, and
enthusiasts use it as their main software tool.
1. Code Editor: To make writing and modifying Arduino sketches (programs) easier, the IDE
has a user-friendly code editor with syntax highlighting, auto-indentation, and code completion
functions.
30
2. Sketch Management: The Arduino IDE's sketchbook is where sketches, or programs, are
arranged. Within the IDE, users may create, open, save, and manage sketches.
3. Library Manager: Users may install, update, and manage libraries—groups of pre-written
code—used in their projects with ease thanks to the IDE's built-in library manager. Arduino
libraries offer ready-to-use code for common tasks, which streamlines the development process
and adds additional functionality.
4. Board Manager: The Arduino IDE is compatible with a large number of boards, both official
and third-party, that are Arduino-compatible. With the integrated board manager, users may
choose the right board type and install firmware and drivers as well as other board-specific
support packages right fromthe IDE.
5. Serial Monitor: An Arduino board can be contacted via the serial port by using the serial
monitor utility that comes with the IDE. It makes it possible to monitor data provided and
received by the Arduino in real time, which is extremely helpful for debugging and
troubleshooting.
31
6. Integrated Examples: A range of example sketches covering many facets of Arduino
programming are included with the Arduino IDE. These range from fundamental input/output
(I/O) operations to more sophisticated features like sensors, displays, and communication
protocols. These illustrations provide as excellent teaching tools and jumpstarts for new
initiatives.
7. Code Compilation and Upload: Using USB or other compatible interfaces, Arduino
sketches scan be seamlessly converted into machine code (binary) and uploaded to Arduino
boardsusing the IDE's user-friendly workflow. With a few clicks, users may submit their code
to theirtarget board and check it for faults.
All things considered, the Arduino IDE offers an easy-to-use and intuitive development
environment for Arduino projects, enabling creators and developers to easily realize their ideas.
Regardless of your level of experience developing intricate IoT applications or just starting out
inthe world of electronics, the Arduino IDE provides the materials and tools you need to bring
your ideas to life.
VS CODE
Microsoft created Visual Studio Code (VS Code), a small, open-source code editor. It has swiftly
risen to the top of the list of preferred options for developers to write code on a variety of
platforms and programming languages. Below is a summaryofVS Code:
1. User Interface: With an emphasis on productivity and simplicity, VS Code offers a clear and
easy-to-use user interface. With features like split views, tabbed editing, and an integrated
terminal, its layout can be customized to give developers a flexible workplace that suits their
needs.
32
2. Language Support: One of VS Code's best qualities is its wide range of language support.
Numerous programming languages, including JavaScript, TypeScript, Python, Java, C++, and
many more, are supported by it out of the box. To further improve the coding experience, VS
Code provides syntax highlighting, code completion, and intelligent code suggestions tailored to
each language.
3. Extensions: With VS Code's extensive ecosystem of extensions, developers can alter and
expand its features to suit their requirements. Thousands of community-developed extensions in
areas including themes, language support, debugging tools, source control integrations, and
productivity boosts are available in the VS Code Marketplace. There's probably an extension out
there that can help you improve your productivity, integrate it with other tools, or make
development jobs easier.
33
5. Cross-Platform Compatibility: Windows, macOS, and Linux are just a few of the operating
systems on which VS Code is designed to function flawlessly. The platform's feature set and
consistent performance across many scenarios make it a popular choice for developers working
in a varietyof environments.
6. Community and assistance: Thanks to a sizable and vibrant developer community, Visual
Studio Code continues to receive contributions, criticism, and assistance from users all around
the world. The official documentation offers resources for learning, debugging, and mastering
VS Code, in addition to community forums, GitHub repositories, and online tutorials.
Visual Studio Code provides a strong and adaptable coding environment that fuses the ease of
use of a light-weight editor with the features of an integrated development environment. Because
of its broad language support, rich ecosystem of extensions, and cross-platform compatibility, it
is a vital tool for developers working in a variety of programming disciplines and at all skill
levels.
PYTHON
Python is a popular high-level interpreted programming language that is easy to learn and has
a lot of adaptability. Python was created by Guido van Rossum in the late 1980s and has since
grown to be one of the most widely used programming languages globally. Developers use
Python for a variety of purposes, including data analysis, web development, artificial
intelligence, and scientific computing.
34
1. Easy to Read and produce: Python's syntax is intended to be simple and straightforward,
making it simple for developers to produce and comprehend code. Enforcing proper coding
practices and improving readability are two benefits of using indentation to designate code
chunks.
2. Interpreted and Dynamically Typed: Python is an interpreted language, which means that
an interpreter runs code without first compiling it into machine code. Additionally, it has
dynamictyping, which enables variables to be assigned dynamically without explicitly defining
their datatype.
3. Large Standard Library: Python includes a large standard library that offers many modules
and functions for handling typical tasks like data processing, networking, file input and output,
and string manipulation. Development is sped up and fewer external dependencies are required
thanks to this large library.
4. Platform Independence: Python code can operate on a variety of operating systems without
the need for change because it is platform-independent. Python is a great option for creating
cross-platformapps because of its portability.
5. Strong Community Support: Python boasts a thriving and dynamic developer community
that actively contributes to the ecosystem by building tools, frameworks, and libraries.
Thousands of third-party packages that expand Python's capabilities and cover a wide range of
topics, including web development, data analysis, machine learning, and more, are available on
the Python Package Index (PyPI).
35
7. Wide Range of Applications: Python is used in many different fields and sectors, such as
scientific computing, automation, web development (using frameworks like Django and Flask),
data science and analytics (using libraries like NumPy, pandas, and scikit-learn), artificial
intelligence and machine learning (using frameworks like TensorFlow and PyTorch), and more.
8. Ease of Learning: Python is a great option for novices learning to code because of its
straightforward syntax and readability. Thanks to its easy learning curve, copious documentation,
and online resources, people ofdifferent backgrounds and ability levels can use it.
Python is a powerful and well-liked programming language for a wide range of applications
because of its ease of use, adaptability, and robust ecosystem. Python delivers the flexibility and
resources you need to realize your ideas, regardless of your level of experience designing
complicated systems or learning to code.
C Language
C is a robust and popular programming language that is well-known for its portability, efficiency,
and versatility. C, one of the most widely used programming languages today, was created by
Dennis Ritchie at Bell Laboratories in the early 1970s. It has impacted numerous other
programming languages. This is a summaryof C:
1. Mid-level Language: Because C combines the features of low-level languages like assembly
language and high-level languages like Python, it is frequently referred to as a "mid-level"
programming language. For systems programming and embedded development, it offers a nice
compromise between direct hardware manipulation and high-level abstraction.
36
3. Efficiency and Performance: C has an excellent reputation for both. Because it permits low-
level access to system resources and direct memory manipulation, developers can create highly
optimized code for jobs demanding speed and resource efficiency.
4. Portability: C programs can run on a variety of hardware platforms and operating systems
with little to no modification because of its high degree of portability. Using standard libraries
and adhering to the ANSI C standard's common coding conventions help accomplish this
portability.
5. Static Typing: Because C is a statically typed language, variable types are fixed at build time
and are not modifiable while the program runs. Although it necessitates explicit type declarations
for variables and functions, this enables efficient memory allocation and early mistake
detection.
6. Rich Standard Library: C comes with a rich standard library that includes functions for
handling memory, input/output, string manipulation, and mathematical computations, among
other typical tasks. These built-in features help C programs evolve more quickly and be more
portable.
8. Tradition and Industry Acceptance: In the software business, C has a lengthy history and
a solid reputation. It is extensively utilized in fields including scientific computing, device
drivers, game development, embedded systems, and operating systems. C is used by many
widely used software systems and applications, or it is used to write C components.
9. Learning Curve: Because of its sophisticated syntax and manual memory management, C
can be harder to learn than other higher-level languages, despite its unmatched controland
37
efficiency. Nonetheless, learning C offers a strong basis for comprehending the fundamentals of
computer architecture and programming.
C is a strong, flexible programming language with a long history and a variety of uses. For
developers working on system-level and performance-critical applications, its low-level features,
portability, and efficiency make it a preferred option. Even while learning C may require more
work than learning other higher-level languages, the knowledge and abilities gained from doing
so are crucial for comprehending the foundations of programming and creating reliable, effective
software systems.
MACHINE LEARNING
Within the science of artificial intelligence (AI), machine learning focuses on creating statistical models
and algorithms that let computers learn from data and make judgments or predictions without needing to
be explicitly programmed to do so. Machine learning aims to develop systems that can automatically pick
up new skills and grow with experience, making them more capable and precise over time. An outline of
machine learning is provided here:
Supervised Learning: here, the algorithm is trained ona labeled dataset, meaning that every input
has a matching target output. Acquiring a mapping from inputs to outputs is the aim in order to
enable the modelto forecast on novel, unknown data.
Reinforcement Learning: This technique teaches an agent how to interact with its surroundings
and pick up knowledge from feedback that takes the shape of incentives or punishments.
Through repeated trial and error, the agent eventually learns how to behave in a way that
maximizes cumulative reward.
38
2. Key Concepts: -
Features and Labels: In supervised learning, labels are the intended outputs that need to be
predicted, while features arethe input variables or traitsthat are utilized to create predictions.
Training and Testing: Two sets of the dataset are usually separated for training and testing. To
determine the model's performance and capacity for generalization, it is trained on the training
set and tested onthe testing set.
Loss Function and Optimization: The difference between the true labels and the expected
outputs is quantified by the loss function. During training, optimization procedures like gradient
descent are employed to updatethe modelparameters and minimize the loss function.
Model Evaluation: Machine learning models' performance on tasks like regression,
classification, and others is assessed using a range of measures, including accuracy, precision,
recall, and F1-score.
picture and Speech Recognition: Computers can comprehend and analyze visual and aural
information thanks to machine learning algorithms, which underpin applications like object
detection, picture categorization, and speech recognition.
Natural Language Processing (NLP): NLP techniques enable computers to comprehend and
produce human language, and are used for tasks like text classification, sentiment analysis,
machine translation, and chatbots.
Predictive Analytics: A variety of industries, including marketing, e-commerce, banking, and
healthcare, use machine learning for predictive analytics. Based on past data, predictive models
can project future patterns, consumer behavior, and results.
Recommendation Systems: These systems provide tailored suggestions for goods, movies,
music, and other content by analyzing user preferences and behavior using machine learning
algorithms.
39
4. Difficulties and Considerations: -
Data Quantity and Quality: To achieve precise predictions and generalization, machine learning
models need to be trained on sizable, high-quality datasets.
Model Complexity and Interpretability: It's important to strike a balance between model
complexity and interpretability since too complicated models run the risk of being opaque and
overfitting the training set.
Ethical and Bias Concerns: Machine learning algorithms may reinforce pre existing biases in
the data, producing unfair or discriminating results. Ensuring fairness and addressing bias in
machine learning models is a continuous challenge.
Support Vector Machine or SVM is one of the most popular Supervised Learning algorithms,
which is used for Classification as well as Regression problems. However, primarily, it is used
for Classification problems in Machine Learning. The goal of the SVM algorithm is to
createthebest line or decision boundary that can segregate n-dimensional space into classes so
that we can easily put the new data point in the correct category in the future. This best
decision boundary is called a hyperplane. SVM chooses the extreme points/vectors that help
in creating the hyperplane. These extreme cases are called support vectors, and hence the
algorithm is termed as Support Vector Machine.
Linear SVM: Linear SVM is applied to data that can be classified into two groups using only
one straight line; this type of data is known as linearly separable data, and the classifier used to
classify it is known as a linear SVM classifier.
Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which implies that
a dataset is considered non-linear data if it cannot be classified using a straight line. The classifier
that is used in this situation is referred to as a non-linear SVM classifier.
40
Non-linear SVM: Non-Linear SVM is used for non-linearly separated data, which implies that
a dataset is considered non-linear data if it cannot be classified using a straight line. The classifier
that is used in this situation is referred to as a non-linear SVM classifier.
Hyperplane: To separate the classes in n-dimensional space, there might be more than one line
or decision boundary. The key is to determine which choice boundary best aids in the
classification of the data points. The SVM hyperplane is the name given to this optimal
boundary.
The features in the dataset determine the hyperplane's dimensions; for example, if the image
displays two characteristics, the hyperplane's dimensions will be a straight line. Additionally, a
hyperplane is a two-dimensional plane if there are three characteristics.
The maximum distance between the data points, or the maximum margin, is what we always
include in our hyperplane creation.
VectorsofSupport:
Support vectors are the data points or vectors that are closest to the hyperplane and have an
impact on its position. These vectors are known as support vectors because theyprovide support
to the hyperplane.
RANDOM FOREST
In machine learning, random forest is a potent ensemble learning technique that is applied to
both classification and regression problems. It is a member of the decision tree-based algorithm
family and is used extensively because of its efficiency, simplicity, and resilience. This is a
summary of random forest:
41
1. Ensemble Learning: Random forest is an ensemble learning method that builds a more
reliable and accurate model by combining several decision trees. A random portion of the
training data is used to individually train each decision tree in the random forest, and each tree
then generates predictions using a unique set ofrules.
2. Decision Trees: Based on the values of input features, decision trees are straightforward yet
effective prediction models that divide the feature space into regions recursively. Every leaf node
in the tree indicates a predicted output or class label, whereas every internal node in the tree
represents a choice made based on a feature.
Voting method: Random forests use a majority voting method to integrate the forecasts of
individual decision trees in classification problems. Every tree "vote" for the class label that
appears the most frequently in its leaf nodes; the class with the highest number of votes
becomes the final prediction. The final prediction for regression tasks is the mean of the values
predictedby each tree.
42
4. Hyperparameters: To maximize model performance and generalization, Random Forest
provides a number of hyperparameters that can be adjusted. The number of trees in the forest,
the trees' maximum depth, the minimum number of samples needed to split a node, and the
maximum number of features taken into consideration for splitting are a few important
hyperparameters.
6. Applications: - Random forests are frequently utilized for tasks like feature selection,
regression, and classification in a variety of areas, including bioinformatics, finance, healthcare,
and marketing.
- It is frequently used in practical applications like recommendation systems, fraud detection,
disease diagnosis, credit scoring, and customer churn prediction.
Random forest is a strong and adaptable machine learning algorithm that performs exceptionally
well when handling intricate datasets and producing precise predictions. It is a well-liked option
for many different types of predictive modeling tasks because of its capacity to reduce
overfitting, manage high-dimensionaldata, and offer insights on feature relevance.
43
CHAPTER 5
5. WORKING OF THE PROPOSED SYSTEM
DHT11 Sensors: These sensors measure real-time temperature and humidity data within the environment
where they are deployed.
Datagram Communication: Devices likely transmit small, independent packets of sensor data for efficient
exchange.
MQTT Protocol: MQTT acts as a lightweight messaging protocol designed for constrained IoT devices,
facilitating data transfer.
Mosquito Broker: This central broker manages the communications, receiving data from the IoT devices and
making it available to the server.
Intrusion Detection: The core task ofthe server is to analyze incoming data fromthe devices to
detect anomalies that might suggest security breaches Random Forest: This ensemble algorithm
constructs multiple decision trees, using collective decision-making to enhance accuracy and
pinpoint security threats. Support Vector Machine (SVM): The SVM algorithm excels at classifying
data by finding optimal boundaries between different data categories (normal vs. compromised in
this case), further identifying intrusions.
44
Alarm Systems & Notification
PROGRAM CODE
import paho.mqtt.client as
mqttimport csv
fromdatetime import
datetimeimport time
fromsklearn.ensemble import
RandomForestClassifierfrom sklearn.svm import
SVC
import numpyas
npimport
asyncio
fromsklearn.model_selection import train_test_split
fromsklearn.metrics import accuracy_score, precision_recall_fscore_support
import numpyas np
fromtelegram import Bot
TELEGRAM_BOT_TOKEN = "6456846815:AAEC98GLh4sJqexakjcYDqIih-X8BkYopmc"
TELEGRAM_CHAT_ID = "772685223"
broker_address =
"localhost"port = 1883
topic = "sensor/environment" csv_file_path= "labeled_sensor_data.csv"
45
async defsend_to_telegram(message):
bot = Bot(token=TELEGRAM_BOT_TOKEN)
await bot.send_message(chat_id=TELEGRAM_CHAT_ID, text=message)
print("Message sent to Telegram:", message)
temperature_drop_threshold = 5.0
humidity_drop_threshold = 10.0
temperature_hike_threshold = 2
humidity_hike_threshold = 5
retrain_interval=
30svm_model=
SVC()
random_forest_model= RandomForestClassifier()
withopen(csv_file_path, "r") as
file:reader = csv.reader(file)
next(reader)
data =[row for row inreader]
svm_model.fit(X, y)
random_forest_model.fit(X,
y)
46
temp2_mean = np.mean([row[2] for row in X])
humidity2_mean = np.mean([row[3] for row in X])
deftrain_model_and_adjust_threshold(X, y, temp_mean,
humidity_mean):X = np.array(X)
y= np.array(y)
model = RandomForestClassifier(n_estimators=100,
random_state=42)model.fit(X, y)
y_pred = model.predict(X)
temp_deviation= 0.2
humidity_deviation= 0.2
47
last_retrain_time = time.time()
deftrain_model_and_adjust_threshold(X, y, temp_mean,
humidity_mean):X = np.array(X)
y= np.array(y)
model = RandomForestClassifier(n_estimators=100,
random_state=42)model.fit(X, y)
y_pred = model.predict(X)
temp_deviation= 0.2
humidity_deviation= 0.2
48
print(f"Precision: {precision}, Recall: {recall}, F1 Score:
{f1_score}")print(
f"Temperature Threshold: {temp_threshold}, HumidityThreshold: {humidity_threshold}"
)
print(f"Threshold Values Updated WithCurrent Dataset")
readings = message.split(",")
readings)anomaly_detected = (
temp1 >temp_threshold
or humidity1 >
humidity_thresholdor temp2 >
temp_threshold
or humidity2 > humidity_threshold
)
%H:%M:%S")new_data = preprocess_and_label_data(message)
# print(is_sudden_hike(new_data[0], temp1_mean,
temperature_hike_threshold)) # print(is_sudden_hike(new_data[1],
humidity1_mean, humidity_hike_threshold)) #
print(is_sudden_hike(new_data[2], temp2_mean, temperature_hike_threshold))
49
# print(is_sudden_hike(new_data[3], humidity2_mean, humidity_hike_threshold))
if (
is_sudden_hike(new_data[0], temp1_mean, temperature_hike_threshold)
or is_sudden_hike(new_data[1], humidity1_mean,
humidity_hike_threshold) or is_sudden_hike(new_data[2], temp2_mean,
temperature_hike_threshold) or is_sudden_hike(new_data[3],
humidity2_mean, humidity_hike_threshold)
):
svm_prediction= svm_model.predict([new_data[:4]])[0]
random_forest_prediction = random_forest_model.predict([new_data[:4]])[0]
print(timestamp)
print(new_data[:4])
print(svm_prediction)
print(type(svm_predictio
n))
if svm_prediction == 28 or random_forest_prediction == 8:
asyncio.run(
send_to_telegram(
f"Sudden hike Detected - Possible Intrusion: {timestamp} - {new_data[:4]}"
)
)
print(f"Sudden hike Detected at {timestamp} -
{new_data[:4]}")else:
print(
f"Suddenhike Detected, but models did not confirm intrusion at {timestamp} -
{new_data[:4]
}"
)
if (
50
is_sudden_drop(new_data[0], temp1_mean, temperature_drop_threshold)
or is_sudden_drop(new_data[1], humidity1_mean,
humidity_drop_threshold) or is_sudden_drop(new_data[2], temp2_mean,
temperature_drop_threshold) or is_sudden_drop(new_data[3],
humidity2_mean, humidity_drop_threshold)
):
svm_prediction = svm_model.predict([new_data[:4]])[0]
random_forest_prediction = random_forest_model.predict([new_data[:4]])[0]
print(timestamp)
print(new_data[:4])
print(svm_prediction)
print(type(svm_predictio
n))
ifsvm_prediction == 28 or random_forest_prediction == 8:
asyncio.run(
send_to_telegram(
f"SuddenDrop Detected- Possible Intrusion: {timestamp} - {new_data[:4]}"
)
)
print(f"SuddenDropDetected at {timestamp} -
{new_data[:4]}")else:
print(
f"SuddenDrop Detected, but models did not confirm intrusion at {timestamp} -
{new_data[:4]
}"
)
else:
print(f"NormalData at {timestamp} - {new_data[:4]}")
client = mqtt.Client()
client.on_message =
on_message
51
client.connect(broker_address, port=port)
client.subscribe(topic)
client.loop_start()
def
load_data_from_csv(file_path):
withopen(file_path, "r") as
file:
reader
=csv.reader(file)
next(reader)
data=[row for row in
reader]return data
try:
while True:
current_time = time.time()
data = load_data_from_csv(csv_file_path)
X=[
[float(row[1]), float(row[2]), float(row[3]),
float(row[4])]for row in data
]
y= [int((float(row[1]) + float(row[3])) / 2) for row in data]
svm_model.fit(X, y)
random_forest_model.fit(X,
y)
52
X=[float(row[1]), float(row[2]), float(row[3]),
float(row[4])]for row in data]
time.sleep(1)
except KeyboardInterrupt:
print("Stopping the
script.")
client.disconnect()
client.loop_stop()
53
CHAPTER 6
6. ADVANTAGES
1. Enhanced Security: The suggested solution significantly improves the security of IoT
environmental monitoring devices by integrating advanced features such an Intrusion Detection
solution (IDS) based on machine learning techniques. This preserves the integrity and privacy of
the environmental data that has been gathered by promptly identifying and averting possible
security risks or breaches.
3. Enhanced Data Integrity: The suggested system guarantees the accuracy of the gathered
environmental data by putting strong security measures and secure communication protocols into
place. The system preserves the integrity and dependability of the environmental data,
facilitating analysis and decision-making based on knowledge gained by preventing illegal
access or alteration.
4. Real-time Monitoring and Alerting: Real-time monitoring of IoT devices and their data
streams is made possible by the incorporation of an IDS based on machine learning techniques.
The intrusion detection system (IDS) has the ability to identify any unusual or suspicious
behavior and promptly send out alerts that facilitate prompt response and incident or breach
mitigation.
54
5. Adherence to Regulations: The increased security features and protocols incorporated in the
suggested system aid in guaranteeing adherence to standards and regulations concerning data
security and privacy. This is especially crucial in fields or industries like healthcare and
environmental monitoring where stringent compliance with laws is required.
6. Flexibility and Scalability: The suggested system is made to be flexible and scalable in order
to meet changing demands. The system's long-term viability and efficacy can be ensured by its
easy expansion or updating to meet changing environmental monitoring needs or new security
concerns.
● Lack of Data Encryption: Private information sent between Internet of Things devices
and the cloud may be intercepted and accessed by unauthorized parties if encryption
methods are not in place. The system is now more vulnerable to privacy abuses and data
breaches.
● Lack of Real-time Monitoring and Alerts: It's possible that the current system doesn't
have the tools necessary to monitor security breaches or irregularities in environmental
data in real-time or to send out alerts to administrators. The impact of security issues may
worsen as a result of this delay in discovery and reaction.
55
RESULT
In the experiment, the assessment of accuracy, a widely used multi-class performance metric, is
conductedto analyze the performance of the executed models. Precision, Recall, and F1-score scores
are also computed. The results indicate that among the ensemble models RF and SVM, SVM attains the
highest testaccuracy, scoring 98%Additionally, the MQTT protocol and datagram connection provide
secure and effective data transfer between IoT devices and the server. The system’s responsiveness and
efficacy in identifying and addressing security events are supported by the framework’s dependable
communication architecture. The implementation of sophisticated machine learning techniques on the
server side, such asRandom Forest and Support Vector Machine (SVM), is the central component of
the project. These algorithms enable the system to proactively detect and mitigate potential security
threats byanalyzing incoming data streams for patterns suggestive of intrusion or anomalous behavior.
56
Graph of temperature data over time. The y-axis is labeled "Temperature" and is scaled from0 degrees
to35 degrees. The x-axis is labeled "Timestamp".
There are two lines on the graph. One line, labeled "temp1", is blue and appears to be generally increasing
in temperature over time. The other line, labeled "temp2", is green and appears to be generally increasing
in temperature over time.
57
The graph displays data on two different types of humidity: humidity1 and humidity2. “Humidity
Over Time”. Green Line (humidity1) This line remains fairly constant throughout the time period. Orange
Line(humidity2) It shows a significant increase in humidity over time, reaching close to 90 by date 03/09.
X-Axis (Timestamp) Datesrange from03/04 to 03/09. Y-Axis (Humidity) Values range from40 to 90. Both
lines start at a similar humidity level ofaround 70 on date 03/04.
58
The image represents a Temperature Correlation Heatmap, which visually depicts the correlation
between two temperature variables temp1 and temp2. The heatmap is labeled as “Temperature Correlation
Heatmap.” It consists of a 2x2 grid, with each cell representing a correlation value. Red Cells (Diagonal) The
top-left and bottom-right squares are colored red. These indicate a perfect correlation (correlation coefficient
of 1.00) between the same temperature variables (temp1 and temp2). Blue Cells (Off-Diagonal) The top-
right and bottom-left squares are blue. These represent a very high positive correlation (correlation
coefficient of 0.99) between temp1 and temp2. The x-axis and y-axis are labeled as “temp1” and “temp2”,
respectively, indicating the variables being compared. On the right side of the heatmap, there’s a color scale
ranging from red (correlation coefficient 1.000) to blue (approximately 0.995). This scale shows how colors
correspond to different correlation values.
59
60
CONCLUSION
The project is an excellent example of a comprehensive and progressive strategy for enhancing
security in the context of the Internet of Things (IoT). Through the integration of DHT11-
equipped IoT devices, machine learning-powered Intrusion Detection Systems (IDS), and
secure communication protocols, the system creates a mutually beneficial synergy that improves
the resilience and robustness of networked settings. The project addresses major IoT security
concerns, most notably the identification and mitigation of potential threats and aberrant activity,
through careful design and implementation. By using DHT11 sensors, it is possible to monitor
vital environmental parameters like humidity and temperature in real-time, which paves the way
for proactive surveillance and control. Additionally, the MQTT protocol and datagram
connection provide secure and effective data transfer between IoT devices and the server. The
system's responsiveness and efficacy in identifying and addressing security events are supported
by the framework's dependable communication architecture. The implementation of sophisticated
machine learning techniques on the server side, such as Random Forest and Support Vector
Machine (SVM), is the central component of the project. These algorithms enable the system to
proactively detect and mitigate potential security threats by analyzing incoming data streams for
patterns suggestive of intrusion or anomalous behavior.
Rapid threat identification and response are made possible by the combination of secure
communication technologies and machine learning-based intrusion detection systems, which
strengthens the overall security posture of IoT settings. Through the utilization of machine
learning and the combined intelligence of IoT devices, the system improves situational
awareness and gives stakeholders the ability to take proactive measures to protect against new
risks. Furthermore, prompt and efficient communication with pertinent authorities is ensured by
the project's flawless connection for alert notifications with the Telegram platform. This
functionality minimizes possible risks and improves the overall resilience of IoT infrastructures
by enabling prompt intervention and resolution of security events. To sum up, the project offers
a scalable and adaptable framework for identifying, assessing, and responding to security
threats in interconnected environments, which constitutes a significant improvement in IoT
security.
61
FUTURE SCOPE
The project's future scope offers numerous chances for improvement and advancement, which
will enable the development of an intrusion detection and environmental monitoring system that
is complete and more effective. First off, adding more types of sensors to the project would
greatly improve its monitoring capabilities. A more thorough awareness of environmental
conditions would be made possible by the incorporation of sensors for air quality, pollution, or
particular environmental contaminants. This would allow for more informed decision-making
and preventative intervention measures. Furthermore, exploring the domain of sophisticated
machine learning models has the potential to enhance the precision and effectiveness of intrusion
detection. The system can more accurately identify anomalous patterns in the data by using
methods like deep learning, anomaly detection, or ensemble learning. This will improve threat
detection and mitigation. Scalability and adaptability are important factors to take into account
for the project's future growth.
The system's flexibility to adapt to changing needs and growing monitoring networks would
be ensured by designing it to support a larger number of IoT devices and different environmental
monitoring configurations. The system would be future-proofed by its scalability, which would
enable it to expand smoothly in tandem with rising data quantities and monitoring requirements.
Effective monitoring and control of IoT devices require an interface that is easy to use. System
administrators and end users can obtain important insights from the data by constructing an
accessible interface with visualization tools like graphs, charts, and maps. This enables informed
decision-making and prompt response to environmental risks. Another important factor to think
about is energy optimization, especially for Internet of Things devices placed in hard-to-reach or
isolated areas. By putting power-saving techniques like duty cycling, sleep modes, and energy-
efficient communication protocols into practice, devices' battery life and energy consumption
can be extended and their maintenance needs can be minimized, guaranteeing long-term
operation. Finally, the system would be able to predict future trends and changes in
environmental conditions if it used past data for environmental forecasting.
62
REFERENCES
[1] M. Frustaci, P. Pace, G. Aloi and G. Fortino, "Evaluating critical security issues of the IoT
world: Present and future challenges", IEEE Internet Things J, vol. 5, no. 4, pp. 2483-2495,
2018.
[3] G. Yang et al., "IoT-Based Remote Pain Monitoring System: From Device to Cloud
Platform", IEEE JBiomed Health Inform, vol. 22, no. 6, pp. 1711-1719, 2018.
[4] T. A. Teli, F. Masoodi and R. Yousuf, "Security Concerns and Privacy Preservation in
Blockchain based IoT Systems: Opportunities and Challenges", in Proc. of the ICICNIS 2020,
2021.
[5] I. S. Thaseen, B. Poorva and P. S. Ushasree, "Network Intrusion Detection using Machine
Learning Techniques", Proc. of the International Conference on Emerging Trends in Information
Technology and Engineering, 2020.
[6] F. Masoodi, S. Alam and S. T. Siddiqui, "Security & Privacy Threats Attacks and
Countermeasures in Internet of Things", Int. J. Netw. Secur. Its Appl., vol. 11, no. 02, pp. 67-77,
2019.
63
[8] F. S. Masoodi, I. Abrar and A. M. Bamhdi, "An Effective Intrusion Detection System using
Homogeneous Ensemble Techniques", Int. J. Inf. Secur. Priv., vol. 16, no. 1, pp. 1-18, 2021.
[9] A. Kumar, K. Abhishek, M. R. Ghalib, A. Shankar and X. Cheng, "Intrusion detection and
prevention system for an IoT environment", Digital Communications and Networks, vol. 8, no.
4,pp. 540-551, 2022.
[10] X. Li, W. Chen, Q. Zhang and L. Wu, "Computers & Security Building Auto-Encoder
Intrusion Detection System based on random forest feature selection", ComputSecur, vol. 95,
2020.
[11] A. S. Talita, O. S. Nataza and Z. Rustam, "Naive Bayes Classifier and Particle Swarm
Optimization Feature Selection Method for Classifying Intrusion Detection System Dataset", J
Phys Conf Ser, vol. 1752, no. 1, 2021.
[12] H. Jiang, Z. He, G. Ye and H. Zhang, "Network Intrusion Detection Based on PSO-Xgboost
Model", IEEE Access, vol. 8, pp. 58392-58401, 2020.
[13] S. Hosseini, "A new machine learning method consisting of GA-LR and ANN for attack
detection", Wireless Networks, vol. 26, no. 6, pp. 4149-4162, 2020.
[14] F. Masoodi, "Machine Learning for Classification analysis of Intrusion Detection on NSL-
KDD Dataset", Turkish Journal of Computer and Mathematics Education (TURCOMAT), vol.
12, no. 10, pp. 2286-2293, 2021.
[15] N. Abdalgawad and S. Member, "Generative Deep Learning to Detect Cyberattacks for the
IoT -23 Dataset", IEEE Access, vol. 10, pp. 6430-6441, 2022.
[16] A. M. Bamhdi, I. Abrar and F. Masoodi, "An ensemble based approach for effective
intrusion detection using majority voting", Telkomnika (Telecommunication Computing
Electronics and Control), vol. 19, no. 2, pp. 664-671, 2021.
64
[17] I. Ullah and Q. H. Mahmoud, "A Scheme for Generating a Dataset for Anomalous Activity
Detection in IoT Networks", in Lecture Notes in Computer Science, pp. 508-520, 2020.
[18] I. Abrar, Z. Ayub, F. Masoodi and A. M. Bamhdi, "A Machine Learning Approach for
Intrusion Detection System on NSL-KDD Dataset", In Proc. of the Int. Conf. Smart Electron
Commun. (ICOSEC-2020), pp. 919-924, 2020.
[19] C. Ambikavathi and S. K. Srivatsa, "Predictor Selection and Attack Classification using
Random Forest for Intrusion Detection", Journal of Scientific and Industrial Research, vol. 79,
no. 05, pp. 365-368, 2020.
[20] S. S. Dhaliwal, A. al Nahid and R. Abbas, "Effective intrusion detection system using
XGBoost", Information (Switzerland), vol. 9, no. 7, 2018.
[21] A. S. Ahanger, S. M. Khan and F. Masoodi, "An Effective Intrusion Detection System using
Supervised Machine Learning Techniques", In Proc. of the 5th International Conference on
Computing Methodologies and Communication (ICCMC-2021), pp. 1639-1644, 2021.
[22] V. Hassija, V. Chamola, V. Saxena, D. Jain, P. Goyal and B. Sikdar, "A Survey on IoT
Security: Application Areas Security Threats and Solution Architectures", IEEE Access, vol. 7,
pp. 82721-82743, 2019.
[23] M. Grandini, E. Bagli and G. Visani, Metrics for Multi-Class Classification: an Overview,
2020, [online] Available: http://arxiv.org/abs/2008.05756.
65
www.ijcrt.org © 20XX IJCRT | Volume X, Issue X Month Year | ISSN: 2320-2882
Abstract: This project presents novel ideas on boosting security and thus reducing the risk in Internet of Things (IoT)
based on implementing Intrusion Detection Systems leveraging machinelearning. The project details the designing of
two IoT devices based on DHT11 sensors that are used to measure the temperatureand humidity levels. These devices
use datagram communication,for which data is transmitted through a Mosquitto broker using the MQTT protocol. In
the server side, a machine learning modelmade out of Random Forest and Support Vector Machine (SVM) algorithms
analyzes the information that passes it through for patterns hinting at intrusion or unusual behavior. Whenever alarm is
identified from a possible security threat, notifications are there forwarded on the Telegram platform within the relevant
authorities. This project integrates IoT, machine learning, and secured communications technologies together in order
to enhance the robustness of connected environments that can very quickly detect threats and respond accordingly in
a timely manner.
Index Terms – Intrusion Detection, Feature Selection, Machine Learning , Random Forest Ensemble Approach ,Support
Vector Machine.
I. INTRODUCTION
The way we engage with and use technology has undergonea radical paradigm shift with the introduction of the Inter-net
of Things (IoT). It consists of an extensive network of linked sensors, devices, and systems that exchange data and
communicate with each other in a seamless manner to auto- mate tasks, boost productivity, and facilitate better decision-
making. This interconnection has completely changed the waywe live, work, and interact with our surroundings. It has
revolutionized a number of industries, including healthcare, transportation, agriculture, and smart homes.The origins of the
Internet of Things can be found in the early 2000s, when earlysensor networks and RFID (Radio Frequency Identification)
technology first came into being. But the Internet of Things didn’t really take off until the development of low-cost, low-
power microcontrollers, together with improvements in wireless communication protocols and cloud computing.IoT growth
was further propelled by the widespread use of smart- phones and high-speed internet connectivity, which provided the
infrastructure and connectivity required to support a wide range of networked devices. These days, Internet of Things (IoT)
gadgets are present in almost every area of our life, ranging from smart thermostats and wearable fitness trackers to industrial
machinery and self-driving cars.The fundamental promise of the Internet of Things is its capacity to gather, process, and act
upon massive volumes of data in real-time, providing previously unheard-of insights and optimization opportunities. IoT
systems can identify patterns, anticipatetrends, and automate decision-making by utilizing sensor data and machine learning
algorithms. This results in increasedproductivity, lower costs, and better user experiences.
IoT devices in the healthcare industry remotely monitorpatients’ vital signs, allowing for the early identification of health
problems and individualized interventions. Sensors in agriculture monitor environmental factors and soil moisture content to
maximize crop yields and irrigation schedules. IoT-enabled automobiles in the transportation sector interact with other cars
and infrastructure to increase road safety and efficiency.
[1] Sajad M. Khan, Faheem Syeed Masoodi ”Intrusion Detection System for IoT Environment using Ensemble Approaches”,
proposed a system in feb 2013 To ensure that important assets are available and secure within a protected network architecture,
Intrusion Detection Systems (IDS) are commonly used. However, current IDS algorithms often struggle to perform effectively.
To address this, machine learning has been employed to enhance IDS efficiency. The main challenge with IDS classification is
the large amount of irrelevant and redundant data in high-dimensional datasets, making it impossible for a single classifier to
identify all types of attacks effectively. Thus, a novel ensemble IDS approach was proposed in this study. The approach involved
using Random Forest (RF) for dimensionality reduction to select the optimal subset of the initial dataset. An ensemble learning
method was then used for intrusion detection and identification. The proposed RF method outperformed other state-of- the-art
approaches in several parameters with an accuracy of 99% as demonstrated by experimental results on the IoTID20 dataset. The
approach was evaluated using several performance criteria, including Accuracy, Precision, Recall, and F1-score The accuracy of
the ensemble classifiers was higher than that of the individual models. This improvement can be attributed to ensemble learning
strategies that leverage diverse learning mechanisms with varying capabilities. By combining these strategies, we were able to
enhance the reliability of our predictions while reducing the occurrence of classification errors. The experimental results show
that the framework can improve the efficiency of the Intrusion Detection System, achieving an accuracyrate of 0.9863.
[2] Cristiano Antonio de Souza,Carlos Becker Westphall ”Two-step ensemble approach for intrusion detection and identification
in IoT and fog computing environments”, proposed a system in march 2022 Due to Internet of Things devices resource
limitations, security often does not receive enough attention. Intrusion detection approaches are important for identifying attacks
and taking appropriate countermeasures for each specific threat. This work presents a two-step approach for intrusion detection
and identification. The first step performs a traffic analysis with an Extra Tree binary classifier. Events detected as intrusive are
analyzed in the second stage by an ensemble approach consisting of Extra Tree, Random Forest, and Deep Neural Network. An
extensive evaluation was performed with the Bot-IoT, IoTID20, NSL- KDD, and CICIDS2018 intrusion datasets. The
experiments demonstrated that the proposed approach could achieve similar or superior performance to other machine learning
techniques and state-of-the-art approaches in all databases, demonstrating the robustness of the proposed approach The Internet
of Things (IoT) is expanding quickly and is becoming increasingly important in our daily lives. Internet-connected IoT nodes
can connect to the internet byusing an IP address. As a result, users of various social networking platforms will be able to connect
to and share devices . There is a concern about security and privacy with this broad range of IoT applications. Without a secure
and up of a wide variety of devices that generally have a small size, and internet connectivity as their main characteristics [1].
Due to their small size, these devices generally have limited resources, low processing capacity, and limited memory. Thus, to
carry out the storage, processing, and analysis of the generated data, it is necessary to send them to a device with greater
computational power. The high traffic generated bythese devices and the latencyhampers the sending to cloud computing.
[3] Umaira Ahad, Yashwant Singh, Pooja Anand ,”Intrusion Detection System Model for IoT Networks Using Ensemble
Learning”,proposed a sytem in 2022 The capacity to identify breaches and malicious activity inside the Internet of Things (IoT)
networks is important for network infrastructure resilience as the dependence on IoT devices and services grows. Intrusion
detection systems (IDS) are basic components of network security. IDSs monitor and analyze the activity of a system in a network
to identify intrusions. Existing intrusion detection systems (IDS) gather and utilize large amounts of data with irrelevant, unneces-
sary, and unsuitable characteristics, resulting in long detection times and low accuracy. In this paper, we present an IDS model
based on a Random Forest (RF) classifier. NSL-KDD dataset is used to test the performance of the model and the satisfying
performance is obtained in terms of accuracy, detection rate, and false alarm rate. The proposed model has attained an average
accuracy of 99.3% and 98% for binary classification and multiclass classification, respectively. To demonstrate the efficacy of
the suggested model, its accuracy was compared with some existing approaches that utilize other models such as AIDS, ELM
and PCA, MapReduce-based hybrid architecture, and DRNN.ud. In this way, it can provide faster processing and response for
IoT devices.Currently, many electronic devices can be connected to the Internet and provide data and services to users. The
Internet of Things (IoT) environments are evolving and becoming popular. The number of devices connected to the Internet
continues to grow.
IoT has enormous potential, but it also comes with a lot of difficulties and worries, especially when it comes to security and
privacy. Because of their interconnectedness, Internet of Things devices have a larger attack surface, which leaves them open to
online dangers including malware outbreaks, hacking, and data breaches. Furthermore, privacy and data protection are put at risk
by the massive volumes of sensitive data created by IoT devices, which makes strong security protocols and legal frameworks
necessaryto secure user data.
Standards compliance and interoperability are two more issues that the IoT ecosystem must deal with. Making sure that a
multitude of devices from various manufacturers work together seamlessly and with a variety of communication protocols can
be a challenging undertaking. Furthermore, as IoT installations continue to increase in size and complexity, scalability and
sustainability become more difficult to achieve. To reduce their negative effects on the environment, effective resource
management and infrastructure optimization areneeded.
Support Vector Machine Algorithm - Support Vector Machine or SVM is one of the most popular Supervised Learning
algorithms, which is used for Classification as well as Regression problems. However, primarily, it is used for Classification
problems in Machine Learning. The goal of the SVM algorithm is to create the best line or decision boundary that can segregate
n-dimensional space into classes so that we can easily put the new data point in the correct category in the future. This best
decision boundary is called a hyperplane.
Random Forest Algorithm - In machine learning, random forest is a potent ensemble learning technique that is applied to both
classification andregression problems. It is a member of the decision tree based algorithm familyand is used extensivelybecause
of its efficiency, simplicity, and resilience.
The present system for our environmental monitoring equipment is based on traditional communication protocols with no
sophisticated security features. Currently, two IoT devices equipped with DHT11 sensors for temperature and humidity
monitoring are connected in a simple manner. These devices interact using a normal datagram protocol, passing data through a
Mosquitto broker that use the MQTT protocol. The device collect data from their surroundings and exchange it with cloud.
V. PROPOSED SYSTEM
The proposed system aims togreatly improve the security of our IoT environmental monitoring devices by including sophisticated
features, notably an Intrusion Detection System (IDS) based on machine learning methods. Two IoT devices equipped with
DHT11 sensors for temperature and humidity measurement remain at the heart of environmental monitoring. However, the
connectivity between these devices under goes significant improvement. Datagram connection is preserved, how ever data is now
transmitted over a Mosquitto broker using the MQTT protocol, giving a more secure communication channel than the previous
method.
Start Start
Authorized
Process Data with
Publish Data ML Model
(MQTT
Protocol)
Intruder
Prediction? Send Alert
Publisher
Telegram
App
DHT11
Sensor
MQTT
WIFI
Node
MCU
DHT11 PC
Sensor ML Code
Subscriber
The detailed list of Software requirements for the proposed system are as follows:
● Operating System: Windows 10 or Windows 11 (64-bit preferred).
● Programming Language: Python 3.x ,C.
● Machine Learning algorithms: SVM, Random forest
When the system detected anyintrusion it alert us via telegram as shown in the below diagram.
The temperature over time. The x-axis is labeled "Time," and the y-axis is labeled "Temperature (°C)" are shown in the
figure.
The humidityover time. The x-axis is labeled "Time," and the y-axis is labeled "Humidity (%)" are shown in the figure
IX. CONCLUSION
The project is an excellent example of a comprehensive and progressive strategy for enhancing security in the context of the
Internet of Things (IoT). Through the integration of DHT11-equipped IoT devices, machine learning-powered Intrusion Detection
Systems (IDS), and secure communication protocols, the system creates a mutually beneficial synergy that improves the resilience
and robustness of networked settings.Additionally, the MQTT protocol and datagram con- nection provide secure and effective
data transfer between IoT devices and the server. The system’s responsiveness and efficency in identifying and addressing security
events are supported by the framework’s dependable communication architecture.These algorithms enable the system to
proactively detect and mitigate potential security threats by analyzing incoming data streams for patterns suggestive of intrusion
or anomalous behavior.Through the utilization of machine learning and the combined intelligence of IoT devices, the system
improves situational awareness and gives stakeholders the ability to take proactive measures to protect against new risks.
REFERENCES
[1] Sajad M. Khan, Faheem Syeed Masoodi Proposed ”In- trusion Detection System for IoT Environment using Ensemble
Approaches” was presented on 15-17 March 2023. It was added to IEEE Xplore on 04 May 2023. The electronic ISBN for the
paper is 978-93-80544-47-2, and the Print on Demand (PoD) ISBN is 978-1-6654-7703-1.”
[2] Cristiano Antonio de Souza, Carlos Becker Westphall, Re- nato Bobsin Machado Proposed ”Two-step ensemble approach for
intrusion detection and identification in IoT and fog com- puting environments”,Maintained by Elsevier, Published in the
journal Computers & Electrical Engineering. The content is classified as an article and is protected by copyright © 2022
Elsevier Ltd. All rightsreserved.”
[3] Umaira Ahad, Yashwant Singh, Pooja Anand, Zakir Ahmad Sheikh, and Pradeep Kumar Singh Proposed ”Intrusion Detection
System Model for IoT Networks Using Ensemble Learning”, is published in the Journal of Interconnection Networks, Volume
22, Issue 03