0% found this document useful (0 votes)
8 views

Project Report 02

The project report titled 'Cyber Threat Intelligence Mining using AI' explores the application of Artificial Intelligence to enhance the analysis of cyber threat intelligence (CTI) data for improved threat detection and response. By employing machine learning and natural language processing, the project aims to automate the extraction of actionable insights from vast datasets, thereby empowering security teams to proactively counter cyber threats. The report highlights the significance of AI in transforming CTI mining to address the challenges posed by the evolving cyber threat landscape.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Project Report 02

The project report titled 'Cyber Threat Intelligence Mining using AI' explores the application of Artificial Intelligence to enhance the analysis of cyber threat intelligence (CTI) data for improved threat detection and response. By employing machine learning and natural language processing, the project aims to automate the extraction of actionable insights from vast datasets, thereby empowering security teams to proactively counter cyber threats. The report highlights the significance of AI in transforming CTI mining to address the challenges posed by the evolving cyber threat landscape.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 52

CYBER THREAT INTELLIGENCE

MINING USING AI

A PROJECT REPORT

Submitted by

ANANDHI PRIYA T (723720104004)


NITHYA M (723720104038)
SOWNDARYA M (723720104056)

In partial fulfillment for the award of the degree

of

BACHELOR OF ENGINEERING

IN

COMPUTER SCIENCE AND ENGINEERING

VSB COLLEGE OF ENGINEERING TECHNICAL CAMPUS

ANNA UNIVERSITY: CHENNAI 600 025

MAY – 2024
2
ANNA UNIVERSITY: CHENNAI 600 025

BONAFIDE CERTIFICATE

Certified that this project report “Cyber Threat Intelligence Mining using AI” is the Bonafide
work of the following students, ANANDHI PRIYA T [723720104004], NITHYA M [723720104038],

SOWNDARYA M [723720104056], who carried out the project work under my supervision.

SIGNATURE SIGNATURE
Mrs. V. RADHA, M.E., Mrs. M. RAMADEVI, M.E.,

HEAD OF THE DEPARTMENT, ASSISTANT PROFESSOR,

Department of Computer Science and Department of Computer Science and


Engineering, Engineering,
V.S.B College of Engineering Technical
V.S.B College of Engineering Technical
Campus,
Campus,
Coimbatore-642109
Coimbatore-642109

Submitted for the university project viva voce held on......................................at


V.S.B College of Engineering Technical Campus Coimbatore.

INTERNAL EXAMINER EXTERNAL EXAMINER


ACKNOWLEDGEMENT

First of all, we extend our heartfelt gratitude to the management of V.S. B


College of Engineering Technical Campus institutions, Chairman Mr. V. S. Balsamy,
B.Sc., L.L.B., for providing us with all sort of support in the completion of this project.

We record our indebtedness to our principal Dr.V. Velmurugan Ph.D., for his
guidance and sustained encouragement for the successful of the project.

We are highly gratitude to Mrs. V. Radha, M.E., Head of the department,


Department of Computer Science and Engineering for his valuable suggestions
throughout the course of this project.

We owe our gratitude to our project coordinator Dr. T. Kalaikumaran,


Professor, Department of Computer Science and Engineering, for his unlisted
encouragement and moreover for his timely support and guidance till the completion of
the project work.

We also extend our sincere thanks to Mrs.M. Ramadevi, ME., Assistant


Professor, Department of Computer Science and Engineering, project guide for her kind
support and timely assistance rendered which led us in successful completion of our
project.

We extend our heartfelt salutation to our beloved parents and friends who have
always been an integral part in helping us through tough times and all teaching and non-
teaching staff for providing their moral support making herculean success of our
project.
ABSTRACT

Cybersecurity professionals face a relentless barrage of ever-evolving threats.


Extracting valuable insights from vast amounts of cyber threat intelligence (CTI)
data is crucial for proactive defense. However, manual analysis of this complex
data is laborious and hinders timely threat detection. This project explores the
application of Artificial Intelligence (AI) for mining actionable intelligence from
CTI data. We propose leveraging AI techniques like machine learning and natural
language processing to automate data processing, identify hidden patterns, and
accelerate threat analysis. By empowering security teams with faster and more
comprehensive insights, this AI-powered CTI mining approach aims to fortify
defense against the ever-present cyber threat landscape. By leveraging AI
techniques, we aim to automate critical tasks associated with CTI processing and
analysis. Machine learning algorithms will be employed to shift through massive
data sets, identifying subtle nuances, anomalies, and previously unknown threats
that might escape human detection. This automation will free up security analysts
to focus on higher-level tasks like threat hunting and incident response.
Furthermore, AI's ability to analyse data at lightning speed will significantly
accelerate the process of extracting actionable intelligence from CTI. This
empowers security teams to make informed decisions and respond swiftly to
emerging threats, ultimately fortifying their defences against a constantly evolving
cyber threat landscape. In essence, this project bridges the gap between the vast
potential of CTI data and the practical limitations of human analysis. By
employing AI-powered CTI mining, we aim to transform this data into a powerful
weapon against cyber threats, enabling security teams to proactively counter
malicious actors and safeguard critical systems and information
TABLE OF CONTENTS

CHAPTER TITLE PAGE NO


4 SYSTEM ANALYSIS 13
ABSTRACT I
4.1 EXISTING SYSTEM 13
LIST OF TABLES II
4.1.1 Disadvantage 13
LIST OF FIGURES III
4.2 PROPOSED SYSTEM 14
LIST OF ABBREVIATIONS IV
4.2.1 Advantage 14
1 INTRODUCTION 1
4.2.2 Block Diagram 15
2 LITERATURE SURVEY 3
4.2.3 System Architect
3 SYSTEM SPECIFICATION 168
5 MODULE DESCRIPTION 178
3.1 HARDWARE REQUIREMENTS
5.1 LIST OF MODULES 178
3.2 SOFTWARE REQUIREMENTS
5.2 DATA COLLECTION 178
3.3 HARDWARE DESCRIPTION
5.3 PRE-PROCESSING 188
3.3.1 Hard Disk
5.4 FEATURE EXTRACTION 189
3.3.2 RAM
5.5 MODULE PRECRIPTION 19 9
3.3.3 Processor
5.6 USER-DRIVEN THREAT 20
3.4 PREDICTION
SOFTWARE REQUIREMENT
10
6 RESULT 3.4.1 Window10(64BIT) 10
21
7 3.4.2
CONCLUSION Python 10
28
3.4.3 Anaconda
11
8 FUTURE ENHANCEMENT 29

REFERENCE

APPENDIX 1

APPENDIX 2
LIST OF FIGURES

TITLE PAGE NO
FIG.NO

3.3.2 RAM 9

3.3.3 Processor 9

3.4.1 Windows 10 10

3.4.2 Python 11

3.4.3 Anaconda 12

4.2.2 Threat Intelligence Life Cycle 15

4.2.3 System Architecture 16

6.1 Home Page 19

6.2 Create User Account 22

6.3 Sign Page 23

6.4 Phishing Prediction Page 24

6.5 SQL Injection Prediction Page 25

6.6 DOS Attack Prediction Page 26

6.7 Ransomeware Prediction Page 27


LIST OF ABBREVIATIONS

AI Artificial Intelligence
CTI Cyber Threat Intelligence
ML Machine Learning
NLP Natural Language Processing
CHAPTER 1

INTRODUCTION

The ever-expanding digital landscape has unfortunately become a breeding ground for
cyber threats. The complexity and volume of cyberattacks are constantly evolving,
making it increasingly challenging for security professionals to keep pace. Traditional
methods of threat detection are often overwhelmed by the sheer amount of data
generated on networks. This project explores the power of Artificial Intelligence (AI)
in Cyber Threat Intelligence (CTI) mining.

By leveraging AI techniques, we aim to develop a more robust and automated


approach to extracting valuable insights from vast amounts of security data. This
intelligence can then be used to Proactively identify and predict cyber threats, enhance
threat actor profiling, Automate threat response. AI can analyze historical attack data
and identify patterns that indicate impending threats. AI can uncover connections
between different attacks and threat actors, providing a more comprehensive
understanding of their motivations and tactics. By automating the analysis of security
alerts, AI can expedite threat response times, minimizing potential damage. This
project holds the potential to revolutionize the way we approach cybersecurity. By
harnessing the power of AI, we can gain a significant advantage in the fight against
cybercrime. In this rapidly evolving digital landscape, the battle against cyber threats
requires more than just human vigilance; it demands the power of intelligent machines.
Cyber Threat Intelligence Mining, driven by AI, represents a paradigm shift in how we
detect, analyze, and mitigate cyber risks. At its core, it leverages advanced algorithms
and machine learning models to sift through vast troves of data, extracting actionable
insights that empower organizations to stay ahead of malicious actors. From malicious
actors, the need for robust and proactive cybersecurity measures has never been more
critical. Amidst this escalating threat landscape, the fusion of Artificial Intelligence
(AI) with cyber threat intelligence (CTI) mining emerges as a formidable arsenal in the
1
hands of defenders. Cyber threat intelligence mining using AI represents a paradigm
shift in cybersecurity operations. It empowers organizations to extract actionable
insights from vast volumes of disparate data sources, enabling them to anticipate,
detect, and mitigate threats with unparalleled speed and accuracy. By harnessing the
capabilities of AI, such as machine learning, natural language processing, and
predictive analytics, CTI mining transcends traditional approaches, offering a
proactive defense mechanism against evolving cyber threats. This paper explores the
synergy between AI and CTI mining, delving into the methodologies, technologies,
and applications that underpin this innovative approach. From threat detection and
attribution to vulnerability assessment and incident response, AI-driven CTI mining
revolutionizes every facet of the cybersecurity lifecycle. Moreover, it adapts and
evolves in real-time, continuously learning from new data patterns and emerging
threats, thereby fortifying the resilience of organizations against cyber adversaries.

2
CHAPTER2

LITERATURE SURVEY

Nan Sun, Ming Ding, Senior Member, IEEE, Jiao Jiao Jiang , Weifang Xu,
Xiaoxing Mo , Yong hang Tai , and Jun Zhang , Senior Member , IEEE Cyber
Threat Intelligence Mining for Proactive Cybersecurity Defense: A Survey and
New Perspectives IEEE Communications Surveys & Tutorials, VOL. 25, NO. 3,
Third Quarter 2023

Today’s cyber-attacks have become more severe and frequent, which calls for a new
line of security defenses to protect against them. The dynamic nature of new-
generation threats, which are evasive, resilient, and complex, makes traditional
security systems based on heuristics and signatures struggle to match. Organizations
aim to gather and share real-time cyber threat information and then turn it into threat
intelligence for preventing attacks or, at the very least, responding quickly in a
proactive manner. Cyber Threat Intelligence (CTI) mining, which uncovers,
processes, and analyzes valuable information about cyber threats, is booming.
However, most organizations today mainly focus on basic use cases, such as
integrating threat data feeds with existing network and firewall systems, intrusion
prevention systems, and Security Information and Event Management systems
(SIEMs), without taking advantage of the insights that such new intelligence can
deliver. In order to make the most of CTI so as to significantly strengthen security
postures, we present a comprehensive review of recent research efforts on CTI
mining from multiple data sources in this article. Specifically, we provide and
devise a taxonomy to summarize the studies on CTI mining based on the intended
purposes (i.e., cybersecurity-related entities and events, cyber-attack tactics,
techniques and procedures, profiles of hackers, indicators of compromise,
vulnerability exploits and malware implementation, and threat hunting), along with
a comprehensive review of the current state-of-the-art. Lastly, we discuss research
challenges and possible future research directions for CTI mining
3
Shivangi Gupta, A. Sai Sabitha, Ritu Punhani Cyber Security Threat
Intelligence using Data Mining Techniques and Artificial Intelligence
International Journal of Recent Technology and Engineering (IJRTE) ISSN:
2277-3878, Volume-8 Issue-3, September 2019

Threat intelligence is the procurement of evidence-based knowledge about current


or potential threats. The interest of threat intelligence comprises of advancement in
efficiency and boosting effectiveness in terms of analytical and prevention
capabilities. Cybersecurity represents serious interest for numerous organizations
because maximum of them are using Internet-connected data devices which are
opening doors for cyber attackers. Outstanding threat intelligence within the cyber
sphere requests for the knowledge base of threat information and a thoughtful way
to represent this knowledge. This study proposes a clear rationale of significant
artificial intelligence (AI) techniques used for recognizing a cyber-attack. Data
analysis can be formulated to guide industries and Internet-connected systems such
as smartphones or robotic factories on what to do in the appearance of an incident.
AI techniques will analyze past incidents and summarize knowledge from experts
and will continue to adapt or reform new branches as it reviews from the new
incidents. In addition, various data mining approaches used in boosting threat
truthfulness in cybersecurity data are also studied. To conclude, we discussed that;
AI will robotize the collation of machine-readable external threats and will improve
the efficiency and accuracy of the data for each smart organization’s specific
framework.

Md Sharon Abu1 , Siti Rahayu Selamat , Aswani Ariffin , Robia Yusof 1,3
Malaysian Computer Emergency Response Team, Cybersecurity Malaysia
2,4Faculty of Information Technology and Communication, University Technical
Malaysia Melaka, Malaysia Cyber Threat Intelligence – Issue and Challenges

4
Indonesian Journal of Electrical Engineering and Computer Science Vol. 10, No.
1, April 2018

Today threat landscape evolving at the rapid rate with much organization continuously
face complex and malicious cyber threats. Cybercriminal equipped by better skill,
organized and well-funded than before. Cyber Threat Intelligence (CTI) has become a
hot topic and being under consideration for many organizations to counter the rise of
cyber-attacks. The aim of this paper is to review the existing research related to CTI.
Through the literature review process, the most basic question of what CTI is examines
by comparing existing definitions to find common ground or disagreements. It is found
that both organization and vendors lack a complete understanding of what information
is considered to be CTI, hence more research is needed in order to define CTI. This
paper also identified current CTI product and services that include threat intelligence
data feeds, threat intelligence standards and tools that being used in CTI. There is an
effort by specific industry to shared only relevance threat intelligence data feeds such
as Financial Services Information Sharing and Analysis Center (FS-ISAC) that
collaborate on critical security threats facing by global financial services sector only.
While research and development center such as MITRE working in developing a
standards format (e.g.; STIX, TAXII, CybOX) for threat intelligence sharing to solve
interoperability issue between threat sharing peers. Based on the review for CTI
definition, standards and tools, this paper identifies four research challenges in cyber
threat intelligence and analyses contemporary work carried out in each. With an
organization flooded with voluminous of threat data, the requirement for qualified
threat data analyst to fully utilize CTI and turn the data into actionable intelligence
become more important than ever. The data quality is not a new issue but with the
growing adoption of CTI, further research in this area is needed.

Syed Rameem Zahra, Mohammad Ahsan Chishti, Asif Iqbal Baba, Fan Wu 2022,
Detecting Covid-19 chaos driven phishing/malicious URL attacks by a fuzzy logic
5
and data mining-based intelligence system, Egyptian Informatics Journal 23 (2),
197-214, 2022.
With confusion and uncertainty ruling the world, 2020 created near-perfect conditions
for cybercriminals. As businesses virtually eliminated in-person experiences, the
COVID-19 pandemic changed the way we live and caused a mass migration to digital
platforms. However, this shift also made people more vulnerable to cyber-crime.
Victims are being targeted by attackers for their credentials or financial rewards, or
both. This is because the Internet itself is inherently difficult to secure, and the
attackers can code in a way that exploits its flaws. Once the attackers gain root access
to the devices, they have complete control and can do whatever they want.
Consequently, taking advantage of highly unprecedented circumstances created by the
Covid-19 event, cybercriminals launched massive phishing, malware, identity theft,
and ransomware attacks. Therefore, if we wish to save people from these frauds in
times when millions have already been tipped into poverty and the rest are trying hard
to sustain, it is imperative to curb these attacks and attackers. This paper analyses the
impact of Covid-19 on various cyber-security related aspects and sketches out the
timeline of Covid-19 themed cyber-attacks launched globally to identify the modus
operandi of the attackers and the impact of attacks. It also offers a thoroughly
researched set of mitigation strategies which can be employed to prevent the attacks in
the first place. Moreover, this manuscript proposes a fuzzy logic and data mining-based
intelligence system for detecting Covid-19 themed malicious URL/phishing attacks.
The performance of the system has been evaluated against various malicious/phishing
URLs, and it was observed that the proposed system is a viable solution to this
problem.

M. R. Rahman, R. Mahdavi-Heaves, and L. Williams, “A literature review on


mining cyberthreat intelligence from unstructured texts,” in Proc. Int. Conf. Data
Min. Workshops (ICDMW), 2020, pp. 516– 525.
Cyberthreat defense mechanisms have become more proactive these days, and thus
leading to the increasing incorporation of cyberthreat intelligence (CTI). Cybersecurity
6
researchers and vendors are powering the CTI with large volumes of unstructured
textual data containing information on threat events, threat techniques, and tactics.
Hence, extracting cyberthreat-relevant information through text mining is an effective
way to obtain actionable CTI to thwart cyberattacks. The goal of this research is to aid
cybersecurity researchers understand the source, purpose, and approaches for mining
cyberthreat intelligence from unstructured text through a literature review of peer-
reviewed studies on this topic. We perform a literature review to identify and analyze
existing research on mining CTI. By using search queries in the bibliographic
databases, 28,484 articles are found. From those, 38 studies are identified through the
filtering criteria which include removing duplicates, non-English, non-peer-reviewed
articles, and articles not about mining CTI. We find that the most prominent sources of
unstructured threat data are the threat reports, Twitter feeds, and posts from hackers
and security experts. We also observe that security researchers mined CTI from
unstructured sources to extract Indicator of Compromise (IoC), threat-related topic, and
event detection. Finally, natural language processing (NLP) based approaches: topic
classification; keyword identification; and semantic relationship extraction among the
keywords are mostly availed in the selected studies to mine CTI information from
unstructured threat sources.

T. D. Wagner, K. Mahbub, E. Palomar, and A. E. Abdallah, “Cyber threat


intelligence sharing: Survey and research directions,” Compute. Security, vol. 87,
Nov. 2019, Art. no. 101589
Cyber Threat Intelligence (CTI) sharing has become a novel weapon in the arsenal of
cyber defenders to proactively mitigate increasing cyber-attacks. Automating the
process of CTI sharing, and even the basic consumption, has raised new challenges for
researchers and practitioners. This extensive literature survey explores the current
state-of-the-art and approaches different problem areas of interest pertaining to the
larger field of sharing cyber threat intelligence. The motivation for this research stems
from the recent emergence of sharing cyber threat intelligence and the involved
challenges of automating its processes. This work comprises a considerable number of
7
articles from academic and gray literature, and focuses on technical and non-technical
challenges. Moreover, the findings reveal which topics were widely discussed, and
hence considered relevant by the authors and cyber threat intelligence sharing
communities.

8
CHAPTER 3

SYSTEM SPECIFICATION

3.1 HARDWARE REQUIREMENTS:


 Hard Disk
 RAM
 Processor

3.2 SOFTWARE REQUIREMENTS

 Window 10 (64 bit)


 Python
 Anaconda

3.3 HARDWARE DESCRIPTION

3.3.1 Hard Disk

The hard disk needs to be compatible with the interface of the computer system
it will be installed in. Common interfaces include SATA (Serial ATA), SAS (Serial
Attached SCSI), and PCIe (Peripheral Component Interconnect Express).
The physical size of the hard disk must match the form factor supported by the
computer chassis or storage enclosure. Common form factors for desktop computers
include 3.5-inch and 2.5-inch drives, while laptops typically use 2.5-inch or smaller
form factors.
The storage capacity of the hard disk should meet the requirements of the intended use.
Hard disks are available in a wide range of capacities, from several hundred gigabytes
to multiple terabytes.

9
3.3.2RAM

AI algorithms used in CTI mining often require substantial amounts of data to be


processed simultaneously. RAM provides the necessary memory space for storing and
manipulating this data efficiently. It allows AI models to access and analyze large
datasets quickly, enabling faster processing and extraction of insights diverse CTI

Fig 3.3.2: RAM

Machine learning models utilized in CTI mining, such as neural networks, often
require significant memory resources during both training and inference stages. During
model training, RAM is used to store training data batches, model parameters, and
intermediate computations. In the inference stage, RAM is utilized to load trained
models and process incoming data streams for real-time threat detection and analysis.

3.3.3PROCESSOR
Processors handle the cleaning and normalization of large datasets containing
threat intelligence. This may involve handling inconsistencies, removing duplicates,
and
converting data into a format suitable for AI algorithms.

10
Fig 3.3.3 Processor

Processors perform calculations to extract relevant features from raw data. These
features could be network traffic patterns, malware code characteristics, or attacker
behaviors indicators. Processors power the training of machine learning and deep
learning models. This involves running complex algorithms that learn from vast
amounts of threat intelligence data to identify patterns and anomalies indicative of
cyber threats

3.4SOTWARE REQUIREMENT

3.4.1WINDOW 10(64 bit)


Windows 10 supports popular AI frameworks like TensorFlow and PyTorch,
allowing developers to build and train AI models for CTI mining. Security analysts can
leverage pre-built tools and libraries within Windows 10 to analyze threat intelligence
data.

11
Fig 3.4.1 Windows 10
Many SIEM systems, which aggregate and analyze security data for threat detection,
are compatible with Windows 10. These SIEM systems can integrate with AI-powered
threat intelligence tools, providing a centralized platform for security professionals.

3.4.2PYTHON
Python boasts a vast collection of libraries like TensorFlow, Pyotr, scikit-learn,
and Kera’s. These libraries provide pre-built functions and tools for building, training,
and deploying machine learning and deep learning models crucial for CTI mining tasks
like anomaly detection and threat classification.

Fig 3.4.2 Python

Libraries like Scapi, Nmap, and MISP (Malware Information Sharing Platform)
provide functionalities for network traffic analysis, vulnerability scanning, and threat

12
data integration. These tools seamlessly integrate with Python's AI libraries, creating a
comprehensive environment for CTI mining.

3.4.3ANACONDA
Anaconda eliminates the need for manual installation and configuration of various
Python libraries required for AI and data science in CTI mining. It comes pre-loaded
with essential libraries like TensorFlow, scikit-learn, Pandas, and NumPy, saving
security professionals valuable time and effort.

Fig 3.4.3 ANACONDA


Anaconda allows the creation and management of isolated virtual environments for
different CTI mining projects. This ensures project-specific dependencies and avoids
conflicts between libraries used in different projects.

13
CHAPTER 4

SYSTEM ANALYSIS

4.1 EXISTING SYSTEM

In the current cybersecurity landscape, organizations rely heavily on traditional


security systems that employ heuristics, signatures, and basic threat data integration.
These systems, while providing some level of protection, struggle to keep up with the
dynamic and evolving nature of modern cyber threats. They often lack the depth of
analysis and proactive capabilities needed to effectively counter sophisticated attacks.
Basic threat data feeds are integrated into network and firewall systems, intrusion
prevention systems, and Security Information and Event Management systems
(SIEMs), but their insights are limited and reactive. cybersecurity landscape,
organizations heavily lean on traditional security systems, employing heuristics,
signatures, and basic threat data integration. While these systems offer some
protection, they falter in addressing the dynamic and evolving nature of modern cyber
threats. Their analysis often lacks depth, and they struggle to proactively counter
sophisticated attacks. Basic threat data feeds are integrated into network and firewall
systems, intrusion prevention systems, and Security Information and Event
Management systems (SIEMs), yet their insights remain limited and reactive

4.1.1 Disadvantage

 Limited Threat Detection

 Complex Cyber Threat

 Lack of Comprehensive

 Difficulty in Identifying Threat

 Delaying Mitigation Effort

14
4.2PROPOSED SYSTEM

The proposed system aims to revolutionize cybersecurity through the


implementation of an advanced Cyber Threat Intelligence (CTI) mining framework.
This framework will leverage diverse data sources and cutting-edge analysis
techniques to provide organizations with a proactive and comprehensive approach to
threat detection, prevention, and response. By processing and analyzing CTI insights,
the system will empower organizations to identify emerging threats, profile hackers,
understand attack tactics, and make informed decisions to strengthen their security
posture. This proposed system will bridge the gaps in the existing cybersecurity
landscape, enhancing the ability to combat complex and evolving cyber threats
effectively and modules to detect potential dangers and provide an efficient means
of communication in case of emergency

15
4.1.2 Advantage

 Customized Threat Intelligence Feeds


 Comprehensive Threat Awareness
 Timely Detection of Threat
 Continuous Monitoring and Analysis
 Integration with Security Operation

4.1.3 Block diagram

Fig: 4.2.2 Threat Intelligence Life Cycle

16
4.1.4 System Architecture

Fig4.2.3 System Architecture

17
CHAPTER 5

MODULE DESCRIPTION

5.1 LIST OF MODULES

1. Data Collection
2. Pre- Processing
3. Feature Extraction
4. Module prediction
5. User-Driven Threat Prediction

5.2 DATA COLLECTION

The process of gathering diverse and relevant cyber threat data from multiple
sources is fundamental to effective threat detection and analysis. This module involves
systematically retrieving raw data, including logs, network traffic, threat feeds, and
social media content. Key steps include identifying data sources, employing retrieval
mechanisms such as APIs and scraping tools, normalizing and aggregating data,
ensuring quality assurance, and ensuring scalability and resilience. By executing this
module successfully, organizations can build a comprehensive dataset to drive
informed decision-making and strengthen their cybersecurity posture against evolving
threats.
The summary is clear and concise, but it's missing some specific details about the
importance of data collection in cybersecurity. Data collection serves as the
cornerstone of effective cyber threat intelligence, encompassing the systematic
gathering of diverse and relevant data from various sources. This includes logs,
network traffic, threat feeds, and social media platforms. Key steps involve identifying
sources, employing retrieval mechanisms like APIs and scraping tools, normalizing
and aggregating data, ensuring quality assurance, and establishing scalability and
resilience.

18
By executing this module meticulously, organizations can construct a robust dataset
vital for proactive threat detection and analysis. This data-driven approach enables
informed decision-making, empowers timely response to emerging threats, and fortifies
the organization's cybersecurity posture against evolving adversaries. Thus, data
collection lays the foundation for a comprehensive and dynamic cyber defense
strategy.

5.3Pre- Processing

At the forefront of data preparation lies the preprocessing module, a pivotal stage
where raw data undergoes meticulous cleaning, transformation, and organization. This
essential process targets the elimination of noise, the resolution of missing values, the
standardization of formats, and the assurance of data consistency before advancing to
subsequent analysis. Through methods such as data cleaning, missing value handling,
format standardization, and consistency checks, organizations lay a robust foundation
for insightful analysis and informed decision-making in the realm of cybersecurity.
This module serves as the cornerstone for deriving accurate insights and proactively
addressing cyber threats, ensuring that organizations navigate the data landscape with
clarity and precision.

the preprocessing module is paramount. It focuses on cleaning, transforming, and


organizing raw data to ensure its quality and reliability before further analysis. This
module addresses various tasks, including noise removal, handling missing values,
format standardization, and ensuring data consistency. By executing preprocessing
effectively, organizations establish a solid foundation for subsequent analysis, enabling
accurate insights and informed decision-making in combating cyber threats.

5.4FEATURE EXTRACTION

Feature extraction serves as a vital bridge between raw data and actionable insights
in cyber threat intelligence mining employing AI. It involves transforming diverse data
19
sources into meaningful features that AI algorithms can analyse effectively. Key
aspects include data representation, dimensionality reduction, feature engineering, and
adaptation to both supervised and unsupervised learning approaches. Feature extraction
empowers organizations to identify patterns, anomalies, and indicators of malicious
activity, enhancing the efficiency and accuracy of threat detection systems. It's an
essential preprocessing step that enables AI to effectively combat evolving cyber
threats by extracting relevant information and providing actionable intelligence.

It involves the transformation of raw data into actionable insights by identifying and
extracting relevant features. Through techniques such as data representation,
dimensionality reduction, and feature engineering, organizations can effectively
analyse vast datasets to detect patterns and anomalies indicative of malicious activity.
Feature extraction enables AI algorithms to process and interpret information
efficiently, enhancing the accuracy and efficacy of threat detection systems. By
leveraging advanced AI techniques, organizations can stay ahead of evolving cyber
threats, bolstering their cybersecurity defence’s and safeguarding their digital assets
effectively.

5.5 MODULE PRECRIPTION


Prescription Module: In the landscape of cyber threat intelligence mining
utilizing AI, the prescription module emerges as a pivotal element in translating
insights into proactive defence strategies. This module goes beyond mere analysis,
offering actionable recommendations and countermeasures to mitigate identified
threats effectively. Leveraging the power of AI algorithms, the prescription module
identifies vulnerabilities, assesses risks, and recommends tailored responses based on
the analysed threat intelligence. It integrates seamlessly with existing cybersecurity
frameworks, enabling organizations to prioritize threats, allocate resources efficiently,
and orchestrate timely responses. By automating the prescription process, AI-driven

20
systems empower organizations to adapt swiftly to evolving threats, fortify their
defence’s, and maintain a proactive stance against cyber adversaries.

The prescription module in AI-driven cyber threat intelligence mining offers proactive
defense strategies by identifying vulnerabilities, assessing risks, and recommending
tailored responses. Integrated seamlessly into existing cybersecurity frameworks, it
prioritizes threats, allocates resources efficiently, and orchestrates timely responses.
Through automation, it enables swift adaptation to evolving threats, strengthening
defenses, and maintaining a proactive stance against cyber adversaries.

5.6 USER -DRIVEN THREAT PREDICTION

User-Driven Threat Prediction represents a significant shift in cybersecurity,


placing the power of threat anticipation directly in the hands of users. Unlike
traditional threat prediction methods that rely solely on automated algorithms and
historical data, user-driven prediction leverages the collective intelligence and insights
of individuals within an organization. By incorporating input from users who are on the
front lines of daily operations and have unique perspectives on potential vulnerabilities,
this approach enhances the accuracy and relevance of threat predictions.

In this paradigm, users become active participants in the threat prediction process,
contributing real-time observations, anecdotal evidence, and contextual information
that may not be captured by automated systems alone. This user-generated data
supplements machine learning algorithms, enriching the predictive models with
qualitative insights and enhancing their ability to identify emerging threats.

User-driven threat prediction also fosters a culture of cyber vigilance within


organizations, empowering employees at all levels to be proactive in identifying and
reporting suspicious activities or anomalous behaviours. By encouraging open
communication and collaboration, organizations can tap into the collective expertise of
their workforce to stay ahead of evolving threats.

21
22
CHAPTER 6
RESULT

Fig 6.1 Home page

The Home Page serves as the central hub for the Cyber Threat Intelligence (CTI)
mining system, providing users with an overview of the system's capabilities,
features, and latest updates. The design is user-friendly, incorporating intuitive
navigation menus, quick access buttons, and informative widgets to guide users
through the platform's functionalities.

23
Fig 6.2 Create User Account

The Create User Account page offers a streamlined and secure registration process
for new users to access the CTI mining system. Users are prompted to provide
essential information, such as their name, email address, and password, which is
encrypted and stored securely.

24
Fig 6.3 Sign in page

The Sign In Page provides existing users with a secure and seamless authentication
process to access their personalized dashboards and threat intelligence reports.
Users are required to enter their registered email address and password, which
undergo encrypted verification to authenticate their identity.

25
Fig:6.4 Phishing prediction page

The Phishing Prediction Page leverages advanced machine learning algorithms to


analyze and predict potential phishing attacks based on historical and real-time data.
The page presents users with interactive visualizations and insights into phishing trends,
patterns, and indicators of compromise (Io’c). Users can customize filters, explore
detailed threat profiles, and receive actionable recommendations to mitigate phishing
risks effectively.

26
Fig 6.5 SQL injection prediction page

The SQL Injection Prediction Page utilizes predictive analytics to identify and
forecast potential SQL injection attacks targeting organizational databases and web
applications. The page offers users a comprehensive view of SQL injection
vulnerabilities, attack vectors, and associated risk scores across various assets and
infrastructure components. Users can drill down into specific incidents, analyze
attack patterns, and access remediation guidelines to secure their systems proactively.

27
Fig 6.6 Dos attack prediction page

The DoS (Denial of Service) Attack Prediction Page employs machine learning and
anomaly detection techniques to detect and predict potential DoS attacks aimed at
disrupting organizational networks and services

28
Fig 6.7 Ransome ware prediction page

The Ransomware Prediction Output page offers users a predictive analysis of


potential ransomware threats targeting organizational data and systems. Leveraging
advanced machine learning models and threat intelligence feeds, the page
highlights emerging ransomware campaigns, encryption techniques, and ransom
demands observed in the wild

29
CHAPTER 7

CONCLUSION

In conclusion, the integration of AI in cyber threat intelligence mining represents a


significant advancement in cybersecurity. By harnessing the power of machine learning
and data analysis, organizations can enhance their ability to detect, analyse, and respond
to cyber threats in real-time. The automated nature of AI-driven threat intelligence
mining not only improves efficiency but also enables security teams to stay proactive in
the face of constantly evolving threats. Ultimately, this synergy between AI and cyber
threat intelligence mining is instrumental in strengthening defences, minimizing risks,
and safeguarding digital assets in an increasingly complex and dynamic threat
landscape.

30
CHAPTER 8

FUTURE ENHANCEMENT

AI algorithms will better grasp the context of cyber threats, enabling more accurate
threat prioritization and tailored mitigation strategies.AI-driven models will predict
emerging threats, empowering organizations to preemptively address vulnerabilities
and defend against evolving attacks. AI will automate incident response processes,
swiftly containing and mitigating cyber-attacks with minimal human intervention AI-
powered platforms will facilitate greater collaboration and information sharing among
security teams, industry peers, and threat intelligence providers. There will be a focus
on developing AI models that provide transparent and interpretable results, fostering
trust in automated threat intelligence systems. Defense mechanisms will evolve to
detect and counter adversarial AI tactics employed by cyber attackers.AI algorithms
will continuously learn from new data and adapt to evolving threats in real-time,
ensuring agility and effectiveness in cyber defense strategies.

31
REFERENCES

[1] S. Naseer, Y. Saleem, S. Khalid, M. K. Bashir, J. Han, M. M. Iqbal, K. Han,


"Enhanced Network Anomaly Detection Based on Deep Neural Networks,"
IEEE Access, vol. 6, pp. 48231-48246, 2018.

[2]B. Zhang, G. Hu, Z. Zhou, Y. Zhang, P. Qian, L. Chang, "Network Intrusion


Detection Based on Directed Acyclic Graph and Belief Rule Base", ETRI
Journal, vol. 39, no. 4, pp. 592-604, Aug. 2017

[3] W. Wang, Y. Sheng and J. Wang, "HAST-IDS: Learning hierarchical spatial


temporal features using deep neural networks to improve intrusion detection,"
IEEE Access, vol. 6, no. 99, pp. 1792-1806, 2018.

[4] M. K. Hussein, N. Bin Zainal and A. N. Jaber, "Data security analysis for
DDoS defense of cloud-based networks," 2015 IEEE Student Conference on
Research and Development (Scored), Kuala Lumpur, 2015, pp. 305-310.

[5] S. Sandeep Sekaran, K. Kandasamy, "Profiling SIEM tools and correlation


engines for security analytics," In Proc. Int. Conf. Wireless Com., Signal Prove.
and Net. (Wisp NET), 2017, pp. 717-721.

[6] Hubbell and V. Surya narayana False alarm minimization techniques in


signature based intrusion detection systems: A survey,’’ Compute. Common., vol.
49, pp. 117, Aug. 2014.

[7] A. Naser, M. A. Majid, M. F. Zolile and S. Anwar, "Trusting cloud


computing for personal files," 2014 International Conference on Information and
Communication Technology Convergence (ICTC), Busan, 2014, pp. 488-489.

32
[8] Y. Shen, E. Marconi, P. Verviers, and Gianluca Stringham, "Tiresias:
Predicting Security Events Through Deep Learning," In Proc. ACM CCS 18,
Toronto, Canada, 2018, pp. 592-605.

[9] Kyle Soaks and Nicolas Christin, "Automatically detecting vulnerable websites
before they turn malicious,", In Proc. USENIX Security Symposium., San Diego,
CA, USA, 2014, pp.625-640.

[10] K. Veerama channid, I. Arnaldo, V. Koraput, C. Basis, K. Li, "AI2: training a


big data machine to defend," In Proc. IEEE Bigdata Security HPSC IDS, New
York, NY, USA, 2016, pp. 49-54

[11] Mahmood Lavallee, Ebrahim Bagheri, Wei Lu and Ali A. Ghobadi, "A
detailed analysis of the kid cup 99 data set," In Proc. of the Second IEEE Int. Conf.
Comp. Int. for Sec. and Def. App., pp. 53-58, 2009.

[12] I. Sharfuddin, A. H. Lashari, A. A. Ghobadi, "Toward generating a new


intrusion detection dataset and intrusion traffic characterization", Proc. Int. Conf.
Inf. Syst. Scur. Privacy, pp. 108- 116, 2018.

[13] N. Shone, T. N. Ngoc, V. D. Phail and Q. Shi, "A deep learning approach to
network intrusion detection," IEEE Trans. Emerge. Topics Compute. Intel., vol. 2,
pp. 41-50, Feb. 2018

[14] R. Vijayakumar, Mamoon Alazar, K. P. Soman, P. Poorna Chandran, Ameer


Al Namrata and Sit Lakshmi Venkatraman, "Deep Learning Approach for
Intelligent Intrusion Detection System," IEEE Access, vol. 7, pp. 41525-41550,
Apr. 2019.

33
[15] W. Hu, W. Hu, S. Maybank, "Ad boost-based algorithm for network
intrusion detection," IEEE Trans. Syst. Man B Cybernet., vol. 38, no. 2, pp. 577-
583, Feb. 2008.

[16] T.-F. Yen et al., "Beehive: Large-scale log analysis for detecting suspicious
activity in enterprise networks", Proc. 29th Annul. Compute. Security Appl.
Conf., New York, NY, USA, 2013, pp. 199- 208.

[17] K.-O. Darken, T Rix, C Kleiner, B Hellmann, L. Renner’s, "Seem approach


for a higher level of its security in enterprise networks", In Proc. IDAACS,
Warsaw, Poland, 2015, pp. 322-327.

[18] en.wikipedia.org “Security information and event management,” 2016


[Online] Available information and event management.

[19] Y. Lacuna, L. Bottom, Y. Bagnio, and P. Haffner, "Gradient-based


learning applied to document recognition," Proc. IEEE, vol. 86, no. 11, pp.
2278-2324, Nov. 1998.

[20] C. Dong, C. C. Loy, K. He and X. Tang, "Image Super-Resolution Using


Deep Convolutional Networks," in IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 38, no. 2, pp. 295- 307, 1 Feb. 2016.

[21] A. Apathy, "Connecting images and natural language," Ph.D. dissertation,


Fac.Compute. Sci., Stanford Univ., Stanford, CA, USA, 2016.

[22] A. Krushinski, I. Subsieve, and G. E. Hinton, "ImageNet classification with


Deep convolutional neural networks," In Proc. Of the 25th Int. Confront Neural
Inf. ProSystem’s -Volume 1, ser. NIPS’12, 2012, pp. 1097-1105.
34
[23] Q. Zhu, X. Li, A. Conesa, and C. Pereira, “Gram-CNN: a deep learning
Approach with local context for named entity recognition in biomedical text,”
Bioinformatics, vol. 34, no. 9, pp. 1547–1554, 2017.

[24] Wang, M. Zhu, X. Zeng, X. Ye, and Y. Sheng, “Malware traffic


classification using convolutional neural network for representation learning,”
In Proc. Int. Conf. on Infor. Net. (ICOIN), Da Nang, Vietnam, Jan. 2017, pp.
712–717.

[25] Z. Li, Z. Qin, K. Huang, X. Yang, and S. Ye, “Intrusion detection using
convolutional neural networks for representation learning,” In Proc. Int. Conf.
Neural Information Springer, 2017, pp. 858–866.

[26] M. Alazar, S. Venkatraman, P. Watters, and M. Alazar, “Zero-day


malware detection based on supervised learning algorithms of API call
signatures,” In Proc. 9th Australis. Data Mining Conf., vol. 121. Ballarat,
Australia, Dec. 2011, pp. 171182.

[27] E. Raff, J. Sylvester, and C. Nicholas, “Learning the PE header, malware


detection with minimal domain knowledge,'' In Proc. 10th CalWORKs
Atif.Intel. secure. New York, NY, USA, Nov. 2017, pp. 121-132.

[28] J. Gu et al., "Recent advances in convolutional neural networks", Corer,


pp. 187-332, Dec. 2017

[29] Kene Wu, Zune Chen, Wei Li, "A Novel Intrusion Detection Model for a
Massive Network Using Convolutional Neural Networks", Access IEEE, vol.
35
6, pp. 5085050859, 2018

[30] Taejon Kim, Sang C. Suh, Hyunjoo Kim, Jongmyo Kim and Jingo Kim,
"An Encoding Technique for CNN-based Network Anomaly Detection," In
Proc. IEEE International Conference on Big Data (IEEE Bigdata), Seattle,
WA, USA, Jan. 2019, pp. 2960-2965.

[31] R. Vijayakumar, M. Alazar, K. P. Soman, P. Poorna Chandran and S.


Venkatraman "Robust Intelligent Malware Detection Using Deep Learning,"
IEEE Access, vol. 7, pp. 46717-46738, Apr. 2019

[32] Y. Lacuna, Y. Bagnio, and G. Hinton, “Deep learning,” Nature, vol. 521,
no. 7553, pp. 436-444, 2015.

[33] S. Hoch Reiter, Y. Bagnio, P. Francona, and J. Schmid Huber, “Gradient


flow in recurrent nets: the difficulty of learning long-term dependencies,”
2001.

36
WEB REFERENCE

1. https://attack.mitre.org/
2. https://bigid.com/blog/ai-threat-intelligence/
3. https://www.ibm.com/topics/threat-intelligence
4. https://www.crowdstrike.com/cybersecurity-101/threat-intelligence/
5. https://cloud.google.com/blog/topics/threat-intelligence/ai-five-phases-
intelligence-lifecycle
6. https://www.forbes.com/sites/forbestechcouncil/2023/07/21/how-ai-enabled-
threat-intelligence-is-becoming-our-future/?sh=3056bc49727e
7. https://www.vmware.com/topics/glossary/content/threat-intelligence.html
8. https://ieeexplore.ieee.org/document/8601235
9. https://www.silobreaker.com/glossary/ai-in-threat-intelligence/
10. https://www.checkpoint.com/cyber-hub/cyber-security/what-is-threat-
intelligence/
11. https://www.fortinet.com/resources/cyberglossary/artificial-intelligence-in-
cybersecurity

37
APPENDIX – 1
SOURCE CODE

from tinker import message box

from tinker import *

from tinker import simple dialog

import tinker

from tinker import file dialog

import matplotlib. Pylon as Pelt

import NumPy as np

from interfile dialog

import ask open file name

import OS

import pandas as pd

from skarn import preprocessing

from Scalar. feature extraction. Text

import Count Vectorizer, Tf-idf Vectorizer

from sklearn import Svm

from sklearn. Metrics import accuracy score

from sklearn. Model selection import train test slit


from kerns. models import Sequential

from kerns. layers import Flatten


from kerns. layers import Dense, Activation, Dropout
38
from sklearn. pre-processing import One Hot Encoder import
skleras. Layers
from keras. layers import Convolution2D
from keras. layers import MaxPooling2D
from keras. layers import Flatten
from keras. layers
import Dense, Activation, Batch Normalization, Dropout

from sklearn. Metrics import precision score from


sklearn. Metrics import recall score
from sklearn. Metrics import f1 score

from sklearn. Naive bayes import Bernoulli


from sklearn. Neighbors import K Neighbors Classifier
from sklearn. Tree import Decision Tree Classifier
from sklearn. Ensemble import Random Forest Classifier

main = tinker’s()

main. Title ("Cyber Threat Detection Based on Artificial Neural Networks


Using Event Profiles") #designing main screen
main. Geometry("1300x1200")

le = pre-processing. Label Encoder ()


global filename

global feature extraction

global X, Y
global doc

39
global label names
global Train, X test, y train, ytest
global lstm acc, cnn acc, svm acc, knn acc, dt acc, random acc, nb acc
globallstm_precision, cnn_precision,svm_precision,knn_precision
dt_precision,random_precision,nb_precision

globallstm_recall,cnn_recall,svm_recall,knn_recall,dt_acc,do_real
l,nb_recall
global lstm_fm,cnn_fm,svm_fm,knn_fm,dt_fm,random_fm,nb_fm

def upload ():


global filename
global X, Y
global doc
global label names
filename=file dialog. Ask open filename(initial
dirt="datasets") dataset=pd.read_csv(filename)
label names=dataset.labels.unique()
dataset['labels'] = le.fit_transform(dataset['labels'])
cols = dataset. Shape [1]
cols = cols - 1
X = dataset. Values [: 0: cols]

Y = dataset. Values [: cols]

Y = Y. as type ('int')

doc= []

for I in range (Len(X)):


strs= ''for j in range (Len(X[i])):
strs+=str(X[i,j])+" " doc.append(strs.strip())
40
APPENDIX 2

SCREENSHOTS

41
42
43

You might also like