Fake URL Detection Using Machine Learning and Deep Learning
Fake URL Detection Using Machine Learning and Deep Learning
ISSN No:-2456-2165
A Mukesh Karthik V
Department of Computer Science, Department of Computer Science,
Dayananda Sagar Collage of Engineering Dayananda Sagar Collage of Engineering
Bangalore, India Bangalore, India
Soumya Patil
Department of Computer Science,
Dayananda Sagar Collage of Engineering
Bangalore, India
Abstract:- The risk of network information insecurity is There are several scientific studies showing many malicious
growing rapidly in number and level of risk is very high. URL detection methods based on machine learning and deep
The methods mostly used by hackers today is to attack learning techniques. This document proposes a method to
whole system and exploit human vulnerabilities. These detect spoofed URLs using machine learning techniques
techniques include social engineering, phishing, based on proposed URL behaviours and attributes. In
pharming, etc. One of the steps in conducting these addition, big data technology is also leveraged to enhance the
attacks is to deceive users with fake Uniform Resource ability to detect malicious URLs based on their anomalous
Locators (URLs). As a result, fake URL detection is of behaviour. In short, the proposed detection system consists of
great interest nowadays. There have been several novel features and behaviours of URLs, machine learning
scientific studies showing a number of methods to detect algorithms, and big data techniques. Experimental results
malicious URLs based on machine learning and deep show that the proposed URL attributes and behaviours help
learning techniques. In this paper, we propose a Fake significantly improve detection of malicious URLs. This
URL detection method using machine learning techniques indicates that the proposed system can be viewed as a
based on our proposed URL behaviours and attributes. streamlined and easy-to-use malicious URL detection
Moreover, bigdata technology is also exploited to improve solution. URLs (Uniform Resource Locators) are used to
the capability of detection malicious URLs based on refer to resources on the Internet. [1] presents the properties
abnormal behaviours. In short, the proposed detection and two basic components of a URL as a protocol identifier,
system consists of a new set of URLs features and which indicates the protocol to use, and a resource name,
behaviours, a machine learning algorithm, and a bigdata which indicates the IP address or domain name where the
technology. The experimental results show that the resource is located. You can see that each URL has a specific
proposed URL attributes and behaviour can help improve structure and format. This can be suggested and identified
the ability to detect malicious URL significantly. This is when an attacker attempts to change one or more of her details
suggested that the proposed system may be considered as in her URL. Malicious URLs are known as links that harm
an optimized and friendly used solution for malicious users. These URLs are resources or pages that allow attackers
URL detection. to execute code on your computer, redirect you to unwanted,
malicious, or other phishing sites, or download malware.
Keywords:- URL; Malicious URL Detection; Phishing; redirect the user to Malicious URLs can be found in
Machine Learning everything from how files are downloaded to how movies are
downloaded, drive-by downloads, phishing, spamming,
I. INTRODUCTION tampering, and more.
The risk of network information becoming unstable is A. Clayton Johnson, Bishal Khadka, Ram B. Basnet
growing rapidly, and the level of risk is very high. The Organizations face significant threats from emails with
primary method used by hackers today is to attack entire Uniform Resource Locators (URLs), which may compromise
systems and exploit human vulnerabilities. These techniques network security and user credentials through spear-phishing
include social engineering, phishing, pharming, and more. and other common phishing techniques. campaigns to their
One of the steps in carrying out these attacks is to trick users staff. The identification and classification of harmful URLs is
with a fake URL (Uniform Resource Locator). That's why a crucial practical application to a scientific challenge. An
there's a lot of interest in detecting fake URLs these days. organisation can safeguard itself by filtering incoming emails
Problem Statement
To develop a Malicious URL detecting system which
accurately detects and classifies the Benign and Malicious
URLs using Machine Learning and Deep Learning
Techniques.
Input: The dataset contains collection of malicious, benign,
spam, malware and defacement URLs in multiple formats like
csv, JSON, etc. Fig.1 Proposed Methodology
Output: Displays whether the URLs are Fraudulent and
legitimate based on features. III. MODULE DECOMPOSITION