Malware Classification Using Machine Learning Algorithms and Tools
Malware Classification Using Machine Learning Algorithms and Tools
https://doi.org/10.22214/ijraset.2021.34353
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.429
Volume 9 Issue VI Jun 2021- Available at www.ijraset.com
Abstract: The explosive growth of malware variants poses a major threat to information security. Malware is the one which
frequently growing day by day and becomes major threats to the Internet Security. According to numerous increasing of worm
malware in the networks nowadays, it became a serious danger that threatens our computers. Networks attackers did these
attacks by designing the worms. A designed system model is needed to defy these threats, prevent it from multiplying and
spreading through the network, and harm our computers. In this paper, we designed a classification on system model for this
issue. The designed system detects the worm malware that depends on the information of the dataset that is taken from website,
the system will receive the input package and then analyze it, the Naïve Bayesian classification technique will start to work and
begin to classify the package, by using the data mining Naïve Bayesian classification technique, the system worked fast and
gained great results in detecting the worm. By applying the Naïve Bayesian classification technique using its probability
mathematical equations for both threat data and benign data, the technique will detect the malware and classify data whether it
was threat or benign.
I. INTRODUCTION
With the rapid development of the Internet, malware became one of the major cyber threats nowadays. Any software performing
malicious actions, including information stealing, espionage, etc. can be referred to as malware. Kaspersky Labs define malware as
“a type of computer program designed to infect a legitimate user's computer and inflict harm on it in multiple ways. In this world of
digitization ,one of the most immediate threats to one's professional data and personal data is malicious malware executable.
Intruders & hackers are using various new approaches of intruding different type of malware in existing software's like polymorphic
metamorphic are very difficult to recognize or categorized accurately. Lot of efforts are required to analyze huge number of
malware samples manually. Malware word defines from Malicious Software. Malware is a malicious code that affects the user
system or computer and intently harms the computer by an attacker. Malware is variant forms which are a virus, Trojan, backdoor,
root kits, ransom ware, worm, botnet, spyware, adware, key loggers, etc., and there is a wide range of their families are existing and
massively growing on the internet daily. Malware have impacted a large number of computing devices. The term malware come
from malicious software which are designed to meet the harmful intent of a malicious attacker. Malware can compromise
computers/smart devices, steal confidential information, penetrate networks, and cripple critical infrastructures, etc. These programs
include viruses, worms, trojans, spyware, bots, rootkits, ransomware, etc. Network security is an important branch of computer
science that protect the stored data in the computers, which it connected together by one network. Recently, the knowledge of the
network became developed and common in our world, network attackers are increasing every day, and their threats are evolving as
well. Network security is a very important matter for foundations like universities, special projects and corporations. These
foundations can supply many important functions for the countries safety. Nowadays, the online services are very popular for the
users. The users now can communicate with each other and share information, and knowledge among each other. Now these
services are less expensive and more cooperative by using the Information Technology (IT) associations, and Internet Service
Providers (ISPs).Malware may put network at danger. Malware is a program can install in the network and electronic devices like
computers, smart phones and tablets that connected in the network. It damages these devices by accessing it illegitimately and
destroy its personal data and information; for an example: Adware could do the malicious work . Malware is the most dangerous
threats to the networks. The malware could take many forms to do its attack, it always come as package and try to access to the
network. Every day new types and forms of the malware are found. The malware programmers always make decisions about
protecting their malware from anti malware programs like Kaspersky, McAfee, NOD, Norton and many anti viruses programs that
we use in our PCs. The malware threats are too many; the most serious one is worm. The worm can replicate itself and spread
through the network very fast. Nowadays, the world faces a major problem called worm especially the facilities and the network
users. In spite of the detection techniques ability in detection it still have difficulties in detecting these worms . Here the detection
techniques role comes. Detection techniques are the most effective defense against the malware in the network.
The malware defenders are the anti-viruses these days, it can detect the malware signature and prevent it from doing its malicious
work . In the Operating System (OS) keeping the safety, confidentiality and availability is very important and a hard task,
because of the difficulties and impendence that the network faces in securing the data and information inside the network and
keeping it safe from outside attacks, so it is very important these networks have a defense line against the outside attacks.Electronic
devices such as computers and smart phones; the malware can affect them by a huge number of malware that spread in these
devices.
II. PROBLEM STATEMENT
We implemented the malware classification based on the deep learning has been open online for user testing and can automatically
detect whether a visited site is a malware or not. Now day virus scanner technology has two parts a signature-based detector and a
heuristic classifier that detects new viruses. The classic signature-based detection algorithm uses signature of known malicious
executables to detect new virus. Signature-based methods create a unique tag for each malicious program so that it can be use as a
future examples of it can be correctly classified with a small error rate. These methods do not generalize well to detect new
malicious binaries because they are created to give a false positive rate as close to zero as possible. Whenever a detection method
generalizes to new instances, the tradeoff is for a higher false positive rate. Heuristic classifiers are generated by a group of virus
experts to detect new malicious programs. It is time-consuming and sometime it is not detect new malicious executables.
V. OBJECTIVES
A. Detect malicious websites and provide users a safe online environment.
B. Save users systems from virus attacks.
VI. SCOPE
This project's future reach will include the implementation of this application on a broader scale, making it compliant with other
software and other platforms. This project is initially deployed with limited classifiers. Depending on the specifications and the
work area of the project, various other features may also be provided to users. In addition to users, the software is also automatically
remove the virus from the website. It will make websites more system friendly for users.
VIII. ADVANTAGES
1) Develop Business Processes: To be able to use internet technologies
2) Availability: Without even buying the antivirus we can check which website is risky and which is not.
3) Transparency: After putting url in the text box users can see that the website is risky or not.
4) User Friendly: In this application, we used a simple user interface for users and easy process to detect malicious website, it will
make it easy to use this program.
5) Flexibility: Depending on type of website this application can detect viruses, and stop the virus to reach your system.
IX. DISADVANTAGE
The computer or laptop and internet connection are required for this application to scan the viruses in a website.
X. EXPECTED OUTCOME
By using machine learning algorithms and tools and using the online malware classification system, the vision of this application is
to ensure that your system will not attacked by the viruses.
XI. CONCLUSION
In the last few years malware have become a significant threat. what are the machine learning algorithms they used in their work,
from what sources dataset is collected, what are parameters they consider to reach their goal and the corresponding experimental
results In the discussion, it clearly identifies that machine learning algorithms are very useful for the classification and clustering of
malware samples for small datasets and for large volumes of data. The worm malware can use the computer ports as gateways to
access the computer and invade the network. Our networks need protection from outside attackers to defy against their attacks, so
strong systems are needed to detect and prevent the malware from breaking through the networks and do its malicious work.
REFERENCES
[1] D. Gavrilut, M. Cimpoeşu, D. Anton and L. Ciortuz, "Malware detection using machine learning", Computer Science and Information Technology 2009.
IMCSIT'09. International Multiconference on, pp. 735-741, October 2009.
[2] K. Chumachenko, Machine Learning Methods for Malware Detection and Classification, 2017.
[3] L. Liu, B. S. Wang, B. Yu and Q. X. Zhong, "Automatic malware classification and new malware detection using machine learning", Frontiers of Information
Technology & Electronic Engineering, vol. 18, no. 9, pp. 1336-1347, 2017.
[4] O. E. David and N. S. Netanyahu, "Deepsign: Deep learning for automatic malware signature generation and classification", Neural Networks (IJCNN) 2015
International Joint Conference on, pp. 1-8, July 2015.
[5] P. Singhal and N. Raul, Malware detection module using machine learning algorithms to assist in centralized security in enterprise networks, 2012,
[6] E. Masabo, K. S. Kaawaase and J. Sansa-Otim, "Big data: deep learning for detecting malware", Proceedings of the 2018 International Conference on Software
Engineering in Africa, pp. 20-26, May
[7] T. N. Phyu, "Survey of classification techniques in data mining", Proceedings of the International MultiConference of Engineers and Computer Scientists, vol.
1, pp. 18-20, March 2009
[8] D. Chen, “Detecting Hiding Malicious Website Using Network Traffic Mining Approach,” 2010 2nd Int. Conforence Educ. Technol. Comput., 2010
[9] Y. Park, D. S. Reeves, and M. Stamp, “Deriving common malware behavior through graph clustering,” Comput. Secur., vol. 39, pp. 419–430,