0% found this document useful (0 votes)
171 views

Fraud Detection in E-Commerce Using Machine Learning

Customers rely heavily on decisions to purchase products either on commerce sites or in online retail outlets. Since these reviews are the game changers for success or failure in product marketing, reviews are used for positive or negative ideas. Improper reviews may also be referred to as false / fraudulent reviews or spam comments or false reviews. To downgrade or advance the item, resentful audits or phony surveys, which are tricky, are posted i

Uploaded by

Velumani s
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
171 views

Fraud Detection in E-Commerce Using Machine Learning

Customers rely heavily on decisions to purchase products either on commerce sites or in online retail outlets. Since these reviews are the game changers for success or failure in product marketing, reviews are used for positive or negative ideas. Improper reviews may also be referred to as false / fraudulent reviews or spam comments or false reviews. To downgrade or advance the item, resentful audits or phony surveys, which are tricky, are posted i

Uploaded by

Velumani s
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

ISSN 2278-3091

Muhammad Ahsan Saeed et al ., International Journal ofVolume


Advanced10,
Trends in Computer
No.3, Science
May - June and Engineering, 10(3), May - June 2021, 2206 – 2211
2021
International Journal of Advanced Trends in Computer Science and Engineering
Available Online at http://www.warse.org/IJATCSE/static/pdf/file/ijatcse1011032021.pdf
https://doi.org/10.30534/ijatcse/2021/1011032021

Fraud Detection in E-Commerce Using Machine Learning


Muhammad Ahsan Saeed1, Farrukh Yousaf 1, Osama Bin Khalid1, Mushhad Gilani1, Qamar Nawaz2,
Isma Hamid3
1
University Institute of Information Technology, PMAS Arid Agriculture University Rawalpindi, Pakistan
2
Department of Computer Science, University of Agriculture Faisalabad, Pakistan
3
Department of Computer Science, National Textile University, Faisalabad, Pakistan

 contains a large amount of data which can be a very bad cause


ABSTRACT for various other applications if misused.
Customers rely heavily on decisions to purchase products According to the proposed research, certain systems were
either on commerce sites or in online retail outlets. Since considered that supports different attributes and domains like
these reviews are the game changers for success or failure in few are used for detection of fraudulent credit card, this is also
product marketing, reviews are used for positive or negative important because in case of fraudulent credit card heavy
ideas. Improper reviews may also be referred to as false / financial loss can be caused. And few only detects the fake
fraudulent reviews or spam comments or false reviews. To and fraudulent reviews, but this proposed methodology not
downgrade or advance the item, resentful audits or phony only intends to detect fraudulent reviews but also to remove
surveys, which are tricky, are posted in the web-based them from the database. This proposed e-commerce web
business site. This outcome will prompt possible monetary
application gets reviews as the inputs from the users, stores
misfortunes or bigger measure of development in business.
them in the database and then proposed methods are used to
So, the proposed system is design and developed in such way
detect the spammers. In order to perform the task, firstly it is
that it will detect fake, false and spam reviews for fraud
detection using machine learning approaches like Sentiment needed to check either a customer is logged in or not.
Analysis, Support Vector Machine (SVM), Decision Tree Secondly, check that either a customer bought the product or
algorithm, and N-gram model. not. Thirdly, check the credibility of the reviewer. To do this,
reviewers Medium Access Control (MAC) address. Internet
Key words: Review deviation Reputation systems, Sentiment Protocol (IP) address, location, is accessed. Location and IP
Analysis, Fake review observation; Spam Detection addresses can be changed easily using VPNs so there is a need
to detect VPNs as well. There is one easy way to do that is to
1. INTRODUCTION use MAC address because it can’t be easily changed. MAC
In today's digital world the idea of spams and frauds have address basically comprises of unique number which is
become a threat to both customers and companies. Identifying utilized to track a device in a network. It facilitates us with a
the fake reviews is a complex and difficult task. The secure way to locate senders or receivers in the network and
fraudulent reviewers are often paid to write these reviews. helps avoid unwanted network access.
Because of this, it is the difficult job for the average customer Purpose of proposed approach is to ascertain all the
to distinguish fake reviews from real ones, by looking at each un-authentically fake or spam comments associated to any
review. In addition, to support the sale of goods a few
product that may be causing a high damage on the
businesses submit positive reviews of their products to
productivity of seller and additionally manipulating the
influence customer purchasing practices. Because of the
perspective of consumers about the product. Spam Reviews
uplifted degree of competition in the business, it has become a
need for every organization to maintain its esteem and height numerous reviews given by a user for the same product are
on the lookout. As a result, splitting automated reviews considered fake. Not necessarily, a loyal customer may review
becomes an important task for customers and businesses. every time the product is bought. The main problem is that all
Subsequently, this article proposes to create a machine the reviews posted on the website may not be real or genuine.
learning-based framework for identifying spam and fake
reviews and non-reviews. 2. LITERATURE REVIEW
There has been an exciting change in the way people share
and express their opinion and credit has gone completely to In the text, spam reviews are divided into three groups [1].
the social web. Client-generated content is a term used to (1) Untruthful Reviews - a major part of this paper,
display something donated by web customers (rather than (2) Reviews on trademarks - where opinions are solely
something provided by site owners). Client created content concerned with the product or the seller of the product
and fail to review the product.

2206
Muhammad Ahsan Saeed et al ., International Journal of Advanced Trends in Computer Science and Engineering, 10(3), May - June 2021, 2206 – 2211

(3) Non-Reviews - those reviews or updates containing text can be found in the future that can help you find accurate
or unrelated ads. information [2].
Fake review detection using data mining is the study whose
The first phase, the revised review, is of great concern as it objective was to solve the fake reviews problem by using
undermines the integrity of the online review system. different data mining techniques and explore the weaknesses
Identifying a type of spam review for a particular type is as and strength in data mining techniques. For this study
challenging, if not impossible, to distinguish between false supervised approach was made to detect the fake reviews
reviews and self-study. To show the difficulty of this task. which includes Support Vector Machine (SVM), Multinomial
Naive Bayes (MNB) & Multilayer Perceptron in this research,
2.1. Robust Algorithm: the authors took different approaches for spam review
Amazon has developed a robust algorithm to detect fake detection. they started with supervised method, then tried
reviews, that can be both positive or negative [2]. Fake with semi-supervised method and finally, used a fully
reviews are positive when bought by the seller and can be unsupervised method for spam review detection. First, they
negative when bought by the competitors to drop the rating. used supervised approach that requires large scale of datasets
So, it’s a complex matrix to detect fake reviews which then they used semi-supervised which depends a lot on
Amazon is trying to overcome with the help of technology and graphs. It was suggested by the authors that in future other
the team of professionals to manually monitor. Overall, the methods for validation of Words Basket Analysis (WBA) can
approach of Amazon is good but is more time consuming by be proposed. One suggestion for this purpose is to manually
manually detecting the fake reviews. label the fake and real reviews this will help in reducing the
2.2. Fake Review Fraud Detection using Data Mining: size of datasets, then the performance of WBA approach can
Another approach towards fake reviews detection is data be improved and for labeling the truthful and deceptive
mining. While it is a good way but it also has restrictions and reviews a behavioral approach can be employed [3],
drawbacks [3]. Spam Review Detection Techniques: Systematic Literature
1. Violates user privacy: It is a well-known fact that data Review A study in which researchers conduct a
mining collects information about people using certain comprehensive review of existing studies on the availability
market-based strategies and information technology. of spam reviews using the Systematic Literature Review
2. Additional irrelevant information. (SLR.) In total, 76 existing studies are reviewed and analyzed.
3. Misuse of information. Researchers evaluated studies based on how features are
extracted from review data sets and the different methods and
2.3. Yelp filtering Algorithm: techniques used to solve spam detection problem detection.
Yelp is the largest and most popular online review site that This study has shown that the success of any spam retrieval
filters untrue or suspicious reviews. It uses various filtering review method depends. Feature releases depend on the
algorithm in order to find out fake reviews. After studying a update database, and the accuracy of spam detection methods
lot about yelp, it was decided that machine learning approach depends on the choice of feature engineering method.
is always a better approach towards fake and spam reviews Therefore, in the successful use of the spam review
detection. acquisition model and achieve better accuracy, these factors
Fake Review Detection and classification and analysis of real need to be considered in conjunction. To the knowledge of the
and Pseudo reviews in false review analysis and real false researchers, this is the first complete review of existing
review to understand the psyche of false reviewers to produce studies in the field of spam review for the use of the SLR
data sets that can provide high accuracy in detection using process. This study presented a systematic review of the
supervised learning. Kl-divergence method is used to study literature on the field of spam review findings and highlighted
data sets behavioral features are used along with n-gram the contributions of recent research in the form of various
features to check dataset of AMT. AMT’s generated dataset engineering methods, methods of detecting spam reviews,
was not found as representative of fake reviews and and various measures used in performance testing. To bring
furthermore it was found that behavioral features alone give out direct pragmatic evidence, this work organized a review
good results accuracy [4-7]. process, focused on a search query, raised research questions,
Exploiting Product Related Review Features for Fake Review selected papers from reputable publishers, applied formal
Detection exploits product-related review features to detect submissions and a study assessment process. The main
false review acquisition A convolutional neural network advantage of this study is that, to our knowledge, this is the
model is suggested to integrate a product with a brand name. first attempt to integrate all available spatial reviews of spam
For maximum flexibility, the wrap-up strategy is used to reviews using the SLR method. In addition, the release of this
incorporate a network model bag with two efficient dividers. study may assist in further research in the field of spam review
Various types of tests were performed to evaluate the [8].
performance of the suggested model. Another missing feature

2207
Muhammad Ahsan Saeed et al ., International Journal of Advanced Trends in Computer Science and Engineering, 10(3), May - June 2021, 2206 – 2211

An Empirical Study on Detecting Fake Reviews Using to make the e-commerce environment better. People relish to
Machine Learning Strategies, this paper receives false read the reviews before they made up their minds to buy
reviews using machine learning methods many ways to anything from webstores. By detecting fake reviews and
analyze the data of movie reviews and introduce the
eliminating them will lead us to the point where buyers will
algorithms for dividing the emotions and guiding the learning
used in this work with stops and non-stop word methods [9]. not be further manipulated by any of this unauthentic and
Algorithms are emotionally divided by the Install tool, which spurious reviews stuff. Buying and selling of products will get
is used to separate movie update databases into non-fiction more frequent and facile.
updates. Finally, in the future this can be applied to different
commerce websites like the amazon e bay dataset or a According to Literature review, it is ascertained that system
different movie review database and use a variety of options.
like this subsists but with the different attributes and domain.
This method does not apply to default commerce websites.
Some of them are fraud detection of credit cards and some are
Because its algo produces manual results [10].
After thoroughly viewing different research papers on fraud predicated on fake reviews detection, but we are intended to
detections. We came to a point where we believe that engender a product that gives us access to not just find those
approaching techniques to deal with fraud detections are reviews out but withal to efface them from database. This will
many and different algorithms for fraud detections are let us to maintain the webstores and keep a check and balance
designed. But keeping the world need in mind there was a of reviews.
missing factor related to this and that is Machine Learning.
An approach that covers almost every aspect towards fraud 4. PROPOSED METHOD
detection handling that's why the idea that we have proposed Usually, it is noticed that an original buyer will review on the
are much more up to date and much more efficient while quality of the service or product only once until or unless he or
dealing with fraud detection. Table1 illustrates the she wants to respond to other customers or deliberately wants
comparison of different research papers. to misguide others by hyper or fake reviews.
The attributes of spam reviews & non-spam as discussed in
Table 1: Summary of literature comparison the next section in detail are used to build the database. A web
Research IP Mac Location Gmail Spam
application has been created that captures user reviews, which
Article Addres Addre ID word
s ss detect stores data in a database and detects spam according to the
ion proposed method. Customer identification is tracked using its
Mukherjee      login email, location and IP address of its device and Mac
et al address. The process as discussed in Figure.1
Sun et al     
Hossain et     
al
Mirza et al     
Elmurngi     
et al
Bajaj et al     

3. PROBLEM STATEMENT
The proposed software solution will help the user to
determine the fake unauthentic and spam reviews of any
product. It always has been an immense problem that all the
buyers face when buying stuffs from e-commerce webstores.
Fake and spam comments most of the time manipulate the
buyers which deplorably impact on the productivity of the
seller at astronomically immense level. With all of our
research we have ascertain that fake reviews are still an
immensely colossal issue for all e-commerce community.
With proposed software store owners will be able to detect
these fake reviews and withdraw them from the database.
As discussed above, this system is being developed to mainly
target the unauthentically spurious and spam reviews in order
Figure 1: Flow of proposed method

2208
Muhammad Ahsan Saeed et al ., International Journal of Advanced Trends in Computer Science and Engineering, 10(3), May - June 2021, 2206 – 2211

i. MAC ADDRESS v. SPAM WORD


A person who is came to do fraud activity can change his IP
address by switching his network. Also, the device location The spam dictionary is used to find unrelated reviews. We
can be changed by using VPN. So, to overcome these main have taken inspiration from previous studies to find the spam
above mentioned problems mac address is used. When a user word 'buy direct', 'money back', 'the offer'. These spam words
doing different types of fraud reviews, proposed system gets play a vital role in misleading consumers. Database is created
his mac address because it can’t be replaced. We can stop his that store these types of spam words using N-gram model &
fraud reviews by blocking his mac address he can’t be able to Jaccard coefficient. The flow of buying and review process of
detect the online website. products can be seen in Figure 2 and Figure 3.

ii. IP Address

Customers can use any devices to send incorrect reviews.


However, capturing the device's IP address can help identify
the device in particular. Our proposed version determines the
times when a user is allowed to send online reviews from his
device to a single product. A true customer will not attempt to
post multiple reviews for the same PRODUCT OVER AND OVER
again. How to stop spam you can mislead users, which is very
annoying. But spam can change their IP address by changing
their network or other source.

iii. EMAIL ID
Every website allows a customer to access their services and
Figure 2 : Flow of review and buying of products
allow transactions only after verifying their ownership, i.e.,
the user must create an account on that website. One
component of the information required by each online service
website is the customer email id. On proposed web
application, the buyer may send only 1 review to a specific
product by the email provided. In the event that a customer
logs into an email website for example [email protected], and
the next time they try to log in using [email protected], then
existing E-commerce websites treat them as two separate
accounts or customers. two different. But in reality, these IDs
are the same in relation to email. Therefore, it is easier for
spammers to sign in with different accounts and send multiple
updates with different identities all the time. To avoid this, we
Figure 3: Reviews of products
have suggested the removal of names and compare whether
the IDs are the same or different. A better idea would be to let 5. CONCLUSION & RESULT
the email handle authenticity and instead of having a separate In this work it was the aimed to propose a new way of finding
subscription, simply allow the customer to use the website fake reviews that affect the buying behaviors of customers.
using his or her email. The proposed web application Our purposed strategy can cover a few attributes of a reviewer,
implements both methods. But what if spammer creates in this way, keeping a remarkable personality for every client.
another email address? Although it will be difficult to get a It is adequately ready to perceive spam exercises under certain
new ID every time you post a review but yes, it is possible. To presumptions. The preliminary experiments have shown
combat this attack, we can track the IP address of the device. promising results.
After viewing variety of research on fraud detection, it was
iv. LOCATION found that work has been done on spam words utilizing
Another feature used is the customer's location. To determine different techniques, but still detection was destitute in some
the person’s location, it is just needed to find the persons manners which involves other aspects like IP addresses, MAC
longitude and latitude and where the person provides reviews addresses and Email accounts additionally Machine Learning
to the product. When the location is found, we look at how approaches were missing and very less work has been done till
many reviews of the same product are offered from the same now utilizing these approaches. Some researchers suggested
location. A mock reviewer trying to increase or decrease a the utilization of IP addresses was found but still it was the
product will post more than one review to affect its rating destitute in mac addresses so proposed approach covered all
which may also adversely affect another customer’s purchase these aspects which involves IP addresses, mac addresses,
decision. But location can also be changed using a VPN.

2209
Muhammad Ahsan Saeed et al ., International Journal of Advanced Trends in Computer Science and Engineering, 10(3), May - June 2021, 2206 – 2211

Table 2: Summary and classification of Reviews on same product

Sr. Reviews ID Mac Address IP Address Latitude Longit Spam impact


No ude word
1 Great product Ahsa164@gmail. E4:A7;C5:A4:CF:D 192.168.64. 28.65289 34.890 -- non-spam
D 1 87
com

2 Nice product Ahsa164gmail.co E4:A7;C5:A4:CF:D 192.168.65. 28.65289 34.890 Money spam


and quality m D 2 87 back
3 Nice features, ali164gmail.com 98:40;C5:A4:CF:2D 192.168.67. 18.65289 23.890 Cash spam
happy to buy this 1 87 bonus
and get cash
back as bonus
4 Low quality. bad Zahid1122@gmai 09:0A;C5:A9:7F:9B 192.168.69. 28.65289 37.890 Money spam
product. Want l.com 3 87 back
my money back.
5 Seller doesn’t Zahid11223@gm 09:0A:C5:A3:4F:9B 192.168.64. 28.65289 34.890 Money spam
send product as ail.com 1 87 back
described. Want
my money back
6 Just received my [email protected] A4:7F;C5:A4:CF:2F 192.116.64. 28.65289 34.890 -- Non-spam
product 1 87
7 Amazing [email protected] AF:7C:C5:88:CF:9B 192.168.79. 37.65289 57.890 -- Non-spam
product 1 87
8 Not up to the [email protected] E4:A7;C5:A4:CF:D 192.168.75. 67.65289 78.890 -- Non-spam
mark D 1 87

Email account utilizing a Machine Learning technique to


make the detection more efficient than it was ever before and REFERENCES
more facile in case of different scarcely frauds.
The only one time a customer can post a review from given [1] H. Ahmed, I. Traore, and S. Saad, "Detecting
email, with specific MAC address, and from particular opinion spams and fake news using text
location. The Purposed method used & tested on a set of 90 classification," Security and Privacy, vol. 1, no. 1,
reviews, and it shows accuracy up to 75%. p. e9, 2018.
In Figure 3 & Table 2 results of our purposed method are [2] C. Sun, Q. Du, and G. Tian, "Exploiting product
presented. Review 2 having same email address and same related review features for fake review
MAC address as that of review 1, therefore, it has been detection," Mathematical Problems in Engineering,
considered spam because user repeatedly commented in the vol. 2016, 2016.
favor of product. In Review 4 & 5 there are spam words like, [3] M. F. Hossain, "Fake review detection using data
“I Want my money back. Also, the device which the user is mining," 2019.
using has same MAC Address, so it is considered spam. [4] A. Mukherjee, V. Venkataraman, B. Liu, and N.
Glance, "Fake review detection: Classification
ACKNOWLEDGEMENT and analysis of real and pseudo reviews,"
UIC-CS-03-2013. Technical Report, 2013.
It is hereby to apprise you that this research is not copied [5] H. B. Abdalla, J. Lin, G. Li, S. M. M. J. I. J. o. I.
from anywhere and doesn't contain any plagiarism. We are Gilani, and E. Engineering, "NoSQL: Confidential
thankful to our PMAS Arid Agriculture University UIIT on Data Security and Data Management by Using
campus for giving us this astounding opportunity of a Mobile Application," vol. 6, no. 2, p. 84, 2016.
inscribing a research paper. We thank Dr. Yasir Hafeez for [6] A. Mahmood, S. M. M. Gilani, M. J. Iqbal, Z.
his assistance and guidance and additionally this research Haider, and S. J. T. J. Daud, "Analysis and
would never have been possible without the great supervision Evaluation of Secure Solutions for Terrestrial
of Dr. Mushhad Gilani who guided us in every single details. Networks," vol. 24, no. 04, pp. 63-71, 2019.
Availed us in all manners in order to get this research done. [7] A. Thakur, B. Shaikh, V. Jain, and A. Magar,
"INTERNATIONAL JOURNAL OF

2210
Muhammad Ahsan Saeed et al ., International Journal of Advanced Trends in Computer Science and Engineering, 10(3), May - June 2021, 2206 – 2211

ENGINEERING SCIENCES & RESEARCH


TECHNOLOGY CREDIT CARD FRAUD
DETECTION USING HIDDEN MARKOV
MODEL AND ENHANCED SECURITY
FEATURES."
[8] N. Hussain, H. Turab Mirza, G. Rasool, I. Hussain,
and M. Kaleem, "Spam review detection
techniques: A systematic literature review,"
Applied Sciences, vol. 9, no. 5, p. 987, 2019.
[9] L. Nawaz and Q. Nawaz "An Improved
Methodology for Data Hiding In Images Using
Haar Transformed, LSB Replacement Method
and Modified PVDMF 1," International Journal of
Advanced Trends in Computer Science and
Engineering, vol. 10, no. 3, pp. 1690-1699, 2021.
[10] E. Elmurngi and A. Gherbi, "An empirical study
on detecting fake reviews using machine learning
techniques," in 2017 seventh international
conference on innovative computing technology
(INTECH), 2017, pp. 107-114: IEEE.

2211

You might also like