Detecting Fraud Apps Ashish
Detecting Fraud Apps Ashish
MARKETPLACES”
A Project Report
Submitted By
Ashish Kumar Singh
Registration Number: 322100098
Submitted to
LOVELY PROFESSIONAL UNIVERSITY PHAGWARA, PUNJAB
2022-2024
1
Student’s Declaration
I, Ashish Kumar Singh, Reg. No. 322100098, hereby declare that the work done by
original work for the partial fulfilment of the requirements for the award of the degree,
Dated:05/08/2024
2
ACKNOWLEDGEMENT
Every work constitutes great deal of assistance and guidance from the people concerned
Wish to place on record my sincere gratitude to Centre for Distance and Online
my project.
my family members. At last, I would like to thank all the faculty of business management to
I’m also thankful to my friends who provided me their constant support and assistance.
Place: Bihar
Date:05/08/2024
Ashish Kumar Singh (322100098)
3
ABSTRACT
Fraud detection in online marketplaces has become a critical area of research and
application due to the exponential growth of e-commerce and digital transactions. This
paper explores the development and implementation of robust fraud detection
mechanisms to ensure secure and trustworthy online transactions. Utilizing a
combination of machine learning algorithms, data mining techniques, and behavioral
analysis, this study aims to identify and mitigate fraudulent activities such as fake
reviews, payment fraud, and identity theft.
The research begins with a comprehensive review of existing fraud detection methods,
highlighting their strengths and limitations. Following this, a novel hybrid model is
proposed, integrating supervised and unsupervised learning techniques to enhance the
accuracy and efficiency of fraud detection. Supervised learning algorithms, such as
decision trees and support vector machines, are employed to classify known fraudulent
patterns, while unsupervised methods, including clustering and anomaly detection, are
used to identify previously unseen fraud schemes.
The proposed system leverages large datasets from various online marketplaces,
ensuring a diverse range of transaction types and user behaviors are considered. Feature
engineering plays a crucial role in the model, with emphasis on extracting relevant
attributes that signify fraudulent activity, such as transaction times, geographic
inconsistencies, and unusual spending patterns.
4
Table of Contents
1 Declaration by Student 2
3 Acknowledgement 3
4 Abstract 4
5 List of Figures 6
6 List of Tables 7-8
8 List of Abbreviations 9
9 Chapter-1 Introduction 10-18
17 Appendix 47-54
18 Screen Shots 55-59
19 Annexures 60-61
References
5
LIST OF FIGURES
3 Sequence Diagram 34
4 Class Diagram 35
Home Screen
5 55
User Registration
6 56
6
LIST OF SCHEMES
2. Rule-Based Systems
3. Behavioral Analysis
User Behavior Analytics (UBA): Monitors and analyzes user activities to detect
deviations from normal behavior.
4. Data Analytics
Big Data Analysis: Processes large volumes of data to identify fraud trends and
correlations.
7
Real-Time Analytics: Provides immediate insights and alerts for potentially
fraudulent activities.
5. Identity Verification
6. Blockchain Technology
7. Collaborative Filtering
Peer Reviews and Ratings: Uses feedback from other users to identify fraudulent
sellers or buyers.
Deep Learning: Utilizes complex neural networks to detect subtle patterns of fraud
that simpler models might miss.
Natural Language Processing (NLP): Analyzes text data, such as customer reviews
and messages, to detect fraudulent content.
8
LIST OF ABBREVIATIONS
Here's a list of abbreviations commonly used in the context of fraud detection in online
marketplaces:
1. AI - Artificial Intelligence
9
INTRODUCTION
The Mobile App is a very popular and well-known concept due to the rapid
advancement in the mobile technology. Due to the large number of mobile Apps, ranking
fraud is the key challenge in front of the mobile App market. There are millions of apps
are available in market for the application of mobile users. However, all the mobile users
first prefer high ranked apps when downloading it. To download application smart phone
user has to visit play store such as Google Play Store, Apples store etc. When user visit
play store then he is able to see the various application lists. This list is built on the basis
of promotion or advertisement. User doesn’t have knowledge about the application (i.e.
which applications are useful or useless). So user looks at the list and downloads the
applications. But sometimes it happens that the downloaded application won’t work or not
useful. That means it is fraud in mobile application list. To avoid this fraud, we are making
application in which we are going to list the applications. In this paper, we provide a brief
view of ranking fraud and propose a ranking fraud detection system for mobile Apps.
Specifically, we first propose to accurately locate the ranking fraud by mining the active
periods by using mining leading session algorithm. Furthermore, we investigate three
types of evidences, i.e., ranking based evidences, rating based evidences and review based
evidences, by studying historical records. We used an optimal aggregation method to
integrate all the evidences for fraud detection. Finally, we evaluate the proposed system
with real-world App data collected from the Google App Store for a long time period. In
the experiments, we validate the effectiveness of the proposed system, and show the
scalability of the detection algorithm as well as some regularity of ranking fraud activities.
The quantity of mobile Apps has developed at an amazing rate in the course of recent
years. For instances, the growth of apps were increased by 1.6 million at Apple's App store
and Google Play.To increase the development of mobile Apps, many App stores launched
daily App leaderboards, which demonstrate the chart rankings of most popular Apps.
Indeed, the App leaderboard is one of the most important ways for promoting mobile
Apps. A higher rank on the leaderboard usually leads to a huge number of downloads and
million dollars in revenue. Therefore, App developers tend to explore various ways such
as advertising campaigns to promote their Apps in order to have their Apps ranked as high
10
as possible in such App leaderboards. However, as a recent trend, instead of relying on
traditional marketing solutions, shady App developers resort to some fraudulent means to
deliberately boost their Apps and eventually manipulate the chart rankings on an App
store. This is usually implemented by using so called botfarms or human water armies to
inflate the App downloads, ratings and reviews in a very short time.
There are some related works, for example, we positioning spam recognition, online
survey spam identification and portable App suggestion, but the issue of distinguishing
positioning misrepresentation for mobile Apps is till under-investigated. The problem of
detecting ranking fraud for mobile Apps is still underexplored. Toovercome these
essentials, in this paper, we build a system for positioning misrepresentation discovery
framework for portable apps that is the model for detecting ranking fraud in mobile apps.
For this, we have to identify several important challenges.
First, fraud is happen any time during the whole life cycle of app, so the identification of
the exact time of fraud is needed. Second, due to the huge number of mobile Apps, it is
difficult to manually label ranking fraud for each App, so it is important to automatically
detect fraud without using any basic information. Mobile Apps are not always ranked high
in the leaderboard, but only in some leading events ranking that is fraud usually happens
in leading sessions.
Therefore, main target is to detect ranking fraud of mobile Apps within leading sessions.
First propose an effective algorithm to identify the leading sessions of each App based on
its historical ranking records. Then, with the analysis of Apps’ ranking behaviors, find out
the fraudulent Apps often have different ranking patterns in each leading session
compared with normal Apps. Thus, some fraud evidences are characterized from Apps’
historical ranking records. Then three functions are developed to extract such ranking
based fraud evidences. Therefore, further two types of fraud evidences are proposed based
on Apps’ rating and review history, which reflect some anomaly patterns from Apps’
historical rating and review records. In addition, to integrate these three types of
evidences, an unsupervised evidence-aggregation method is developed which is used for
evaluating the credibility of leading sessions from mobile Apps.
11
1.3 OVERVIEW OF THE PROJECT
With the increase in the number of web Apps, to detect the fraud Apps, this project
proposes a simple and effective system. Fig.1 shows the Framework of Fraud ranking
discovery in mobile app
Indeed, careful observation reveals that mobile Apps are not always ranked high in the
leaderboard, but only in some leading events, which form different leading sessions. In
other words, ranking fraud usually happens in these leading sessions. Therefore,
detecting ranking fraud of mobile Apps is actually to detect ranking fraud within
leading sessions of mobile Apps. Specifically, this system first proposes a simple yet
effective algorithm to identify the leading sessions of each App based on its historical
ranking records. Then, with the analysis of Apps’ ranking behaviors, find that the
fraudulent Apps often have different ranking patterns in each leading session compared
with normal Apps. Thus, it characterizes some fraud evidences from Apps’ historical
ranking records, and develop three functions to extract such ranking based fraud
evidences. Nonetheless, the ranking based evidences can be affected by App
developers’ reputation and some legitimate marketing campaigns, such as “limited-time
discount”. As a result, it is not sufficient to only use ranking based evidences.
Therefore, it further proposes two types of fraud evidences based on Apps’ rating and
12
review history, which reflect some anomaly patterns from Apps’ historical rating and
review records. In addition, it develops an unsupervised evidence-aggregation method
to integrate these three types of evidences for evaluating the credibility of leading
sessions from mobile Apps.
It is worth noting that all the evidences are extracted by modeling Apps’ ranking, rating
and review behaviors through statistical hypotheses tests. The proposed framework is
scalable and can be extended with other domain generated evidences for ranking fraud
detection. Experimental results show the effectiveness of the proposed system, the
scalability of the detection algorithm as well as some regularity of ranking fraud
activities.
1.4 OBJECTIVES
13
1.5 MODULES
4. Evidence Aggregation
After downloading app users generally rate the app. The rating given by the user is one
of the most important factors for the popularity of the app. An app having higher rating
always attracts more number of users to download it and naturally it can also be ranked
higher in the chart rankings. Thus, in ranking fraud of apps, rating based evidences is
also an important feature so they are needs to be considered.
Pre-processing of ratings
General ratings are between one to five, in this module it will consider, the rating which
are less than or equal to three are considered as negative ratings and rating above three
are considered as positive ratings
Generally, ratings are between one to five, in this module we compute the average rating
of particular app and compare it with threshold. The rating which are less than or equal
to three are considered as negative ratings and rating above three are considered as
positive ratings. Finally, the output is in the form of zeros and ones i.e. negative rating
gives zero as an output while positive rating gives one as an output.
Along with rating users are allowed to write their reviews about the app. Such reviews
are showing the personalized experiences of usage for particular mobile Apps. The
review given by the user is one of the most important factors for the popularity of the
14
app. As the reviews are given in natural language so pre-processing of reviews and then
sentiment analysis on pre-processed reviews is performed. The system will find
sentiment of the review which can be positive or negative. Positive review adds plus one
to positive score, if negative it will add one to negative score. In this way it will find out
score of each of the reviews and determine whether app is fraud or not on the basis of
review based evidences. This module contains two subparts given below:
Pre-processing Reviews
2. Stop word removal: Stop words are commonly used words such as: a, the, and,
for, from, is, in and many more.
Sentiment Analysis
After pre-processing of reviews system find out the sentiments of the reviews.
It will classify the review as positive or negative. The system will find sentiment of
the review which can be positive or negative. Positive review adds plus one to
positive score, if negative it will add one to negative score. In this way it will find
out score of each of the reviews and determine whether app is fraud or not on the
basis of review-based evidences.
15
1.5.3 RANKING BASED EVIDENCES
In this phase, we detect Apps’ ranking behavior, by finding three phases of ranking,
namely, rising phase, maintaining phase and recession phase. If the apps ranking
reach to peak position in the leaderboard that phase is called as rising phase and
maintaining same peak position for specific time period is called as maintaining
phase. If the ranking of the app decreases rapidly in the leading event then it is called
as recession phase.
After three types of fraud evidences are extracted, the next work is to combine
them for ranking fraud detection. Every evidence is given a Boolean weight as 0 or 1
where 0 indicate fraud nature and 1 indicate no fraud nature.
1. SYSTEM ANALYSIS
In the literature, while there are some related work, such as web
ranking spam detection, online review spam detection and mobile App
recommendation, the problem of detecting ranking fraud for mobile Apps is still
under-explored. Generally speaking, the related works of this study can be grouped
into three categories. The first category is about web ranking spam detection. The
second category is focused on detecting online review spam. Finally, the third
category includes the studies on mobile App recommendation
2.1.1 DISADVANTAGES
Although some of the existing approaches can be used for anomaly detection
from historical rating and review records, they are not able to extract fraud evidences
for a given time period (i.e., leading session).
Cannot able to detect ranking fraud happened in Apps’ historical leading sessions
16
There is no existing benchmark to decide which leading sessions or Apps really
In today’s era, due to rapid development in the mobile technology and mobile
devices, the applications i.e. mobile apps are being very interesting and popular
concept. As there is large number of mobile Apps, ranking fraud is the challenging
factor in front of the mobile App market. Ranking fraud is the term used for
referring to fraudulent or suspicious activities having the intention of boosting up the
Apps in the popularity list. In fact, App developers are using tricky means frequently
for increasing their Apps sales. The main aim is to develop such system that find
ranking, rating and review behaviours for investigating review based evidences,
rating based evidences and ranking based evidences and then aggregation based on
optimization to combine all the evidences for detection of fraud.
2.2.1 Advantages
The proposed framework is scalable and can be extended with other domain
leading sessions or Apps really contain ranking fraud. Thus, we develop four
intuitive baselines and invite five human evaluators to validate.
The feasibility of the project is analyzed in this phase and business proposal is
put forth with a very general plan for the project and some cost estimates. During
system analysis the feasibility study of the proposed system is to be carried out. This
is to ensure that the proposed system is not a burden to the company. For feasibility
analysis, some understanding of the major requirements for the system is essential.
17
Three key considerations involved in the feasibility analysis are
Economic Feasibility
Technical Feasibility
Social Feasibility
This study is carried out to check the economic impact that the system will
have on the organization. The amount of fund that the company can pour into the
research and development of the system is limited. The expenditures must be
justified. Thus the developed system as well within the budget and this was achieved
because most of the technologies used are freely available. Only the customized
products had to be purchased.
This study is carried out to check the technical feasibility, that is, the technical
requirements of the system. Any system developed must not have a high demand on
the available technical resources. This will lead to high demands on the available
technical resources. This will lead to high demands being placed on the client. The
developed system must have a modest requirement, as only minimal or null changes
are required for implementing this system.
The aspect of study is to check the level of acceptance of the system by the
user. This includes the process of training the user to use the system efficiently. The
user must not feel threatened by the system, instead must accept it as a necessity.
The level of acceptance by the users solely depends on the methods that are
employed to educate the user about the system and to make him familiar with it.
Users’ level of confidence must be raised so that the user is also able to make some
constructive criticism, which is welcomed, as end user is the final user of the system.
18
3. SYSTEM SPECIFICATION
RAM : 1 GB
Language : C#.NET
An interface for users is build using ASP .Net. SQL server 2008 is used as a backend
tool. Additional technologies used is web services and ADO .Net. Then it comes the
turn of operating System. Any .Net framework Compatible software platform can be
used.
19
Introduction to Dot Net
Microsoft .NET is a set of Microsoft software technologies for rapidly building and
integrating XML Web services, Microsoft Windows-based applications, and Web
solutions. The .NET Framework is a language-neutral platform for writing programs
that can easily and securely interoperate. There’s no language barrier with .NET there
are numerous languages available to the developer including Managed C++, C#,
Visual Basic and Java Script. The .NET framework provides the foundation for
components to interact seamlessly, whether locally or remotely on different platforms.
It standardizes common data types and communications protocols so that components
created in different languages can easily interoperate.
“NET” is also the collective name given to various software components built upon the
.NET platform. These will be both products (Visual Studio.NET and Windows.NET
Server, for instance) and services (like Passport, .NET My Services, and so on).
The CLR is described as the “execution engine” of .NET. It provides the environment
within which programs run. The most important features are
Managed Code
The code that targets .NET which contains certain extra Information -
“metadata” - to describe itself. While both managed and unmanaged code can run in
20
the runtime, only managed code contains the information that allows the CLR to
guarantee, for instance, safe execution and interoperability.
Managed Data
Managed Code comes Managed Data. CLR provides memory allocation and Deal
location facilities, and garbage collection. Some .NET languages use Managed Data by
default, such as C#, Visual Basic.NET and JScript.NET, whereas others, namely C++,
do not. Targeting CLR can, depending on the language the user’re using, impose
certain constraints on the features available. As with managed and unmanaged code,
one can have both managed and unmanaged data in .NET applications - data that
doesn’t get garbage collected but instead is looked after by unmanaged code.
The CLR uses something called the Common Type System (CTS) to strictly enforce
type-safety. This ensures that all classes are compatible with each other, by describing
types in a common way.
CTS define how types work within the runtime, which enables types in one language
to interoperate with types in another language, including cross-language exception
handling. As well as ensuring that types are only used in appropriate ways, the runtime
also ensures that code doesn’t attempt to access memory that hasn’t been allocated to it.
The CLR provides built-in support for language interoperability. To ensure that the
user can develop managed code that can be fully used by developers using any
programming language, a set of language features and rules for using them called the
Common Language Specification (CLS) has been defined. Components that follow
these rules and expose only CLS features are considered CLS-compliant.
NET provides a single-rooted hierarchy of classes, containing over 7000 types. The
root of the namespace is called System; this contains basic types like Byte, Double,
Boolean, and String, as well as Object. All objects derive from System. Object. As well
21
as objects, there are value types. Value types can be allocated on the stack, which can
provide useful flexibility. There are also efficient means of converting value types to
object types if and when necessary.
The set of classes is pretty comprehensive, providing collections, file, screen, and
network I/O, threading, and so on, as well as XML and database connectivity.
Constructors are used to initialize objects, whereas destructors are used to destroy them.
In other words, destructors are used to release the resources allocated to the object. In
C#.NET the sub finalize procedure is available. The sub finalize procedure is used to
complete the tasks that must be performed when an object is destroyed. The sub finalize
procedure is called automatically when an object is destroyed. In addition, the sub
finalize procedure can be called only from the class it belongs to or from derived
classes.
Garbage Collection
NET, the garbage collector checks for the objects that are not currently in use by
applications. When the garbage collector comes across an object that is marked for
garbage collection, it releases the memory occupied by the object.
Overloading
Overloading is another feature in C#. Overloading enables the user to define multiple
procedures with the same name, where each procedure has a different set of arguments.
Besides using overloading for procedures, the user can use it for constructors and
properties in a class.
22
Structured Exception Handling
NET supports structured handling, which enables the user to detect and
remove errors at runtime. In C#.NET, the user needs to use Try…Catch…Finally
statements to create exception handlers. Using Try…Catch…Finally statements, the
user can create robust and effective exception handlers to improve the performance of
this application.
C#.NET is also compliant with CLS (Common Language Specification) and supports
structured exception handling. CLS is set of rules and constructs that are supported by
the CLR (Common Language Runtime). CLR is the runtime environment provided by
the .NET Framework; it manages the execution of the code and also makes the
development process easier by providing services.
Conclusion of .NET
23
BACK END
FEATURES OF SQL-SERVER
The OLAP Services feature available in SQL Server. The term OLAP Services has been
replaced with the term Analysis Services. Analysis Services also includes a new data
mining component. The term repository is used only in reference to the repository engine
within Meta Data Services.
TABLE
QUERY
FORM
REPORT
TABLE
VIEWS OF TABLE
Design View
Datasheet View
DESIGN VIEW
To build or modify the structure of a table the user work in the table design view. I can
specify what kind of data will be hold.
24
DATASHEET VIEW
To add, edit or analyses the data itself the user work in tables datasheet view mode.
QUERY
Queries are the real workhorses in a database, and can perform many different
functions. Their most common function is to retrieve specific data from the tables. The
data the user wants to see is usually spread across several tables, and queries allow the
user to view it in a single datasheet. Also, since the user usually don't want to see all the
records at once, queries let the user add criteria to "filter" the data down to just the
records the user want. Queries often serve as the record source for forms and reports.
FORMS
Forms are sometimes referred to as "data entry screens." They are the interfaces the
user use to work with your data, and they often contain command buttons that perform
various commands. The user can create a database without using forms by simply
editing your data in the table datasheets. However, most database users prefer to use
forms for viewing, entering, and editing data in the tables.
REPORTS
Reports are what the user use to summarize and present data in the tables. Each report
can be formatted to present the information in the most readable way possible. A
report can be run at any time, and will always reflect the current data in the database.
Reports are generally formatted to be printed out, but they can also be viewed on the
screen, exported to another program, or sent as e-mail message.
4. SYSTEM DESIGN
Input design is the process of converting user inputs into computer-based format. The
project requires a set of information from the user to prepare a report. In order to
prepare a report, when organized input data are needed.
25
In the system design phase, the diagram identifies logical data flow, data stores and
destination. Input data is collected and organised into groups of similar data. The goal
behind designing input data is to make the data entry easy and make it free from logical
errors. The input entry to all type of clients is the user name and password only. If they
are valid the client is allowed to enter into the software.
Objectives
Outputs are the most important and direct source of information to the user and to
management. Efficient and eligible output design should improve the system’s
relationship with the user and help in decision making. Output Design generally deals
with the results generated from stored or calculated values.
Reports are displayed either as screen preview or printed form. Most end users will
not actually operate the information systems or enter data through workstations, but
they will use the output from the system.
Form Design
The cost of collecting raw data and cost of distributing processed information are
major costs of a system. So careful forms design can affect the cost effectiveness of the
system. Well-designed forms can increase efficiency; improve workflow and lower
system costs.
26
Code Design
When a large volume of data is being handled, it is important that items be identified,
sorted or selected easily. To accomplish this, each data item must have a unique
identification and must be relates to other items of data of the same type. Thus, codes
are used to identify item uniquely.
The general objective is to make information access easy, quick, expensive and flexible
for the user. In database design several specific objectives are consider:
Control Redundancy:
Redundant data occupies space and therefore, is wasteful. If versions of the same data
are in different phase of updating, a system often gives conflicting information. A
unique aspect of database design is storing data only once, which controls redundancy
and improves system performance.
Data Independence:
The accuracy and database ensure the data quality content remain constant. Integrity
controls detect data inaccuracy where occur.
27
Privacy and Security:
For the data to remain private, security measures must be taken to an unauthorized
access. Database security means that data are protected from various forms of
destructions. Uses must be positively identifying and actions monitored. Managing the
database require a Database Administrator (DBA) whose key functions are to be
managing data activities, The database structure and the DBMS. In addition, a
managerial background the DBA needs a technical knowledge to deal with database
designer.
A data flow diagram (DFD) is a graphical system model that shows all of the
main requirements for an information system in one diagram: inputs and outputs,
processes, and data storage. A DFD describes what data flows rather than how it is
processed. Everyone working on a development project can see all aspects of the
system working together at once with DFD. That is one reason for its popularity. The
DFD is also easy to read because it is graphical model. The DFD is mainly used during
problem analysis. End Users, management, and all information systems workers
typically can read and interpret the DFD with minimal training.
Level 0
User
User Login Outp
ut
User
Authenticati
on
28
Level1
Database
Optimize
records
Level 2
Rating based
evidence
Apps with
Ranking based Optimization
Historical
evidence based
records
aggregation
Review based
evidence
Level 3
29
Rating based
evidence
Apps with
Historical Ranking based Optimization
evidence based
records
aggregation
Review based
evidence
Compare with
previous evidence
Detect fraud
30
1. Provide users a ready-to-use, expressive visual modeling Language so that they can
develop and exchange meaningful models.
A use case diagram in the Unified Modeling Language (UML) is a type of behavioral
diagram defined by and created from a Use-case analysis. Its purpose is to present a
graphical overview of the functionality provided by a system in terms of actors, their
goals (represented as use cases), and any dependencies between those use cases. The
main purpose of a use case diagram is to show what system functions are performed for
which actor. Roles of the actors in the system can be depicted.
31
User
registration
Upload review
Apps with
historical records
Rating based
user evidence
server
Review based
evidence
Ranking based
evidence
Optimization based
aggregation
Compare with
previous evidence to
find defect range
32
4.6. SYSTEM FLOW DIAGRAM
33
4.8. CLASS DIAGRAM
34
Server User
Rating based evidence User name
Ranking based evidence Company
Review based evidence address
name
Contact details
Optimization based
Compare with previous Purchase apps ()
Fetect fraud range () Review apps ()
Mobile apps
records
Apps name
Apps details
Apps reviews()
Maintains
historical records
()
5. SYSTEM TESTING
Testing is the process of running a system with the intention of finding errors. Testing
enhances the integrity of a system by detecting deviations in design and errors in the
system. Testing aims at detecting error-prone areas. This helps in the prevention of
errors in a system. Testing also adds value to the product by conforming to the user
requirements.
The main purpose of testing is to detect errors and error-prone areas in a system.
Testing must be thorough and well-planned. A partially tested system is as bad as an
untested system. And the price of an untested and under-tested system is high.
35
The implementation is the final and important phase. It involves user-training, system
testing in order to ensure successful running of the proposed system. The user tests
the system and changes are made according to their needs. The testing involves the
testing of the developed system using various kinds of data. While testing, errors are
noted and correctness is the mode.
Unit testing focuses efforts on the smallest unit of software design. This is known as
module testing. The modules are tested separately. The test is carried out during
programming stage itself. In this step, each module is found to be working satisfactory
as regards to the expected output from the module.
36
5.3.2 Integration Testing
Data can be lost across an interface. One module can have an adverse effect on
another, sub functions, when combined, may not be linked in desired manner in major
functions. Integration testing is a systematic approach for constructing the program
structure, while at the same time conducting test to uncover errors associated within
the interface. The objective is to take unit tested modules and builds program
structure. All the modules are combined and tested as a whole.
After performing the validation testing, the next step is output testing of the
proposed system, since no system could be useful if it does not produce the required
output in a specific format. The output format on the screen is found to be correct.
The format was designed in the system design time according to the user needs. For
the hard copy also; the output comes as per the specified requirements by the user.
Hence output testing did not result in any correction for the system.
37
5.3.5 User Acceptance Testing
User acceptance of a system is the key factor for the success of any system.
The system under consideration is tested for the user acceptance by constantly
keeping in touch with the prospective system users at the time of developing and
making changes whenever required.
Error messages and warning messages are “bad news” delivered to the user’s
iterative systems where something has gone away or wrong. Therefore, in this
developed software there are some messages which will be displaying while using
this developed software if the user goes wrong. These messages are used to help the
user for better use. When the user enters wrong password, the displayed message
would be Invalid Password! Please try again. While adding information for various
modules no field should be empty when saving the information in the database
otherwise corresponding error messages are given immediately. While updating the
information if any field is left empty then the messages are displayed accordingly
Like select the task title, please Enter Numeric values etc.
6. SYSTEM IMPLEMENTATION
38
To arrive at more accurate evaluation, here executed al classification and
throughput performance is measured. System Implementation is a practice of
creating or modifying a system to create a new business process or replace an
existing business process. Implementation of software refers to the final installation
of the packages in its real environment, to the satisfaction of the intended users and
the operation of the systems. The people are not sure that the software is meant to
make their job easier.
The active user must be aware of the benefits of using the system
The system has been tested with sample data, changes are made to the
user requirements and run in parallel with the existing system to find out the
discrepancies. When the number of threads are rises with respect to the number of
cores. The user has also been appraised how to run the system during the training
period.
Implementation Plan
Implementation is the stage, which is crucial in the life cycle of the new
system designed. Implementation means converting a new or revised system design
into an operational one. The mechanism involved in dealing with the data structures
39
are equally important and have to be taken into consideration. This is the stage of the
project where the theoretical design is turned into a working system. In this project
implementation includes all those activities that take place to convert from the old
system to the new one. The important phase of implementation plan is changeover.
Careful planning
Changeover
The implementation is to be done step by step since testing with dummy data
will not always reveal the faults. The system will be subjected to the employees to
work. If such error or failure is found, the system can be corrected before it is
implemented in full stretch. The trail should be done as long as the system is made
sure to function without any failure or errors. Precautions should be taken so that
40
any error if occurred should not totally make the process to a halt. Such a care
should be taken. The system can be fully established if it does not create any error
during the testing periods.
Implementation is the stage, which is crucial in the life cycle of the new
system designed. Implementation means converting a new or revised system design
into an operational one. The mechanism involved in dealing with the data structures
are equally important and have to be taken into consideration. This is the stage of the
project where the theoretical design is turned into a working system. In this project
implementation includes all those activities that take place to convert from the old
system to the new one. The important phase of implementation plan is changeover.
41
System Maintenance
Corrective Maintenance
Preventative Maintenance
42
preserving the useful life of equipment and avoiding premature equipment failures,
minimising any impact on operational requirements. Preventative maintenance is
carried out only on those items where a failure would have expensive or
unacceptable consequences e.g. lifts, fire alarms, electricity supply and gas supply.
Many of these items are also subject to a statutory requirement for inspection and
preventive maintenance.
Perfective Maintenance
Adaptive Maintenance
43
RESULT AND DISCUSSION
False Positives and Negatives: The trade-off between false positives (legitimate
transactions flagged as fraud) and false negatives (fraudulent transactions not
detected) was a critical factor. Models like Neural Networks tended to have fewer
false negatives but a slightly higher rate of false positives.
2. Behavioral Analysis
3. Feature Importance
Key Features: Features such as transaction amount, user account age, frequency of
transactions, geographical location discrepancies, and IP address history were
identified as crucial indicators for fraud detection.
44
Feature Engineering: The creation of derived features (e.g., average transaction
value per user, frequency of high-value purchases) significantly improved model
performance, highlighting the importance of domain-specific feature engineering.
Data Integrity: The quality and completeness of data were paramount. Missing or
inaccurate data led to a decrease in detection accuracy. Data pre-processing steps,
including imputation of missing values and normalization, were critical for
maintaining model performance.
Historical Data: Historical transaction data provided a rich source of information for
training models, allowing them to learn from past fraudulent activities and improve
prediction accuracy.
5. Real-Time Detection
Latency and Speed: Real-time fraud detection systems need to balance accuracy with
speed. Implementing optimized algorithms and efficient data processing pipelines
ensured that fraud detection systems could operate in real-time without significant
delays.
Scalability: The ability to scale and handle high transaction volumes was essential for
maintaining performance in online marketplaces. Distributed computing frameworks
and cloud-based solutions were effective in addressing scalability concerns.
45
CONCLUSION AND FUTURE ENHANCEMENTS
7.1 CONCLUSIONS
A ranking fraud detection system for mobile Apps has been developed in this project.
Specifically, it first showed that ranking fraud happened in leading sessions and
provided a method for mining leading sessions for each App from its historical
ranking records. Then, it identified ranking based evidences, rating based evidences
and review based evidences for detecting ranking fraud. Moreover, it proposed an
optimization-based aggregation method to integrate all the evidences for evaluating
the credibility of leading sessions from mobile Apps. A unique perspective of this
approach is that all the evidences can be modeled by statistical hypothesis tests, thus it
is easy to be extended with other evidences from domain knowledge to detect ranking
fraud. Finally, it validates the proposed system with extensive experiments on real-
world App data collected from the Apple’s App store. Experimental results showed
the effectiveness of the proposed approach.
In the future, it is planned to study more effective fraud evidences and analyze the
latent relationship among rating, review and rankings. Moreover, it will be extended
to ranking fraud detection approach with other mobile App related services, such as
mobile Apps recommendation, for enhancing user experience.
46
APPENDIX
SOURCE CODE
Create Account.aspx
using System;
using System.Collections;
using System.Configuration;
using System.Data;
using System.Linq;
using System.Web;
using System.Web.Security;
using System.Web.UI;
using System.Web.UI.HtmlControls;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Xml.Linq;
public partial class createaccount1 : System.Web.UI.Page
Label11.Text = Convert.ToString(objt.userid());
Label7.Text = (string)Session["username"];
lbl_eid.Visible = false;
lbl_eid.Text = (string)Session["id"];
lbl_date.Text = Convert.ToString(DateTime.Now);
lbl_date.Visible = false;
Session["uid"] = Label11.Text;
47
objt.createusers(Label11.Text,lbl_eid.Text, Label7.Text, TextBox4.Text,
TextBox5.Text, TextBox1.Text, TextBox2.Text, TextBox6.Text, TextBox7.Text);
objt.insert(Label11.Text, lbl_eid.Text, Label7.Text, lbl_date.Text);
Response.Redirect("rating.aspx");
}
else
{
MsgBox.Show("Please give the correct verification code");
Admin Recommend.aspx.cs
using System;
using System.Collections;
using System.Configuration;
using System.Data;
using System.Linq;
using System.Web;
using System.Web.Security;
using System.Web.UI;
using System.Web.UI.HtmlControls;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Xml.Linq;
public partial class adminrecomend : System.Web.UI.Page
for (int h = 0; h < data.Tables[0].Rows.Count; h++)
{
DropDownList1.Items.Add(data.Tables[0].Rows[h]["pname"].ToString());
}
protected void Button1_Click(object sender, EventArgs e)
GridView1.Visible = true;
DataSet ds = new DataSet();
ds = cs.admintabl(DropDownList1.SelectedItem.Value);
GridView1.DataSource = ds;
GridView1.DataBind();
}
48
public void bind()
{
DataSet ds = new DataSet();
ds = cs.admintabl(DropDownList1.SelectedItem.Value);
GridView1.DataSource = ds;
GridView1.DataBind();
}
protected void GridView1_PageIndexChanging(object sender,
GridViewPageEventArgs e)
GridView1.PageIndex = e.NewPageIndex;
bind();
Opinion.aspx.cs
using System;
using System.Collections;
using System.Configuration;
using System.Data;
using System.Linq;
using System.Web;
using System.Web.Security;
using System.Web.UI;
using System.Web.UI.HtmlControls;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Xml.Linq;
public partial class postingopinion : System.Web.UI.Page
Label3.Visible = false;
TextBox1.Visible = false;
lbl_dat.Visible = false;
Label4.Visible = false;
lbl_dat.Text = Convert.ToString(DateTime.Now);
Label4.Text = Convert.ToString(obj.createid());
protected void Button3_Click(object sender, EventArgs e)
Session["id"] = Label4.Text;
if (DropDownList1.SelectedItem.Text == "Others")
49
Session["comment"] = TextBox1.Text;
if (TextBox1.Text == "
MsgBox.Show("Enter the required data");
.UI.WebControls.WebParts;
using System.Xml.Linq;
using System.Data.SqlClient;
public partial class ranking : System.Web.UI.Page
class1 cs = new class1();
GridView1.Visible = false;
Label2.Visible = false;
Label3.Visible = false;
DataSet data = new DataSet();
data = cs.dropproduct();
DropDownList1.Items.Add("--Select--");
for (int h = 0; h < data.Tables[0].Rows.Count; h++)
Label3.Visible = true;
Label2.Text=" members Recommendation"+" ";
Label3.Text = "= " +
Convert.ToString(cs.countpro1(DropDownList1.SelectedItem.Text)) +
" in " + DropDownList1.SelectedItem.Text;
}
protected void LinkButton1_Click(object sender, EventArgs e)
DataSet ds1 = new DataSet();
ad.Fill(ds1);
for (int h = 0; h < ds1.Tables[0].Rows.Count; h++)
{
SqlDataAdapter adp = new SqlDataAdapter("select product,rate,count(rate) as
Rank from opinions group by rate,product", connect);
DataSet ds = new DataSet();
adp.Fill(ds);
GridView1.DataSource = ds;
GridView1.DataBind();
50
connect.Open();
SqlDataAdapter ad = new SqlDataAdapter("select distinct product from
opinions", connect);
DataSet ds1 = new DataSet();
ad.Fill(ds1);
for (int h = 0; h < ds1.Tables[0].Rows.Count; h++)
SqlDataAdapter adp = new SqlDataAdapter("select
product,rate,count(rate) as Rank from opinions group by rate,product", connect);
DataSet ds = new DataSet();
adp.Fill(ds);
GridView1.DataSource = ds;
GridView1.DataBind();
Reviews.aspx.cs
using System;
using System.Collections;
using System.Configuration;
using System.Data;
using System.Linq;
using System.Web;
using System.Web.Security;
using System.Web.UI;
using System.Web.UI.HtmlControls;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Xml.Linq;
a = Request.Params["id"];
Label1.Visible = false;
Label1.Text = a;
Label2.Text = Label1.Text;
Session["pr"] = Label2.Text;
Label3.Text = Convert.ToString(cs.countpro(a)) + " Customer" + "
Reviews";
Label4.Visible = false;
Label5.Visible = false;
51
Label6.Visible = false;
Label7.Visible = false;
Label8.Visible = false;
Label9.Visible = false;
Label10.Visible = false;
Label11.Visible = false;
Label12.Visible = false;
Label13.Visible = false;
Label14.Visible = false;
LinkButton1.Visible = false;
LinkButton2.Visible = false;
LinkButton3.Visible = false;
LinkButton4.Visible = false;
LinkButton5.Visible = false;
Panel1.Visible = false;
ImageButton2.Visible = false;
Label15.Visible = false;
}
protected void ImageButton1_Click(object sender, ImageClickEventArgs e)
{
Label14.Text = Label3.Text;
Label4.Visible = true;
Label5.Visible = true;
Label6.Visible = true;
Label7.Visible = true;
Label8.Visible = true;
Label9.Visible = true;
Label10.Visible = true;
Label11.Visible = true;
Label12.Visible = true;
Label13.Visible = true;
Label14.Visible = true;
LinkButton1.Visible = true;
LinkButton2.Visible = true;
52
LinkButton3.Visible = true;
LinkButton4.Visible = true;
LinkButton5.Visible = true;
protected void LinkButton3_Click(object sender, EventArgs e)
{
Response.Redirect("average.aspx");
}
protected void LinkButton4_Click(object sender, EventArgs e)
{
Response.Redirect("bad.aspx");
}
protected void LinkButton5_Click(object sender, EventArgs e)
{
Response.Redirect("worst.aspx");
Rating.aspx.cs
using System;
using System.Collections;
using System.Configuration;
using System.Data;
using System.Linq;
using System.Web;
using System.Web.Security;
using System.Web.UI;
using System.Web.UI.HtmlControls;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Xml.Linq;
Admin Ranking.aspx.cs
using System;
using System.Collections;
using System.Configuration;
using System.Data;
using System.Linq;
53
using System.Web;
using System.Web.Security;
using System.Web.UI;
using System.Web.UI.HtmlControls;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Xml.Linq;
using System.Data.SqlClient;
public partial class adminranking : System.Web.UI.Page
{
class1 cs = new class1();
string s, s2, s3;
int s1 = 2;
SqlConnection connect = new
SqlConnection(ConfigurationManager.AppSettings["recommentationconnection"]);
connect.Close();
54
SCREEN SHOTS
Home Screen
55
User Registration
56
Global Anomaly Home
Upload Apps
57
User Viewing App Details
58
Global and Local Ranking
59
REFERENCES
BOOKS
WEBSITES
1. http://venturebeat.com/2012/07/03/
2. http://www.ibtimes.com/applethreatens- crackdown-biggest-app-store-
ranking-fra ud-406764
3. http://en.wikipedia.org/wiki/ information retrieval and validation
JOURNALS
[1] B. Zhou, J. Pei, and Z. Tang. A spamicity approach to web spam detection. In
Proceedings of the 2008 SIAM International Conference on Data Mining, SDM’08,
pages 277–288, 2008.
[2] A. Ntoulas, M. Najork, M. Manasse, and D. Fetterly. Detecting spam web pages
through content analysis. In Proceedings of the 15th international conference on
World Wide Web, WWW ’06, pages 83–92, 2006.
[3] N. Spirin and J. Han. Survey on web spam detection: principles and algorithms.
SIGKDD Explor. Newsl., 13(2):50–64, May 2012.
[4] E.-P. Lim, V.-A. Nguyen, N. Jindal, B. Liu, and H. W. Lauw. Detecting product
review spammers using rating behaviors. In Proceedings of the 19th ACM
60
international conference on Information and knowledge management, CIKM ’10,
pages 939–948, 2010.
[5] Z.Wu, J.Wu, J. Cao, and D. Tao. Hysad: a semi- supervised hybrid shilling attack
detector for trustworthy product recommendation. In Proceedings of the 18th ACM
SIGKDD international conference on Knowledge discovery and data mining, KDD
’12, pages 985–993, 2012
[6] S. Xie, G. Wang, S. Lin, and P. S. Yu. Review spam detection via temporal pattern
discovery. In Proceedings of the 18th ACM SIGKDD international conference on
Knowledge discovery and data mining, KDD ’12, pages 823–831, 2012.
[7] B. Yan and G. Chen. Appjoy: personalized mobile application discovery. In
Proceedings of the 9th international conference on Mobile systems, applications, and
services, MobiSys ’11, pages 113– 126, 2011.
[8] K. Shi and K. Ali. Getjar mobile application recommendations with very sparse
datasets. In Proceedings of the 18th ACM SIGKDD international conference on
Knowledge discovery and data mining, KDD ’12, pages 204–212, 2012.
[9] R. Agrawal and R. Srikant, “Fast algorithms for mining association rules,” in
VLDB, 1994.
[10] H. Zhu, E. Chen, K. Yu, H. Cao, H. Xiong, and J. Tian. Mining personal context-
aware preferences for mobile users. In Data Mining (ICDM), 2012 IEEE 12th
International Conference on, pages 1212–1217, 2012.
[11]Hengshu Zhu, Hui Xiong Discovery of Ranking Fraud for Mobile Apps. IEEE
TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING,2013.
61