0% found this document useful (0 votes)
49 views

Sat - 97.Pdf - Bank Fraud Detection Using Machine Learning Algorithm

The document discusses bank fraud detection using machine learning techniques. It proposes using supervised and unsupervised learning algorithms like association, clustering, forecasting, and classification to analyze customer transaction data and identify patterns that could indicate fraud. Identifying these patterns would allow banks to add additional verification steps to help prevent fraudulent banking activities.

Uploaded by

Vj Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views

Sat - 97.Pdf - Bank Fraud Detection Using Machine Learning Algorithm

The document discusses bank fraud detection using machine learning techniques. It proposes using supervised and unsupervised learning algorithms like association, clustering, forecasting, and classification to analyze customer transaction data and identify patterns that could indicate fraud. Identifying these patterns would allow banks to add additional verification steps to help prevent fraudulent banking activities.

Uploaded by

Vj Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

ABSTRACT

The banking sector is a very important sector in our present day generation
where almost every human has to deal with the bank either physically or online.
In dealing with the banks, the customers and the banks face the chances of
been trapped by fraudsters. Examples of fraud include insurance fraud, credit
card fraud, accounting fraud, etc. Detection of fraudulent activity is thus critical
to control these costs. The most common types of bank fraud include debit
and credit card fraud, account fraud, insurance fraud, money laundering fraud,
etc. Bankers are obliged to safeguard their financial assets as well as
institutional integrity to armored the global financial system. Anti-fraud guard
systems are regularly circumvented by fraudsters' dodging techniques. This
paper hereby addresses bank fraud detection via the use of machine learning
techniques; association, clustering, forecasting, and classification to analyze
the customer data in order to identify the patterns that can lead to frauds. Upon
identification of the patterns, adding a higher level of verification/authentication
to banking processes can be added.

v
TABLE OF CONTENTS

Chapter No. TITLE Page No.


ABSTRACT v
LIST OF FIGURES viii
LIST OF ABBREVIATIONS ix
1 INTRODUCTION 1
1.1. OVERVIEW 2
1.2 . MACHINE LEARNING 3
1.3 MACHINE LEARNING STRATEGIES 3
1.3.1. SUPERVISED LEARNING 4
1.3.2. UNSUPERVISED LEARNING 5
2 LITERATURE SURVEY 7
3 METHODOLOGY 11
3.1. EXISTING SYSTEM 11
3.2. PROPOSED SYSTEM 12
3.3. ADVANTAGES OF PROPOSED SYSTEM 12
3.4. SOFTWARE AND HARDWARE REQUIREMENTS 12
3.4.1. HARDWARE REQUIREMENTS 12
3.4.2. SOFTWARE REQUIREMENTS 13
3.4.3. LIBRARIES 13
3.5. PROGRAMMING LANGUAGES 14
3.5.1 JAVA 14
3.5.2. MYSQL 18
3.6. SYSTEM ARCHITECTURE 21
3.7. MODULES USED 21
3.7.1. CLASSIFICATION 11
3.7.2. CLUSTERING 22
3.7.3. ASSOCIATION RULE……………………… 23
3.7.4 REQUIREMENT ANALYSIS………………...24

vi
3.8 SYSTEM STUDY 29
3.7.3FEASIBILITYSTUDY 29
3.8.2. TECHNICAL FEASIBILITY 30
3.8.3. SOCIAL FEASIBILITY 30
4 RESULTS AND DISCUSSION 40
4.1. WORKING 40
5 CONCLUSION 41
5.1. CONCLUSION 41
REFERENCES 42
APPENDICES 43
A. SOURCE CODE 43
B. SCREENSHOTS 46
C. PLAGIARISM REPORT 50

vii
LIST OF FIGURES

Figure No. Figure Name Page No.


1.1. MACHINE LEARNING CLASSIFICATION 6
1.2. MACHINE LEARNING TASK 6
3.1 SYSTEM ARCHITECTURE 21
3.2 DECISION TREE 23

viii
LIST OF ABBREVIATIONS

ABBREVIATIONS EXPANSION

ML Machine Learning

DT Decision Tree

TP True Positive

FP False Positive

TN True Negative

FN False Negative

ix
CHAPTER 1
INTRODUCTION.

According to The American Heritage dictionary, second college


edition, fraud is defined as a deception deliberately practiced to secure
unfair unlawful gain. Fraud detection is the recognition of symptoms of
fraud where no prior suspicion or tendency to fraud exists. Examples
include insurance fraud, credit card fraud and accounting fraud. Data
from the Nigeria Inter-Bank Settlement System (NIBSS) has revealed
that fraudulent transactions in the banking sector at its peak. Fraud has
evolved from being committed by casual fraudsters to being committed
by organized crime and fraud rings that use sophisticated methods to
take over control of accounts and commit fraud. Some 6.8 million
Americans were victimized by card fraud in 2007, according to Javelin
research. Such fraud on existing accounts accounted for more than $3
billion in losses in 2007. The Nilson Report estimates the cost to the
industry to be $4.84 billion. Javelin estimates the losses at more than six
times that amount – some $30.6 billion in 2007. Of course, fraud is not a
domestic product as it‘s everywhere. For instance, card fraud losses cost
UK economy GBP 423 million in 2006. Credit card fraud accounts for the
biggest cut of the $600 million that airlines lose each year globally.

1
OVERVIEW
Fraud detection is a set of activities undertaken to prevent money or
property from being obtained through false pretenses. Fraud detection is
applied to many industries such as banking or insurance. In banking,
fraud may include forging checks or using stolen credit cards.

With an unlimited and rising number of ways someone can commit


fraud, detection can be difficult. Activities such as reorganization,
downsizing, moving to new information systems or encountering a cyber
security breach could weaken an organization's ability to detect fraud.
Techniques such as real-time monitoring for fraud are recommended.
Organizations should look for fraud in financial transactions, locations,
devices used, initiated sessions and authentication systems.

Fraud can be committed in different ways and different settings. For


example, fraud can be committed in banking, insurance, government and
healthcare sectors. A common type of banking fraud is customer account
takeover. This is when someone illegally gains access to a victim's bank account
using bots. Other examples of fraud in banking include the use of malicious
applications, the use of false identities, money laundering, credit card fraud and
mobile fraud.

Government fraud is committing fraud against federal agencies such as the


U.S. Department of Health and Human Services, Department of Transportation,
Department of Education or Department of Energy. Types of government fraud
include billing for unnecessary procedures, overcharging for items that cost less,
providing old equipment when billing for new equipment and reporting hours
worked for a worker that does not exist.

2
MACHINELEARNING

Machine learning could be a subfield of computer science (AI). The


goal of machine learning typically is to know the structure information
of knowledge of information and match that data into models which
will be understood and used by folks. Although machine learning
could be a field inside technology, it differs from ancient
processapproaches.

In ancient computing, algorithms are sets of expressly


programmed directions employed by computers to calculate or
downside solve. Machine learning algorithms instead give computers
to coach on knowledge inputs and use applied math analysis so as to
output values that fall inside a particular vary. thanks to this, machine
learning facilitates computers in building models from sample
knowledge tomodify decision-making processes supported knowledge
inputs.

MACHINE LEARNING STRATEGIES

In machine learning, tasks square measure typically classified into


broad classes. These classes square measure supported however
learning is received or however feedback on the educational is given to
the system developed. Two of the foremost wide adopted machine
learning strategies square measure supervised learning that trains
algorithms supported example input and output information that's tagged
by humans, and unattended learning that provides the algorithmic
program with no tagged information so as to permit it to search out
structure at intervals its computer file.

3
SUPERVISED LEARNING

In supervised learning, the pc is given example inputs that square


measure labelled with their desired outputs. The aim of this technique is
for the algorithmic program to be ready to ―learn‖ by comparison its
actual output with the ―taught‖ outputs to search out errors, and modify
the model consequently. Supervised learning thus uses patterns to
predict label values on extra unlabeled information. For example, with
supervised learning, an algorithm may be fed data with images of sharks
labelled as fish and images of oceans labelled as water. By being trained
on this data, the supervised learning algorithm should be able to later
identify unlabeled shark images as fish and unlabeled ocean images as
water.

A common use case of supervised learning is to use historical


information to predict statistically probably future events. It's going to use
historical stock exchange info to anticipate approaching fluctuations or
be used to filter spam emails. In supervised learning, labeled photos of
dogs are often used as input file to classify unlabeled photos of dogs.

4
UNATTENDED LEARNING

In unattended learning, information is unlabeled, that the learning


rule is left to seek out commonalities among its input file. The goal of
unattended learning is also as easy as discovering hidden patterns at
intervals a dataset;however it should even have a goal of feature
learning, that permits the procedure machine to mechanically discover
the representations that square measure required to classify data.

Unsupervised learning is usually used for transactional information.


You will have an oversized dataset of consumers and their purchases,
however as a person's you'll probably not be able to add up of what
similar attributes will be drawn from client profiles and their styles of
purchases.

With this information fed into Associate in Nursing unattended


learning rule, it should be determined that ladies of a definite age vary
UN agency obtain unscented soaps square measure probably to be
pregnant, and so a promoting campaignassociated with physiological
condition and baby will be merchandised.

5
CHAPTER 2

LITERATURE SURVEY

Fraud detection has been usually seen as a data mining problem where the
objective is to correctly classify the transactions as legitimate or fraudulent. For
classification problems many performance measures are defined most of which
are related with correct number of cases classified correctly.

A more appropriate measure is needed due to the inherent structure of


credit card transactions. When a card is copied or stolen or lost and captured by
fraudsters it is usually used until its available limit is depleted. Thus, rather than
the number of correctly classified transactions, a solution which minimizes the
total available limit on cards subject to fraud is more prominent.

Since the fraud detection problem has mostly been defined as a


classification problem, in addition to some statistical approaches many data
mining algorithms have
been proposed to solve it. Among these, decision trees and artificial neural
networks are the most popular ones. The study of Bolton and Hand provides a
good summary of
literature on fraud detection problems.

However, when the problem is approached as a classification problem with


variable misclassification costs as discussed above, the classical data mining
algorithms are not directly applicable; either some modifications should be made
on them or new algorithms developed specifically for this purpose are needed. An
alternative approach could be trying to make use of general purpose meta
heuristic approaches like genetic algorithms.
7

You might also like