0% found this document useful (0 votes)
346 views

Credit Card Fraud Detection Using Random Forest Algorithm and CNN

This document proposes an intelligent approach for detecting credit card fraud using an optimized light gradient boosting machine. It summarizes that credit card fraud costs firms and consumers large financial losses annually. The proposed approach uses a Bayesian-based hyperparameter optimization algorithm to tune the parameters of a light gradient boosting machine for fraud detection. Experiments on two real-world credit card transaction datasets show the approach achieves high performance in terms of accuracy, AUC, precision, and F1-score compared to other methods.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
346 views

Credit Card Fraud Detection Using Random Forest Algorithm and CNN

This document proposes an intelligent approach for detecting credit card fraud using an optimized light gradient boosting machine. It summarizes that credit card fraud costs firms and consumers large financial losses annually. The proposed approach uses a Bayesian-based hyperparameter optimization algorithm to tune the parameters of a light gradient boosting machine for fraud detection. Experiments on two real-world credit card transaction datasets show the approach achieves high performance in terms of accuracy, AUC, precision, and F1-score compared to other methods.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 48

AN INTELLIGENT APPROACH TO CREDIT CARD FRAUD DETECTION

USING AN OPTIMIZED LIGHT GRADIENT BOOSTING MACHINE

ABSTRACT

New advances in electronic commerce systems and communication technologies have made
the credit card the potentially most popular method of payment for both regular and online
purchases; thus, there is significantly increased fraud associated with such transactions.
Fraudulent credit card transactions cost firms and consumers large financial losses every
year, and fraudsters continuously attempt to find new technologies and methods for
committing fraudulent transactions. The detection of fraudulent transactions has become a
significant factor affecting the greater utilization of electronic payment. Thus, there is a need
for efficient and effective approaches for detecting fraud in credit card transactions. This
paper proposes an intelligent approach for detecting fraud in credit card transactions using
an optimized light gradient boosting machine (OLightGBM). In the proposed approach, a
Bayesian-based hyperparameter optimization algorithm is intelligently integrated to tune
the parameters of a light gradient boosting machine (LightGBM). To demonstrate the
effectiveness of our proposed OLightGBM for detecting fraud in credit card transactions,
experiments were performed using two real-world public credit card transaction data sets
consisting of fraudulent transactions and legitimate ones. Based on a comparison with
other approaches using the two data sets, the proposed approach outperformed the other
approaches and achieved the highest performance in terms of accuracy (98.40%), Area
under receiver operating characteristic curve (AUC) (92.88%), Precision (97.34%) and
F1-score (56.95%).

INTRODUCTION

The migration of business to the Internet and the electronic monetary transactions that
occur in the continuously growing cash-less economy have made the accurate detection of
fraud a significant factor in securing such transactions. Credit card fraud occurs when a
thief uses credit card information to complete purchase processes without permission from
the credit card owner. The large-scale use of credit cards and the lack of effective security
systems result in billion-dollar losses to credit card fraud. Because credit card firms are
typically unwilling to announce such facts, it is difficult to obtain a precise
approximation of the losses. However, certain data regarding the financial losses caused
by credit card fraud are publicly accessible. The use of credit cards without strong security
causes billion-dollar financial losses. Global financial losses due to credit card fraud
amounted to 22.8 billion US dollars in 2017 and are expected to continuously increase ;
by 2020, the number is expected to reach 31 billion US dollars. There are two
categories of credit card fraud: application fraud and behavior fraud.

Application fraud refers to fraudulent credit card applications. Such fraud occurs when
a fraudster initiates a new credit card process using false identity details and the issuer
accepts the request. Behavior fraud occurs after a credit card is correctly issued and
denotes credit card transactions that involve fraudulent behavior. Credit card fraud
detection has been significant issue for credit card users and financial organizations.
Because detecting even a small number of fraudulent transactions would protect large
amounts of money, credit card fraud has also become a significant problem for
researchers. For various reasons, fraud detection is considered a challenge for machine
learning because, for example, the distribution of data continually evolves over time due
to new attack approaches and seasonality and because a very small percentage of all
credit card transactions are fraudulent.

This paper proposes an intelligent approach for detecting fraudulent credit card
transactions that uses an optimized light gradient boosting machine. In the proposed
approach, a Bayesian-based hyperparameter optimization algorithm is intelligently
integrated to tune the parameters of the light gradient boosting machine algorithm. The
proposed approach is primarily concerned with discriminating between legitimate and
fraudulent credit card transactions. The main contribution of our research is an
intelligent approach for detecting fraud in credit card transactions using an optimized
light gradient boosting machine in which a Bayesian-based hyperparameter optimization
algorithm is utilized to optimize the parameters of the light gradient boosting machine.
The performance of the proposed intelligent approach is evaluated based on two real-
world data sets and compared with other machine learning techniques using
performance evaluation metrics. The remainder of the paper is structured as follows.
Related research is reviewed in the second section. Section three describes our
proposed intelligent approach for credit card fraud detection, and in section four, the
results of experiments are discussed. Finally, the study’s conclusions are summarized
in section five.
The fast and wide reach of the Internet has made it one of the major selling channels for the
retail sector. In the last few years, there has been a rapid increase in the number of card
issuers, card users and online merchants, giving very little time for technology to catch-up
and prevent online fraud completely. Statistics shows that on-line banking has been the
fastest growing Internet activity with nearly 44% of the population in the US actively
participating in it. As overall e-commerce volumes continued to grow over the past few
years, the figure of losses to Internet merchants was projected to be between $5 and $15
billion in the year 2005. Recent statistics by Garner group place online fraud rate between
0.8% and 0.9%, with auction fraud accounting to nearly half of the total incidents of fraud on
the Internet. Considering the current trends of e-commerce volumes, the projected loss is
$8.2 billion in the year 2006, with $3.0 billion in the US alone.

In order to understand the severity of credit card fraud, let us briefly look into the
mechanisms adopted by fraudsters to commit fraud. Credit card fraud involves illegal use of
card or card information without the knowledge of the owner and hence is an act of criminal
deception. Fraudsters usually get hold of card information in a variety of ways: Intercepting
of mails containing newly issued cards, copying and replicating of card information through
skimmers or gathering sensitive information through phishing (cloned websites) or from
unethical employees of credit card companies. Phishing involves acquiring of sensitive
information like card numbers and passwords by masquerading as a trustworthy person or
business in an electronic communication such as e-mail. Fraudsters may also resort to
generation of credit card numbers using BIN (Bank Identification Numbers) of banks. A
recent scheme of Triangulation takes fraud fighters many days to realize and investigate. In
this method, the fraudster operates through an authentic-looking website, where he
advertises and sells goods at highly discounted prices. The unaware buyer submits his card
information and buys goods.

The fraudster then places an order with a genuine merchant using the stolen card
information. He then uses the stolen card to purchase other goods or route funds into
intractable accounts. Its only after several days that the merchant and card owners realize
about the fraud. This type of fraud causes initial confusion that provides camouflage for the
fraudster to carry out their operations. 1.2. Impact of fraud It is interesting to note that credit
card fraud affects card owners the least because their liability is limited to the transactions
made. The existing legislations and cardholder protection policies as well as insurance
schemes in most countries protect the interests of the cardholders. However, the most
affected are the merchants, who, in most situations, do not have any evidence (eg. Digital
signature) to dispute the cardholders’ claim of misused card information. Merchants end up
bearing all the loses due to chargeback, shipping cost of goods, card issuer fees and charges
as well as their own administrative costs.

Excessive fraudulent cases involving the same merchant can drive away customers, cause
card issuer banks to withdraw service and also result in loss of reputation and goodwill. Card
issuer banks have to bear the administrative cost of investigations into fraud cases as well as
infrastructure costs of setting up the required software and hardware facilities to combat
fraud. They also incur indirect costs through transaction delays. Studies show that the
average time lag between the fraudulent transaction date and chargeback notification can be
as high as 72 days, thereby giving fraudsters sufficient time to cause severe damage . 1.3.
Fraud detection and prevention

The negative impacts of fraud make it very clear and necessary to put in place an effective
and economical fraud detection system. Recent technological advancements to combat fraud
have contributed number of solutions in this area. Fraud detection techniques involving
sophisticated screening of transactions to tracking customer behaviour and spending patterns
are now being developed and employed by both merchants as well as card issuer banks.
Some of the recently employed techniques include transaction screening through Address
Verification Systems (AVS), Card Verification Method (CVM), Personal Identification
Number (PIN) and Biometrics. AVS involves verification of address with zip code of the
customer while CVM and PIN involve checking of numeric code that is keyed in by the
customer. Biometrics might involve signature or fingerprint verification. Rule-based
methods and maintaining of positive and negative lists of customers and geographical
regions are also used in practice.

Data mining and credit scoring methods focus on statistical analyses and deciphering of
customer behaviour and spending patterns to detect frauds. Neural networks are capable of
deriving patterns out of databases containing historical transactions of customers. These
neural networks can be ‘trained’ and are ‘adaptive’ to the emerging new forms of frauds.
Deployment of sophisticated techniques and screening of every transaction alone will not
reduce losses. It is necessary to employ an effective and economical solution to combat
fraud. Such a solution should not only detect fraud cases efficiently but also turn out to be
cost-effective. The idea is to strike a balance between the cost involved in transaction
screening and review and the losses due to fraudulent cases. Analyses show that review of
only 2.0% of transactions can result in reducing fraud losses accounting to 1.0% of total
value of transactions.

While a review of as high as 30% of transactions can reduce the fraud loses drastically to
0.06%, but that increases review costs exorbitantly. The estimated cost of not using anti-fraud
software was about $60 billion in 2005. The key to minimize total costs is to categorize
transac- tions and review only the potentially fraudulent cases. This should involve
deployment of a step-by-step screening, filtering and review mechanism. A typical
deployment can involve initial authentication of transactions through PIN, expiry date on
card, AVS and CVM. A second level of screening can involve comparing with positive and
negative lists as well as rules based on customers, geographical regions, IP addresses and
policies. Risk and credit scoring with pattern and behaviour analyses can come next,
followed by manual review.

This classifies and filters out transactions as genuine or fraudulent in every step and as a
result only a few transactions would require further manual review. Such a solution reduces
the overall processing delay as well as total costs involved in manpower and administration.
The focus of this paper will now shift to risk scoring and behavioral pattern detection using
neural networks. 2. Neural networks in fraud detection – literature review Neural Networks
have been extensively put to use in the areas of banking, finance and insurance. They have
been successfully applied into credit scoring of customers, bankruptcy or business failure
prediction, stock price forecasting, bond rating, currency prediction and many more areas.
In the area of fraud detection and prevention, neural networks like feed-forward networks
with back-propagation have found immense applications).

Usually such applications of neural networks systems involve knowing about the previous
cases of fraud, to make systems learn the var- ious trends. Fraud cases are statistically
analyzed to derive out relationships among input data and values for certain key parameters
in order to understand the various patterns of fraud. This knowledge of fraud trends is then
iteratively taught to feed-forward neural networks, which can successfully identify similar
fraud cases occurring in the future.
EXITING SYSTEM:

The potential social and economic importance of detecting fraudulent credit card
transactions has increased the number of relevant research efforts in the literature. This
section reviews several significant studies. More comprehensive reviews can be found .
There are two main approaches for detecting fraudulent credit card transactions using
machine learning algorithms: supervised learning algorithms and unsupervised learning
algorithms. In supervised learning algorithms, historical credit card transactions are
labeled as legitimate or fraudulent. Then, supervised learning algorithms start learning
using these data to create a model that can be used to categorize new data samples. In
contrast, unsupervised learning algorithms are based on the direct classification of credit
card transactions using patterns that are considered normal. Then, the algorithm
classifies transactions that do not conform to such patterns as fraudulent credit card
transactions. Both supervised learning , and unsupervised learning algorithms have
been utilized for credit card fraud detection. The most popular algorithms for the detection
of credit card fraud use supervised learning and employ labeled transactions for
classifier training. Fraudulent credit card transactions are detected by classifying features
extracted from credit card transactions . A number of classification algorithms have
been utilized to detect fraudulent credit card transactions. A probabilistic neural
network (PNN), logistic regression (LOR) and genetic programming (GP) have been
employed for classifying fraud in credit card transactions. A data set of 202 Chinese firms
was used, and t-statistics were applied to select the important features. The results
revealed that PNN outperformed the other approaches. Bayesian belief networks (BNNs)
and decision trees (DTs) were used to detect fraud in financial transactions. Here, a
data set of financial transactions collected from 76 Greek industrial companies was used.
The data set included 38 financial transactions confirmed to be fraudulent by assessors.
The BNNs obtained the highest accuracy (90.3%), whereas the DTs achieved an accuracy
of 73.6%.

PROPOSED SYSTEM:

This paper proposes an intelligent approach for detecting fraudulent credit card
transactions that uses an optimized light gradient boosting machine. In the proposed
approach, a Bayesian-based hyperparameter optimization algorithm is intelligently
integrated to tune the parameters of the light gradient boosting machine algorithm. The
proposed approach is primarily concerned with discriminating between legitimate and
fraudulent credit card transactions. The main contribution of our research is an
intelligentapproach for detecting fraud in credit card transactions usingan optimized light
gradient boosting machine in which aBayesian-based hyperparameter optimization algorithm
isutilized to optimize the parameters of the light gradientboosting machine. The performance
of the proposedintelligent approach is evaluated based on two real-worlddata sets and
compared with other machine learningtechniques using performance evaluation metrics.

MODULES:

DATA SET AND DATA PREPROCESSING

To develop different experiments for evaluating the proposed approach and demonstrating
its generality, we consider two different real-world data sets. The first data set consists of
284,807 credit card transactions made by the credit card owners in September 2013 in
Europe. Of the 284,807 transactions in the data set, 492 were fraudulent; the positive
class (i.e., the fraudulent transactions) represents 0.172% of all transactions . The data set
includes 31 features. The first 28 features (i.e., V1 to V28) are the principal components
obtained using principal components analysis (PCA). The basic reason is to maintain data
privacy. “Time” and “Amount” are the only two features that are not transformed using
PCA. The second data set is the UCSD-FICO Data Mining Contest 2009 Dataset, which
is a real data set of e-commerce transactions. The objective was to detect anomalous
ecommerce transactions. The data set consists of 94,683 transactions, 2,094 of which
are fraudulent. The data set was collected from 73,729 credit cards during a period of 98
days. It contains 20 fields, including class, and the fields labels are as follows: amount,
hour1, state1, zip1, custAttr1, field1, custAttr2, field2, hour2, flag1, total, field3, field4,
indicator1, indicator2, flag2, flag3, flag4, flag5, and Class.

FEATURE SELECTION
Selecting significant and important features is critical for the effective detection of credit
card fraud when the number of features is large. LightGBM utilizes the information gain
(IG) method to select the most important features and thus decrease the dimensionality of
the training data. Information gain functions by extracting similarities between credit card
transactions and then awarding the greatest weight to the most significant features based
on the class of legitimate and fraudulent credit card transactions. Because of its
computational efficiency and leading performance in terms of precision, information gain is
employed as a feature selection method in the proposed approach.

THE OPTIMIZED LIGHT GRADIENT BOOST CLASSIFIER

This section explains the proposed intelligent approach for detecting fraudulent credit
card transactions using an optimized light gradient boosting framework based on tree
learning algorithms. In the proposed approach, a Bayesianbased hyperparameter
optimization algorithm is intelligently integrated to tune the parameters of the
LightGBM algorithm. The high-performance LightGBM algorithm can quickly handle
large amounts of data and the distributed processing of data. It was developed as an open
source project by Microsoft. The Light Gradient Boosting algorithm is explained in figure
2. The LightGBM algorithm includes several parameters, termed hyper parameters. The
hyper parameters have a significant impact on the performance of LightGBM algorithm.
They are typically set manually and then tuned in a continuous trial and error process.

MODEL EVALUATION USING PERFORMANCE METRICS

To evaluate the performance of the proposed approach for credit card fraud detection, a
cross validation test is applied. The k-fold cross-validation (CV) method is utilized to
systematically and carefully assess the performance of the proposed approach for credit
card fraud detection. K-Fold CV is a statistical analysis approach that has been widely
employed by researchers to assess the performance of the machine learning classifier. In
this research, we conduct a 5-fold CV test to assess the performance of the proposed
approach. The two analyzed data sets have imbalance in classes: there are more normal than
fraudulent transactions. In this case, to achieve more accurate estimates, cross validation
is used to train and test the model in each subset of the two data sets; then, the average of
all the noted metrics is calculated over the data set. Each data set is divided randomly
into five separate subsets of equal size. At each step of validation, a single subset (20% of
the data set) is reserved as the validation data set for testing the performance of the
proposed approach, while the remaining four subsets (80% of the data set) are employed as
the training data set. This process is then repeated five times until each subset has been
used. The average of the performances of the five test subsets is calculated, and the final
result is the total performance of the proposed approach on a 5-fold CV test. To assess
the performance of the proposed approach, several measures are considered, including the
Confusion Matrix, Precision, Recall, Accuracy (ACC), AUC and F1-score. The metrics
are defined based on the confusion matrix.

BACKGROUND:

The potential social and economic importance of detecting fraudulent credit card
transactions has increased the number of relevant research efforts in the literature. This
section reviews several significant studies. More comprehensive reviews can be found.
There are two main approaches for detecting fraudulent credit card transactions using
machine learning algorithms: supervised learning algorithms and unsupervised learning
algorithms. In supervised learning algorithms, historical credit card transactions are
labeled as legitimate or fraudulent.

Then, supervised learning algorithms start learning using these data to create a model
that can be used to categorize new data samples. In contrast, unsupervised learning
algorithms are based on the direct classification of credit card transactions using
patterns that are considered normal. Then, the algorithm classifies transactions that do not
conform to such patterns as fraudulent credit card transactions. Both supervised
learning and unsupervised learnin algorithms have been utilized for credit card fraud
detection. The most popular algorithms for the detection of credit card fraud use
supervised learning and employ labeled transactions for classifier training. Fraudulent
credit card transactions are detected by classifying features extracted from credit card
transactions.

A number of classification algorithms have been utilized to detect fraudulent credit card
transactions. A probabilistic neural network (PNN), logistic regression (LOR) and genetic
programming (GP) have been employed for classifying fraud in credit card transactions. A
data set of 202 Chinese firms was used, and t-statistics were applied to select the
important features. The results revealed that PNN outperformed the other approaches.
Bayesian belief networks (BNNs) and decision trees (DTs) were used to detect fraud in
financial transactions. Here, a data set of financial transactions collected from 76 Greek
industrial companies was used. The data set included 38 financial transactions confirmed
to be fraudulent by assessors. The BNNs obtained the highest accuracy (90.3%),
whereas the DTs achieved an accuracy of 73.6%.

A self-organizing map (SOM) was used to generate a model for unsupervised credit card
fraud detection. The advantages of this method are that because the SOM model does not
require prior information, the model is updated continuously by adding new credit card
transactions; the disadvantage may be the difficulty of detecting fraudulent credit card
transactions with high accuracy. Recently, deep learning has become a powerful
component of machine learning and achieved promising results in several fields, such
as image processing. Jurgovsky et al. utilized a long short-term memory (LSTM) frame to
detect credit card fraud as a sequence classification issue in the supervised learning
category. Kraus and Feuerriegel utilized deep-learning approaches to support financial
decisions. Fiore et al. proposed a scheme to make synthetic instances based on generative
adversarial networks to enhance credit card fraud detection performance by solving the
issue of the imbalanced data set.

Carcillo et al. implemented a hybrid approach that utilizes unsupervised outlier scores to
expand the set of features of the fraud detection classifier. Their main contribution was to
implement and assess various levels of granularity for outlier score definition. Their
experimental results indicate that their proposed approach is efficient and enhances
detection accuracy. Carcillo et al. also introduced the SCAlable Real-time Fraud
Finder (SCARFF), which incorporates big-data techniques (Cassandra, Kafka and Spark)
in a machine learning method to address nonstationarity, imbalance, and feedback
latency. The results of experiments based on a large data set of real credit card
transactions demonstrated that the framework is efficient, accurate and scalable.

Saia et al. proposed a new approach to credit card fraud detection based on a model
defined using a discrete Fourier transform converted to utilize frequency patterns. The
approach has the advantages of treating imbalanced class distribution and cold-start issues
by considering only past legitimate transactions, thus decreasing the data heterogeneity
problem. Yuan et al. introduced a novel framework that combines deep neural
networks and spectral graph analysis for fraud detection. They developed and assessed
two neural networks for fraud detection: a deep auto encoder and a convolutional
neural network. Experimental results indicated that their proposed approach is effective
for fraud detection.

Saia presented a novel credit card fraud detection method based on the discrete wavelet
transform, which was utilized to construct an evaluation model capable of overcoming
problems related to the imbalanced distribution of credit card fraud data sets. The
experimental results indicated that the performance of the proposed approach was
comparable to that of state-of-the-art approaches, such as random forests. West et al.
presented a comprehensive review of financial fraud detection approaches using
computational intelligence techniques. In addition, they identified research gaps that were
not addressed by other review articles. Ensemble classifiers associate what is currently
learned from new samples from previously attained knowledge. Dhankhad et al.
applied many supervised machine learning algorithms to identify fraudulent credit card
transactions using a real-world data set. Then, they used these algorithms to implement
a super classifier based on ensemble learning approaches. Their results indicated that the
ensemble approach achieved the best performance. Pozzolo et al.

designed two fraud detection systems based on an ensemble method and a sliding-
window method, respectively. The study revealed that the winning strategy involved
training two separate classifiers and then aggregating the outcomes. Based on
experiments on a large data set, the results indicated that the proposed approach improved
fraud alert precision. Bio-inspired algorithms offer global solutions to the optimization
problems. The combining bio-inspired optimization algorithms with machine learning
models may enhance the performances of the machine learning models because it has
the ability to deduct the best solutions for the optimization problem. Therefore, machine
learning models have been coupled with bio-inspired optimization techniques,
Kamaruddin and Vadlamani [49] developed a hybrid approach of Particle Swarm
Optimization and AutoAssociative Neural Network for credit card
fraud detection.

DEEP LEARNING

Deep learning uses artificial neural networks to perform sophisticated computations on


large amounts of data. It is a type of machine learning that works based on the structure and
function of the human brain.  Deep learning algorithms train machines by learning from
examples. Industries such as health care, eCommerce, entertainment, and advertising
commonly use deep learning.

DEFINING NEURAL NETWORKS

A neural network is structured like the human brain and consists of artificial neurons,
also known as nodes. These nodes are stacked next to each other in three layers: The input
layer  The hidden layer(s) The output layer
Data provides each node with information in the form of inputs. The node multiplies the
inputs with random weights, calculates them, and adds a bias. Finally, nonlinear functions,
also known as activation functions, are applied to determine which neuron to fire.

HOW DEEP LEARNING ALGORITHMS WORK

While deep learning algorithms feature self-learning representations, they depend upon
ANNs that mirror the way the brain computes information. During the training process,
algorithms use unknown elements in the input distribution to extract features, group objects,
and discover useful data patterns. Much like training machines for self-learning, this occurs at
multiple levels, using the algorithms to build the models. Deep learning models make use of
several algorithms. While no one network is considered perfect, some algorithms are better
suited to perform specific tasks. To choose the right ones, it’s good to gain a solid
understanding of all primary algorithms.

TYPES OF ALGORITHMS USED IN DEEP LEARNING

Here is the list of top 10 most popular deep learning algorithms:


Convolutional Neural Networks (CNNs) Long Short Term Memory Networks (LSTMs)
Recurrent Neural Networks (RNNs) Generative Adversarial Networks (GANs) Radial
Basis Function Networks (RBFNs) Multilayer Perceptrons (MLPs) Self Organizing
Maps (SOMs) Deep Belief Networks (DBNs) Restricted Boltzmann
Machines( RBMs)

Autoencoders

Deep learning algorithms work with almost any kind of data and require large amounts of
computing power and information to solve complicated issues. Now, let us, deep-dive, into
the top 10 deep learning algorithms.

Convolutional Neural Networks (CNNs)

CNN's, also known as ConvNets, consist of multiple layers and are mainly used for
image processing and object detection. Yann LeCun developed the first CNN in 1988 when it
was called LeNet. It was used for recognizing characters like ZIP codes and digits. CNN's
are widely used to identify satellite images, process medical images, forecast time series, and
detect anomalies.

How Do CNNs Work? CNN's have multiple layers that process and extract features from
data: Convolution Layer CNN has a convolution layer that has several filters to perform
the convolution operation. Rectified Linear Unit (ReLU) CNN's have a ReLU layer to
perform operations on elements. The output is a rectified feature map. Pooling Layer The
rectified feature map next feeds into a pooling layer. Pooling is a down-sampling operation
that reduces the dimensions of the feature map. 

The pooling layer then converts the resulting two-dimensional arrays from the pooled feature
map into a single, long, continuous, linear vector by flattening it.  Fully Connected Layer
A fully connected layer forms when the flattened matrix from the pooling layer is fed as an
input, which classifies and identifies the images. Below is an example of an image
processed via CNN.
LONG SHORT TERM MEMORY NETWORKS (LSTMS)

LSTMs are a type of Recurrent Neural Network (RNN) that can learn and memorize
long-term dependencies. Recalling past information for long periods is the default behavior. 
LSTMs retain information over time. They are useful in time-series prediction because they
remember previous inputs. LSTMs have a chain-like structure where four interacting layers
communicate in a unique way. Besides time-series predictions, LSTMs are typically used for
speech recognition, music composition, and pharmaceutical development. How Do LSTMs
Work? First, they forget irrelevant parts of the previous state  Next, they selectively update
the cell-state values Finally, the output of certain parts of the cell state Below is a diagram
of how LSTMs operate:
RECURRENT NEURAL NETWORKS (RNNS)

RNNs have connections that form directed cycles, which allow the outputs from the
LSTM to be fed as inputs to the current phase.  The output from the LSTM becomes an
input to the current phase and can memorize previous inputs due to its internal memory.
RNNs are commonly used for image captioning, time-series analysis, natural-language
processing, handwriting recognition, and machine translation. An unfolded RNN looks like
this:
How Do RNNs work?

The output at time t-1 feeds into the input at time t.  Similarly, the output at time t feeds
into the input at time t+1. RNNs can process inputs of any length.  The computation
accounts for historical information, and the model size does not increase with the input size.
Here is an example of how Google’s autocompleting feature works:

GENERATIVE ADVERSARIAL NETWORKS (GANS)

GANs are generative deep learning algorithms that create new data instances that
resemble the training data. GAN has two components: a generator, which learns to generate
fake data, and a discriminator, which learns from that false information. The usage of GANs
has increased over a period of time. They can be used to improve astronomical images and
simulate gravitational lensing for dark-matter research. Video game developers use GANs to
upscale low-resolution, 2D textures in old video games by recreating them in 4K or higher
resolutions via image training. GANs help generate realistic images and cartoon characters,
create photographs of human faces, and render 3D objects.

How Do GANs work?


The discriminator learns to distinguish between the generator’s fake data and the real
sample data. During the initial training, the generator produces fake data, and the
discriminator quickly learns to tell that it's false. The GAN sends the results to the generator
and the discriminator to update the model. Below is a diagram of how GANs operate:

RADIAL BASIS FUNCTION NETWORKS (RBFNS)

RBFNs are special types of feed forward neural networks that use radial basis functions
as activation functions. They have an input layer, a hidden layer, and an output layer and are
mostly used for classification, regression, and time-series prediction.

How Do RBFNs Work

RBFNs perform classification by measuring the input's similarity to examples from the
training set. RBFNs have an input vector that feeds to the input layer. They have a layer of
RBF neurons. The function finds the weighted sum of the inputs, and the output layer has
one node per category or class of data. The neurons in the hidden layer contain the Gaussian
transfer functions, which have outputs that are inversely proportional to the distance from the
neuron's center. The network's output is a linear combination of the input’s radial-basis
functions and the neuron’s parameters. See this example of an RBFN:
MULTILAYER PERCEPTRONS (MLPS)

MLPs are an excellent place to start learning about deep learning technology.  MLPs
belong to the class of feedforward neural networks with multiple layers of perceptrons that
have activation functions. MLPs consist of an input layer and an output layer that are fully
connected. They have the same number of input and output layers but may have multiple
hidden layers and can be used to build speech-recognition, image-recognition, and machine-
translation software.

How Do MLPs Work?

MLPs feed the data to the input layer of the network. The layers of neurons connect in a
graph so that the signal passes in one direction. MLPs compute the input with the weights
that exist between the input layer and the hidden layers. MLPs use activation functions to
determine which nodes to fire. Activation functions include ReLUs, sigmoid functions, and
tanh. MLPs train the model to understand the correlation and learn the dependencies
between the independent and the target variables from a training data set. Below is an
example of an MLP. The diagram computes weights and bias and applies suitable activation
functions to classify images of cats and dogs.
SELF ORGANIZING MAPS (SOMS) 

Professor Teuvo Kohonen invented SOMs, which enable data visualization to reduce the
dimensions of data through self-organizing artificial neural networks.  Data visualization
attempts to solve the problem that humans cannot easily visualize high-dimensional data.
SOMs are created to help users understand this high-dimensional information.

How Do SOMs Work?

SOMs initialize weights for each node and choose a vector at random from the training data.
SOMs examine every node to find which weights are the most likely input vector. The
winning node is called the Best Matching Unit (BMU). SOMs discover the  BMU’s
neighbourhood, and the amount of neighbors lessens over time. SOMs award a winning
weight to the sample vector. The closer a node is to a BMU, the more its weight changes..
The further the neighbor is from the BMU, the less it learns. SOMs repeat step two for N
iterations. Below, see a diagram of an input vector of different colors. This data feeds to a
SOM, which then converts the data into 2D RGB values. Finally, it separates and categorizes
the different colors.
DEEP BELIEF NETWORKS (DBNS)

DBNs are generative models that consist of multiple layers of stochastic, latent
variables. The latent variables have binary values and are often called hidden units. DBNs
are a stack of Boltzmann Machines with connections between the layers, and each RBM layer
communicates with both the previous and subsequent layers. Deep Belief Networks (DBNs)
are used for image-recognition, video-recognition, and motion-capture data. 

How Do DBNs Work

Greedy learning algorithms train DBNs. The greedy learning algorithm uses a layer-by-
layer approach for learning the top-down, generative weights. DBNs run the steps of Gibbs
sampling on the top two hidden layers. This stage draws a sample from the RBM defined by
the top two hidden layers. DBNs draw a sample from the visible units using a single pass of
ancestral sampling through the rest of the model. DBNs learn that the values of the latent
variables in every layer can be inferred by a single, bottom-up pass. Below is an example of
DBN architecture:
RESTRICTED BOLTZMANN MACHINES (RBMS)

Developed by Geoffrey Hinton, RBMs are stochastic neural networks that can learn from
a probability distribution over a set of inputs.  This deep learning algorithm is used for
dimensionality reduction, classification, regression, collaborative filtering, feature learning,
and topic modeling. RBMs constitute the building blocks of DBNs. RBMs consist of two
layers: Visible units  Hidden units Each visible unit is connected to all hidden units.
RBMs have a bias unit that is connected to all the visible units and the hidden units, and they
have no output nodes.

How Do RBMs Work?

RBMs have two phases: forward pass and backward pass. RBMs accept the inputs and
translate them into a set of numbers that encodes the inputs in the forward pass. RBMs
combine every input with individual weight and one overall bias. The algorithm passes the
output to the hidden layer. In the backward pass, RBMs take that set of numbers and
translate them to form the reconstructed inputs. RBMs combine each activation with
individual weight and overall bias and pass the output to the visible layer for reconstruction.
At the visible layer, the RBM compares the reconstruction with the original input to analyze
the quality of the result. Below is a diagram of how RBMs function:

Autoencoders

Autoencoders are a specific type of feedforward neural network in which the input and
output are identical. Geoffrey Hinton designed autoencoders in the 1980s to solve
unsupervised learning problems. They are trained neural networks that replicate the data from
the input layer to the output layer. Autoencoders are used for purposes such as
pharmaceutical discovery, popularity prediction, and image processing.

How Do Autoencoders Work?

An autoencoder consists of three main components: the encoder, the code, and the
decoder. Autoencoders are structured to receive an input and transform it into a different
representation. They then attempt to reconstruct the original input as accurately as possible. 
When an image of a digit is not clearly visible, it feeds to an autoencoder neural network. 
Autoencoders first encode the image, then reduce the size of the input into a smaller
representation. Finally, the autoencoder decodes the image to generate the reconstructed
image. The following image demonstrates how autoencoders operate:

ADVANTAGES OF DEEP LEARNING:

Benefits or advantages of Deep Learning Following are the benefits or advantages


of Deep Learning: Features are automatically deduced and optimally tuned for desired
outcome. Features are not required to be extracted ahead of time. This avoids time consuming
machine learning techniques. Robustness to natural variations in the data is automatically
learned.
The same neural network based approach can be applied to many different applications and
data types. Massive parallel computations can be performed using GPUs and are scalable
for large volumes of data. Moreover it delivers better performance results when amount of
data are huge. The deep learning architecture is flexible to be adapted to new problems in
the future. Drawbacks or disadvantages of Deep Learning Following are the drawbacks or

DISADVANTAGES OF DEEP LEARNING:

It requires very large amount of data in order to perform better than other techniques.
It is extremely expensive to train due to complex data models. Moreover deep learning
requires expensive GPUs and hundreds of machines. This increases cost to the users.
There is no standard theory to guide you in selecting right deep learning tools as it requires
knowledge of topology, training method and other parameters. As a result it is difficult to be
adopted by less skilled people.

It is not easy to comprehend output based on mere learning and requires classifiers to do so.
Convolutional neural network based algorithms perform such tasks.

LITRATURE SURVEY:

A SURVEY OF MACHINE-LEARNING AND NATURE-INSPIRED BASED


CREDIT CARD FRAUD DETECTION TECHNIQUES

Credit card is one of the popular modes of payment for electronic transactions in many
developed and developing countries. Invention of credit cards has made online transactions
seamless, easier, comfortable and convenient. However, it has also provided new fraud
opportunities for criminals, and in turn, increased fraud rate. The global impact of credit card
fraud is alarming, millions of US dollars have been lost by many companies and individuals.
Furthermore, cybercriminals are innovating sophisticated techniques on a regular basis,
hence, there is an urgent task to develop improved and dynamic techniques capable of
adapting to rapidly evolving fraudulent patterns. Achieving this task is very challenging,
primarily due to the dynamic nature of fraud and also due to lack of dataset for researchers.
This paper presents a review of improved credit card fraud detection techniques. Precisely,
this paper focused on recent Machine Learning based and Nature Inspired based credit card
fraud detection techniques proposed in literature. This paper provides a picture of recent trend
in credit card fraud detection. Moreover, this review outlines some limitations and
contributions of existing credit card fraud detection techniques, it also provides necessary
background information for researchers in this domain. Additionally, this review serves as a
guide and stepping stone for financial institutions and individuals seeking for new and
effective credit card fraud detection techniques.

HOBA: A NOVEL FEATURE ENGINEERING METHODOLOGY FOR CREDIT


CARD FRAUD DETECTION WITH A DEEP LEARNING ARCHITECTURE

Credit card transaction fraud costs billions of dollars to card issuers every year. A well-
developed fraud detection system with a state-of-the-art fraud detection model is regarded as
essential to reducing fraud losses. The main contribution of our work is the development of a
fraud detection system that employs a deep learning architecture together with an advanced
feature engineering process based on homogeneity-oriented behavior analysis (HOBA).
Based on a real-life dataset from one of the largest commercial banks in China, we conduct a
comparative study to assess the effectiveness of the proposed framework. The experimental
results illustrate that our proposed methodology is an effective and feasible mechanism for
credit card fraud detection. From a practical perspective, our proposed method can identify
relatively more fraudulent transactions than the benchmark methods under an acceptable false
positive rate. The managerial implication of our work is that credit card issuers can apply the
proposed methodology to efficiently identify fraudulent transactions to protect customers’
interests and reduce fraud losses and regulatory costs

A DATA MINING BASED SYSTEM FOR CREDIT-CARD FRAUD DETECTION IN


E-TAIL

Credit-card fraud leads to billions of dollars in losses for online merchants. With the
development of machine learning algorithms, researchers have been finding increasingly
sophisticated ways to detect fraud, but practical implementations are rarely reported. We
describe the development and deployment of a fraud detection system in a large e-tail
merchant. The paper explores the combination of manual and automatic classification, gives
insights into the complete development process and compares different machine learning
methods. The paper can thus help researchers and practitioners to design and implement data
mining based systems for fraud detection or similar problems. This project has contributed
not only with an automatic system, but also with insights to the fraud analysts for improving
their manual revision process, which resulted in an overall superior performance.

This paper describes a rapid technique: communal analysis suspicion scoring (CASS), for
generating numeric suspicion scores on streaming credit applications based on implicit links
to each other, over both time and space. CASS includes pair-wise communal scoring of
identifier attributes for applications, definition of categories of suspiciousness for application-
pairs, the incorporation of temporal and spatial weights, and smoothed k-wise scoring of
multiple linked application-pairs. Results on mining several hundred thousand real credit
applications demonstrate that CASS reduces false alarm rates while maintaining reasonable
hit rates. CASS is scalable for this large data sample, and can rapidly detect early symptoms
of identity crime. In addition, new insights have been observed from the relationships
between applications

CREDIT CARD FRAUD DETECTION: A REALISTIC MODELING AND A NOVEL


LEARNING STRATEGY

Detecting frauds in credit card transactions is perhaps one of the best testbeds for
computational intelligence algorithms. In fact, this problem involves a number of relevant
challenges, namely: concept drift (customers' habits evolve and fraudsters change their
strategies over time), class imbalance (genuine transactions far outnumber frauds), and
verification latency (only a small set of transactions are timely checked by investigators).
However, the vast majority of learning algorithms that have been proposed for fraud
detection rely on assumptions that hardly hold in a real-world fraud-detection system (FDS).
This lack of realism concerns two main aspects: 1) the way and timing with which supervised
information is provided and 2) the measures used to assess fraud-detection performance. This
paper has three major contributions. First, we propose, with the help of our industrial partner,
a formalization of the fraud-detection problem that realistically describes the operating
conditions of FDSs that everyday analyze massive streams of credit card transactions. We
also illustrate the most appropriate performance measures to be used for fraud-detection
purposes. Second, we design and assess a novel learning strategy that effectively addresses
class imbalance, concept drift, and verification latency. Third, in our experiments, we
demonstrate the impact of class unbalance and concept drift in a real-world data stream
containing more than 75 million transactions, authorized over a time window of three years.

DATA MINING FOR CREDIT CARD FRAUD: A COMPARATIVE STUDY

Credit card fraud is a serious and growing problem. While predictive models for credit card
fraud detection are in active use in practice, reported studies on the use of data mining
approaches for credit card fraud detection are relatively few, possibly due to the lack of
available data for research. This paper evaluates two advanced data mining approaches,
support vector machines and random forests, together with the well-known logistic
regression, as part of an attempt to better detect (and thus control and prosecute) credit card
fraud. The study is based on real-life data of transactions from an international credit card
operation

REAL-TIME CREDIT CARD FRAUD DETECTION USING COMPUTATIONAL


INTELLIGENCE

Online banking and e-commerce have been experiencing rapid growth over the past few years and
show tremendous promise of growth even in the future. This has made it easier for fraudsters to
indulge in new and abstruse ways of committing credit card fraud over the Internet. This paper
focuses on real-time fraud detection and presents a new and innovative approach in understanding
spending patterns to decipher potential fraud cases. It makes use of self-organization map to decipher,
filter and analyze customer behavior for detection of fraud

CREDIT CARD FRAUD DETECTION USING SELF-ORGANIZING MAPS

owadays, credit card fraud detection is of great importance to finan-cial institutions. This article
presents an automated credit card fraud detection sys-tem based on the neural network technology.
The authors apply the Self-Organizing Map algorithm to create a model of typical cardholder's
behavior and to analyze the deviation of transactions, thus finding suspicious transactions.

VIDEO TRACKING USING LEARNED HIERARCHICAL FEATURES

In this paper, we propose an approach to learn hierarchical features for visual object tracking. First,
we offline learn features robust to diverse motion patterns from auxiliary video sequences. The
hierarchical features are learned via a two-layer convolutional neural network. Embedding the
temporal slowness constraint in the stacked architecture makes the learned features robust to
complicated motion transformations, which is important for visual object tracking. Then, given a
target video sequence, we propose a domain adaptation module to online adapt the pre-learned
features according to the specific target object. The adaptation is conducted in both layers of the deep
feature learning module so as to include appearance information of the specific target object. As a
result, the learned hierarchical features can be robust to both complicated motion transformations and
appearance changes of target objects. We integrate our feature learning algorithm into three tracking
methods. Experimental results demonstrate that significant improvement can be achieved using our
learned hierarchical features, especially on video sequences with complicated motion transformations.

SEQUENCE CLASSIFICATION FOR CREDIT-CARD FRAUD DETECTION

Due to the growing volume of electronic payments, the monetary strain of credit-card fraud is
turning into a substantial challenge for financial institutions and service providers, thus
forcing them to continuously improve their fraud detection systems. However, modern data-
driven and learning-based methods, despite their popularity in other domains, only slowly
find their way into business applications.

In this paper, we phrase the fraud detection problem as a sequence classification task and
employ Long Short-Term Memory (LSTM) networks to incorporate transaction sequences.
We also integrate state-of-the-art feature aggregation strategies and report our results by
means of traditional retrieval metrics.

A comparison to a baseline random forest (RF) classifier showed that the LSTM improves
detection accuracy on offline transactions where the card-holder is physically present at a
merchant. Both the sequential and non-sequential learning approaches benefit strongly from
manual feature aggregation strategies. A subsequent analysis of true positives revealed that
both approaches tend to detect different frauds, which suggests a combination of the two. We
conclude our study with a discussion on both practical and scientific challenges that remain
unsolved.

DECISION SUPPORT FROM FINANCIAL DISCLOSURES WITH DEEP NEURAL


NETWORKS AND TRANSFER LEARNING
Company disclosures greatly aid in the process of financial decision-making; therefore, they are
consulted by financial investors and automated traders before exercising ownership in stocks. While
humans are usually able to correctly interpret the content, the same is rarely true of computerized
decision support systems, which struggle with the complexity and ambiguity of natural language. A
possible remedy is represented by deep learning, which overcomes several shortcomings of traditional
methods of text mining. For instance, recurrent neural networks, such as long short-term memories,
employ hierarchical structures, together with a large number of hidden layers, to automatically extract
features from ordered sequences of words and capture highly non-linear relationships such as context-
dependent meanings. However, deep learning has only recently started to receive traction, possibly
because its performance is largely untested. Hence, this paper studies the use of deep neural networks
for financial decision support. We additionally experiment with transfer learning, in which we pre-
train the network on a different corpus with a length of 139.1 million words. Our results reveal a
higher directional accuracy as compared to traditional machine learning when predicting stock price
movements in response to financial disclosures. Our work thereby helps to highlight the business
value of deep learning and provides recommendations to practitioners and executives.

DATA MINING TECHNIQUES FOR THE DETECTION OF FRAUDULENT


FINANCIAL STATEMENTS

This paper explores the effectiveness of Data Mining (DM) classification techniques in
detecting firms that issue fraudulent financial statements (FFS) and deals with the
identification of factors associated to FFS. In accomplishing the task of management fraud
detection, auditors could be facilitated in their work by using Data Mining techniques. This
study investigates the usefulness of Decision Trees, Neural Networks and Bayesian Belief
Networks in the identification of fraudulent financial statements. The input vector is
composed of ratios derived from financial statements. The three models are compared in
terms of their performances.

SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS:

 System : Pentium IV 2.4 GHz.


 Hard Disk : 40 GB.
 Floppy Drive : 1.44 Mb.
 Monitor : 15 VGA Colour.
 Mouse : Logitech.
 Ram : 512 Mb.

SOFTWARE REQUIREMENTS:

 Operating system : Windows XP/7.


 Coding Language : Python
 Tool : Tensor Flow
 Database : SQL SERVER 2008

ARCHITECTURE:
The proposed intelligent approach for credit card fraud detection consists of four major
steps, which are explained in the following subsections. The experiment was performed
using an Intel Core i7 processor with 8GB RAM. The proposed approach and other
machine learning techniques were implemented and tested using Python.

SOFTWARE ENVIRONEMT:
Python is a general-purpose interpreted, interactive, object-oriented, and high-level
programming language. It was created by Guido van Rossum during 1985- 1990. Like Perl,
Python source code is also available under the GNU General Public License (GPL). This
tutorial gives enough understanding on Python programming language.

Why to Learn Python?

Python is a high-level, interpreted, interactive and object-oriented scripting language. Python


is designed to be highly readable. It uses English keywords frequently where as other
languages use punctuation, and it has fewer syntactical constructions than other languages.

Python is a MUST for students and working professionals to become a great Software
Engineer specially when they are working in Web Development Domain. I will list down
some of the key advantages of learning Python:

 Python is Interpreted − Python is processed at runtime by the interpreter. You do


not need to compile your program before executing it. This is similar to PERL and
PHP.
 Python is Interactive − You can actually sit at a Python prompt and interact with the
interpreter directly to write your programs.
 Python is Object-Oriented − Python supports Object-Oriented style or technique of
programming that encapsulates code within objects.
 Python is a Beginner's Language − Python is a great language for the beginner-level
programmers and supports the development of a wide range of applications from
simple text processing to WWW browsers to games.

Characteristics of Python

Following are important characteristics of Python Programming −

 It supports functional and structured programming methods as well as OOP.


 It can be used as a scripting language or can be compiled to byte-code for building
large applications.
 It provides very high-level dynamic data types and supports dynamic type checking.
 It supports automatic garbage collection.
 It can be easily integrated with C, C++, COM, ActiveX, CORBA, and Java.
Hello World using Python.

Just to give you a little excitement about Python, I'm going to give you a small conventional
Python Hello World program, You can try it using Demo link.

Applications of Python

As mentioned before, Python is one of the most widely used language over the web. I'm
going to list few of them here:

 Easy-to-learn − Python has few keywords, simple structure, and a clearly defined
syntax. This allows the student to pick up the language quickly.
 Easy-to-read − Python code is more clearly defined and visible to the eyes.
 Easy-to-maintain − Python's source code is fairly easy-to-maintain.
 A broad standard library − Python's bulk of the library is very portable and cross-
platform compatible on UNIX, Windows, and Macintosh.
 Interactive Mode − Python has support for an interactive mode which allows
interactive testing and debugging of snippets of code.
 Portable − Python can run on a wide variety of hardware platforms and has the same
interface on all platforms.
 Extendable − You can add low-level modules to the Python interpreter. These
modules enable programmers to add to or customize their tools to be more efficient.
 Databases − Python provides interfaces to all major commercial databases.
 GUI Programming − Python supports GUI applications that can be created and
ported to many system calls, libraries and windows systems, such as Windows MFC,
Macintosh, and the X Window system of Unix.
 Scalable − Python provides a better structure and support for large programs than
shell scripting.

Python is a high-level, interpreted, interactive and object-oriented scripting language. Python


is designed to be highly readable. It uses English keywords frequently where as other
languages use punctuation, and it has fewer syntactical constructions than other languages.
 Python is Interpreted − Python is processed at runtime by the interpreter. You do
not need to compile your program before executing it. This is similar to PERL and
PHP.
 Python is Interactive − You can actually sit at a Python prompt and interact with the
interpreter directly to write your programs.
 Python is Object-Oriented − Python supports Object-Oriented style or technique of
programming that encapsulates code within objects.
 Python is a Beginner's Language − Python is a great language for the beginner-level
programmers and supports the development of a wide range of applications from
simple text processing to WWW browsers to games.

History of Python

Python was developed by Guido van Rossum in the late eighties and early nineties at the
National Research Institute for Mathematics and Computer Science in the Netherlands.

Python is derived from many other languages, including ABC, Modula-3, C, C++, Algol-68,
SmallTalk, and Unix shell and other scripting languages.

Python is copyrighted. Like Perl, Python source code is now available under the GNU
General Public License (GPL).

Python is now maintained by a core development team at the institute, although Guido van
Rossum still holds a vital role in directing its progress.

Python Features

Python's features include −

 Easy-to-learn − Python has few keywords, simple structure, and a clearly defined
syntax. This allows the student to pick up the language quickly.
 Easy-to-read − Python code is more clearly defined and visible to the eyes.
 Easy-to-maintain − Python's source code is fairly easy-to-maintain.
 A broad standard library − Python's bulk of the library is very portable and cross-
platform compatible on UNIX, Windows, and Macintosh.
 Interactive Mode − Python has support for an interactive mode which allows
interactive testing and debugging of snippets of code.
 Portable − Python can run on a wide variety of hardware platforms and has the same
interface on all platforms.
 Extendable − You can add low-level modules to the Python interpreter. These
modules enable programmers to add to or customize their tools to be more efficient.
 Databases − Python provides interfaces to all major commercial databases.
 GUI Programming − Python supports GUI applications that can be created and
ported to many system calls, libraries and windows systems, such as Windows MFC,
Macintosh, and the X Window system of Unix.
 Scalable − Python provides a better structure and support for large programs than
shell scripting.

Apart from the above-mentioned features, Python has a big list of good features, few are
listed below −

 It supports functional and structured programming methods as well as OOP.


 It can be used as a scripting language or can be compiled to byte-code for building
large applications.
 It provides very high-level dynamic data types and supports dynamic type checking.
 It supports automatic garbage collection.
 It can be easily integrated with C, C++, COM, ActiveX, CORBA, and Java.
 Variables are nothing but reserved memory locations to store values. This means that
when you create a variable you reserve some space in memory.
 Based on the data type of a variable, the interpreter allocates memory and decides
what can be stored in the reserved memory. Therefore, by assigning different data
types to variables, you can store integers, decimals or characters in these variables.
 Assigning Values to Variables

 Python variables do not need explicit declaration to reserve memory space. The
declaration happens automatically when you assign a value to a variable. The equal
sign (=) is used to assign values to variables.
 The operand to the left of the = operator is the name of the variable and the operand to
the right of the = operator is the value stored in the variable. For example −
 A module allows you to logically organize your Python code. Grouping related code
into a module makes the code easier to understand and use. A module is a Python
object with arbitrarily named attributes that you can bind and reference.
 Simply, a module is a file consisting of Python code. A module can define functions,
classes and variables. A module can also include runnable code.
 Example

 The Python code for a module named aname normally resides in a file named
aname.py. Here's an example of a simple module, support.py

Python has been an object-oriented language since it existed. Because of this, creating and
using classes and objects are downright easy. This chapter helps you become an expert in
using Python's object-oriented programming support.

If you do not have any previous experience with object-oriented (OO) programming, you
may want to consult an introductory course on it or at least a tutorial of some sort so that you
have a grasp of the basic concepts.

However, here is small introduction of Object-Oriented Programming (OOP) to bring you at


speed −

Overview of OOP Terminology

 Class − A user-defined prototype for an object that defines a set of attributes that
characterize any object of the class. The attributes are data members (class variables
and instance variables) and methods, accessed via dot notation.
 Class variable − A variable that is shared by all instances of a class. Class variables
are defined within a class but outside any of the class's methods. Class variables are
not used as frequently as instance variables are.
 Data member − A class variable or instance variable that holds data associated with a
class and its objects.
 Function overloading − The assignment of more than one behavior to a particular
function. The operation performed varies by the types of objects or arguments
involved.
 Instance variable − A variable that is defined inside a method and belongs only to
the current instance of a class.
 Inheritance − The transfer of the characteristics of a class to other classes that are
derived from it.
 Instance − An individual object of a certain class. An object obj that belongs to a
class Circle, for example, is an instance of the class Circle.
 Instantiation − The creation of an instance of a class.
 Method − A special kind of function that is defined in a class definition.
 Object − A unique instance of a data structure that's defined by its class. An object
comprises both data members (class variables and instance variables) and methods.
 Operator overloading − The assignment of more than one function to a particular
operator.

Creating Classes

The class statement creates a new class definition. The name of the class immediately follows
the keyword class followed by a colon as follows −

class ClassName:
'Optional class documentation string'
class_suite

 The class has a documentation string, which can be accessed via ClassName.__doc__.
 The class_suite consists of all the component statements defining class members, data
attributes and functions.

Example

Following is the example of a simple Python class −

class Employee:
'Common base class for all employees'
empCount = 0

def __init__(self, name, salary):


self.name = name
self.salary = salary
Employee.empCount += 1

def displayCount(self):
print "Total Employee %d" % Employee.empCount

def displayEmployee(self):
print "Name : ", self.name, ", Salary: ", self.salary

 The variable empCount is a class variable whose value is shared among all instances
of a this class. This can be accessed as Employee.empCount from inside the class or
outside the class.
 The first method __init__() is a special method, which is called class constructor or
initialization method that Python calls when you create a new instance of this class.
 You declare other class methods like normal functions with the exception that the first
argument to each method is self. Python adds the self argument to the list for you; you
do not need to include it when you call the methods.

Creating Instance Objects

To create instances of a class, you call the class using class name and pass in whatever
arguments its __init__ method accepts.

"This would create first object of Employee class"


emp1 = Employee("Zara", 2000)
"This would create second object of Employee class"
emp2 = Employee("Manni", 5000)

Accessing Attributes

You access the object's attributes using the dot operator with object. Class variable would be
accessed using class name as follows −

emp1.displayEmployee()
emp2.displayEmployee()
print "Total Employee %d" % Employee.empCount
The Python standard for database interfaces is the Python DB-API. Most Python database
interfaces adhere to this standard.

You can choose the right database for your application. Python Database API supports a wide
range of database servers such as −

 GadFly
 mSQL
 MySQL
 PostgreSQL
 Microsoft SQL Server 2000
 Informix
 Interbase
 Oracle
 Sybase

Here is the list of available Python database interfaces: Python Database Interfaces and APIs.
You must download a separate DB API module for each database you need to access. For
example, if you need to access an Oracle database as well as a MySQL database, you must
download both the Oracle and the MySQL database modules.

The DB API provides a minimal standard for working with databases using Python structures
and syntax wherever possible. This API includes the following −

 Importing the API module.


 Acquiring a connection with the database.
 Issuing SQL statements and stored procedures.
 Closing the connection

We would learn all the concepts using MySQL, so let us talk about MySQLdb module.

What is MySQLdb?

MySQLdb is an interface for connecting to a MySQL database server from Python. It


implements the Python Database API v2.0 and is built on top of the MySQL C API.
How do I Install MySQLdb?

Before proceeding, you make sure you have MySQLdb installed on your machine. Just type
the following in your Python script and execute it −

#!/usr/bin/python

import MySQLdb

If it produces the following result, then it means MySQLdb module is not installed −

Traceback (most recent call last):


File "test.py", line 3, in <module>
import MySQLdb
ImportError: No module named MySQLdb

To install MySQLdb module, use the following command −

For Ubuntu, use the following command -


$ sudo apt-get install python-pip python-dev libmysqlclient-
dev
For Fedora, use the following command -
$ sudo dnf install python python-devel mysql-devel redhat-rpm-
config gcc
For Python command prompt, use the following command -
pip install MySQL-python

Note − Make sure you have root privilege to install above module.

Database Connection

Before connecting to a MySQL database, make sure of the followings −

 You have created a database TESTDB.


 You have created a table EMPLOYEE in TESTDB.
 This table has fields FIRST_NAME, LAST_NAME, AGE, SEX and INCOME.
 User ID "testuser" and password "test123" are set to access TESTDB.
 Python module MySQLdb is installed properly on your machine.
 You have gone through MySQL tutorial to understand MySQL Basics

Python provides various options for developing graphical user interfaces (GUIs). Most
important are listed below.

 Tkinter − Tkinter is the Python interface to the Tk GUI toolkit shipped with Python.
We would look this option in this chapter.
 wxPython − This is an open-source Python interface for wxWindows
http://wxpython.org.
 JPython − JPython is a Python port for Java which gives Python scripts seamless
access to Java class libraries on the local machine http://www.jython.org.

There are many other interfaces available, which you can find them on the net.

Tkinter Programming

Tkinter is the standard GUI library for Python. Python when combined with Tkinter provides
a fast and easy way to create GUI applications. Tkinter provides a powerful object-oriented
interface to the Tk GUI toolkit.

Creating a GUI application using Tkinter is an easy task. All you need to do is perform the
following steps −

 Import the Tkinter module.


 Create the GUI application main window.
 Add one or more of the above-mentioned widgets to the GUI application.
 Enter the main event loop to take action against each event triggered by the user.

Example

#!/usr/bin/python

import Tkinter
top = Tkinter.Tk()
# Code to add widgets will go here...
top.mainloop()
This would create a following window −

Tkinter Widgets

Tkinter provides various controls, such as buttons, labels and text boxes used in a GUI
application. These controls are commonly called widgets.

There are currently 15 types of widgets in Tkinter. We present these widgets as well as a brief
description in the following table −

Sr.No. Operator & Description

1
The Button widget is used to display buttons in your application.
Canvas

2
The Canvas widget is used to draw shapes, such as lines, ovals, polygons and
rectangles, in your application.
Checkbutton

3
The Checkbutton widget is used to display a number of options as checkboxes. The
user can select multiple options at a time.
Entry

4
The Entry widget is used to display a single-line text field for accepting values from a
user.
Frame
5
The Frame widget is used as a container widget to organize other widgets.
6 Label

The Label widget is used to provide a single-line caption for other widgets. It can also
contain images.
Listbox
7
The Listbox widget is used to provide a list of options to a user.
Menubutton
8
The Menubutton widget is used to display menus in your application.
Menu

9
The Menu widget is used to provide various commands to a user. These commands are
contained inside Menubutton.
Message

10
The Message widget is used to display multiline text fields for accepting values from a
user.
Radiobutton

11
The Radiobutton widget is used to display a number of options as radio buttons. The
user can select only one option at a time.
Scale
12
The Scale widget is used to provide a slider widget.
Scrollbar

13
The Scrollbar widget is used to add scrolling capability to various widgets, such as list
boxes.
Text
14
The Text widget is used to display text in multiple lines.
Toplevel
15
The Toplevel widget is used to provide a separate window container.
16 Spinbox

The Spinbox widget is a variant of the standard Tkinter Entry widget, which can be
used to select from a fixed number of values.
PanedWindow

17
A PanedWindow is a container widget that may contain any number of panes,
arranged horizontally or vertically.
LabelFrame

18
A labelframe is a simple container widget. Its primary purpose is to act as a spacer or
container for complex window layouts.
tkMessageBox
19
This module is used to display message boxes in your applications.

Let us study these widgets in detail −

Standard attributes

Let us take a look at how some of their common attributes.such as sizes, colors and fonts are
specified.

 Dimensions
 Colors
 Fonts
 Anchors
 Relief styles
 Bitmaps
 Cursors

Let us study them briefly −

Geometry Management

All Tkinter widgets have access to specific geometry management methods, which have the
purpose of organizing widgets throughout the parent widget area. Tkinter exposes the
following geometry manager classes: pack, grid, and place.
 The pack() Method − This geometry manager organizes widgets in blocks before
placing them in the parent widget.
 The grid() Method − This geometry manager organizes widgets in a table-like
structure in the parent widget.
 The place() Method − This geometry manager organizes widgets by placing them in a
specific position in the parent widget.

CONCLUSION

The detection of credit card fraud is significant to theimproved utilization of credits cards.
With large andcontinuing financial losses being experienced by financial firms and given
the increasing difficulty of detecting credit card fraud, it is important to develop more
effective approaches for detecting fraudulent credit card transactions. This paper
proposes an intelligent approach for detecting fraudin credit card transactions using an
optimized light gradientboosting machine (OLightGBM). We conducted several
experiments using two real-world data sets. The performanceof the proposed approach was
evaluated through comparisonwith other research outcomes and state-of-the-art
machinelearning algorithms, including random forest, logisticregression, the radial support
vector machine, the linearsupport vector machine, k-nearest neighbors, decision tree,and
naive bayes. The experimental results indicate that theproposed approach outperformed the
other machine learningalgorithms and achieved the highest performance in terms ofAccuracy,
AUC, Precision and F1-score. The results revealthat the proposed algorithm is superior to
other classifiers. The results also highlight the importance and value of adopting
anefficient parameter optimization strategy for enhancing the predictive performance of
the proposed approach.
EFERENCES

[1] Zhang, Xinwei, Yaoci Han, Wei Xu, and Qili Wang. “HOBA: A novel feature
engineering methodology for credit card fraud detection with a deep learning
architecture.” INFORM SCIENCES.May,2019. Accessed on:8/1/2019.

[2] N. Carneiro , G. Figueira , M. Costa , “A data mining based system for credit-card
fraud detection in e-tail”, Decis. Support Syst. Vol 95, pp.91101, Mar 2017.

[3] Lebichot, Bertrand, Yann-Aël Le Borgne, Liyun He-Guelton,


Frédéric Oblé, and Gianluca Bontempi. “Deep-Learning Domain Adaptation
Techniques for Credit Cards Fraud Detection.” in Proc. INNS Big Data and Deep
Learning conference. Genoa, Italy, 2019. pp. 78-88.

[4] John, Hyder, and Sameena Naaz. “Credit Card Fraud Detection using Local Outlier
Factor and Isolation Forest.”, International Journal of Computer Sciences and
Engineering. Vol. 7, no. 4, pp. 1060-1064,Apr .2019.

[5] C. Phua , R. Gayler , V. Lee , K. Smith-Miles , “On the communal analysis suspicion
scoring for identity crime in streaming credit applications”, Eur. J. Oper. Res. Vol 195,no
2,pp 595–612, Jun.2009.

[6] R.J. Bolton , D.J. Hand , “Statistical fraud detection: a review”, Stat. Sci. Vol 17,no
3,pp 235–249, Aug.2002.

[7] Dal Pozzolo, A., Boracchi, G., Caelen, O., Alippi, C., & Bontempi, G. “Credit card
fraud detection: a realistic modeling and a novel learning strategy”. IEEE Trans. Neural
Netw. Learn. Syst. vol 29,no 8, pp 1–14 .3784-3797,Sep.2017.

[8] S. Bhattacharyya, S. Jha, K. Tharakunnel, and J. C. Westland, “Data mining for credit
card fraud: a comparative study,” Decis. Support Syst, vol. 50, no. 3, pp. 602–
613,Feb.2011.
[9] N. Sethi and A. Gera, “A revived survey of various credit card fraud detection
techniques,” International Journal of Computer Science and Mobile Computing, vol. 3, no.
4, pp. 780–791, Apr.2014.

[10] Adewumi AO, Akinyelu AA. “A survey of machine-learning and nature-inspired


based credit card fraud detection techniques.” International Journal of System Assurance
Engineering and Management, vol 8,no 2,pp. 937-53, Nov.2017.

[11] Awoyemi, John O., Adebayo O. Adetunmbi, and Samuel A. Oluwadare. “Credit card
fraud detection using machine learning techniques: A comparative analysis.” In Proc.
ICCNI, Lagos, Nigeria, 2017, pp. 1-9.

You might also like