IECH1103 Analytics Report

This document discusses detecting counterfeit banknotes using machine learning algorithms. It describes a dataset containing genuine and counterfeit banknote images that is used to train models. Three algorithms are tested: linear regression, random forest, and K-means clustering. The results of each model are analyzed to determine the best approach for classifying banknotes and identifying fakes. The overall goal is to develop an effective system for authenticating banknotes using machine learning.

Uploaded by

Official Work

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views11 pages

IECH1103 Analytics Report

Uploaded by

Official Work

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

Cover

Table of Content
Table of Figure
Data choice
Banknotes are essential for the smooth functioning of the monetary system. Preventing the
circulation of counterfeit banknotes is crucial for ensuring the smooth operation of the cash
economy. More counterfeit bills are available now than in previous years. Counterfeit
currency is a fake version of legal tender that has been illegally produced. The bad state of
the country's financial market can be attributed in part to the widespread production of
counterfeit bills in every denomination. In this data set, the quantity of each type is
distributed normally. Notes are authentic when the target class value is 0 and fake when it is
1.

The revised ARFF file is displayed in the screenshot below.

Background Information

It is considered counterfeit and is therefore a type of fake currency if a banknote is printed without
the proper authorization from either the state or the federal government. Fake currency is also
known as counterfeit currency. The long-term goal of this study is to devise a system for classifying
the various methods that can be used to detect counterfeit banknotes in order to forestall the
further spread of fakes. The study's goal is to establish a framework for classifying the various
approaches that can be taken to identify fake banknotes. The currency of a nation is considered to
be among its most valuable resources. Some participants in the financial market are so dishonest
that they deliberately flood the market with counterfeit notes that are almost impossible to tell
apart from the genuine article. This is done with the intention of sowing confusion and discord
among prospective investors. The genuine banknotes and the counterfeits share numerous
similarities, which makes it difficult for human beings to differentiate between the two. Because it is
possible to manufacture convincing counterfeit banknotes, it is necessary to have a system that can
accurately determine whether or not a particular bill is genuine. This research identifies a variety of
machine learning algorithms that, in the not-too-distant future, may be utilized for the purpose of
verifying and analyzing banknotes. For the purpose of identifying genuine banknotes, it is necessary
to make use of supervised learning in addition to unsupervised methods such as Decision Trees,
Linear Regression, and Simple K-Means. With the assistance of these algorithms, it is possible to
decipher the patterns on banknotes.

Dataset Description

The machine learning repository at UCI was responsible for providing the dataset that was utilized
during the training of the models. As a direct consequence of this, the training of the models was a
resounding and unqualified success. In order to compile the dataset, both genuine and fabricated
photographs of monetary instruments were used in equal measure. This treasure trove could
contain as many as 1,372 items that are completely unique to the world. The total number of
characteristics is five; four of these make up the features, and one of these serves as the objective.
The percentage of items belonging to each category that is represented in the dataset follows the
parameters of a normal distribution. [Ignore capitalization] The values 0 and 1 will be accepted by
the target class. A value of 0 will indicate a genuine note, while a value of 1 will indicate a fake note.

The histogram of all five attributes in the dataset can be seen in the image below.
You can see the relationships between the various attributes in the scatter plot below.

Data Preprocessing

During the process of data cleansing, redundant particulars and errors are eliminated and rectified.
The procedure of correcting incorrect information is known as "data cleansing." In the process of
preparing the data warehouse, there are several touchpoints at which the data must be cleansed.
The process of combining separate pieces of information into one coherent whole is referred to as
"data integration," which is a word that describes the method itself. Before any information can be
moved from one system to another, a process known as data transformation has to be carried out
first. Because there are no missing values, and because we have already normalized the data, this set
is in excellent condition for analysis. In order to accomplish the goals of this data analysis, auxiliary
considerations are simply unnecessary.

Data mining

Here, we've implemented the Decision Tree algorithm, the Linear Regression model, and the Simple
K-means clustering algorithm to mine this data.

Linear Regression Model

The use of linear regression as a technique for doing predictive analysis is common because it's so
straightforward. The primary purposes of using regression are to find out the following two things: 1)
whether or not a specific group of independent variables, known as predictor variables, can
accurately predict a set of dependent variables,and 2) Which of the predictor variables are the best
indicators of the value of the outcome variable, and how do these predictor variables influence the
value of the outcome variable, as shown by the magnitude and sign of the beta estimates. The
correlation that exists between the dependent variable and the independent variables has been
demonstrated with the help of the regression estimations. The regression equation looks like this
when there is only one independent variable and one dependent variable involved in the analysis:

In this equation, ‘x’ represents the independent variable score, ‘b’ the regression coefficient, and ‘c’
a constant.
Mean absolute error is 12.95 percent, root mean squared error is 17.44 percent, relative absolute
error is 26.0624 percent, and the total percentage we obtained is 34.8272 percent. All these figures
originate from the same collection of data.

Random Forest Model

Techniques of supervised machine learning, such as the Random Forest algorithm, have found
widespread use in the disciplines of classification and regression. It constructs decision trees by using
a wide variety of instances as building blocks, and then employs majority voting for classification and
an average for regression. According to the findings, the coefficient of correlation is 98.58 percent,
the mean absolute error is 2.37 percent, the root mean squared error is 8.45 percent, the relative
absolute error is 4.767 percent, and the total percentage that we acquired is 16.880 percent.
K-means

The centers are located with the help of the K-means clustering algorithm, and the procedure is
repeated until the optimal solution is located. A "K" indicates the total number of discovered clusters
when using K-means. Data points are grouped into clusters if their squared distance from the
cluster's center is minimal. Data points tend to be more alike within a cluster if there is less variation
within it. In the realm of unsupervised machine learning, K-means is a data grouping method that
can be employed. An unlabeled dataset can be partitioned into a predetermined number of clusters
using this algorithm.

Here, the variance is 0.541, skewness is 0.5878, curtosis is 0.2, entropy is 0.6703, and class is 0.4366
at the cluster centroid for each of the five attributes that make up our dataset. The cluster instance
percentages are 48% for class 0 and 52% for class 1, respectively.
Conclusion

This study will investigate banknote authentication, also known as the process of distinguishing
genuine from counterfeit banknotes, by employing two supervised learning techniques and one
unsupervised learning technique. The goal of this research is to better understand banknote
authentication. Following a discussion of the approaches that have previously been utilized to detect
counterfeit banknotes, this study moved on to investigate additional approaches. The dataset
containing the banknotes was put through a battery of tests on each of these models so that it could
be determined which of these models would be the most effective choice for classifying the
banknotes. We compared the results of these analyses in order to select the model that provided
the most encouraging results for the purpose of conducting additional research.

The AI Marketing Canvas
No ratings yet
The AI Marketing Canvas
25 pages
Cell Barring (RAN15.0 02)
No ratings yet
Cell Barring (RAN15.0 02)
51 pages
Huawei MV Oss-Global Case Stories1 PDF
No ratings yet
Huawei MV Oss-Global Case Stories1 PDF
40 pages
Panasonic Lumix s5 II
No ratings yet
Panasonic Lumix s5 II
803 pages
New PUMA Mathematics Mastery Curriculum Maps 1
No ratings yet
New PUMA Mathematics Mastery Curriculum Maps 1
31 pages
Temperature Prediction Models in Mass Concrete State of The Art Literature Review
No ratings yet
Temperature Prediction Models in Mass Concrete State of The Art Literature Review
10 pages
SNMP Datasheet For Ita Ups Sic snmp810
No ratings yet
SNMP Datasheet For Ita Ups Sic snmp810
2 pages
Auto Cad Assignment 3
0% (1)
Auto Cad Assignment 3
3 pages
Content Standard:: /configuring-Of-Computer-Systems-And-Networks - PDF Module in ICT CHS 10 Teacher Guide
100% (2)
Content Standard:: /configuring-Of-Computer-Systems-And-Networks - PDF Module in ICT CHS 10 Teacher Guide
2 pages
Valli A A Compact Course On Linear Pdes
No ratings yet
Valli A A Compact Course On Linear Pdes
267 pages
Research Ii: Types of Research Data
No ratings yet
Research Ii: Types of Research Data
21 pages
Quiz 4 - ELEN2016A - 2020
No ratings yet
Quiz 4 - ELEN2016A - 2020
3 pages
CHEMISTRY - 3.1 Accuracy Precision Practice Sig Figs and Sci Notation
100% (1)
CHEMISTRY - 3.1 Accuracy Precision Practice Sig Figs and Sci Notation
20 pages
LAS WEEK 1 - Grade 10 ICT
No ratings yet
LAS WEEK 1 - Grade 10 ICT
4 pages
Using Multivariate Statistics 7th Edition Barbara G. Tabachnickdownload
100% (2)
Using Multivariate Statistics 7th Edition Barbara G. Tabachnickdownload
51 pages
Tarantella PDF
No ratings yet
Tarantella PDF
10 pages
Srs - Lms (Final)
No ratings yet
Srs - Lms (Final)
15 pages
Benefits of Being Hilton
100% (3)
Benefits of Being Hilton
3 pages
Delcam - PowerMILL 2015 R2 WhatsNew EN - 2015
No ratings yet
Delcam - PowerMILL 2015 R2 WhatsNew EN - 2015
71 pages
Computer Project: For Loop
No ratings yet
Computer Project: For Loop
11 pages
Extinguishant Control Panel (SHC70002, SHC70003) Operation and Maintenance Manual
No ratings yet
Extinguishant Control Panel (SHC70002, SHC70003) Operation and Maintenance Manual
38 pages
Machine Learning May 2024
No ratings yet
Machine Learning May 2024
8 pages
Official - PCPP
No ratings yet
Official - PCPP
12 pages
Commercial Electric 4 Ft. 8-Outlet Surge Protect Energy Saving Power Bar in White - HEADER - META - TAGS - SITE - NAME
No ratings yet
Commercial Electric 4 Ft. 8-Outlet Surge Protect Energy Saving Power Bar in White - HEADER - META - TAGS - SITE - NAME
7 pages
Types of Software Testing
No ratings yet
Types of Software Testing
10 pages
Ajay Asthana: Greater Atlanta Area SR SAP Basis/SAP HANA Consultant at SAP
No ratings yet
Ajay Asthana: Greater Atlanta Area SR SAP Basis/SAP HANA Consultant at SAP
6 pages
UPDPSWin 3000MU
No ratings yet
UPDPSWin 3000MU
5 pages
SUN2000-115kTL-M2 Datasheet
No ratings yet
SUN2000-115kTL-M2 Datasheet
2 pages
Https Support - Honeywellaidc.com S Article Best-Practices-On-t
No ratings yet
Https Support - Honeywellaidc.com S Article Best-Practices-On-t
3 pages
Transmitting Loop Antenna For The 40M Band
No ratings yet
Transmitting Loop Antenna For The 40M Band
12 pages
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (643)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
4/5 (1175)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2885)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)

IECH1103 Analytics Report

Uploaded by

IECH1103 Analytics Report

Uploaded by

Cover

The revised ARFF file is displayed in the screenshot below.

Linear Regression Model

Random Forest Model

You might also like