The document discusses using a K-Means clustering algorithm and data science to detect fake banknotes by measuring the variance and skewness of wavelet transformed images and representing the correlation between these metrics in a graph. The algorithm identifies where data points cluster to predict whether a banknote is legitimate or fake based on its proximity to the identified cluster points.
Download as DOCX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
18 views
Data Science Project
The document discusses using a K-Means clustering algorithm and data science to detect fake banknotes by measuring the variance and skewness of wavelet transformed images and representing the correlation between these metrics in a graph. The algorithm identifies where data points cluster to predict whether a banknote is legitimate or fake based on its proximity to the identified cluster points.
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3
Data Science Project
Report
Our purpose in this project is trying to detect fake
banknotes through a data science algorithm. The purpose is predict if a banknote is legit or not giving a result. For this we use an algorith known as “K-Means” which gives us a result based on the proximity on K points, for knowing the result, if it’s genuine or a fake banknote, we are going to explain it better next in this report.
We start with some data information, we know two
variables or things that we can measure for our purpose, “V1” and “V2”. “V1” means variance of Wavelet Transformed image, which is a mathematic technique to analyze images when we measure the variance. Now, “V2” means skewness of Wavelet Transformed image, who is used to measure the asymmetry of differents images (in this case, we are going to analyze banknotes).
Now, depending on V1 and V2 we can classify banknotes
in two differents Classes: We are going to consider if it’s in Class 1 the banknote is genuine, and if it is fake, it’s on Class 2.
We measure V1 and V2 in a normalized o standard form,
because both of them have different scales, so we don’t work properly with the data if we are going to using different scales to measure and analyze. So first of all, we normalize the two variables finding the maximum and the minimum of V1 and we do it the same with V2, next we apply a mathematic formula to standarize both data. Now we have properly scaled data, we apply the K-Means algorith, as we say it early, the K-Means it’s a technique who finds where are the data clustering on. Now, this areas where the data is clustering, are marked by a point on the graph.
Explaining the graph:
This graph, even if looks very terrifying at first view, it is
very simple.
On the first axis, X, we have the data of “V1”
and the second axis, Y, we have the data of “V2”. In this graph all the blue points means the correlation between each other (V1,V2) we can call that a vector. As we can see, the blue points make a figure, our work now is trying to find the K-Means clusters. If you take attention, you can probably guess that red points are K-Means cluster, and you are right. The red points is where our data science algorithm finds clusters in all this vectors or (V1, V2) points, which for us it get it so much difficult to find this.
Now, in summarize, we can say:
- Our purpose is trying to identify fake banknotes, for this
we use Data Science algorithms, specifically K-Means Clustering. - We measure the variance of Wavelets transformed called V1 and skewness of Wavelets transformed called V2, which are mathematical techniques for analyze images of banknotes in this case. - We represent the correlation between V1 and V2 in a graph and we apply the K-Means algorithm which help us to find where the points are clustering on.
Now, maybe you’re probaly thinking, ok, that’s great but
we can predict now if a banknote is legit or not? Well, the answer is yes, so knowing the K-Means clusters if a vector is nearer at the bottom-left point, we can say the banknote is probably genuine and if it’s nearer for the top- right point, it’s a fake banknote.
Recomendations:
With this technique, you can make a computer analyze the
banknotes and check if the vector is nearer of which point, the result gives you an answer “Genuine” or “Fake” banknote. It’s important to consider that this algorithm is not 100% effective but is highly certain. It’s important to keep developing some other techniques and data science algorithms to detect and predict fake banknotes. We know that bad guys are improving their technologies to make more “genuine look” of banknotes, it’s important to be a step ahead of them.