0% found this document useful (0 votes)

27 views95 pages

ML-Unit3 Updated

Uploaded by

ssrindes

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views95 pages

ML-Unit3 Updated

Uploaded by

ssrindes

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 95

Machine Learning- Unit3

Aman Kumar
Unsupervised Learning
• In the case of unsupervised learning, not all
variables and data patterns are classified.
• Instead, the machine must uncover hidden
patterns and create labels through the use of
unsupervised learning algorithms.
• The k-means clustering algorithm is a popular
example of unsupervised learning. This simple
algorithm groups data points that are found to
possess similar features as shown in Figure ahead
Clustering
• If you group data points
based on the purchasing
behavior of SME (Small
and Medium-sized
Enterprises) and large
enterprise customers, for
example, you are likely to
see two clusters emerge.

• This is because SMEs and

large enterprises tend to
have disparate buying
habits.
Clustering
• When it comes to purchasing cloud infrastructure, for
instance, basic cloud hosting resources and a Content
Delivery Network (CDN) may prove sufficient for most
SME customers.
• Large enterprise customers, though, are more likely to
purchase a wider array of cloud products and entire
solutions that include advanced security and
networking products like WAF (Web Application
Firewall), a dedicated private connection, and VPC
(Virtual Private Cloud).
Clustering
• By analyzing customer purchasing habits,
unsupervised learning is capable of identifying
these two groups of customers without specific
labels that classify the company as small,
medium or large.
Clustering
• The advantage of unsupervised learning is it
enables you to discover patterns in the data that
you were unaware existed—such as the
presence of two major customer types.
• Clustering techniques such as k-means
clustering can also provide the springboard for
conducting further analysis after discrete groups
have been discovered.
Clustering
In industry, unsupervised learning is particularly
powerful in fraud detection
—where the most dangerous attacks are often
those yet to be classified. One real-world example
is DataVisor, who essentially built their business
model based on unsupervised learning.
Clustering
• Founded in 2013 in California, DataVisor
protects customers from fraudulent online
activities, including spam, fake reviews, fake app
installs, and fraudulent transactions.
• Whereas traditional fraud protection services
draw on supervised learning models and rule
engines, DataVisor uses unsupervised learning
which enables them to detect unclassified
categories of attacks in their early stages.
Clustering
• On their website, DataVisor explains that "to detect attacks,
existing solutions rely on human experience to create rules
or labeled training data to tune models. This means they
are unable to detect new attacks that haven’t already been
identified by humans or labeled in training data."
• This means that traditional solutions analyze the chain of
activity for a particular attack and then create rules to
predict a repeat attack. Under this scenario, the dependent
variable (y) is the event of an attack and the independent
variables (X) are the common predictor variables of an
attack
Clustering
Examples of independent variables could be:
• A sudden large order from an unknown user. I.E. established
customers generally spend less than $100 per order, but a new user
spends $8,000 in one order immediately upon registering their
account.
• A sudden surge of user ratings. I.E. As a typical author and bookseller
on Amazon.com, it’s uncommon for my first published work to receive
more than one book review within the space of one to two days. In
general, approximately 1 in 200 Amazon readers leave a book review
and most books go weeks or months without a review. However, I
commonly see competitors in this category (data science) attracting
20-50 reviews in one day! (Unsurprisingly, I also see Amazon removing
these suspicious reviews weeks or months later.)
Clustering
• Identical or similar user reviews from different
users. Following the same Amazon analogy, I
often see user reviews of my book appear on
other books several months later (sometimes
with a reference to my name as the author still
included in the review!). Again, Amazon
eventually removes these fake reviews and
suspends these accounts for breaking their
terms of service.
Clustering
• Suspicious shipping address. I.E. For small
businesses that routinely ship products to local
customers, an order from a distant location
(where they don't advertise their products) can
in rare cases be an indicator of fraudulent or
malicious activity.
Clustering
• DataVisor and other anti-fraud solution providers therefore leverage
unsupervised learning to address the limitations of supervised
learning by analyzing patterns across hundreds of millions of accounts
and identifying suspicious connections between users—without
knowing the actual category of future attacks.
• By grouping malicious actors and analyzing their connections to other
accounts, they are able to prevent new types of attacks whose
independent variables are still unlabeled and unclassified.
• Sleeper cells in their incubation stage (mimicking legitimate users) are
also identified through their association to malicious accounts.
Clustering algorithms such as k-means clustering can generate these
groupings without a full training dataset in the form of independent
variables that clearly label indications of an attack, such as the four
examples listed earlier
Clustering
• Knowledge of the dependent variable (known
attackers) is generally the key to identifying other
attackers before the next attack occurs. The other plus
side of unsupervised learning is companies like
DataVisor can uncover entire criminal rings by
identifying subtle correlations across users.
• We will cover unsupervised learning specific to
clustering analysis. Other examples of unsupervised
learning include association analysis, social network
analysis, and descending dimension algorithms.
k-means Clustering (Centroid Based)
• As a popular unsupervised learning algorithm, k-means
clustering attempts to divide data into k discrete
groups and is effective at uncovering basic data
patterns.
• Examples of potential groupings include animal
species, customers with similar features, and housing
market segmentation.
• The k-means clustering algorithm works by first
splitting data into k number of clusters with k
representing the number of clusters you wish to create.
If you choose to split your dataset into three clusters
then k, for example, is set to 3
k-means Clustering
k-means Clustering
• we can see that the original (unclustered) data
has been transformed into three clusters (k is 3).
If we were to set k to 4, an additional cluster
would be derived from the dataset to produce
four clusters
k-means Clustering
• How does k-means clustering separate the data points?
The first step is to examine the unclustered data on the
scatterplot and manually select a centroid for each k
cluster
• That centroid then forms the epicenter of an individual
cluster. Centroids can be chosen at random, which
means you can nominate any data point on the
scatterplot to act as a centroid. However, you can save
time by choosing centroids dispersed across the
scatterplot and not directly adjacent to each other.
k-means Clustering
• In other words, start by guessing where you
think the centroids for each cluster might be
located. The remaining data points on the
scatterplot are then assigned to the closest
centroid by measuring the Euclidean distance.
k-means Clustering
• Each data point can be assigned to only one cluster and each
cluster is discrete. This means that there is no overlap between
clusters and no case of nesting a cluster inside another cluster.
• Also, all data points, including anomalies, are assigned to a
centroid irrespective of how they impact the final shape of the
cluster
• However, due to the statistical force that pulls all nearby data
points to a central point, your clusters will generally form an
elliptical or spherical shape.
k-means Clustering
• After all data points have been allocated to a centroid, the next
step is to aggregate the mean value of all data points for each
cluster, which can be found by calculating the average x and y
values of all data points in that cluster.
• Next, take the mean value of the data points in each cluster and
plug in those x and y values to update your centroid coordinates.
This will most likely result in a change to your centroids’
location. Your total number of clusters, however, will remain the
same. You are not creating new clusters, rather updating their
position on the scatterplot. Like musical chairs, the remaining
data points will then rush to the closest centroid to form k
number of clusters
k-means Clustering
• Should any data point on the scatterplot switch clusters with
the changing of centroids, the previous step is repeated. This
means, again, calculating the average mean value of the
cluster and updating the x and y values of each centroid to
reflect the average coordinates of the data points in that
cluster.
• Once you reach a stage where the data points no longer
switch clusters after an update in centroid coordinates, the
algorithm is complete, and you have your final set of
clusters. The diagrams break down the full algorithmic
process.
k-means Clustering

Sample data points are plotted on a scatterplot

k-means Clustering

Two data points are nominated as centroids

k-means Clustering

Two clusters are formed after calculating the Euclidean distance of the remaining data points to the centroids.
k-means Clustering

The centroid coordinates for each cluster are updated to reflect the cluster’s
mean value. As one data point has switched from the right cluster to the left
cluster, the centroids of both clusters are recalculated.
k-means Clustering

Two final clusters are produced based on the updated

centroids for each cluster
DBSCAN Clustering (Density-based spatial
clustering of applications with noise)

DBSCAN is a density-based clustering algorithm that

segregates data points into high-density regions
separated by regions of low density. Unlike k-means or
hierarchical clustering, which require specifying the
number of clusters beforehand, DBSCAN automatically
determines clusters based on the density of data
points.
Why DBSCAN Clustering
• Partitioning methods like K-means clustering work for finding
spherical-shaped clusters or convex clusters.
• In other words, they are suitable only for compact and well-
separated clusters. Moreover, they are also severely affected
by the presence of noise and outliers in the data.
• Real-life data may contain irregularities, like:
– Clusters can be of arbitrary shape such as shown in the figure
– Data may contain noise.
DBSCAN Clustering
Parameters Required For DBSCAN Algorithm
• eps: It defines the neighborhood around a data point i.e. if the distance
between two points is lower or equal to ‘eps’ then they are considered
neighbors. If the eps value is chosen too small then a large part of the
data will be considered as an outlier. If it is chosen very large then the
clusters will merge and the majority of the data points will be in the
same clusters. One way to find the eps value is based on the k-distance
graph.
• MinPts: Minimum number of neighbors (data points) within eps radius.
The larger the dataset, the larger value of MinPts must be chosen. As a
general rule, the minimum MinPts can be derived from the number of
dimensions D in the dataset as, MinPts >= D+1. The minimum value of
MinPts must be chosen at least 3.
DBSCAN Clustering
The fundamental concepts driving DBSCAN are core points,
border points, and noise points:
• Core Points: A point is considered a core point if there are at
least MinPts number of points (including itself) within a
radius ε of it.
• Border Points:
– Its neighborhood contains less than MinPts data points,
or
– It is reachable from some core point i.e., it is within ε
distance from a core point.
• Noise Points: Points that are neither core nor border points.
These are often outliers or data points in low-density
regions.
DBSCAN Clustering
DBSCAN Clustering
Steps Used In DBSCAN Algorithm
• Find all the neighbor points within eps and identify the core points or
visited with more than MinPts neighbors.
• For each core point if it is not already assigned to a cluster, create a new
cluster.
• Find recursively all its density-connected points and assign them to the
same cluster as the core point.
A point a and b are said to be density connected if there exists a
point c which has a sufficient number of points in its neighbors and both
points a and b are within the eps distance. This is a chaining process. So,
if b is a neighbor of c, c is a neighbor of d, and d is a neighbor of e, which
in turn is neighbor of a implying that b is a neighbor of a.
• Iterate through the remaining unvisited points in the dataset. Those
points that do not belong to any cluster are noise.
DBSCAN Clustering
Demonstration: Use reference pics…
DBSCAN Clustering
Demonstration
DBSCAN Clustering
Demonstration
DBSCAN Clustering
Demonstration
DBSCAN Clustering
Demonstration
DBSCAN Clustering
Demonstration
DBSCAN Clustering
Demonstration
DBSCAN Clustering
Demonstration
DBSCAN Clustering
Demonstration
DBSCAN Clustering
Demonstration
DBSCAN Clustering
Demonstration
DBSCAN Clustering
Demonstration
DBSCAN Clustering
Demonstration
DBSCAN Clustering
Demonstration
DBSCAN Clustering
Demonstration
DBSCAN Clustering
K-Means Vs DBScan
S.No. K-means Clustering DBScan Clustering
Clusters formed are more or less spherical or convex Clusters formed are arbitrary in shape and may not have
1.
in shape and must have same feature size. same feature size.
K-means clustering is sensitive to the number of
2. Number of clusters need not be specified.
clusters specified.
DBSCan Clustering can not efficiently handle high
3. K-means Clustering is more efficient for large datasets.
dimensional datasets.
K-means Clustering does not work well with outliers DBScan clustering efficiently handles outliers and noisy
4.
and noisy datasets. datasets.

In the domain of anomaly detection, this algorithm DBScan algorithm, on the other hand, locates regions of high
5. causes problems as anomalous points will be assigned density that are separated from one another by regions of
to the same cluster as “normal” data points. low density.

It requires two parameters : Radius(R) and Minimum

Points(M)
R determines a chosen radius such that if it includes enough
6. It requires one parameter : Number of clusters (K)
points within it, it is a dense area.
M determines the minimum number of data points required
in a neighborhood to be defined as a cluster.

Varying densities of the data points doesn’t affect K- DBScan clustering does not work very well for sparse datasets
7.
means clustering algorithm. or for data points with varying density.
Distribution-Based Clustering
So far we learned about clustering based on similarity/distance
or density. This family of clustering algorithms takes a totally
different metric into consideration: probability.
Distribution-Based Clustering is a clustering model in which we
will fit the data on the probability of it belonging to the same
distribution.
This clustering approach assumes data is composed of
distributions, such as Normal, Gaussian, binomial, etc. Gaussian
distribution is prominent when we have a fixed number of
distributions and all the upcoming data is fitted into it such that
the distribution of data may get maximized.
Distribution-Based Clustering
• As seen below, data is modeled into 3 Gaussian distributions and as the
distance from the distribution’s center increases, the probability that a
point belongs to the distribution decreases.
• The bands show a decrease in probability. The distribution models of
clustering are most closely related to statistics and it is very closely related
to the way in which datasets are generated and arranged using random
sampling principles i.e., to fetch data points from one form of distribution.
• Clusters can then easily be defined as objects that are most likely to belong
to the same distribution.
• The expectation-maximization algorithm is one of the popular examples of
distribution-based clustering.
Hierarchical Clustering
It creates a tree of clusters. Hierarchical clustering, not
surprisingly, is well suited to hierarchical data, such as
taxonomies
In addition, another advantage is that any number of
clusters can be chosen by cutting the tree at the right
level.

Refer Link:
file:///C:/Users/dell/OneDrive/CU/AI/ML-IVSem/HierarchicalClustering.html
Performance Matrix for Clustering
Silhouette Coefficient: Silhouette Coefficient or
silhouette score is a metric used to calculate the
goodness of a clustering technique. Its value ranges
from -1 to 1.
1: Means clusters are well apart from each other and
clearly distinguished.
0: Means clusters are indifferent, or we can say that
the distance between clusters is not significant.
-1: Means clusters are assigned in the wrong way.
Performance Matrix for Clustering
Silhouette Coefficient:
Performance Matrix for Clustering
Silhouette Coefficient:

Silhouette Score = (b-a)/max(a,b)

a= average intra-cluster distance i.e the average distance between each point within a cluster.
b= average inter-cluster distance i.e the average distance between all clusters.
Performance Matrix for Clustering
import pandas as pd
import numpy as np
import seaborn as sns
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score
%matplotlib inline

X= np.random.rand(50,2)
Y= 2 + np.random.rand(50,2)
Z= np.concatenate((X,Y))
Z=pd.DataFrame(Z) #converting into data frame for ease

sns.scatterplot(Z)

KMean= KMeans(n_clusters=2)
KMean.fit(Z)
label=KMean.predict(Z)

print(f'Silhouette Score(n=2): {silhouette_score(Z, label)}')

sns.scatterplot(Z[0],Z[1],hue=label)
Performance Matrix for Clustering
Let’s try with 3 clusters:

KMean= KMeans(n_clusters=3)
KMean.fit(Z)
label=KMean.predict(Z)
print(f’Silhouette Score(n=3): {silhouette_score(Z, label)}’)
sns.scatterplot(Z[0],Z[1],hue=label,palette=’inferno_r’)

As you can see in the figure clusters are not well apart. The inter cluster distance
between cluster 1 and cluster 2 is almost negligible. That is why the silhouette score for
n= 3(0.596) is lesser than that of n=2(0.806).
Market Basket Analysis
• A technique that is used to uncover purchase
patterns in any retail setting is known as Market
Basket Analysis
• This is a technique that gives the careful study of
purchases done by a customer in a supermarket.
This concept identifies the pattern of frequent
purchase items by customers. This analysis can
help to promote deals, offers, sale by the
companies, and data mining techniques helps to
achieve this analysis task
Market Basket Analysis

• Market basket analysis is a process that looks for

relationships of objects that “go together” within
the business context. In reality, market basket
analysis goes beyond the supermarket scenario
from which its name is derived.
• Market basket analysis is the analysis of any
collection of items to identify affinities that can
be exploited in some manner. Some examples of
the use of market basket analysis include:
Market Basket Analysis

• Market basket analysis mainly works with the

ASSOCIATION RULE {IF} -> {THEN}.
– IF means Antecedent: An antecedent is an item
found within the data
– THEN means Consequent: A consequent is an
item found in combination with the antecedent.
Market Basket Analysis- Apriori
Algorithm
There are three components in APRIORI ALGORITHM:
– SUPPORT
• This measure gives an idea of how frequent an itemset is in all the
transactions.
• consider itemset1 = {bread, butter} and itemset2 = {bread, shampoo}.
Many transactions will have both bread and butter on the cart but
bread and shampoo? Not so much. So in this case, itemset1 will
generally have a higher support than itemset2

• Value of support helps us identify the rules worth considering for

further analysis
Market Basket Analysis- Apriori
Algorithm
There are three components in APRIORI ALGORITHM:
– CONFIDENCE
• This measure defines the likeliness of occurrence of consequent on the
cart given that the cart already has the antecedents.
• That is to answer the question — of all the transactions containing say,
{Bread}, how many also had {Butter} on them? We can say by common
knowledge that {Bread} → {Butter} should be a high confidence rule.
Technically, confidence is the conditional probability of occurrence of
consequent given the antecedent.

• {Butter} → {Bread}? That is, what fraction of transactions having butter

also had bread? Very high i.e. a value close to 1? That’s right. What about
{Yogurt} → {Milk}? High again. {Toothbrush} → {Milk}? Not so sure?
Confidence for this rule will also be high since {Milk} is such a frequent
itemset and would be present in every other transaction
Market Basket Analysis- Apriori
Algorithm
There are three components in APRIORI ALGORITHM:
– LIFT
• Lift controls for the support (frequency) of consequent while
calculating the conditional probability of occurrence of {Y} given
{X}. Lift is a very literal term given to this measure.
• Think of it as the *lift* that {X} provides to our confidence for having
{Y} on the cart.
• To rephrase, lift is the rise in probability of having {Y} on the cart with
the knowledge of {X} being present over the probability of having {Y}
on the cart without any knowledge about presence of {X}.
• Mathematically,
In Summary

•Support - Proportional frequency of an item in the database.

•Confidence - how confident we are of an event, given another event.

•Lift - a measure that tells us whether the probability of an event B increases or decreases
given event A.
Market Basket Analysis

• An example of Association Rules

– Assume there are 100 customers
– 10 of them bought milk, 8 bought butter and 6
bought both of them.
– bought milk => bought butter
– support = P(Milk & Butter) = 6/100 = 0.06
– confidence = support/P(Butter) = 0.06/0.08 = 0.75
– lift = confidence/P(Milk) = 0.75/0.10 = 7.5
Market Basket Analysis- Example

• Market basket analysis example

– Let’s understand the market basket analysis with
an example of how the Apriori algorithm is
implemented
• STEP 1: Catch a glance at the Purchase table to
understand how the data is loaded into the
engine.
Market Basket Analysis- Example

STEP 1: Catch a glance at the Purchase table to

understand how the data is loaded into the
engine.
STEP 2: Next, each transaction is aggregated across records into a single record as
an array that converts the data set to an R transaction. Check out the below image
that depicts the result of the aggregation process.
STEP 3: Lastly, the Apriori logic is implemented in the transactions. Check the
below-given image for the result set.
Market Basket Analysis- Apriori

Apriori Property –
All non-empty subset of frequent itemset must be frequent.

Apriori assumes that :

All subsets of a frequent itemset must be frequent(Apriori property). If an itemset
is infrequent, all its supersets will be infrequent.
DataSet

Step-1: K=1
(I) Create a table containing support count of each item present in
dataset – Called C1(candidate set)

(II) compare candidate set item’s support count with minimum support count(here
min_support=2 if support_count of candidate set items is less than min_support then
remove those items). This gives us itemset L1.
Step-2: K=2
(I) Generate candidate set C2 using L1 (this is called join step).
Check all subsets of an itemset are frequent or not and if not frequent remove that
itemset.(Example subset of{I1, I2} are {I1}, {I2} they are frequent. Check for each itemset)
Now find support count of these itemsets by searching in dataset.

(II) compare candidate (C2) support count with minimum

support count(here min_support=2 if support_count of
candidate set item is less than min_support then remove
those items) this gives us itemset L2.
Step-3:
Generate candidate set C3 using L2 (join step).
So itemset generated by joining L2 is {I1, I2, I3}{I1, I2, I5}{I1, I3, i5}{I2, I3, I4}{I2, I4,
I5}{I2, I3, I5}
Check if all subsets of these itemsets are frequent or not and if not, then remove
that itemset.(Here subset of {I1, I2, I3} are {I1, I2},{I2, I3},{I1, I3} which are frequent.
For {I2, I3, I4}, subset {I3, I4} is not frequent so remove it. Similarly check for every
itemset)
find support count of these remaining itemset by searching in dataset.

(II) Compare candidate (C3) support count with minimum support count(here
min_support=2 if support_count of candidate set item is less than min_support then
remove those items) this gives us itemset L3.
Step-4:
Generate candidate set C4 using L3.
Check all subsets of these itemsets are frequent or not (Here itemset formed by
joining L3 is {I1, I2, I3, I5} so its subset contains {I1, I3, I5}, which is not frequent). So
no itemset in C4
We stop here because no frequent itemsets are found further
Market Basket Analysis
Thus, we have discovered all the frequent item-sets. Now generation of strong association rule comes into picture. For that
we need to calculate confidence of each rule.

Confidence –

Confidence(A->B)=Support_count(A∪B)/Support_count(A)

So here, by taking an example of any frequent itemset, we will show the rule generation.

Itemset {I1, I2, I3} //from L3

SO rules can be
[I1Î2]=>[I3] //confidence = sup(I1Î2Î3)/sup(I1Î2) = 2/4*100=50%
[I1Î3]=>[I2] //confidence = sup(I1Î2Î3)/sup(I1Î3) = 2/4*100=50%
[I2Î3]=>[I1] //confidence = sup(I1Î2Î3)/sup(I2Î3) = 2/4*100=50%
[I1]=>[I2Î3] //confidence = sup(I1Î2Î3)/sup(I1) = 2/6*100=33%
[I2]=>[I1Î3] //confidence = sup(I1Î2Î3)/sup(I2) = 2/7*100=28%
[I3]=>[I1Î2] //confidence = sup(I1Î2Î3)/sup(I3) = 2/6*100=33%

So if minimum confidence is 50%, then first 3 rules can be considered as strong association rules.

A confidence of 50% means that 50% of the customers, who purchased item1 and item2 also bought item3.
Market Basket Analysis

The two primary drawbacks of the Apriori Algorithm are:

At each step, candidate sets have to be built.

To build the candidate sets, the algorithm has to repeatedly scan the database.

These two properties inevitably make the algorithm slower. To overcome these redundant
steps, a new association-rule mining algorithm was developed named Frequent Pattern
Growth Algorithm. It overcomes the disadvantages of the Apriori algorithm by storing all
the transactions in a Trie Data Structure
Market Basket Analysis- FP Growth
Algo
Item Frequency
A 1
Transaction ID Items C 2
T1 {E,K,M,N,O,Y}
This data is a hypothetical dataset of D 1
T2 {D,E,K,N,O,Y} transactions with each letter representing an E 4
T3 {A,E,K,M} I 1
T4 {C,K,M,U,Y}
item. The frequency of each individual item is K 5
T5 {C,E,I,K,O,O} computed M 3
N 2
O 4
U 1
Y 3

Let the minimum support be 3. A Frequent Pattern set is built which will contain all the
elements whose frequency is greater than or equal to the minimum support. These
elements are stored in descending order of their respective frequencies. After insertion
of the relevant items, the set L looks like this

L = {K : 5, E : 4, M : 3, O : 4, Y : 3}
Market Basket Analysis- FP Growth
Algo
Now, for each transaction, the respective Ordered-Item set is built. It is done by
iterating the Frequent Pattern set and checking if the current item is contained in the
transaction in question. If the current item is contained, the item is inserted in the
Ordered-Item set for the current transaction. The following table is built for all the
transactions:

Transaction ID Items Ordered-Item Set

T1 {E,K,M,N,O,Y} {K,E,M,O,Y}

T2 {D,E,K,N,O,Y} {K,E,O,Y}

T3 {A,E,K,M} {K,E,M}

T4 {C,K,M,U,Y} {K,M,Y}

T5 {C,E,I,K,O,O} {K,E,O}

Now, all the Ordered-Item sets are inserted into a trie Data Structure.
Market Basket Analysis- FP Growth
Algo
a) Inserting the set {K, E, M, O, Y}:
Here, all the items are simply linked one after the other in the order of occurrence in
the set and initialize the support count for each item as 1.
Market Basket Analysis- FP Growth
Algo
b) Inserting the set {K, E, O, Y}: Till the insertion of the elements K and E, simply the
support count is increased by 1. On inserting O we can see that there is no direct link
between E and O, therefore a new node for the item O is initialized with the support
count as 1 and item E is linked to this new node. On inserting Y, we first initialize a new
node for the item Y with support count as 1 and link the new node of O with the new
node of Y.
Market Basket Analysis- FP Growth
Algo
c) Inserting the set {K, E, M}:

Here simply the support count of each element is increased by 1.

Market Basket Analysis- FP Growth
Algo
d) Inserting the set {K, M, Y}:

Similar to step b), first the support count of K is increased, then new nodes for M and Y
are initialized and linked accordingly.
Market Basket Analysis- FP Growth
Algo
e) Inserting the set {K, E, O}:

Here simply the support counts of the respective elements are increased. Note that the
support count of the new node of item O is increased.
Market Basket Analysis- FP Growth
Algo
Now, for each item, the Conditional Pattern Base is computed which is path labels of all
the paths which lead to any node of the given item in the frequent-pattern tree. Note that
the items in the below table are arranged in the ascending order of their frequencies.
Market Basket Analysis- FP Growth
Algo
Now for each item, the Conditional Frequent Pattern Tree is built. It is done by taking the
set of elements that is common in all the paths in the Conditional Pattern Base of that item
and calculating its support count by summing the support counts of all the paths in the
Conditional Pattern Base.
Market Basket Analysis- FP Growth
Algo
From the Conditional Frequent Pattern tree, the Frequent Pattern rules are generated by
pairing the items of the Conditional Frequent Pattern Tree set to the corresponding to the
item as given in the below table.

For each row, two types of association rules can be inferred for example for the first row
which contains the element, the rules K -> Y and Y -> K can be inferred. To determine
the valid rule, the confidence of both the rules is calculated and the one with
confidence greater than or equal to the minimum confidence value is retained.
Market Basket Analysis
• Descriptive market basket analysis: This type only derives insights from past data and is the
most frequently used approach. The analysis here does not make any predictions but rates
the association between products using statistical techniques. For those familiar with the
basics of Data Analysis

• Predictive market basket analysis: This type uses supervised learning models like
classification and regression. It essentially aims to mimic the market to analyze what causes
what to happen. Essentially, it considers items purchased in a sequence to determine cross-
selling. For example, buying an extended warranty is more likely to follow the purchase of an
iPhone. While it isn't as widely used as a descriptive MBA, it is still a very valuable tool for
marketers

• Differential market basket analysis: This type of analysis is beneficial for competitor analysis.
It compares purchase history between stores, between seasons, between two time periods,
between different days of the week, etc., to find interesting patterns in consumer behavior.
For example, it can help determine why some users prefer to purchase the same product at
the same price on Amazon vs Flipkart. The answer can be that the Amazon reseller has more
warehouses and can deliver faster, or maybe something more profound like user experience.
Market Basket Analysis- Benefits
Reinforcement Learning
• Reinforcement learning (RL) is a machine learning (ML) technique that
trains software to make decisions to achieve the most optimal results.
• It mimics the trial-and-error learning process that humans use to achieve
their goals. Software actions that work towards your goal are reinforced,
while actions that detract from the goal are ignored.
• RL algorithms use a reward-and-punishment paradigm as they process
data. They learn from the feedback of each action and self-discover the
best processing paths to achieve final outcomes.
• The algorithms are also capable of delayed gratification. The best overall
strategy may require short-term sacrifices, so the best approach they
discover may include some punishments or backtracking along the way. RL
is a powerful method to help artificial intelligence (AI) systems achieve
optimal outcomes in unseen environments.
• Unlike supervised and unsupervised learning, reinforcement learning
continuously improves its model by leveraging feedback from previous
iterations. This is different to supervised and unsupervised learning, which
both reach an indefinite endpoint after a model is formulated from the
training and test data segments.
Reinforcement Learning
Key concepts
• In reinforcement learning, there are a few key concepts to
familiarize yourself with:
• The agent is the ML algorithm (or the autonomous system)
• The environment is the adaptive problem space with attributes
such as variables, boundary values, rules, and valid actions
• The action is a step that the RL agent takes to navigate the
environment
• The state is the environment at a given point in time
• The reward is the positive, negative, or zero value—in other words,
the reward or punishment—for taking an action
• The cumulative reward is the sum of all rewards or the end value
Reinforcement Learning
Algorithm basics
• Reinforcement learning is based on the Markov decision
process, a mathematical modeling of decision-making that
uses discrete time steps. At every step, the agent takes a
new action that results in a new environment state.
Similarly, the current state is attributed to the sequence of
previous actions.
• Through trial and error in moving through the environment,
the agent builds a set of if-then rules or policies. The
policies help it decide which action to take next for optimal
cumulative reward. The agent must also choose between
further environment exploration to learn new state-action
rewards or select known high-reward actions from a given
state. This is called the exploration-exploitation trade-off.
Reinforcement Learning

Difference w.r.t Unsupervised Learning: RL has a predetermined end goal. While it takes an
exploratory approach, the explorations are continuously validated and improved to increase
the probability of reaching the end goal. It can teach itself to reach very specific outcomes.
Reinforcement Learning
• A specific algorithmic example of reinforcement learning is Q-learning. In Q-
learning, you start with a set environment of states, represented by the symbol ‘S’.
In the game Pac-Man, states could be the challenges, obstacles or pathways that
exist in the game. There may exist a wall to the left, a ghost to the right, and a
power pill above—each representing different states.
• The set of possible actions to respond to these states is referred to as “A.” In the
case of Pac-Man, actions are limited to left, right, up, and down movements, as
well as multiple combinations thereof.
• The third important symbol is “Q.” Q is the starting value and has an initial value of
“0.”
• As Pac-Man explores the space inside the game, two main things will happen:
• Q drops as negative things occur after a given state/action
• Q increases as positive things occur after a given state/action
• In Q-learning, the machine will learn to match the action for a given state that
generates or maintains the highest level of Q. It will learn initially through the
process of random movements (actions) under different conditions (states). The
machine will record its results (rewards and penalties) and how they impact its Q
level and store those values to inform and optimize its future actions
Assignment 3
• Use the k-means clustering algorithm and Euclidean distance to cluster the following eight
examples into three clusters: A1=(2,10), A2=(2,5), A3=(8,4), A4=(5,8), A5=(7,5), A6=(6,4),
A7=(1,2), A8=(4,9). Find the new centroid at every new point entry into the cluster group.
Assume initial cluster centers as A1, A4, A7

• Explain Reinforcement Learning in detail, discuss its types along with various elements.
Outline on partially observable state.

DSUP Exp5
No ratings yet
DSUP Exp5
7 pages
Clustering and Pattern Recognition Unit 5
No ratings yet
Clustering and Pattern Recognition Unit 5
21 pages
Chapter 3 p4
No ratings yet
Chapter 3 p4
18 pages
Lecture Unsupervised (17!04!2024)
No ratings yet
Lecture Unsupervised (17!04!2024)
61 pages
DSA Presentation Group 6
No ratings yet
DSA Presentation Group 6
34 pages
ML UNIT 4 Sir
No ratings yet
ML UNIT 4 Sir
42 pages
Machine Learning - Iv
No ratings yet
Machine Learning - Iv
13 pages
ML Unit4
No ratings yet
ML Unit4
19 pages
ML Unit 4
No ratings yet
ML Unit 4
17 pages
Unit 3 Unsupervised Learning & Neural Network
No ratings yet
Unit 3 Unsupervised Learning & Neural Network
21 pages
Week 9. Unsupervised Learning
No ratings yet
Week 9. Unsupervised Learning
32 pages
04-FSSR DS610 2024 2025T1 Kmeans
No ratings yet
04-FSSR DS610 2024 2025T1 Kmeans
57 pages
UnSupervised Learning
No ratings yet
UnSupervised Learning
3 pages
3. Unit 3
No ratings yet
3. Unit 3
34 pages
R20 Machine Learning Unit 4
No ratings yet
R20 Machine Learning Unit 4
49 pages
Week 9
No ratings yet
Week 9
66 pages
ch 5
No ratings yet
ch 5
34 pages
Unit-4 ML
No ratings yet
Unit-4 ML
16 pages
Unit 4
No ratings yet
Unit 4
53 pages
Unit 4-Unsupervised Learning-K Means and Hierarchical Clustering
No ratings yet
Unit 4-Unsupervised Learning-K Means and Hierarchical Clustering
48 pages
What Is Unsupervised Learning
No ratings yet
What Is Unsupervised Learning
9 pages
Data Mining For BI - Part 5
No ratings yet
Data Mining For BI - Part 5
34 pages
UNIT-5 Material
No ratings yet
UNIT-5 Material
42 pages
Week 11
No ratings yet
Week 11
49 pages
ML Unit 4 V1
No ratings yet
ML Unit 4 V1
30 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
9 pages
Unit 4
No ratings yet
Unit 4
40 pages
Untitled Document
No ratings yet
Untitled Document
32 pages
Clustering
No ratings yet
Clustering
84 pages
Cluster Lecture-1
No ratings yet
Cluster Lecture-1
20 pages
Unit 4
No ratings yet
Unit 4
125 pages
Unsupervised Learning: Niveditha. GH
No ratings yet
Unsupervised Learning: Niveditha. GH
10 pages
DSV - Unit 3 - Data Analysis in Depth
No ratings yet
DSV - Unit 3 - Data Analysis in Depth
53 pages
(KtabPDF Com) xrwA7TEBGp
No ratings yet
(KtabPDF Com) xrwA7TEBGp
32 pages
Machine Learning unsupervised learning methods
No ratings yet
Machine Learning unsupervised learning methods
10 pages
Clustering Algorithm: An Unsupervised Learning Approach
No ratings yet
Clustering Algorithm: An Unsupervised Learning Approach
23 pages
Unit-5 Clustering (March 16, 24)
No ratings yet
Unit-5 Clustering (March 16, 24)
25 pages
Week 14 and 15 Machine Learning Unsupervised 2
No ratings yet
Week 14 and 15 Machine Learning Unsupervised 2
25 pages
Unsupervised Machine Learning
No ratings yet
Unsupervised Machine Learning
10 pages
Agglomerative Is A Bottom-Up Technique, But Divisive Is A Top-Down Technique
No ratings yet
Agglomerative Is A Bottom-Up Technique, But Divisive Is A Top-Down Technique
8 pages
Chapter 3 Unsupervised Machine Learning (3)
No ratings yet
Chapter 3 Unsupervised Machine Learning (3)
41 pages
ML Unsupervised
No ratings yet
ML Unsupervised
35 pages
The Math Behind The K-Means and Hierarchical Clust+
No ratings yet
The Math Behind The K-Means and Hierarchical Clust+
13 pages
Lecture - 10 Unsupervised Learning & K-Means Clustering
No ratings yet
Lecture - 10 Unsupervised Learning & K-Means Clustering
31 pages
ML Unit 3
No ratings yet
ML Unit 3
24 pages
Clustering: Dr. Md. Al-Amin Bhuiyan
No ratings yet
Clustering: Dr. Md. Al-Amin Bhuiyan
6 pages
Artificial Intelligence Lec 5
No ratings yet
Artificial Intelligence Lec 5
20 pages
1.supervised and Unsupervised
No ratings yet
1.supervised and Unsupervised
42 pages
ML Mod 4 Part 1
No ratings yet
ML Mod 4 Part 1
99 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
6 pages
Clustering
No ratings yet
Clustering
12 pages
Unit 4
No ratings yet
Unit 4
74 pages
L7 Clustering
No ratings yet
L7 Clustering
58 pages
7.introduction To Clustering
No ratings yet
7.introduction To Clustering
11 pages
Unsuper Clustering (3)
No ratings yet
Unsuper Clustering (3)
220 pages
1694601073-Unit 3.1 Unsupervised Learning CU 2.0
No ratings yet
1694601073-Unit 3.1 Unsupervised Learning CU 2.0
35 pages
Unsupervised Learning Final
No ratings yet
Unsupervised Learning Final
17 pages
EAI13
No ratings yet
EAI13
19 pages
Unsupervised Machine Learning
No ratings yet
Unsupervised Machine Learning
59 pages
Iit KGP Practice-13 Stacks and Queues
100% (1)
Iit KGP Practice-13 Stacks and Queues
8 pages
A5 BinTree
No ratings yet
A5 BinTree
4 pages
Region Elimination Method
No ratings yet
Region Elimination Method
17 pages
The Shortest Path Between Two Nodes AMD Algorithm: School of Computers and Information Engineering
No ratings yet
The Shortest Path Between Two Nodes AMD Algorithm: School of Computers and Information Engineering
20 pages
Quiz Format (1) DAA4A
No ratings yet
Quiz Format (1) DAA4A
1 page
Design and Analysis of Algorithms - Assignment #3
No ratings yet
Design and Analysis of Algorithms - Assignment #3
55 pages
Algo Handout
No ratings yet
Algo Handout
23 pages
Sec Pseudo + Number Puzzle-2
No ratings yet
Sec Pseudo + Number Puzzle-2
5 pages
Data Structure and Algorithm (CS-102) : Ashok K Turuk
No ratings yet
Data Structure and Algorithm (CS-102) : Ashok K Turuk
39 pages
DMT Unit-IV - UR20 - New
No ratings yet
DMT Unit-IV - UR20 - New
62 pages
Sample DAA Lab Programs 8-13 Fin
No ratings yet
Sample DAA Lab Programs 8-13 Fin
25 pages
AOA Experiment 1
No ratings yet
AOA Experiment 1
7 pages
Winsem2020-21 Eee1007 Eth Vl2020210500383 Model Question Paper Eee1007 QP
No ratings yet
Winsem2020-21 Eee1007 Eth Vl2020210500383 Model Question Paper Eee1007 QP
4 pages
Needleman Wunsch
100% (1)
Needleman Wunsch
6 pages
I1101 Final
No ratings yet
I1101 Final
48 pages
Ads Syllabus
No ratings yet
Ads Syllabus
3 pages
Lecture 11
No ratings yet
Lecture 11
28 pages
Daa - Unit IV 2 Chapter BRANCH&BOUND
No ratings yet
Daa - Unit IV 2 Chapter BRANCH&BOUND
33 pages
1505760060csen 3111
No ratings yet
1505760060csen 3111
6 pages
Assignment A8
No ratings yet
Assignment A8
4 pages
Daa Unit 2 - Completed ND2019
No ratings yet
Daa Unit 2 - Completed ND2019
34 pages
2017-CE-008 Lab # 10 SHA
No ratings yet
2017-CE-008 Lab # 10 SHA
8 pages
Introduction To Data Structures: Dept. of Computer Science Faculty of Science and Technology
No ratings yet
Introduction To Data Structures: Dept. of Computer Science Faculty of Science and Technology
22 pages
233089.91.94.101.09.12.14.18 17yct601 - Design-And-Analysis-Of-Algorithm-Set-A
No ratings yet
233089.91.94.101.09.12.14.18 17yct601 - Design-And-Analysis-Of-Algorithm-Set-A
4 pages
Cs-08 Data Structure Using C Language
No ratings yet
Cs-08 Data Structure Using C Language
3 pages
Algorithm - Finding Elementary Intervals in Overlapping Intervals - Stack Overflow
No ratings yet
Algorithm - Finding Elementary Intervals in Overlapping Intervals - Stack Overflow
3 pages
Constraint Satisfaction Problems: Artificial Intelligence COSC-3112 Ms. Humaira Anwer
No ratings yet
Constraint Satisfaction Problems: Artificial Intelligence COSC-3112 Ms. Humaira Anwer
15 pages
Unit Commitment PDF
100% (3)
Unit Commitment PDF
37 pages
Data Structures Topic: Queue
No ratings yet
Data Structures Topic: Queue
34 pages
AI-Lab (Genetic Algorithm)
No ratings yet
AI-Lab (Genetic Algorithm)
9 pages

ML-Unit3 Updated

Uploaded by

ML-Unit3 Updated

Uploaded by

Machine Learning- Unit3

• This is because SMEs and

Sample data points are plotted on a scatterplot

Two data points are nominated as centroids

Two final clusters are produced based on the updated

DBSCAN is a density-based clustering algorithm that

It requires two parameters : Radius(R) and Minimum

Silhouette Score = (b-a)/max(a,b)

print(f'Silhouette Score(n=2): {silhouette_score(Z, label)}')

• Market basket analysis is a process that looks for

• Market basket analysis mainly works with the

• Value of support helps us identify the rules worth considering for

• {Butter} → {Bread}? That is, what fraction of transactions having butter

•Support - Proportional frequency of an item in the database.

•Confidence - how confident we are of an event, given another event.

• An example of Association Rules

• Market basket analysis example

STEP 1: Catch a glance at the Purchase table to

Apriori assumes that :

(II) compare candidate (C2) support count with minimum

Itemset {I1, I2, I3} //from L3

The two primary drawbacks of the Apriori Algorithm are:

At each step, candidate sets have to be built.

Transaction ID Items Ordered-Item Set

Here simply the support count of each element is increased by 1.

You might also like