0% found this document useful (0 votes)

45 views85 pages

Unsupervised Learning 2024-PPG

Uploaded by

adwaitmali2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views85 pages

Unsupervised Learning 2024-PPG

Uploaded by

adwaitmali2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 85

Unsupervised Learning

By
Prof(Dr) Premanand P Ghadekar

1
Unsupervised Learning

❖ K-Means Clustering
❖ C-Means Clustering
❖ Associated Rule Mining

2
Unsupervised Learning
Unsupervised Learning is a type of Machine Learning Algorithm used to draw
inferences From data sets consisting of input data without labelled responses

3
Unsupervised Learning-Process Flow
Training Data is collections of information without any label

4
What is Clustering
Clustering is the process of dividing the datasets into groups consisting
of similar data points.
o It merge grouping of objects based on the information found in the data,
describing objects or their relationship.
o Points in the same groups are as similar as possible.
o Points in different groups are as dissimilar as possible.

5
Why is Clustering used
The goal of clustering is to determine the intrinsic grouping in a set of unlabeled
data.
Sometimes Partitioning is the goal.

6
Where it is used

7
Clustering Example
Image segmentation
Goal: Break up the image into meaningful or perceptually similar regions

8
Types of Clustering

o Exclusive Clustering-K-Means Clustering

Division of Objects into clusters such that each object is in exactly one cluster not
Several clusters

10
Types of Clustering

o Overlapping Clustering-C Means Clustering

C-Means Clustering- Division of Objects into clusters such that each object can belongs to
Multiple clusters

11
Types of Clustering
o Exclusive Clustering
o Overlapping Clustering
o Hierarchical Clustering- Agglomerative and Divisive

Clusters have a tree like structure or a Parent Child Relationship

Agglomerative or Bottom up Approach -Begin with each element as a separate cluster
and merge them into successively larger clusters.
Divisive or Top down Approach- Approach begin with the whole set and proceed to
divide it into smaller clusters.
12
K-Means Clustering
The process by which objects are classified into a predefined number of groups
so that they are as much dissimilar as possible from one group to another
group, but as much similar as possible within each group.

‘K’ in K-Means represent t the number of Clusters

13
K-Means Algorithm WORKING

Distance Measure determines the similarity between two elements, and it influences
the shape of the Clusters.
Euclidean Distance Measure, Manhattan Distance Measure, Squared Euclidean
Distance Measure, Cosine distance measure

14
K-Means Clustering Steps
1. First, we need to decide number of clusters to be made (Guessing).
2. Then we provide centroid of all the clusters (Guessing)
3. The Algorithm calculates Euclidian distance of the points from each centroid and assign
the point to the closest cluster.

(a) (b) (c) 15

K-Means Clustering Steps
4. Next the Centroids are calculated again, when we have our new cluster.
5. The Distance of the points from the center of Clusters are calculated again and points are
assigned to the closet cluster.
6. And then again, the new centroid for the cluster is calculated.

(a) (b) (c) 16

K-Means Clustering Steps
7. These steps are repeated until we have a repetition in Centroids or new centroids are
very close to the previous one.

17
K-Means Clustering Algorithm
❑ Divide the point into three Clusters. Where K=3

18
K-Means Clustering Algorithm
Step-1: Select the number of Clusters to be identified, i.e. select a value of k=3 in
this case.
Step-2: Randomly select three distinct data points.

19
K-Means Clustering Algorithm
Step-3: Measure the distance between the 1st point and selected three Clusters.

Distance from 1st Point to Cluster 1, 2 and 3.

20
K-Means Clustering Algorithm
Step-4:Assign 1st point to nearest cluster.

Assign 1st point to nearest cluster. (Red in this case)

21
K-Means Clustering Algorithm
Step-5:Calculate the mean value including the new point for the red cluster.

Calculate the mean value including the new point for the red cluster.

22
K-Means Clustering Algorithm
Step-6: Find to which cluster does point 2 belongs to, how?
Repeat the same procedure but measure distance to the red mean

Calculate the mean value including the new point for the red cluster.

23
K-Means Clustering Algorithm
Step-7:
o Measure the distance
o Assign the point to the nearest Cluster
o Calculate the Cluster mean using the new point.

24
K-Means Clustering Algorithm
According to K-means Algorithm it iterates over again and again unless and until
the data points within each cluster stop changing.

25
K-Means Clustering Algorithm

26
K-Means Clustering Algorithm

Randomly chose K examples as initial Centroids

While True:
Create K Clusters by assigning each Example to closest Centroid
Compute k new Centroids by Averaging Examples in each Cluster
If centroids don’t change :
break

27
K-Means Clustering Example
Sr No Height Weight K1 Cluster-Centroid-185, 72

1 185 72 K2 Cluster-Centroid-170, 56

2 170 56 Centroid & Euclidean Distance technique

3 168 60 Euclidean Distance Row 3- For K1=20.68

4 179 68 for K2= 4.48

Row 3 belongs to K2 (As ED=4.48)
5 182 72
New Centroid Calculations For K2-
6 188 77
((170+168)/2, (56+60)/2)= (169, 58)
7 180 71
Euclidean Distance Row 4 for K1=6.32
8 180 70
For K2=14.14
9 183 84
Row 4 belongs to K1 (As ED=6.32)
10 180 88
Calculate New Centroid for K1-
11 180 67
((185+179)/2, (72+68/2))= (182, 70)
12 177 76
K1= 1,4, 5,6,7,8,9,10,11,12

K2= 2, 3 28
K-Means Clustering Example

29
K-Means Clustering Example

30
K-Means Clustering Example

31
K-Means Clustering Example

32
K-Means Clustering Example

33
K-Means Clustering Example

34
K-Means Clustering Example

35
K-Means Clustering Example

36
K-Means Clustering Example

37
How to decide the number of Clusters
The Elbow Method
First of all, Compute the Sum of Squared Error (SSE) for some value of k (For
Ex-2,4,6,8 etc.). The SSE is defined as the sum of the squared distance between
each member of the cluster and its centroid. Mathematically-
SSE=σ𝐾
𝑖=1 σ𝑧=𝑐𝑖 𝑑𝑖𝑠𝑡 𝑥, 𝑐𝑖
2

The idea of the elbow method is to choose the ‘k’ after which the SSE decrease is
Almost constant.
38
Pros and Cons of K-means Clustering

Pros
o Simple Understandable
o Items automatically assigned to clusters

Cons
o Must define number of clusters
o All items forced into clusters
o Unable to handle noisy data and outliers

39
Applications of K-Means Clustering

o Academic Performance
o Diagnostic System
o Search Engine
o Wireless Sensor Network

41
Fuzzy C-means Clustering

Fuzzy C-Means is the extension of K-means, the popular simple Clustering

technique.
Fuzzy clustering (also referred to as soft clustering) is a form of clustering in
which each data point can belong to more than one cluster.

b
a

42
Fuzzy C-means Clustering

o Fuzzy Logic was proposed by scientist Lotfi Zadeh

o Represent Uncertainty
o Represent with degree
o Represent the belongingness of a number of a crips set to fuzzy set
o It’s a mathematical Language
o Relational Logic + Boolean Logic + Predicate Logic=Fuzzy Language
o Fuzzy Logic deals with Fuzzy Set/ Fuzzy Algebra

43
Pros and Cons of C-means Clustering

Pros
o Allows a data point to be in multiple clusters.
o A more natural representation of the behavior of genes.
o Genes usually are involved in multiple functions.

Cons
o Need to define c, the number of clusters.
o Need to determine membership cut-off value.
o Clusters are sensitive to initial assignment of centroids.
o Fuzzy C-Means is not a deterministic algorithm.

44
K-Means versus Fuzzy C-Means

Attribution to a Cluster- In fuzzy clustering, each point has a probability of

belonging to each cluster, rather than completely belonging to just one cluster as it
is the case in the traditional k-means.
Speed-Fuzzy-C means will tend to run slower than K means, since it’s actually
doing more work. Each point is evaluated with each cluster, and more operations
are involved in each evaluation.
Personal Opinion-Soft-K-Means is “less stupid” than Hard-K-Means when
it comes to elongated clusters

45
Fuzzy C-means Clustering

46
Fuzzy Sets

47
Steps in Fuzzy C-Means

48
The process flow of fuzzy C-Means

1. Assume a fixed number of clusters k.

2. Initialization: Randomly initialize the k-means μk associated with the clusters
and compute the probability that each data point xi is a member of a given
cluster k, P(point xi has label k|xi, k).
3. Iteration: Recalculate the centroid of the cluster as the weighted centroid
given the probabilities of membership of all data points xi:

4. Termination: Iterate until convergence or until a user-specified number of

iterations has been reached (the iteration may be trapped at some local
maxima or minima).

49
The Fuzzy C-Means Example

To better understand this principle, a classic example of mono-dimensional data

is given below on an x axis.

50
The Fuzzy C-Means Example

o This data set can be traditionally grouped into two clusters.

o By selecting a threshold on the x-axis, the data is separated into two clusters.
o The resulting clusters are labelled 'A' and 'B', as seen in the following image.
Each point belonging to the data set would therefore have a membership
coefficient of 1 or 0.
o This membership coefficient of each corresponding data point is represented
by the inclusion of the y-axis.

51
The Fuzzy C-Means Example
o In fuzzy clustering, each data point can have membership to multiple clusters.
o By relaxing the definition of membership coefficients from strictly 1 or 0, these
values can range from any value from 1 to 0.
o The following image shows the data set from the previous clustering, but now
fuzzy c-means clustering is applied.
o First, a new threshold value defining two clusters may be generated.
o Next, new membership coefficients for each data point are generated based on
clusters centroids, as well as distance from each cluster centroid.

As one can see, the middle data point

belongs to cluster A and cluster B. the
value of 0.3 is this data point's
membership coefficient for cluster A.

52
Density Based Spatial Clustering of Application with Noise

53
Evaluation Metrics for Clusters
Some popular measures used to evaluate the C-Means clusters :
1. Homogeneity analysis of the clusters formed.
2. The clusters thus formed using Fuzzy C-Means, need to homogeneous and
separated from other clusters.
3. Coefficient of Variance analysis for each cluster.
4. Pearson Correlation can be used for validating the quality of clusters.
5. If we have ground truth cluster values, precision, recall, and f-score can also
be considered.
6. Elbow Method and Silhouette are also some statistical measures for
evaluating your clusters (I would rather use them to in pre-definition of
cluster number).
7. Entropy-based methods

54
Hierarchical Clustering
Hierarchical Clustering is an alternative approach which builds a hierarchy from
the bottom up, and doesn’t require us to specify the number of clusters beforehand.

55
Pros and Cons : Hierarchical Clustering

Pros
o No assumption of a particular number of clusters.
o May corresponds to meaningful taxonomies.

Cons
o Once a decision is made to combine two clusters, it cant be undone.
o TO slow for large datasets.

56
Why to use Market Basket Analysis
Market Basket Analysis
In order to understand why Market Basket Analysis is important we need to understand
the objective of MBA
The primary objectives of Market Basket Analysis is to
o Improve the effectiveness of Marketing and
o Improve the Sales tactics using customer data collected (During the Sales Transaction)

Market Basket Analysis is a modelling technique based upon the theory that if you buy a
certain group of items, you are more (or less) likely to buy another group of items.

57
What Questions Market Basket Analysis ?
o What products are customers really interested in?
o What products are sold well and which products can be combined with them?
o Which combinations are working well in terms of products?
o Other Random Observations or hidden Pattern if any?

58
What is Market Basket Analysis ?
o Market Basket Analysis(MBA) is a technique or algorithm of Data Mining to
file association rules from given Data or available data.
o The Mathematical Concept behind this algorithm is simple
o Support
o Confidence

59
Example
Given is the data of the transaction table of the shop/Super Market

60
Example

A customer coming to a shop buys a Milk, also buys a Bread.

So in the given scenario lets calculate Support and Confidence to understand:
o Support 60%( 60% Customers buying at least 1 product from the Shop
bought Milk in the list.
o Confidence 66%(66% customers who bought Milk also bought Bread)

61
Market Basket Analysis

Market Basket Analysis explains the combinations of products that

frequently co-occur in transactions.

Market Basket Analysis Algorithms

1. Association Rule Mining
2. Apriori

62
Association Rule Mining

❖ Mining Frequent Pattern and rules

❖ Association Rules:-Conditional Dependencies
❖ Two Stages
o Find Frequent Patterns
o Derive Associations (A B) from frequent Patterns
❖ Find Patterns in
o Sequences (time Series data, Fault Analysis)
o Transactions (Market basket data)
o Graphs(Social network analysis)

63
Association Rule Mining
Association Rule Mining is a technique that shows how items are associated
with each other.

Examples
1. Customer who purchase bread have a 60% likelihood of also purchasing
Jam.

2. Customer who purchase laptop are more likely to purchase laptop bags.

64
Association Rule Mining
Example of Association Rule

A B

It means that if a person buys item A then he will also buy item B

Three common ways to measure association:-

o Support
o Confidence
o Lift

65
Association Rule Mining

66
Market Basket Analysis

Market Basket Analysis

❖ Immediate Extension to Association Rules.
❖ Association Rules with “Lot of Business” outcome.
❖ Very highly used in Retail Scenarios.
❖ Typical Input
o List of purchases by customers over different visits
❖ Output
o What items purchased together?
o What items purchased sequentially?
o What items purchased in seasons?
❖ Association Rules-Generate Rules
❖ Example-(X Y)
o Market Basket Assigns Business
o Outcome to those rules
o Example-X, Y Could be Sold together.
67
Market Basket Analysis-Lift Measure

❖ Confidence : How confident that Y is present, in the presence of X?

❖ Expected Confidence : How confident that Y is present, in the absence of
X?
❖ Lift = Confidence/(Expected Confidence) =( Y in presence of X)/(Y in
absence of X)
❖ Explains the change in probability of Y over (presence of X) and (absence of
X).
❖ Lift =1 implies that X actually makes no impact on Y.
❖ Lift>1 implies that the relationship between X and Y is more significant.
❖ The Larger the lift ratio, the more significant the association.

69
Association Rule Mining-Example

70
Association Rule Mining-Example

71
Apriori Algorithm-Example
Example- For the following Given Transaction Data-Set, Generate Rules using
Apriori Algorithm. Consider Values as Support=50%, and Confidence=75%.
Data Set Frequent Item Set Support(Bread)=nBread/n

Trans. ID Items Purchased Items Frequency Support

1 Bread, Cheese, Egg, Juice Bread 4 4/5=80%
2 Bread, Cheese, Juice Cheese 3 3/5=60%
3 Bread, Milk, Yogurt Egg 1 1/5=20%
4 Bread, Juice, Milk Juice 4 4/5=80%
5 Cheese, Juice, Milk Milk 3 3/5=60%
Yogurt 1 1/5=20%

Remove Egg and Yogurt from the list as Support value is less than 50%

72
Apriori Algorithm-Example
Make 2-Items Candidate Set and Write their Frequency
Support=50%, and Confidence=75%. For Rules –
I. Bread, Juice (1)
Item Pairs Frequency Support II. Cheese, Juice (2)
Bread, Cheese 2 2/5=40%
1. (Bread, Juice-)
Bread, Juice 3 3/5=60% Bread Juice
Juice Bread
Bread, Milk 2 2/5=40%
Cheese, Juice 3 3/5=60% Confidence (A B)=
Support(A U B)/S(A)
Cheese, Milk 1 1/5=20%
Juice, Milk 2 2/5=40%

I. Bread, Juice –Calculate Confidence Level

1. Bread Juice = S(B U J) /S(B) = 3/5 * 5/4 =3/4= 75%
2. Juice Bread = S(J U B) /S(J) = 3/5 * 5/4 =3/4=75 %
II. Cheese, Juice-Calculate Confidence Level
1. Cheese Juice = S(C U J)/S(C) = 3/5 * 5/3 =100%
2. Juice Cheese = S( J U C)/S(J) =3/5 * 5/4= 3/4 = 75 %

As Confidence Level of all the Rules are equal to 75% -Means all the Rules
are Good. 73
Apriori Algorithm-Example
Example- For the following Given Transaction Data-Set, Generate Rules using
Apriori Algorithm. Consider Values as Support=50%, and Confidence=70%.
Data Set I Find out the Frequency of Items

Trans. ID Items Purchased Item Set Support

100 1 3 4 1 2/4 = 50%

200 2 3 5 2 3/4 = 75%

300 1 2 3 5 3 3/4 = 75%

400 2 5 4 1/4 = 25%

5 3/4 = 75%

Itemset-1,2,3,5 For 4 Support is 25

74
Apriori Algorithm-Example
Example- For the following Given Transaction Data-Set, Generate Rules using
Apriori Algorithm. Consider Values as Support=50%, and Confidence=70%.
II Find out the Support/ III Find out the Support/Frequency of
Frequency of Pair of Items three Items

Item Sets Support Item Sets Support

{1,2} 1/4 = 25% {1,3,5} 1/4 = 25%
{1,3} 2/4 = 50% {2,3,5} 2/4 = 50%
{1,5} 1/4 = 25% {1, 2,3} 1/4 = 25%
{2,3} 2/4 = 50%
{2,5} 3/4 = 75%
{3,5} 2/4 = 50% In {2,3, 5} Support is 50%. We have to prepare
Rules.

75
Apriori Algorithm-Example
Consider Values as Support=50%, and Confidence=70%.
Define Rules and Calculate Confidence Values.
II Rules Confidence=S(A U B)/S(A)
Ex-(2 ^ 3) 5 =2/2=100%

Sr Rules Support Sr Rules Confidence

No No

1 (2 ^ 3) 5 2 1 (2 ^ 3) 5 2/2=100%

2 (3 ^ 5) 5 2 2 (3 ^ 5) 5 2/2=100%

3 (2 ^ 5) 3 2 3 (2 ^ 5) 3 2/3=66%

4 2 (3 ^ 5) 2 4 2 (3 ^ 5) 2/3=66%

5 5 (2 ^ 3) 2 5 5 (2 ^ 3) 2/3=66%

6 3 (2 ^ 5) 2 6 3 (2 ^ 5) 2/3=66%

Rules 1 & 2 are Valid as Confidence Level is greater than 70%

76
Apriori Algorithm-Example

77
Apriori Algorithm

78
Apriori Algorithm

79
Apriori Algorithm-First Iteration

80
Apriori Algorithm- Second Iteration

81
Apriori Algorithm- Third Iteration

82
Apriori Algorithm-Pruning

83
Apriori Algorithm-Fourth Iteration

84
Apriori Algorithm-Subset Creation

85
Apriori Algorithm-Applying Rules

86
Apriori Algorithm-Applying Rules

87
April 5, 2024 Prof P P Ghadekar, VIT Pune 88

Support Vector Machines PDF
100% (1)
Support Vector Machines PDF
37 pages
Jntuk R20 ML Unit-Ii
No ratings yet
Jntuk R20 ML Unit-Ii
37 pages
K Means
No ratings yet
K Means
40 pages
CNS Lab Manual
No ratings yet
CNS Lab Manual
32 pages
ET4248E - Chap9 - K-Means and GMM
No ratings yet
ET4248E - Chap9 - K-Means and GMM
27 pages
Ensemble Methods Bagging Boosting and Stacking
100% (1)
Ensemble Methods Bagging Boosting and Stacking
19 pages
AI Chapter 3 Part 5
No ratings yet
AI Chapter 3 Part 5
30 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
24 pages
Ain Shams University Faculty of Engineering
No ratings yet
Ain Shams University Faculty of Engineering
2 pages
Data Preprocesing JavaPoint
No ratings yet
Data Preprocesing JavaPoint
19 pages
Support Vector Machines
No ratings yet
Support Vector Machines
14 pages
REPORT On DECISION TREE
No ratings yet
REPORT On DECISION TREE
40 pages
Expectation Maximization
No ratings yet
Expectation Maximization
23 pages
Chapter 6 ML Classifications
100% (1)
Chapter 6 ML Classifications
51 pages
03 - K Means Clustering On Iris Datasets
No ratings yet
03 - K Means Clustering On Iris Datasets
4 pages
Naive Bayes
No ratings yet
Naive Bayes
38 pages
Intro SVM New Example PDF
100% (1)
Intro SVM New Example PDF
56 pages
Data Mining
No ratings yet
Data Mining
6 pages
Bayesian Networks in Class Exercises Solutions
No ratings yet
Bayesian Networks in Class Exercises Solutions
4 pages
Lab 08 Solutions
No ratings yet
Lab 08 Solutions
5 pages
Approximate Inference
No ratings yet
Approximate Inference
37 pages
CH 6
No ratings yet
CH 6
72 pages
Final Exam Paper Fall 2020
No ratings yet
Final Exam Paper Fall 2020
3 pages
Heart: Our "Goal" Predict The Presence of Heart Disease in The Patient
100% (1)
Heart: Our "Goal" Predict The Presence of Heart Disease in The Patient
73 pages
Matplotlib Fundamentals
No ratings yet
Matplotlib Fundamentals
31 pages
K-Means Clustering Algorithm With Numerical Example
No ratings yet
K-Means Clustering Algorithm With Numerical Example
11 pages
Machine Learning
No ratings yet
Machine Learning
57 pages
Unit 4
No ratings yet
Unit 4
4 pages
DBSCAN Clustering Algorithm: Presented by
No ratings yet
DBSCAN Clustering Algorithm: Presented by
22 pages
K-Means and PCA
No ratings yet
K-Means and PCA
69 pages
IS328 Final Exam
No ratings yet
IS328 Final Exam
12 pages
Bandits
No ratings yet
Bandits
2 pages
1.linear Regression PSP
No ratings yet
1.linear Regression PSP
92 pages
Distributed Databases: Solutions To Practice Exercises
No ratings yet
Distributed Databases: Solutions To Practice Exercises
4 pages
Introduction To Tree Methods
No ratings yet
Introduction To Tree Methods
15 pages
Support Vector Machines: Dominik Wisniewski Wojciech Wawrzyniak
No ratings yet
Support Vector Machines: Dominik Wisniewski Wojciech Wawrzyniak
16 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
52 pages
Pycryptodome Master
100% (1)
Pycryptodome Master
82 pages
A Comparison of Classification Techniques On Prediction of Student Performance
No ratings yet
A Comparison of Classification Techniques On Prediction of Student Performance
6 pages
Association Rules FP Growth
No ratings yet
Association Rules FP Growth
32 pages
CSI 2110 Summary PDF
No ratings yet
CSI 2110 Summary PDF
17 pages
SVM
No ratings yet
SVM
21 pages
Exercise 4: Simple and Multiple Linear Regression Analysis
No ratings yet
Exercise 4: Simple and Multiple Linear Regression Analysis
15 pages
KNN ALGORITHM IN MACHINELEARNING
No ratings yet
KNN ALGORITHM IN MACHINELEARNING
10 pages
3.1 C 4.5 Algorithm-19
No ratings yet
3.1 C 4.5 Algorithm-19
10 pages
K Means Clustering Lecture
No ratings yet
K Means Clustering Lecture
32 pages
Naïve Bayes Classifier (Week 8)
No ratings yet
Naïve Bayes Classifier (Week 8)
18 pages
Machine Learning: Louis Fippo Fitime
No ratings yet
Machine Learning: Louis Fippo Fitime
37 pages
K Means
No ratings yet
K Means
22 pages
Decision Tree Entropy Gini
No ratings yet
Decision Tree Entropy Gini
5 pages
ML Kernel Methods
No ratings yet
ML Kernel Methods
51 pages
Artificial Intelligence DITI 1113: Uniformed Search II
No ratings yet
Artificial Intelligence DITI 1113: Uniformed Search II
36 pages
DBSCAN
No ratings yet
DBSCAN
42 pages
Blue Print Class XII Maths
No ratings yet
Blue Print Class XII Maths
1 page
An Introduction Of: Support Vector Machine
No ratings yet
An Introduction Of: Support Vector Machine
36 pages
PINN Gentle Introduction
No ratings yet
PINN Gentle Introduction
26 pages
Neural Networks and Their Application To Finance: Martin P. Wallace (P D)
No ratings yet
Neural Networks and Their Application To Finance: Martin P. Wallace (P D)
10 pages
University of Tunis Fall 2013 Tunis Business School Decision & Game Theory Tutorial 3
No ratings yet
University of Tunis Fall 2013 Tunis Business School Decision & Game Theory Tutorial 3
4 pages
08250771
No ratings yet
08250771
8 pages
A Algorithm
No ratings yet
A Algorithm
9 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
Square Topology For NoCs
No ratings yet
Square Topology For NoCs
4 pages
UNIT V Correlation and Regression Important Questions and QB
No ratings yet
UNIT V Correlation and Regression Important Questions and QB
7 pages
AI Unit 2 - Constarint Sattisfaction, Means End Analysis, Adversial Search
No ratings yet
AI Unit 2 - Constarint Sattisfaction, Means End Analysis, Adversial Search
42 pages
Branko Kovačević, Zoran Banjac, Milan Milosavljević (Auth.) - Adaptive Digital Filters-Springer-Verlag Berlin Heidelberg (2013) PDF
No ratings yet
Branko Kovačević, Zoran Banjac, Milan Milosavljević (Auth.) - Adaptive Digital Filters-Springer-Verlag Berlin Heidelberg (2013) PDF
220 pages
Artificial Intelligence - KCS701 - 2022-23 - AKTU - Solution - PDF.PDF - Crdownload
No ratings yet
Artificial Intelligence - KCS701 - 2022-23 - AKTU - Solution - PDF.PDF - Crdownload
28 pages
Bcse409l Natural-Language-Processing TH 1.1 0 Bcse409l
No ratings yet
Bcse409l Natural-Language-Processing TH 1.1 0 Bcse409l
2 pages
Lesson 8 Analysis of Structure
No ratings yet
Lesson 8 Analysis of Structure
11 pages
Maths 1
No ratings yet
Maths 1
3 pages
Unit 4
No ratings yet
Unit 4
13 pages
Applied Regression Analysis and Generalized Linear Models, 3rd Edition Annotated PDF Download
100% (10)
Applied Regression Analysis and Generalized Linear Models, 3rd Edition Annotated PDF Download
16 pages
Data Models
No ratings yet
Data Models
52 pages
ECTS-Bogen - Computer Science
No ratings yet
ECTS-Bogen - Computer Science
2 pages
Naac Lesson Plan Subject-Wsn
No ratings yet
Naac Lesson Plan Subject-Wsn
6 pages
The Use of Reinforcement Learning in Gaming The BR
No ratings yet
The Use of Reinforcement Learning in Gaming The BR
9 pages
Branch Net
No ratings yet
Branch Net
13 pages
Hough Examples - Algorithm
No ratings yet
Hough Examples - Algorithm
8 pages
1 Fundamental Laws of Bohmian Mechanics
No ratings yet
1 Fundamental Laws of Bohmian Mechanics
21 pages
Brain Tumor Classification Using Hybrid Single Image Super-Resolution Technique With ResNext101!32!8d and VGG19 PR
No ratings yet
Brain Tumor Classification Using Hybrid Single Image Super-Resolution Technique With ResNext101!32!8d and VGG19 PR
14 pages
PSRM II Assingment 6
No ratings yet
PSRM II Assingment 6
2 pages
Simplex Projection Walkthrough
No ratings yet
Simplex Projection Walkthrough
8 pages
Riscv Crypto Spec v0.9.0 Scalar
No ratings yet
Riscv Crypto Spec v0.9.0 Scalar
52 pages
Dsa 7
No ratings yet
Dsa 7
9 pages
Homework Week 4 Array Based Sequence
No ratings yet
Homework Week 4 Array Based Sequence
3 pages
Quantum Field Theory (2022-23) Toby Wiseman
No ratings yet
Quantum Field Theory (2022-23) Toby Wiseman
4 pages
Stochastic Differential Equations With Multi-Marko
No ratings yet
Stochastic Differential Equations With Multi-Marko
12 pages
Hw5 C2a Vergara
No ratings yet
Hw5 C2a Vergara
5 pages
Secure Communication Using Quantum Computing Method
No ratings yet
Secure Communication Using Quantum Computing Method
4 pages
ECC Writeup
No ratings yet
ECC Writeup
4 pages
Euler's Method Simpson's Rule: X X DX
No ratings yet
Euler's Method Simpson's Rule: X X DX
1 page

Unsupervised Learning 2024-PPG

Uploaded by

Unsupervised Learning 2024-PPG

Uploaded by

Unsupervised Learning

o Exclusive Clustering-K-Means Clustering

o Overlapping Clustering-C Means Clustering

Clusters have a tree like structure or a Parent Child Relationship

‘K’ in K-Means represent t the number of Clusters

(a) (b) (c) 15

(a) (b) (c) 16

Distance from 1st Point to Cluster 1, 2 and 3.

Assign 1st point to nearest cluster. (Red in this case)

Randomly chose K examples as initial Centroids

2 170 56 Centroid & Euclidean Distance technique

3 168 60 Euclidean Distance Row 3- For K1=20.68

4 179 68 for K2= 4.48

Fuzzy C-Means is the extension of K-means, the popular simple Clustering

o Fuzzy Logic was proposed by scientist Lotfi Zadeh

Attribution to a Cluster- In fuzzy clustering, each point has a probability of

1. Assume a fixed number of clusters k.

4. Termination: Iterate until convergence or until a user-specified number of

To better understand this principle, a classic example of mono-dimensional data

o This data set can be traditionally grouped into two clusters.

As one can see, the middle data point

A customer coming to a shop buys a Milk, also buys a Bread.

Market Basket Analysis explains the combinations of products that

Market Basket Analysis Algorithms

❖ Mining Frequent Pattern and rules

Three common ways to measure association:-

Market Basket Analysis

❖ Confidence : How confident that Y is present, in the presence of X?

Trans. ID Items Purchased Items Frequency Support

I. Bread, Juice –Calculate Confidence Level

Trans. ID Items Purchased Item Set Support

100 1 3 4 1 2/4 = 50%

200 2 3 5 2 3/4 = 75%

300 1 2 3 5 3 3/4 = 75%

400 2 5 4 1/4 = 25%

Itemset-1,2,3,5 For 4 Support is 25

Item Sets Support Item Sets Support

Sr Rules Support Sr Rules Confidence

Rules 1 & 2 are Valid as Confidence Level is greater than 70%

You might also like