0% found this document useful (0 votes)
11 views21 pages

Clustring Data Mining

Uploaded by

Maryam Syed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views21 pages

Clustring Data Mining

Uploaded by

Maryam Syed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

Clustering and Association Rule in

Data Mining
Farhad Muhammad Riaz
[email protected]
K-Mean Clustering

• K-means clustering is the most popular unsupervised


learning algorithm.
• It is used with unlabeled data without defined categories
or groups.
• The algorithm follows an easy or simple way to classify a
given data set through a certain number of clusters, fixed
apriori.
• The K-Means algorithm works iteratively to assign each
data point to one of the K groups based on the provided
features. Data points are clustered based on feature
similarity.
K-Mean Clustering
K-Mean Clustering

• Application
– Image segmentation
– Customer segmentation
– Species clustering
– Anomaly detection
– Clustering languages
K-Mean Clustering
K-Means Clustering Algorithm

• Step 1
– Choose the number of clusters K.
• Step 2
– Randomly select any K data points as cluster centers.
– Select cluster centers in such a way that they are as farther as possible from each other.
• Step 3
– Calculate the distance between each data point and each cluster center.
– The distance may be calculated either by using given distance function or by using
euclidean distance formula.
K-Means Clustering Algorithm

• Step 4
– Assign each data point to some cluster.
– A data point is assigned to that cluster whose center is nearest to that
data point.
• Step 5
– Re-compute the center of newly formed clusters.
– The center of a cluster is computed by taking mean of all the data
points contained in that cluster.
– .
• Step 6
– Calculate the distance between each data point and each cluster center.
– The distance may be calculated either by using given distance function or by using
euclidean distance formula.
K-Means Clustering Algorithm

• Step 6
– Keep repeating the procedure from Step-03 to Step-05 until any of the
following stopping criteria is met-
• Center of newly formed clusters do not change
• Data points remain present in the same cluster
• Maximum number of iterations are reached
Example

• A1(2, 10), A2(2, 5), A3(8, 4), A4(5, 8), A5(7, 5), A6(6,
4), A7(1, 2), A8(4, 9)
• Initial cluster centers are: C1(2, 10), C2(5, 8) and
C3(1, 2).
• You can use
– Distance formula Ρ(a, b) = |x2 – x1| + |y2 – y1|
– Euclidean Distance Formula
Example
Example
Example
Example
Example
Association Rule Mining

• Association Rules Analysis has become


familiar for analysis in the retail industry.
• It is also called Market Basket Analysis
terms.

Support: It is the probability of an event to occur.


Confidence: It is a measure of conditional probability
Lift: It is the probability of all items occurring together divided by the product of antecedent
and consequent occurring as if they are independent of each other.

You might also like