0% found this document useful (0 votes)
21 views3 pages

Exp 8

Uploaded by

Khan Shahid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views3 pages

Exp 8

Uploaded by

Khan Shahid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Experiment No.

08

Aim: Implementation of Hierarchical Clustering method.

Introduction
Hierarchical clustering is an unsupervised learning method for clustering data points. The
algorithm builds clusters by measuring the dissimilarities between data. Unsupervised
learning means that a model does not have to be trained, and we do not need a "target"
variable. This method can be used on any data to visualize and interpret the relationship
between individual data points.

Hierarchical clustering is separating data into groups based on some measure of similarity,
finding a way to measure how they’re alike and different, and further narrowing down the
data.

Here we will use hierarchical clustering to group data points and visualize the clusters using
both a dendrogram and scatter plot.

How does it work?


We will use Agglomerative Clustering, a type of hierarchical clustering that follows a bottom up
approach. We begin by treating each data point as its own cluster. Then, we join clusters together
that have the shortest distance between them to create larger clusters. This step is repeated until
one large cluster is formed containing all of the data points.

Hierarchical clustering requires us to decide on both a distance and linkage method. We will use
euclidean distance and the Ward linkage method, which attempts to minimize the variance
between clusters.

What is Dendrogram?

A dendrogram is a diagram that shows the hierarchical relationship between objects. It is


most commonly created as an output from hierarchical clustering. The main use of a
dendrogram is to work out the best way to allocate objects to clusters. The dendrogram below
shows the hierarchical clustering of six observations shown on the scatterplot to the
left. (Dendrogram is often miswritten as dendogram.)
Example
Start by visualizing some data points: code:

import numpy as np
import matplotlib.pyplot as plt

x = [4, 5, 10, 4, 3, 11, 14 , 6, 10, 12]


y = [21, 19, 24, 17, 16, 25, 24, 22, 21, 21]

plt.scatter(x, y)
plt.show()
Example code:
import numpy as np
import matplotlib.pyplot as plt
from scipy.cluster.hierarchy import dendrogram, linkage

x = [4, 5, 10, 4, 3, 11, 14 , 6, 10, 12]


y = [21, 19, 24, 17, 16, 25, 24, 22, 21, 21]

data = list(zip(x, y))

linkage_data = linkage(data, method='ward', metric='euclidean')


dendrogram(linkage_data)

plt.show()

Conclusion:

You might also like