0% found this document useful (0 votes)

53 views8 pages

Bis Distance

Mahalanobis distance is a statistic that measures the distance between a point and a distribution. It is based on correlations between variables by which different patterns can be identified and analyzed. The distance is calculated by taking into account the correlations of the data set and is not dependent on the scale of measurements. It provides a more accurate way of determining similarity than distance measures such as Euclidean distance.

Uploaded by

yoga_laddo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views8 pages

Bis Distance

Uploaded by

yoga_laddo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

Mahalanobis Distance

Mahalanobis distance is also called quadratic distance. It measures the separation of two groups of
objects. Suppose we have two groups with means and , Mahalanobis distance is given by the
following

Formula

The data of the two groups must have the same number of variables (the same number of columns)
but not necessarily to have the same data (each group may have different number of rows).

Mahalanobis distance is also called quadratic distance. It measures the separation of two
groups of objects. Suppose we have two groups with means and , Mahalanobis distance
is given by the following

Formula

The data of the two groups must have the same number of variables (the same number of
columns) but not necessarily to have the same data (each group may have different number of
rows).

For example: Suppose we have two groups of data, each of group consists of two variables
(x, y). The scattered plot of data is shown below.

First, we center the data on the arithmetic mean of each variable.

Covariance matrix of group is computed using centered data matrix

It produces covariance matrices for group 1 and 2 as follow

The pooled covariance matrix of the two groups is computed as weighted average of the
covariance matrices. The weighted average takes this form

The pooled covariance is computed using weighted average (10/15)*Covariance group 1 +

(5/15)*Covariance group 2 yields

The Mahalanobis distance is simply quadratic multiplication of mean difference and inverse
of pooled covariance matrix.
To perform the quadratic multiplication, check again the formula of Mahalanobis distance
above. When you get mean difference, transpose it, and multiply it by inverse pooled
covariance. After that, multiply the result with the mean difference again and you take the
square root. The final result of Mahalanobis distance is

Contact

What is K-Mean Clustering?

<Previous | Next | Contents>

K means clustering algorithm was developed by J. MacQueen (1967) and then by J. A.

Hartigan and M. A. Wong around 1975. Simply speaking k-means clustering is an algorithm
to classify or to group your objects based on attributes/features into K number of group. K is
positive integer number. The grouping is done by minimizing the sum of squares of distances
between data and the corresponding cluster centroid. Thus the purpose of K-mean clustering
is to classify the data.

Example: Suppose we have 4 objects as your training data points and each object have 2
attributes. Each attribute represents coordinate of the object.

Attribute 1 (X):weight Attribute 2 (Y): pH

Object index

Medicine A 1 1
Medicine B 2 1
Medicine C 4 3
Medicine D 5 4

We also know before hand that these objects belong to two groups of medicine (cluster 1 and
cluster 2). The problem now is to determine which medicines belong to cluster 1 and which
medicines belong to the other cluster.

Numerical Example of K-Means Clustering

<Previous | Next | Contents>

The basic step of k-means clustering is simple. In the beginning we determine number of
cluster K and we assume the centroid or center of these clusters. We can take any random
objects as the initial centroids or the first K objects in sequence can also serve as the initial
centroids.

Then the K means algorithm will do the three steps below until convergence

Iterate until stable (= no object move group):

1. Determine the centroid coordinate

2. Determine the distance of each object to the centroids
3. Group the object based on minimum distance

The numerical example below is given to understand this simple iteration. You may
download the implementation of this numerical example as Matlab code here. Another
example of interactive k-means clustering using Visual Basic (VB) is also available here. MS
excel file for this numerical example can be downloaded at the bottom of this page.

Suppose we have several objects (4 types of medicines) and each object have two attributes
or features as shown in table below. Our goal is to group these objects into K=2 group of
medicine based on the two features (pH and weight index).

Object attribute 1 (X): weight attribute 2 (Y): pH

index
Medicine A 1 1
Medicine B 2 1
Medicine C 4 3
Medicine D 5 4

Each medicine represents one point with two attributes (X, Y) that we can represent it as
coordinate in an attribute space as shown in the figure below.
1. Initial value of centroids : Suppose we use medicine A and medicine B as the first
centroids. Let and denote the coordinate of the centroids, then and

2. Objects-Centroids distance : we calculate the distance between cluster centroid to each

object. Let us use Euclidean distance, then we have distance matrix at iteration 0 is
Each column in the distance matrix symbolizes the object. The first row of the distance
matrix corresponds to the distance of each object to the first centroid and the second row is
the distance of each object to the second centroid. For example, distance from medicine C =

(4, 3) to the first centroid is , and its distance to the second

centroid is , etc.

3. Objects clustering : We assign each object based on the minimum distance. Thus, medicine
A is assigned to group 1, medicine B to group 2, medicine C to group 2 and medicine D to
group 2. The element of Group matrix below is 1 if and only if the object is assigned to that
group.

4. Iteration-1, determine centroids : Knowing the members of each group, now we compute
the new centroid of each group based on these new memberships. Group 1 only has one
member thus the centroid remains in . Group 2 now has three members, thus the
centroid is the average coordinate among the three members:

5. Iteration-1, Objects-Centroids distances : The next step is to compute the distance of all
objects to the new centroids. Similar to step 2, we have distance matrix at iteration 1 is
6. Iteration-1, Objects clustering: Similar to step 3, we assign each object based on the
minimum distance. Based on the new distance matrix, we move the medicine B to Group 1
while all the other objects remain. The Group matrix is shown below

7. Iteration 2, determine centroids: Now we repeat step 4 to calculate the new centroids
coordinate based on the clustering of previous iteration. Group1 and group 2 both has two

members, thus the new centroids are and

8. Iteration-2, Objects-Centroids distances : Repeat step 2 again, we have new distance

matrix at iteration 2 as
9. Iteration-2, Objects clustering: Again, we assign each object based on the minimum
distance.

We obtain result that . Comparing the grouping of last iteration and this iteration
reveals that the objects does not move group anymore. Thus, the computation of the k-mean
clustering has reached its stability and no more iteration is needed. We get the final grouping
as the results

Object Feature 1 (X): Feature 2 (Y): pH Group (result)

weight index
Medicine A 1 1 1
Medicine B 2 1 1
Medicine C 4 3 2
Medicine D 5 4 2

Module - 4 K Means Clustering
No ratings yet
Module - 4 K Means Clustering
20 pages
K-Means Clustering - Numerical Example
100% (1)
K-Means Clustering - Numerical Example
6 pages
Learning Together Is Fun!: Learning English Through Sharing Picture Books
No ratings yet
Learning Together Is Fun!: Learning English Through Sharing Picture Books
11 pages
Machine Learning
No ratings yet
Machine Learning
29 pages
02 K-Means
No ratings yet
02 K-Means
25 pages
K-Means Clustering-Converted-Merged
No ratings yet
K-Means Clustering-Converted-Merged
76 pages
K Mean Clustering
No ratings yet
K Mean Clustering
36 pages
K Mean Clustering
No ratings yet
K Mean Clustering
27 pages
K Mean Clustering
No ratings yet
K Mean Clustering
48 pages
K Mean Clustering
No ratings yet
K Mean Clustering
45 pages
K Mean Clustering 1
No ratings yet
K Mean Clustering 1
26 pages
K Mean Clustering
No ratings yet
K Mean Clustering
32 pages
K Mean Clustering 1
100% (1)
K Mean Clustering 1
12 pages
K Mean Clustering
No ratings yet
K Mean Clustering
11 pages
K-Mean Clustering Final
No ratings yet
K-Mean Clustering Final
21 pages
K Means Elaboration
No ratings yet
K Means Elaboration
6 pages
Clustering
No ratings yet
Clustering
8 pages
ML Clustering K Mean
No ratings yet
ML Clustering K Mean
33 pages
Lecture 18 Clustering 19092024 091909am
No ratings yet
Lecture 18 Clustering 19092024 091909am
33 pages
Lecture - 9 Unsupervised Learning (K-Means, Association Analysis and Frequuent Items)
No ratings yet
Lecture - 9 Unsupervised Learning (K-Means, Association Analysis and Frequuent Items)
73 pages
Lecture 4
No ratings yet
Lecture 4
64 pages
K Mean Clustering1
No ratings yet
K Mean Clustering1
23 pages
ML Unit-2
No ratings yet
ML Unit-2
31 pages
Unsupervised Learning - Clustering
No ratings yet
Unsupervised Learning - Clustering
55 pages
K Mean
No ratings yet
K Mean
12 pages
16 K Mean Clustring 1 18052023 095249am 08042024 093324am
No ratings yet
16 K Mean Clustring 1 18052023 095249am 08042024 093324am
20 pages
LO1 - K-Means Strategy
No ratings yet
LO1 - K-Means Strategy
29 pages
Clustering
No ratings yet
Clustering
80 pages
C9 - Clustering - K Means
No ratings yet
C9 - Clustering - K Means
24 pages
K Means Algo
No ratings yet
K Means Algo
7 pages
ADL LAB Manual
No ratings yet
ADL LAB Manual
27 pages
AI-AG-Day-2-28th Feb 2023
No ratings yet
AI-AG-Day-2-28th Feb 2023
44 pages
Unit 3 - KmeansClustering
No ratings yet
Unit 3 - KmeansClustering
17 pages
Kmea
No ratings yet
Kmea
53 pages
Clustering Notes
No ratings yet
Clustering Notes
29 pages
Algo
No ratings yet
Algo
59 pages
K Mean Cluster Analysis
No ratings yet
K Mean Cluster Analysis
16 pages
L18 19 Clustering
No ratings yet
L18 19 Clustering
48 pages
K Means
No ratings yet
K Means
23 pages
K-Means Cluster Analysis UC Business Analytics R Programming Guide
No ratings yet
K-Means Cluster Analysis UC Business Analytics R Programming Guide
19 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
12 pages
Clustering: CMPUT 466/551 Nilanjan Ray
No ratings yet
Clustering: CMPUT 466/551 Nilanjan Ray
34 pages
K-Means Clustering
No ratings yet
K-Means Clustering
6 pages
KMeans Variants
No ratings yet
KMeans Variants
27 pages
K Mean
No ratings yet
K Mean
7 pages
Clustering
No ratings yet
Clustering
84 pages
Clustering Algorithms
No ratings yet
Clustering Algorithms
19 pages
ML 5
No ratings yet
ML 5
61 pages
K Means Example
No ratings yet
K Means Example
10 pages
Clustering
No ratings yet
Clustering
24 pages
6 Clustering
No ratings yet
6 Clustering
15 pages
K - Means Clustering
No ratings yet
K - Means Clustering
13 pages
Clustering - K-Means: Prerequisite
No ratings yet
Clustering - K-Means: Prerequisite
8 pages
Lecture 1 (UNIT 1)
No ratings yet
Lecture 1 (UNIT 1)
68 pages
Mod4 - Unsupervised Learning
No ratings yet
Mod4 - Unsupervised Learning
9 pages
AI Chapter 3 Part 5
No ratings yet
AI Chapter 3 Part 5
30 pages
06 - Unsupervised Learning - 18 Dec 2023
No ratings yet
06 - Unsupervised Learning - 18 Dec 2023
50 pages
ML Lec-16
No ratings yet
ML Lec-16
16 pages
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
Galois Theory: Lectures Delivered at the University of Notre Dame by Emil Artin (Notre Dame Mathematical Lectures,
From Everand
Galois Theory: Lectures Delivered at the University of Notre Dame by Emil Artin (Notre Dame Mathematical Lectures,
Emil Artin
4/5 (6)
Set Theory Essentials
From Everand
Set Theory Essentials
Emil Milewski
No ratings yet
Notes 1
No ratings yet
Notes 1
23 pages
CFA Using Excel
No ratings yet
CFA Using Excel
5 pages
Analysis of Data Session 1
No ratings yet
Analysis of Data Session 1
26 pages
GRE MathRefresher DiagnosticTest
No ratings yet
GRE MathRefresher DiagnosticTest
7 pages
Maximilian Steinberg
No ratings yet
Maximilian Steinberg
5 pages
Guidance On Road Markings
No ratings yet
Guidance On Road Markings
17 pages
Himbro - Honey London
No ratings yet
Himbro - Honey London
140 pages
Review Exercise - Chapter 1 - Solution
100% (1)
Review Exercise - Chapter 1 - Solution
9 pages
CPS 5008
No ratings yet
CPS 5008
12 pages
Yaw Apenteng Agyei Ofosuhene
No ratings yet
Yaw Apenteng Agyei Ofosuhene
84 pages
Info Age
No ratings yet
Info Age
31 pages
December 2 Flier Final-NEW PDF
No ratings yet
December 2 Flier Final-NEW PDF
1 page
Passive - Comparative - Mini Test PDF
No ratings yet
Passive - Comparative - Mini Test PDF
2 pages
Philippine Indigenous Craft - ICC
No ratings yet
Philippine Indigenous Craft - ICC
8 pages
ATHE Level 6 Diploma in Business Management (120 Credits)
0% (1)
ATHE Level 6 Diploma in Business Management (120 Credits)
4 pages
Advanced Auditing
No ratings yet
Advanced Auditing
76 pages
Farm Life By: Jenny Rose R. Santos
No ratings yet
Farm Life By: Jenny Rose R. Santos
6 pages
Cosmetic & Homecare Industry
No ratings yet
Cosmetic & Homecare Industry
2 pages
Internal Energy Change Equations
No ratings yet
Internal Energy Change Equations
2 pages
Scan Converting Circle
No ratings yet
Scan Converting Circle
36 pages
Andculture Brand Guide
No ratings yet
Andculture Brand Guide
35 pages
Installation of NS2
No ratings yet
Installation of NS2
3 pages
Lesson Plan 10 Transactional Letters
No ratings yet
Lesson Plan 10 Transactional Letters
3 pages
Macbag Msb-I Feb2012
No ratings yet
Macbag Msb-I Feb2012
1 page
"SAR" Games: The Technique To Help Student Writing and Compound Sentences Through Picture
No ratings yet
"SAR" Games: The Technique To Help Student Writing and Compound Sentences Through Picture
12 pages
Pre Columbian Moors
100% (1)
Pre Columbian Moors
6 pages
Boeing 777-300ER Air New Zealand
No ratings yet
Boeing 777-300ER Air New Zealand
18 pages
Course Handout
No ratings yet
Course Handout
5 pages
Cefr Letters b2 and c1
No ratings yet
Cefr Letters b2 and c1
32 pages
Paper - 2011 - Widowati - Glucose-Ethanol Fermentation Dynamic Model
No ratings yet
Paper - 2011 - Widowati - Glucose-Ethanol Fermentation Dynamic Model
8 pages
Find, Nurture, and Convert Leads To Close Deals Faster, Easy-Peasy, On The Top CRM Platform For Sales
No ratings yet
Find, Nurture, and Convert Leads To Close Deals Faster, Easy-Peasy, On The Top CRM Platform For Sales
4 pages
Leading For The Future
No ratings yet
Leading For The Future
4 pages
Presentation 1
No ratings yet
Presentation 1
24 pages

Bis Distance

Uploaded by

Bis Distance

Uploaded by

Mahalanobis Distance

First, we center the data on the arithmetic mean of each variable.

It produces covariance matrices for group 1 and 2 as follow

The pooled covariance is computed using weighted average (10/15)*Covariance group 1 +

What is K-Mean Clustering?

<Previous | Next | Contents>

K means clustering algorithm was developed by J. MacQueen (1967) and then by J. A.

Attribute 1 (X):weight Attribute 2 (Y): pH

Numerical Example of K-Means Clustering

<Previous | Next | Contents>

Iterate until stable (= no object move group):

1. Determine the centroid coordinate

Object attribute 1 (X): weight attribute 2 (Y): pH

2. Objects-Centroids distance : we calculate the distance between cluster centroid to each

(4, 3) to the first centroid is , and its distance to the second

members, thus the new centroids are and

8. Iteration-2, Objects-Centroids distances : Repeat step 2 again, we have new distance

Object Feature 1 (X): Feature 2 (Y): pH Group (result)

You might also like