0% found this document useful (0 votes)

19 views9 pages

Expectation-Maximization Clustring V2

Uploaded by

Mono job

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views9 pages

Expectation-Maximization Clustring V2

Uploaded by

Mono job

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Project Data mining Ensam Meknes

1.Expectation-maximization: .................................................................................................................................. 2
1.1.Definition: ..................................................................................................................................................... 2
1.2.Intuition behind: ............................................................................................................................................ 2
1.3 Mathematic formulation: ............................................................................................................................... 2
1.4. EM for Clustering: (Soft assignment) .......................................................................................................... 3
1.4.1 Mixture Models: ..................................................................................................................................... 3
1.4.2 Example: ................................................................................................................................................. 3
1.4.3 Complexity: ............................................................................................................................................ 5
Conclusion: ............................................................................................................................................................. 5
References: .............................................................................................................................................................. 6
2.Mean shift clustering: .......................................................................................................................................... 7
2.1 Definition: ..................................................................................................................................................... 7
2.1.1 Advantages: ............................................................................................................................................ 7
2.1.2 How Does Mean-Shift Clustering Work? .............................................................................................. 7
2.2 Example:........................................................................................................................................................ 8
2.3 Complexity: ................................................................................................................................................... 8
References: .............................................................................................................................................................. 9

Figure 1:Example of a KDE function for 7 data points .................................................................................... 7

1
Project Data mining Ensam Meknes

1.Expectation-maximization:
1.1.Definition:
The Expectation-Maximization (EM) algorithm is an iterative optimization method that combines different
unsupervised machine learning algorithms to find maximum likelihood or maximum posterior estimates of
parameters in statistical models that involve unobserved latent variables.

There are two main steps:

- E-step, the algorithm computes the latent variables using the current parameter estimates.
- M-step, the algorithm determines the parameters that maximize the expected log-likelihood obtained in
the E-step, then corresponding model parameters are updated.

1.2.Intuition behind:
Case 1: Distribution parameters are Known/Missing values:

Let's consider that we have a variable X with values [1,2, x], the X has the gaussian distribution (1,1), the best
estimation for x is the mean value 1.

Case 2: A parameter is unknown/No Missing values:

I know all values [1,2,3], and I want to estimate the µ, so the best value is the arithmetic mean = 3.

Case 3: A parameter is unknown/Missing values:

To guess missing values, I need µ, and to estimate µ I need all values? It’s like Checken-egg problem, so here
the EM (Expectation-maximization) came to game, how?

We guess a µ0 = 0, then x0 = 0, then µ1 = 1 (1+2+0/3), so x1 = 1... at some point of this iterative process we
1+2+𝑥
come to this equation: µ = = 𝑥 so, the x = 1.5 = µ.
3

1.3 Mathematic formulation:

- P (X = x /µ) the probability that X equal to x knowing that X distribution is N(µ,1).
- L = Log (likehood (x1, x2....x)/µ)) = log(p(x1/µ)) + log(p(x2/µ)) +......+ log(p(x/µ))
1- Geuss µ0.
𝑝
2- 𝐸(log(𝑙𝑖𝑘𝑒ℎ𝑜𝑜𝑑(µ))) = ∫𝑥 (𝑥/µ0) ⋅ 𝐿 ⋅ 𝑑𝑥 , this function used to estimate the next value of µ.
3- µ = argmax (E (log (likehood(µ)))), the value that maximize the E function.
=> the main application of this algorithm is for clustering tasks or handling missing values.

2
Project Data mining Ensam Meknes

1.4. EM for Clustering: (Soft assignment)

Like K-means we start with a random guess of distributions/clusters and then proceed to improve iteratively by
alternating two steps:
1. (Expectation) Assign each data point to a cluster probabilistically.
2. (Maximization) Update the parameters for each cluster based on the points in the cluster (weighted by
their probability assigned in the first step).
instead of doing “hard” assignment providing a cluster for each data point (K-means). The model will
provide the probability that a given data point belongs to each cluster. This is called “soft” assignment,
EM implemented for clustering as Mixture Models.

1.4.1 Mixture Models:

A Mixture Model is expressed by the following equations:

With k is the number of clusters

πⱼ are the mixture weights
GMM is a Mixture Model where the pⱼ(x) is a finite combination of Gaussian Distributions. Where θ is the
collection of all the parameters of the model (mixture weights, means, and covariance matrices)

for each data point, x (in red), we can compute the probability that it belongs to each component
(cluster/distribution)

1.4.2 Example:
In this example our dataset is a bunch of 1-dimentionel points, we have two gaussian mixtures (distributions),
we try to find out if a specific point belongs to the red or blue distribution.

3
Project Data mining Ensam Meknes

Next, we calculate the probability that a point belongs to a distribution, we call the responsibilities:
1 (𝑥𝑖−𝜇𝑏 )2 𝑃(𝑥𝑖/𝑏𝑙𝑢𝑒)⋅𝑃(𝑏𝑙𝑢𝑒)
𝑃(𝑥𝑖/𝑏𝑙𝑢𝑒) = exp (− ) 𝑏𝑖 =
2𝜎𝑏2 𝑃(𝑥𝑖/𝑏𝑙𝑢𝑒)⋅𝑃(𝑏𝑙𝑢𝑒)+𝑃(𝑥𝑖/𝑟𝑒𝑑)⋅𝑃(𝑟𝑒𝑑)
√2𝜋𝜎𝑏2

𝑎𝑖 = 1 − 𝑏𝑖

bi,ai are the probability that a point xi belongs the blue, red distributions respectively.

Then updating parameters of the distributions (mean, std):

𝑏1⋅𝑥1 + 𝑏2⋅𝑥2 +⋯+𝑏𝑛⋅𝑥𝑛 𝑎1⋅𝑥1 + 𝑎2⋅𝑥2 +⋯+𝑎𝑛⋅𝑥𝑛
𝜇𝑏 = 𝜇𝑎 =
𝑏1+𝑏2+⋯+𝑏𝑛 𝑎1+𝑎2+⋯+𝑎𝑛

𝑏1 ⋅ (𝑥1 − 𝜇𝑏)2 + 𝑏2 ⋅ (𝑥2 − 𝜇𝑏)2 + ⋯ + 𝑏𝑛 ⋅ (𝑥𝑛 − 𝜇𝑏)2

𝜎𝑏2 =
𝑏1 + 𝑏2 + ⋯ + 𝑏𝑛
𝑎1 ⋅ (𝑥1 − 𝜇𝑎)2 + 𝑎2 ⋅ (𝑥2 − 𝜇𝑎)2 + ⋯ + 𝑎𝑛 ⋅ (𝑥𝑛 − 𝜇𝑎)2
𝜎𝑎2 =
𝑎1 + 𝑎2 + ⋯ + 𝑎𝑛
We can observe that if bi, ai = 0 or 1 in the means we obtain formulas like those we use for updating centroids
in K-means.

4
Project Data mining Ensam Meknes

Like K-means is an iterative approach. After several iterations, parameters are no longer changing
(convergence) we come up with our clusters.

1.4.3 Complexity:
Its time complexity is of 𝑂(𝑁𝐾𝐷 3 ), where N is the number of data points, K is the number of Gaussian
components and D is the problem dimension.
For example, for a problem with 3 components, 2D, and with 200 points per cluster the running time is around
2 min.

Conclusion:
The EM algorithm is very sensitive to initialization. What some people recommend is to run K-Means (because
it has a lower computational cost) and use the output centers as the initialization means of the mixture
components. By doing that, you substantially accelerate the convergence of the EM algorithm. I would add that
it is also easier to find an appropriate number of clusters by running K-Means.
Nevertheless, the EM algorithm is considered to be better than K-Means because it provides additional
information about the data, namely, the dispersion (variance) of the cluster, not only its centers.

5
Project Data mining Ensam Meknes

References:
Definitions

Intuition

EM clustring1

EM clustring2

6
Project Data mining Ensam Meknes

2.Mean shift clustering:

2.1 Definition:
Mean-shift clustering is an unsupervised machine learning algorithm used to identify clusters within a dataset. It
is a density-based clustering method that focuses on finding the regions of high density and iteratively shifting
data points towards the highest density of points.

2.1.1 Advantages:
Unlike the popular K-Means cluster algorithm, mean-shift does not require specifying the number of clusters in
advance. The number of clusters is determined by the algorithm with respect to the data.

It is particularly useful for datasets where the clusters have arbitrary shapes and are not well-separated by linear
boundaries.

2.1.2 How Does Mean-Shift Clustering Work?

The process can be divided to 3 main steps:

Kernel Density Estimation: we need first to estimate the density function for our data points, using KDE
technique, we start by assigning a kernel function to each data point, this function can be equivalent to a
gaussian distribution with zero mean and unit variance (Eq1), the assigned function (Eq2) is divided by a
parameter h (kernel bandwidth) to have a unit area.

The KDE will be the sum of the kernel functions (Eq3) with n is the number of points.

Eq1 Eq3

Eq2

Figure 1:Example of a KDE function for 7 data points

7
Project Data mining Ensam Meknes

Shifting Data Points: In the second step, the algorithm iteratively shifts the data points towards regions of
higher density. The shift is determined by calculating the mean shift vector for each data point, this shift vector
calculated inside a region of interest determined by a radius R (the only parameter of the Algorithm).

Convergence and Cluster Identification: The algorithm continues shifting the data points until convergence is
reached. Convergence in Mean Shift occurs when the data points stop moving significantly. This means that the
data points have reached the modes of the density distribution.

Once convergence is achieved, the final position of each data point represents a cluster center. So points
belongs to the same cluster will converge to the same point (Cluster center /mode).

Once we identify centroids the algorithm assigns each data point to the closest cluster center.

2.2 Example:
An example on the car.xls dataset (from lab3 Tanagra), with performing PCA to visualize results, give 3
clusters as showing bellow:

2.3 Complexity:
Sklearn’s implementation of the algorithm, has a lower runtime complexity, will usually be around
O(T*n*log(n)), where n is the number of samples and T is the number of iterations. In higher dimensions, the
complexity will be around O(T*n²).

8
Project Data mining Ensam Meknes

References:
Definition

Explications

Image Enhancement Image Filtering
No ratings yet
Image Enhancement Image Filtering
167 pages
Week 7 - Latent Variable Models and Expectation Maximization
No ratings yet
Week 7 - Latent Variable Models and Expectation Maximization
39 pages
Unit 4
No ratings yet
Unit 4
29 pages
Machine Learning Notes-1 (Clustering-1)
No ratings yet
Machine Learning Notes-1 (Clustering-1)
25 pages
MLLecture 1
No ratings yet
MLLecture 1
56 pages
01 Clustering
No ratings yet
01 Clustering
43 pages
Image Segmentation1
No ratings yet
Image Segmentation1
42 pages
ML Unit3
No ratings yet
ML Unit3
21 pages
Lecture 06
No ratings yet
Lecture 06
51 pages
UNIT III Part-1
No ratings yet
UNIT III Part-1
69 pages
Lec. 15-Final. ClusAdvanced
No ratings yet
Lec. 15-Final. ClusAdvanced
103 pages
EM and Kmeans Relations
No ratings yet
EM and Kmeans Relations
70 pages
Mean Shift Clustering
No ratings yet
Mean Shift Clustering
23 pages
GMM
No ratings yet
GMM
25 pages
Lecture Expectation Maximization
No ratings yet
Lecture Expectation Maximization
58 pages
MLT Lab 08
No ratings yet
MLT Lab 08
5 pages
Unsupervised Learning - A Comprehensive Overview of
No ratings yet
Unsupervised Learning - A Comprehensive Overview of
5 pages
Clustering Mixed Data
No ratings yet
Clustering Mixed Data
10 pages
Lect 10 - Unsupervised Learning
No ratings yet
Lect 10 - Unsupervised Learning
50 pages
Detecting Patterns With Unsupervised Learning
No ratings yet
Detecting Patterns With Unsupervised Learning
21 pages
ML Module5 Clustering
No ratings yet
ML Module5 Clustering
71 pages
Clustering, K-Means,. Expectation Maximization, Mean Shift, Classifier Ensembles, Bagging, Boosting
No ratings yet
Clustering, K-Means,. Expectation Maximization, Mean Shift, Classifier Ensembles, Bagging, Boosting
21 pages
401 Week7 Part 2 EM Algorithm
No ratings yet
401 Week7 Part 2 EM Algorithm
58 pages
Clustering, K-Means,. Expectation Maximization, Mean Shift, Classifier Ensembles, Bagging, Boosting
No ratings yet
Clustering, K-Means,. Expectation Maximization, Mean Shift, Classifier Ensembles, Bagging, Boosting
21 pages
Unit 3 Clustering Algorithm
No ratings yet
Unit 3 Clustering Algorithm
44 pages
DSA5102 Lecture10
No ratings yet
DSA5102 Lecture10
40 pages
Unit-V Clustering Part 1
No ratings yet
Unit-V Clustering Part 1
26 pages
BIS 541 Ch04 20-21 S
No ratings yet
BIS 541 Ch04 20-21 S
82 pages
Machine Learning: CSCE883
No ratings yet
Machine Learning: CSCE883
22 pages
Clustering
No ratings yet
Clustering
17 pages
Unit - V DW
No ratings yet
Unit - V DW
6 pages
ML Lecture06 Unsupervised Learning
No ratings yet
ML Lecture06 Unsupervised Learning
87 pages
Some Studies of Expectation Maximization Clustering Algorithm To Enhance Performance
No ratings yet
Some Studies of Expectation Maximization Clustering Algorithm To Enhance Performance
16 pages
Docu87490 - Data Domain DD3300 Field Replacement and Upgrade Guide PDF
0% (2)
Docu87490 - Data Domain DD3300 Field Replacement and Upgrade Guide PDF
204 pages
ML Unit Iii
No ratings yet
ML Unit Iii
12 pages
PROBABILISTIC Learning Jb-New
No ratings yet
PROBABILISTIC Learning Jb-New
13 pages
Data Mining For BI - Part 5
No ratings yet
Data Mining For BI - Part 5
34 pages
Clustering Techniques in ML: Submitted By: Pooja 16EJICS072
No ratings yet
Clustering Techniques in ML: Submitted By: Pooja 16EJICS072
26 pages
Schematic Nrf24l01+Pa+Lna
100% (1)
Schematic Nrf24l01+Pa+Lna
2 pages
Pattern Analysis-Machine Learning
No ratings yet
Pattern Analysis-Machine Learning
74 pages
Unit 4 Descriptive Modeling
No ratings yet
Unit 4 Descriptive Modeling
18 pages
Clustering
No ratings yet
Clustering
11 pages
6.2 K Means
No ratings yet
6.2 K Means
23 pages
Lecture 12 - Unsupervised Learning - Shoould Be Marged
No ratings yet
Lecture 12 - Unsupervised Learning - Shoould Be Marged
31 pages
ML Unit-4 Final 2024-25
No ratings yet
ML Unit-4 Final 2024-25
28 pages
Week 9 - Clustering
No ratings yet
Week 9 - Clustering
63 pages
Region Segmentation Readings: Chapter 10: 10.1 Additional Materials Provided
No ratings yet
Region Segmentation Readings: Chapter 10: 10.1 Additional Materials Provided
47 pages
Products Barcodes 2024-04-05T10 38 12.851448Z
No ratings yet
Products Barcodes 2024-04-05T10 38 12.851448Z
16 pages
FDP Manual - Petrel Dynamic Modeling PDF
83% (6)
FDP Manual - Petrel Dynamic Modeling PDF
28 pages
Mixture Models and Clustering
No ratings yet
Mixture Models and Clustering
8 pages
Dissertation Essex Uni
100% (2)
Dissertation Essex Uni
6 pages
Unit 5
No ratings yet
Unit 5
5 pages
Introduction To (Statistical) Machine Learning
No ratings yet
Introduction To (Statistical) Machine Learning
30 pages
CH-6 DM Clustering
No ratings yet
CH-6 DM Clustering
28 pages
Unit 3 & 4 (p18)
No ratings yet
Unit 3 & 4 (p18)
18 pages
Introduction To Data Science: Clustering
No ratings yet
Introduction To Data Science: Clustering
45 pages
Concepts and Techniques: - Chapter 11
No ratings yet
Concepts and Techniques: - Chapter 11
103 pages
Expectation Maximization
No ratings yet
Expectation Maximization
23 pages
Chapter 15-BOM
No ratings yet
Chapter 15-BOM
12 pages
K Means Clustering
No ratings yet
K Means Clustering
6 pages
K Means
No ratings yet
K Means
33 pages
Machine Learning & Data Mining: Understanding
No ratings yet
Machine Learning & Data Mining: Understanding
7 pages
Ampere's Law
No ratings yet
Ampere's Law
20 pages
New - FE - I - Exam Form - Submitted List
No ratings yet
New - FE - I - Exam Form - Submitted List
42 pages
Combinepdf 7
No ratings yet
Combinepdf 7
120 pages
Microsoft Powerpoint Tips and Tricks
No ratings yet
Microsoft Powerpoint Tips and Tricks
8 pages
What Is Painting? Definition &amp Description - Eden Gallery
No ratings yet
What Is Painting? Definition &amp Description - Eden Gallery
3 pages
AARM CAIA Benchmarks-1
No ratings yet
AARM CAIA Benchmarks-1
12 pages
Mse & History Format
No ratings yet
Mse & History Format
24 pages
Cleaning Validation For Biopharmaceutical Manufacturing at Genentech
100% (1)
Cleaning Validation For Biopharmaceutical Manufacturing at Genentech
4 pages
Chakan Iv, Pune: Indospace - in
No ratings yet
Chakan Iv, Pune: Indospace - in
16 pages
Tokyo Revengers, Chapter 219 - English Scans
No ratings yet
Tokyo Revengers, Chapter 219 - English Scans
1 page
Ra21vss1 07 Explode View and Parts List
No ratings yet
Ra21vss1 07 Explode View and Parts List
11 pages
Data Science - Full-Time PDF
No ratings yet
Data Science - Full-Time PDF
34 pages
Pianoman: "Piano Man"
No ratings yet
Pianoman: "Piano Man"
2 pages
Conversion of Units
No ratings yet
Conversion of Units
1 page
Some, Any, Much, Many, A Lot Of, How Many, How Mu
No ratings yet
Some, Any, Much, Many, A Lot Of, How Many, How Mu
1 page
Powers of Central Government Under The Environmental Protection Act 1986
No ratings yet
Powers of Central Government Under The Environmental Protection Act 1986
4 pages
R 19 Unit V
No ratings yet
R 19 Unit V
13 pages
Tips and Techniques To Cleanup Outlook 2007-2010
No ratings yet
Tips and Techniques To Cleanup Outlook 2007-2010
10 pages
A Parents Guide To Grammar 1
No ratings yet
A Parents Guide To Grammar 1
8 pages
Language Development Program Birth 12 Months
No ratings yet
Language Development Program Birth 12 Months
2 pages
Analytical Review
No ratings yet
Analytical Review
2 pages
Problem Set 1 Significant Figures Answer Sheet
No ratings yet
Problem Set 1 Significant Figures Answer Sheet
2 pages
Hurt Feelings Report
No ratings yet
Hurt Feelings Report
1 page
Affidavit of Loss of Certificate of Registration of Motor Vehicle
No ratings yet
Affidavit of Loss of Certificate of Registration of Motor Vehicle
1 page

Expectation-Maximization Clustring V2

Uploaded by

Expectation-Maximization Clustring V2

Uploaded by

Project Data mining Ensam Meknes

Figure 1:Example of a KDE function for 7 data points .................................................................................... 7

There are two main steps:

Case 2: A parameter is unknown/No Missing values:

Case 3: A parameter is unknown/Missing values:

1.3 Mathematic formulation:

1.4. EM for Clustering: (Soft assignment)

1.4.1 Mixture Models:

With k is the number of clusters

Then updating parameters of the distributions (mean, std):

𝑏1 ⋅ (𝑥1 − 𝜇𝑏)2 + 𝑏2 ⋅ (𝑥2 − 𝜇𝑏)2 + ⋯ + 𝑏𝑛 ⋅ (𝑥𝑛 − 𝜇𝑏)2

2.Mean shift clustering:

2.1.2 How Does Mean-Shift Clustering Work?

Figure 1:Example of a KDE function for 7 data points

You might also like