Session No: CO2-1 Session Topic: Motion Analysis: Digital Video Processing
Session No: CO2-1 Session Topic: Motion Analysis: Digital Video Processing
Session No: CO2-1 Session Topic: Motion Analysis: Digital Video Processing
Fig. : Datapoints
• Now clustering means to find the set of points that are close together
as compared to some other data points.
Fig. : Clusters
Why Gaussian Mixture Model cont…
• Here in the above image, there is one more notation for the centroids
of both clusters and act as the parameters that identify each one of
the clusters.
• There are multiple methods or approaches that are used to do the
clustering. They are-
K-means Clustering
Gaussian Mixture Model
Hierarchical Clustering
etc….
Why Gaussian Mixture Model cont…
• K-means is quite a popular clustering algorithm that updates the
parameters of each cluster by an iterative approach. Basically, it
calculates the centroid (means) of each cluster and then calculates
their distance to each of the data points. This process is repeated until
some criteria are fulfilled.
• K-means is a hard clustering method which means that it associates
each point to one and only cluster. The limitation of this method is
that there is no probability that tells us how much a data point is
associated with the cluster.
• This is when GMM (Gaussian Mixture Model) comes to the picture.
• Let’s recall types of clustering methods:
• Hard clustering: clusters do not overlap (element either belongs to
cluster or it does not) — e.g. K-means, K-Medoid.
• Soft clustering: clusters may overlap (strength of association between
clusters and instances) — e.g. mixture model using Expectation-
Maximization algorithm.
GMM (Gaussian Mixture Model)
• The core idea of this model is that it tries to model the dataset in the
mixture of multiple Gaussian mixtures.
What is Gaussian Mixture ?
• Gaussian Mixture is a function that includes multiple Gaussians equal
to the total number of clusters formed.
• Each Gaussian in the mixture carries some parameters which are-
A mean, that defines the center.
A covariance, that defines the width.
A probability.
Fig. : GMM clusters
• Here it can be seen that there are three clusters that mean three
Gaussian functions.
• Each Gaussian explains the data present in each of the clusters
available.
• Since there are three( k=3) clusters and the probability density is
defined as a linear function of densities of all these k distributions.
• As there will be some n number of sample points in k(th) cluster
and the parameters cannot be estimated in closed form.
• Now question is that how will you find out the missing or hidden data
points?
• The basic two steps of the EM algorithm i.e, E-step and M-step are
often pretty easy for many of the machine learning problems in terms
of implementation.
• The solution to the M-steps often exists in the closed-form.
• It is always guaranteed that the value of likelihood will increase after
each iteration.
Disadvantages of EM algorithm
• It has slow convergence.
• It is sensitive to starting point, converges to the local optimum only.
• It cannot discover K (likelihood keeps growing with number of
clusters)
• It takes both forward and backward probabilities into account. This
thing is in contrast to that of numerical optimization which considers
only forward probabilities.
References