0% found this document useful (0 votes)
17 views

Clustering in Machine Learning Notes

Uploaded by

kunal b malviya
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Clustering in Machine Learning Notes

Uploaded by

kunal b malviya
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

UNIT-II: Clustering in Machine Learning

Clustering in Machine Learning:

-------------------------------

1. Types of Clustering Methods:

- Partitioning Clustering: Involves dividing the data into distinct, non-overlapping clusters.

- Distribution Model-Based Clustering: Assumes the data is generated by a mixture of underlying

probability distributions.

- Hierarchical Clustering: Builds a hierarchy of clusters either agglomeratively (bottom-up) or

divisively (top-down).

- Fuzzy Clustering: Allows a data point to belong to multiple clusters with varying degrees of

membership.

2. Birch Algorithm:

- A clustering algorithm that constructs a CF (Clustering Feature) tree for efficient clustering of

large datasets.

- It works by dynamically adjusting the threshold to maintain a balance between clustering quality

and efficiency.

3. CURE Algorithm:

- A hierarchical clustering algorithm designed to handle large datasets.

- CURE uses representative points and applies a combination of centroid-based and

distance-based techniques to improve cluster quality.

4. Gaussian Mixture Models (GMM) and Expectation Maximization (EM):


- GMM is a probabilistic model that assumes all data points are generated from a mixture of

several Gaussian distributions.

- The EM algorithm is used to estimate the parameters of the GMM by iteratively refining the

likelihood of the model based on observed data.

5. Parameters Estimations:

- Maximum Likelihood Estimation (MLE): A method for estimating the parameters of a statistical

model by maximizing the likelihood function.

- Maximum A Posteriori (MAP): A method similar to MLE but incorporates prior information (a

prior distribution) to improve the estimation process.

6. Applications of Clustering:

- Image segmentation, market segmentation, anomaly detection, social network analysis, and

document categorization are some common applications of clustering.

You might also like