Unsupervised Learning
Unsupervised Learning
Machine Learning
Introduction
• A type of Machine Learning
• Input Features without labeled target attribute
• Draw inferences from datasets
• Explore the data to find some intrinsic features
Examples
• Grouping of people that have similar sizes so make only 3
and “large”.
interest.
Clustering
• Most common method of Unsupervised learning
• It is used to analyze data features to find hidden patterns by
forming groups or clusters in data.
• The clusters are modeled using a measure of similarity which
is defined upon metrics such as Euclidean or probabilistic
distance.
Clustering Algorithms
• Partitioning clustering: partitions data into k distinct clusters
based on distance to the centroid of a cluster.
• Hierarchical clustering: builds a multilevel hierarchy of clusters
by creating a cluster tree.
• Model Based: Hypothesize a model for each cluster and find
best fit of the models to data.
• Density Based: Guided by connectivity and density functions.
• Gaussian mixture models: models clusters as a mixture of
multivariate normal density components.
• Hidden Markov models: uses observed data to recover the
sequence of states.
Partitioning Clustering
• Using K means Algorithm.
• Partition the data into k clusters.
• Each data point associate itself with the nearest mean (cluster
centers or cluster centroid), serving as a prototype of the
cluster.
• Centroids are updating with the addition and subtraction of
data points in clusters.
• Recursively doing the process, gives the optimized clusters.
• Stops when there is a minimum or no change in centroids
values.
Contd.
Hierarchal Clustering
• This algorithm produces a nested sequence of clusters.
• Each cluster is distinct from each other and the data in each
cluster are broadly similar to each other.
Contd.
Types of Hierarchal Clustering
• Agglomerative (bottom-up) Clustering:
It builds the dendogram from the bottom level and merges the
most similar or nearest pair of clusters.
Stop when all the clusters are merged into single root node.