0% found this document useful (0 votes)
0 views4 pages

Unsupervised Learning Notes

This document provides an overview of machine learning, specifically focusing on unsupervised learning, which involves learning from unlabeled data to identify patterns. It explains various types of machine learning, including supervised, unsupervised, and reinforcement learning, along with real-life applications and common techniques such as clustering, dimensionality reduction, and association rule learning. Additionally, it details popular clustering algorithms like K-Means and Hierarchical Clustering, outlining their processes and objectives.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views4 pages

Unsupervised Learning Notes

This document provides an overview of machine learning, specifically focusing on unsupervised learning, which involves learning from unlabeled data to identify patterns. It explains various types of machine learning, including supervised, unsupervised, and reinforcement learning, along with real-life applications and common techniques such as clustering, dimensionality reduction, and association rule learning. Additionally, it details popular clustering algorithms like K-Means and Hierarchical Clustering, outlining their processes and objectives.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

UNSUPERVISED LEARNING

 What is Machine Learning?


 Machine Learning (ML) is a branch of Artificial
Intelligence (AI) that focuses on creating systems that
can learn from data and improve their performance
over time without being explicitly programmed.

 Machine learning is a method by which computers learn


from experience (data) to make predictions or decisions
without being directly told how to do it.

 How it works?
 Instead of writing code to tell the computer what to do
step-by-step, we give it data and a model, and it
figures out the patterns or rules on its own.
 Types of Machine Learning:
 Supervised Learning – Learn from labeled data.
Example: Predicting house prices from past data
(features + price).
 Unsupervised Learning – Find patterns in unlabeled
data.
Example: Customer segmentation in marketing.
 Reinforcement Learning – Learn by trial and error with
rewards.
Example: A robot learning to walk.
 Real-Life Applications:
 Spam email detection
 Movie or product recommendations
 Facial recognition
 Self-driving cars
 Fraud detection in banking
 What is Unsupervised Learning?
 Unsupervised Learning is a type of machine learning
where the model learns from data that is not labeled —
meaning there are no predefined categories or
outcomes.
 In supervised learning, the data comes with answers
(like "spam" or "not spam").
In unsupervised learning, there are no answers — the
algorithm tries to find hidden patterns or structures in
the data on its own.

Example:
Imagine you have a bunch of customer data (age,
income, purchase habits), but you don’t know anything
about who they are.
With unsupervised learning, the algorithm might:
 Group similar customers together (this is called
clustering).

 Find unusual behavior (called anomaly detection).

 Common Techniques in Unsupervised Learning:

 Clustering – Group similar data points.

Example: Customer segmentation, grouping users by


behavior.

Algorithms: K-Means, Hierarchical Clustering.

 Dimensionality Reduction – Simplify data without losing


important information.

Example: Visualizing high-dimensional data in 2D.

Algorithms: PCA (Principal Component Analysis), t-SNE.

 Association Rule Learning – Discover relationships


between features.

Example: Market basket analysis (people who buy


bread often buy butter).

Algorithm: Apriori

 What is Clustering in Machine Learning?


 Clustering is an unsupervised learning technique that
involves grouping similar data points together based on
their features — without using any labels.

Simple Explanation: Imagine you have a basket of


mixed fruits, but they’re not labeled.
Clustering is like automatically grouping the fruits into
"apples", "oranges", and "bananas" based on their
shape, size, and color — without knowing the names.

 Objective of Clustering:

To divide a dataset into clusters, where:

 Data points in the same cluster are more similar to each


other.
 Data points in different clusters are more different.

 Popular Clustering Algorithms:

 K-Means Clustering

1. You choose K (number of clusters).


2. The algorithm tries to find K groups by
minimizing the distance within clusters.
3. Fast and widely used.

 Hierarchical Clustering

1. Creates a tree-like structure of clusters.


2. No need to choose K upfront.
3. Good for visualizing with dendrograms.

 DBSCAN (Density-Based Spatial Clustering)

1. Groups based on density of points.


2. Can find clusters of different shapes and sizes.
3. Good at finding outliers.

 K-Means Clustering

K-Means Clustering is an unsupervised machine learning


algorithm used to group similar data points into K distinct
clusters.

How K-Means Works (Step-by-Step):

1. Choose the number of clusters (K): You must decide


how many groups you want to divide the data into.
2. Initialize cluster centroids: Randomly pick K points from
the dataset as the initial centroids (cluster centers).
3. Assign each data point to the nearest centroid: Use
Euclidean distance to measure closeness.
4. Recalculate centroids: For each cluster, compute the
new centroid as the mean of all points in that cluster.
5. Repeat steps 3 & 4: Until the assignments don’t change
anymore (convergence), or a max number of iterations
is reached.

Mathematics Behind It:

 Distance formula:

For a point x=(x1,x2) and centroid c=(c1,c2):

Distance=square root of (x1−c1)2+(x2−c2)2

 Hierarchical Clustering

Hierarchical Clustering is an unsupervised learning method


that builds a tree-like structure (called a dendrogram) to
group data points based on similarity.

There are two main types:

1. Agglomerative (Bottom-Up) – Most Common

 Start with each data point as its own cluster.


 Merge the two closest clusters.
 Repeat until all points belong to a single cluster.

2. Divisive (Top-Down)

 Start with one big cluster.


 Split it recursively into smaller clusters.
 Stop when each point is in its own cluster.

You might also like