0% found this document useful (0 votes)
5 views

Supervised vs unsupervised data overview

The document compares Supervised and Unsupervised Learning in machine learning, highlighting their definitions, key differences, types, algorithms, evaluation metrics, and real-world applications. Supervised Learning uses labeled data to predict outcomes, while Unsupervised Learning analyzes unlabeled data to find patterns. Examples include spam detection for Supervised Learning and customer segmentation for Unsupervised Learning.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Supervised vs unsupervised data overview

The document compares Supervised and Unsupervised Learning in machine learning, highlighting their definitions, key differences, types, algorithms, evaluation metrics, and real-world applications. Supervised Learning uses labeled data to predict outcomes, while Unsupervised Learning analyzes unlabeled data to find patterns. Examples include spam detection for Supervised Learning and customer segmentation for Unsupervised Learning.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Supervised vs.

Unsupervised Learning: A Detailed Comparison

Machine learning is broadly categorized into Supervised and Unsupervised Learning


based on the type of data used and the learning approach. Let’s dive into their
differences with examples, algorithms, applications, and evaluation metrics.

1. Definition

Supervised Learning

Supervised learning is a machine learning approach where the model is trained on a


dataset that includes input-output (X → Y) pairs. The goal is to learn a function that
maps inputs to the correct outputs.
 Example:

o A spam detection model is trained on emails labeled as spam or not


spam to predict new emails.

o A house price prediction model learns from past sales data (features like
size, location, etc.) to predict future prices.
Unsupervised Learning

Unsupervised learning is a type of machine learning where the model learns patterns
from unlabeled data (only X) without explicit supervision. The goal is to discover
hidden structures or relationships in the data.
 Example:

o Customer segmentation: Analyzing purchase behavior to group similar


customers without predefined labels.
o Anomaly detection: Identifying fraudulent transactions without prior fraud
labels.

2. Key Differences

Feature Supervised Learning Unsupervised Learning

Data Type Uses labeled data (X, Y) Uses unlabeled data (only X)

Predict outcomes based on past


Objective Find patterns, clusters, or anomalies
data
Feature Supervised Learning Unsupervised Learning

Key Clustering, Dimensionality


Classification, Regression
Techniques Reduction, Anomaly Detection

Model Learns a mapping between inputs


Learns underlying structures in data
Training (X) and outputs (Y)

Spam detection, fraud detection, Customer segmentation, topic


Application
disease prediction, stock price modeling, anomaly detection, feature
Areas
forecasting extraction

Discrete (classes) or Continuous Groups, patterns, or reduced


Output Type
(numeric values) dimensions

Evaluation Accuracy, Precision, Recall, F1- Silhouette Score, Davies-Bouldin


Metrics score, RMSE, R² score Index, Inertia

Email classification, sentiment Market segmentation,


Examples analysis, self-driving car object recommendation systems, anomaly
detection detection

3. Types of Supervised and Unsupervised Learning

A) Supervised Learning Types & Algorithms

Type Description Example Algorithms

Linear Regression, Polynomial Regression,


Predicts a continuous
Regression Decision Tree Regression, Random Forest
numerical value
Regression, SVR

Logistic Regression, Decision Trees, Random


Predicts a discrete
Classification Forest, SVM, KNN, Naïve Bayes, Neural
category (class labels)
Networks

 Example:

o Regression: Predicting house prices based on features like size, location,


and number of rooms.
o Classification: Predicting whether a customer will buy a product based
on their demographics and past purchases.
B) Unsupervised Learning Types & Algorithms

Type Description Example Algorithms

K-Means, Hierarchical Clustering,


Groups similar data points
Clustering DBSCAN, Gaussian Mixture Model
together
(GMM)

Reduces data complexity


Dimensionality
while preserving important PCA, t-SNE, Autoencoders
Reduction
features

Anomaly Identifies rare or unusual Isolation Forest, Local Outlier


Detection data points Factor (LOF), One-Class SVM

 Example:

o Clustering: Grouping customers based on shopping behavior for targeted


marketing.
o Dimensionality Reduction: Using PCA to reduce the number of features
in a high-dimensional dataset for visualization.
o Anomaly Detection: Identifying fraudulent credit card transactions.

4. Evaluation Metrics

Since Supervised Learning predicts known outcomes, we can directly compare


predictions with actual values using:
Supervised Learning Metrics

 For Classification:

o Accuracy: (Correct Predictions / Total Predictions)

o Precision & Recall: Measure the quality of predictions

o F1-score: Harmonic mean of precision and recall

o ROC-AUC: Measures how well the model distinguishes between classes

 For Regression:

o Mean Squared Error (MSE) & Root Mean Squared Error (RMSE)

o R² Score (Coefficient of Determination)


Since Unsupervised Learning doesn’t have labeled data, its evaluation is trickier and
relies on intrinsic measures:
Unsupervised Learning Metrics
 For Clustering:

o Silhouette Score: Measures how similar a point is to its cluster.

o Davies-Bouldin Index: Measures compactness and separation of


clusters.
o Inertia (SSE in K-Means): Measures how tightly clusters are formed.

 For Dimensionality Reduction:

o Explained Variance Ratio (PCA): Measures retained information.

5. Real-World Applications

Supervised Learning Unsupervised Learning


Field
Example Example

Fraud detection, credit Market segmentation, anomaly


Finance
scoring detection in transactions

Identifying unknown disease


Disease prediction, patient
Healthcare patterns, medical image
risk assessment
segmentation

Product recommendation, Customer segmentation,


E-commerce
customer churn prediction shopping behavior analysis

Object detection (pedestrian,


Self-Driving Cars Identifying unknown road patterns
vehicle)

Customer segmentation for


Marketing Email spam classification
targeted advertising

Natural Language Sentiment analysis, chatbot Topic modeling, document


Processing (NLP) responses clustering

You might also like