0% found this document useful (0 votes)

17 views43 pages

Lecture 08 Slides

The document covers unsupervised learning techniques, focusing on dimensionality reduction methods like Principal Component Analysis (PCA) and clustering methods such as k-Means. It explains the motivation behind these techniques, their algorithms, and their applications in data analysis. Additionally, it provides examples and comparisons between PCA and autoencoders for dimensionality reduction.

Uploaded by

baptiste.ferrer10

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views43 pages

Lecture 08 Slides

Uploaded by

baptiste.ferrer10

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

1

Unsupervised learning
Announcements

▪ Last year exam posted on Moodle front page

▪ Quiz 1 grades posted.

▪ Next week: 22.11.2023, Quiz 2 - same format as last week

• Covers topics covered during 23.10-06.11
3
Outline

▪ Unsupervised learning

• Dimensionality reduction: Principal Component Analysis (PCA)

• Clustering: k-Means
4
Introduction Linear regression Logistic regression

Feature engineering Data statistics Naive Bayes

KNN Clustering Dimensionality reduction

Neural networks Convolutional neural Decision-trees

networks
Review of data statistics, ex: Quiz 1 grades
Mean: 7.63, Mode: 8, Median: 8, Standard deviation: 1.85
Brief review of last lecture
Introduction
Unsupervised learning

Unsupervised learning is a type of machine learning that looks for previously

undetected patterns in a data set with no pre-existing labels and with a minimum of
human supervision.

in the next techniques, we don’t use labels anymore!

Dimensionality reduction
Motivation
Intuition

We have samples described by a series of features

We want to find a smaller set of new

x2 features that explain our sample because:

Less features is easier to visualize

Feature 2

Some of the current feature can be

redundant
Some of the current features are not very
useful to describe our samples
Feature 1 x1
Principal component analysis (PCA)
Approach to dimensionality reduction

How to find this smaller set of new features ?

PCA : Find the best linear combination of
x2
features to create new features that explain our
samples better
Feature 2

Feature 1 x1
PCA
Projection of points onto a lower dimensional subspace

How to find this smaller set of new features ?

w1x1 + w2x2
The new feature
Length

A mix of length and weight that describe

our samples better

Weight x1
Projection of a point onto a subspace
Subspace

Distance of a point to a subspace

Projection of a point onto a subspace

Find a subspace to minimize average distance of data to it

Sum of distances of all data points to a subspace

PCA chooses a subspace to minimize

Formulation of PCA objective using the data matrix

Standardize data:

Frobenius norm of a matrix:

PCA objective:
Eigenvalue decomposition of a symmetric matrix
Solution to PCA using eigenvalue decomposition
Principal components
Example: projection of data onto 2 principal components
PCA
Pseudo-code
PCA example - distinguishing texts
Defining features

Each data sample is a document

There are d unique words in all the documents

Feature is positive for document if word is in document

TFIDF feature definition for documents

Term frequency of word … in document …

Document frequency of word ….

TFIDF for each word ….

PCA example
Based on “Principal Component Analysis” lecture of Stanford EE104

Distinguishing text: The Critique of Pure Reason by Immanuel Kant and

The Problems of Philosophy by Bertrand Russell
Dimensionality reduction
Other techniques

There are several other techniques for dimensionality reduction :

Linear discriminant analysis (LDA)

Generalized discriminant analysis (GDA)

T-distributed Stochastic Neighbor Embedding (t-SNE)

Autoencoders
Autoencoder
Introduction

▪ An autoencoder is a type of neural network often used for dimensionality

reduction.
▪ Autoencoders are trained in an unsupervised manner, by minimizing the
N
L (x , x ̂ )
i i
∑
reconstruction error / loss
i=1

▪ Example: Squared error

Autoencoder
Autoencoder vs. PCA

Top: Some examples of the original MNIST

test samples

Middle: Reconstructed output from an auto-

encoder with a latent space of 8 dimensions
This auto-encoder uses convolutional layers, and
was trained on the MNIST training set

Bottom: Reconstructed output from PCA with

8 reduced dimensions

Image credit: F. Fleuret, Deep Learning (EPFL)

Summary - dimensionality reduction
Used for
• Exploratory data analysis
• Visualizing data
• Help reduce overfitting by reducing feature dimension

PCA: an approach to dimensionality reduction

▪ Projects data onto a linear subspace
▪ Useful in case there is approximately linear dependence
between different features
▪ Easy to compute
▪ Connection to singular value decomposition (see Problem set 2)
Clustering
ML

Supervised
S
& - unsupervised
classification dimensionale

I 1)
regression ,
recluchen clostery
2X :
lin logesha
regar K .
NN
negression
>
·

PCA
-
K-
means
.
"
NN
-

Naive
Bayes
Clustering - definition and motivation

Goal: group data points as a first step to understand the data set

Examples: (see 4.4 and 4.5 in LinAlgebra book

- Clustering music genres: grouping music based on the similarities in their
audio features (beat per minute, duration, loudness, amount of words, etc)
Clustering introduction

▪ Toy example:
• Each data sample is a point in 2D

cluster 1
cluster 2

cluster 3 cluster 4
k-means approach to clustering

How to cluster data without labels?

Try to find similarity between groups of points

k-means: Group points based on their proximity

(in terms of distance in the feature space)
k-means
Introduction

▪ Given a set of unlabelled input samples, group the samples into k

clusters (k ∈ ℕ) [xi). ,

▪ k-Means idea:
• Identify&
k cluster of data points given N samples.
-

• Find prototype points μ1, μ2, . . . , μk representing the center of each

&
cluster and add the other data points to the nearest cluster center.

pank
representative
k-Means
Preliminaries

▪ A single representative point for data:

k-Means
▪ Choose k clusters to represent data

▪ Determining the cluster a single point … belongs to

▪ Determining the cluster centres to minimize the distance of each point to its
assigned cluster
k-Means objective function

Algorithm - heuristic
1. Initialize {μ1, μ2, . . . , μk} (e.g., randomly)
2. While not converged
1. Assign each point …. to the nearest center
2. Update each center μj based on the points assigned to it
k-Means
Algorithm - Details
▪ Step 2.1: Assign each point…. to the nearest center
• For each point …, compute the Euclidean distance to every center
{μ1, μ2, . . . , μk}
• Find the smallest distance
• The point is said to be assigned to the corresponding cluster (note that each
point is assigned to a single cluster)
▪ Step 2.2: Update each center μj based on the points assigned to it

• Recompute each center μj as the mean of the points that were assigned to it
k-Means
Algorithm - Convergence
▪ Step 2 is repeated while k-Means has not converged
• What criteria to stop iterating?
• Fixed number of iterations? It’s arbitrary and a too small number can lead to
bad results
• The difference in assignments or center locations between two iterations can
be used as criteria to stop the algorithm

▪ K-Means does not always converge to the best solution

•
k-Means
Example
▪ Use the Palmer Penguins dataset
▪ With Flipper length against Bill length

With label Without label

k-Means
Example
▪ Centroid initialisation
k-Means
Example
▪ First assignment
k-Means
Example
▪ Next assignment
k-Means
Example
▪ Next assignment
k-Means
Example
▪ Final assignment
Summary - Clustering
Used for understanding data
Examples:
Topic discovery in a large set of documents
Recommendation engines
Guessing missing entries

k-means: an approach to clustering

Easy to implement and to interpret
k-means algorithm converges. However, non-convex optimization: solution depends on
initial condition
Further reading

PCA
importance of Standardisation:
PCA on StatQuest!!! k-Means: chapter 4 of LinAlgebra book

k-means: Chapter 4 of LinAlgebra book

▪ For coding of PCA/k-means, see SciKit Learn

Operational Manual For Alphenix 2b308-309en
No ratings yet
Operational Manual For Alphenix 2b308-309en
618 pages
Machine Learning: Unsupervised Learning Dimensionality Reduction K-Means Clustering
No ratings yet
Machine Learning: Unsupervised Learning Dimensionality Reduction K-Means Clustering
28 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
78 pages
QSRI Lecture4
No ratings yet
QSRI Lecture4
56 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
66 pages
Lecture Note 08
No ratings yet
Lecture Note 08
6 pages
Module 4
No ratings yet
Module 4
63 pages
U L D R: Nsupervised Earning and Imensionality Eduction
No ratings yet
U L D R: Nsupervised Earning and Imensionality Eduction
58 pages
ML - Unit - 2
No ratings yet
ML - Unit - 2
13 pages
1.supervised and Unsupervised
No ratings yet
1.supervised and Unsupervised
42 pages
Clustering
No ratings yet
Clustering
65 pages
Advanced Data Analysis Techniques 2
No ratings yet
Advanced Data Analysis Techniques 2
32 pages
Ai Notes V
No ratings yet
Ai Notes V
7 pages
Pca&kmean
No ratings yet
Pca&kmean
6 pages
ML DSBA Lab7
No ratings yet
ML DSBA Lab7
6 pages
Medical Imabmnge Analysis
No ratings yet
Medical Imabmnge Analysis
41 pages
Unit 4-Unsupervised Learning-K Means and Hierarchical Clustering
No ratings yet
Unit 4-Unsupervised Learning-K Means and Hierarchical Clustering
48 pages
Unsupervised Learning 1
No ratings yet
Unsupervised Learning 1
40 pages
K Means
No ratings yet
K Means
25 pages
Introduction-To-Ml-Part-3 Edited
No ratings yet
Introduction-To-Ml-Part-3 Edited
73 pages
Supervised Learning vs. Unsupervised Learning
No ratings yet
Supervised Learning vs. Unsupervised Learning
7 pages
Practical Statistics For Data Science - Chapter7
No ratings yet
Practical Statistics For Data Science - Chapter7
12 pages
Lecture 7: Unsupervised Learning: C19 Machine Learning Hilary 2013 A. Zisserman
No ratings yet
Lecture 7: Unsupervised Learning: C19 Machine Learning Hilary 2013 A. Zisserman
20 pages
ML Unit-5
No ratings yet
ML Unit-5
21 pages
Chapter 3 Unsupervised Learning
No ratings yet
Chapter 3 Unsupervised Learning
45 pages
DM&BAFall2204 2
No ratings yet
DM&BAFall2204 2
61 pages
Unsuper
No ratings yet
Unsuper
15 pages
UnsupervisedLearning FoundationalMathofAI S24
No ratings yet
UnsupervisedLearning FoundationalMathofAI S24
6 pages
Week 5 v1.1 - Unsupervised Learning
No ratings yet
Week 5 v1.1 - Unsupervised Learning
40 pages
3.k-Metoids and Hierarchical Updated
No ratings yet
3.k-Metoids and Hierarchical Updated
50 pages
Topic 2
No ratings yet
Topic 2
10 pages
Clustering Algorithms
No ratings yet
Clustering Algorithms
19 pages
DSA5102 Lecture9
100% (1)
DSA5102 Lecture9
35 pages
Graph Partitioning Advance Clustering Technique
No ratings yet
Graph Partitioning Advance Clustering Technique
14 pages
BCA Semester VI Data Mining Module 4 (Presentation Kind of N
No ratings yet
BCA Semester VI Data Mining Module 4 (Presentation Kind of N
56 pages
2021 Clustering
No ratings yet
2021 Clustering
50 pages
Principal Component Analysis: Jianxin Wu
No ratings yet
Principal Component Analysis: Jianxin Wu
24 pages
ML4 Unsupervised Learning
No ratings yet
ML4 Unsupervised Learning
60 pages
Machine Learning Section3 Ebook v05
No ratings yet
Machine Learning Section3 Ebook v05
15 pages
SEEM2460 Unsupervised Learning Clustering
No ratings yet
SEEM2460 Unsupervised Learning Clustering
76 pages
Final ML Unit3 May24
No ratings yet
Final ML Unit3 May24
154 pages
Machine Learning For Humans, Part 3 - Unsupervised Learning - by Vishal Maini - Machine Learning For Humans - Medium
No ratings yet
Machine Learning For Humans, Part 3 - Unsupervised Learning - by Vishal Maini - Machine Learning For Humans - Medium
23 pages
4.1 Clustering
No ratings yet
4.1 Clustering
80 pages
Unit 7 Clustering
No ratings yet
Unit 7 Clustering
56 pages
ML Ch-5 Clustering, Dimensionality Reduction and Recommender System
No ratings yet
ML Ch-5 Clustering, Dimensionality Reduction and Recommender System
13 pages
Clustering
No ratings yet
Clustering
55 pages
Unit 4 Clustering - K-Means and Hierarchical
No ratings yet
Unit 4 Clustering - K-Means and Hierarchical
40 pages
03 Dimensionality Reduction
No ratings yet
03 Dimensionality Reduction
38 pages
Clustering
No ratings yet
Clustering
29 pages
7 Cluster Analysis
No ratings yet
7 Cluster Analysis
62 pages
Clustering Lecture
No ratings yet
Clustering Lecture
46 pages
ML Chapter 4
No ratings yet
ML Chapter 4
38 pages
Week6 Clustering Regression
No ratings yet
Week6 Clustering Regression
101 pages
Clustering and Dimensionality Reduction
No ratings yet
Clustering and Dimensionality Reduction
58 pages
Introduction To Unsupervised Learning:: Clustering
No ratings yet
Introduction To Unsupervised Learning:: Clustering
21 pages
Week 9
No ratings yet
Week 9
66 pages
Cluster Analysis
No ratings yet
Cluster Analysis
29 pages
07 Clustering
No ratings yet
07 Clustering
54 pages
Unit 3 & 4 (p18)
No ratings yet
Unit 3 & 4 (p18)
18 pages
5clustering 2
No ratings yet
5clustering 2
35 pages
The Tech Interview Playbook: From DSA to System Design
From Everand
The Tech Interview Playbook: From DSA to System Design
Chinmoy Mukherjee
No ratings yet
ISO (International Organization Standardization)
100% (1)
ISO (International Organization Standardization)
18 pages
45 Excel Formulas
No ratings yet
45 Excel Formulas
138 pages
Maintenance Manual mb491 PDF
No ratings yet
Maintenance Manual mb491 PDF
298 pages
How To Crack GATE - IES - BARC - Electronic Devices and Circuits (EDC)
No ratings yet
How To Crack GATE - IES - BARC - Electronic Devices and Circuits (EDC)
4 pages
Unit 30 - Assignment 1
100% (1)
Unit 30 - Assignment 1
3 pages
BlueBorne Technical White Paper
No ratings yet
BlueBorne Technical White Paper
42 pages
C 5750 Users Guide
No ratings yet
C 5750 Users Guide
105 pages
The Impact of Cloud Computing On Organisational Ag
No ratings yet
The Impact of Cloud Computing On Organisational Ag
18 pages
Manual de Serviços
No ratings yet
Manual de Serviços
10 pages
Barangay Baracbac SK Annual Budget Fy 2019: Republic of The Philippines Province of Ilocos Sur Municipality of Galimuyod
No ratings yet
Barangay Baracbac SK Annual Budget Fy 2019: Republic of The Philippines Province of Ilocos Sur Municipality of Galimuyod
7 pages
Cisco Asa Firepower
No ratings yet
Cisco Asa Firepower
11 pages
The Diagnostic Process: Learning Objectives Key Terms
No ratings yet
The Diagnostic Process: Learning Objectives Key Terms
21 pages
Turunan Imidazoline Crodazoline o
No ratings yet
Turunan Imidazoline Crodazoline o
2 pages
Mist Edge
No ratings yet
Mist Edge
2 pages
Cantina Centrifuge CFG February March2025
No ratings yet
Cantina Centrifuge CFG February March2025
10 pages
S-20 U-Verse Remote User Guide
No ratings yet
S-20 U-Verse Remote User Guide
2 pages
Keywords and Identifiers in C
No ratings yet
Keywords and Identifiers in C
3 pages
WI-PS306GF-O Datasheet V3.0
No ratings yet
WI-PS306GF-O Datasheet V3.0
12 pages
Bus Naming On Xilinx Schematics PDF
No ratings yet
Bus Naming On Xilinx Schematics PDF
3 pages
Implen Nanophotometer User Manual V1.0.5
No ratings yet
Implen Nanophotometer User Manual V1.0.5
70 pages
How Does The Positioning of Information Technology Firms in Strat
No ratings yet
How Does The Positioning of Information Technology Firms in Strat
35 pages
Computer Hardware Assessment Package LS 6
No ratings yet
Computer Hardware Assessment Package LS 6
21 pages
FACTORS INFLUENCING ADOPTION OF E-PROCUREMENT IN HUMANITARIAN ORGANIZATIONS (A Case of Norwegian Refugee Council - Kakuma Refugee Camp
100% (1)
FACTORS INFLUENCING ADOPTION OF E-PROCUREMENT IN HUMANITARIAN ORGANIZATIONS (A Case of Norwegian Refugee Council - Kakuma Refugee Camp
72 pages
SWT 3000 Teleprotection Technical Data
No ratings yet
SWT 3000 Teleprotection Technical Data
8 pages
Sony hcd-gtr6 gtr6b gtr7 gtr8 gtr8b Ver.1.2 PDF
No ratings yet
Sony hcd-gtr6 gtr6b gtr7 gtr8 gtr8b Ver.1.2 PDF
92 pages
mANT30 PDF
No ratings yet
mANT30 PDF
1 page
Joystick DANFOSS JS1-H
No ratings yet
Joystick DANFOSS JS1-H
4 pages
Computerised Accounting 2019
No ratings yet
Computerised Accounting 2019
2 pages
Pue - Kar.nic - in PUE PDF Files Colleges NN
No ratings yet
Pue - Kar.nic - in PUE PDF Files Colleges NN
18 pages

Lecture 08 Slides

Uploaded by

Lecture 08 Slides

Uploaded by

1

▪ Last year exam posted on Moodle front page

▪ Quiz 1 grades posted.

▪ Next week: 22.11.2023, Quiz 2 - same format as last week

• Dimensionality reduction: Principal Component Analysis (PCA)

Feature engineering Data statistics Naive Bayes

KNN Clustering Dimensionality reduction

Neural networks Convolutional neural Decision-trees

Unsupervised learning is a type of machine learning that looks for previously

in the next techniques, we don’t use labels anymore!

We have samples described by a series of features

We want to find a smaller set of new

Less features is easier to visualize

Some of the current feature can be

How to find this smaller set of new features ?

How to find this smaller set of new features ?

A mix of length and weight that describe

Distance of a point to a subspace

Projection of a point onto a subspace

Sum of distances of all data points to a subspace

PCA chooses a subspace to minimize

Frobenius norm of a matrix:

Each data sample is a document

There are d unique words in all the documents

Feature is positive for document if word is in document

Term frequency of word … in document …

Document frequency of word ….

TFIDF for each word ….

Distinguishing text: The Critique of Pure Reason by Immanuel Kant and

There are several other techniques for dimensionality reduction :

Linear discriminant analysis (LDA)

Generalized discriminant analysis (GDA)

T-distributed Stochastic Neighbor Embedding (t-SNE)

▪ An autoencoder is a type of neural network often used for dimensionality

▪ Example: Squared error

Top: Some examples of the original MNIST

Middle: Reconstructed output from an auto-

Bottom: Reconstructed output from PCA with

Image credit: F. Fleuret, Deep Learning (EPFL)

PCA: an approach to dimensionality reduction

Examples: (see 4.4 and 4.5 in LinAlgebra book

How to cluster data without labels?

Try to find similarity between groups of points

k-means: Group points based on their proximity

▪ Given a set of unlabelled input samples, group the samples into k

• Find prototype points μ1, μ2, . . . , μk representing the center of each

▪ A single representative point for data:

▪ Determining the cluster a single point … belongs to

▪ K-Means does not always converge to the best solution

With label Without label

k-means: an approach to clustering

k-means: Chapter 4 of LinAlgebra book

▪ For coding of PCA/k-means, see SciKit Learn

You might also like