100% found this document useful (1 vote)

37 views35 pages

DSA5102 Lecture9

lecture9

Uploaded by

gjpnwmdpz7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

37 views35 pages

DSA5102 Lecture9

lecture9

Uploaded by

gjpnwmdpz7

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 35

Foundations of Machine Learning

DSA 5102 • Lecture 9

Li Qianxiao
Department of Mathematics
Last Time
Until now, we have focused on supervised learning
• Datasets comes in input-label pairs
• Goal is to learn their relationship for prediction

For the rest of the course, we are going to look at a variety of

unsupervised learning methodologies.

As always, we start with the simplest linear cases and proceed

from there.
Unsupervised Learning Overview
Supervised Learning

Supervised learning is about learning to make predictions

(Oracle) Cat

Predictive Dog
Model

Our goal: Using data, learn a predictive model that approximates

Unsupervised Learning

Unsupervised learning is where we do not have label information

(Oracle) Cat

Dog

Example goal: learn some task-agnostic patterns from the input data
Examples of Unsupervised Learning
Tasks: Dimensionality Reduction

https://media.geeksforgeeks.org/wp-content/uploads/Dimensionality_Reduction_1.jpg
Examples of Unsupervised Learning
Tasks: Clustering

https://upload.wikimedia.org/wikipedia/commons/thumb/c/c8/Cluster-2.svg/1200px-Cluster-2.svg.png
Examples of Unsupervised Learning
Tasks: Density Estimation

By ‫ طاها‬- Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=24309466

Examples of Unsupervised Learning
Tasks: Generative Models

http://www.lherranz.org/wp-content/uploads/2018/07/blog_generativesampling.png
Why unsupervised learning?
• Labelled data is expensive to collect
• Labelled data is impossible to get
• Different application scenarios
Principal Component Analysis
Review: Eigenvalues and Eigenvectors
• For a square matrix , an eigenvector with associated eigenvalue satisfies

• We say is diagonalizable if there exists a diagonal (matrix of eigenvalues)

and an invertible (columns=eigenvectors) such that
• is symmetric if . is orthogonal if
• Well-known result: if is symmetric then it is diagonalizable by orthogonal
matrices, i.e.

Columns of are orthonormal: . In fact, is an orthonormal basis for . Moreover,

the eigenvalues are real.

Watch this! https://www.youtube.com/watch?v=PFDu9oVAE-g&t=453s

Review: Eigenvalues and Eigenvectors
• A symmetric matrix is
• Positive semi-definite if for all
• Positive definite if for all
• Suppose is symmetric positive definite. Then, WLOG we will
order its eigenvalues

and are the corresponding orthonormal eigenvectors.

Motivating PCA: Shoe Sizes
Capturing the Variation?
Although there are two dimensions to the data, there is really one
effective dimension! How do we uncover this dimension?
A Dynamic Visualization
Find the direction
that captures the
most variance

Two
Formulations
Find the direction
that minimizes
projection error
Derivation of PCA
(Maximize Variance)
Derivation of PCA
(Minimize Error)
The PCA Algorithm
Simple Example
Choosing The Embedding Dimension
PCA in Feature Space (Example)
PCA in Feature Space
We define a vector of feature maps

Form design matrix

Perform PCA on the Transformed dataset!

PCA in Feature Space
PCA in Feature Space (Example)
Define Feature Maps
PCA as a Form of Whitening
Recall: Principal component scores are given by

Define the transformation

Then, !

In other words, has uncorrelated features. This is known as a

PCA whitening transform.
Example: Iris Dataset
Autoencoders
PCA as Compression Algorithm

𝑍 𝑚 = 𝑋 𝑈𝑚 𝑋 ′ =𝑍 𝑚 𝑈 𝑀
𝑇

Encoder Decoder
Latent
Autoencoders
In this sense, the autoencoder is a nonlinear counter-part of PCA
based compression!
PCA: 𝑍 𝑚 = 𝑋 𝑈𝑚 𝑋 ′ =𝑍 𝑚 𝑈 𝑚
𝑇

Encoder Latent Decoder

AE: 𝑍 𝑚 =𝑇 enc ( 𝑋 ;𝜃) 𝑋 ′ =𝑇 dec ( 𝑍 𝑚 ; 𝜙 )

Neural Network Autoencoders
How do we pick the encoding and decoding and

One choice: use universal approximators, e.g. neural networks!

where
Neural Network Autoencoders
Given a dataset , we solve the empirical risk minimization to
minimize the distance between and

The empirical risk minimization uses inputs as labels!

Demo: PCA and Autoencoders
Summary
PCA fits an ellipsoid to data. Two interpretations:
• Maximize variance
• Minimize error

PCA is useful for:

• Dimensionality reduction
• Feature extraction / clustering
• Data whitening

Viewed as a reconstruction algorithm, autoencoders is a nonlinear

analogue of PCA

Customer Churn Prediction Project: by Shweta Gupta
100% (6)
Customer Churn Prediction Project: by Shweta Gupta
41 pages
Module-2 Lecture 7
100% (1)
Module-2 Lecture 7
21 pages
CS09 607 (P) - DBMS Lab Manual PDF
100% (1)
CS09 607 (P) - DBMS Lab Manual PDF
94 pages
Chapter 3 - Old PPT - Deadlock
100% (1)
Chapter 3 - Old PPT - Deadlock
40 pages
Concurrency Control in Distributed Databases
100% (1)
Concurrency Control in Distributed Databases
12 pages
Triggers
100% (1)
Triggers
9 pages
Intro To Iterative Deepening
100% (1)
Intro To Iterative Deepening
12 pages
Introduction To Distributed Database Presentation
100% (1)
Introduction To Distributed Database Presentation
67 pages
A Presenation BY: Pawan Sharma
100% (1)
A Presenation BY: Pawan Sharma
13 pages
Mini Max
100% (1)
Mini Max
9 pages
Distributed Catalog Management
100% (1)
Distributed Catalog Management
12 pages
DBMS - Unit-3
No ratings yet
DBMS - Unit-3
35 pages
7 Query Localization
No ratings yet
7 Query Localization
27 pages
Concurrency Control in Distributed Database Systems
No ratings yet
Concurrency Control in Distributed Database Systems
5 pages
Distributed System
100% (1)
Distributed System
119 pages
Concurrency Control in Distributed Transactions
No ratings yet
Concurrency Control in Distributed Transactions
17 pages
Industrial Applications Using Neural Networks
No ratings yet
Industrial Applications Using Neural Networks
11 pages
Module I Supervised Learning PPT-1
100% (1)
Module I Supervised Learning PPT-1
147 pages
Virtual Functions and Polymorphism
No ratings yet
Virtual Functions and Polymorphism
11 pages
21csc205p Dbms Unit I
No ratings yet
21csc205p Dbms Unit I
154 pages
Triggers Lecture
100% (1)
Triggers Lecture
27 pages
Purpose of Database System: What Is DBMS?
No ratings yet
Purpose of Database System: What Is DBMS?
8 pages
Unit No.4 Parallel Database
No ratings yet
Unit No.4 Parallel Database
32 pages
Data Mining and Model Selection
No ratings yet
Data Mining and Model Selection
27 pages
Schema Refinement and Normal Forms: UNIT-4
No ratings yet
Schema Refinement and Normal Forms: UNIT-4
10 pages
Lab Manual OOP
100% (1)
Lab Manual OOP
19 pages
Ankit Sir All Units Dbms
100% (1)
Ankit Sir All Units Dbms
142 pages
Data Mining Models - GeeksforGeeks
No ratings yet
Data Mining Models - GeeksforGeeks
4 pages
R22 Unit-4
No ratings yet
R22 Unit-4
29 pages
DSV Module-3
No ratings yet
DSV Module-3
24 pages
Distributed Database Systems: January 2002
No ratings yet
Distributed Database Systems: January 2002
25 pages
Master of Computer Application: Lab Manual
No ratings yet
Master of Computer Application: Lab Manual
30 pages
Data Mining KDD Process
No ratings yet
Data Mining KDD Process
22 pages
Data Warehousing and Data Mining (10cs755)
No ratings yet
Data Warehousing and Data Mining (10cs755)
142 pages
Transaction in DDB
100% (1)
Transaction in DDB
9 pages
Parallel Computing
No ratings yet
Parallel Computing
57 pages
Database Management Systems Raghu Ramakrishnan 185 192
No ratings yet
Database Management Systems Raghu Ramakrishnan 185 192
8 pages
DBMS Unit 2
No ratings yet
DBMS Unit 2
19 pages
NEURAL NETWORKS and Deep Learning: Going Deep About Neural Network
No ratings yet
NEURAL NETWORKS and Deep Learning: Going Deep About Neural Network
4 pages
Dbms Lab File
100% (1)
Dbms Lab File
30 pages
Unit 4 Data Science
No ratings yet
Unit 4 Data Science
21 pages
Adv. OS Notes Module-04
100% (1)
Adv. OS Notes Module-04
8 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
Parallel Computing LessonPlan
No ratings yet
Parallel Computing LessonPlan
10 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
29 pages
Unit II - Problem Solving by Searching
No ratings yet
Unit II - Problem Solving by Searching
21 pages
ML Unit-3
No ratings yet
ML Unit-3
24 pages
Data Modeling Using The Entity-Relationship Model
100% (1)
Data Modeling Using The Entity-Relationship Model
28 pages
Machine Learning: Presentation
100% (2)
Machine Learning: Presentation
23 pages
Crash Recovery
No ratings yet
Crash Recovery
30 pages
Unit 4 DBMS R23
No ratings yet
Unit 4 DBMS R23
19 pages
Distributed File Systems
No ratings yet
Distributed File Systems
75 pages
UNIT-04: Introduction To Data Mining: Data Mining Techniques KDD Process Association Rules.
No ratings yet
UNIT-04: Introduction To Data Mining: Data Mining Techniques KDD Process Association Rules.
40 pages
Chapter 4 Distributed Database Systems
No ratings yet
Chapter 4 Distributed Database Systems
69 pages
2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals
No ratings yet
2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals
35 pages
DSA5105 Lecture8
No ratings yet
DSA5105 Lecture8
35 pages
W9a Autoencoders Pca
No ratings yet
W9a Autoencoders Pca
7 pages
Presentation
No ratings yet
Presentation
31 pages
Lecture-3 Unit 3
No ratings yet
Lecture-3 Unit 3
22 pages
Wk01 Machine Learning
No ratings yet
Wk01 Machine Learning
6 pages
Lec 13-14 PCA
No ratings yet
Lec 13-14 PCA
53 pages
DSA5102 Lecture11
No ratings yet
DSA5102 Lecture11
44 pages
DSA5102 Lecture10
No ratings yet
DSA5102 Lecture10
40 pages
DSA5102 Lecture12
No ratings yet
DSA5102 Lecture12
41 pages
DSA5102 Lecture3
No ratings yet
DSA5102 Lecture3
34 pages
2017 NIR Meat and Meat Product
No ratings yet
2017 NIR Meat and Meat Product
24 pages
Polymer International - 2001 - Miranda - Ultraviolet Induced Crosslinking of Poly Vinyl Alcohol Evaluated by Principal
No ratings yet
Polymer International - 2001 - Miranda - Ultraviolet Induced Crosslinking of Poly Vinyl Alcohol Evaluated by Principal
5 pages
Applications of Human Biometrics in Digital Image Processing
No ratings yet
Applications of Human Biometrics in Digital Image Processing
4 pages
Feature Extraction Techniques
No ratings yet
Feature Extraction Techniques
32 pages
1-5-18 M Tech CSE Batch 2018 PDF
No ratings yet
1-5-18 M Tech CSE Batch 2018 PDF
80 pages
DS ML CompleteSlides PDF
No ratings yet
DS ML CompleteSlides PDF
211 pages
Operational Modal Analysis Tutorial - Svib Seminar May 2007
No ratings yet
Operational Modal Analysis Tutorial - Svib Seminar May 2007
12 pages
Implementation of Smart Attendance On FPGA: October 2019
No ratings yet
Implementation of Smart Attendance On FPGA: October 2019
6 pages
Pattern Recognition Introduction Features Classifiers and Principles Jrgen Beyerer Matthias Richter Matthias Nagel Download
No ratings yet
Pattern Recognition Introduction Features Classifiers and Principles Jrgen Beyerer Matthias Richter Matthias Nagel Download
86 pages
Port Selection Factors by Shipping Linesممتازة
No ratings yet
Port Selection Factors by Shipping Linesممتازة
9 pages
Machine Learning Bloque 4
No ratings yet
Machine Learning Bloque 4
12 pages
Malhotra 19
No ratings yet
Malhotra 19
37 pages
Advanced Risk and Portfolio Management Attilio Meucci, Ago.2012 ARPM - Brochure - v1
No ratings yet
Advanced Risk and Portfolio Management Attilio Meucci, Ago.2012 ARPM - Brochure - v1
5 pages
Regression Vs Kalman Filter
No ratings yet
Regression Vs Kalman Filter
68 pages
2023 Ria - 37.05 - 04
No ratings yet
2023 Ria - 37.05 - 04
11 pages
Prediction of Air Pollution Using Artificial Intelligence: A Case Study of Delhi NCT
No ratings yet
Prediction of Air Pollution Using Artificial Intelligence: A Case Study of Delhi NCT
24 pages
SVD Note
No ratings yet
SVD Note
2 pages
CONN FMRI Functional Connectivity Toolbox Manual v15
No ratings yet
CONN FMRI Functional Connectivity Toolbox Manual v15
29 pages
【12】Factor Timing
No ratings yet
【12】Factor Timing
43 pages
Ghosh 2015
No ratings yet
Ghosh 2015
15 pages
A Context Aware Unsupervised Predictive Maintenance
No ratings yet
A Context Aware Unsupervised Predictive Maintenance
27 pages
Dietary Patterns and Their Association With Sociodemographic and Lifestyle Factors in Filipino Adults - PMC
No ratings yet
Dietary Patterns and Their Association With Sociodemographic and Lifestyle Factors in Filipino Adults - PMC
11 pages
Vegan Tutor
No ratings yet
Vegan Tutor
43 pages
Independent Components Analysis
No ratings yet
Independent Components Analysis
26 pages
Data Analytics and Performance of Mobile Apps Using R Language
No ratings yet
Data Analytics and Performance of Mobile Apps Using R Language
10 pages
James & McCulloch 1990
No ratings yet
James & McCulloch 1990
40 pages
Data Driven Smart Manufacturing Tool Wear Monitoring With Audio Signals and Machine Learning
No ratings yet
Data Driven Smart Manufacturing Tool Wear Monitoring With Audio Signals and Machine Learning
11 pages
Multicollinearity
No ratings yet
Multicollinearity
15 pages
Khatoon 2020
No ratings yet
Khatoon 2020
7 pages

DSA5102 Lecture9

Uploaded by

DSA5102 Lecture9

Uploaded by

Foundations of Machine Learning

DSA 5102 • Lecture 9

For the rest of the course, we are going to look at a variety of

As always, we start with the simplest linear cases and proceed

Supervised learning is about learning to make predictions

Our goal: Using data, learn a predictive model that approximates

Unsupervised learning is where we do not have label information

By ‫ طاها‬- Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=24309466

• We say is diagonalizable if there exists a diagonal (matrix of eigenvalues)

Columns of are orthonormal: . In fact, is an orthonormal basis for . Moreover,

Watch this! https://www.youtube.com/watch?v=PFDu9oVAE-g&t=453s

and are the corresponding orthonormal eigenvectors.

Form design matrix

Perform PCA on the Transformed dataset!

Define the transformation

In other words, has uncorrelated features. This is known as a

Encoder Latent Decoder

AE: 𝑍 𝑚 =𝑇 enc ( 𝑋 ;𝜃) 𝑋 ′ =𝑇 dec ( 𝑍 𝑚 ; 𝜙 )

One choice: use universal approximators, e.g. neural networks!

The empirical risk minimization uses inputs as labels!

PCA is useful for:

Viewed as a reconstruction algorithm, autoencoders is a nonlinear

You might also like