0% found this document useful (0 votes)

92 views

Probabilistic Models With Latent Variables

This document discusses probabilistic models with latent variables. It introduces latent variables to model multiple modes in empirical distributions for unsupervised learning problems like density estimation. The Expectation-Maximization algorithm is presented as an approach for inference and parameter estimation in these latent variable models, including Gaussian mixture models and principal component analysis.

Uploaded by

About

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

92 views

Probabilistic Models With Latent Variables

Uploaded by

About

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 29

Probabilistic Models

with Latent Variables

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 1
Density Estimation Problem
• Learning from unlabeled data
• Unsupervised learning, density estimation

• Empirical distribution typically has multiple modes

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 2
Density Estimation Problem

From http://yulearning.blogspot.co.uk

From http://courses.ee.sun.ac.za/Pattern_Recognition_813

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 3
Density Estimation Problem
• Conv. composition of unimodal pdf’s: multimodal pdf
where

• Physical interpretation
• Sub populations

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 4
Latent Variables
• Introduce
new variable for each
• Latent / hidden: not observed in the data

• Probabilistic interpretation
• Mixing weights:
• Mixture densities:

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 5
Generative Mixture Model
•
For
𝑍𝑖

𝑋𝑖

• recovers mixture distribution

𝑁

Plate Notation

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 6
Tasks in a Mixture Model
• Inference

• Parameter Estimation
• Find parameters that e.g. maximize likelihood
• Does not decouple according to classes
• Non convex, many local minima

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 7
Example: Gaussian Mixture Model
• Model
For

• Inference

• Soft-max function

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 8
Example: Gaussian Mixture Model
• Loglikelihood
• Which training instance comes from which component?

• No closed form solution for maximizing

• Possibility 1: Gradient descent etc

• Possibility 2: Expectation Maximization

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 9
Expectation Maximization Algorithm
• Observation: Know values of easy to maximize

• Key idea: iterative updates

• Given parameter estimates, “infer” all variables
• Given inferred variables, maximize wrt parameters

• Questions
• Does this converge?
• What does this maximize?

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 10
Expectation Maximization Algorithm
• Complete loglikelihood

• Problem: not known

• Possible solution: Replace w/ conditional expectation

• Expected complete loglikelihood

Wrt where are the current parameters

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 11
Expectation Maximization Algorithm
•

Where

• Compare with likelihood for generative classifier

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 12
Expectation Maximization Algorithm
• Expectation Step
• Update based on current parameters

• Maximization Step
• Maximize wrt parameters

• Overall algorithm
• Initialize all latent variables
• Iterate until convergence
• M Step
• E Step

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 13
Example: EM for GMM
• E Step remains the step for all mixture models

• M Step

• Compare with generative classifier

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 14
Analysis of EM Algorithm
• Expected complete LL is a lower bound on LL
• EM iteratively maximizes this lower bound

• Converges to a local maximum of the loglikelihood

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 15
Bayesian / MAP Estimation
• EM overfits
• Possible to perform MAP instead of MLE in M-step

• EM is partially Bayesian
• Posterior distribution over latent variables
• Point estimate over parameters

• Fully Bayesian approach is called Variational Bayes

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 16
(Lloyd’s) K Means Algorithm
• Hard EM for Gaussian Mixture Model
• Point estimate of parameters (as usual)
• Point estimate of latent variables
• Spherical Gaussian mixture components

Where

• Most popular “hard” clustering algorithm

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 17
K Means Problem
• Given
, find k “means” and data assignments such
that

• Note: is k-dimensional binary vector

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 18
Model selection: Choosing K for GMM
• Cross validation
• Plot likelihood on training set and validation set for
increasing values of k
• Likelihood on training set keeps improving
• Likelihood on validation set drops after “optimal” k

• Does not work for k-means! Why?

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 19
Principal Component Analysis: Motivation
• Dimensionality reduction
• Reduces #parameters to estimate
• Data often resides in much lower dimension, e.g., on a line
in a 3D space
• Provides “understanding”

• Mixture models very restricted

• Latent variables restricted to small discrete set
• Can we “relax” the latent variable?

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 20
Classical PCA: Motivation
• Revisit K-means

• W: matrix containing means

• Z: matrix containing cluster membership vectors

• How can we relax Z and W?

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 21
Classical PCA: Problem
•

• X:
• Arbitrary Z of size ,
• Orthonormal W of size

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 22
Classical PCA: Optimal Solution
• Empirical covariance matrix
• Scaled and centered data
• where contains L Eigen vectors for the L largest
Eigen values of

• Alternative solution via Singular Value

Decomposition (SVD)

• W contains the “principal components” that capture

the largest variance in the data

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 23
Probabilistic PCA
• Generative model

forced to be diagonal

• Latent linear models

• Factor Analysis
• Special Case: PCA with

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 24
Visualization of Generative Process

From Bishop, PRML

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 25
Relationship with Gaussian Density
•

• Why does need to be restricted?

• Intermediate low rank parameterization of Gaussian

covariance matrix between full rank and diagonal
• Compare #parameters

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 26
EM for PCA: Rod and Springs

From Bishop, PRML

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 27
Advantages of EM
• Simpler than gradient methods w/ constraints

• Handles missing data

• Easy path for handling more complex models

• Not always the fastest method

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 28
Summary of Latent Variable Models
• Learning from unlabeled data

• Latent variables
• Discrete: Clustering / Mixture models ; GMM
• Continuous: Dimensionality reduction ; PCA

• Summary / “Understanding” of data

• Expectation Maximization Algorithm

Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 29

Case B Notes
No ratings yet
Case B Notes
4 pages
Footwear Quality Standard and Testing Requirements.
No ratings yet
Footwear Quality Standard and Testing Requirements.
53 pages
Dbms PPT For Chapter 7
No ratings yet
Dbms PPT For Chapter 7
45 pages
Surface Vehicle Recommended Practice: CAD Model For SAE J826 3-D H-Point Machine
No ratings yet
Surface Vehicle Recommended Practice: CAD Model For SAE J826 3-D H-Point Machine
6 pages
Final Practical List Computer Peripherals and Interface
No ratings yet
Final Practical List Computer Peripherals and Interface
42 pages
6CS4-02 ML PPT Unit-3
No ratings yet
6CS4-02 ML PPT Unit-3
52 pages
Problem Solving and Python Programming L T P C
No ratings yet
Problem Solving and Python Programming L T P C
1 page
ModuleV IoT
No ratings yet
ModuleV IoT
37 pages
UNIT 1 (Improving Software Economics) PDF
No ratings yet
UNIT 1 (Improving Software Economics) PDF
20 pages
OS Practical File (1900651)
No ratings yet
OS Practical File (1900651)
59 pages
VTU Question Paper of 18EC741 IOT and Wireless Sensor Networks Feb-2022
No ratings yet
VTU Question Paper of 18EC741 IOT and Wireless Sensor Networks Feb-2022
2 pages
Iot Physical Devices
No ratings yet
Iot Physical Devices
22 pages
Bannari Amman Institute of Technology
No ratings yet
Bannari Amman Institute of Technology
10 pages
IOT Unit-4
No ratings yet
IOT Unit-4
16 pages
Evolution of Big Data
No ratings yet
Evolution of Big Data
21 pages
NPTEL Domain
No ratings yet
NPTEL Domain
1 page
ML LAB MANUAL-BCSL606[1]
No ratings yet
ML LAB MANUAL-BCSL606[1]
64 pages
Cloud Computing Unit-1 Notes
No ratings yet
Cloud Computing Unit-1 Notes
12 pages
AI Ch-14 Inroduction To Prolog
No ratings yet
AI Ch-14 Inroduction To Prolog
15 pages
Variables in Java: - : Unit 2
No ratings yet
Variables in Java: - : Unit 2
13 pages
Unit - 3 Iot and Applications
No ratings yet
Unit - 3 Iot and Applications
23 pages
ds4015-big-data-analytics-vignesh-k-notes
No ratings yet
ds4015-big-data-analytics-vignesh-k-notes
146 pages
ML Unit-3 ppt
No ratings yet
ML Unit-3 ppt
92 pages
HOW To Use Keil Μvision4: For ARM7 (LPC2148) Step By Step
No ratings yet
HOW To Use Keil Μvision4: For ARM7 (LPC2148) Step By Step
13 pages
IT WORKSHOP-LABMANUAL(SEMESTER 2)
No ratings yet
IT WORKSHOP-LABMANUAL(SEMESTER 2)
15 pages
COMPUTER ARCHITECTURE VIVA QUESTIONS
No ratings yet
COMPUTER ARCHITECTURE VIVA QUESTIONS
7 pages
1) Develop and Demonstrate A XHTML Document That Illustrates The Use External Style Sheet, Ordered List, Table, Borders, Padding, Color, and The Tag
No ratings yet
1) Develop and Demonstrate A XHTML Document That Illustrates The Use External Style Sheet, Ordered List, Table, Borders, Padding, Color, and The Tag
23 pages
deep-learning-r18-jntuh-lab-manual
No ratings yet
deep-learning-r18-jntuh-lab-manual
20 pages
internship-PPT (Pradip Pokharel 1HM17CS023)
No ratings yet
internship-PPT (Pradip Pokharel 1HM17CS023)
23 pages
Internet-of-Things-Arshdeep Bagha-19-51
No ratings yet
Internet-of-Things-Arshdeep Bagha-19-51
33 pages
DL Unit-2 Notes PPT
No ratings yet
DL Unit-2 Notes PPT
39 pages
Final Viva
No ratings yet
Final Viva
21 pages
Stmlabexperiments 1 - 11
No ratings yet
Stmlabexperiments 1 - 11
68 pages
Pattern Recognition Using Hybrid Framework for Person Identification_ Retinal Iris Image Analysis
No ratings yet
Pattern Recognition Using Hybrid Framework for Person Identification_ Retinal Iris Image Analysis
7 pages
P, NP, NP - Complete, NP Hard
No ratings yet
P, NP, NP - Complete, NP Hard
19 pages
Decision Tree - A Step-by-Step Guide
No ratings yet
Decision Tree - A Step-by-Step Guide
36 pages
21CS743_DL_Module4_notes
No ratings yet
21CS743_DL_Module4_notes
7 pages
Isom 3400 - Python For Business Analytics 1. Intro To Python
No ratings yet
Isom 3400 - Python For Business Analytics 1. Intro To Python
46 pages
7th Sem 1
No ratings yet
7th Sem 1
32 pages
Gujarat Technological University: Computer Engineering Machine Learning SUBJECT CODE: 3710216
No ratings yet
Gujarat Technological University: Computer Engineering Machine Learning SUBJECT CODE: 3710216
2 pages
LIC
No ratings yet
LIC
8 pages
OOSE Lab Report
No ratings yet
OOSE Lab Report
30 pages
Iare DLD PPT 0
No ratings yet
Iare DLD PPT 0
294 pages
IML-IITKGP - Assignment 7 Solution
No ratings yet
IML-IITKGP - Assignment 7 Solution
8 pages
The Role of Algorithms in Computing
No ratings yet
The Role of Algorithms in Computing
9 pages
Lab Manual
No ratings yet
Lab Manual
28 pages
IoT UNIT III
No ratings yet
IoT UNIT III
13 pages
Artificial Intelligence Question Bank-RICH
No ratings yet
Artificial Intelligence Question Bank-RICH
10 pages
25th August MCA New First Year Syllabus 2020
No ratings yet
25th August MCA New First Year Syllabus 2020
24 pages
Big Data Analytics Lab Manual
No ratings yet
Big Data Analytics Lab Manual
90 pages
Internet of Things A Hands - On Approach - Arshdeep Bahga, Vijay Madisetti
No ratings yet
Internet of Things A Hands - On Approach - Arshdeep Bahga, Vijay Madisetti
34 pages
Unit 3 Full Notes
No ratings yet
Unit 3 Full Notes
30 pages
unit V
No ratings yet
unit V
67 pages
Unit - 1 Iot and Applications
No ratings yet
Unit - 1 Iot and Applications
14 pages
Object Oriented Concepts: 18CS45 Model Question Paper-1 With Effect From 2019-20 (CBCS Scheme)
No ratings yet
Object Oriented Concepts: 18CS45 Model Question Paper-1 With Effect From 2019-20 (CBCS Scheme)
2 pages
Dbms Unit 1 Notes
0% (1)
Dbms Unit 1 Notes
14 pages
UNIT 4 - Perceptron and DL
No ratings yet
UNIT 4 - Perceptron and DL
39 pages
Unit-4 Illumination & Colour Models
No ratings yet
Unit-4 Illumination & Colour Models
69 pages
Wireless and Mobile Network Architecture: Handoff Management Detection and Assignment
No ratings yet
Wireless and Mobile Network Architecture: Handoff Management Detection and Assignment
30 pages
Hypothesis Space Search in Decision Trees
No ratings yet
Hypothesis Space Search in Decision Trees
15 pages
Probabilistic Models For Classification
No ratings yet
Probabilistic Models For Classification
32 pages
Lecture 9
No ratings yet
Lecture 9
24 pages
Who Am I?: Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
No ratings yet
Who Am I?: Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
13 pages
Upcoming Government Exams 2024-25
No ratings yet
Upcoming Government Exams 2024-25
22 pages
Australia in 2040 The Future Demographic
No ratings yet
Australia in 2040 The Future Demographic
19 pages
Oops
No ratings yet
Oops
13 pages
Ee To Se For Roads Eot
No ratings yet
Ee To Se For Roads Eot
1 page
RT9011
No ratings yet
RT9011
12 pages
BMW TPM Training
No ratings yet
BMW TPM Training
68 pages
Analysing Existing Structures A Brief Introduction
No ratings yet
Analysing Existing Structures A Brief Introduction
4 pages
ADDAI vs. PIONEER TOBACCO CO. LTD. (1989) DLSC511
No ratings yet
ADDAI vs. PIONEER TOBACCO CO. LTD. (1989) DLSC511
6 pages
Acta Materialia: Full Length Article
No ratings yet
Acta Materialia: Full Length Article
16 pages
DRRR Chapter 1
No ratings yet
DRRR Chapter 1
17 pages
Testo 540
No ratings yet
Testo 540
2 pages
Detection of Underground Cable Fault Using Arduino and GSM
No ratings yet
Detection of Underground Cable Fault Using Arduino and GSM
1 page
Goliath 90 v129 e
No ratings yet
Goliath 90 v129 e
129 pages
The Economic Loss Doctrine
No ratings yet
The Economic Loss Doctrine
9 pages
CPR Awareness Training
No ratings yet
CPR Awareness Training
5 pages
Velar D300 - Price List - W.E.F. 03.04.2018
No ratings yet
Velar D300 - Price List - W.E.F. 03.04.2018
1 page
Analysis of AND Sony Pictures India LTD.: Zee Entertainment Enterprises LTD
No ratings yet
Analysis of AND Sony Pictures India LTD.: Zee Entertainment Enterprises LTD
10 pages
Individual Professional Development Plan
No ratings yet
Individual Professional Development Plan
3 pages
Motherboard A8ne-Fm
No ratings yet
Motherboard A8ne-Fm
74 pages
[Ebooks PDF] download Computational Modeling of Underground Coal Gasification 1st Edition Vivek V. Ranade (Author) full chapters
No ratings yet
[Ebooks PDF] download Computational Modeling of Underground Coal Gasification 1st Edition Vivek V. Ranade (Author) full chapters
55 pages
Still A
No ratings yet
Still A
11 pages
Comparison of Cisco, Huawei and Juniper Command Line
No ratings yet
Comparison of Cisco, Huawei and Juniper Command Line
6 pages
Ultimate Iray Skin Manager Documentation
No ratings yet
Ultimate Iray Skin Manager Documentation
13 pages
Oos, Oot & Data Integrity
100% (2)
Oos, Oot & Data Integrity
83 pages
Mactan-Cebu International Airport (Mcia) PPP Project
No ratings yet
Mactan-Cebu International Airport (Mcia) PPP Project
8 pages
Error Warning 20225 or No PDF Printer Appears
No ratings yet
Error Warning 20225 or No PDF Printer Appears
2 pages
Scope and Application of Learning Theories in The Delivery of Medical Education
No ratings yet
Scope and Application of Learning Theories in The Delivery of Medical Education
6 pages

Probabilistic Models With Latent Variables

Uploaded by

Probabilistic Models With Latent Variables

Uploaded by

Probabilistic Models

with Latent Variables

• Empirical distribution typically has multiple modes

• recovers mixture distribution

• No closed form solution for maximizing

• Possibility 1: Gradient descent etc

• Key idea: iterative updates

• Problem: not known

• Expected complete loglikelihood

Wrt where are the current parameters

• Compare with likelihood for generative classifier

• Compare with generative classifier

• Converges to a local maximum of the loglikelihood

• Fully Bayesian approach is called Variational Bayes

• Most popular “hard” clustering algorithm

• Note: is k-dimensional binary vector

• Does not work for k-means! Why?

• Mixture models very restricted

• W: matrix containing means

• How can we relax Z and W?

• Alternative solution via Singular Value

• W contains the “principal components” that capture

• Latent linear models

From Bishop, PRML

• Why does need to be restricted?

• Intermediate low rank parameterization of Gaussian

From Bishop, PRML

• Handles missing data

• Easy path for handling more complex models

• Not always the fastest method

• Summary / “Understanding” of data

• Expectation Maximization Algorithm

You might also like