0% found this document useful (0 votes)

35 views15 pages

CS772 Lec1

Uploaded by

bhaveshshukla0903

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views15 pages

CS772 Lec1

Uploaded by

bhaveshshukla0903

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Course Logistics and Introduction to

Probabilistic Machine Learning

CS772A: Probabilistic Machine Learning
Piyush Rai
2
Course Logistics
▪ Course Name: Probabilistic Machine Learning – CS772A
▪ 2 classes each week
▪ Mon/Thur 18:00-19:15
▪ Venue: RM-101

▪ Attendance policy: None but biometric attendance will be taken

▪ All material (readings etc) will be posted on course webpage (internal access)
▪ URL: https://web.cse.iitk.ac.in/users/piyush/courses/pml_spring25/pml.html

▪ Q/A and announcements on Piazza. Please sign up

▪ URL: https://piazza.com/iitk.ac.in/secondsemester2025/cs772
▪ If need to contact me by email ([email protected]), prefix subject line with “CS772”

▪ Unofficial auditors are welcome CS772A: PML

3
Workload and Grading Policy
▪ 3 quizzes: 30%
▪ In class, closed-book

▪ Mid-sem exam: 20% (date as per DOAA schedule). Closed book

▪ End-sem exam: 30% (date as per DOAA schedule). Closed book

▪ Research project (to be done in groups of 4-5): 20%

▪ Some topics will be suggested (research papers)
▪ You can propose your own topic (but must be related to probabilistic ML)
▪ More details will be shared soon

▪ Proration: If you miss any quiz/mid-sem, we can prorate it using end-sem marks
▪ Proration only allowed on limited grounds (e.g., health related)
CS772A: PML
4
Textbooks and Readings
▪ Some books that you may use as reference (freely available online)
▪ Kevin P. Murphy, Probabilistic Machine Learning: An Introduction (PML-1), The MIT Press, 2022.
▪ Kevin P. Murphy, Probabilistic Machine Learning: Advanced Topics(PML-2), The MIT Press, 2022.
▪ Chris Bishop, Pattern Recognition and Machine Learning (PRML), Springer, 2007.
▪ Chris Bishop and Hugh Bishop, Deep Learning: Foundations and Concepts (DLFC), Springer, 2023.

▪ Follow the suggested readings for each lecture (may also include some portions
from these books), rather than trying to read these books in a linear fashion
CS772A: PML
5
Probabilistic Machine Learning
▪ Machine Learning primarily deals with
𝑁
▪ Predicting output 𝑦∗ for new (test) inputs 𝒙∗ given training data 𝑿, 𝒚 = 𝒙𝑖 , 𝑦𝑖 𝑖=1
▪ Generating new (synthetic) data given some training data 𝑿 = 𝒙𝑖 𝑁 𝑖=1
▪ Probabilistic ML gives a natural way to solve both these tasks (with some advantages)
▪ Prediction: Learning the predictive distribution PML is about estimating
these distributions accurately
Using this, we can not only
and efficiently
get the mean but also the
variance (uncertainty) of the 𝑝 𝑦∗ 𝑥∗ , 𝑿, 𝒚) Estimating them exactly is
predicted output 𝑦∗
hard in general but we can
▪ Generation: Learning a generative model of data use approximations
Can “sample” (simulate) from
this distribution to generate 𝑝 𝒙∗ 𝑿) Both are conditional
distributions
new data

▪ At its core, both problems require estimating the underlying distribution of data
CS772A: PML
6
Probabilistic Machine Learning
▪ With a probabilistic approach to ML, we can also easily incorporate “domain knowledge”

▪ Can specify our assumptions about data using suitable probability distributions over
inputs/outputs, usually in the forms Distribution of the input
Probability distribution of
𝑝 𝑦𝑛 𝑥𝑛 , 𝜃) 𝑝(𝑥𝑛 |𝑦𝑛 , 𝜃) conditioned on its “label/output”
the output as a function Distribution of
of input Unknown parameters
of this distribution
𝑝(𝑥𝑛 |𝜃) the inputs

▪ Can specify our assumptions about the unknowns 𝜃 using a “prior distribution”
Represents our belief
about the unknown
parameters before we
see the data
𝑝(𝜃)

▪ After seeing some data 𝒟, can update the prior into a posterior distribution 𝑝(𝜃|𝒟)
CS772A: PML
7
The Core of PML: Two Basic Rules of Probability
▪ Sum Rule (marginalization): Distribution of 𝑎 considering for all possibilities of 𝑏
If 𝑏 is a discrete r.v. If 𝑏 is a continuous r.v.

𝑝 𝑎 = ෍ 𝑝(𝑎, 𝑏) or 𝑝 𝑎 = න 𝑝 𝑎, 𝑏 𝑑𝑏
▪ Product Rule 𝑏

𝑝 𝑎, 𝑏 = 𝑝 𝑎 𝑝 𝑏 𝑎 = 𝑝 𝑏 𝑝 𝑎 𝑏
▪ These two rules are the core of most of probabilistic/Bayesian ML
▪ Bayes rule easily derived from the sum and product rules
𝑝 𝑏 𝑝 𝑎𝑏 𝑝 𝑏 𝑝 𝑎𝑏 Assuming 𝑏 is a
𝑝 𝑏𝑎 = = continuous r.v.
𝑝 𝑎 ‫𝑎 𝑝 ׬‬, 𝑏 𝑑𝑏
CS772A: PML
8

ML and Uncertainty
(and how PML handles uncertainty)

CS772A: PML
9
Uncertainty due to Limited Training Data
▪ Model/parameter uncertainty is due to not having enough training data
Same model class (linear models) Uncertainty not just about the
but uncertainty about the weights weights but also the model class

3 different model classes

considered here (with
linear, polynomial, circular
decision boundaries)

Each model class itself will have

uncertainty(like left fig) since
there isn’t enough training data

▪ Also called epistemic uncertainty. Usually reducible

▪ Vanishes with “sufficient” training data
Image credit: Balaji L, Dustin T, Jasper N. (NeurIPS 2020 tutorial)

Image credit: Balaji L, Dustin T, Jasper N. (NeurIPS 2020 tutorial) CS772A: PML
10
Uncertainty due to Inherent Noise in Training Data
▪ Data uncertainty can be due to various reasons, e.g.,
▪ Intrinsic hardness of labeling, class overlap
▪ Labeling errors/disagreements (for difficult training inputs)
▪ Noisy or missing features

Image credit: Eric Nalisnick Image source: “Improving machine classification using human uncertainty measurements” (Battleday et al, 2021)

▪ Also called aleatoric uncertainty. Usually irreducible

▪ Won’t vanish even with infinite training data
▪ Note: Can sometimes vanish by adding more features
(figure on the right) or switching to a more complex model
Image source: “Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods” (HW 2021)

CS772A: PML
Image source: “Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods” (H&W 2021)
In this course, we will mostly focus 11
How to Estimate Uncertainty? on the Bayesian approach but other
two approaches are also popular
and will also be discussed

▪ Uncertainty in parameters: This can be estimated/quantified via mainly three ways:

Sampling multiple training sets and estimating
𝜃 (1) 𝜃 (2) 𝜃 (𝑆)
𝑝(𝜃|𝒟) A case of 2-dim 𝜃
the parameters from each training set

Bayesian way: Treat params as 𝜃2

random variables and estimate
their distribution conditioned on Frequentist way: Treat params as fixed Ensemble: Train the same model with 𝑆
the given training data (a.k.a. unknowns and estimate them using different initializations or different
posterior distribution) 𝜃1 multiple datasets. This yields a subsets of the training data. Each run
set/distribution over the params(not a will give a different estimate, so we get
“posterior” but a distribution nevertheless!) a set of param estimates

▪ Uncertainty in predictions: Usually estimated by computing and reporting the mean and
variance of predictions made using many possible values of 𝜃. Commonly reported as:
Predictive Distribution Can get both mean
𝑝(𝑦∗ |𝑥∗ , 𝒟) and variance/quantiles Sets/intervals of possible predictions
of the prediction

CS772A: PML
12
Predictive Uncertainty
▪ Information about uncertainty gives an idea about how much to trust a prediction
▪ It can also “guide” us in sequential decision-making:
Test output Test input Given our current estimate of the
regression function, which training
𝑝 𝑦∗ 𝑥∗ , 𝐷) = 𝒩(𝑦∗ |𝜇∗ , 𝜎∗2 )
input(s) should we add next to
improve its estimate the most?
Training data

Blue curve is the mean of the Uncertainty can help here: Acquire training
function (learned so far using inputs from regions where the function is
the available data), shaded most uncertain about its current predictions
region denotes the current
predictive uncertainty

▪ Applications in active learning, reinforcement learning, Bayesian optimization, etc

CS772A: PML
13
Generative Models
▪ PML is not just about parameter/predictive uncertainty

▪ Generative models invariably are also probabilistic models

▪ Learning such models will also be a topic of study in this course

Figure credit: Lilian Weng CS772A: PML
14
Probabilistic Modeling of Data: The Setup
▪ We are given some training data 𝒟
▪ For supervised learning, 𝒟 contains 𝑁 input-label pairs 𝒙𝑖 , 𝑦𝑖 𝑁𝑖=1
▪ For unsupervised learning, 𝒟 contains 𝑁 inputs 𝒙𝑖 𝑁 𝑖=1
▪ Other settings are also possible (e.g., semi-sup., reinforcement learning, etc)
▪ Assume that the observations are generated by a probability distribution
▪ For now, assume the form of the distribution to be known (e.g. a Gaussian)
▪ Parameters of this distribution, collectively denoted by 𝜃 are unknown
▪ Our goal is to estimate the distribution (and thus 𝜃) using training data
▪ Once the distribution is estimated, we can do things such as
▪ Predict labels of new inputs, along with our confidence in these predictions
▪ Generate new data with similar properties as training data
▪ .. and a lot of other useful tasks, e.g., detecting outliers CS772A: PML
15
Probabilistic Modeling of Data: The Setup
▪ We will denote the data distribution as 𝑝𝜃 (𝒟) or 𝑝(𝒟|𝜃)
▪ Assume that, conditioned on 𝜃, observations are independently and identically
distributed (i.i.d. assumption). Depending on the problem, this may look like:
Supervised generative model i.i.d.
𝑁
(both inputs and output are (𝒙𝑛 , 𝑦𝑛 ) ∼ 𝑝(𝒙, 𝑦|𝜃) 𝑝 𝒟𝜃 = ෑ 𝑝(𝒙𝑖 , 𝑦𝑖 |𝜃)
modeled using a distribution) 𝑖=1
Supervised discriminative model
i.i.d.
𝑁
(only the output is modeled using
a distribution); input is assumed
𝑦𝑛 ∼ 𝑝(𝑦|𝒙, 𝜃) 𝑝 𝒟𝜃 = ෑ 𝑝(𝑦𝑖 |𝒙𝑖 , 𝜃)
“given” and not modeled
𝑖=1
i.i.d.
𝑁
Unsupervised generative 𝑝 𝒟𝜃 = ෑ 𝑝(𝒙𝑖 |𝜃)
model (there are only
𝒙𝑛 ∼ 𝑝(𝒙|𝜃)
inputs; no labels) 𝑖=1

▪ Assume that both training and test data come from the same distribution
▪ This assumption, although standard, may be violated in real-world applications of ML and
there are “adaptation” methods to handle that
CS772A: PML

Kinetics: The Oxidation of Iodide by Hydrogen Peroxide
No ratings yet
Kinetics: The Oxidation of Iodide by Hydrogen Peroxide
3 pages
Robert Waelder Five Lectures
No ratings yet
Robert Waelder Five Lectures
68 pages
Probabilistic Models For Supervised Learning: Piyush Rai Introduction To Machine Learning (CS771A)
No ratings yet
Probabilistic Models For Supervised Learning: Piyush Rai Introduction To Machine Learning (CS771A)
32 pages
CS772 Lec6
No ratings yet
CS772 Lec6
23 pages
CS772 Lec7
No ratings yet
CS772 Lec7
18 pages
771 A18 Lec6
No ratings yet
771 A18 Lec6
155 pages
Machine Learning
No ratings yet
Machine Learning
2 pages
CE880 Lecture5 Slides
No ratings yet
CE880 Lecture5 Slides
32 pages
Probabilistic ML Crash Course - Leblanc, Mason
No ratings yet
Probabilistic ML Crash Course - Leblanc, Mason
95 pages
Intro Slides
No ratings yet
Intro Slides
31 pages
Slides Cours ML
No ratings yet
Slides Cours ML
272 pages
771 A18 Lec7
No ratings yet
771 A18 Lec7
120 pages
DLT Unit-1
No ratings yet
DLT Unit-1
28 pages
CS772 Lec8
No ratings yet
CS772 Lec8
14 pages
Ds 6
No ratings yet
Ds 6
21 pages
Ds 8
No ratings yet
Ds 8
10 pages
SML Lecture1
No ratings yet
SML Lecture1
37 pages
Lec2 Intro To ML
No ratings yet
Lec2 Intro To ML
35 pages
Deep Learning Summer School 2015: Introduction To Machine Learning
No ratings yet
Deep Learning Summer School 2015: Introduction To Machine Learning
46 pages
Course Logistics and Introduction: CS771: Introduction To Machine Learning Piyush Rai
No ratings yet
Course Logistics and Introduction: CS771: Introduction To Machine Learning Piyush Rai
24 pages
MLSM Lecture1 050923
No ratings yet
MLSM Lecture1 050923
37 pages
Software Engineer
No ratings yet
Software Engineer
207 pages
Basic Concepts of Machine Learning For Beginners 1732109263
No ratings yet
Basic Concepts of Machine Learning For Beginners 1732109263
102 pages
CS772 Lec5
No ratings yet
CS772 Lec5
22 pages
Course Logistics and Introduction To Machine Learning
No ratings yet
Course Logistics and Introduction To Machine Learning
34 pages
Chatgpt Unit - 1
No ratings yet
Chatgpt Unit - 1
5 pages
Intro DL 01
No ratings yet
Intro DL 01
64 pages
Untitled Notebook
No ratings yet
Untitled Notebook
19 pages
Lecture 2 Ai
No ratings yet
Lecture 2 Ai
24 pages
2-Inductive Learning
No ratings yet
2-Inductive Learning
37 pages
Lecture 7
No ratings yet
Lecture 7
16 pages
Course Code Course Title Course Planner: Through This Course Students Should Be Able To
No ratings yet
Course Code Course Title Course Planner: Through This Course Students Should Be Able To
4 pages
Introduction To ML Unit-1
No ratings yet
Introduction To ML Unit-1
90 pages
Introduction
No ratings yet
Introduction
93 pages
Introduction
No ratings yet
Introduction
41 pages
Notes
No ratings yet
Notes
125 pages
Lecture 1
100% (1)
Lecture 1
51 pages
Computer Network: 02 December 2024 22:38
No ratings yet
Computer Network: 02 December 2024 22:38
5 pages
Evaluating Model Performance: Evaluation Strategies: Train/Validation/Test
No ratings yet
Evaluating Model Performance: Evaluation Strategies: Train/Validation/Test
127 pages
PAC Bayesian Learning Introduction
No ratings yet
PAC Bayesian Learning Introduction
124 pages
Machine Learning: Foundations: Prof. Nathan Intrator
No ratings yet
Machine Learning: Foundations: Prof. Nathan Intrator
60 pages
Unit 3
No ratings yet
Unit 3
16 pages
PR & ML: CS5691: Machine Learning
No ratings yet
PR & ML: CS5691: Machine Learning
42 pages
ML Intro Theory
No ratings yet
ML Intro Theory
10 pages
ML Lecture 4
No ratings yet
ML Lecture 4
15 pages
Lecture 3: Applications of Machine Learning Algorithms Jul. 06 & 09, 2018
No ratings yet
Lecture 3: Applications of Machine Learning Algorithms Jul. 06 & 09, 2018
3 pages
07 Intro To ML
No ratings yet
07 Intro To ML
38 pages
DIR Notes 1
No ratings yet
DIR Notes 1
39 pages
Sec 1630
No ratings yet
Sec 1630
145 pages
ML 5
No ratings yet
ML 5
28 pages
COS324 Course Notes
No ratings yet
COS324 Course Notes
256 pages
Ch01 ICS422 02
No ratings yet
Ch01 ICS422 02
39 pages
Lecture1 PDF
No ratings yet
Lecture1 PDF
37 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
57 pages
LN ML Rug
No ratings yet
LN ML Rug
283 pages
Introduction To ML
No ratings yet
Introduction To ML
4 pages
Lecture5 Maximum Likelihood
No ratings yet
Lecture5 Maximum Likelihood
13 pages
Week 6 - Lecture 11-1
No ratings yet
Week 6 - Lecture 11-1
28 pages
ML 01
No ratings yet
ML 01
24 pages
Unit 1-2
No ratings yet
Unit 1-2
22 pages
Worked Examples in Advanced Mechanics of Materials using MATLAB
From Everand
Worked Examples in Advanced Mechanics of Materials using MATLAB
Eric Okoth Ogur
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Shlok Biradar - My Short Story
No ratings yet
Shlok Biradar - My Short Story
2 pages
Lesson 13. Death
No ratings yet
Lesson 13. Death
22 pages
BLA Lec-4
No ratings yet
BLA Lec-4
20 pages
6wresearch - Global Research Capabilities Presentation
No ratings yet
6wresearch - Global Research Capabilities Presentation
16 pages
3.2, Machine in The Garden
No ratings yet
3.2, Machine in The Garden
6 pages
Iii. Financial Assessment 1. Profitability Ratios
No ratings yet
Iii. Financial Assessment 1. Profitability Ratios
2 pages
Land Assignment
No ratings yet
Land Assignment
7 pages
St. Vincent College of Cabuyao: Bachelor of Science in Information Technology
No ratings yet
St. Vincent College of Cabuyao: Bachelor of Science in Information Technology
3 pages
L 2what Is Inclusive History
No ratings yet
L 2what Is Inclusive History
2 pages
All The Great Scholars
No ratings yet
All The Great Scholars
97 pages
Ap 2025
No ratings yet
Ap 2025
90 pages
Cprogramming For 5th Sem Mech
No ratings yet
Cprogramming For 5th Sem Mech
9 pages
Mba 3rd Sem Syllabus
No ratings yet
Mba 3rd Sem Syllabus
33 pages
Detailedlessonplan 131030230707 Phpapp01
No ratings yet
Detailedlessonplan 131030230707 Phpapp01
8 pages
Rakowski Preludes: A Brief Examination of His Compositional Process
0% (1)
Rakowski Preludes: A Brief Examination of His Compositional Process
20 pages
Pallega, Jay-R - General Principles in Physical Science (MAEd Sci 212)
No ratings yet
Pallega, Jay-R - General Principles in Physical Science (MAEd Sci 212)
60 pages
The Forgotten Glowing Vale Beyond The Veil
No ratings yet
The Forgotten Glowing Vale Beyond The Veil
5 pages
8888 Uprising - Wikipedia, The Free Encyclopedia
No ratings yet
8888 Uprising - Wikipedia, The Free Encyclopedia
12 pages
Activity
No ratings yet
Activity
14 pages
Slappy
No ratings yet
Slappy
2 pages
Aspergillus Salpingitis A Rare Case Report
No ratings yet
Aspergillus Salpingitis A Rare Case Report
4 pages
Cash Flow Statement
80% (5)
Cash Flow Statement
6 pages
Acsm Get Certified Guide: Be The Gold Standard
No ratings yet
Acsm Get Certified Guide: Be The Gold Standard
16 pages
User Reviews of Top Mobile Apps in Apple and Google App Stores
No ratings yet
User Reviews of Top Mobile Apps in Apple and Google App Stores
7 pages
Reactive Blue 221
No ratings yet
Reactive Blue 221
4 pages
CFG MCQ
100% (1)
CFG MCQ
7 pages
UNIT 1 Adjudication of Dispute and Claims
100% (1)
UNIT 1 Adjudication of Dispute and Claims
13 pages
June
No ratings yet
June
4 pages

CS772 Lec1

Uploaded by

CS772 Lec1

Uploaded by

Course Logistics and Introduction to

Probabilistic Machine Learning

▪ Attendance policy: None but biometric attendance will be taken

▪ Q/A and announcements on Piazza. Please sign up

▪ Unofficial auditors are welcome CS772A: PML

▪ Mid-sem exam: 20% (date as per DOAA schedule). Closed book

▪ End-sem exam: 30% (date as per DOAA schedule). Closed book

▪ Research project (to be done in groups of 4-5): 20%

3 different model classes

Each model class itself will have

▪ Also called epistemic uncertainty. Usually reducible

▪ Also called aleatoric uncertainty. Usually irreducible

▪ Uncertainty in parameters: This can be estimated/quantified via mainly three ways:

Bayesian way: Treat params as 𝜃2

▪ Applications in active learning, reinforcement learning, Bayesian optimization, etc

▪ Generative models invariably are also probabilistic models

▪ Learning such models will also be a topic of study in this course

You might also like