0% found this document useful (0 votes)

6 views

Probabilistic Theory of Deep Learning

intro to probabilistic theory

Uploaded by

sethuramanr1976

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Probabilistic Theory of Deep Learning

intro to probabilistic theory

Uploaded by

sethuramanr1976

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Probabilistic Theory of Deep

Learning
This course explores probabilistic deep learning, a robust approach that
embraces uncertainty for more powerful and insightful models.

PRESENTED BY:

S.RANJANI-B.E.(CSE)

IIIrd year
Fundamentals of Probability
Theory
Probability theory is the foundation of probabilistic deep learning. It helps us
understand uncertainty and make informed decisions using data. We'll explore
key ideas like probability distributions, random variables, and Bayes' theorem.

1 Probability 2 Random Variables

Distributions Random variables represent
Probability distributions show things with uncertain values.
how likely different outcomes We'll explore different types,
are for a random event. We'll including those that can only
learn about common be whole numbers and those
distributions, such as the that can have any value.
Gaussian, Bernoulli, and
Poisson.

3 Conditional 4 Bayes' Theorem

Probability Bayes' theorem helps us
Conditional probability helps update our beliefs about
us understand the chances of something based on new
one event happening when information. It's a key part of
another event has already probabilistic modeling and
occurred. This is important for used a lot in Bayesian deep
making predictions and learning.
understanding data.
Understanding Uncertainty in
Deep Learning
Uncertainty in deep learning (DL) refers to the model’s confidence or lack of
certainty about its predictions.

Uncertainty helps us understand the reliability of deep learning models and

how confident we can be in their results. We'll explore various techniques to
quantify uncertainty, including Bayesian methods, model

Epistemic Uncertainty Aleatoric Uncertainty

This type of uncertainty arises Aleatoric uncertainty is a type of
from limited data or model uncertainty that arises from
imperfections. It can be reduced the random nature of a system
by gathering more data or or data and can't be explained
improving the model. away. It can be characterized as
the "noise" in data and is
irreducible, meaning it can't be
reduced with additional
information.
Maximum Likelihood
Estimation
MLE is a key method for estimating model parameters in probabilistic deep
learning. Instead of single predictions, probabilistic models produce probability
distributions over outcomes. MLE aims to find the parameter values that
maximize the likelihood of the observed data within the model's distribution.

Steps of Maximum Likelihood in Deep Learning

1. Define the Model

2. Write the Likelihood Function

3. Take the Log of the Likelihood (Log-Likelihood)

4. Maximize the Log-Likelihood

5. Estimation of the parameter

Let's break it down with a smaller example:
You have a coin, and you're trying to figure out the best guess for the
probability of heads (Y) based on the observed data.You flip the coin 3 times
and record the results.

Let’s say:

( X ) = outcome of the coin flip (either heads or tails)

( Y ) = probability of heads (we're trying to estimate this)

Data (Observed ( X ))

You flip the coin 3 times and get the following outcomes:

Flip 1: Heads (H)

Flip 2: Tails (T)

Flip 3: Heads (H)

1. Define the Model

The probability distribution is:

P(Heads) = Y
P(Tails) = 1 - Y

2.Likelihood Function

The likelihood function is a fundamental concept in statistics, especially in

the context of maximum likelihood estimation (MLE). It measures how well a
particular set of parameters explains the observed data.

The likelihood of observing the outcomes (H, T, H) is the product of the

probabilities for each outcome. Since we got heads, tails, and heads, the
likelihood is:

Likelihood = P(Heads) x P(Tails) x P(Heads) = Y x (1 - Y) x Y = Y^2 x (1 - Y)

3. Log-Likelihood
Since multiplying probabilities can get messy, we take the log of the likelihood
(log-likelihood):

Log-Likelihood = log(Y^2 x (1 - Y))

=2log(Y x (1-Y)) (power rule)

= 2log(Y) + log(1 - Y) (product rule)

4.Maximizing the log-likelihood

Let’s try guessing different values of ( Y ) (the probability of heads) and see
which gives us the highest log-likelihood.

Guess 1: ( Y = 0.5 ) (Fair Coin)

Log-Likelihood= 2 log(0.5) + log(1 - 0.5) = 2(-0.693) + (-0.693) = -2.079

Guess 2: ( Y = 0.8 ) (Biased towards Heads)

Log-Likelihood = 2 log(0.8) + log(1 - 0.8) = 2(-0.223) + (-1.609) = -2.055

Guess 3: ( Y = 0.6 ) (Biased slightly towards Heads)

Log-Likelihood = log(0.6) + (1 - 0.6) = 2(-0.511) + (-0.916) = -1.938

5. Estimation of the parameter

From the calculations, we see that the log-likelihood is highest when ( Y = 0.6
), which means the coin is estimated to have a 60% chance of landing heads
based on our observed data.
Bayesian Inference
Bayesian inference helps us learn from data by updating our beliefs about
things we don't know. It's a powerful tool in deep learning, allowing us to build
models that are more reliable and can handle uncertainty better. We'll explore
how Bayesian inference works and how it's used in deep learning, including
Bayesian neural networks and Bayesian optimization.

1 Prior Distribution
The prior distribution represents our initial guess about the
unknown things before we look at any data. It reflects what we
already know or believe about the problem.

2 Likelihood Function
The likelihood function tells us how likely the data is, given a
particular guess about the unknowns. It measures how well
our model fits the observed data.

3 Posterior Distribution
The posterior distribution combines our initial guess (prior)
and the information from the data (likelihood) to give us a
more accurate belief about the unknowns after looking at the
data.

Steps in Bayesian inference:

1. Define the Prior Probability

2. Define the Likelihood

3. Calculate the Total Probability
4. Apply Bayes' Theorem
5. Interpretation
Example:
Problem Setup:

Imagine you have a coin that might be biased. You want to determine the
probability that the coin is biased toward heads after flipping it a few times.

Step 1: Define the Prior Probability

Before flipping the coin, you have a belief about whether it’s fair or biased.
Let's say:

Prior belief: You think there’s a 50% chance the coin is fair (50% heads)
and a 50% chance it is biased (70% heads).

So we define:

P(Fair)=0.5
P(Biased)=0.5

Step 2: Define the Likelihood

Next, you flip the coin 3 times and get 2 heads and 1 tail. Now you need to
calculate how likely it is to get this result under both hypotheses.

The formula is:

P(k|X)=n!/k! x (n−k) x p^k x (1−p)^n−k

X: probability function (Fair, Biased)

p: probability of getting heads
1−p: probability of getting tails
n: Total number of coin flips -(3)
k: Number of successes (e.g., number of heads you want to calculate the
probability for)-(2)

Likelihood if the coin is fair:

p=0.5

P(2heads∣Fair)=3!/2!(3-2)! x (0.5)^2 x (1-0.5)^1=0.375

Likelihood if the coin is biased:

p=0.7

P(2heads∣Biased)=3!/2!(3-2)! x (0.7)^2 x (1-0.7)^1=0.441

Step 3: Calculate the Total Probability of Getting
2 Heads
Now we need the total probability of getting 2 heads, P(2 heads):

P(E)=P(E∣H1) P(H1)+P(E∣H2) P(H2)+…+P(E∣Hn) P(Hn)

E=Event H=Hypotheses

P(2heads)=P(2heads∣Fair)×P(Fair)+P(2heads∣Biased)×P(Biased)

P(2heads)=(0.375×0.5)+(0.441×0.5)

=0.1875+0.2205=0.408

Step 4: Apply Bayes' Theorem

Now we can use Bayes' Theorem to find the posterior probability that the coin
is biased given that we observed 2 heads:

The formula for Bayes' Theorem is:

P(A∣B)=P(B∣A) P(A)/P(B)

P(A∣B): The posterior probability

P(B∣A): The likelihood
P(A): The prior probability
P(B): The total probability

P(Biased∣2heads)=P(2heads∣Biased)×P(Biased)/P(2heads)

P(Biased∣2heads)=0.441×0.50/408

=0.22050.408≈0.540P

Step 5: Interpretation

After observing 2 heads in 3 flips, the probability that the coin is biased is
approximately 54%. This means your belief has shifted from an initial 50%
chance to a 54% chance that the coin is biased towards heads.
Optimisation and
Generalization:
Optimization

What it is: Optimization is the process of adjusting the model parameters (like
weights in a neural network) to minimize the difference between the predicted
outputs and the actual outputs (errors).

Why it matters: Effective optimization helps the model learn from the training
data, improving its performance.

In Probabilistic DL:

Optimization often involves minimizing a loss function, which quantifies

how well the model predicts outcomes.
Techniques like gradient descent are used to find the best parameter
values that lower the loss, taking uncertainty into account.

Generalization

What it is: Generalization refers to a model's ability to perform well on new,

unseen data, not just the data it was trained on.

Why it matters: If a model only learns the training data too well (overfitting), it
won’t be effective when faced with new examples. Good generalization means
the model can apply what it learned to different situations.

In Probabilistic DL:

Probabilistic models use uncertainty estimates to help generalize better.

For example, by considering the distribution of possible outputs, a model
can be more flexible and robust when making predictions on new data.
Practical Applications and
Future Directions
Probabilistic deep learning solves many real-world problems in fields like
computer vision, language understanding, robotics, and healthcare.

Healthcare Autonomous Driving

Probabilistic deep learning helps Probabilistic models improve self-
analyze medical images, predict driving cars' perception, decision-
diseases, and personalize making, and risk management.
treatments.

Robotics
Probabilistic approaches enable Cognitive Science
robots to navigate, manipulate Probabilistic models help us
objects, and interact with humans. understand human cognition and
decision-making.

1.2 - Uncertainties and Errors
No ratings yet
1.2 - Uncertainties and Errors
38 pages
Probability Theory - Towards Data Science
No ratings yet
Probability Theory - Towards Data Science
19 pages
Bayesian
No ratings yet
Bayesian
91 pages
2223hk1 Slide01 ML2022-2
No ratings yet
2223hk1 Slide01 ML2022-2
23 pages
03_lecturenote_MLE_MAP
No ratings yet
03_lecturenote_MLE_MAP
7 pages
Introduction To Probabilistic Learning
No ratings yet
Introduction To Probabilistic Learning
9 pages
ML Lecture 03 - Probabilistic Inference (Spring 2024)
No ratings yet
ML Lecture 03 - Probabilistic Inference (Spring 2024)
46 pages
Week 1 Review and Posterior Probability Calculation Example: Duke University
No ratings yet
Week 1 Review and Posterior Probability Calculation Example: Duke University
6 pages
03 Lectureslides ParameterInference
No ratings yet
03 Lectureslides ParameterInference
24 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages
ML Unit 2
No ratings yet
ML Unit 2
8 pages
Wa0002.
No ratings yet
Wa0002.
24 pages
AIML-Unit 3 Notes-Assignment 3
No ratings yet
AIML-Unit 3 Notes-Assignment 3
37 pages
notes19
No ratings yet
notes19
11 pages
PyCon 2015 - Bayesian Statistics Made Simple
100% (4)
PyCon 2015 - Bayesian Statistics Made Simple
145 pages
Lecture 6
No ratings yet
Lecture 6
13 pages
Unit 2 - Probabilistic Reasoning
No ratings yet
Unit 2 - Probabilistic Reasoning
25 pages
unit2 AI & ML
No ratings yet
unit2 AI & ML
29 pages
Bayesian Uncertainty Quantification
No ratings yet
Bayesian Uncertainty Quantification
23 pages
Slide 1
No ratings yet
Slide 1
37 pages
L09 Learning I Bayesian Learning
No ratings yet
L09 Learning I Bayesian Learning
66 pages
CS464_Ch3_Estimation
No ratings yet
CS464_Ch3_Estimation
56 pages
Bayes Theorem
No ratings yet
Bayes Theorem
20 pages
15.097: Probabilistic Modeling and Bayesian Analysis
No ratings yet
15.097: Probabilistic Modeling and Bayesian Analysis
42 pages
Foundations of Machine Learning: Part A: Probability Basics
No ratings yet
Foundations of Machine Learning: Part A: Probability Basics
75 pages
Likelihood Frequentist
No ratings yet
Likelihood Frequentist
27 pages
Introduction To Bayesian Learning: Aaron Hertzmann University of Toronto SIGGRAPH 2004 Tutorial
No ratings yet
Introduction To Bayesian Learning: Aaron Hertzmann University of Toronto SIGGRAPH 2004 Tutorial
141 pages
Frequentist vs. Bayesian Statistics Frequentist Thinking Bayesian Thinking
No ratings yet
Frequentist vs. Bayesian Statistics Frequentist Thinking Bayesian Thinking
18 pages
Understanding Python
No ratings yet
Understanding Python
9 pages
Unit-3 Ai
No ratings yet
Unit-3 Ai
24 pages
UNIT 5 - Uncertainty
No ratings yet
UNIT 5 - Uncertainty
36 pages
Unit 8 Ai
No ratings yet
Unit 8 Ai
29 pages
18CS71 Module 4
No ratings yet
18CS71 Module 4
30 pages
ML Unit III
No ratings yet
ML Unit III
40 pages
Bayes ML Tutorial
No ratings yet
Bayes ML Tutorial
69 pages
Lecture5 Maximum Likelihood
No ratings yet
Lecture5 Maximum Likelihood
13 pages
Bayesian Updating: Probabilistic Prediction Class 12, 18.05 Jeremy Orloff and Jonathan Bloom
No ratings yet
Bayesian Updating: Probabilistic Prediction Class 12, 18.05 Jeremy Orloff and Jonathan Bloom
4 pages
Week 6 Mle
No ratings yet
Week 6 Mle
41 pages
ML Unit 1
No ratings yet
ML Unit 1
13 pages
ML Unit 3 Part 1
No ratings yet
ML Unit 3 Part 1
36 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
180 pages
SCSA3015 Deep Learning Unit 2 PDF
No ratings yet
SCSA3015 Deep Learning Unit 2 PDF
32 pages
Probabilistic Reasoning in Artificial Intelligence
No ratings yet
Probabilistic Reasoning in Artificial Intelligence
5 pages
2. Given Information:: 3. Calculate ε
No ratings yet
2. Given Information:: 3. Calculate ε
9 pages
Module 4
No ratings yet
Module 4
51 pages
Notes On ML
No ratings yet
Notes On ML
42 pages
Unit I Probabilistic Reasoning I 9
No ratings yet
Unit I Probabilistic Reasoning I 9
20 pages
Bayesian Learning: Thanks To Nir Friedman, HU
No ratings yet
Bayesian Learning: Thanks To Nir Friedman, HU
41 pages
Mle & Map
No ratings yet
Mle & Map
21 pages
Chapter 5 - Machine Learning
No ratings yet
Chapter 5 - Machine Learning
59 pages
UNIT II.docx
No ratings yet
UNIT II.docx
30 pages
Unit Iii 1
No ratings yet
Unit Iii 1
20 pages
(R17A1204) Artificial Intelligence (6) - 119-143
No ratings yet
(R17A1204) Artificial Intelligence (6) - 119-143
25 pages
Mathematics in Machine Learning
No ratings yet
Mathematics in Machine Learning
83 pages
Unit 2
No ratings yet
Unit 2
20 pages
E-Note 14654 Content Document 20231228101425AM
No ratings yet
E-Note 14654 Content Document 20231228101425AM
10 pages
Probabilistic Reasoning: Unit-V
No ratings yet
Probabilistic Reasoning: Unit-V
33 pages
Introduction To Bayesian Statistics: Foo Lee Kien (PHD)
No ratings yet
Introduction To Bayesian Statistics: Foo Lee Kien (PHD)
65 pages
Machine_learning(unit 3)
No ratings yet
Machine_learning(unit 3)
9 pages
Ch6
No ratings yet
Ch6
19 pages
Abductive Reasoning: Fundamentals and Applications
From Everand
Abductive Reasoning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Quantum Mechanics II - Homework Assignment 9: Alejandro G Omez Espinosa April 21, 2013
No ratings yet
Quantum Mechanics II - Homework Assignment 9: Alejandro G Omez Espinosa April 21, 2013
4 pages
Statistics for Business and Economics 8th Edition Newbold Test Bank instant download
100% (2)
Statistics for Business and Economics 8th Edition Newbold Test Bank instant download
47 pages
Mathematics-in-the-Modern-World-Research-Project-Guidelines
No ratings yet
Mathematics-in-the-Modern-World-Research-Project-Guidelines
3 pages
Test Bank For Introduction To General Organic and Biochemistry 12th Edition Frederick A Bettelheim William H Brown Mary K Campbell Shawn o Farrell Omar Torres 13
No ratings yet
Test Bank For Introduction To General Organic and Biochemistry 12th Edition Frederick A Bettelheim William H Brown Mary K Campbell Shawn o Farrell Omar Torres 13
34 pages
The Process Capability Analysis
No ratings yet
The Process Capability Analysis
18 pages
Non-Equivalent Control Group Design
No ratings yet
Non-Equivalent Control Group Design
1 page
MACHOTE New Gage RR Spreadsheet 1
No ratings yet
MACHOTE New Gage RR Spreadsheet 1
30 pages
Advances in Multivariate Statistical Methods (Statistical Science and Interdisciplinary Research) (Statistical Science and Interdisciplinary Research - Platinum Juliee Series) (PDFDrive) PDF
100% (1)
Advances in Multivariate Statistical Methods (Statistical Science and Interdisciplinary Research) (Statistical Science and Interdisciplinary Research - Platinum Juliee Series) (PDFDrive) PDF
492 pages
A Scientific Abstract Summarizes Your Research Paper or Article in A Concise
No ratings yet
A Scientific Abstract Summarizes Your Research Paper or Article in A Concise
2 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
1 page
Basic Statistics For Researchers PDF
No ratings yet
Basic Statistics For Researchers PDF
86 pages
Your Passport To A Career in Bioinformatics
100% (2)
Your Passport To A Career in Bioinformatics
84 pages
Quiz Linear Regression
No ratings yet
Quiz Linear Regression
3 pages
Variable 1 Variable 2: T-Test: Two-Sample Assuming Equal Variances
No ratings yet
Variable 1 Variable 2: T-Test: Two-Sample Assuming Equal Variances
4 pages
Mowery e Sampat 2005 - Universities in National Innovation Systems
No ratings yet
Mowery e Sampat 2005 - Universities in National Innovation Systems
38 pages
Chapter 1 - Introduction To Quantitative Analysis
No ratings yet
Chapter 1 - Introduction To Quantitative Analysis
35 pages
Paper_20201107070607
No ratings yet
Paper_20201107070607
70 pages
D. Lecture 3 - How To Evaluate Impact
No ratings yet
D. Lecture 3 - How To Evaluate Impact
57 pages
Wooldridge Slides 10 Diff in Diffs
No ratings yet
Wooldridge Slides 10 Diff in Diffs
31 pages
Thesis Weighted Mean
100% (2)
Thesis Weighted Mean
6 pages
Examples of Business Intelligence: Data Visualization
No ratings yet
Examples of Business Intelligence: Data Visualization
2 pages
Design_and_analysis_A_researchers_handbo
No ratings yet
Design_and_analysis_A_researchers_handbo
5 pages
Franck Hertz Experiment
No ratings yet
Franck Hertz Experiment
2 pages
Use of Theoretical and Conceptual Frameworks in Qualitative Research
No ratings yet
Use of Theoretical and Conceptual Frameworks in Qualitative Research
5 pages
Eee137 PS3
No ratings yet
Eee137 PS3
3 pages
Quanti Imrad
No ratings yet
Quanti Imrad
5 pages
Minitab Survey Analysis With ANOVA and More
No ratings yet
Minitab Survey Analysis With ANOVA and More
4 pages
Bayesian Decision Theory
No ratings yet
Bayesian Decision Theory
63 pages
The Use of Predictions Strategy in Improving Stude
No ratings yet
The Use of Predictions Strategy in Improving Stude
5 pages

Probabilistic Theory of Deep Learning

Uploaded by

Probabilistic Theory of Deep Learning

Uploaded by

Probabilistic Theory of Deep

1 Probability 2 Random Variables

3 Conditional 4 Bayes' Theorem

Uncertainty helps us understand the reliability of deep learning models and

Epistemic Uncertainty Aleatoric Uncertainty

Steps of Maximum Likelihood in Deep Learning

1. Define the Model

2. Write the Likelihood Function

3. Take the Log of the Likelihood (Log-Likelihood)

4. Maximize the Log-Likelihood

5. Estimation of the parameter

( X ) = outcome of the coin flip (either heads or tails)

Flip 1: Heads (H)

Flip 2: Tails (T)

1. Define the Model

The probability distribution is:

The likelihood function is a fundamental concept in statistics, especially in

The likelihood of observing the outcomes (H, T, H) is the product of the

Likelihood = P(Heads) x P(Tails) x P(Heads) = Y x (1 - Y) x Y = Y^2 x (1 - Y)

Log-Likelihood = log(Y^2 x (1 - Y))

=2log(Y x (1-Y)) (power rule)

= 2log(Y) + log(1 - Y) (product rule)

4.Maximizing the log-likelihood

Guess 1: ( Y = 0.5 ) (Fair Coin)

Log-Likelihood= 2 log(0.5) + log(1 - 0.5) = 2(-0.693) + (-0.693) = -2.079

Guess 2: ( Y = 0.8 ) (Biased towards Heads)

Log-Likelihood = 2 log(0.8) + log(1 - 0.8) = 2(-0.223) + (-1.609) = -2.055

Guess 3: ( Y = 0.6 ) (Biased slightly towards Heads)

Log-Likelihood = log(0.6) + (1 - 0.6) = 2(-0.511) + (-0.916) = -1.938

5. Estimation of the parameter

Steps in Bayesian inference:

1. Define the Prior Probability

2. Define the Likelihood

Step 1: Define the Prior Probability

Step 2: Define the Likelihood

The formula is:

P(k|X)=n!/k! x (n−k) x p^k x (1−p)^n−k

X: probability function (Fair, Biased)

Likelihood if the coin is fair:

P(2heads∣Fair)=3!/2!(3-2)! x (0.5)^2 x (1-0.5)^1=0.375

Likelihood if the coin is biased:

P(2heads∣Biased)=3!/2!(3-2)! x (0.7)^2 x (1-0.7)^1=0.441

P(E)=P(E∣H1) P(H1)+P(E∣H2) P(H2)+…+P(E∣Hn) P(Hn)

Step 4: Apply Bayes' Theorem

The formula for Bayes' Theorem is:

P(A∣B): The posterior probability

Optimization often involves minimizing a loss function, which quantifies

What it is: Generalization refers to a model's ability to perform well on new,

Probabilistic models use uncertainty estimates to help generalize better.

Healthcare Autonomous Driving

You might also like