CPSC340: Entropy and Maximum Likelihood

This document outlines a lecture on maximum likelihood and entropy. It introduces maximum likelihood as a strategy for learning parameters from data by choosing parameters that make the observed data most probable. Maximum likelihood is applied to Bernoulli random variables by differentiating the log likelihood and setting it equal to zero. Entropy is also introduced as a measure of uncertainty in a random variable. The next lecture will cover Bayesian learning.

Uploaded by

bzsahil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views19 pages

CPSC340: Entropy and Maximum Likelihood

Uploaded by

bzsahil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

CPSC340

Entropy and maximum likelihood

Nando de Freitas
September, 2012
University of British Columbia
Outline of the lecture
This lecture introduces to our first strategy for learning: Maximum
Likelihood. The goal is for you to learn:

Definition of the maximum likelihood learning strategy.

How to apply maximum likelihood to Bernoulli r.v.s.
Understand the concepts of information and entropy.
Derive the connection between maximum likelihood and
differential entropy.
Understand maximum likelihood as a contrasting principle (the
world vs. the the hallucinations of the mind).
Frequentist learning
Frequentist learning assumes that there exists a true model, say with
parameters θο .
^
The estimate (learned value) will be denoted θ.

Given n data, x1:n = {x1, x2,…, xn }, we choose the value of θ that has
more probability of generating the data. That is,

θ^ = arg max p( x1:n |θ )

θ
Frequentist learning
Example: Suppose we observe the data, x1:n = {1, 1, 1, 1, 1, 1}, where
each xi comes from the same Bernoulli distribution (i.e. it is independent
and identically distributed (iid)). What is a good guess of θ?
Maximum Likelihood procedure
Step 1: Given n data, x1:n = {x1, x2,…, xn }, write down the expression
for the joint distribution of the data:

p( x1:n |θ ) =

Step 2: Compute the log-likelihood.

Step 3: Differentiate and equate to zero to find the estimate of θ .

Bernoulli MLE
Step 1: Write down the specific distribution for each datum (Bernoulli in
our case):
p( xi |θ ) =

p( x1:n |θ ) =

Step 2: Compute the log-likelihood.

Bernoulli MLE
Step 3: Differentiate and equate to zero to find the estimate of θ :
Entropy
In information theory, entropy H is a measure of the uncertainty
associated with a random variable. It is defined as:

H(X) = - Σx p(x) log p(x)

Example: For a Bernoulli variable X, the entropy is:
MLE - advanced
MLE - advanced
MLE - advanced
MLE - advanced
MLE - advanced
MLE - advanced
MLE - advanced
MLE - advanced
MLE - advanced
MLE - advanced
Next lecture
In the next lecture, we introduce Bayesian learning.

Quantum Information: Stephen M. Barnett
No ratings yet
Quantum Information: Stephen M. Barnett
60 pages
1.probability Random Variables and Stochastic Processes Athanasios Papoulis S. Unnikrishna Pillai 1 300 1 30
No ratings yet
1.probability Random Variables and Stochastic Processes Athanasios Papoulis S. Unnikrishna Pillai 1 300 1 30
30 pages
Maximum Likelihood and Bayesian Parameter Estimation: Chapter 3, DHS
No ratings yet
Maximum Likelihood and Bayesian Parameter Estimation: Chapter 3, DHS
35 pages
Maximum Likelihood Estimation
No ratings yet
Maximum Likelihood Estimation
46 pages
Introduction To Bayesian Inference: M. Botje NIKHEF, PO Box 41882, 1009DB Amsterdam, The Netherlands June 21, 2006
No ratings yet
Introduction To Bayesian Inference: M. Botje NIKHEF, PO Box 41882, 1009DB Amsterdam, The Netherlands June 21, 2006
68 pages
Statistical Machine Learning 1665832214
No ratings yet
Statistical Machine Learning 1665832214
55 pages
Chapte 2 - Maximum Likelihood - HEC - Lausanne
No ratings yet
Chapte 2 - Maximum Likelihood - HEC - Lausanne
276 pages
3logistic Regression
No ratings yet
3logistic Regression
61 pages
CS464 Ch3 Estimation
No ratings yet
CS464 Ch3 Estimation
56 pages
STAT 135 Lab 2 Confidence Intervals, MLE and The Delta Method
No ratings yet
STAT 135 Lab 2 Confidence Intervals, MLE and The Delta Method
28 pages
l3 ML PDF
No ratings yet
l3 ML PDF
24 pages
Likelihood Frequentist
No ratings yet
Likelihood Frequentist
27 pages
Bayesian
No ratings yet
Bayesian
91 pages
Mlelectures PDF
No ratings yet
Mlelectures PDF
24 pages
ML - Unit-3 Chapter - 6 (Bayes Theorem) - Notes
No ratings yet
ML - Unit-3 Chapter - 6 (Bayes Theorem) - Notes
31 pages
03 Prob
No ratings yet
03 Prob
38 pages
MAP&MLE
No ratings yet
MAP&MLE
44 pages
CPSC 440: Advanced Machine Learning: Exponential Families
No ratings yet
CPSC 440: Advanced Machine Learning: Exponential Families
15 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages
Probabilistic Theory of Deep Learning
No ratings yet
Probabilistic Theory of Deep Learning
11 pages
MA40189 Notes
No ratings yet
MA40189 Notes
70 pages
SP2009F - Lecture03 - Maximum Likelihood Estimation (Parametric Methods)
No ratings yet
SP2009F - Lecture03 - Maximum Likelihood Estimation (Parametric Methods)
23 pages
5 Logistic
No ratings yet
5 Logistic
53 pages
2223hk1 Slide01 ML2022-2
No ratings yet
2223hk1 Slide01 ML2022-2
23 pages
Week 6 Mle
No ratings yet
Week 6 Mle
41 pages
FSMLecture 4
No ratings yet
FSMLecture 4
49 pages
Lecture17 Mle Map
No ratings yet
Lecture17 Mle Map
29 pages
Maximum Likelihood Estimation by K.Kashin
No ratings yet
Maximum Likelihood Estimation by K.Kashin
34 pages
Notes4 BayesianLearning
No ratings yet
Notes4 BayesianLearning
8 pages
ML Map and Bayseian
No ratings yet
ML Map and Bayseian
35 pages
ML Physics
No ratings yet
ML Physics
24 pages
Mlelectures PDF
No ratings yet
Mlelectures PDF
24 pages
Log-Linear Models and Conditional Random Fieldsels
No ratings yet
Log-Linear Models and Conditional Random Fieldsels
27 pages
Naive Bayes Classifier and Other Topics
No ratings yet
Naive Bayes Classifier and Other Topics
52 pages
15.097: Probabilistic Modeling and Bayesian Analysis
No ratings yet
15.097: Probabilistic Modeling and Bayesian Analysis
42 pages
T8 - Classical Stat Inference
No ratings yet
T8 - Classical Stat Inference
8 pages
03 Lecturenote MLE MAP
No ratings yet
03 Lecturenote MLE MAP
7 pages
Ech 4
No ratings yet
Ech 4
39 pages
Wk04 Machine Learning
No ratings yet
Wk04 Machine Learning
6 pages
MLE Assingnment
No ratings yet
MLE Assingnment
7 pages
Maximum Entropy: Density Estimation
No ratings yet
Maximum Entropy: Density Estimation
18 pages
MIT18 05S14 Reading10b PDF
No ratings yet
MIT18 05S14 Reading10b PDF
9 pages
Stat520 Ch.5
No ratings yet
Stat520 Ch.5
5 pages
Section 5
No ratings yet
Section 5
18 pages
Toc 1
No ratings yet
Toc 1
17 pages
12 MLEFilled
No ratings yet
12 MLEFilled
8 pages
L08 MaximumLikelihoodEstimation
No ratings yet
L08 MaximumLikelihoodEstimation
5 pages
Lec 38
No ratings yet
Lec 38
8 pages
Maximum-Likelihood & Bayesian Parameter Estimation: Srihari: CSE 555
No ratings yet
Maximum-Likelihood & Bayesian Parameter Estimation: Srihari: CSE 555
9 pages
Lecture 1: Introduction, Entropy and ML Estimation
No ratings yet
Lecture 1: Introduction, Entropy and ML Estimation
5 pages
Bayesian Basics: Ryan P. Adams
No ratings yet
Bayesian Basics: Ryan P. Adams
7 pages
Topic 14: Maximum Likelihood Estimation: 1 Examples
No ratings yet
Topic 14: Maximum Likelihood Estimation: 1 Examples
6 pages
Maximum Likelihood
No ratings yet
Maximum Likelihood
7 pages
11 Parameter Estimation
No ratings yet
11 Parameter Estimation
6 pages
Maximum Likelihood Estimation: Guy Lebanon February 19, 2011
No ratings yet
Maximum Likelihood Estimation: Guy Lebanon February 19, 2011
6 pages
PIEAS Sample Paper For MS Engineers and Scientists
No ratings yet
PIEAS Sample Paper For MS Engineers and Scientists
8 pages
Formula Latex Meaning
No ratings yet
Formula Latex Meaning
5 pages
(Hu2017groupsparse) Group Sparse Optimization Via LP, Q Regularization
No ratings yet
(Hu2017groupsparse) Group Sparse Optimization Via LP, Q Regularization
52 pages
Nesterov CD 2012
No ratings yet
Nesterov CD 2012
23 pages
Lect05 2design Examp
No ratings yet
Lect05 2design Examp
6 pages
Assignment # 01: COMSATS Institute of Information Technology Abbottabad
No ratings yet
Assignment # 01: COMSATS Institute of Information Technology Abbottabad
4 pages
Lectures on the Coupling Method
From Everand
Lectures on the Coupling Method
Torgny Lindvall
No ratings yet
A System of Legal Logic: Using Aristotle, Ayn Rand, and Analytical Philosophy to Understand the Law, Interpret Cases, and Win in Litigation
From Everand
A System of Legal Logic: Using Aristotle, Ayn Rand, and Analytical Philosophy to Understand the Law, Interpret Cases, and Win in Litigation
Russell Hasan
No ratings yet
Diophantine Approximations
From Everand
Diophantine Approximations
Ivan Niven
3/5 (1)
Mathematical Foundations of Information Theory
From Everand
Mathematical Foundations of Information Theory
A. Ya. Khinchin
3.5/5 (9)
Neural Modeling Fields: Fundamentals and Applications
From Everand
Neural Modeling Fields: Fundamentals and Applications
Fouad Sabry
No ratings yet