0% found this document useful (0 votes)

57 views34 pages

Dealing With Uncertainty P (X - E) : Probability Theory The Foundation of Statistics

This document discusses probability models and their use in making inferences from data. It begins by covering the history of probability theory and different approaches. It then discusses using probability models to represent uncertainties and combining evidence. The document outlines discrete and continuous probability distributions and how to compute probabilities from models using math or simulation. It covers joint, marginal, and conditional probabilities and how Bayes' rule allows updating probabilities with new evidence. Finally, it discusses learning probability models from data and different types of probability models like Naive Bayes and Markov models.

Uploaded by

Chittaranjan Pani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views34 pages

Dealing With Uncertainty P (X - E) : Probability Theory The Foundation of Statistics

Uploaded by

Chittaranjan Pani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 34

Dealing With Uncertainty

P(X|E)
Probability theory
The foundation of Statistics

History

Games of chance: 300 BC

1565: first formalizations
1654: Fermat & Pascal, conditional probability
Reverend Bayes: 1750s
1950: Kolmogorov: axiomatic approach
Objectivists vs subjectivists
(frequentists vs Bayesians)

Frequentist build one model

Bayesians use all possible models, with priors

Concerns
Future: what is the likelihood that a student
will get a CS job given his grades?
Current: what is the likelihood that a person
has cancer given his symptoms?
Past: what is the likelihood that Marilyn
Monroe committed suicide?
Combining evidence.
Always: Representation & Inference

Basic Idea
Attach degrees of belief to proposition.
Theorem: Probability theory is the best way
to do this.
if someone does it differently you can play a
game with him and win his money.

Unlike logic, probability theory is nonmonotonic.

Additional evidence can lower or raise
belief in a proposition.

Probability Models:
Basic Questions
What are they?
Analogous to constraint models, with probabilities on
each table entry

How can we use them to make inferences?

Probability theory

How does new evidence change inferences

Non-monotonic problem solved

How can we acquire them?

Experts for model structure, hill-climbing for
parameters

Discrete Probability Model

Set of RandomVariables V1,V2,Vn

Each RV has a discrete set of values
Joint probability known or computable
For all vi in domain(Vi),
Prob(V1=v1,V2=v2,..Vn=vn) is known,
non-negative, and sums to 1.

Random Variable
Intuition: A variable whose values belongs to a
known set of values, the domain.
Math: non-negative function on a domain (called
the sample space) whose sum is 1.
Boolean RV: John has a cavity.
cavity domain ={true,false}

Discrete RV: Weather Condition

wc domain= {snowy, rainy, cloudy, sunny}.

Continuous RV: Johns height

johns height domain = { positive real number}

Cross-Product RV
If X is RV with values x1,..xn and
Y is RV with values y1,..ym, then
Z = X x Y is a RV with n*m values <x1,y1>
<xn,ym>

This will be very useful!

This does not mean P(X,Y) = P(X)*P(Y).

Discrete Probability Distribution

If a discrete RV X has values v1,vn, then a
prob distribution for X is non-negative real
valued function p such that: sum p(vi) = 1.
This is just a (normalized) histogram.
Example: a coin is flipped 10 times and heads
occur 6 times.
What is best probability model to predict this
result?
Biased coin model: prob head = .6, trials = 10

From Model to Prediction

Use Math or Simulation

Math: X = number of heads in 10 flips

P(X = 0) = .4^10
P(X = 1) = 10* .6*.4^9
P(X = 2) = Comb(10,2)*.6^2*.4^8 etc
Where Comb(n,m) = n!/ (n-m)!* m!.
Simulation: Do many times: flip coin (p = .6) 10
times, record heads.
Math is exact, but sometimes too hard.
Computation is inexact and expensive, but doable

p=.6
0
1
2
3
4
5
6
7
8
9
10

Exact
.0001
.001
.010
.042
.111
.200
.250
.214
.120
.43
.005

10
.0
.0
.0
.0
.2
.1
.6
.1
.0
.0
.0

100
.0
.0
.01
.04
.05
.24
.22
.16
.18
.09
.01

1000
.0
.002
.011
.042
.117
.200
.246
.231
.108
.035
.008

P=.5
0
1
2
3
4
5
6
7
8
9
10

Exact
.0009
.009
.043
.117
.205
.246
.205
.117
.043
.009
.0009

10
.0
.0
.0
.1
.2
.0
.3
.3
.1
.0
.0

100
.0
.01
.07
.13
.24
.28
.15
.08
.04
.0
.0

1000
.002
.011
.044
.101
.231
.218
.224
.118
.046
.009
.001

Learning Model: Hill Climbing

Theoretically it can be shown that p = .6 is
best model.
Without theory, pick a random p value and
simulate. Now try a larger and a smaller p
value.
Maximize P(Data|Model). Get model
which gives highest probability to the data.
This approach extends to more complicated
models (variables, parameters).

Another Data Set

Whats going on?

0
1
2

.34
.38
.19

3
4
5

.05
.01
.02

6
7
8

.08
.20
.30

9
10

.26
.1

Mixture Model

Data generated from two simple models

coin1 prob = .8 of heads
coin2 prob = .1 of heads
With prob .5 pick coin 1 or coin 2 and flip.
Model has more parameters
Experts are supposed to supply the model.
Use data to estimate the parameters.

Continuous Probability
RV X has values in R, then a prob
distribution for X is a non-negative realvalued function p such that the integral of p
over R is 1. (called prob density function)
Standard distributions are uniform, normal
or gaussian, poisson, etc.
May resort to empirical if cant compute
analytically. I.E. Use histogram.

Joint Probability: full knowledge

If X and Y are discrete RVs, then the prob
distribution for X x Y is called the joint
prob distribution.
Let x be in domain of X, y in domain of Y.
If P(X=x,Y=y) = P(X=x)*P(Y=y) for every
x and y, then X and Y are independent.
Standard Shorthand: P(X,Y)=P(X)*P(Y),
which means exactly the statement above.

Marginalization
Given the joint probability for X and Y, you
can compute everything.
Joint probability to individual probabilities.
P(X =x) is sum P(X=x and Y=y) over all y
Conditioning is similar:
P(X=x) = sum P(X=x|Y=y)*P(Y=y)

Marginalization Example

Compute Prob(X is healthy) from

P(X healthy & X tests positive) = .1
P(X healthy & X tests neg) = .8
P(X healthy) = .1 + .8 = .9
P(flush) = P(heart flush)+P(spade flush)+
P(diamond flush)+ P(club flush)

Conditional Probability
P(X=x | Y=y) = P(X=x, Y=y)/P(Y=y).
Intuition: use simple examples
1 card hand X = value card, Y = suit card
P( X= ace | Y= heart) = 1/13
also P( X=ace , Y=heart) = 1/52
P(Y=heart) = 1 / 4
P( X=ace, Y= heart)/P(Y =heart) = 1/13.

Formula
Shorthand: P(X|Y) = P(X,Y)/P(Y).
Product Rule: P(X,Y) = P(X |Y) * P(Y)
Bayes Rule:
P(X|Y) = P(Y|X) *P(X)/P(Y).

Remember the abbreviations.

Conditional Example
P(A = 0) = .7
P(A = 1) = .3

P(B|A)

P(A,B) = P(B,A)
P(B,A)= P(B|A)*P(A)
P(A,B) = P(A|B)*P(B)
P(A|B) = P(B|
A)*P(A)/P(B)

Exact and simulated

P(A,B) 10

100

1000

.14

.18

.14

.56

.55

.56

.27

.24

.03

.06

Note Joint yields everything

Via marginalization
P(A = 0) = P(A=0,B=0)+P(A=0,B=1)=
.14+.56 = .7

P(B=0) = P(B=0,A=0)+P(B=0,A=1) =
.14+.27 = .41

Simulation
Given prob for A and prob for B given A
First, choose value for A, according to prob
Now use conditional table to choose value
for B with correct probability.
That constructs one world.
Repeats lots of times and count number of
times A= 0 & B = 0, A=0 & B= 1, etc.
Turn counts into probabilities.

Consequences of Bayes Rules

P(X1,X2,X3) =P(X1)*P(X2|X1)*P(X3|X1,X2).
Note: These equations make no assumptions!
Last equation is called the Chain or Product Rule
Can pick the any ordering of variables.

Extensions of P(A) +P(~A) = 1

P(X|Y) + P(~X|Y) = 1
Semantic Argument
conditional just restricts worlds

Syntactic Argument: lhs equals

P(X,Y)/P(Y) + P(~X,Y)/P(Y) =
(P(X,Y) + P(~X,Y))/P(Y) = (marginalization)
P(Y)/P(Y) = 1.

Bayes Rule Example

Meningitis causes stiff neck (.5).
P(s|m) = 0.5

Prior prob of meningitis = 1/50,000.

p(m)= 1/50,000 = .00002

Prior prob of stick neck ( 1/20).

p(s) = 1/20.

Does patient have meningitis?

p(m|s) = p(s|m)*p(m)/p(s) = 0.0002.

Is this reasonable? p(s|m)/p(s) = change=10

Bayes Rule: multiple symptoms

Given symptoms s1,s2,..sn, what estimate
probability of Disease D.
P(D|s1,s2sn) = P(D,s1,..sn)/P(s1,s2..sn).
If each symptom is boolean, need tables of
size 2^n. ex. breast cancer data has 73
features per patient. 2^73 is too big.
Approximate!

Notation: max arg

Conceptual definition, not operational
Max arg f(x) is a value of x that maximizes
f(x).
MaxArg Prob(X = 6 heads | prob heads)
yields prob(heads) = .6

Idiot or Nave Bayes:

Not necessary to get prob right: only order.

Pretty good but Bayes Nets do it better.

Chain Rule and Markov Models

Recall P(X1, X2, Xn) = P(X1)*P(X2|
X1)*P(Xn| X1,X2,..Xn-1).
If X1, X2, etc are values at time points 1, 2..
and if Xn only depends on k previous times,
then this is a markov model of order k.
MMO: Independent of time
P(X1,Xn) = P(X1)*P(X2)..*P(Xn)

Markov Models
MM1: depends only on previous time
P(X1,Xn)= P(X1)*P(X2|X1)*P(Xn|Xn-1).

May also be used for approximating

probabilities. Much simpler to estimate.
MM2: depends on previous 2 times
P(X1,X2,..Xn)= P(X1,X2)*P(X3|X1,X2) etc

Common DNA application

Looking for needles: surprising frequency?

Thermo-Fluid Lab Jimma University Jit
100% (1)
Thermo-Fluid Lab Jimma University Jit
89 pages
Linear Programming Graphical Method
83% (6)
Linear Programming Graphical Method
28 pages
The Solid State: CBSE Board - Chemistry - 12 NCERT Exercise With Solutions
No ratings yet
The Solid State: CBSE Board - Chemistry - 12 NCERT Exercise With Solutions
16 pages
Java Record
No ratings yet
Java Record
81 pages
04 Notes 6250 f13
0% (1)
04 Notes 6250 f13
16 pages
Mostly Harmless Statistics
No ratings yet
Mostly Harmless Statistics
506 pages
JLSS
No ratings yet
JLSS
17 pages
Binayak Science College, Angul Sec-B
No ratings yet
Binayak Science College, Angul Sec-B
2 pages
The Fascinating Number Pi
No ratings yet
The Fascinating Number Pi
40 pages
5.1.number Systems and Digital Cercuits - Topic 5 ..Bits
No ratings yet
5.1.number Systems and Digital Cercuits - Topic 5 ..Bits
30 pages
Personal Diary 2012
No ratings yet
Personal Diary 2012
17 pages
Curriculum Map Math 7 Q4
No ratings yet
Curriculum Map Math 7 Q4
3 pages
3D Kinematics: Presented By: Amir Patel PHD (Mechatronics) Cape Town
No ratings yet
3D Kinematics: Presented By: Amir Patel PHD (Mechatronics) Cape Town
32 pages
Cp5293 Big Data Analytics 1
No ratings yet
Cp5293 Big Data Analytics 1
9 pages
Java Operator Precedence - Javatpoint
No ratings yet
Java Operator Precedence - Javatpoint
4 pages
Drawing Mechanics - M.S
No ratings yet
Drawing Mechanics - M.S
17 pages
Indira Gandhi Institute of Technology, Sarang
No ratings yet
Indira Gandhi Institute of Technology, Sarang
5 pages
C.L. Application
No ratings yet
C.L. Application
1 page
CLASS 2025 Bayesian Framework
No ratings yet
CLASS 2025 Bayesian Framework
46 pages
Admission List
No ratings yet
Admission List
12 pages
Warm up: - Find α
No ratings yet
Warm up: - Find α
8 pages
Probability On Graphs-Geoffrey Grimmett
No ratings yet
Probability On Graphs-Geoffrey Grimmett
220 pages
PHP Programs WT
No ratings yet
PHP Programs WT
11 pages
ProblemsChapter 05 Cables PDF
No ratings yet
ProblemsChapter 05 Cables PDF
4 pages
Flow Characteristics in Mixers Agitated by Helical Ribbon Blade Impeller
No ratings yet
Flow Characteristics in Mixers Agitated by Helical Ribbon Blade Impeller
15 pages
07 Bayesian Networks
No ratings yet
07 Bayesian Networks
106 pages
On The Early History of SVD
No ratings yet
On The Early History of SVD
25 pages
Cs Ai Lecture Notes 02
No ratings yet
Cs Ai Lecture Notes 02
103 pages
ENERGY METHODS Students' Notes
No ratings yet
ENERGY METHODS Students' Notes
13 pages
Learning Competency Directory
No ratings yet
Learning Competency Directory
3 pages
05-JSCE-Quality Specifications For Continuous Fiber Reinforcing Materials JSCE-E 131-1995
No ratings yet
05-JSCE-Quality Specifications For Continuous Fiber Reinforcing Materials JSCE-E 131-1995
10 pages
2 Probability
No ratings yet
2 Probability
30 pages
Formative Versus Reflective Measurement Implications For Explaining Innovation in Marketing Partnerships
No ratings yet
Formative Versus Reflective Measurement Implications For Explaining Innovation in Marketing Partnerships
9 pages
Percent Practice Chapter Test: Principles of Mathematics 8
No ratings yet
Percent Practice Chapter Test: Principles of Mathematics 8
6 pages
Applicant (%) Mobile No
No ratings yet
Applicant (%) Mobile No
3 pages
Lec6 - Probabilistic Reasoning
No ratings yet
Lec6 - Probabilistic Reasoning
36 pages
07-Lecture - 7 - COMPUTING AND ANALYZING POLYGON TRAVERSE MISCLOSURE ERRORS
No ratings yet
07-Lecture - 7 - COMPUTING AND ANALYZING POLYGON TRAVERSE MISCLOSURE ERRORS
25 pages
5.data Mining - Bayesian Network
No ratings yet
5.data Mining - Bayesian Network
25 pages
CS115 Probability
No ratings yet
CS115 Probability
41 pages
I Learn: Together We Make The Difference
No ratings yet
I Learn: Together We Make The Difference
2 pages
Binayak Science College Mathematics Study Plan: Total Number of Periods Required: 50
No ratings yet
Binayak Science College Mathematics Study Plan: Total Number of Periods Required: 50
2 pages
FM: 40 Physics Time: 1HR
No ratings yet
FM: 40 Physics Time: 1HR
2 pages
Import Try: "MD5" "String To Create Hash On"
No ratings yet
Import Try: "MD5" "String To Create Hash On"
2 pages
2022 Naive Bayes and Probability
No ratings yet
2022 Naive Bayes and Probability
30 pages
CHP: 13 and 14
No ratings yet
CHP: 13 and 14
62 pages
Lec-1 Probabilistic Models
No ratings yet
Lec-1 Probabilistic Models
29 pages
Bougheas Et Al. 2009 1-S2.0-S0378426608001830-Main
No ratings yet
Bougheas Et Al. 2009 1-S2.0-S0378426608001830-Main
8 pages
Fall 2019 Prob Review
No ratings yet
Fall 2019 Prob Review
33 pages
Probs Stats
No ratings yet
Probs Stats
26 pages
Online Application Form For Recruitment
No ratings yet
Online Application Form For Recruitment
2 pages
L08 Probabilistic Reasoning
No ratings yet
L08 Probabilistic Reasoning
90 pages
Chapter 4 Bayesian Networks
No ratings yet
Chapter 4 Bayesian Networks
62 pages
Consent Letter College
No ratings yet
Consent Letter College
1 page
Binayak Science College: (Permitted by Dept. of Higher Education, Govt. of Odisha)
No ratings yet
Binayak Science College: (Permitted by Dept. of Higher Education, Govt. of Odisha)
1 page
PMRprobabilistic Modelling Primer
No ratings yet
PMRprobabilistic Modelling Primer
14 pages
AML-IV New
No ratings yet
AML-IV New
98 pages
Ages Handout
No ratings yet
Ages Handout
1 page
ML - Lec 2 - Review of Probability and Statistics
No ratings yet
ML - Lec 2 - Review of Probability and Statistics
30 pages
Unit 2 (2) - 1
No ratings yet
Unit 2 (2) - 1
37 pages
Random Variables
No ratings yet
Random Variables
15 pages
Lecture 10
No ratings yet
Lecture 10
59 pages
07 Probability Review
No ratings yet
07 Probability Review
56 pages
Reasoning Under Uncertainity
No ratings yet
Reasoning Under Uncertainity
49 pages
Stochbasics Handout
No ratings yet
Stochbasics Handout
36 pages
ProbabilityStatitic Review
No ratings yet
ProbabilityStatitic Review
41 pages
Mathematics in Machine Learning
No ratings yet
Mathematics in Machine Learning
83 pages
Outline of The Course: Unknown
No ratings yet
Outline of The Course: Unknown
26 pages
Lecture 05 Reasoning Under Uncertainty
No ratings yet
Lecture 05 Reasoning Under Uncertainty
41 pages
Foundations of Machine Learning: Part A: Probability Basics
No ratings yet
Foundations of Machine Learning: Part A: Probability Basics
75 pages
Introduction To Uncertainity
No ratings yet
Introduction To Uncertainity
66 pages
2 Mle
No ratings yet
2 Mle
28 pages
2) Descriptive Statistics - Asst. Prof. Dr. Meliz Yuvalı
No ratings yet
2) Descriptive Statistics - Asst. Prof. Dr. Meliz Yuvalı
16 pages
Introduction To Bayesian Learning: Aaron Hertzmann University of Toronto SIGGRAPH 2004 Tutorial
No ratings yet
Introduction To Bayesian Learning: Aaron Hertzmann University of Toronto SIGGRAPH 2004 Tutorial
141 pages
ML BayesionBeliefNetwork Lect12 14
No ratings yet
ML BayesionBeliefNetwork Lect12 14
99 pages
Math 18 Matlab HW5
No ratings yet
Math 18 Matlab HW5
7 pages
Lecture5 Maximum Likelihood
No ratings yet
Lecture5 Maximum Likelihood
13 pages
281 Probability Reference Sheet
No ratings yet
281 Probability Reference Sheet
3 pages
Sam Roweis Probx
No ratings yet
Sam Roweis Probx
12 pages
Unit Iv L Earning
No ratings yet
Unit Iv L Earning
23 pages
Lec-1 Probabilistic Models
No ratings yet
Lec-1 Probabilistic Models
29 pages
Unit Iv L Earning
No ratings yet
Unit Iv L Earning
33 pages
Unit IV CI PDF
No ratings yet
Unit IV CI PDF
24 pages
Probabilistic Model
No ratings yet
Probabilistic Model
7 pages
Unit-Ii: Probability I: Introductory Ideas
No ratings yet
Unit-Ii: Probability I: Introductory Ideas
28 pages
BaYesian Models Machine Learning 2016
No ratings yet
BaYesian Models Machine Learning 2016
126 pages
DSCI303-18 NaiveBayes
No ratings yet
DSCI303-18 NaiveBayes
44 pages
Bayes Rule
No ratings yet
Bayes Rule
29 pages
BCS-DS-602: Machine Learning: Dr. Sarika Chaudhary Associate Professor Fet-Cse
No ratings yet
BCS-DS-602: Machine Learning: Dr. Sarika Chaudhary Associate Professor Fet-Cse
18 pages
09 AI Probability Based Expert Systems
No ratings yet
09 AI Probability Based Expert Systems
64 pages
2223hk1 Slide01 ML2022-2
No ratings yet
2223hk1 Slide01 ML2022-2
23 pages
Machine Learning and Data Mining: Prof. Alexander Ihler
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler
51 pages
ML Academy - Part II
No ratings yet
ML Academy - Part II
8 pages
Machine Learning and Data Mining: Prof. Alexander Ihler
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler
51 pages
Scribe: Naive Bayes Classifier
No ratings yet
Scribe: Naive Bayes Classifier
16 pages
Probability Theory For Machine Learning: Chris Cremer September 2015
No ratings yet
Probability Theory For Machine Learning: Chris Cremer September 2015
40 pages
Business Econometrics Using SAS Tools (BEST) : Class IV - Probability Refresher
No ratings yet
Business Econometrics Using SAS Tools (BEST) : Class IV - Probability Refresher
31 pages
25-27 Statistical Reasoning-Probablistic Model-Naive Bayes Classifier
No ratings yet
25-27 Statistical Reasoning-Probablistic Model-Naive Bayes Classifier
35 pages
ECE523 Engineering Applications of Machine Learning and Data Analytics - Bayes and Risk - 1
No ratings yet
ECE523 Engineering Applications of Machine Learning and Data Analytics - Bayes and Risk - 1
7 pages
Bayes ML Tutorial
No ratings yet
Bayes ML Tutorial
69 pages
Bayesian Nonparametrics and The Probabilistic Approach To Modelling
No ratings yet
Bayesian Nonparametrics and The Probabilistic Approach To Modelling
27 pages
Introduction To Probability Theory: A Short Course On Graphical Models
No ratings yet
Introduction To Probability Theory: A Short Course On Graphical Models
30 pages
Introduction To Probabilistic Learning
No ratings yet
Introduction To Probabilistic Learning
9 pages
NAME : I Learn Additional Paper
No ratings yet
NAME : I Learn Additional Paper
1 page
Binayak Science College, Angul SL No Name D.O.B Qualification Subject Experience ID Proof 1 Bijay Kumar Rath
No ratings yet
Binayak Science College, Angul SL No Name D.O.B Qualification Subject Experience ID Proof 1 Bijay Kumar Rath
1 page
Binayak Science College, Angul Date: .: Breakfast Student Staff Total
No ratings yet
Binayak Science College, Angul Date: .: Breakfast Student Staff Total
1 page
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)
Applications of Derivatives Errors and Approximation (Calculus) Mathematics Question Bank
From Everand
Applications of Derivatives Errors and Approximation (Calculus) Mathematics Question Bank
Mohmmad Khaja Shareef
No ratings yet

Dealing With Uncertainty P (X - E) : Probability Theory The Foundation of Statistics

Uploaded by

Dealing With Uncertainty P (X - E) : Probability Theory The Foundation of Statistics

Uploaded by

Dealing With Uncertainty

Games of chance: 300 BC

Frequentist build one model

Unlike logic, probability theory is nonmonotonic.

How can we use them to make inferences?

How does new evidence change inferences

How can we acquire them?

Discrete Probability Model

Set of RandomVariables V1,V2,Vn

Discrete RV: Weather Condition

Continuous RV: Johns height

This will be very useful!

Discrete Probability Distribution

From Model to Prediction

Math: X = number of heads in 10 flips

Learning Model: Hill Climbing

Another Data Set

Data generated from two simple models

Joint Probability: full knowledge

Compute Prob(X is healthy) from

Remember the abbreviations.

Exact and simulated

Note Joint yields everything

Consequences of Bayes Rules

Extensions of P(A) +P(~A) = 1

Syntactic Argument: lhs equals

Bayes Rule Example

Prior prob of meningitis = 1/50,000.

Prior prob of stick neck ( 1/20).

Does patient have meningitis?

Is this reasonable? p(s|m)/p(s) = change=10

Bayes Rule: multiple symptoms

Notation: max arg

Idiot or Nave Bayes:

Not necessary to get prob right: only order.

Chain Rule and Markov Models

May also be used for approximating

Common DNA application

Looking for needles: surprising frequency?

You might also like