Lecture 5 Bayesian Model 1
Lecture 5 Bayesian Model 1
(Weights of evidence)
model
Study area
? ? ? ?
? ? ?
? ? ? ? ? ?
? ? ? ? ?
? ? ? ?
? ? ? ? ? ?
? ? ? ? ?
? ? ? ? ?
? ? ? ? ?
? ? ? ? ? ?
? ? ? ?
? ? ? ? ?
? ? ?? ? ?
? ? ? ?
?? ? ? ? ? ?
? ? ? ?
? ? ? ? ?
? ? ? ? ? ? ?
? ? ? ? ?
? ? ? ? ? ?
? ? ? ? ?
? ? ? ? ?
? ? ? ? ?
? ? ? ? ? ?
? ? ? ?
? ? ? ? ?
Area: 144 sq km ? ? ?? ? ?
? ? ? ?
?? ? ?
Unit Cell Size: 1 sq km ? ?
Ds Ds Ds
Ds Ds
Ds Ds Ds Ds
Ds Ds Ds Ds Ds
Px Ds Ds Ds Px
Distance to fault layer Px Px Ds Ds x Ds Ds
Px Ds
P Ds Px
x Ds Ds Ds
P x P x P x P
s s D s D s Px Px
s s Px Px D D
s s P x D s Ds
s D D D
D x x P x L Ds L L D L Ds
s Ds L Ds L P L P L s DDss L Ds Ds
L L D L ss L Ds L Px L P H D L
x s D L s
H s D s DD s s D s Ds Ds D
Soil permeability layer D L L s D L L
H H H
x s D s DsL DLs D H D H x D s D s Ds
D L L s L
H H H P H x D s D s L Ds L Ds H D H P x Ds Ds
x P x P L L s x L P
P H H s s H D L P
L L L
x P x H ?Px L D?s L D? L D? s P x Px Ds
L ? L P? H P? H
x ? s DL
? L ? s P x ?Px L ?Ds LL ? Ds L D L x s Ds
? ? s D s
H ? D H ? L D s s P D
? L ? LL ? D L ? Ds x s L D L
? ? L s L ?Px L ?Px L ?P L ?D L ss L Ds Ds Ds
? L? L D s ? L ? D x s D s D D
s ?DDss L D?s L P? L ?P H ?D
x
? ? ? H ? L ? s D
L L
?D L ? L L
? ? H ? H ? H ? ? L ? L ? H? H L
? L
? ? H ? H ? H? H ? L ? L ? L ? H H L
?
? L ? L ? H ? H ? L ?? L L ? L ? H L L
? ? ? L H ? L ?
?
? L ? L ? H ? H ? LL ? L L L
? ? ? L
? L ? L ?
H ? H
? L ? L ? LL ? L ? ? ? ?
? ? ? ? ?
? ? ? ?
? ? ? ? ? ? ?
? ? ? ? ?
? ? ? ? ? ?
? ? ?
? ? ? ? ? ? ?
? ? ? ? ??
How do we estimate the class??? ? ? ?? ? ?
Training data
Ds Ds Ds
Px Ds Ds
Ds Ds Ds
Px Px Px Distance to fault layer
x Ds Ds
Px Px P Px
s s Px Px Ds
s D D
D
s s P x P x Px
D L D L
L Ds L Px We estimate class label based on
L L
s DDssL Ds Px
s L D L
L L H D conditional probabilities:
H L L
L L H
H H L Soil permeability 1. P(W|Px, H)
L L L
L
L
L L H L 2. P(W|Px, L)
L L
L L LL L 3. P(W|Ds,H)
4. P(W|Ds, L)
D D D
D D
W D 5. P(D|Px, H)
D W D D Bore well layer
D
D D D 6. P(D|Px, L)
D D W
D D D W D 7. P(D|Ds,H)
Px – Proximal D D 8. P(D|Ds, L)
D D D D
Ds – Distal D D
LD D D
W D
H - High If P(W) > P(D), then class = W
L – Low
W – Water-bearing
D – Dry
Terminology
Populations & Samples
• Population: the complete set of individuals, objects or
scores of interest.
– Often too large to sample in its entirety
– It may be real or hypothetical (e.g. the results from an experiment
repeated ad infinitum)
6
Probability & Statistics
c e
p a
n s
ti o
u la pace
o p ple s
P Probability Sa
m
Sampling
Population Sample Data
Parameters Statistics
Inferencing
Statistics
Two containers containing red balls and blue balls (Population comprising red and blue balls)
From the observations we compute statistics that we use to estimate population parameters,
which index the probability density, from which we can compute the probability of a future
observation from that population…………..
10
Variables
• Variables can be further classified as:
– Dependent/Response. Variable of primary interest (e.g. amount of
rainfall). Not controlled by the experimenter.
– Independent/Predictor
• Not controlled by the experimentalist (temperature,
humidity…(called Covariate when not controlled)
• Controlled by the experimentalist (called Factor when
controlled).
• If the value of a variable cannot be predicted in advance
then the variable is referred to as a random variable
11
2. Frequency Distributions
12
Example – TDS in water
• Water samples taken from 36 locations in Powai as part of a study to
determine the natural variation of total dissolved solids in the area.
13
TDS in Powai water samples
14
Frequency Distribution
8
Frequency
4
= Probability distributions
(when idealized and fitted to mathematical functions)
Probability: the “frequentist” approach
• .5 = even odds
• .1 = 1 chance out of 10
Probability
“something-has-to-happen rule”:
– The probability of the set of all possible outcomes of a
trial must be 1.
– P(S) = 1
(S represents set of all possible outcomes.)
CAUTION: are outcomes are equally likely??
• P(B|A)=Probability of B, given A
Conditional probability (cont.)
• P(B|A) = P(B&A)/P(A)
Independence….???
With notation for conditional probabilities, we can now
formalize the definition of independence
• events A and B are independent whenever
P(B|A) = P(B)
• P(B|A) = P(B&A)/P(A) =
P(B).P(A|B)/P(A)
Hypothesis
Null and alternative hypothesis
– Notation
• Null: H0
• Alternative: Ha
Null and Alternative Hypotheses
• The Null and Alternative Hypotheses are mutually exclusive. Only
one of them can be true.
• The Null and Alternative Hypotheses are collectively exhaustive.
They are stated to include all possibilities.
• The Null Hypothesis is assumed to be true.
• The burden of proof falls on the Alternative Hypothesis.
𝑃 ( 𝑁𝑢𝑙𝑙h𝑦𝑝𝑜𝑡h𝑒𝑠𝑖𝑠 ) 𝑃(𝐸𝑣𝑖𝑑𝑒𝑛𝑐𝑒∨𝑁𝑢𝑙𝑙h𝑦𝑝𝑜𝑡h𝑒𝑠𝑖𝑠)
¿
𝑃 ( 𝐴𝑙𝑡𝑒𝑟𝑛𝑎𝑡𝑒h𝑦𝑝𝑜𝑡h𝑒𝑠𝑖𝑠 ) 𝑃 ( 𝐸𝑣𝑖𝑑𝑒𝑛𝑐𝑒|𝐴𝑙𝑡𝑒𝑟𝑛𝑎𝑡𝑒h𝑦𝑝𝑜𝑡h𝑒𝑠𝑖𝑠 ) +¿𝑃 ( 𝑁𝑢𝑙𝑙h𝑦𝑝𝑜𝑡h𝑒𝑠𝑖𝑠 ) 𝑃 ( 𝐸𝑣𝑖𝑑𝑒𝑛𝑐𝑒|𝑁𝑢𝑙𝑙h𝑦𝑝𝑜𝑡h𝑒𝑠𝑖𝑠 )
𝑃 (h 𝑦𝑝𝑜𝑡h𝑒𝑠𝑖𝑠) 𝑃 (𝐸𝑣𝑖𝑑𝑒𝑛𝑐𝑒∨h𝑦𝑝𝑜𝑡h𝑒𝑠𝑖𝑠)
𝑃 ( 𝐴𝑙𝑡𝑒𝑟𝑛𝑎𝑡𝑒 h𝑦𝑝𝑜𝑡h𝑒𝑠𝑖𝑠|𝐸𝑣𝑖𝑑𝑒𝑛𝑐𝑒 )=
𝑃 (𝐸𝑣𝑖𝑑𝑒𝑛𝑐𝑒)
𝑃 ( 𝑁𝑢𝑙𝑙h𝑦𝑝𝑜𝑡h𝑒𝑠𝑖𝑠 ) 𝑃(𝐸𝑣𝑖𝑑𝑒𝑛𝑐𝑒∨𝑁𝑢𝑙𝑙h𝑦𝑝𝑜𝑡h𝑒𝑠𝑖𝑠)
¿
𝑃 ( 𝐴𝑙𝑡𝑒𝑟𝑛𝑎𝑡𝑒h𝑦𝑝𝑜𝑡h𝑒𝑠𝑖𝑠 ) 𝑃 ( 𝐸𝑣𝑖𝑑𝑒𝑛𝑐𝑒|𝐴𝑙𝑡𝑒𝑟𝑛𝑎𝑡𝑒h𝑦𝑝𝑜𝑡h𝑒𝑠𝑖𝑠 ) +¿𝑃 ( 𝑁𝑢𝑙𝑙h𝑦𝑝𝑜𝑡h𝑒𝑠𝑖𝑠 ) 𝑃 ( 𝐸𝑣𝑖𝑑𝑒𝑛𝑐𝑒|𝑁𝑢𝑙𝑙h𝑦𝑝𝑜𝑡h𝑒𝑠𝑖𝑠 )
• Type II Error
– Committed by ACcept INCOrrect null hypothesis
– The probability of committing a Type II error is called β (beta).
Decision Table
for Hypothesis Testing
Hypothesis Test
Truth
Decision H 0 True H 0 False
Accept Type II
H0 Correct
Error
Reject Type I
Error Correct
H0
Does patient have cancer or not?
• A patient takes a lab test and the result comes back positive. It is
known that the test returns a correct positive result in 98% of the
cases and a correct negative result in 97% of the cases.
Furthermore, only 0.008 of the entire population has this disease.
Well (D)
10k
Spatial Feature (B1)
10k
Objective: To estimate the probability of occurrence of D in each unit cell of the study area
Approach: Use BAYES’ THEOREM for updating the prior probability of the occurrence of D
to posterior probability based on the conditional probabilities (or weights of evidence)
of the spatial features.
𝑡h𝑎𝑡 𝑖𝑠: 𝑃 ( 𝐷∨ 𝐵1 , 𝐵 2)
Weights of Evidence
Four steps:
1. Convert numeric maps to binary maps (the method uses
present/absent type of features)
2. Calculation of prior probability
3. Calculate weights of evidence (likelihood ratios) for each predictor
map
4. Combine weights
Weights of Evidence
Step 1 Multiclass to Binary Maps
Use the distance at which there is maximum spatial association as the
threshold !
Well
Weights of Evidence
Step 1 Multiclass to Binary Maps
B C
Area (A) = n(A) = 25; n(D|A) = 2.5 n(D|A) = 2 For A = (2-2.5)/2.5 = -0.20
Area (B) = n(A) = 21; n(D|B) = 2.1 n(D|B) = 2 For B = (2-2.1)/2.1 = -0.05
Area(C) = n(C) = 7; n(D|C) = 0.7 n(D|C) = 2 For C = (2-0.7)/0.7 = +1.9
Area(D) = n(D) = 47; n(D|D) = 4.7 n(D|D) = 4 For D = (2.0-4.0)/4.0 = -0.50
(Area (S) = n(S) = 100; n(D) = 10) n(D) = 10
Observed vs expected distribution for
line features
Calculate observed vs expected distribution of points for cumulative distances
No association!
Contrast
Roughly defined as =
Probability of points within a feature - probability of points outside the feature
Mathematically:
𝑃 ( 𝐵∨𝐷) 𝑃 (𝐵∨𝐷)
𝐶 𝑜𝑛𝑡𝑟𝑎𝑠𝑡=ln −ln
𝑃 ( 𝐵∨𝐷) 𝑃 (𝐵∨𝐷)
Weights of evidence model
Step 2: Calculation of prior probability of Wells
1k
1k Study area (S)
Unit cell
Well (D)
10k
10k
• The probability of the occurrence D when no other information about the area is available or considered.
𝑃 ( 𝐷∧ 𝐵) 𝑃 ( 𝐵| 𝐷 ) 𝑃 (𝐵∨𝐷)
𝑃 ( 𝐷|𝐵 ) = = 𝑃 ( 𝐷) =𝑃 (𝐷)
𝑃 ( 𝐵) 𝑃 ( 𝐵) 𝑃 ( 𝐷 ) 𝑃 (𝐵∨𝐷)+ 𝑃 ( 𝐷 ) 𝑃 (𝐵∨𝐷)
𝑛( 𝐷)
𝑃 ( 𝐷 )= D
𝑛 (𝑆)
𝑛(𝐵 ∩ 𝐷)
𝑃 ( 𝐵∨𝐷 )=
𝑛(𝐷)
𝑛( 𝐵∩ 𝐷)
𝑃 ( 𝐵∨ 𝐷 )=
𝑛( 𝐷)
𝑛( 𝐵∩ 𝐷) 𝑛 ( 𝐷 ) −𝑛 ( 𝐷 ∩ 𝐵)
𝑃 ( 𝐵∨ 𝐷 )= =
𝑛( 𝐷) 𝑛 (𝐷)
𝑛( 𝐵∩ 𝐷) 𝑛 ( 𝑆 ) − 𝑛 ( 𝐵 ) − 𝑛( 𝐷)+𝑛( 𝐵∩ 𝐷)
𝑃 ( 𝐵∨ 𝐷 )= =
𝑛( 𝐷) 𝑛(𝐷)
Exercise 10k Unit cell size = 1 sq km & each well
S occupies 1 unit cell
B
10k 1 Feature (B1)
Feature (B2)
B
2
10k
Calculate the weights of evidence (W+ and W-) and Contrast values for B1 and B2
𝑛(𝐵 ∩ 𝐷)
𝑃 ( 𝐵∨𝐷 )= 𝑃 (𝐵∨ 𝐷) 𝑃 (𝐵∨𝐷)
𝑛(𝐷) 𝑊 +¿ 𝐿𝑜𝑔 ;𝑊 −=𝑙𝑜𝑔
𝑃 (𝐵∨ 𝐷) ´
𝑃 (𝐵∨ 𝐷)
𝑛( 𝐵∩ 𝐷)
𝑃 ( 𝐵∨ 𝐷 )=
𝑛( 𝐷)
𝑛( 𝐵∩ 𝐷) 𝑛 ( 𝐷 ) −𝑛 ( 𝐷 ∩ 𝐵)
𝑃 ( 𝐵∨ 𝐷 )= =
𝑛( 𝐷) 𝑛 (𝐷)
𝑛( 𝐵∩ 𝐷) 𝑛 ( 𝑆 ) − 𝑛 ( 𝐵 ) − 𝑛( 𝐷)+𝑛( 𝐵∩ 𝐷)
𝑃 ( 𝐵∨ 𝐷 )= =
𝑛( 𝐷) 𝑛(𝐷)
Exercise 10k Unit cell size = 1 sq km & each well
S occupies 1 unit cell
B
10k 1 SFeature (B1)
Feature (B2)
B
2
10k
4/10 𝑊+
𝐵1
¿=1.09888 ; 𝑊
−
𝐵1 =− 0.3678 ¿
−
¿= 0.2060 ; 𝑊
𝑊+
=− 0.0763 ¿
=12/90 𝐵2
𝐵2
𝑛( 𝐵∩ 𝐷) 𝑛 ( 𝐷 ) −𝑛 ( 𝐷 ∩ 𝐵)
𝑃 ( 𝐵∨ 𝐷 )= = =6 /10
𝑛( 𝐷) 𝑛 ( 𝐷)
𝑛( 𝐵∩ 𝐷) 𝑛 ( 𝑆 ) − 𝑛 ( 𝐵 ) − 𝑛( 𝐷)+𝑛( 𝐵∩ 𝐷)
𝑃 ( 𝐵∨ 𝐷 )= = =78/ 90
𝑛( 𝐷) 𝑛( 𝐷)
Step 3 Calculation of weights of evidence
Contrast (C) measures the net strength of spatial association between the spatial
feature and points
Contrast = W+ – W-
+…….
60
Loge (O{D|B1, B2}) =Loge(O{D}) + W+/-B1 +
W+/-B2
Loge(O{D}) = Loge(0.11) = -2.2073
S
For the areas where both B1 and B2 are
present
Loge (O{D|B1, B2}) = -2.2073 + 1.0988 + 0.2050 = - B1
0.8585
O{D|B1, B2} = Antiloge (-0.8585) = 0.4238
P = O/(1+O) = (0.4238)/(1.4238) = 0.2968