0% found this document useful (0 votes)

48 views132 pages

Chapter 2 Principles of Statistics

The document is a course chapter on Geostatistics in Petroleum Engineering, focusing on the principles of statistics, including descriptive and inferential statistics. It covers topics such as frequency distribution, univariate and bivariate statistics, and linear regression, with examples illustrating the application of these concepts. Key statistical measures such as mean, median, mode, variance, and correlation coefficients are discussed to analyze data relevant to petroleum engineering.

Uploaded by

xuanhieu.14504

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views132 pages

Chapter 2 Principles of Statistics

Uploaded by

xuanhieu.14504

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 132

Vietnam National University - Ho Chi Minh City

University of Technology
Faculty of Geology & Petroleum Engineering
Department of Drilling - Production Engineering

Course
Geostatistics in Petroleum Engineering
Trần Nguyễn Thiện Tâm
[email protected]
Chapter 2

Principles of Statistics

9/10/2024 Geostatistics 2
References
Mohan Kelkar, Godofredo Perez. Applied Geostatistics for Reservoir Characterization.
Society of Petroleum Engineers, Texas, 2002.

9/10/2024 Geostatistics 3
Contents
❑ Introduction
❑ Descriptive statistics
❑ Inferential statistics

9/10/2024 Geostatistics 4
Introduction
Descriptive Statistics
• Organization, presentation, and summarization of data.
• Better understanding of the type of information currently available that allows us to
use it more productively.
Inferential statistics
• Deriving conclusions about a population on the basis of sample data
• Estimating values at unsampled locations

9/10/2024 Geostatistics 5
Descriptive Statistics
▪ Frequency Distribution
▪ Univariate Statistics
▪ Bivariate Statistics

9/10/2024 Geostatistics 6
Frequency Distribution
The simplest ways to analyze sample data
Summarizes the data in a more compact form than original sample observations
The range of the data is divided into intervals called class intervals
The number of measurements falling within a particular class, i, is called a class
frequency, fi

9/10/2024 Geostatistics 7
Frequency Distribution
Relative frequency, 𝑓𝑅𝑖
𝑓𝑖
𝑓𝑅𝑖 =
𝑛
n = total number of samples
𝑁

෍ 𝑓𝑅𝑖 = 1
𝑖=1
N = total number of classes
Cumulative relative class frequency
𝑗

𝐹𝑗 = ෍ 𝑓𝑅𝑖
𝑖=1

9/10/2024 Geostatistics 8
Example 2.1
The following porosity samples are measured in a wellbore: 0.141, 0.124, 0.152, 0.156,
0.113, 0.167, 0.194, 0.142, 0.133, 0.149, 0.106, 0.137, 0.147, 0.159, 0.174, 0.129, 0.153,
0.173, 0.189, 0.16, 0.193, 0.156, 0.149, 0.135, 0.145, 0.171, 0.101, 0.151, 0.176, 0.191,
0.121, 0.148, 0.153, 0.171, 0.183, 0.108, 0.123, 0.169, 0.185, 0.153, 0.117, 0.127, 0.145,
0.141, 0.165, 0.14, 0.143, 0.178, 0.179, 0.157. Analyze these porosity, ϕ, values using a
frequency-distribution analysis.

9/10/2024 Geostatistics 9
Example 2.1 - Solution
For these 50 values, we divide the data into five classes

9/10/2024 Geostatistics 10
Univariate Statistics
• Mean
• Median
• Mode
• Percentiles
• Variance
• Coefficient of Variation
• Range

9/10/2024 Geostatistics 11
Mean
Arithmetic mean: represents the central tendency
𝑛
of the sample
1
𝑥ҧ = ෍ 𝑥𝑖
𝑛
𝑖=1
where
n = total number of samples
xi = the value of sample i

9/10/2024 Geostatistics 12
Median
Another measure of central tendency, which is the sample point that divides the sample
into equal halves
If all the samples are arranged in an ascending order so that x1 < x2 … < xn
When n is odd, the sample median, 𝑥, ෤ is calculated by
𝑥෤ = 𝑥(𝑛+1)/2
When n is even, the sample median, 𝑥,
෤ is calculated by
𝑥𝑛/2 + 𝑥𝑛Τ2 + 1
𝑥෤ =
2

9/10/2024 Geostatistics 13
Mode
Another measure of a central tendency, is an observation that occurs most
frequently in the sample.
The value of the mode obviously depends on the precision of the data, especially for
naturally occurring variables.
If the data are very precise, each value is unique and none is repeated.

9/10/2024 Geostatistics 14
Mode
Mean, median, and mode coincide with each other if the distribution is symmetric.
If the distribution is skewed, these three tendencies exhibit different values.
If the distribution is skewed positively (to the right), mode < median < mean.
If the distribution is skewed negatively (to the left), mode > median > mean.

9/10/2024 Geostatistics 15
Percentiles
Percentile values represent sample values that are greater than a certain percentage
of the sample values.
The median is an example of the 50th percentile value because 50% of the values are
smaller than the median.
If the values are arranged in ascending order, xp represents a value where p percent
values are smaller than xp. For example, x10 represents a sample value that is greater
than 10% of the total sample points.

9/10/2024 Geostatistics 16
Percentiles
Certain types of percentiles are commonly used in describing sample data. For example,
the first quartile represents x25, where 25% of the sample values are less than x25. x75
represents a value where 25% of the sample values are greater than x75.

9/10/2024 Geostatistics 17
Variance
The sample variance represents the spread of the data. It is a quantitative measure of
how widely the data are distributed.
σ 𝑛 2
𝑥
𝑖=1 𝑖 − 𝑥ҧ
𝑠2 =
𝑛−1
where
s2 = sample variance
𝑥ҧ = sample mean
n = total number of samples

9/10/2024 Geostatistics 18
Variance
Variance can also be calculated as
𝑛 2 2
σ 𝑥
𝑖=1 𝑖 − 𝑛 𝑥ҧ
𝑠2 =
𝑛−1
The square root of variance, s, is called the standard deviation. It has the same units as
the variable being sampled.

9/10/2024 Geostatistics 19
Coefficient of Variation
The coefficient of variation, Cv, is defined as
𝑠
𝐶𝑣 =
𝑥ҧ
where
s = standard deviation and
𝑥ҧ = sample mean
Because s and 𝑥ҧ have the same units, Cv is a dimensionless quantity; therefore, it
provides a measure of the relative spread of a sample.

9/10/2024 Geostatistics 20
Range
Range is another quantitative measure of the spread. A simple definition of range, R, is
𝑅 = 𝑥𝑚𝑎𝑥 − 𝑥𝑚𝑖𝑛
where
xmax = the maximum value and
xmin = the minimum value

9/10/2024 Geostatistics 21
Range
Other definitions of range have also been used. For example, interquartile range
represents the difference between two successive quartile values. We can define the
first quartile range as
R1 = x25 - xmin
where
x25 = the 25th percentile value and
xmin = the minimum value.
Similar definitions can be used for other quartile ranges.

9/10/2024 Geostatistics 22
Example 2.2
The following data for pay-zone thickness (in feet) are collected from all available wells
in a reservoir: 6, 10, 20, 12, 20, 10, 15, 32, 27, 10, 18, 29, 8, 17, 23, 36, 19, 13, 33, 10, 26.
Calculate mean, median, mode, quartile values, variance, coefficient of variation, and
range.

9/10/2024 Geostatistics 23
Bivariate Statistics
Covariance is a measure of the relationship
𝑛
between
𝑛
two𝑛variables
1 1 1
𝑐 𝑥, 𝑦 = ෍ 𝑥𝑖 𝑦𝑖 − ෍ 𝑥𝑖 ෍ 𝑦𝑖
𝑛 𝑛 𝑛
𝑖=1 𝑖=1 𝑖=1
where
xi and yi = samples of the variables x and y, respectively, and
n = total number of sample pairs
Note that covariance reduces to variance if x = y.

9/10/2024 Geostatistics 24
Bivariate Statistics
If x and y are positively related (i.e., as x increases, y increases), the covariance has a
positive value.
If x and y are negatively related (i.e., as x increases, y decreases), covariance has a
negative value.
If x and y are not related, the covariance has a value close to zero.

9/10/2024 Geostatistics 25
Bivariate Statistics
Correlation coefficient
𝑐(𝑥, 𝑦)
𝑟 𝑥, 𝑦 =
𝑠𝑥 𝑠𝑦
where
r(x,y) = correlation coefficient,
c(x,y) = covariance between x and y,
sx = standard deviation of the x variable, and
sy = standard deviation of the y variable.

9/10/2024 Geostatistics 26
Bivariate Statistics
The value of the correlation coefficient always falls between the limits of +1 and -1.
If x and y are positively related, the correlation coefficient falls between 0 and +1. The
stronger the relationship, the closer the value will be to +1.
If x and y are negatively related, the correlation coefficient falls between 0 and -1. The
stronger the relationship, the closer the value will be to -1.
If x and y are not related, the correlation coefficient is zero.

9/10/2024 Geostatistics 27
Bivariate Statistics
In some instances, the square of the correlation coefficient, r2(x, y), is used instead of the
correlation coefficient to describe the relationship between the two variables.
One advantage of using this value (sometimes called the r2 statistic) is that it always falls
between zero and one, whether x and y are positively or negatively related.
This is the term most commonly used in describing the “goodness of fit” in a linear-
regression analysis between two variables.

9/10/2024 Geostatistics 28
Bivariate Statistics
In addition to correlation coefficient, rank correlation coefficient is another measure
that indicates the relationship between two variables.
𝑐(𝑅𝑥 , 𝑅𝑦 )
𝑟 𝑅𝑥 , 𝑅𝑦 =
𝑆𝑅𝑥 , 𝑆𝑅𝑦
where
r(Rx, Ry) = rank correlation coefficient,
c(Rx, Ry) = covariance between the rank values of the two variables, and
𝑠𝑅𝑥 and 𝑠𝑅𝑦 = standard deviations for the rank values for the two variables.
When each variable has the same number of data values, 𝑠𝑅𝑥 = 𝑠𝑅𝑦 .

9/10/2024 Geostatistics 29
Example 2.3
Table 2.5 provides core permeability vs. core porosity data from a well. Calculate the
covariance, the correlation coefficient and the rank correlation coefficient between log k
and ϕ data.

9/10/2024 Geostatistics 30
Example 2.3 - Solution

9/10/2024 Geostatistics 31
Example 2.3 - Solution
To calculate correlation coefficient, we must calculate standard deviation for both log k
and ϕ. Use Eq. 2.7 with n instead of (n - 1) as the denominator.
σ 𝑛 2 2
𝑖=1 𝑥𝑖 − 𝑛 𝑥ҧ
𝑠2 =
𝑛

9/10/2024 Geostatistics 32
Example 2.3 - Solution
To estimate the rank correlation coefficient, all data values are first sorted in ascending
order. Then, each value is assigned a rank, depending on where it falls. The smallest
value receives the lowest rank, and the largest value has a rank of n, where n = total
number of samples.

9/10/2024 Geostatistics 33
Linear Regression
A linear relationship is useful in predicting a value of one variable when the value of
the other variable is known.
The simplest type of this relationship is
y = mx + b
where
y = the variable to be estimated;
x = the known variable,
m = the slope of the straight line, and
b = an intercept on the y axis.

9/10/2024 Geostatistics 34
Linear Regression
To estimate the values of m and b, we first use the available sample pair of x and y, and
obtain the “best” fit between the two variables.
We can show that the best fit can be obtained by defining the values of m and b as
𝑐(𝑥, 𝑦)
𝑚=
𝑠𝑥2

𝑏 = 𝑦ത − 𝑚𝑥ҧ
where
c(x,y) = covariance between x andy,
𝑠𝑥2 = variance of x, and
𝑦ത and 𝑥ҧ = arithmetic means of the y and x variables, respectively.

9/10/2024 Geostatistics 35
Example 2.4
Table 2.5 provides core permeability vs. core porosity data from a well. Find the best-fit
line between log k and ϕ values.

9/10/2024 Geostatistics 36
Bivariate Relationships for Spatial Data
Covariance can be used as a statistical tool to quantify the relationship between two
variables.
An important distinction when establishing a bivariate relationship for spatial data is
that the same variable is examined but at different spatial locations. It also is possible
to develop a relationship between two different variables at different locations.

9/10/2024 Geostatistics 37
Example 2.5
Table 2.9 shows porosity data collected from a vertical well at uniform intervals of 1 ft.
Establish a relationship between porosity values at different locations as functions of
distance between those values.

9/10/2024 Geostatistics 38
Example 2.5 - Solution
Recall that the covariance relationship 𝑛states that 𝑛 𝑛
1 1 1
𝑐 𝑥, 𝑦 = ෍ 𝑥𝑖 𝑦𝑖 − ෍ 𝑥𝑖 ෍ 𝑦𝑖
𝑛 𝑛 𝑛
𝑖=1 𝑖=1 𝑖=1
We use the same relationship, except that the x and y variables are the same variable at
different locations. For example, if we denote variables x(u) as a value of x at Location u
and a variable x(u + L) as a value𝑛of x at Location u + L, 𝑛we can write
𝑛
Eq. as
1 1 1
𝑐 𝑥(𝑢), 𝑥(𝑢 + 𝐿) = ෍ 𝑥 𝑢𝑖 𝑥(𝑢𝑖 + 𝐿) − ෍ 𝑥 𝑢𝑖 ෍ 𝑥(𝑢𝑖 + 𝐿)
𝑛 𝑛 𝑛
𝑖=1 𝑖=1 𝑖=1
where
L = distance between the two variables, also called the lag distance
n = the number of pairs located a distance L apart

9/10/2024 Geostatistics 39
Example 2.5 - Solution

Fig. shows that, for a lag distance of 1 ft, we can gather six pairs; for a lag distance of 2 ft,
we can gather five pairs; and so forth.

9/10/2024 Geostatistics 40
Example 2.5 - Solution
We can calculate the covariance for a lag distance of 1 ft with n = 6 ϕ(u) ϕ(u + 1)
𝑐 ϕ(𝑢), ϕ(𝑢 + 1) 8.25 9.00
1 9.00 6.25
= (8.25 × 9.00 + 9.00 × 6.25 + 6.25 × 5.00 + 5.00 × 5.30 + 5.30
6 6.25 5.00
× 4.75 + 4.75 × 5.00)
1 1 5.00 5.30
− 8.25 + 9.00 + 6.25 + 5.00 + 5.30 + 4.75 (9.00 + 6.25 + 5.00 5.30 4.75
6 6
+ 5.30 + 4.75 + 5.00) = 1.73 4.75 5.00

For covariance, we can simply write the left side of the equation as
c(1) because it reflects the covariance for a lag distance of 1 ft

9/10/2024 Geostatistics 41
Example 2.5 - Solution
ϕ(u) ϕ(u + 2)
8.25 6.25
9.00 5.00
6.25 5.30
5.00 4.75
5.30 5.00

Covariance for a lag distance of 2 ft is calculated in the same way. There are five pairs at
that lag distance.
𝑐 2
1
= 8.25 × 6.25 + 9.00 × 5.00 + 6.25 × 5.30 + 5.00 × 4.75 + 5.30 × 5.00
5
1 1
− 8.25 + 9.00 + 6.25 + 5.00 + 5.30 6.25 + 5.00 + 5.30 + 4.75 + 5.00 = 0.43
5 5

9/10/2024 Geostatistics 42
Example 2.5 - Solution
ϕ(u) ϕ(u + 3)
8.25 5.00
9.00 5.30
6.25 4.75
5.00 5.00

There are four pairs for a lag distance of 3 ft, and the same equation is used to calculate
c(3) = 0.195.

9/10/2024 Geostatistics 43
Example 2.5 - Solution
The values of correlation coefficient at various lag distances can be calculated similarly.
𝑐(𝑥, 𝑦)
𝑟 𝑥, 𝑦 =
𝑠𝑥 𝑠𝑦
For spatial data sets,
𝑐[𝑥 𝑢 , 𝑥 𝑢 + 𝐿 ]
𝑟 𝑥(𝑢), 𝑥(𝑢 + 𝐿) =
𝑠𝑥(𝑢) 𝑠𝑥(𝑢+𝐿)
As in the case of covariance, the correlation coefficient can be written simply as a
function of the lag distance.
𝑐(𝐿)
𝑟(𝐿) =
𝑠𝑥(𝑢) 𝑠𝑥(𝑢+𝐿)

9/10/2024 Geostatistics 44
Example 2.5 - Solution
For a lag distance of 1 ft, we can calculate sx(u) by calculating the variance of all data
points used as a first data point in a given pair.
The mean is
1
𝑥(𝑢) = 8.25 + 9.00 + 6.25 + 5.00 + 5.30 + 4.75 = 6.425
6
The variance is
6 2 2
2
σ 𝑖 = 1 𝑥(𝑢𝑖) − 6𝑥(𝑢)
𝑠𝑥(𝑢) =
6
8.25 + 9.00 + 6.252 + 5.002 + 5.302 + 4.752 − 6 × 6.4252
2 2
= = 2.6823
6
Therefore, sx(u) = 1.638

9/10/2024 Geostatistics 45
Example 2.5 - Solution
Similarly, for the second data point in each pair.
The mean is
1
𝑥(𝑢 + 1) = 9.00 + 6.25 + 5.00 + 5.30 + 4.75 + 5.00 = 5.883
6
The variance is
6 2 2
2
σ 𝑥
𝑖 = 1 (𝑢𝑖+1) − 6𝑥(𝑢 + 1)
𝑠𝑥(𝑢 + 1) =
62
9.00 + 6.25 + 5.00 + 5.302 + 4.752 + 5.002 − 6 × 5.8832
2 2
= = 2.1722
6
Therefore, sx(u + 1) = 1.474

9/10/2024 Geostatistics 46
Example 2.5 - Solution
𝑐 1 1.73
𝑟 1 = = = 0.7165
𝑠𝑥 𝑢 𝑠𝑥 𝑢+1 1.638 × 1.474
Similarly, for lag distances of 2 and 3 ft, respectively,
𝑐 2 0.43
𝑟 2 = = = 0.5135
𝑠𝑥 𝑢 𝑠𝑥 𝑢+2 1.595 × 0.525
and
𝑐 3 0.195
𝑟 3 = = = 0.631
𝑠𝑥 𝑢 𝑠𝑥 𝑢+3 1.586 × 0.195

9/10/2024 Geostatistics 47
Example 2.5 - Solution
A special case exists when the covariance and correlation-coefficient values are
estimated at a lag distance of zero. At L = 0, the equation for covariance reduces to the
corresponding equation for𝑛 variance. 𝑛 𝑛
1 1 1 2
𝑐 0 = ෍ 𝑥 𝑢𝑖 𝑥(𝑢𝑖 ) − ෍ 𝑥 𝑢𝑖 ෍ 𝑥(𝑢𝑖 ) = 𝑠𝑥(𝑢)
𝑛 𝑛 𝑛
𝑖=1 𝑖=1 𝑖=1
In our example, n = 7 for L = 0, which gives c 0 = 𝑠𝑥2 𝑢 = 2.548
We can easily show that r(0) = 1 because a perfect relationship exists between x(u) and
x(u) - they are identical. Also mathematically,
𝑐 0 𝑠𝑥2 𝑢
𝑟 0 = = =1
𝑠𝑥 𝑢 𝑠𝑥 𝑢 𝑠𝑥 𝑢 𝑠𝑥 𝑢

9/10/2024 Geostatistics 48
Inferential statistics
Inferential statistics is a logical extension of descriptive statistics.
Descriptive statistics most often deals with analyzing sample data sets. However, from
the characteristics of the sample, conclusions (inferences) can be drawn about the
population from which the sample was taken.

9/10/2024 Geostatistics 49
Inferential statistics
▪ Random Experiment
▪ Sample Space and Events
▪ Probability
▪ Random Variables
▪ Mathematical Expectation
▪ Important Distribution Functions

9/10/2024 Geostatistics 50
Random Experiment
An experiment whose outcome cannot be predicted with certainty in advance.
Obviously, a random experiment has to result in more than one possible outcome.
Example: tossing of a coin
• More than one outcome: heads or tails
• We cannot predict the outcome with certainty
All characteristics of a random experiment are satisfied

9/10/2024 Geostatistics 51
Random Experiment
The concept of random experiment is extremely important. However, drilling a well as a
random experiment is different from tossing a coin as a random experiment. Tossing a
coin results in either of two outcomes; therefore, it is a random experiment. Drilling a
well results in only one outcome; however, our lack of knowledge does not allow us to
predict that outcome with certainty. Therefore, we treat it as a random experiment
with multiple possibilities of outcomes.

9/10/2024 Geostatistics 52
Sample Space and Events
A sample space, S, is a set of all possible outcomes. For a rolling-a-die experiment, we
can denote the sample space as
S = (1, 2, 3, 4, 5, 6)
because the experiment has six possible outcomes.
An event is defined as a set consisting of some of the possible outcomes. If Event A
consists of all the even-numbered outcomes of the rolling-a-die experiment, then Event
A is
A = (2, 4, 6)

9/10/2024 Geostatistics 53
Sample Space and Events
If Event B consists of all the outcomes less than five for the rolling-a-die experiment,
then Event B is
B = (1, 2, 3, 4)
Using only two events, A and B, of a sample space, we can define the union of these two
events, A ∪ B, as consisting of all the outcomes present in either A or B. Therefore,
A ∪ B = (1, 2, 3, 4, 6)
Similarly, using Events A and B, we can define the intersection of these two events, A ∩
B, as consisting of all outcomes that are present in both A and B. Therefore,
A ∩ B = (2, 4)

9/10/2024 Geostatistics 54
Sample Space and Events
If the intersection of two events results in a null set (containing no outcome), these two
events are mutually exclusive.
For example, if Event C contains all the odd-numbered outcomes from a rolling-a-die
experiment, then
C = (1, 3, 5)
With the definition of intersection of events,
A∩C=∅
where ∅ = a null set.
Because A ∩ C contains no outcomes, A and C are considered mutually exclusive.

9/10/2024 Geostatistics 55
Sample Space and Events
The best way to illustrate some of these definitions is with Venn diagrams. Fig. 2.25
shows examples of Venn diagrams for union and intersection of events within a sample
space. The shaded region indicates the resulting event.

9/10/2024 Geostatistics 56
Probability
Probability normally is associated to a particular event of a random experiment. A
geologist’s statement that “there is 30% probability of finding oil at a location where a
new well is to be drilled” can have two meanings. Both meanings are correct and are a
result of the way we define the random experiment.
1. The first interpretation is that the geologist believes that, in reservoirs with a similar
depositional environment, 30% of the wells will produce oil. That is, if several wells are
drilled in very similar depositional environments, 30% will produce oil and, by
inference, 70% will be dry holes.
2. The second interpretation is that the 30% probability is a measure of the geologist’s
subjective belief that the well will produce oil.
These two interpretations can be directly related to the description of random
experiments.

9/10/2024 Geostatistics 57
Probability
The first interpretation is closely related to the random experiment of rolling a die.
That is, if we repeat the experiment a large number of times under controlled
conditions, a pattern emerges about the outcomes. For example, for a true die, if we roll
the die a large number of times, we observe that each of the six outcomes is equally
likely. Under this interpretation, probability can be defined as
𝑛𝐴
𝑝 𝐴 = lim
𝑛𝑐 →∞ 𝑛𝐶

where
p(A) = probability of Event A
nA = number of times outcome has occurred, and
nC = number of times the random experiment is conducted under controlled conditions,
nC has to be large to ensure that a correct pattern is captured.

9/10/2024 Geostatistics 58
Probability
For the rolling-a-die experiment, the probability of each of the six outcomes is 1/6 or
0.1667. Going back to the geologist’s statement, if we drill a large number of wells in a
similar, depositional environment, we observe that p(P), where P = producer is 30%, or
30% of the wells should produce oil.

9/10/2024 Geostatistics 59
Probability
Interpretation 2 is closely related to the random experiment of drilling a well. The
geologist is simply using a subjective belief about the chance of success. Uncertainty
exists is because of a lack of complete knowledge about the reservoir. However, the
geologist is using his/her partial knowledge to assign a value to the probability of
success.

9/10/2024 Geostatistics 60
Probability
Both interpretations are correct, and the mathematics of probability does not change
with the interpretation applied. Deterministic events can be treated as random events if
we lack sufficient knowledge about those events; however, with partial knowledge about
the events, probabilities can be assigned to the likely outcomes (events) of that
experiment.

9/10/2024 Geostatistics 61
Probability
▪ Laws of Probability
▪ Conditional Probability

9/10/2024 Geostatistics 62
Laws of Probability
For Event A of a random experiment with Sample Space S,
0≤𝑝 𝐴 ≤1
That is, the probability value can never be less than zero or greater than one. Also,
𝑝 𝑆 =1
That is, the probability that the outcome will be part of the sample space is equal to one.
Secondly, 𝑛 𝑛
𝑒 𝑒

෍ 𝑝 𝐴 𝑖 = 𝑝 ራ 𝐴𝑖
𝑖=1 𝑖=1
where
Ai = sequence of mutually exclusive events and
ne = number of mutually exclusive events

9/10/2024 Geostatistics 63
Laws of Probability
For ne = 2
𝑝 𝐴1 + 𝑝 𝐴2 = 𝑝 𝐴1 ∪ 𝐴2
and for ne = 3
𝑝 𝐴1 + 𝑝 𝐴2 + 𝑝 𝐴3 = 𝑝 𝐴1 ∪ 𝐴2 ∪ 𝐴3
That is, the probability of the union of mutually exclusive events is equal to the addition
of the probabilities of the individual events.

9/10/2024 Geostatistics 64
Laws of Probability
For two events that are not mutually exclusive,
𝑝 𝐴∪𝐵 =𝑝 𝐴 +𝑝 𝐵 −𝑝 𝐴∩𝐵

9/10/2024 Geostatistics 65
Example 2.6
The following three events are defined for a rolling-a-die experiment.
1. Event A. All even-numbered outcomes.
2. Event B. All outcomes greater than 3.
3. Event C. All odd-numbered outcomes less than 4
Calculate p(A), p(B), p(C), p(A∪B), p(A∪C) and p(A∩B).

9/10/2024 Geostatistics 66
Example 2.6 - Solution
From the description of events, the events are A = (2, 4, 6), B = (4, 5, 6), and C = (1, 3).
Knowing that the probability of individual outcomes for rolling a die is 1/6, we can
calculate the probability of individual events
1 1 1 1
𝑝 𝐴 = + + =
6 6 6 2

1 1 1 1
𝑝 𝐵 = + + =
6 6 6 2

1 1 1
𝑝 𝐶 = + =
6 6 3

9/10/2024 Geostatistics 67
Example 2.6 - Solution
Because 𝑝 𝐴 ∩ 𝐶 = ∅
1 1 5
𝑝 𝐴∪𝐶 =𝑝 𝐴 +𝑝 𝐶 = + =
2 3 6
which can be confirmed because
A∪C = (1, 2, 3, 4, 6)
Therefore,
1 1 1 1 1 5
𝑝 𝐴∪𝐶 = + + + + =
6 6 6 6 6 6

9/10/2024 Geostatistics 68
Example 2.6 - Solution
and A ∩ B = (4, 6)
which results in
1
𝑝 𝐴∩𝐵 =
3

1 1 1 2
𝑝 𝐴∪𝐵 =𝑝 𝐴 +𝑝 𝐵 −𝑝 𝐴∩𝐵 = + − =
2 2 3 3
which is confirmed because
A ∪ B = (2, 4, 5, 6)
Therefore,
1 1 1 1 2
𝑝 𝐴∪𝐵 = + + + =
6 6 6 6 3

9/10/2024 Geostatistics 69
Conditional Probability
As the name indicates, conditional probability is the probability of an event that is
conditional on some information. This allows calculation of the probability of a given
event when partial information regarding the result of the random experiment is
available.

9/10/2024 Geostatistics 70
Conditional Probability
The most common notation used to describe conditional probability p(A|B). This
indicates the conditional probability of Event A occurring given that Event B has
occurred. A general equation for calculating conditional probability is
𝑝 𝐴∩𝐵
𝑝 𝐴|𝐵 =
𝑝(𝐵)

9/10/2024 Geostatistics 71
Example 2.7
The probability of finding oil in an exploration well is estimated to be 0.2. One of the
uncertainties in finding oil in this well is the presence of source rock. The probability of
the presence of source rock is 0.7. After these preliminary calculations were made, a
well drilled in a nearby area confirmed the presence of source rock in the region. What
is the probability of finding oil in the first exploration well given that the presence of
source rock is confirmed?

9/10/2024 Geostatistics 72
Example 2.7 - Solution
Let A be the event that the oil is found in the exploration well, and B be the event that
the source rock is present.
𝑝 𝐴∩𝐵
𝑝 𝐴|𝐵 =
𝑝(𝐵)
We know that p(B) = 0.7 and p(A ∩ B) = 0.2 (because A is a subset of B)
0.2
𝑝 𝐴|𝐵 = = 0.286
0.7
The probability of finding oil improves to 0.286 because source rock is present.

9/10/2024 Geostatistics 73
Conditional Probability
We can rewrite
𝑝 𝐴 ∩ 𝐵 = 𝑝 𝐴|𝐵 𝑝(𝐵)
Eq. allows us to derive a few more conclusions.
First, for mutually exclusive events, (A ∩ B) is a null set; therefore, p(A ∩ B) = 0.
Because the left side of Eq. is zero, p(A|B) = 0, which is consistent with the idea that if B
has occurred, A cannot occur. Therefore, the probability of A occurring given that B
has occurred is zero. The same definition can be used to define independent events,
which are events that are independent of each other. The occurrence of an independent
event is not affected by whether any of the events from which it is independent has
occurred.

9/10/2024 Geostatistics 74
Conditional Probability
For example, in rolling a pair of dice, the outcome of one die does not affect the
outcome of the other die. The outcomes of the two dice are completely independent of
each other. In other words,
p(A|B) = p(A) và p(B) = p(B|A)
where the probability that Event A will occur is not affected by the fact that B has
occurred. The same thing can be said about Event B. The probability of Event B
occurring is not affected by the fact that Event A has occurred.

9/10/2024 Geostatistics 75
Conditional Probability
Substituting gives
𝑝 𝐴∩𝐵 =𝑝 𝐴 𝑝 𝐵
for independent events. For more than two events, Eq. can be extended as
𝑝 𝐴1 ∩ 𝐴2 ∩ ⋯ ∩ 𝐴𝑛 = 𝑝 𝐴1 𝑝 𝐴2 … p(𝐴𝑛 )
where A1, A2, ... , An = independent events.

9/10/2024 Geostatistics 76
Example 2.8
The probability of success is estimated to be 0.2 for an exploration well in Basin 1. For
another exploration well in Basin 2, the probability of success is estimated to be 0.3. If
both wells are drilled, what is the probability that both will be successful?

9/10/2024 Geostatistics 77
Example 2.8 - Solution
Because these wells are drilled in different basins, they can be considered as
independent events; the outcome of one well does not affect the outcome of the other. If
Event A is a successful well in Basin 1 and Event B is a successful well in Basin 2,
p(A) = 0.2 and p(B) = 0.3.
With Eq. 2.41,
p(A ∩ B) = p(A)p(B) = 0.2 x 0.3 = 0.06.
The probability that both events will occur is 0.06 or 6%.

9/10/2024 Geostatistics 78
Conditional Probability
Another useful extension of Eq. 2.38 can be written. Recall that Eq. 2.38 states that
𝑝 𝐴∩𝐵
𝑝 𝐴|𝐵 =
𝑝(𝐵)
With a sample space consisting of A, mutually
𝑛
exclusive events so that
𝑒

෍ 𝑝(𝐴𝑖 ) = 1
𝑖=1
we can easily write
𝑝 𝐵 = 𝑝 𝐴1 ∩ 𝐵 + 𝑝 𝐴2 ∩ 𝐵 + ⋯ + 𝑝 𝐴𝑛𝑒 ∩ 𝐵

9/10/2024 Geostatistics 79
Conditional Probability
In Fig. 2.26, which illustrates this, the sample space is divided
into four mutually exclusive events, A1 through A4, and Event
B is located in the sample space. As the figure shows. Event B
can be written as
𝐵 = 𝐴1 ∩ 𝐵 + 𝐴2 ∩ 𝐵 + 𝐴3 ∩ 𝐵 + 𝐴4 ∩ 𝐵
This can be generalized for ne mutually exclusive events.

9/10/2024 Geostatistics 80
Conditional Probability
𝑝 𝐴∩𝐵
𝑝 𝐴|𝐵 =
𝑝(𝐵)

𝑝 𝐴∩𝐵
𝑝 𝐵|𝐴 =
𝑝(𝐴)

𝑝 𝐴 ∩ 𝐵 = 𝑝 𝐵|𝐴 𝑝 𝐴
𝑝 𝐵 = 𝑝 𝐴1 ∩ 𝐵 + 𝑝 𝐴2 ∩ 𝐵 + ⋯ + 𝑝 𝐴𝑛𝑒 ∩ 𝐵

𝑝 𝐵|𝐴𝑗 𝑝(𝐴𝑗 ) Eq. 2.46b is Bayes’ theorem, which

𝑝 𝐴𝑗 |𝐵 =
σ𝑛𝑖=1
𝑒
∩ 𝐵) represents a generalized equation for
𝑝(𝐴𝑖
conditional probability. An entire branch of
𝑝 𝐵|𝐴𝑗 𝑝(𝐴𝑗 ) stastatistics, Bayesian statistics, is based on
𝑝 𝐴𝑗 |𝐵 = 𝑛𝑒 the concept of determining conditional
σ𝑖=1 𝑝 𝐵 𝐴𝑖 𝑝(𝐴𝑖 ) possibilities.
9/10/2024 Geostatistics 81
Example 2.9
A 3D geophysical survey concludes that a new area has three potential regions where
oil can be found. It is equally likely that oil could be found in any of the three regions. On
the basis of the surrounding regions and the presence of source rock, it is possible for
only one of the regions to contain oil. Geophysicists are certain that one of these
regions should have commercial reserves. If only one exploration well is drilled in a
region, there is a 40% chance that oil will not be discovered in that region although
oil may indeed be present (overlook probability). If one well is drilled in Region 1 and
is unsuccessful, what is the probability that oil is present in Region 1? What is the
probability that the oil is present in the other two regions?

9/10/2024 Geostatistics 82
Example 2.9 - Solution
Let Ei where i = 1,2,3, be an event that oil is in Region i. Let F be an event that the search
of Region 1 is unsuccessful. With Bayes’ theorem,
𝑝 𝐸1 ∩ 𝐹
𝑝 𝐸1 |𝐹 = 3
σ𝑖=1 𝑝 𝐹 𝐸𝑖 𝑝(𝐸𝑖 )

and
𝑝 𝐸1 ∩ 𝐹 = 𝑝 𝐹 𝐸1 𝑝(𝐸1 )
The denominator of Eq. can be written as
3

෍ 𝑝 𝐹 𝐸𝑖 𝑝(𝐸𝑖 ) = 𝑝 𝐹 𝐸1 𝑝 𝐸1 + 𝑝 𝐹 𝐸2 𝑝 𝐸2 + 𝑝 𝐹 𝐸3 𝑝(𝐸3 )
𝑖=1

9/10/2024 Geostatistics 83
Example 2.9 - Solution
Each region has an equal likelihood of success; therefore,
p(E1) = p(E2) = p(E3) = 0.333
Also, P(F|E1) = 0.4, P(F|E2) = 1, and p(F|E3) = 1 because the exploration well in Region 1
results in failure if oil is in either Region 2 or 3. Substituting all these values gives
𝑝 𝐹 𝐸1 𝑝(𝐸1 )
𝑝 𝐸1 |𝐹 =
𝑝 𝐹 𝐸1 𝑝 𝐸1 + 𝑝 𝐹 𝐸2 𝑝 𝐸2 + 𝑝 𝐹 𝐸3 𝑝(𝐸3 )
0.4 × 0.33
= = 0.167
0.4 × 0.33 + 1 × 0.33 + 1 × 0.33
Similarly,
𝑝 𝐹 𝐸2 𝑝(𝐸2 )
𝑝 𝐸2 |𝐹 =
𝑝 𝐹 𝐸1 𝑝 𝐸1 + 𝑝 𝐹 𝐸2 𝑝 𝐸2 + 𝑝 𝐹 𝐸3 𝑝(𝐸3 )
1 × 0.33
= = 0.417
0.4 × 0.33 + 1 × 0.33 + 1 × 0.33
The probability of finding oil in Regions 2 and 3 improved to 0.417 because the search in
Region 1 was unsuccessful.
9/10/2024 Geostatistics 84
Random Variables
A random variable is a variable whose values are generated by a random experiment
on the basis of some probabilistic function.
For example, the rolling-a-die experiment produces any one of the six possible outcomes
randomly. If the random variable is letter X for this random experiment, for a true die,
1
𝑝 𝑋=1 = =𝑝 𝑋=2 =𝑝 𝑋=3 =𝑝 𝑋=4 =𝑝 𝑋=5 =𝑝 𝑋=6
6
That is, the probability that a random variable can take any one of the six values is 1/6.

9/10/2024 Geostatistics 85
Random Variables
It is important to maintain the distinction between a random variable and an actual
outcome of a random variable. To make this distinction, we use an uppercase letter to
denote a random variable (e.g., X) and a lowercase letter to denote the outcome (or
realization) of a random variable (e.g., x).
Conceptually, the difference between the random variable and its realizations can be
explained with the same example of the rolling-a-die experiment. The random variable
can take any of the six outcomes 1, 2, 5, 6, 3, 4, 3, 3, 4, 6, 1, 4, 2, …, these are the
realizations of the random variable, which we can denote as x1 = 1, x2 = 2, x3 = 5, and x4 =
6, where the subscripts denote the number of a particular realization.

9/10/2024 Geostatistics 86
Random Variables
Random variables are defined as two types: discrete and continuous. Discrete random
variables can take a finite number of values. An example is the rolling-a-die
experiment, where a random variable can take only six possible values. Continuous
random variables can take a very large number of values (for example, a collection of
porosity data from a reservoir), and a large number of outcomes are possible.

9/10/2024 Geostatistics 87
Random Variables
Probability Function. The probability function describes the probability that a random
variable will take a certain value. The probability function is closely related to the
relative-frequency-distribution function, which describes the chance that a value will
fall within a certain class. The probability function describes an essentially similar
behavior.

9/10/2024 Geostatistics 88
Random Variables
For discrete random variables, the probability mass function, P(a), of a random
variable X is
P(a) = p(X = a)
For a discrete random variable, X can take
𝑛
finite number of values xi,i = 1,n. Therefore,
෍ 𝑃 𝑥𝑖 = 1
𝑖=1

9/10/2024 Geostatistics 89
Example 2.10
Define the probability mass function for the rolling-a-die experiment. Show that
σ𝑛𝑖=1 𝑃 𝑥𝑖 = 1 is satisfied for this function.

9/10/2024 Geostatistics 90
Example 2.10 - Solution
Knowing the random experiment, we can write, for example,
P(1) = p(X = 1) = 1/6
Similarly, we can write
P(2) = P(3) = P(4) = P(5) = P(6) = 1/6
Applying Eq. gives
6
1 1 1 1 1 1
෍ 𝑃 𝑥𝑖 = + + + + + = 1
6 6 6 6 6 6
𝑖=1

9/10/2024 Geostatistics 91
Random Variables
For a continuous random variable, the
probability density function, f(x), describes
the behavior of a random variable. Fig. 2.27
shows a representative probability density
function. One requirement of the probability
density function is that the area under the
curve be equal to∞one. Mathematically,

න 𝑓 𝑥 𝑑𝑥 = 1
−∞

9/10/2024 Geostatistics 92
Random Variables
The probability that the value of a random
variable will fall within a certain
interval can be calculated
𝑏

𝑝(𝑎 < 𝑋 ≤ 𝑏) = න 𝑓 𝑥 𝑑𝑥
𝑎
Schematically, the probability that a value
will fall within a certain interval is
represented by the area under the curve
within that interval (Fig. 2.28).

9/10/2024 Geostatistics 93
Example 2.11
Pay-zone thickness in a reservoir is described by the following probability density
function.
0 for 𝑥 ≤ 20 ft
1
𝑓 𝑥 = for 20 < 𝑥 ≤ 70 ft
50
0 for 𝑥 > 70 ft
∞
Show that this density function satisfies ‫׬‬−∞ 𝑓 𝑥 𝑑𝑥 = 1 . Further, calculate the
probability that the thickness at a particular location will fall between 30 and 50 ft thick.

9/10/2024 Geostatistics 94
Example 2.11 - Solution
Applying Eq. to the probability density function gives
∞ 20 70 ∞
1 1 70
න 𝑓 𝑥 𝑑𝑥 = න 0𝑑𝑥 + න 𝑑𝑥 + න 0𝑑𝑥 = 0 + 𝑥| + 0 = 1
50 50 20
−∞ −∞ 20 50 70
1 1 50 20
𝑝 30 ≤ 𝑋 ≤ 50 = න 𝑑𝑥 = 𝑥| = = 0.4
50 50 30 50
30
Therefore, there is a 40% probability that the pay zone will fall within the 30- to 50-ft
interval.

9/10/2024 Geostatistics 95
Cumulative-Distribution Function
The cumulative-distribution function, F(x), is defined as
F(x) = p(X ≤ x)
It is the probability that a random variable X will be less than a particular value x.
Knowing the definition of the cumulative-distribution function, we can calculate the
probability that a random variable will fall within a certain interval.
p(X ≤ b) = p(X ≤ a) + p(a < X ≤ b)
Therefore,
p(a < X ≤ b) = p(X ≤ b) - p(X ≤ a) = F(b) - F(a)

9/10/2024 Geostatistics 96
Cumulative-Distribution Function
For a discrete random variable, cumulative-distribution function can be calculated as
𝐹 𝑎 = ෍ 𝑃(𝑥𝑖 )
𝑥𝑖 ≤𝑎
where P(xi) = the probability mass function of a random variable.

9/10/2024 Geostatistics 97
Cumulative-Distribution Function
For a continuous random variable, the cumulative-distribution function can be
calculated as 𝑎

𝐹 𝑎 = න 𝑓 𝑥 𝑑𝑥
−∞
where f(x) = the probability density function of a random variable.

9/10/2024 Geostatistics 98
Example 2.12
Calculate the cumulative-distribution function for the rolling-a-die experiment. What is
the probability that the outcome will fall between two and five, p(2 < X ≤ 5)?

9/10/2024 Geostatistics 99
Example 2.12 - Solution
1
1
𝐹 1 = ෍ 𝑃(𝑥𝑖 ) =
6
𝑖=1

2
1 1 1
𝐹 2 = ෍ 𝑃(𝑥𝑖 ) = + =
6 6 3
𝑖=1
Fig. 2.29 shows the plot of the cumulative-
distribution function. It starts with a value
of zero and reaches a value of 1 at a value
of the variable equal to six.
p(2 < X ≤ 5) = F(5) - F(2) = 5/6 – 1/3 = 1/2

9/10/2024 Geostatistics 100

Example 2.13
Pay-zone thickness in a reservoir is described by the following probability density
function.
0 for 𝑥 ≤ 20 ft
1
𝑓 𝑥 = for 20 < 𝑥 ≤ 70 ft
50
0 for 𝑥 > 70 ft
Define the cumulative-distribution function for this function. Confirm that the
probability that a thickness will fall between 30 and 50 ft is 0.4.

9/10/2024 Geostatistics 101

Example 2.13 - Solution
For a value of x between 20 and 70,
𝑎 20 𝑎
1 𝑎 − 20
𝐹 𝑎 = න 𝑓 𝑥 𝑑𝑥 = න 0𝑑𝑥 + න 𝑑𝑥 =
50 50
−∞ −∞ 20
In general,
0 for 𝑥 ≤ 20 ft
𝑥 − 20
𝐹 𝑥 = for 20 < 𝑥 ≤ 70 ft
50
1 for 𝑥 > 70 ft
To calculate p(30 < X ≤ 50), we can write
50 − 20 30 − 20
𝑝 30 < 𝑋 ≤ 50 = 𝐹 50 − 𝐹 30 = − = 0.4
50 50
which confirms the previous answer.

9/10/2024 Geostatistics 102

Mathematical Expectation
Mathematical expectation or expected value is defined as some weighted average
outcome of a random experiment if the experiment is conducted a large number of
times.

9/10/2024 Geostatistics 103

Mathematical Expectation
For a discrete random variable X, the expected
𝑜
value of X is defined as
𝐸 𝑋 = ෍ 𝑥𝑖 𝑃[𝑥 = 𝑥𝑖 ]
𝑖=1
where
E[X] = expected value;
xi = outcome of the random variable,
P[X = xi] = probability mass density function for the ith outcome, and
o = number of possible outcomes.

9/10/2024 Geostatistics 104

Example 2.14
What is the expected value of a rolling-a-die experiment?

9/10/2024 Geostatistics 105

Example 2.14 - Solution
We know that the random variable can take any of the six values, each having a
probability of 1/6. Using Eq. gives
6
1 1 1 1 1 1 7
𝐸 𝑋 = ෍ 𝑥𝑖 𝑃[𝑥 = 𝑥𝑖 ] = 1 +2 +3 +4 +5 +6 = = 3.5
6 6 6 6 6 6 2
𝑖=1
The expected value is 3.5 for the rolling-a-die experiment. As in the previous example,
an outcome of 3.5 cannot be realized after a single roll of a die; it represents the
average of all the outcomes if the experiment is repeated a large number of times. That
is, if we roll a die a large number of times and note the outcome each time, the
arithmetic average of all those realizations is close to 3.5.

9/10/2024 Geostatistics 106

Mathematical Expectation
The definition of expected value can be generalized for any real valued function u(x) of
the variable X.
For a discrete variable 𝑜

𝐸 𝑢(𝑋) = ෍ 𝑢(𝑥𝑖 )𝑃[𝑥 = 𝑥𝑖 ]

𝑖=1
For a continuous variable ∞

𝐸 𝑢(𝑋) = න 𝑢 𝑥 𝑓 𝑥 𝑑𝑥
−∞

9/10/2024 Geostatistics 107

Mathematical Expectation
The expected value of a constant is a constant
E[K] = K
The expected value of a constant times a function is equal to the constant times the
expected value of the function
E[Ku(X)] = KE[u(X)]
The expected value of a sum of two functions is equal to the sum of the expected values
of the two function
E[u1(X) + u2(X)] = E[u1(X)] + E[u2(X)]

9/10/2024 Geostatistics 108

Mathematical Expectation
The two most important expected values for a single variable are the arithmetic mean
and the variance.
The arithmetic mean, μ, is defined as
μ = E[X]
μ is the expected value of the variable itself. The arithmetic mean is the population
mean, not the sample mean; therefore, we differentiate it from the sample mean
(denoted by 𝑥)ҧ by using the symbol μ to represent it.

9/10/2024 Geostatistics 109

Mathematical Expectation
Variance, σ2, is defined as
𝜎2 = 𝐸 𝑋 − 𝜇 2 = 𝑉 𝑋
The notation σ2 is different from s2, which represents sample variance. The square root
of σ2, σ is the standard deviation.
With the three characteristics of expected values, we can show that
σ2 = E[X2] - {E[X]}2
or
σ2 = E[X2] – μ2

9/10/2024 Geostatistics 110

Mathematical Expectation
Some important characteristics of the variance are
V[K] = 0
where K is a constant;
V[KX] = K2V[X]
where K is a constant; and
V[KX + b] = K2[X]
where K and b are constants.

9/10/2024 Geostatistics 111

Example 2.15
Calculate the variance for the rolling-a-die experiment.

9/10/2024 Geostatistics 112

Example 2.15 - Solution
We estimated the arithmetic mean in Example 2.15 and
μ = E[X] = 3.5
To calculate the variance, we need E[X2] 𝑜
𝐸 𝑋 2 = ෍ 𝑥𝑖2 𝑃[𝑥 = 𝑥𝑖 ]
𝑖=1
For six outcomes, we can write
6
1 1 1 1 1 1
𝐸[𝑋 2 ] = ෍ 𝑥𝑖2 𝑃 𝑥 = 𝑥𝑖 = 12 +2 2 +3 2 +4 2 +5 2 +6 2
6 6 6 6 6 6
𝑖=1
= 15.17
σ2 = E[X2] – μ2 = 15.17 – 3.52 = 2.917
σ = 1.708

9/10/2024 Geostatistics 113

Important Distribution Functions
▪ Uniform Distribution
▪ Normal (Gaussian) Distribution
▪ Log-Normal Distribution

9/10/2024 Geostatistics 114

Uniform Distribution
The probability density function for the uniform distribution is
1
𝑓 𝑥 = ቐ𝑏 − 𝑎 for 𝑎 ≤ 𝑥 ≤ 𝑏
0 for x is otherwise
The cumulative-distribution function is
0 for 𝑥 < 𝑎
𝑥−𝑎
𝐹 𝑥 =൞ for 𝑎 ≤ 𝑥 ≤ 𝑏
𝑏−𝑎
1 for 𝑥 > 𝑏

9/10/2024 Geostatistics 115

Uniform Distribution
Fig. 2.30 shows both the probability density function and the cumulative-distribution
function.

9/10/2024 Geostatistics 116

Uniform Distribution
The mean of the uniform distribution is calculated as
𝑎+𝑏
𝜇=
2
and the variance is calculated as
(𝑏 − 𝑎)2
𝜎2 =
2

9/10/2024 Geostatistics 117

Normal (Gaussian) Distribution
Normal distribution undoubtedly is the most famous distribution in the field of
statistics. Its probability density function has a bell-shaped curve, which is well known
to almost everyone, even those unfamiliar with statistical principles.

9/10/2024 Geostatistics 118

Normal (Gaussian) Distribution
The density function is given by
1 1 𝑥−𝜇 2
𝑓 𝑥 = exp − for − ∞ < 𝑥 < ∞
𝜎 2𝜋 2 𝜎
This distribution function has a mean of μ and a variance of σ2. The maximum value of
the density function is 0.4/σ, which is reached at x = μ. It is a symmetric function.

9/10/2024 Geostatistics 119

Normal (Gaussian) Distribution
To use the normal-distribution function, it is much more convenient to define a
standardized normal distribution. If we define a new variable,
𝑥−𝜇
𝑧=
𝜎
the probability density function is
1 𝑧2
𝑓 𝑧 = exp −
2𝜋 2
This distribution has a mean of zero and a variance of one.
The cumulative-distribution function is 𝑧
1 −𝑡 2 /2
𝐹 𝑧 = න𝑒 𝑑𝑡
2𝜋
−∞

9/10/2024 Geostatistics 120

Normal (Gaussian) Distribution
Fig- 2.31 shows the probability density function and the cumulative-distribution
function for the normal distribution.

9/10/2024 Geostatistics 121

Normal (Gaussian) Distribution
Table 2.11 provides the values of F(z) for a range of z between - 3
and + 3 and can be used for a normal distribution with any mean
and variance.

9/10/2024 Geostatistics 122

Example 2.16
1. The porosity in the reservoir is estimated to have a mean of 0.2 and a variance of
0.0004. If the porosity is believed to be normally distributed, what is the probability that
the porosity value will be between 0.18 and 0.22?
2. If rock with a porosity of less than 15% is believed to be nonreservoir rock, what is
the probability that the rock at a given location will have a porosity of less than 15%?

9/10/2024 Geostatistics 123

Example 2.16 - Solution
1. For the distribution, μ = 0.2, σ2 = 0.0004 or σ = 0.02. To calculate
the p[0.18 ≤  ≤ 0.22], we first need to standardize the two values
0.18 and 0.22. If 1 = 0.18 and 2 = 0.22,
1 − 𝜇 0.18 − 0.2
𝑧1 = = = −1
𝜎 0.02

2 − 𝜇 0.22 − 0.2
𝑧2 = = =1
𝜎 0.02
Once the values are standardized, we can look up the value of F(z1)
and F(z2) in Table 2.11: F(z1) = 0.15866 and F(z2) = 0.84134.
p[0.18 ≤  ≤ 0.22] = F(z2) - F(z1) = 0.84134 - 0.15866 = 0.6827
That is, a 68% probability exists that a porosity value will fall
between 0.18 and 0.22.

9/10/2024 Geostatistics 124

Example 2.16 - Solution
2. We first need to calculate p[ = 0.15]. We can standardize the
value with
3 − 𝜇 0.15 − 0.2
𝑧3 = = = −2.5
𝜎 0.02
From Table 2.11, F(z3) = 0.00621. Therefore
−𝜇
𝑝 ≤ 𝑧3 = 0.00621
𝜎

9/10/2024 Geostatistics 125

Normal (Gaussian) Distribution
One reason for the popularity of the normal-distribution function is the central-limit
theorem, which states that the “sum of large number of independent random variables
tends to be normally distributed.”

9/10/2024 Geostatistics 126

Log-Normal Distribution
Log-normal distribution is closely related to normal distribution. If the logarithm of a
variable is normally distributed, then the variable itself is log-normally distributed.

9/10/2024 Geostatistics 127

Log-Normal Distribution
Fig. 2.33 shows this transformation schematically. In this figure, the log-normal
distribution is skewed with a long tail on the right side. After transforming the data by
taking the log of the variable, however, the distribution becomes symmetric and
normal.

9/10/2024 Geostatistics 128

Log-Normal Distribution
If we consider X to be a log-normally distributed variable, we can define Y = ln X, where
Y = the value of the natural logarithm of the random variable X. If the mean of the
variable Y is α and the variance is β2, we can write the probability density function
for the variable X as
2
1 1 𝑥−𝛼
𝑓 𝑥 = exp − for 𝑥 > 0
𝑥𝛽 2𝜋 2 𝛽

9/10/2024 Geostatistics 129

Log-Normal Distribution
We can show that the mean and the variance of Random Variable X is related to the
mean and variance of the transformed variable Y.
𝛽2
𝜇 = 𝑒𝑥𝑝 𝛼 +
2

2
𝜎2 = 𝜇2 𝑒 𝛽 −1
where
μ = the mean of Variable X and
σ2 = the variance of Variable X.

9/10/2024 Geostatistics 130

Example 2.17
Permeability values in a reservoir are expected to be log-normally distributed with a
mean of 20 md and a variance of 2,000 md2. What is the probability that the value of
permeability in the reservoir at a given location will exceed 200 md?

9/10/2024 Geostatistics 131

Example 2.17 - Solution
In this example, μ = 20 md and σ2 = 2,000 md2. Rearranging Eqs.
2.104 and 2.103 gives, respectively,
𝜎 2 2000
2
𝛽 = ln 1 + 2 = ln 1 + = 1.792
𝜇 20 2

𝛽2 1.792
𝛼 = ln𝜇 − = ln20 − = 2.1
2 2
Standardizing gives
ln𝑥 − 𝛼 ln 200 − 2.1
𝑧= = = 2.39
𝛽 1.792
From Table 2.11, F(2.39) = 0.992. That is, the probability that
permeability will be less than 200 md is 99.2%. Or, in other words,
the probability that the permeability will be greater than 200 md is
(1 - 0.992 = 0.008) 0.8%.

9/10/2024 Geostatistics 132

BORRADAILE 2003 - Statistics of Earth Science Data LB 6029 PDF
100% (1)
BORRADAILE 2003 - Statistics of Earth Science Data LB 6029 PDF
371 pages
Exploratory Data Analysis
100% (3)
Exploratory Data Analysis
26 pages
Chapter 3 Spatial Relationships - Estimation and Modeling
No ratings yet
Chapter 3 Spatial Relationships - Estimation and Modeling
111 pages
Basics of Applied Geostatistics - 2022
No ratings yet
Basics of Applied Geostatistics - 2022
133 pages
Tpe 517 Geostatistics II
No ratings yet
Tpe 517 Geostatistics II
83 pages
Lecture-1 Descriptive Statistics
No ratings yet
Lecture-1 Descriptive Statistics
50 pages
Geostats Manual 2006
100% (3)
Geostats Manual 2006
237 pages
FSML Descriptives Probability RV RNG Handout
No ratings yet
FSML Descriptives Probability RV RNG Handout
57 pages
Chapter Two
No ratings yet
Chapter Two
36 pages
Introduction To Geostatistics For Site Characterization and Safety Assessment
0% (1)
Introduction To Geostatistics For Site Characterization and Safety Assessment
166 pages
Pete 360 Chapter 1
No ratings yet
Pete 360 Chapter 1
100 pages
Descriptive Statistics For Spatial Distributions
No ratings yet
Descriptive Statistics For Spatial Distributions
38 pages
Introduction To Geostatistics - Course Notes: Ye Zhang Dept. of Geology & Geophysics University of Wyoming
No ratings yet
Introduction To Geostatistics - Course Notes: Ye Zhang Dept. of Geology & Geophysics University of Wyoming
36 pages
Geostatitics Notes Alevel Kazembe WIP
No ratings yet
Geostatitics Notes Alevel Kazembe WIP
63 pages
Chapter 3 - Basic Statistical Concepts
No ratings yet
Chapter 3 - Basic Statistical Concepts
16 pages
Statistics For Geoscience Applications: Univariate Statistics Bivariate Statistics Multivariate Statistics
No ratings yet
Statistics For Geoscience Applications: Univariate Statistics Bivariate Statistics Multivariate Statistics
25 pages
MI 353 - Mine Geostatistics - Book (DR - Bowa)
No ratings yet
MI 353 - Mine Geostatistics - Book (DR - Bowa)
30 pages
Olatunji Indonesia
No ratings yet
Olatunji Indonesia
50 pages
Presentation 3
No ratings yet
Presentation 3
26 pages
SGL 204 Lecture1 2018
No ratings yet
SGL 204 Lecture1 2018
52 pages
Geostatistics Lesson 1
No ratings yet
Geostatistics Lesson 1
25 pages
Business Statistics: Assignment-1 Project Report
No ratings yet
Business Statistics: Assignment-1 Project Report
10 pages
Ostatistics
100% (2)
Ostatistics
44 pages
Ofr20091103 USGS Geostatistics
No ratings yet
Ofr20091103 USGS Geostatistics
348 pages
2 Statistik Spasial
No ratings yet
2 Statistik Spasial
64 pages
Gathering and Organizing Data
80% (10)
Gathering and Organizing Data
5 pages
2021 - Lecture 4 - Descriptive Statistics - Slides
No ratings yet
2021 - Lecture 4 - Descriptive Statistics - Slides
20 pages
Slide 2-Geostatistics-BKEL
No ratings yet
Slide 2-Geostatistics-BKEL
4 pages
2 Spatial Statistics - Univariate
No ratings yet
2 Spatial Statistics - Univariate
70 pages
2 Pengenalan Geostatistik
No ratings yet
2 Pengenalan Geostatistik
59 pages
Exploratory Spatial Data Analysis
No ratings yet
Exploratory Spatial Data Analysis
54 pages
Basic Concepts of Statistics
No ratings yet
Basic Concepts of Statistics
43 pages
Overview of Statistics and Geo-Statistics
No ratings yet
Overview of Statistics and Geo-Statistics
16 pages
Descriptive Stats
No ratings yet
Descriptive Stats
50 pages
An Introduction To Applied Geostatistics
100% (1)
An Introduction To Applied Geostatistics
92 pages
Statistical Treatment of Test Data: Geostatistical Methods
No ratings yet
Statistical Treatment of Test Data: Geostatistical Methods
38 pages
Week 01 Introduction
No ratings yet
Week 01 Introduction
33 pages
002 Applied Geostatistics For Reservoir Char-Halaman-27-61 PDF
No ratings yet
002 Applied Geostatistics For Reservoir Char-Halaman-27-61 PDF
35 pages
Quality Control: Fundamentals of Statistics
No ratings yet
Quality Control: Fundamentals of Statistics
62 pages
Variograms
92% (13)
Variograms
20 pages
Regionalized Variables
No ratings yet
Regionalized Variables
10 pages
Ics 2328 Computer Oriented Statistical Modeling Assignment March 2024 Ms
No ratings yet
Ics 2328 Computer Oriented Statistical Modeling Assignment March 2024 Ms
6 pages
2 Data Analysis
No ratings yet
2 Data Analysis
11 pages
GLY 413 Geostatistics and Data Analysis (1) - 021606
No ratings yet
GLY 413 Geostatistics and Data Analysis (1) - 021606
22 pages
Geostatistics and Reservoir Modeling Module: Review of Basic Statistics
No ratings yet
Geostatistics and Reservoir Modeling Module: Review of Basic Statistics
52 pages
Intro To Geostatistics
No ratings yet
Intro To Geostatistics
50 pages
Course Syllabus Geostat Jan2024
No ratings yet
Course Syllabus Geostat Jan2024
7 pages
Zivot - Introduction To Computational Finance and Financial Econometrics
50% (2)
Zivot - Introduction To Computational Finance and Financial Econometrics
188 pages
PE20M014 Introduction To Geo-Statistics
No ratings yet
PE20M014 Introduction To Geo-Statistics
26 pages
Geosta 1
No ratings yet
Geosta 1
36 pages
YE ZHANG, Introduction To Geostatistics
No ratings yet
YE ZHANG, Introduction To Geostatistics
36 pages
CQF Exam 3 Questions and Guide
100% (1)
CQF Exam 3 Questions and Guide
8 pages
Exploratory Factor Analysis On Road Accidents in Cagayan de Oro City
No ratings yet
Exploratory Factor Analysis On Road Accidents in Cagayan de Oro City
23 pages
Ratio and Product Methods of Estimation: Y X Y X Y X
No ratings yet
Ratio and Product Methods of Estimation: Y X Y X Y X
23 pages
Practice Problems FOR Biostatistics
No ratings yet
Practice Problems FOR Biostatistics
37 pages
Problem Set For Practice
No ratings yet
Problem Set For Practice
1 page
ENGR 217 Lecture 6
No ratings yet
ENGR 217 Lecture 6
29 pages
CH 07
No ratings yet
CH 07
107 pages
Probability Theory The Logic of Science 1st Edition E.T. Jaynes - The Full Ebook Set Is Available With All Chapters For Download
No ratings yet
Probability Theory The Logic of Science 1st Edition E.T. Jaynes - The Full Ebook Set Is Available With All Chapters For Download
49 pages
Team 19 Project Report
No ratings yet
Team 19 Project Report
27 pages
Inverse Problems: From Regularization To Bayesian Inference: Calvetti D, Somersalo E
No ratings yet
Inverse Problems: From Regularization To Bayesian Inference: Calvetti D, Somersalo E
37 pages
Probability Normal Distribution
No ratings yet
Probability Normal Distribution
20 pages
Spearman's Cooeficient of Rank Correlation
No ratings yet
Spearman's Cooeficient of Rank Correlation
17 pages
Bayesian Learning: Salma Itagi, Svit
No ratings yet
Bayesian Learning: Salma Itagi, Svit
14 pages
Applied Natural Language Processing: Barbara Rosario
No ratings yet
Applied Natural Language Processing: Barbara Rosario
39 pages
BS Assignment4
100% (1)
BS Assignment4
2 pages
Type 1 and Type 2 Errors
No ratings yet
Type 1 and Type 2 Errors
3 pages
Two-Sample Summary Table
No ratings yet
Two-Sample Summary Table
6 pages
C2025 - Pair HW - Binomial Distribution-1
No ratings yet
C2025 - Pair HW - Binomial Distribution-1
12 pages
Matrix de Covarianza
No ratings yet
Matrix de Covarianza
12 pages
Bus 173 - 2
No ratings yet
Bus 173 - 2
27 pages
RIVERA ECE11 Laboratory Exercise 1
No ratings yet
RIVERA ECE11 Laboratory Exercise 1
6 pages
Lesson 12
No ratings yet
Lesson 12
15 pages
Buss Stat CH 9
No ratings yet
Buss Stat CH 9
3 pages
Afghari Et Al. - 2019 - Effects of Globally Obtained Informative Priors On Bayesian Safety Performance Functions Developed For Australia
No ratings yet
Afghari Et Al. - 2019 - Effects of Globally Obtained Informative Priors On Bayesian Safety Performance Functions Developed For Australia
11 pages
(ENGDAT2) Exercise 3
No ratings yet
(ENGDAT2) Exercise 3
10 pages
Unit III - Quantum Theory of Solid
No ratings yet
Unit III - Quantum Theory of Solid
3 pages
Actual Base+Trend Month Number+Seasonal Index: Airline Miles Data
No ratings yet
Actual Base+Trend Month Number+Seasonal Index: Airline Miles Data
3 pages
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Physics Experiment: Експериментальні роботи, #2
From Everand
Physics Experiment: Експериментальні роботи, #2
Yuliia Derid
No ratings yet
Co-Clustering: Models, Algorithms and Applications
From Everand
Co-Clustering: Models, Algorithms and Applications
Gérard Govaert
No ratings yet
Sampling in Statistics
From Everand
Sampling in Statistics
Stephanie Glen
No ratings yet
Statistics II Essentials
From Everand
Statistics II Essentials
Emil Milewski
2.5/5 (1)
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
Measurement of Length - Screw Gauge (Physics) Question Bank
From Everand
Measurement of Length - Screw Gauge (Physics) Question Bank
Mohmmad Khaja Shareef
No ratings yet
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
Two Dimensional Geometric Model: Understanding and Applications in Computer Vision
From Everand
Two Dimensional Geometric Model: Understanding and Applications in Computer Vision
Fouad Sabry
No ratings yet
Statistics: a QuickStudy Laminated Reference Guide
From Everand
Statistics: a QuickStudy Laminated Reference Guide
BarCharts Publishing, Inc.
No ratings yet

Chapter 2 Principles of Statistics

Uploaded by

Chapter 2 Principles of Statistics

Uploaded by

Vietnam National University - Ho Chi Minh City

𝑝 𝐵|𝐴𝑗 𝑝(𝐴𝑗 ) Eq. 2.46b is Bayes’ theorem, which

9/10/2024 Geostatistics 100

9/10/2024 Geostatistics 101

9/10/2024 Geostatistics 102

9/10/2024 Geostatistics 103

9/10/2024 Geostatistics 104

9/10/2024 Geostatistics 105

9/10/2024 Geostatistics 106

𝐸 𝑢(𝑋) = ෍ 𝑢(𝑥𝑖 )𝑃[𝑥 = 𝑥𝑖 ]

9/10/2024 Geostatistics 107

9/10/2024 Geostatistics 108

9/10/2024 Geostatistics 109

9/10/2024 Geostatistics 110

9/10/2024 Geostatistics 111

9/10/2024 Geostatistics 112

9/10/2024 Geostatistics 113

9/10/2024 Geostatistics 114

9/10/2024 Geostatistics 115

9/10/2024 Geostatistics 116

9/10/2024 Geostatistics 117

9/10/2024 Geostatistics 118

9/10/2024 Geostatistics 119

9/10/2024 Geostatistics 120

9/10/2024 Geostatistics 121

9/10/2024 Geostatistics 122

9/10/2024 Geostatistics 123

9/10/2024 Geostatistics 124

9/10/2024 Geostatistics 125

9/10/2024 Geostatistics 126

9/10/2024 Geostatistics 127

9/10/2024 Geostatistics 128

9/10/2024 Geostatistics 129

9/10/2024 Geostatistics 130

9/10/2024 Geostatistics 131

9/10/2024 Geostatistics 132

You might also like