Probability Density Function:: Time Again. More Closely The Histogram Will Approximate The PDF
Probability Density Function:: Time Again. More Closely The Histogram Will Approximate The PDF
• Where P(v)dv is the probability of the variable v taking a value between v and
v + dv.
• PDF can be estimated by constructing a histogram of an ensemble of
measurements of v at specified location, repeating the experiment time and
time again.
• Larger the ensemble (i.e. the more times the experiment is repeated), the
more closely the histogram will approximate the PDF.
Moment of a Variable
• Moments of variable v are derived from PDF. The n-th moment < vm > is
defined as
• A PDF which is symmetric about mean < v > will have zero skewness.
• All higher odd moments of such a symmetric PDF will also be identically
zero.
• Skewness reveals information about the asymmetry of PDF.
• Positive skewness indicates that PDF has a longer tail for v−⟨v⟩>0 than for
v−⟨v⟩<0.
• Hence a positive skewness means that variable v′ is more likely to take on
large positive values than large negative values.
Skewness
• A time series with long stretches of small negative values and a few instances
of large positive values, with zero time mean, has positive skewness.
10
8
Frequency
6
4
2
0
10
8
Frequency
6
4
2
0
12
10
8
Frequency
6
4
2
0
60
Frequency
40
20
0
0 5 10 15 20
X
For continuous data we don’t have equally spaced
discrete values so instead we use a curve or function
that describes the probability density over the range of
the distribution.
The curve is chosen so that the area under the curve is
equal to 1.
If we observe a sample of data from such a distribution
we should see that the values occur in regions where
the density is highest.
A continuous probability distribution
0.04
0.03
density
0.02
0.01
0.00
0.08
0.08
density
density
0.04
0.04
0.00
0.00
50 100 150 50 100 150
X X
μ = 130 σ = 10 μ = 100 σ = 15
0.08
0.08
density
density
0.04
0.04
0.00
0.00
0
Symmetry ⇒ P (Z < 0) = 0.5
What about P (Z < 1.0)?
P(Z < 1)
0 1
Calculating this area is not easy and so we use proba-
bility tables. Probability tables are tables of probabil-
ities that have been calculated on a computer. All we
have to do is identify the right probability in the table
and copy it down!
Only one special Normal distribution, N(0, 1), has
been tabulated.
The N(0, 1) distribution is called
the standard Normal distribution.
The tables allow us to read off probabilities of the form
P (Z < z).
0 z
z 0.0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.5000 5040 5080 5120 5160 5199 5239 5279 5319 5359
0.1 0.5398 5438 5478 5517 5557 5596 5636 5675 5714 5753
0.2 0.5793 5832 5871 5910 5948 5987 6026 6064 6103 6141
0.3 0.6179 6217 6255 6293 6331 6368 6406 6443 6480 6517
0.4 0.6554 6591 6628 6664 6700 6736 6772 6808 6844 6879
0.5 0.6915 6950 6985 7019 7054 7088 7123 7157 7190 7224
0.6 0.7257 7291 7324 7357 7389 7422 7454 7486 7517 7549
0.7 0.7580 7611 7642 7673 7704 7734 7764 7794 7823 7852
0.8 0.7881 7910 7939 7967 7995 8023 8051 8078 8106 8133
0.9 0.8159 8186 8212 8238 8264 8289 8315 8340 8365 8389
1.0 0.8413 8438 8461 8485 8508 8531 8554 8577 8599 8621
1.1 0.8643 8665 8686 8708 8729 8749 8770 8790 8810 8830
0 0.92 0 0.92
−0.5 0 0 0.5
The Normal distribution is symmetric so we know that
P (Z > −0.5) = P (Z < 0.5) = 0.6915
Example 3
−0.76 0 0 0.76
By symmetry
P (Z < −0.76) = P (Z > 0.76) = 1 − P (Z < 0.76)
= 1 − 0.7764
= 0.2236
Example 4
−0.64 0 0.43
P(Z < −0.64) P(Z < 0.43)
−0.64 0 0 0.43
We can calculate this probability as
P (−0.64 < Z < 0.43) = P (Z < 0.43) − P (Z < −0.64)
= 0.6664 − (1 − 0.7389)
= 0.4053
Example 5
3 6.2
−3 => N(0, 4)
0 3.2
/ 2 => N(0, 1)
0 1.6
We convert this probability to one involving the
N(0, 1) distribution by
(i) Subtracting the mean μ
(ii) Dividing by the standard deviation σ
Subtracting the mean re-centers the distribution on
zero. Dividing by the standard deviation re-scales the
distribution so it has standard deviation 1. If we also
transform the boundary point of the area we wish to
calculate we obtain the equivalent boundary point for
the N(0, 1) distribution.
⇒ P (X < 6.2) = P (Z < 1.6) = 0.9452 where Z ∼ N(0, 1)
This process can be described by the following rule
X−μ
If X ∼ N(μ, σ 2) and Z = σ
then
Z ∼ N(0, 1)
Example 6
Z=D−2
16.40
0 2 0−2 0
16.40
= −0.122
D−2 0−2
P (D < 0) = P √ <√
269 269
= P (Z < −0.122) Z ∼ N (0, 1)
= 0.45142
Other rules that are often used are
If X and Y are two independent normal
variables such that
then
X + Y ∼ N(μ1 + μ2, σ12 + σ22)
aX ∼ N(aμ1, a2σ12)
aX + bY ∼ N(aμ1 + bμ2, a2σ12 + b2σ22)
Using the Normal tables backwards
⇒ P (X < x) = 0.8
X ~ N(45, 400) Z ~ N(0, 1)
Z = X − 45
20
45 x 0 x − 45
20
= 0.84
Standardizing this probability we get
X − 45 x − 45
P < = 0.8
20 20
x − 45
⇒P Z< = 0.8
20