0% found this document useful (0 votes)
108 views22 pages

Continuous Distributions

Uploaded by

chello
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
108 views22 pages

Continuous Distributions

Uploaded by

chello
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

24 Continuous In this chapter you

will learn:
• how to describe

distributions probabilities of
continuous variables
• how to calculate
expected statistics of
Introductory problem continuous variables

The height of many trees in a forest is measured and they • about an extremely
have a mean of 7 m and a standard deviation of 1.5 m. important distribution
Estimate the proportion of trees above 10 m tall. called the normal
distribution
• how to work
In chapter 23 we saw that being able to describe random backwards from
variables allowed us to make predictions about their properties. probabilities to
However, a major limitation was that those methods only applied estimate information
to discrete variables. In reality, many variables we are interested about the data.
in, such as height, weight and time, are continuous variables. In
this chapter we shall extend the methods of chapter 23 to work
with continuous variables. We will also meet the incredibly
important ‘normal distribution’, which is used to model a large
number of continuous variables in the physical world.

24A Continuous random variables


Consider the following data for masses of 5 kg bags of rice.

Mass / kg Frequency
4.9 12
5.0 16
5.1 20
5.2 14

Not all of the data labelled 5.1 kg has a mass of exactly 5.1 kg.
Which is more useful –
A bag with mass 5.1358 kg or 5.0879546 kg would be counted
knowing that the mass
in this category. It would be impossible to list all the different of a bag of rice is
possible actual masses, and it would be impossible to measure 5.1 kg to 2 significant
the mass absolutely accurately. When we collect continuous figures or knowing that the
data we have to put it into groups. This means that we cannot mass is 5.0879546 kg? Is
talk about the probability of a single value of a continuous knowing the exact answer
random variable (crv). We can only talk about the probability always better?
of the crv being in a specified range. A useful way of
representing this is by the area under a graph. This also has the
© Cambridge University Press 2012 24 Continuous distributions 769
f
property that there is no area above a single value but we can
find the area above any range.

The area under a graph can be found by integrating. The


function which we have to integrate is called the probability
density function (pdf), and it is often denoted f (x). The
x
defining feature of f (x) is that the area between two x values is
4.9 5.0 5.1 5.2
the probability of the crv falling between those two values.
y

y = f (x) KEY POINT 24.1

The probability of the crv falling between values a and b is:


P (a < x < b) b
P ( a < x b ) = ∫ f ( x ) dx
a

x
a b

Does representing exam hint


probability by area –
For a continuous variable it does not matter whether you
that is, providing a
use strict inequalities (a < × < b) or inclusive inequalities
visual interpretation –
(a ≤ × ≤ b).
give us any new knowledge?
Is it helpful?

As with discrete probabilities, the total probability over all cases


must equal 1. Also, no probability can ever be negative. This
provides two requirements for a function to be a probability
density function.
t
exam hin KEY POINT 24.2
The limits −
∞ and ∞
e fact that, For a probability density function f(x):
represent th
continuous ∞
in theory, a ∫ f ( x ) dx = 1, f ( x ) ≥ 0
able can
random vari
−∞

al value.
take any re
the limits
In practice,
ral are
of the integ A probability density function will often have different rules
west and
set to the lo over different domains. If a probability density function is only
value the
the highest defined over a particular domain you may assume that it is zero
n take.
variable ca everywhere else.

770 Topic 5: Probability and statistics © Cambridge University Press 2012


Worked example 24.1

A continuous random variable has a pdf:


y
⎧ kx 2 0 ≤ x < 1
f (x) = ⎨
⎩0 otherwise
(a) Find the value of k.
(b) Find the probability of x being between 0.2 and 0.6.

x
0 1

1
Total area is 1. Area is only found between (a) 1 = ∫ kx 2 dx
0
0 and 1 1
⎡ kx 3 ⎤
=⎢
⎣ 3 ⎥⎦ 0
k
=
3
⇔ k=3

06

Probability X lies in [a, b] is ∫


b
f ( x ) dx
(b) P ( 6) ∫02
2
dx
=[ ]
a 06
02
= 0 208

Exercise 24A
1. For each of these distributions, find the possible values of the
unknown parameter k:

(a) (i) f ( x ) = kx 3, 2 < x < 3 (ii) f ( x ) = k x , 1 < x < 4


(b) (i) f ( x ) = x k , − 1 < x ≤ 2 (ii) f ( x ) = 3x k, − 2 ≤ x < 3
(c) (i) f ( x ) = ekx, 0 < x < 2 (ii) f ( x ) = sin kxx , 0 < x < π
1 1
(d) (i) f ( x ) = , 0 ≤ x ≤ 1 (ii) f ( x ) = , 0 ≤ x ≤1
(x k) 2
x k
(e) (i) f ( x ) = x 3 , 0 < x < k (ii) f ( x ) = 2 x 1, 1 < x < k
1
(f) (i) f ( x ) = , k x < k + 1 (ii) f ( x ) = x , k < x k
1+ x
(g) (i) f ( x ) = kx 2 , 0 < x k (ii) f ( x ) = x k, 0 < x k

© Cambridge University Press 2012 24 Continuous distributions 771


(h) (i) f ( x ) = k 2
, 3 x < 8 (ii) f ( x ) = ksin x , π < x < π 2
1 1
(i) (i) f ( x ) = , 1< x k (ii) f ( x ) = , k x <1
x2 2 x

⎧2 − 2 x 0 < x < 1
2. (a) If f ( x ) = ⎨
⎩0 otherwise
(i) Find P( .3 < X < 0.. ). (ii) Find P ( 0 < X < 0.5 ).
⎧ π
⎪cos 0<x<
(b) If f ( x ) = ⎨ 2
⎪⎩ 0 otherwise
⎛π π⎞ ⎛ π⎞
(i) Find P ⎜ < ≤ ⎟ . (ii) Find P 0 ≤ X < ⎟ .
⎝4 3⎠ ⎝ 6⎠
⎧ 1 1 < x < 10

(c) If f ( x ) = ⎨ x ln10
⎪⎩ 0 otherwise
(i) Find P( ). (ii) Find P( ).

⎧ 2x 0 < x < 1
3. (a) If f ( x ) = ⎨
⎩ 0 otherwise
(i) Find a if P ( X a) = 0.4 . (ii) Find b if P ( X b) = 0.9 .
⎧x
⎪ 0<x<8
(b) If f ( x ) = ⎨ 8
⎪⎩0 otherwise

(i) Find a if P ( X a) = 0.9. (ii) Find b if P ( X b) = 0.5.

⎧x
⎪ 2<x<6
(c) If f ( x ) = ⎨16
⎪⎩ 0 otherwise
(i) Find a if P ( 2 + a < X < 6 ) 0.8.
(ii) Find b if ( b X < b + 1) = 0.25.

4. A model predicts that the angle, G, by which an alpha


particle is deflected by a nucleus is modelled by:
⎧kg 2 0 < g < π
f (g) = ⎨
⎩0 otherwise

(a) Find the value of the constant k.


(b) 10 000 alpha particles are fired at a nucleus. If the model
is correct, estimate the number of alpha
π
particles deflected by less than . [6 marks]
3
772 Topic 5: Probability and statistics © Cambridge University Press 2012
5. A random variable Y has distribution:
⎧3 −3 x y > 0
f ( y) = ⎨
⎩0 otherwise

Find the exact value of P( ). [4 marks]

6. If the continuous random variable X has a probability density


⎧ 2 π
⎪sec x 0 < x <
f (x) = ⎨ 4
⎪⎩ 0 otherwise

find the interquartile range of X. [6 marks]


⎧ 1
⎪ 1< x < e
7. If f ( x ) = ⎨ x
⎪⎩ 0 otherwise

(a) Find b in terms of k if P( b X < b ) k.


(b) Find a in terms of k if P ( 2 − a < X ≤ 2 + a ) = k. [7 marks]

⎧ e x k x < 2k
8. If f ( x ) = ⎨
⎩0 otherwise

find P ⎛ X > ⎞ .
3k
[7 marks]
⎝ 2⎠

24B Expectation and variance of


continuous random variables
You may have
The expressions for expectation and variance of continuous random noticed that the
variables all involve integration. expressions for
KEY POINT 24.3 E(X) and Var( X )
look similar to those for
Expectation and variance of continuous random variables: discrete random variables,
∞ but with integration signs
E( X )∫ xxf ( x ) dx instead of summation signs.

−∞
∞ This is because there is
E( X ) ∫
2
x 2 f ( x ) dx a link between sums and
−∞

integrals, which you met in
Var ( X ) E( X2 ) − [ ( X )]
2
chapter 17. You will explore
this further if you study
chapter 28 in the Calculus
option (Topic option 9).
Note: The formulae in the Formula booklet are set out slightly
differently.

© Cambridge University Press 2012 24 Continuous distributions 773


Worked example 24.2

If a continuous random variable has pdf


⎧ 3 x(2 − x ) 0 < x < 2,

f (x) = ⎨ 4
⎪⎩ 0 otherwise
find E(X) and the standard deviation of X.
2 3
E (X ) = ∫ x × x ( − ) dx
We can do the integration on the calculator. 0 4
If you need reminding how, See Calcultor skills 3 2 2
x ( 2 − x ) dx
4 ∫0
=
sheet 10 on the CD-ROM
= 1 (from GDC)

2 3
To find standard deviation we must first E (X 2 ) = ∫ x2 × x ( 2 − ) dx
find Var(X) which requires us to find E(X2)
0 4
3 2 3
4 ∫0
= x ( − ) dx
= 1.2 (from GDC)
∴ Var ( ) = E ( ) − E ( )
2

= 1 2 − 12
=02
standard deviation = 0 2 = 0.447

It is also possible to find the median and mode for a continuous


t distribution.
exam hin
d mean The defining feature of the median is that half of the data should
The expecte
examination be below this value and half above. The mode is the most likely
appears in
ore often
questions m value. We can interpret this in terms of probability.
dian or
than the me KEY POINT 24.4
mode.
The median, m, satisfies
m 1
∫ f ( x ) dx =
−∞
− 2
The mode is the value of x at the maximum value of f (x).
The maximum
value of f (x) is not
necessarily where
df
= 0, see Section
dx
16H.

774 Topic 5: Probability and statistics © Cambridge University Press 2012


Worked example 24.3

⎧3
⎪ (4x 2 x3 ) 0 < x < 2
If f ( x ) = ⎨ 20 find the median and mode of X.
⎪⎩ 0 otherwise
m 3 1
∫ ( 4 x 2 x 3 ) dx =
1 0 20 2
Probability of being below the median is 3 4
2 ⇔
m

3m
=
1
5 80 2

y
This is a quartic equation without any easy x=2

substitution. Time to use the calculator

m3 3m4
y= 5 − 80

1
2
0
m
1.52 5.24
From GDC: m = 1.52 or 5.24
However 0 2 therefore median
= 1.52

df 6 x 9 x 2
For the mode check for a local maximum = − =0
dx 5 20
3x
= (8 − 3 x)
20
8
⇔ =0 o x=
3
y
y = df = 3x
20
(8−3x)
dx

x
8
3 2

From the graph of f(x) it is clear that


8
x=
3
corresponds to a maximum, so the
8
mode is
3

© Cambridge University Press 2012 24 Continuous distributions 775


Exercise 24B 1. Find E(X), the median of X, the mode of X and Var(X) if X
has the given probability density function:
x
(a) (i) ( ) = 2 − 2x 0 < x < 1 (ii) f ( x ) = 0<x< 8
8
1 2
(b) (i) f (x) = 1 < x < 10 (ii) f ( x ) = 2 1 < x < 2
x ln10 x
π
(c) (i) f ( x ) = cos x 0 < x < (ii) f ( x ) = e x 0 < x < ln2
2
3 4
(d) (i) f (x) = 4 x >1 (ii) f ( x ) = 5 x >1
x x
2. (a) Given that E(X) = 1.1, find k if:
⎧ 1 ⎧ k
⎪ x lnk 1 < x k ⎪ 1< x < ∞
(i) f ( x ) = ⎨ (ii) f ( x ) = ⎨ x k
⎪0 otherwise ⎪⎩0 otherwise

(b) Given that E(X) = 3, find k if:
⎧ 1
⎪k k < x k+
(i) f ( x ) = ⎨ k
⎪⎩ 0 otherwise
⎧ 1
⎪ 0 < x < ( − 1)
(ii) f ( x ) = ⎨ x k
⎪⎩ 0 otherwise

3. The continuous random variable X has pdf:


⎧ 3 4x2 x3 0 < x < 2
⎪ ( )
f ( x ) = ⎨ 20
⎪⎩0 otherwise
(a) Find the expected mean of X.
(b) Find the mode of X. [6 marks]

4. A continuous random variable B has pdf:


⎧ ab2 3 < b < 10
f (b ) = ⎨
⎩0 otherwise
(a) Find the value of the constant a.
(b) Find E(B). [7 marks]

5. A function f is given by:


⎧k k y>0
f ( y) = ⎨
⎩ 0 otherwise
(a) Show that f is a probability density function
(b) A random variable Y has distribution given by f (x).
Find E(Y) in terms of k. [10 marks]

776 Topic 5: Probability and statistics © Cambridge University Press 2012


6. Y is a continuous random variable with probability density
function:
⎧ay 2 −kk y < k
f ( y) = ⎨
⎩ 0 otherwise
3
(a) Show that a = 3 .
2k
(b) Given that Var(Y) = 5 find the exact value of k. [11 marks]
2
1 − x2
7. Given that f ( x ) = e , x ∈ R is a probability

density function find E(X) and prove that Var(X) = 1. [9 marks]

24C The normal distribution X ∼ N (μ,σ2)


There are many situations where a variable is most likely to
be close to its average value, and values further away from the
average become increasingly unlikely. Many such situations can
be modelled using the normal distribution. x
μ − 2σ μ μ + 2σ
All that is needed to describe this distribution is its mean (μ)
and variance (σ 2). If a variable follows this distribution we use
the notation X ∼ N ( μ, σ2 ) .
t
exam hin
The probability density function (pdf) for the normal ith the
Be careful w2 ce,
is the varian
distribution is quite complicated: notation: σ
0, 9) has
so X ~ N (1 .
( −μ
μ)
viation σ = 3
2
1
standard de

f (x) = e 2 σ2
2πσ
This function cannot be integrated in terms of other
well-known functions, but your calculator can find
approximate probabilities.
See Calculator sheet 14 on the CD-ROM. Historically,
You may find it helpful to sketch a diagram to get a visual cumulative
probabilities for the
representation of the probability you are trying to find.
normal distribution
X ∼ N(5, 1.2) were recorded in tables
and these are still used if
you don’t have a graphical
calculator. As there cannot
be separate tables for every
possible different µ and
σ, all values needed to be
x converted into a Z-score
5 6
P (X < 6) described later.

© Cambridge University Press 2012 24 Continuous distributions 777


The diagrams can also provide a useful check, to see whether you
should expect the probability to be smaller or greater than 0.5.
X ∼ N(100, 64) X ∼ N(20, 4)

x x
100 110 18.5 20 21
P (X > 110) P (18.5 < X < 21)

Worked example 24.4

The average height of people in a town is 170 cm with standard deviation of 10 cm. What is the
probability that a randomly selected resident:
(a) is less than 165 cm tall?
(b) is between 180 cm and 190 cm tall?
(c) is over 176 cm tall?

X is the crv ‘height of a town resident’ so


State the distribution used X ~ N(170,, )

State the probability to be found (a) p(X )


and use the calculator X ∼ N(170, 100)

x
165 170
P (X < ) = 0.309 ( ) (from GDC)

State the probability to be found (b) p( X )


and use the calculator X ∼ N(170, 100)

x
170 180 190
P ( 180 < < 190 ) 0.136(3SF) (from GDC)

778 Topic 5: Probability and statistics © Cambridge University Press 2012


continued . . .
State the probability to be found (c) X ∼ N(170, 100)
and use the calculator

x
170 176
P (X > 176)

P (X > ) = 0.274
7 (3SF) (from GDC)

If a normally distributed random variable has mean 120,


should a value of 150 be considered unusually large? The In the real world,
there is always a
answer depends on how spread out the variable is, which is
possibility that a
measured by its standard deviation. If the standard deviation is result may have
30 then a value around 150 will be quite common; however, if occurred by random
the standard deviation were 5 then 150 would be very unusual. chance. Supplementary
sheet 12 ‘Significant
It turns out that the probability of a normally distributed
discoveries’ explores how
random variable being less than a given value (P( ), called unlikely a result has to be
the cumulative probability) depends only on the number of before we accept it was
standard deviations x is from the mean. This is called not a fluke, which is often
the Z-score. stated in terms of the
KEY POINT 24.5 z-score.

For X ∼ N ( μ, σ2 ), the Z-score measures the number of


standard deviations of x above the mean.
x −μ
z=
σ
(a negative Z-score means x is below the mean)

Worked example 24.5

Given that X ∼ N( , 6.25):


(a) How many standard deviations is x = 16.1 away from the mean?
(b) Find the value of X which is 1.2 standard deviations below the mean.

x−μ
(a) z =
The number of standard deviations away from σ
the mean is measured by the Z-score

© Cambridge University Press 2012 24 Continuous distributions 779


continued . . .
σ = 6 25 = 2.5
6.25 is the variance 16.1 − 15
∴z = = 0 44
25
16.1 is 0.44 standard deviations away
from the mean.

(b) z = −1 2
Values below the mean have a negative Z-score x − 15
−1 2 =
25
⇒ − 15 = −3
⇒ = 12

Before graphical If we are given a random variable X ∼ N ( μ, σ2 ) we can create


calculators a new random variable Z which takes the values equal to the
existed (which Z-scores of the values of X. In other words, for each x there is a
wasn’t so long x −μ
ago!) people used tables corresponding z = . This is called the standardised value.
σ
showing cumulative
probabilities of the It turns out that, whatever the original mean and standard
standard normal deviation of X, this new random variable always has normal
distribution. Because of distribution with mean 0 and variance 1, called the standard
their importance they were normal distribution: Z N (0, 1). This is an extremely
given special notation: important property of the normal distribution which needs to
Φ (z ) (Z ≤ z ). be used in situations when the mean and standard deviation of
Although you do not have X are not known (see next section).
to use this notation, you
should understand what it
KEY POINT 24.6
means.
The probabilities of X and Z are related by
⎛ x − μ⎞
P( x) = P Z ≤
⎝ σ ⎟⎠

Worked example 24.6

Let X ∼ N( , 0. 2 ). Write the following in terms of probabilities of Z:


(a) P ( X ≤ 6.1)
(b) P( )
(c) P( .5)

6 1 − 6⎞
We are given that x = 6.1 so we can (a) P ( X ) P ⎛⎝ Z ≤ = P( ≤ 0.2)
05 ⎠
calculate z

780 Topic 5: Probability and statistics © Cambridge University Press 2012


continued . . .
(b) P ( 5 7) P( < 7) P( 5)
The relationship between X and Z
− −
= P⎛Z < ⎞ − P⎛Z < ⎞
7 6 5 6
above is stated for probabilities of the
⎝ 05 ⎠ ⎝ 05 ⎠
form P(X k ) , so convert to that form
first = P ( < ) − P ( < − ) = P(−2 < Z < 2)

(c) P ( X ) 1 P (X ≤ )
6 5 − 6⎞
= 1− P⎛Z ≤ = 1 − P( ≤ )
⎝ 05 ⎠
= P(Z > 1)

You can see from the examples above that you don’t actually
have to convert probabilities into the form P(X ≤ k) every time;
simply replace the x values by the corresponding z scores.

Exercise 24C
1. Find the following probabilities:
(a) If X ~ N (20, ),
(i) P ( X ≤ 32) (ii) P ( X < 12)
(b) If Y ~ N ( 4.8,1.44 ),
(i) P (Y > 5.1) (ii) P( .4)
(c) If R ~ N (17, 2)
(i) P (16 R 20) (ii) P( .4 R 18.. )
(d) If Q has a normal distribution with mean 12 and
standard deviation 3:
(i) P (Q > 9.4 ) (ii) P (Q < 14 )
(e) If F has a normal distribution with mean 100 and
standard deviation 25:
(i) P ( F − 100 < 15 ) (ii) P ( F − 100 > 10 )

2. Find the Z-score corresponding to the given value of X:


(a) (i) X N (12, 22 ) , x 13 (ii) X N (38, 72 ) , x 45
(b) (i) X N (20, 9) , x 15 (ii) X N (162, 25) , x = 160

3. Given that X N (16, 2.52 ), write the following in terms of


probabilities of the standard normal variable:
(a) (i) P ( X < 20) (ii) P ( X < 19.2)
(b) (i) P ( X ≥ 14.3) (ii) P ( X ≥ 8.6)
(c) (i) P (12.5 < X < 16.5) (ii) P( .1 ≤ X ≤ 15.. )

© Cambridge University Press 2012 24 Continuous distributions 781


4. It is found that the lifespan of a certain brand of laptop
batteries follows normal distribution with mean 16 hours
and standard deviation 5 hours. A particular battery has a
lifespan of 10.2 hours.
(a) How many standard deviations below the mean is this?
(b) What is the probability that a randomly chosen laptop
battery has a lifespan shorter than this? [6 marks]

5. When Ali competes in long-jump competitions, the


lengths of his jumps are normally distributed with mean
5.2 m and standard deviation 0.7 m.
(a) What is the probability that Ali will record a jump
between 5 m and 5.5 m?
(b) Ali needs to jump 6 m to qualify for the school team.
(i) What is the probability that he will qualify with a
single jump?
(ii) If he is allowed three jumps, what is the probability
that he will qualify for the school team? [7 marks]

6. Weights of a species of cat have a normal distribution with mean


16 kg and variance 16 kg2. In a sample of 2000 such cats, estimate
the number which will have a weight above 13 kg. [6 marks]
You saw in chapter
7. If D ~ (250, ), find:
22 that ∩ means
intersection. (a) P ( D 265 D < 280 )
(b) P ( D D< )
(c) P( ) [6 marks]

8. If Q ~ (4, ), find:
(a) P( 5 < )
(b) P ( Q Q ) [6 marks]

9. The weights of apples are normally distributed with mean


weight 150 g and standard deviation 25 g. Supermarkets
classify apples as ‘medium’ if they are between 120 g and 170 g.
(a) What proportion of apples are medium?
(b) In a bag of 10 apples what is the probability that there
are at least 8 medium apples? [6 marks]

10. The wingspans of a species of pigeon are normally


distributed with mean length 60 cm and standard
deviation 6 cm. A pigeon is chosen at random.

782 Topic 5: Probability and statistics © Cambridge University Press 2012


(a) Find the probability that its wingspan is greater than 50 cm.
(b) Given that its length is greater than 50 cm, find the
probability that a wingspan is greater than 55 cm. [6 marks]

11. Grains of sand are believed to have a normal distribution


with mean 2 mm and variance 0.25 mm2.
(a) Find the probability that a randomly chosen grain of
sand is larger than 1.5 mm.
(b) The sand is passed through a filter which blocks grains
wider than 2.5 mm. The sand that passes through
the filter is examined. What is the probability that a
randomly chosen grain of filtered sand is larger than
1.5 mm? [6 marks]

12. The amount of paracetamol per tablet is believed to be


normally distributed with mean 500 mg and standard
deviation 160 mg. A dose of less than 300 mg is ineffective
in dealing with toothache. In a trial of 20 people treated
for toothache with a single tablet, what is the probability
that 2 or more of them have less than the effective dose?
[6 marks]

13. A variable has a normal distribution with a mean that is


7 times its standard deviation. What is the probability of
the variable taking a value less than 5 times the standard
deviation? [6 marks]

14. If ~ (μ, σ ) and P ( X x ) = k find P( μ ) in


terms of k. [5 marks]

24D Inverse normal distribution t


exam hin
that p must
In Section C we saw how to find probabilities when we knew Remember
lative
information about the variable. In real life it is often useful to work be the cumu
backwards from probabilities to estimate information about the probability.
data. This requires the inverse normal distribution.
KEY POINT 24.7 t
exam hin
lators can
For a given value of probability p the inverse normal Some calcu
e of x such
distribution gives the value of x such that P ( X x ) = p. find the valu p , as
x) =
that P ( X
≤ x) = p.
well as P ( X

© Cambridge University Press 2012 24 Continuous distributions 783


You will need to use your GDC to work out the inverse normal
distribution (see Calculator skills sheet 14, the section on
‘Finding the boundary’ on the CD-ROM). To work out P(X > x)
you might need to do 1 − P(X ≤ x). Note that many textbooks
use the Φ(z) notation mentioned in the previous section
to write inverse normal distribution: If P ( X x ) = p , then
x −μ
Φ −1 ( p ) = z = .
σ

Worked example 24.7

The size of men’s feet is thought to be normally distributed with mean 22 cm and variance
25 cm2. A shoe manufacturer wants only 5% of men to be unable to find shoes large enough for
them. How big should their largest shoe be?

If X is the crv ‘length of a man’s foot’


Convert question into mathematical terms then ~ N(22, )
We want to find the value of x such
that
P ( X > ) = 0 05

Use inverse normal distribution. P ( X x ) 1 P(X > x) = 0.95


We may have to convert into a probability of ⇒ = 30.2 cm (from GDC)
the form P ( X x ) So their largest shoe must fit a foot
30.2 cm long.

t One of the main applications of statistics is to determine


exam hin parameters of the population given information about the data.
lve
This will invo But how can we use the normal distribution calculations if the
qua tions, and
solving e
multaneous mean or the standard deviation is unknown? This is where the
sometimes si standard normal distribution comes in useful; we can replace
s the
equations. A
usually not all the X values by their Z-scores, as they follow a known
numbers are
ay want to distribution, N (0, 1).
‘nice’ you m
lculator.
use your ca

Worked example 24.8

The masses of gerbils are thought to be normally distributed. If 30% of gerbils have a mass of
more than 65 g and 20% have a mass of less than 40 g, estimate the mean and the variance of
the mass of a gerbil.

784 Topic 5: Probability and statistics © Cambridge University Press 2012


continued . . .
If X is the crv ‘mass of a gerbil’ then X ~ N(μ,, 2
)
Convert the information into P (X > ) = 0 3
mathematical terms
P (X < ) = 0 2 (1)

If you need all the probabilities P (X ≤ )=07 (2)


to be in the form P (X ≤ k),
convert the first one
40 − μ
Use inverse normal distribution for from (1) P ( Z z) = 0 2⇒ z = = −0 842
σ
Z (Z ~ N (0, 1)) and relate it to the 65 − μ
from (2) P ( Z z) = 0 7⇒ z = = 0 524
given X values σ
(from GDC)

Solve simultaneous equations 40 − μ = −0 842σ (3)


65 − μ = 0 524 σ (4)
(4) – (3) 25 = 1 366σ
⇒ σ = 18.3 g
∴ μ = 55.4 g

Exercise 24D
1. (a) If X ~ N(14, 49), find x if:
(i) P ( X x ) = 0 .8 (ii) P ( X x ) = 0.46
(b) If X ~ N (36.5,10) , find x if:
(i) P ( X x ) = 0 .9 (ii) P ( X x ) = 0 .4
(c) If X ~ N ( 0,12 ) , find x if:
(i) P ( X < ) (ii) P ( X < 0.8 )

2. (a) If X ~ (μ, ) , find μ if


(i) P ( X > 4 ) = 0.8 (ii) P ( X > 9) = 0.2
(b) If X ~ (8, ) find σ if
(i) P ( X ≤ 19) = 0.6 (ii) P ( X ≤ 0) = 0.3

3. If X ~ (μ, σ ) , find μ and σ if:


(a) (i) P ( X > 7 ) = 0.8 and P ( X < 6) = 0.1
(ii) P ( X > 150) = 0.3 and P ( X < 120) = 0.4
(b) (i) P ( X > 0.1) = 0.4 and P ( X ≥ 0.6) = 0.25
(ii) P ( X > 700) = 0.8 and P ( X ≥ 400) = 0.99

© Cambridge University Press 2012 24 Continuous distributions 785


4. IQ tests are designed to have a mean of 100 and a standard
deviation of 20. What IQ score is needed to be in the top
2% of IQ scores? [5 marks]

5. Rabbits’ masses are normally distributed with an average


mass of 2.6 kg and a variance of 1.44 kg2. A vet decides that
the top 20% of rabbits are obese. What is the minimum
mass for an obese rabbit? [5 marks]

6. A manufacturer knows that his machines produce bolts


whose diameters follow a normal distribution with
standard deviation 0.02 cm. He takes a random sample of
bolts and finds that 6% of them have diameter greater than
2 cm. Find the mean diameter of the bolts. [6 marks]

6. (a) 30% of sand from Playa Gauss falls through a sieve


7.
with gaps of 1 mm, but 90% passes through a sieve
with gaps of 2 mm. Assuming that a grain of sand’s
diameter is normally distributed, estimate the mean
and standard deviation of the sand grains.
(b) 80% of sand from Playa Fermat falls through a sieve with
gaps of 2 mm. 40% of this filtered sand passes through a
sieve with gaps of 1 mm. Assuming that a grain of sand’s
diameter is normally distributed, estimate the mean and
standard deviation of the sand grains. [7 marks]

6. The actual voltage of a brand of 9 V battery is thought


8.
to be normally distributed with standard deviation
0.8 V and mean (9.2 – t) V where t is the time in hours
that the battery has been used. When a battery’s voltage
drops below 7 V it can no longer power a lamp. A batch
of batteries is found and only 10% can power the lamp.
Assuming that the model is correct and that they were all
used for the same amount of time, estimate for how long
the batteries have been used. [7 marks]

6. The times taken for students to complete a test are


9.
normally distributed with a mean of 32 minutes and
standard deviation of 6 minutes.
(a) Find the probability that a randomly chosen student
completes the test in less than 35 minutes.
(b) 90% of students complete the test in less than
t minutes. Find the value of t.
(c) A random sample of 8 students had their time for the
test recorded. Find the probability that exactly 2 of

786 Topic 5: Probability and statistics © Cambridge University Press 2012


these students completed the test in less than
30 minutes. [7 marks]

6. An old textbook says that the range of data can be estimated


10.
as 6 times the standard deviation. If the data is normally
distributed what percentage of the data is within this range?
[6 marks]

11. A scientist noticed that 36% of temperature measurements


were at least 4oC lower than the mean. Assuming that the
measurements follow a normal distribution, estimate the
standard deviation. [5 marks]

12. For a normal distribution find the ratios:


median
(a)
mean
standard deviation
(b) [6 marks]
inter-quartile range

13. Evaluate Φ 1 ( ) + Φ −1 (1 − x ) . [3 marks]

14. A company makes a large number of steel links for chains.


They know that the force required to break any individual
link is modelled by a normal distribution with mean
20 kN. The company tests chains consisting of 4 links. If
any link breaks, the chain will break. A force of 18 kN is
applied to all of the chains and 30% break.
(a) Estimate the probability of a single link breaking.
(b) Hence estimate the standard deviation in the breaking
strength of the links. [6 marks]

15. Most calculators have a random number generator which


generates random numbers distributed uniformly from
0 to 1. How can you use these to form random numbers
that could be drawn from a normal distribution? [4 marks]

Summary
• Because we group continuous data, the probability of a continuous random variable (crv)
is discussed in terms of the probability of it being in a given range. To do this we integrate a
probability density function such that the area under the curve f (x) represents the probability.
The probability of the crv falling between values a and b is:
b
P (a < x < b ) ∫ f ( x ) dx
a

© Cambridge University Press 2012 24 Continuous distributions 787


• For a function to be a probability density function, it must have the following requirements:

∫ f ( x ) dx = 1 , where f ( x ) ≥ 0
−∞

• For continuous random variables, the formulae for expectation and variance require integration:

x 2 f ( x ) dx ; Var ( X )

E( X ) ∫ xxf ( x ) dx; E ( X 2 ) ∫ E( X2 ) − [ ( X )]
2
−∞
− −∞

m 1
• The median, m, of a continuous variable satisfies ∫ f ( x ) dx = , and the mode is the value of x
−∞
− 2
at the maximum value of f (x).

• One very important continuous distribution is the normal distribution: X ~ N (μ, σ2), where
μ = mean and σ2 = variance. Calculators can provide approximate probabilities of being in any
given range.
• For X ∼ N(
N μ σ ), the Z-score (z) measures the number of standard deviations from the mean
x −μ
that a value (x) is: z = .
σ
• Given a random variable X ∼ N( N μ σ ), Z is a new random variable that takes the values equal
to the Z-scores of x, such that for every x there is a corresponding z. This is the standardised
value, which always has a normal distribution Z N( ), called the standard normal
distribution.

• The probabilities of X and Z are related:


⎛ x − μ⎞
P(
x) = P Z ≤
⎝ σ ⎟⎠
• If we know probabilities relating to a variable with a normal distribution we can deduce
information about the data using the inverse normal distribution: for a given value of
probability, p, the inverse normal distribution gives us the value of x such that P( X x ) = p.
To find x, use your GDC.

• If you need to calculate the μ and σ2 of a normal distribution, you can use the standard normal
distribution to replace values of X with their Z-scores as these follow the known distribution
of Z N( ).

Introductory problem revisited

The height of many trees in a forest is measured and they have a mean of 7 m and a
standard deviation of 1.5 m. Estimate the proportion of trees above 10 m tall.

If we make the reasonable assumption that heights of trees are normally distributed, this
problem is asking what is the probability of being more than 2 standard deviations above
the mean. This is 1 Φ (2) = 2.3%.

788 Topic 5: Probability and statistics © Cambridge University Press 2012


Mixed examination practice 24

Short questions
⎧ k 2x 0 < x <1
1. If X is a continuous random variable with pdf f ( x ) = ⎨
⎩0 otherwise
(a) Find the value of k.
(b) Find the variance of X. [6 marks]
2. The test scores of a group of students are normally distributed with mean 62
and variance 144.
(a) Find the percentage of students with scores above 80.
(b) What is the lowest score achieved by the top 50% of the students? [6 marks]
3. 200 people are asked to estimate the size of an angle. 16 give an estimate
which was less than 25o and 42 give an estimate which was more than 35o.
Assuming that the data follows a normal distribution, estimate the mean
and the standard deviation of the results. [6 marks]
4. If X is a continuous random variable with pdf
⎧ax + b 1 x < 5
f (x) = ⎨
⎩0 otherwise
and E(X) = 3.5, find the exact values of the constants a and b. [5 marks]
5. The adult female of a breed of dog has average height 0.7 m with variance
0.05 m2. If the height follows a normal distribution find the probability that in
six independently selected dogs of this breed exactly four are above 0.75 m tall.
[5 marks]
6. If Z ~ N (0,, ), prove that for positive k:
P( Z k ) = 2 − 2Φ(k) [5 marks]

Long questions
1. A continuous random variable X has the probability density function
⎧ ax 2 (5 − x ) 0 < x < 5
f (x) = ⎨
⎩0 otherwise
(a) Find the value of the constant a.
(b)
(b) Evaluate the mean and the standard deviation of X.
(c)
(c) Find the probability that X > 4.
(d) Find the standard deviation of a normal distribution which has the
(d)
same mean as X and the same probability that X > 4. [12 marks]
2. The continuous random variable X has probability density function f (x) where:
⎧ e − kek 0 x≤1
f ( x) = ⎨
⎩ 0 otherwise

© Cambridge University Press 2012 24 Continuous distributions 789


(a) Show that k = 1.
(b) What is the probability that the random variable X has a value that lies
between 1 and 1 ? Give your answer in terms of e.
4 2
(c) Find the mean and variance of the distribution. Give your answers
exactly, in terms of e.
The random variable X above represents the lifetime, in years, of a certain type
of battery.
(d) Find the probability that a battery lasts more than six months.
A calculator is fitted with three of these batteries. Each battery fails independently
of the other two. Find the probability that at the end of six months:
(e) none of the batteries has failed
(f) exactly one of the batteries has failed.
(f) [16 marks]
(© IB Organization 1999)
3. A business man spends X hours on the telephone during the day. The
probability density function of X is given by:
⎧ 1 (8x − x 3 ) 0≤x≤2

f ( x ) = ⎨12
⎪⎩0 otherwise .

(a) (i) Write down an integral whose value is E(X).


(ii) Hence evaluate E(X).
(b) (i) Show that the median, m, of X satisfies the equation
4
m2 + 24 = 0.
(ii) Hence evaluate m.
(c) Evaluate the mode of X. [11 marks]
4. The monthly salary in Argentina is modelled by a normal distribution with an
average of 1500 Pesos. 30% of people earn more than 2000 Pesos per month.
(a) Use this information to estimate the standard deviation of the monthly
salary in Argentina.
(b) Find the probability that a randomly chosen individual earns more
than 3000 Pesos per month.
(c) Given that someone earns more than 2000 Pesos per month find the
probability that they earn more than 3000 Pesos per month.
(d) Find the probability that in three randomly chosen people at least two
earn less than 2000 Pesos per month.
(e) Suggest a reason why the normal distribution may not be an
appropriate model for monthly salaries. [15 marks]

790 Topic 5: Probability and statistics © Cambridge University Press 2012

You might also like