PROBABILITY DISTRIBUTIONS
OBJECTIVES OF THIS UNIT
Alt the end of this unit, you should be able to
4 Distinguish between frequency and probability distributions
4 Explain the Normal, Poisson and Binomial distributions
& Compute the probabilities in Poisson and binomial probability distributions.Biostatistics Unit Three: Probability Distribution. For: L3 & MI classes
Index of Unit Three
3.1. INTRODUCTION
3.2. MAIN CONTENTS
3.2.1. PROBABILITY DISTRIBUTION
3.2.2. THE NORMAL DISTRIBUTION
3.2.3. STANDARDIZING THE NORMAL CURVE
3.2.4. POISSON DISTRIBUTION
3.2.5. BINOMIAL DISTRIBUTION
3.3. TUTOR MARKED ASSIGNMENT
3.4. REFERENCES
Prepared by: Dr. Zoubida Belli, Created b:
Nigeria
1. Hiya S. Ndams Ahmadu Bello University, Zaria, EBBiostatistics Unit Three: Probability Distribution. For: L3 & M1 classes
3.1. INTRODUCTION
Probability is a branch of mathematics which as a general concept can be defined as the
chance of an event occurring. It is the basis of inferential statistics. This unit looks at three
particular distributions, the Normal, the Poisson and the Binomial ~ all of which are important
in sampling theory.
3.2. MAIN CONTENTS
3.2.1. PROBABILITY DISTRIBUTION
A distribution is a scatter of related values, such as the assortment of weights in a group of
cattle. A frequency distribution shows us how many
‘Probably dstibuton
pill adel gl Lol aay JL wish
Probability distribution is very similar because it Rea Rie el ome ga gpa
times given values in a range of values occur. A
shows us how probable given random variable values in a range of such values are.
Example: if we toss two coins, we can obtain 0, 1 or 2 “heads”. If we prepare a table showing
the probabilities of all the random variable values, we will have the probability distribution as
shown below.
‘Number of heads? ‘Sequential event | Probal
0 a 05x05 025
1 ar 05x 05}.050
{TH 05x 03)
2 HH 05x05 =025
Total 00
Therefore, a probability distribution is simply a complete lis
ing of all possible outcomes of
an experiment, together with their probabi
3.2.2. THE NORMAL DISTRIBUTION
This is the most important distribution in statistics. It is also known as the Gaussian
distribution named after Gauss, a German astronomer who showed its use in statistics. The ¢
normal distribution is defined by just two statistics, the mean and the standard deviation
(Gstat! 1,531). Normal distribution is concerned with results obtained by taking measurements
Prepared by: Dr. Zoubida Belli, Created by: Dr. Iliya S. Ndams Ahmadu Bello University,
NigeriaBiostatistics Unit Three: Probability Distribution. For: L3 & M1 classes
on continuous random variable (ie the quantified value of a random event) like weight, yield
etc, The normal distribution is a particular pattern of variation of numbers around the mean.
It is symmetrical (hence we express the standard Normal dstbuton
SLAM chal Sob ge Uple Upeasll ete gl LY te
cae at cane In gy etal lol al
numbers falls off equally away from the mean in aly Rye ange Ugly Regan
deviation as #) and the frequency of individual
both directions, It so happens that the curve given
by this probability’s distribution approximates very closely to a Mathematical curve. This
curve is called the Normal curve.
Figure 3.1: Normal curve with standard deviations
Yeaxis
be toy He He X-axis
Example: In terms of human height, progressively larger and smaller people than the average
‘occur symmetrically with decreasing frequency towards respectively giants or dwarfs.
> Properties of a Normal Curve
= It is a Unimodal symmetrical curve
~The mean, mode & median all coincide, thereby dividing the curve into two equal parts
— Most items on the curve are clustered around the mean
— No kurtosis (qh,4:) or skewness (1,3!) in the curve.
Figure 3.2: Normal curve with Skewness
Prepared by: Dr. Zoubida Belli, Created by: Dr. Iliya S. Ndams Ahmadu Bello University,
NigeriaBiostatistics Unit Three: Probability Distribution. For: L3 & M1 classes
Mean
Median Median Median
Mode
i Meany | Mode
‘symmetrical Negative
Distribution Skew
Example: Let us take a very common example of house prices. Suppose we have house values
ranging from $100k to $1,000,000 with the average being $500,000.
If the peak of the distribution was left of the average value, portraying a positive skewness in the
distribution. It would mean that many houses were being sold for less than the average value,
Le. $500k.
If the peak of the distributed data was right of the average value, that would mean a negative
skew. This would mean that the houses were being sold for more than the average value.
>What is Kurtosis?
Kurtosi
statistical measure used to describe the degree to which scores cluster in the tails
or the peak of a frequency distribution. The peak is the tallest part of the distribution, and
the tails are the ends of the distribution. There are three types of kurtosis:
Mesokuni — Mesokurtic: Distributions that are moderate in
Bap Inge pub pn ly PLAIN Jaa age
breadth and curves with a medium peaked height.
— Leptokurtic: More values in the
4) Leptokuitie General
distribution tails and more values close * Forms of
(0 Mesokurt Kurtosis
to the mean (ie. sharply peaked with (Nonmal)
heavy tails) 1 Patylurtte
— Platykurtic: Fewer values in the tails
and fewer values close to the mean (i.e.
the curve has a flat peak and has more
dispersed scores with lighter tails).
Prepared by: Dr. Zoubida Belli, Created by: Dr. Iliya S. Ndams Ahmadu Bello University, Zaria,
NigeriaBiostatistics Unit Three: Probability Distribution. For: L3 & M1 classes
> Important Aspect or characteristics of the curve
The important aspect of the curve is the area in relationship with probability, if
perpendiculars are erected at a distance of 10 from the mean and in both directions, the area
covered by these perpendiculars and the curve will be about 68.26% of the total area. (It means
that 68.26% of all the frequencies are formed within [Lu cyt ave tua aul ipl Gae g
eee (Gane Sie 01 hire Se cL
GF gl Al 6625 Gy tone oS gl go Ly
stag go 1 § 350 SLY Sa Ail, 68,26
curve is 1 (100%). (100%) 1 gaa Slee LaLa ay
one standard deviation of the mean). The total
probability encompassed by the area under the
Figure 3.2: Normal curve (y)
040
035
>
5030
5
8025
2o20
B01
8
Eso
0.05
0.00
u-30 u-20
ko to ut20 ut30
Normal distribution is defined by:
1
oven
Usually, we do not know the values of 1 (Population mean) and o (Population standard
£-G-W7/20), where: — 00 200] = P [E =172 _ 200-172
— «L VI96 VI196
The proportion of French men are between 165 and 185 centimeters tall
F(2) = 0.02275
65-172 X= 172 185-172
vise * Vise ~~ Vise
4. The tall would be the 9000-th is:
PIL6S 0]=P >
t 1 340 340
— F(-0.5423) = 0.7062
3.2.4. POISSON DISTRIBUTION
A Poisson distribution is a discrete probability distribution that is useful when n is larger and
pis small and when the independent variables occur over a period of time. It can be used when
a density of items is distributed over a given area or [ qau> osSicuny Alsat Ghia ula las! pisisa
os lae Sa pie S pte atoll
volume, such as the number of plants growing per
Maal lyse! ge
Boar Bus
acre(s\ss). It can also be used to discover whether
organisms (a> <8) are randomly distributed.
Example: in ecological studies, Poisson distribution is used to describe the spread of
organisms like insects, trees, and snails’ etc. by the following:
1 Divide the large area into small squares of equal size.
2.Count the particular animal or plant species under study in each square
Prepared by: Dr. Zoubida Belli, Created by: Dr. Iliya S. Ndams Ahmadu Bello University,
NigeriaBiostatistics Unit Three: Probability Distribution. For: L3 & M1 classes
3.You can also randomly select a number of squares, if the area is two large.
The probability of X occurrences in an interval of time, volume, area etc. for a variable where
4 (lambda) is the mean number of occurrences per unit (time, volume, area ...etc) is given by:
ay
PG, a) == where: x = 0,1,2,3
and e = constant, approximately equal to 2.7183 ...
Example: In a study on the distribution of tree-roosting birds, if there are 200 birds randomly
distributed on 500 trees, find the probability that a given tree contains exactly three birds.
Solution: First, find the mean number A of birds on each tree
200 _
500
That is 0.4 birds per tree. Therefore 2 = 0.4, while x = 3,
We can then substitute in the formular above.
2.7183)-°4(0.4)?
P0304) = £ yO)" = 0.0072
Thus, there i less than 21% probability that any given tree will contain exactly three birds.
3.2.5. BINOMIAL DISTRIBUTION
a 04
‘A binomial experiment is a probability experiment that satisfies the following four
requirements:
1. Each trial can have only two outcomes or [yun pt Jus! hips win Qld wd
outcomes that can be reduced to two | JS Mii cite cba a os baie SHhett
Ryn Sg OAS ane me sll go Spat
LAS Sigua! Jara! Gy oly cinaall Genes Ge ite
success or failure. No two events can occur a se
outcomes. i.e these outcomes can either be
simultaneously.
Prepared by: Dr. Zoubida Belli, Created by: Dr. Iliya S. Ndams Ahmadu Bello University,
NigeriaBiostatistics Unit Three: Probability Distribution. For: L3 & M1 classes
2. There must be a fixed number of trails.
3. The outcomes of each trial must be independent of each other.
4, The probability of a success must remain the same for each trial.
‘A binomial distribution is a special probability distribution that describes the distribution of
probabilities when there are only two possible outcomes for each trial of an experiment.
Examples:
1. The answer to a multiple-choice question (even though there are four or five answer
choices) can be classified as correct or incorrect.
2. When tossing a coin, you get either a head or a tail
3. In selecting individuals from human population, you select either a male or a female, a
boy ora girl etc.
The binomial probability formula is given by:
n o
PO) = Gry"?
Where’
"= Numerical probability of a success =
P(s)
= Numerical probability of a failure = P(F)
Number of trials
The number of successes in n trials
‘Mathematical symbol called ‘factorial’. So, n! means multiple all the numbers in a
count down from the total number in the sample.
Example: 7! =7x6x5x4x3x2xland4!=4x3x2x1
Example: A survey on birds showed that one out of five fire finch was trapped, using mist
net, ina given season. If 10 birds are selected at random, find the probability that 3 of the birds
were trapped in the previous season.
Solution: n=10, x-3, P= 1/5 and q-1/5
Substituting these values in the formula above, we have
tora
(0 —3)13! OQ
P@)=
Prepared by: Dr. Zoubida Belli, Created by: Dr. Iliya S. Ndams Ahmadu Bello University,
NigeriaBiostatistics Unit Three: Probability Distribution. For: L3 & M1 classe
The mean (p), variance (02) and standard deviation (a) of a variable that has the binomial
distribution can be found by using the formular:
pen.p;o=n.p.q; o=Vnp.q
From our example above:
The mean: p= 10* '/5 =2
The variance: «? = 10x? x4= 1.6
The Standard deviation, 0? = V1.6 = 1.265
Example: 40% of the snails breed by a certain farm are of the type reticulata, determine
the probability that, out of 12 snails chosen at random, (a) 2 (b) at most 3 will be of the
type reticulate
Solution: 0.4 is the probability of chosen reticulata breed, and 0.6 is the probability of
chosen a non-reticulata breed.
P (2 reticulata breed of 12 snails) =(12)(0.4)?(0.6)°
b. (b) P (at most 3 reticulata snails breed) = P(0 reticulata) + P(1 reticulata) + P(2
reticulata)+ P(3 reticulata)
= (77) (0.4) (0.6)? + (77)(0.4)1(0.6)!*+ (17) (0.4)(0.6)1°+ (47) (0.4)3(0.6)
Prepared by: Dr. Zoubida Belli, Created b:
Nigeria
1. Hiya S. Ndams Ahmadu Bello University, Zaria, as |Biostatistics Unit Three: Probability Distribution. For: L3 & M1 classe
3.4. TUTOR MARKED ASSIGNMENT
1) What is a Normal Curve?
2) Outline any five characteristics of a normal curve,
3) Construct a probability distribution for three patients given a headache relief tablet.
The probabilities for 0, 1, 2 or 3 success are 0.18, 0.52, 0.21 and 0.09, respectively.
4) Inacertain rabbit farm in the southern part of Nigeria 25% of the rabbit breed are of
the type bauscat, determine the probability that, out of 3 rabbits chosen at random,
(a) 1 (b) at most 4 will be of the type bauscat
Prepared by: Dr. Zoubida Belli, Created b:
Nigeria
1. Hiya S. Ndams Ahmadu Bello University, Zaria, as |Biostatistics Unit Three: Probability Distribution, For: L3 & M1 classes
3.6. REFERENCES
1) Bailey, N.TJ. (1994). Statistical Methods in Biology. Third Edition, Cambridge
University Press. United Kingdom.
2) Bluman, A.G. (2004). Elementary Statistics. A Step by Step Approach. Fifth Edition.
McGraw-Hill Companies Incorporated. London.
3) Daniel, W.W. (1995). Biostatistics: a foundation for Analysis in Health sciences. Sixth
Edition. John Wiley and sons Incorporated. USA
4) Fowler, J.A. and Cohen, L. Statistics for Ornithologist. British Trust for Ornithology
Guide 22.
5) Harper, W.M. (1991). Statistics. Sixth Edition. Pitman Publishing, Longman Group,
United Kingdom
6) Hoel, P.G. (1976). Elementary Statistics. Fourth Edition. John Wiley and Sons
Incorporated, NewYork. Pp 151-204.
7) Mukhtar, F.B. (2003). An Introduction to Biostati
Nigeria. Pp 1-112
8) Sanders, D.H., Murph, AF. and Eng, RJ. (1980). Statistics: A Fresh Approach.
ics. Samarib Publishers, Kano
McGraw-Hill Kogakusha, Limited. Kosaido Printing Company Limited,Tokyo,
Japan.
Prepared by: Dr. Zoubida Belli, Created b:
Nigeria
1. Hiya S. Ndams Ahmadu Bello University, Zaria, ElBiostatistics Unit Three: Probability Distribution. For: L3 & M1 classe
NOTES:
EXEREISE 01 (CONTINUOUS RANDOM VARIABLES): In a population with 40000 persons, 2276
have between 0.8 and 0.84 milligrams of bilirubin per deciliter of blood, and 11508
have more than 0.84. Assuming that the level of bilirubin in blood follows a normal
distribution model.
1.Calculate the mean and the standard deviation.
2.How many persons have more than 1 mg of bilirubin per dl of blood?
IERERSISETORVCONTINUSUSRANBEMIVARIABLED: It is known thatthe blood pressure of
people in a population with 20000 persons follows a normal distribution model with
mean 13 mm Hg and interquartile range 4 mm Hg.
1.How many persons have a blood pressure above 16 mm Hg?
2.How much have to decrease the blood pressure of a person with 16 mm Hg in order
to be below the 40% of people with lowest blood pressure?
iSRERGIsE/GS(GONTINUSUE!RANBOMIVARIABLED):-Let Z be a random variable following a
standard normal distribution model. Calculate the following probabilities using the table
of the distribution function:
1. P(Z < 1.24)
2. P(Z > -0.68)
3. P(—1.35 < Z < 0.44)
: Let X be a random variable following a
normal distribution model N(10,2).
1. Calculate P(X < 10).
2. Calculate P(8 < X < 14),
3. Calculate the interquartile range.
4, Calculate the third decile.
EREHEISEIOS(DISERETE|RANSEWIVARIABUER): Let X be a discrete random variable with
the following probability distribution:
Prepared by: Dr. Zoubida Belli, Created b:
Nigeria
1. Hiya S. Ndams Ahmadu Bello University, Zaria, ElBiostatistics Unit Three: Probability Distribution. For: L3 & M1 classes
xl[4[s]eé[7|[8
Fe) | 0.18 | 0.35 | 0.10 | 0.25 | 0.15
1. Calculate and represent graphically the distribution function,
2. Calculate the following probabilities a. P(X < 7.5).b. P(X > 8).c.
PAS X <65).4.P(S 1) = 0.1149 = 11.5 persons . ne
EXERCISE 02;
2 PU < 75) = 088, P(X > 8) =0,PS 2565) = 08m
1. P(X > 16) = 0.1587 = 3174 persons. PG -0.68) = 0.7517. 2. P(X > 2) = 0.2639.
3. P(-1.35 < Z < 0.44) = 0.5815. apel
1 P(X < 10) = 05 Let. X bethe numberoffemsesna group of students,
2. P(@< X< 14) = 0.8186 P(X <2) =0.17%
3. IQR = 2.608 Let be the number of groups of students without males ina sample of
4. Ds = 8.9512. groups, P(Y > 0) = 0.2125.
Prepared by: Dr. Zoubida Belli, Created by: Dr. Iliya S. Ndams Ahmadu Bello University,
Nigeria