Lec 01
Lec 01
Prepared by-
Lakshmi Rani Kundu
Assistant Professor
Department of Public Health and Informatics
Jahangirnagar University
Learning Outcomes:
At the end of the session students will able to-
Use the probability distribution for a discrete random
variable to find the probability of events of interest.
Explain how a density function is used to find probabilities
involving continuous random variables.
explain the similarities and differences between distributions
of the discrete type and the continuous type and when the
use of each is appropriate.
Random variable: A random variable assigns a unique
numerical value to the outcome of a random experiment.
Example: Consider the random experiment of flipping a coin
twice.
The sample space of possible outcomes is S = { HH, HT, TH, TT }.
Now, let’s define the variable X to be the number of
tails that the random experiment will produce.
If the outcome is HH, we have no tails, so the value for X is 0.
If the outcome is HT, we got one tail, so the value for X is 1.
If the outcome is TH, we again got one tail, so the value for X is 1.
Lastly, if the outcome is TT, we got two tails, so the value for X is 2.
X is a quantitative variable that takes the possible values of
0, 1, or 2.
It is random because we do not know which of the three values
the variable will eventually take.
Types of random variable:
➢ Discrete random variable: A random variable whose
possible values are a list of distinct values, is called
a discrete random variable. For example, number of
heads, number of accidents in a day, number of patients
admitted in a hospital, etc.
➢ Continuous random variable: A random variable
that can take any value in an interval, is called
a continuous random variable. For example, Weight,
Blood pressure,Time to recovery, cholesterol, etc.
Probability distribution: The list of all possible values that the
random variable can assume and their corresponding probabilities is
called the probability distribution.
Probability distribution
Note:
Example: Random Experiments (Binomial or Not?)
➢ A fair coin is flipped 20 times; X represents the number of heads.
X is binomial with n = 20 and p = 0.5.
➢ You roll a fair die 50 times; X is the number of times you get a
six.
X is binomial with n = 50 and p = 1/6.
➢ Roll a fair die repeatedly; X is the number of rolls it takes to get a six.
X is not binomial, because the number of trials is not
fixed.
Mean and Variance of binomial distribution:
mean = np
Variane = npq
mean >Variance
Example: Suppose we sample 120 people at random. 10% of
the population has blood type B. On average, how many
would you expect to have blood type B? What is the standard
deviation of the number X who have blood type B?
mean=120*.1=12
Variance=npq=120*.1*.9=10.8
Example: In a community, the probability that a newly born
child will be boy is 2/5. Among the 4 newly born children in
that community, what is the probability that (i) all the four boys,
(ii) at least two boys, (iii) no boys, (iv) exactly one boy and (v)
at most two boys.
Solution: Let us consider the event that a newly born child is a
boy as success in Bernouli trial with probability of success 2/5.
Let the number of boys be a random variable X can take values
0,1,2,3,and 4.
The probability function of X is
4− x
4 2 3
x
f ( x; 4, 2 / 5 ) = for x = 0,1, 2, 3, 4
x 5 5
(i) P[all boys]=P[X=4]=(2/5)^4=16/625=0.0256
(ii) P[at least two boys]=P[X ≥ 2]=1-P[X<2]
=1-P[X=0]+P[X=1]
=1-[(3/5)^4+4(2/5)(3/5)^3]=?
(iii) P[no boys] = P[X= 0] = (3/5)^4 = 0.1296
1 −
f ( x; , 2 ) = e 2
; − x
2
where the parameters and 2 satisfy − and 2 0
The parameters and 2 are the mean and variance of the
normal variate X.
X~N ( , 2 )
The graph of the normal curve is
− +
The curve is symmetric about the mean
The mean, the median, and the mode are all equal
The total area under the curve is 1 or 100%
The black and the red normal curves have means or centers
at μ = mu = 10. The red curve is more spread out and thus
has a larger standard deviation.
The black and the green normal curves have the same
standard deviation or spread but different mean.
Standard Normal Distribution: A continuous random
variable Z is said to have a standard normal variate if its
density function is given by
1 − 12 z 2
f ( z) = e ; − z
2
Z~N(0,1)
The Standard Deviation Rule for Normal Random
Variables (Empirical rule):
If X is a normal random variable, then the probability is
➢ 68% of observations fall within 1 standard deviation of the mean
➢ 95% of observations fall within 2 standard deviations of the mean
➢ 99.7% of observations fall within 3 standard deviation of the mean
Using probability notation, we may write
Suppose that foot length of a randomly chosen adult male is a
normal random variable with mean μ = 11 and standard
deviation σ = 1.5.
(i) What is the probability that a randomly chosen adult male
will have a foot length between 8 and 14 inches?
Answer: 11±2*1.5 = .95 or 95%
(ii) An adult male is almost guaranteed (.997 probability) to
have a foot length between what two values?
Answer: 6.5 and 15.5 inches
What is the probability that a male’s foot length to be more than 13
inches?
How many standard deviations below or above the mean male foot
length is 13 inches?
x−
z=
=(13-11)/1.5
=+1.33
we have just found the z-score for a male foot length of 13 inches
to be z = +1.33. Or, we have standardized the value of 13.
The standardized value z tells how many standard deviations
below or above the mean the original value is, and is calculated as
follows:
z-score = (value – mean)/standard deviation
What is the standardized value for a male foot length of 8.5
inches? How does this foot length relate to the mean?
Answer: z = (8.5 – 11) / 1.5 = -1.67.
This foot length is 1.67 standard deviations below the mean.
A man’s standardized foot length is +2.5. What is his actual
foot length in inches?
x = mu+z*sigma
x = 11 + 2.5(1.5) = 14.75 inches.
Example:
Example: Male foot lengths have a normal distribution,
with mean 11 inches, and standard deviation 1.5 inches.
What is the probability that a male’s foot length to be
more than 13 inches?
Solution: The standardized value of 11 is
z=(x-mu)/sigma=1.33
P(X>13)=P(Z>1.33)
=1-P(Z<1.33)
=0.0918
Suppose that the growth in inches during the tenth year of
life of Bangladeshi boy is a normal random variate with mean
2 inches and standard deviation 1 inch. Find the probability
that a randomly selected boy will grow (i) between 1 and 2
inches in his tenth year, (ii) more than 3 inches in his tenth
year, (iii) at least 1 inch in his tenth year and (iv) less than 1
inch in his tenth year.
Solution:
(i) P[1<X<2]=P[1-2/1<X-2/1<2-2/1]=P[-1<Z<0]
= P[Z<0]-P[Z<-1]
=.500-.1587=.3413
(ii) P[X>3] = P[Z>3-2/1] = P[Z>1]= 1-P[Z<1]
=1-.8413=0.1587
(iii) P[X>1]= P[Z>-1] = 1-P[Z<-1] =1- .1587 =0.8413
(iv)P[X<1]= P[Z<-1]=0.1587
Normal Approximation to Binomial
and Poisson
Normal distribution is obtained from binomial
distribution under the following conditions:
➢ The probability of success or failure are not so small
➢ n, the number of trials is very large (say n>30). np>5
Then, mean, = np = npq
standard deviation,
Normal distribution is obtained from poisson
distribution under the following conditions:
➢ The mean λ of a poisson distribution is large (say λ>1000).
mean = λ, standard deviation, =
Home Work
1. Suppose the average length of stay in a chronic disease hospital of a
certain type of patient is 60 days with a standard deviation of 15. If
it is reasonable to assume an approximately normal distribution of
lengths of stay, find the probability that a randomly selected
patient from this group will have a length of stay:
(a) Greater than 50 days (b) Less than 30 days
(c) Between 30 and 60 days (d) Greater than 90 days
2. The weights of a certain population of young adult females are
approximately normally distributed with a mean of 132 pounds
and a standard deviation of 15. Find the probability that a subject
selected at random from this population will weigh:
(a) More than 155 pounds (b) 100 pounds or less
(c) Between 105 and 145 pounds
3. Suppose it is known that the probability of recovery for a
certain disease is 0.4. If 35 people are stricken with the
disease, what is the probability that:
(a) 25 or more will recover?
(b) Fewer than five will recover?
(Use the normal approximation.)