Chapter 6 - Part II

Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

Chapter 6-Part II

Random variable and Probability


distribution
Random Variable
• A random variable (r.v): is a variable whose value are is
determined by chance.
Example:
1. In the experiment of tossing a coin three times, let we define
the random variable X as number of heads. What is the possible
values of r. v X?
2. Let a pair of fair dice be tossed and let X denotes the sum of the
points obtained. What is the possible value of X?

11/13/2022 Introduction to Biostatistics 2


Types of Random Variable

Discrete random variable: are variables which can


assume only a specific number of values. They have
values that can be counted.
Continuous random variable: are variables that can
assume all values between any two given values.

11/13/2022 Introduction to Biostatistics 3


Probability distribution

11/13/2022 Introduction to Biostatistics 4


Probability distribution of discrete random variable

11/13/2022 Introduction to Biostatistics 5


Probability distribution of discrete r.V

Example:- A shipment of 8 similar computers to a


retail outlet contains 3, that are defective. If a school
makes a random purchase 2 of these computers, find
the probability distribution for the number of
defectives.
X=0,1,2
P(x=0)=5C2 3C0/8C2
11/13/2022 Introduction to Biostatistics 6
Expected and variance of discrete r.v.

11/13/2022 Introduction to Biostatistics 7


Expected value and variance of discrete r.v
Example:
1. The number of messages sent per hour over a computer
network has the following distribution:

number of messages 10 11 12 13 14 15
0.08 0.15 0.30 0.20 0.20 0.07

a. Is it a proper PMF
b. Find the expected value and standard deviation of the number of
messages sent per hour
2. Find the mean and variance of r.v X for the previous example 1

11/13/2022 Introduction to Biostatistics 8


Probability distribution of continuous r.v.

11/13/2022 Introduction to Biostatistics 9


Expected and variance of a continuous r.v

11/13/2022 Introduction to Biostatistics 10


Example

11/13/2022 Introduction to Biostatistics 11


Common Discrete probability distributions
•Binomial Distribution
•Poisson Distribution

Common Continuous Probability Distributions

•Normal Distribution

11/13/2022 Introduction to Biostatistics 12


Binomial Distribution
 A binomial experiment is a probability experiment that
satisfies the following requirements called assumptions
of a binomial distribution.
• There is only two outcomes in Bernoulli trials (success
or failure)
• Fixed number of trials (n) i.e. n should be discrete
• At each trial the probability of success (p) remains the
same
• n trials are independent.

11/13/2022 Introduction to Biostatistics 13


Binomial Distribution……
• The binomial distribution is given by the probability mass
function ( pmf)
 n  x n x
P( X  x)    p q , x  0,1,2,....,n
 x
• In the formula, n= number of trials
x= number of successes in a trial
n-x = number of failures in a trial
p = probability of success (= x/n)
q = 1 - p = probability of failure
The parameters of the binomial distribution are n and p

  E (X )  np

 2  var(X )  np (1  p )
11/13/2022 Introduction to Biostatistics 14
Binomial Distribution…..
• Examples
• Tossing a coin 20 times to see how many tails
occur.
• Asking 200 people if they watch BBC news.
• Asking 100 people if they favor the ruling party.
• Rolling a die to see if a 5 appears.

11/13/2022 Introduction to Biostatistics 15


Example (Binomial ..)
Example 1: Five fair coins are tossed. Find the probability of
obtaining
a. No heads
b. At least four heads
c. At most 2 heads
d. Exactly 2 heads

Example 2:A given mid-exam contains 10 multiple choice


questions, and each question has four alternatives with one exact
answer. Find the probability that the student exactly answered
a. 3 questions c. At least 3 questions
b. 8 questions

11/13/2022 Introduction to Biostatistics 16


Example (Binomial ..)
Suppose that in a certain malarias area past experience indicates
that the probability of a person with a high fever will be
positive for malaria is 0.7. Consider 3 randomly selected
patients (with high fever) in that same area.
a) What is the probability that no patient will be positive for
malaria?
b) What is the probability that exactly one patient will be
positive for malaria?
c) What is the probability that exactly two of the patients will be
positive for malaria?
d) What is the probability that all patients will be positive for
malaria?

11/13/2022 Introduction to Biostatistics 17


Example (Binomial …)
Example 3. Suppose that an examination consists of six true
and false questions, and assume that a student has no
knowledge of the subject matter. The probability that the
student will guess the correct answer to the first question is
30%. Likewise, the probability of guessing each of the
remaining questions correctly is also 30%.
• What is the probability of getting more than three correct answers?
• What is the probability of getting at least two correct answers?
• What is the probability of getting at most three correct answers?
• What is the probability of getting less than five correct answers?

11/13/2022 Introduction to Biostatistics 18


The Poisson Probability Distribution

• The Poisson distribution is also used to represent the


probability distribution of a discrete random variable.
• It is employed in describing random events that occurs
rarely over a continuum of time or space
• Example
- Number of misprinting
- Natural disasters like earth quake
- No of telephone calls per hour
- No of car accident occurs per week

11/13/2022 Introduction to Biostatistics 20


The Poisson Distribution of the random variable X.

x 
e
P( x) 
x!
 λ = mean number of occurrences in the
given unit of time, area, volume, etc.
 e = 2.71828….
 Mean µ = λ , variance: σ2 = λ
Example
1. If 1.6 accidents can be expected an intersection on any given day,
what is the probability that there will be 3 accidents on any given
day?
2. In a hospital, the average number of new born female baby in every
24 hours is 8. what is the probability that
a. No female babies are born in a day
b. Only three females babies are born per day
c. 2 female babies are born in 12 hours
Continuous Probability Distributions
The Normal distribution
♣ The Normal Distribution is by far the most important
probability distribution in statistics.

♣ The normal distribution is a theoretical, continuous probability


distribution whose equation is:

1  x   2
-  
e 2  
1
f(x) 
2  for - < x < +
- < μ < + and
σ>0

11/13/2022 Introduction to Biostatistics 23


Characteristics of the Normal Distribution
♣ It is a probability distribution of a continuous variable. It extends
from minus infinity( -) to plus infinity (+).
♣ It is unimodal, bell-shaped and symmetrical about x =.
♣ The mean, the median and mode are all equal.
♣ The total area under the curve above the x-axis is one square unit.
♣ The curve never touches the x-axis.
♣ It is determined by two quantities: its mean (  ) and SD (  )
♣ An observation from a normal distribution can be related to a
standard normal distribution (SND) which has a published table.

11/13/2022 Introduction to Biostatistics 24


Properties of Normal Distribution

11/13/2022 Introduction to Biostatistics 25


Standard normal distribution
♣ Since the values of  and  will depend on the particular
problem in hand and tables of the normal distribution
cannot be published for all values of  and ,
calculations are made by referring to the standard normal
distribution which has  = 0 and  = 1.
♣ Thus an observation x from a normal distribution with
mean  and standard deviation  can be related to a
Standard normal distribution by calculating :
SND = Z = (x -  )

11/13/2022 Introduction to Biostatistics 26
Properties of the Standard Normal Distribution:
Same as a normal distribution, but also...
• Mean is zero
• Variance is one
• Standard Deviation is one
• Areas under the standard normal distribution
curve have been tabulated in various ways.
• The most common ones are the areas between Z=
0 and a positive value of Z
11/13/2022 Introduction to Biostatistics 27
The Standard Normal Distribution
The standard normal random variable, Z, is the normal random
variable with mean = 0 and standard deviation = 1: Z~N(0,12).

Standard Normal Distribution

0 .4

0 .3

{
f( z)

0 .2 =1

0 .1

0 .0

-5 -4 -3 -2 -1 0 1 2 3 4 5

=0
Z

11/13/2022 Introduction to Biostatistics 28


• Given a normal distributed random variable X
with Mean μ and standard deviation σ.
a X  b
P ( a  X  b)  P (   )
  

a b
P ( a  X  b)  P ( Z )
 

11/13/2022 Introduction to Biostatistics 29


Examples
1. Find the area under the standard normal
distribution which lies
a. Between Z  0 and Z  0.96
b. Between Z  1.45 and Z  0
c. To the right of Z  0.35
d. Between Z  0.67 and Z  0.75

11/13/2022 Introduction to Biostatistics 30


• Solutions
a. Area  P(0  Z  0.96)  0.3315
b. Area  P(1.45  Z  0)
 P (0  Z  1.45)
 0.4265
Area  P( Z  0.35)
c.  P(0.35  Z  0)  P( Z  0)
 P(0  Z  0.35)  P( Z  0)
 0.1368  0.50  0.6368
d. Area  P(0.67  Z  0.75)
 P(0.67  Z  0)  P(0  Z  0.75)
 P(0  Z  0.67)  P(0  Z  0.75)
11/13/2022
 0.2486  0.2734  0.5220
Introduction to Biostatistics 31
Finding Probabilities of the Standard Normal
Distribution: P(0 < Z < 1.56)
Standard Normal Probabilities
Standard Normal Distribution z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.4 0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.3 0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
f(z)

0.2 0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
0.1
1.56 1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
{

1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
0.0
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
-5 -4 -3 -2 -1 0 1 2 3 4 5 1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
Z 1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
Look in row labeled 1.5 2.0
2.1
0.4772
0.4821
0.4778
0.4826
0.4783
0.4830
0.4788
0.4834
0.4793
0.4838
0.4798
0.4842
0.4803
0.4846
0.4808
0.4850
0.4812
0.4854
0.4817
0.4857

and column labeled .06 to 2.2


2.3
0.4861
0.4893
0.4864
0.4896
0.4868
0.4898
0.4871
0.4901
0.4875
0.4904
0.4878
0.4906
0.4881
0.4909
0.4884
0.4911
0.4887
0.4913
0.4890
0.4916

find P(0  z  1.56) =


2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964

0.4406 2.7
2.8
0.4965
0.4974
0.4966
0.4975
0.4967
0.4976
0.4968
0.4977
0.4969
0.4977
0.4970
0.4978
0.4971
0.4979
0.4972
0.4979
0.4973
0.4980
0.4974
0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990

11/13/2022 Introduction to Biostatistics 32


Example: A random variable X has a normal distribution
with mean 80 and standard deviation 4.8. What is the
probability that it will take a value
•Less than 87.2
•Greater than 76.4
•Between 81.2 and 86.0
Solution: X is normal with mean,   80, s tan dard deviation ,   4.8
X   87.2  
a P ( X  87.2)  P (  )
 
87.2  80
 P( Z  )
4.8
 P( Z  1.5)
 P( Z  0)  P(0  Z  1.5)
 0.50  0.4332  0.9332
11/13/2022 Introduction to Biostatistics 33
Example:- A random variable has a normal distribution with   5
Find its mean if the probability that the random variable will
assume a value less than 52.5 is 0.6915.
Solution:
52.5  
P( Z  z )  P( Z  )  0.6915
5
 P (0  Z  z )  0.6915  0.50  0.1915 .
But from the table
 P(0  Z  0.5)  0.1915
52.5  
z  0.5
5
   50
11/13/2022 Introduction to Biostatistics 34
Example:- A normal distribution has mean 62.4.Find its
standard deviation if 20.05% of the area under the normal curve
lies to the right of 72.9.
Solution
X  72.9  
P ( X  72.9)  0.2005  P (  )  0.2005
 
72.9  62.4
 P( Z  )  0.2005

10.5
 P( Z  )  0.2005

10.5
 P (0  Z  )  0.50  0.2005  0.2995

And from table P (0  Z  0.84)  0.2995
10.5
  0.84

   12.5
11/13/2022 Introduction to Biostatistics 35
Exercise 1: A study done on breath metabolites such as ammonia,
acetone, isoprene, ethanol and acetaldehyde in five subjects over
a period of 30 days. Each day, breath samples were taken and
analyzed in the early morning on arrival at the laboratory. For
subject A, a 27-year-old female, the ammonia concentration in
parts per billion (ppb) followed a normal distribution over 30
days with mean 491 and standard deviation 119. What is the
probability that on a random day, the subject‘s ammonia
concentration is between 292 and 649 ppb?
Exercise 2: Of a large group of men, 5% are less than 60 inches in
height and 40% are between 60 & 65 inches. Assuming a normal
distribution, find the mean and standard deviation of heights.

11/13/2022 Introduction to Biostatistics 36

You might also like