0% found this document useful (0 votes)
23 views

Lecture On Random Variables Statistics

A random variable is a variable whose value depends on the outcome of a random experiment. Random variables can be either discrete or continuous. Discrete random variables assume countable or isolated values, like the number of heads in a coin toss. Continuous random variables can assume any value in an interval, like a person's height. The probability distribution of a random variable lists all possible values and their probabilities. For example, the probability distribution of tossing a coin twice shows the four possible outcomes and their 0.25 probabilities. Common probability distributions include the Bernoulli, binomial, and normal distributions. The binomial models experiments with a fixed number of trials with two outcomes, like coin tosses. It gives the probability of

Uploaded by

Naym Mia
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Lecture On Random Variables Statistics

A random variable is a variable whose value depends on the outcome of a random experiment. Random variables can be either discrete or continuous. Discrete random variables assume countable or isolated values, like the number of heads in a coin toss. Continuous random variables can assume any value in an interval, like a person's height. The probability distribution of a random variable lists all possible values and their probabilities. For example, the probability distribution of tossing a coin twice shows the four possible outcomes and their 0.25 probabilities. Common probability distributions include the Bernoulli, binomial, and normal distributions. The binomial models experiments with a fixed number of trials with two outcomes, like coin tosses. It gives the probability of

Uploaded by

Naym Mia
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Random variables

A random variable is a variable whose value is determined by the outcome of


a random experiment. In other word, if a variable has a probability distribution
then it is called a random variable. Random variables can be either discrete or
continuous.

Example

Suppose one family is randomly selected from this population. The process of
randomly selecting a family is called a random experiment. Let x denote the
number of vehicles owned by the selected family. Then x can assume any of the
five possible values (0, 1, 2, 3, and 4) listed in the first column of Table 5.1. The
value assumed by x depends on which family is selected. Thus, this value
depends on the outcome of a random experiment. Consequently, x is called a
random variable.

Discrete random variable

A random variable that assumes countable or certain/isolated values is called a


discrete random variable.

Examples

1. The number of cars sold at a dealership during a given month

2. The number of customers who visit a bank during any given hour

3. The number of heads obtained in two tosses of a coin

4. Determining the number of defects in a batch of 50 items


Continuous random variable

A random variable that can assume any value contained in one or more intervals
is called a continuous random variable.

Example

1. The height of a person

2. Age of a person

3. The price of a house

Probability distribution

The probability distribution of a discrete random variable lists all the


possible values that the random variable can assume and their corresponding
probabilities.

Example

To begin our study of probability distribution, let’s go back to the idea of a fair
coin, suppose we toss a fair coin twice the possible outcomes are:

First toss Second Number of Probability of


toss heads on the four

Possible two tosses possible

outcomes outcomes

from two T T 0 0.5*0.5  0.25


tosses of a fair
T H 1 0.5*0.5  0.25
coin
H T 1 0.5*0.5  0.25

H H 2 0.5*0.5  0.25
Total 1.0

Characteristics of probability distribution

The probability distribution of a discrete random variable possesses the


following two characteristics.

1. 0 ≤ P (x) ≤ 1 for each value of x

2. ΣP (x) = 1.

Types of probability distribution

Probability Distribution

Discrete probability distribution Continuous probability distribution

Bernoulli distribution Uniform distribution

Binomial distribution Exponential distribution

Poisson distribution Normal distribution


Discrete probability distribution

The probability distribution of a discrete random variable lists all the possible
values that the random variable can assume and their corresponding
probabilities.

Example

The probability that you were born in a given month is also discrete because
there are 12 possible values.

Example

Continuous probability distribution

In a continuous probability distribution, the variable under consideration can


take on any within a given range. So, we cannot list all the possible values.

Example

Suppose we were examining the level of effluent in a variety of streams and we


measured the level of effluent by parts of effluent per million parts of water. We
would expect quite a continuous range of parts per million (ppm), all the way
from very low levels is clear mountains streams of extremely high levels in
polluted streams. We would call the distribution of this variable (ppm) a
continuous distribution.

Bernoulli distribution

Bernoulli trial

A random experiment whose outcomes have been classified into two categories
namely “success” and “failure” represented by letters S and F respectively is
called a Bernoulli trail.

Bernoulli distribution

A discrete random variable X is said to have a Bernoulli distribution if its


probability function is given by

 p x q1 x for x  0,1


f  x, p   
 0, otherwise

where p is the parameter of the distribution satisfying 0  p  1 and p  q  1 .

Example

A coin is tossed in which the outcome “head” is a success and the probability of
head is p . Then q  1  p is the probability of failure or tail. If the number of
heads or success is a random variable X , the X can take values 0 or 1 according
to the outcome is tail (failure) or head (success). Then the probability function
of X is
 p x q1 x for x  0,1
f  x, p   
 0, otherwise

Binomial distribution

Introduction

Binomial distribution was first derived by Swiss mathematician James Bernoulli


(1654-1705) and was first published posthumously in 1913, eight years after his
death.

Definition

A discrete random variable X is said to have a binomial distribution if its


probability function is defined by

 n  x n  x
 p q for x  0,1, 2,..., n
f  x; n, p    x 
 0; otherwise

where the two parameters n and p satisfy 0  p  1 and p  q  1 , also n is


positive integer. For a binomial experiment, the probability of exactly x
successes in n trials is given by the binomial formula

 where
 n = total number of trials
 p = probability of success
 q = 1 – p = probability of failure
 x = number of successes in n trials
 n - x = number of failures in n trials
Conditions for Binomial distribution

o There are n identical finite trials.

o Each trail has only two possible outcomes.

o The probabilities of the two outcomes remain constant.

o The trials are independent.

Mean of Binomial distribution

Mean  np , where n  number of trials and p  probability of success.

Variance of Binomial distribution

Variance  npq , where, n  number of trials, p  probability of success and q 


probability of failure  1  p .

Example 1:

Five percent of all DVD players manufactured by a large electronics company


are defective. Three DVD players are randomly selected from the production
line of this company. The selected DVD players are inspected to determine
whether each of them is defective or good. Is this experiment a binomial
experiment?

 1. This example consists of three identical trials.


 2. Each trial has two outcomes: defective or good.
 3. The probability p that a DVD player is defective is 0.05. The
probability q that a DVD player is good is 0.95.
 4. Each trial (DVD player) is independent.
 Because all four conditions of a binomial experiment are satisfied, this is
an example of a binomial experiment.

Example 2:
2
In a community, the probability that a newly born child will be boy . Among
5
the 4 newly born children in that community, what is the probability that
(a) All the four boys
(b) No boys
(c) Exactly one boy.
Solution

Let us consider the event that a newly born child is a boy as success in Bernoulli
2
trial with probability of success . Let the number of boys be a random variable
5
X . Then X can take values 0, 1, 2, 3, and 4.

According to binomial law, the probability function of X is


4 x
2   4 2   3 
x

f  x, 4,         for x  0,1, 2,3, 4 .
 5   x 5   5 

44
 4 2
4

a) p  all boys   p  x  4       


3
 0.0256 .
 4 5   5 
40
 4 2
0

b) p  no boys   p  x  0        
3
 0.1296 .
0 5   5 
4 1
 4 2
1

c) p  exactly one boy   p  x  1       


3
 0.3456 .
1 5   5 
Example 3:

A fair coin is tossed 5 times. Find the probability of

a) exactly two heads


b) no head
Solution

Let the number of heads be a random variable X which can take values 0, 1, 2,
1
3, 4 and 5. Then X is binomial variate with p  and n  5 .
2
The probability function of X is
5 x
1   5 1   1 
x

f  x,5,         for x  0,1, 2,3, 4,5
 2   x 2   2 
5 2
5 1
2

a) p  exactly two heads   p  x  2       


1
 0.3125 .
 2 2   2 
50
5 1
0

b) p  no heads   p  x  0       
1
 0.03125 .
 2 2   2 
Example 4:

Determine the binomial distribution for which mean is 4 and variance is 3.

Solution

Let X be a binomial variate with parameters n and p . Here, we have, np  4


npq 3 3 3 1 4 4
and npq  3 . Thus  q and p  1  q  1   . Then n    16 .
np 4 4 4 4 p 1
4

Hence, the binomial distribution is

16   1  x  3 16 x
 for x  0,1, 2,...,16.
f  x; n, p    x    
 4   4 

 0; otherwise
Example 5:
At the Express House Delivery Service, providing high-quality service to
customers is the top priority of the management. The company guarantees a
refund of all charges if a package it is delivering does not arrive at its
destination by the specified time. It is known from past data that despite all
efforts, 2% of the packages mailed through this company do not arrive at their
destinations within the specified time. Suppose a corporation mails 10 packages
through Express House Delivery Service on a certain day.
a) Find the probability that exactly one of these 10 packages will not arrive
at its destination within the specified time.
b) Find the probability that at most one of these 10 packages will not arrive
at its destination within the specified time.

Solution:
n=total number of packages mailed = 10
p = P (success) = 2% = 0.02
q = P (failure) = 1 – 0.02 = 0.98
a) We know that,
x = number of successes = 1
n – x = number of failures = 10 – 1 = 9
10!
𝑃(𝑥 = 1) = 10 𝐶1 (0.02)1 (0.98)9 = (0.02)1 (0.98)9
1! (10 − 1)!
= (10)(.02)(.83374776) = 0.1667

Thus, there is a 0.1667 probability that exactly one of the 10 packages mailed
will not arrive at its destination within the specified time

b) At most one x = 0 and x = 1


𝑃(𝑥 ≤ 1) = 𝑃(𝑥 = 0) + 𝑃(𝑥 = 1)
=10 𝐶0 (0.02)0 (0.98)10 +10 𝐶1 (0.02)1 (0.98)9
= (1)(1)(0.81707281) + (10)(0.02)(0.83374776)
= 0.8171 + 0.1667 = 0 .9838

Thus, the probability that at most one of the 10 packages mailed will not arrive
at its destination within the specified time is 0.9838.

Example 6: The phone lines to an airline reservation system are occupied 40%
of the time. Assume that the events that the lines are occupied on successive
calls are independent. Assume that 10 calls are placed to the airline.

(a) What is the probability that for exactly three calls the lines are occupied?
(b) What is the probability that for at least one call the lines are not occupied?
(c) What is the expected number of calls in which the lines are all occupied?

Solution:

Let,

X , be the airline reservation system is occupied.

Then, p = 40% = 0.40, q = 1- p= 1- 0.40= 0.60 and n= 10

According to binomial law, the probability function of X is

f x = nc x px q(n−x)

(a) x = 3
p[x = 3]= f 3 = 10c 3 (0.4)3 (0.6)(10−3) = 120 ∗ 0.064 ∗ 0.028 =
0.215
(b) p[x ≥ 1]= 1 −p[x < 1] = 1- p[x = 0]=1-10c 0 ( 0.4)0 (0.6) 10−0

=1- 0.00604 = 0.993


c) The expected number of calls in which the lines are all occupied

Mean= n.p =10*0.40= 4

Example 7 (self test)

Each sample of water has a 10% chance of containing a particular organic


pollutant. Assume that the samples are independent with regard to the presence
of the pollutant. Find the probability that in the next 18 samples, (i) exactly 2
contain the pollutant; (ii) determine the probability that at least four samples
contain the pollutant,

Example 8: The incidence of occupational disease in an industry is such that


the workers have 20% chance of suffering from it. What is the probability that
out of six workers

i. 4 or more will contract disease?


ii. Exactly 3 will contract disease?
iii. At best 2 will contract disease?
iv. Find the mean and variance of workers have suffering
from

Example 9: Warranty records show that the probability that a new car needs a
warranty repair in the first 90 days is 0.05. If a sample of three new cars is
selected, what is the probability that in the first 90 days
i. None needs a warranty repairs?
ii. More than one needs a warranty repairs?
iii. At least one needs a warranty repairs?
iv. What are the mean and standard deviation of number warranty
repair?
Poisson distribution

Introduction

Poisson distribution was developed by France mathematician and physicist


Simeon Denis Poisson (1781-1840), who published it in 1837.

Definition

A discrete random variable X is said to have a Poisson distribution if its


probability function is given by

 e-   x
 for x  0,1, 2,..., .
f  x;     x !
0; otherwise

where, e  2.71828 and  is the parameter of the distribution which is the mean
number of success and   np .

Note:

If X is a Poisson variate with parameter  , then mean   and variance   .


Hence, mean and variance of Poisson distribution are equal.

Examples

The number of cars passing a certain street in time t .


Number of suicide reported in a particular day.
Number of faulty blades in a packet of 100.
Number of printing mistakes at each page of a book.
Number of air accidents in some unit of time.
Number of deaths from a disease such as heart attack or cancer or due to
snake bite.
Number of telephone calls received at a particular telephone exchange in
some unit of time.
The number of defective materials in a packing manufactured by a good
concern.
The number of letters lost in a mail on a given day in a certain big city.
The number of fishes caught in a day in a certain city.
The number of robbers caught on a given day in a certain city.

Approximation of Binomial Distribution to Poisson:


When 𝑝→0 (𝑆𝑢𝑐𝑐𝑒𝑠𝑠𝑟𝑎𝑡𝑒𝑖𝑠𝑣𝑒𝑟𝑦𝑙𝑜𝑤),
𝑛→∞ (𝑇𝑟𝑖𝑎𝑙𝑛𝑢𝑚𝑛𝑒𝑟𝑖𝑠𝑣𝑒𝑟𝑦𝑙𝑎𝑟𝑔𝑒);
Then Binomial Distribution is approximated to Poisson.
Mathematically, 𝐵𝑖𝑛𝑜𝑚 (𝑥; 𝑛, 𝑝) ≈ 𝑃𝑜𝑖𝑠 (𝑥; 𝜆), where, 𝜆 = 𝑛𝑝.

N.B: As a rule of thumb, if n>29 and 𝑛𝑝 ≤ 7, the approximation is close enough


to use the Poisson distribution for binomial problems.

Example 1:

Suppose that the number of emergency patients in a given day at a certain


hospital is a Poisson variable X with parameter   20 . What is the probability
that in a given day there will be?

a) 15 emergency patients.
b) At least 3 emergency patients.
c) More than 20 but less than 25 patients.
Solution

We know that,

 e-   x
f  x;     for x  0,1, 2,...,  .
 x!

 e-20  20  x
Here,   20 ,  f  x; 20    for x  0,1, 2,...,  .
 x !
e-20  20 
15

a) p 15 emergency patients   p  x  15   0.0516 .


15!
b) p  at least 3 patients   p  x  3  1  p  x  3

 1  p  x  0   p  x  1  p  x  2 

e-20  20  e-20  20  e-20  20 


0 1 2

 1    1.
0! 1! 2!

c) p  20  x  25  p  x  21  p  x  22   p  x  23  p  x  24 

e-20  20  e-20  20  e-20  20  e-20  20 


21 22 23 24

     0.2841 .
21! 22! 23! 24!

Example 2:

If the probability that a car accident happens is a very busy road in on hour is
0.001. If 2000 cars passed in one hour by the road, what is the probability that?

a) exactly 3
b) More than 2 car accidents happened on that hour of the road.
Solution

We know that,

 e-   x
f  x;     for x  0,1, 2,...,  .
 x!

Here, p  0.001 , n  2000

   np  2000*0.001  2 .

 e-2  2  x
 f  x; 2    for x  0,1, 2,...,  .
 x !

e-2  2 
3

a) p  exactly 3 accidents   p  x  3   0.18 .


3!

b) p  more than 2 accidents   p  x  2   1  p  x  2 


 1  p  x  0   p  x  1  p  x  2 

e-2  2  e-2  2  e-2  2 


0 1 2

 1    0.323 .
0! 1! 2!

Example 3:

A factory produces blades in a packet of 10. The probability of a blade to be


defective is 0.2%. Find the number of packets having two defective blades in a
consignment of 10,000 packets.

Solution

We know that,

 e-   x
f  x;     for x  0,1, 2,...,  .
 x!

Here, p  0.2%  0.002 , n  10 .    np  10*0.002  0.02 .

e-0.02  0.02 
2

 p  2 defective blades   p  x  2    0.000196 .


2!

Therefore, the total number of packets having two defective blades in a


consignment of 10,000 packets is 10000  0.000196  1.96  2 .

Example 4:

What probability model is appropriate to describe a situation where 100


misprints are distributed randomly throughout the 100 pages of a book? For this
model, what is the probability that a page observed at random will contain at
least three misprints?

Solution

We know that,
 e-   x
f  x;     for x  0,1, 2,...,  .
 x!

1
We have, p  0.01 (because there is only one mistake on the average in a page) ,
100
n  100 .    np  100  0.01  1 .

 p  at least 3 misprints   p  x  3  1  p  x  3

 1  p  x  0   p  x  1  p  x  2 

e-1 1 e-1 1 e-1 1


0 1 2

 1    0.0803
0! 1! 2!

Example 5 (self-test)

On average, a household receives 9.5 telemarketing phone calls per week. Using
the Poisson distribution formula, find the probability that a randomly selected
household receives exactly 6 telemarketing phone calls during a given week.

Example 6:

A washing machine in a Laundromat breaks down an average of three times per


month. Using the Poisson probability distribution formula, find the probability
that during the next month this machine will have
i. a) exactly two breakdowns
ii. b) at most one breakdown
Example 7:

An auto salesperson sells an average of 0.9 cars per day. Let x be the number of
cars sold by this salesperson on any given day. Find the mean, variance, and
standard deviation.
Normal Distribution

Normal distribution is the most important probability distribution in Statistics.

Definition

A continuous random variable X is said to have a normal distribution if its


density function is given by

 x   2
f  x,  ,  2  
1 
e 2 2
;   x   1
 2

Where, the parameters  and  2 satisfy      and  2  0 .

The variable X whose density function given in (1) is called normal variate
with parameters  and  2 and is denoted by N   ,  2  . The parameters  and  2

are actually the mean and variance of the normal variate X . The graph of the
normal curve is

Standard Normal variate:


X 
If X is a normal variate with parameters  and  2 , then Z  is a standard

normal variate with mean zero and variance unity. The density function of Z is

2
1  z2
f  z , 0,1  e ;   z  
2
Importance of Normal Distribution

Most of the distributions occurring in practice can be approximated by


normal distribution. Moreover, many of the sampling distributions e., g.,
Student’s t , Snedecor’s F , Chi-square distributions etc tend to normal for
large samples.
Normal distribution finds large applications in Statistical Quality Control
in industry for setting control limits.
Note: Let X be a continuous random variable with a cumulative distribution
function F  x  and let a and b be two possible values of X , with a  b . The
probability that X lies between a and b is

p  a  x  b   F b   F  a 

Example 1:

A company produces light bulbs whose life times follows a normal distribution
with mean 1200 hours and standard deviation 250 hours. If a light bulb is
chosen randomly from the company’s output, what is the probability that its life
time will be between 900 and 1300 hours?

Solution

Let X represent lifetime in hours. Then

 900   X   1300   
p  900  x  1300   p    
    

 900  1200 1300  1200 


 p z 
 250 250 

 p  1.2  z  0.4 

 p    z  0.4   p    z  1.2 
 0.65542  0.11507
(By using Normal table)
 0.54035

Hence, the probability is approximately 0.54 that a light bulb will last between
900 and 1300 hours.

Example 2:

A very large group of students obtains test scores that are normally distributed
with mean 60 and standard deviation 15. What proportion of students obtained
scores?

a) Less than 85.


b) More than 90.
c) Between 85 and 95.
Solution

Let X denote the test score. Then

X   85    85  60 
a) p  x  80   p  

  p z  
     15 

 p  z  1.67   p    z  1.67 

 0.9525 . (By using Normal table).

That is 95.25% of the students obtained scores less than 80.

X   90    90  60 
b) p  x  90   p  

  p z  
     15 

 p  z  2   1  p  z  2   1  p    z  2 

 1  0.9772  0.0228 . (By using Normal table).

That is 2.28% of the students obtained scores more than 90.

85   X   95   
c) p  85  x  95   p    
    
 85  60 95  60 
 p z 
 15 15 

 p 1.67  z  2.33

 p    z  2.33  p    z  1.67 

 0.9901  0.9525
(By using Normal table)
 0.03756

That is 3.76% of the students obtained scores in the range 85 to 95.

Example 3:

The average daily sales of 500 branch office were Tk. 150 thousands and the
standard deviation Tk. 15 thousands. Assuming the distribution to be normal
indicate how many branches have sales between

a) Tk. 120 thousands and Tk. 145 thousands.


b) Tk. 140 thousands and Tk. 165 thousands.
Solution

Let X be the average daily sales of 500 branch office.

120   X   145   
a) p 120  x  145   p    
    

 120  150 145  150 


 p z 
 15 15 

 p  2  z  0.33

 p    z  0.33  p    z  2 

 0.3707  0.02275  0.34795 (By using Normal table)

Hence, the expected number of branches having sales between Tk. 120
thousands and Tk. 145 thousands are 0.3479  500  173.95  174
140   X   165   
b) p 140  x  165   p    
    

 140  150 165  150 


 p z 
 15 15 

 p  0.67  z  1

 p    z  1  p    z  0.67 

 0.84434  0.25143  0.58991 (By using Normal table)

Hence, the expected number of branches having sales between Tk. 140
thousands and Tk. 165 thousands are

0.58991 500  294.955  295

Example 4:

The life span of a calculator manufactured by Texas Instruments has a normal


distribution with a mean of 54 months and a standard deviation of 8 months.
The company guarantees that any calculator that starts malfunctioning within 36
months of the purchase will be replaced by a new one. About what percentage
of calculators made by this company are expected to be replaced?

Solution: [Hints: For x = 36, P(x < 36) = P (z < -2.25) = .0122, Hence, 1.22%
of the calculators are expected to be replaced]

Example 5:

The line width of for semiconductor manufacturing is assumed to be normally


distributed with a mean of 0.5 micrometer and a standard deviation of 0.05
micrometer.
(a) What is the probability that a line width is greater than 0.62 micrometer?
(b) What is the probability that a line width is between 0.47 and 0.63
micrometer?
Solution: Try yourself [Hints: (a) for, x = 0.62, P(x > 0.62) = P (z > 2.4) = 1 -
2.4 = 0.99]
Example 6:
The life of a semiconductor laser at a constant power is normally distributed
with a mean of 7000 hours and a standard deviation of 600 hours. What is the
probability that a laser fails before 5000 hours?
Solution: Try yourself [Hints: For x = 5000, P(x < 5000) ]
Example 7:
The time it takes a cell to divide (called mitosis) is normally distributed with an
average time of one hour and a standard deviation of 5 minutes.
(a) What is the probability that a cell divides in less than 45 minutes?
(b) What is the probability that it takes a cell more than 65 minutes to divide?
Solution: Try yourself [Hints: mean= 1 hour = 60 min]

You might also like