0% found this document useful (0 votes)
30 views39 pages

Chapter 4-6

Chapter 4

Uploaded by

abiysemagn460
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views39 pages

Chapter 4-6

Chapter 4

Uploaded by

abiysemagn460
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

CHAPTER 4

4 Random Variables and Probability Distribution

4.1 Introduction
In probability and statistics, a random variable or stochastic variable is a variable whose possible
values are outcomes of a random phenomenon. A random variable has a probability distribution,
which specifies the probability that its value falls in any given interval. Probability distributions
form the core of many statistical calculations. They are used as mathematical models to represent
some random phenomenon and subsequently answer statistical questions about that phenomenon.
In real-world problems we are often faced with one or more quantities that do not have fixed
values. The values of such quantities depend on random actions, and they usually change from one
experiment to another.

After completing this chapter, you should be able to

• To introduce one dimensional random variables with their probability distributions.


• To introduce one dimensional functions and computing their probability distribution,
expectation and variance.
• To introduce the basics of random variables, common probability distributions and
expectation.
• To build up interest in probability and hence encourage students to study more advanced
courses.
4.2 Random Variables
It is a variable whose values are determined by a chance with some probability. Let E be the
experiment and S be a sample space associated with the experiment. A function X that assigning
a real number, X(S), to every element of s ∈ S, is called random variable.

Definition: A variable whose values are determined by chance with associated probabilities is
called a random variable. It is a quantity which in different observations can assume different
values. Random variables are usually denoted with capital letter X,Y,Z etc, while the values taken
by them are denoted by lower case letters x,y,z etc. Thus, P (x1< X < x2) is the probability that the
random variable X takes values between x1 and x2, both inclusive. A random variable can be either
discrete or continuous.

1
Definition: Let Ω be the sample space associated with a given random experiment. A real-valued
function X: Ω→ (−∞, ∞) is called a one-dimensional random variable.

4.2.1 Discrete Random Variable


Discrete Random Variables: are variables which can assume only a specific number of values. Its
range is finite (or countable infinite). Discrete random variables are variables that can assume only
certain clearly separated values resulting from a count of some items of interests.
Examples 4.1:
• Toss coin n times and count the number of heads.
• Number of children in a family.
• Number of car accidents per week.
• Number of defective items produced per 1000.
• Number of bacteria per two cubic centimeter of water.

Definition If X is a discrete random variable having distinct values x1,x2,…,xn then the function
PX(x), or simply p(x), defined by p(x)=P(X=xi)=pi, if x=xi and 0, if x≠xi, i=1,2,… is called
probability mass function(pmf) of random variable X. Properties of P(X=xi) where X is discrete
random variable:
1. 0 ≤ P(X=xi) ≤ 1
2. ∑𝑛𝑖=1 P(X = xi) =1
Example 4.2: Consider the experiment of tossing a coin three times let ‘X’ be the number of heads.
then Construct the probability distribution of X?

Solution: The variable X takes the value 0,1,2,3 with probability distribution {HHH, HHT, HTH,
TTH, THT, THH, HTT, TTT} then the probability distribution for X
X 0 1 2 3

P(X) 1/8 3/8 3/8 1/8

4.2.2 Continuous Random Variable


A random variable X is said to be continuous if it can take all possible values (integral as well as
fractional) between certain limits. Continuous random variables occur when we deal with
quantities that are measured on a continuous scale. For instance, the life length of an electric bulb,

2
the speed of a car, weights, heights, and the like are continuous. In such cases, probabilities are
associated with intervals or regions of a continuous random variable, and not with individual
points. Continuous random variable: are variables that can assume all values between any two
given values (intervals).
Examples 4.3:
• Height of students at certain college.

• Mark of a student.

• Life time of light bulbs.


Definition: The probability density function of the continuous random variable X denoted f(x),
satisfies the following properties:
1. f(x) ≥ 0

2. ∫−∞ f(x)dx = 1
𝑥2
3. P(x1<X<x2)=∫𝑥1 f(x)dx
𝑎
4. P(X=a)=∫𝑎 f(x)dx =0
Example 4.4: Suppose we have a continuous random variable’ X’ whose probability density
2
function is given by 𝑓(𝑋) = { 𝑐𝑥 0 < 𝑥 < 3
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
A. Determine the value of ‘c’
B. Verify that f is pdf
C. Calculate 𝑝(1 < 𝑥 < 2)
Solution:

a) ∫−∞ 𝑓(𝑥)𝑑𝑥=1 property of pdf
3 𝑋3 3
∫0 𝑐𝑥 2 𝑑𝑥= 𝑐( 3 )| =9c=c=1/9
0
31 1
b) ∫0 9 𝑥 2 dx=1=(27 𝑥 3 ) = 1Then f is pdf
21 1 2 1
c) ∫1 9 𝑥 3 𝑑𝑥=(27 𝑥 3 ) | = 3
1

3
4.3 Cumulative Distribution Function
a) Cumulative distribution function of discrete random variable
Let X be a discrete random variable with probability mass function (pmf) then the
cumulative distribution function is denoted by F(x) and it is defined by;
F(x) = P(X≤x)=∑𝒙𝒙=𝟎 P(X = xi)
Example 4.5: Tossing a coin three time, let X be getting the numbers of head, then find the CDF
of X.
Solution: The variable ‘X’ takes the value 0,1,2,3
Sample space = (HHH, HHT, HTH, TTH, THT, THH, HTT, TTT)
X P(x) F(X)

0 1/8 1/8

1 3/8 4/8

2 3/8 7/8

3 1/8 1

b) Cumulative distribution function for continuous random variable


If X is continuous random variable with probability density function (pdf), f(x) then the
Cumulative distribution function of X is F(x) which is defined as;
𝑥
F(x) = 𝑝(𝑋 ≤ 𝑥) =∫−∞ f(t)dt
F(X) gives the “accumulated” probability “up to x .
Properties of CDF
1. 0≤F(X)≤1
𝑋 ∞
2. lim 𝐹(𝑋) = lim ∫−∞ 𝑓(𝑡)𝑑𝑡 = ∫−∞ 𝑓(𝑡)𝑑𝑡 = 1
𝑋→∞ 𝑥→∞
𝑋 ∞
3. lim 𝐹(𝑋) = lim ∫−∞ 𝑓(𝑡)𝑑𝑡 = ∫−∞ 𝑓(𝑡)𝑑𝑡=0
𝑥→−∞ 𝑋→−∞

4. F’(X)=f(x) i.e (F(X) is the anti-derivative of f(x) ).


5. F(X) is a non –decreasing function

4
4.4 Expectation and Variance of Random variables
The objective of this section is to introduce you with the most common parameters of probability
distributions. There are some summary measures in terms of which we can summarize the behavior
of probability distributions. The most common of these are the average called expected value and
dispersion about the average called the variance.
Expectation: The averaging process, when applied to a random variable is called expectation. It
is denoted by E(X) or µand is read as the expected value of X or the mean value of X.
Case 1: For discrete random variable
Suppose X is a discrete random variable which takes on values in a finite set x1, x2,…, xn with
probabilities P(xi) = P[X = xi] for i= 1, 2, …n, then Expected value of X, E(X) of the discrete
random variable is given by: 𝐸(𝑋) = 𝜇 = ∑𝑛𝑖=1 𝑥𝑖 𝑃(𝑥𝑖 )
Case 2: For continuous random variable
If X is a continuous random variable then
∞ ∞
E(X) =∫−∞ 𝑥𝑓𝑋 (𝑥) 𝑑𝑥 provided ∫−∞ 𝑥𝑓𝑋 (𝑥) 𝑑𝑥 < ∞ where 𝑓𝑋 (𝑥) is the probability density
function of the continuous random variable X.
Case 3: Mathematical expectation of some real function h(x) of a discrete random variable is given
by:
E[h(x)] = ∑𝑛𝑖=1 h(x)p( xi )
Similarly, if X is a continuous random variable, then

E[h(x)] =∫−∞ ℎ(𝑥)𝑓𝑋 (𝑥) 𝑑𝑥
Properties of Expectation
If X and Y are random variables and a, b are constants then:
1. E(k) = k, where k is any constant
2. E (kX) = k E(X), where k is any constant
3. E (X + k) =E(X) + k
4. E(X + Y) = E(X) +E(Y)
5. E(XY) = E(X) E(Y), if X, Y are independent random variables
6. E(X) ≥ 0, if X ≥ 0.
7. |E(X)| ≤ E(|X|)
8. |E(XY)2| ≤ E(X2) E(Y2).

5
Variance of a random variables
Mean of X = E(X)
Variance of X =𝜎𝑥 2 = 𝐸[𝑋 − 𝐸(𝑋)]2 = 𝐸(𝑋 2 ) − [𝐸(𝑋)]2
Case 1: If X is a discrete random variable with expected value μ then the variance of X, denoted
by Var (X), is defined by:
𝜎𝑥2 = Var(X) = E(X-μ)2 = E(X2) – μ2 = ∑𝑛𝑖=1 𝑥𝑖2 𝑝(𝑥𝑖 ) − 𝜇 2

Case 2: If X is a continuous random variable, then var (X) = 𝜎𝑥 2 = ∫−∞(𝑥 − 𝑥´ )2 𝑓𝑥 (𝑥)𝑑𝑥
Properties of Variances
✓ For any random variable X and constant a, it can be shown that
Var(aX) = a2Var(X)
Var(X + a) = Var(X) +0 = Var(X)
✓ If X and Y are independent random variables, then Var(X + Y) = Var(X) + Var(Y)
More generally if X1, X2 ……, Xk are independent random variables, then

Var (X1 +X2 + …..+ Xk) = Var (X1) +Var (X2) +…. + var (Xk)

i.e.,𝑉𝑎𝑟(∑𝑘𝑖=1 𝑥𝑖 ) = ∑𝑘𝑖=1 𝑉𝑎𝑟(𝑥𝑖 )


✓ If X and Y are not independent, then
Var (X+Y) = Var(X) + 2Cov(X,Y) + Var(Y)
Var(X-Y) = Var(X) – 2Cov(X,Y) + Var(Y)

Example 4.6: Two fair coins are tossed. Determine Var (X) where X is the number of heads that
appear.
a) Use the definition of the variance.
b) Use the fact that the variance of the sum of independent variables is equal to the sum of the
variance.
Solution:
a) Let X is number of heads with possible values 0,1and 2. The Sample space consists of {HH,
TH, HT,TT}
P (X = 0) =¼, P (X = 1) = ½, P(X=2) = ¼
E (X) = 0.P(X=0) +1.P (X=1) +2P(X=2) = 0 (1/4) + 1(1/2) +2(1/4) = 1.
E(X2) = 02P(X=1) +12.P(X=1) +22P(X=2) = 0(1/4) + 1(1/2) +4(1/4) = 3/2.

6
Implies that, Var (X) = E(X2) – μ2 = 3/2-1=1/2
b) Let X be head on the first coin with possible values 0 and 1
Y be head on the second coin with possible values 0 and 1.
P(X= 0) = ½, P (X = 1) = ½ and P (Y=0) = ½, P(Y=1) = ½
E(X) = 0.P(X=0 + 1.P(X=1) E(Y) = 0.P(Y=0) +1P(Y=1)
= 0(1/2) +1(1/2) = 0(1/2) +1(1/2)
= 1/2 = 1/2
E(X2) = 02 .P(X=0) +12.P(X=1) E (Y2) = 02.P(Y=0) +12P(Y=1)
= 0(1/2) +1(1/2) = 0(1/2) +1(1/2)
=1/2 =½
Var (X) = E (X2) – μ2 Var (Y) = E (Y2) - μ2
= ½ - (1/2)2 = ¼ = ½ - (1/2)2 = ¼

X and Y are independent (i.e. the outcome of one coin does not influence the outcome of the
second)

Var (X+Y) = Var (X) +Var (Y) = 1/4 +1/4 = ½ .


Example 4.7: A large domestic automobile manufacturer mails out quarterly customer satisfaction
surveys to owners who have purchased new automobiles within the last 3 years. The proportion of
surveys returned in any given quarter is the outcome of a random variable X having density
3𝑥 2 , 0 ≤ 𝑥 ≤ 1
function𝑓(𝑥) = { . What is the expected proportion of surveys returned in any
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
given quarter?
Solution: By definition,
+∞ 1 1 13𝑥 4
E(x) =∫−∞ 𝑥𝑓(𝑥)𝑑𝑥=∫0 𝑥(3𝑥 2 )𝑑𝑥=∫0 3𝑥 3 𝑑𝑥 = | = 0.75
4 0
The expected proportion of surveys returned in any given quarter is 0.75

4.5 Common Discrete Probability Distributions


The Binomial, multinomial, Geometric, Hyper geometric and Poisson distribution are part of a
family of distributions that we call discrete probability distributions. Some counting techniques
that can be useful in determining the probability of an event using the classical approach. In this
section, we'll explore discrete random variables and discrete probability distributions. The basic
idea is that when certain conditions are met, we can derive formulas for calculating the probability

7
of an event. It describes the possible values and their probability of occurring. Discrete probability
distribution is called probability mass function (pmf), p(x.) and need to satisfy following
conditions.

✓ 0  p(x)=P(X=x)  1 for all x where X is a discrete r.v.


✓  p ( x) = 1
all x

4.5.1 Binomial Distribution


It is a discrete probability distribution which is obtained when the probability of happening of an
event is the same in all trial of the experiment. Many of probability problems have only two
possible outcomes, or outcomes that can be reduced to two outcomes. For example when a coin is
tossed, it can land head or tail. When a baby is born it will be either male or female. True- false
items can be answered in only two ways, true or false. Other situation can be reduced to two
outcomes. For example a person can be classified as having normal or abnormal blood pressure,
depending on the measure of the sphygmomanometer. A multiple choice questions, even though
there are four or five answers choice, can be classified as correct or incorrect. Random experiment
that involves such situations is called binomial experiment.
The origin of binomial distribution is Bernoulli's trial. Bernoulli's trial is an experiment where
there are only two possible outcomes, “success" or "failure". Any experiment can also be turned
into a Bernoulli trial by defining one or more possible results which we are interested as ‘‘Success”
and all other possible results as “Failure”. For instance, while rolling a fair die, a "success" may
be defined as "getting even numbers on top" and odd numbers as "Failure". Generally, the sample
space in a Bernoulli trial is S = {S, F}, S = Success, F = failure.
Notation: Let probability of success and failure are p and q respectively. P (success) = P(s) = p
and P (failure) = P (f) = q, where q= 1- p.
Definition: Let X be the number of success in n repeated Binomial trials with probability of
success p on each trial, then the probability distribution of a discrete random variable X is called
binomial distribution. Let p = the probability of success, q= 1-p= the probability of failure on any
given trial. A binomial random variable with parameters n and p represents the number of x
successes in n independent trials, when each trial has p probability of success. If X is a random
variable, then for i= 0, 1, 2… n

8
n
P ( X = x) =   p X (1 − p )
n− X

 x

The Probability Distribution Function (pdf) assigns a probability to each value of X.

A binomial experiment is a probability experiment that satisfies the following assumptions.


1. The experiment consists of n identical trials.
2. Each trial has only one of the two possible mutually exclusive outcomes, success or a
failure.
3. The probability of each outcome does not change from trial to trial.
4. The trials are independent and repeated the experiment.
If X is a binomial random variable with two parameters n and p then
1. The mean of a binomial distribution : E (X) = np
2. The variance of a binomial distribution : Var (X) = npq
Example 4.8:- Assume an inspector is examining switches. These switches are 10% faulty. If the
inspector examines 10 of the switches,
a) what is the probability that 2 switches are faulty?
Solution:-Using the formula, n = 10, x= 2, p = .10, then
10 
P( X = 2) =  (.10) 2 (.90) 8 = .1937102445
2 
b) What is the probability that at most 2 switches are faulty?

Solution:-
P ( X  2) = P ( X = 0) + P ( X = 1) + ( P ( X = 2)
10  10  10 
P ( X = 0) =  (.10) 0 (.90) 10 + P ( X = 1) =  (.10) 1 (.90) 9 + P ( X = 2) =  (.10) (.90)
2 8

0  1  2 
= .3487 + .3874 + .1937 = .9298

6.4.2 Poisson Distribution


It is another type of discrete probability distribution that related to the binomial distribution with
some differences. Poisson distribution is used in the area of rare events. The Poisson distribution
counts the average number of success in a fixed interval of time or within a specified region.
Examples of random variables that usually obey the Poisson distribution are:

9
4 The number of car accidents in a day.
5 Arrival of telephone calls over interval of times.
6 The number of misprints on a typed page (a group of pages) of a book.
7 The number of suicides reported by a particular city.
8 The number of customers entering a post office on a given day etc.
To apply the Poisson distribution, two conditions must be met:
1. The number of success that occurs in any interval is independent of those that occur in other
non-overlapping intervals.
2. The probability of a success in an interval is proportional to the size of the interval. In short,
the two important traits of the Poisson distribution are independence and probability.
Let X is the number of occurrences in a Poisson process and λ be the actual average number of
occurrences of an event in a unit length of interval, the probability function for Poisson distribution
is,
e −  x
p( x) = x = 0,1,2,...
x!
Here λ is known as parameter of the distribution so that λ>0. Since number of trials is very large
and the probability of success p is very small, it is clear that the event is a rare event. Therefore,
Poisson distribution relates to rare events.
Remarks
Poisson distribution possesses only one parameter λ. If X has a Poisson distribution with parameter
, then E (X) = λ and Var (X) = λ, i.e. E (X) = Var (X) =λ ,
Example 4.9 In a small city, 10 accidents took place in a time of 50 days. Find the probability that
there will be a) two accidents in a day and b) three or more accidents in a day.
Solution: There are 0.2 accidents per day. Let X be the random variable, the number of accidents
per dayX ~poiss ( = 0.2) X = 0, 1, 2, …., P(x=2) = 0.0164
b) P (X ≥ 3) = P(X = 3) + P(X = 4) + P(X = 5) +...= 1- [P(X = 0) + P(X = 1) + P(X = 2)]
= 1- [0.8187 + 0.1637 + 0.0164] = 0.0012
Exercise
1. On average, a certain intersection results in 3 traffic accidents per day. Assuming Poisson
distribution,
a) what is the probability that at this intersection:

10
b) no accidents will occur in a given day?
c) More than 3 accidents will occur in a given day?
d) Exactly 5 accidents will occur in a period of two days?
e) What is the average number of traffic accidents in a period of 4 days?

4.6 Common Continuous Probability Distributions


So far, we have discussed the discrete probability distributions namely the binomial, multinomial,
Poisson etc. These distributions enable us to find probabilities of distinct events, like the
probability of defective items in a sample of given size, the probability of accidents in a factory.
In general, with these distributions, we are able to enumerate the probability of success or failures
occurring in a fixed number of independent trials. However, in practice we come across a number
of biological, social, economic, industrials and psychological measurements where the variables
are continuous in nature, and as such as can be adequately described only by a continuous
probability distribution. One of the most important continuous probability distribution in the entire
field of statistics is the normal probability distribution.
The uniform, normal curve, exponential and gamma are part of a family of distributions that we
call continuous probability distributions.
1. Continuous Probability Distribution – the values that the data takes on is now continuous.
The probability of any single point is 0. We find the probability of some event occurring given an
interval.

2. Probability Density Function – this is the actual distribution we deal with. It is specific to a
particular type of distribution. We denote it f(x); this is the same thing that is called a density
curve. The function just summarizes the curve.
Properties:
(a) The total area under the curve is 1(100%).
(b) The total range of values is on the horizontal axis.
(c) The median still gives us the 50% point and the mean defines where the balance point of the
curve would be (i.e. think of it as the middle of the probability density)

11
4.6.1 Normal Distribution
The normal distribution is a continuous probability distribution and plays a very important and
pivotal role in statistical theory and practice, particularly in the area of statistical inference and
statistical quality control. Its importance is due to the fact that in practice, the experimental results,
very often seem to follow the normal distribution or bell-shaped curve. The normal curve is
symmetrical and is defined by its mean μ and its standard deviation. The normal curve is not just
one curve but a family of curves. Just as the equation for a circle describes the family of circles,
some small and some large, the equation for the normal curve describes a family of such curves
which may differ only with regard to the values of mean and standard devotion, but have the same
characteristics.

Characteristics of the Normal Curve


1. The normal curve is symmetrical about the mean. This means be that the number of units in
the data below the mean is the same as the number of units above the mean. This means the
mean and median have the same value.
2. The height of the normal curve is maximum at the mean value. Thus, the mean and mode
coincide. This means that the normal distribution has the approximately the same value of
mean, median and mode.
3. The curve declines as we go in either direction from the mean, but never touches the base (X-
axis) so that the tails of the curve on both sides extend indefinitely.
4. The corresponding deciles, quartiles and percentiles are equidistant from the mean.
The height of the normal curve Y at any value of the random variable X is given by

1  (x −  )2 
f ( x;  ,  ) = exp− ,−  x   −     0
2 2  2 2

12
Standard Normal Distribution
The symmetrical property of the normal distribution provides a means that is helpful in calculating
probabilities, which is also facilitated by transforming any normal distribution with any mean and
variance to the standard normal distribution. By standardization we mean that the random variable
X will be transformed to another random variable whose mean is 0 and variance is 1. The normal
distribution with zero mean and standard deviation one is known as standard normal distribution.
If X has normal distribution with mean μx and standard deviation𝜎, then the standard normal
random variable Z is given by

x −μ
Z= , for population
𝜎

Using the properties of expectations, it is now trivial to show that E ( Z ) = 0 and V(Z) = 1 . The pdf of
1 2
1 − z
Z is, thus, given by f ( z ) = e 2 ,−  z   .
2

z
The entries in Table A of the Appendix are the values of P(0  Z  z ) =  f ( z )dz .That is, the table
0

gives us the probabilities that a random variable Z having the standard normal distribution will
take on a value on the interval from 0 to z, for z = 0.00, 0.01, 0.02,, 3.98, and 3.99; due to the

13
symmetrical property of the normal curve with respect to its mean, it is unnecessary to extend the
table for negative values of Z.

Note that P( Z  0) = P( Z  0) = 0.5 .

1 2
z 1 − z
That is, the arrowed region is P(0  Z  z ) =  e 2 dz .
0 2

Basic Properties of the standard normal Curve:


1. Total area under the standard normal curve is equal to 1.
2. The standard normal curve is asymptotic to x-axis.
3. The standard normal curve is symmetric about 0.
4. Most of the area under the standard normal curve lies between z= -3 and z=3.
Given a normal distributed random variable X with mean μ and standard deviation σ
𝑎−𝜇 𝑥−𝜇 𝑏−𝜇
𝑃(𝑎 < 𝑋 < 𝑏) = 𝑃( < < )
𝜎 𝜎 𝜎

𝑥−𝜇 𝑎−𝜇
𝑃(𝑋 < 𝑎) = 𝑃( < )
𝜎 𝜎
𝑥−𝜇 𝑎−𝜇
But, = 𝑍 standard normal random variable 𝑃(𝑍 < )
𝜎 𝜎

Note: i) P (a<x<b) = P (a ≤X<b)= P (a<X≤ b) =P (a ≤X≤ b)


ii) P (- ∞ <Z < ∞) = 1
Example 4.11:- Find the probabilities that a random variable having the standard normal
distribution will take on a value
a. Less than 1.72;
b. Less than -0.88;
c. Between 1.30 and 1.75;
d. Between -0.25 and 0.45.

14
Solution: By using the normal table,
a. P( Z  1.72) = P( Z  0) + P(0  Z  1.72) = 0.5 + 0.4573 = 0.9573.

b. P( Z  −0.88) = P( Z  0.88) = 0.5 − P(0  Z  0.88) = 0.5 − 0.3106 = 0.1894.

c. P(1.30  Z  1.75) = P(0  Z  1.75) − P(0  Z  1.30) = 0.4599− 0.4032 = 0.0567 .

d. P(−0.25  Z  0.45) = P(−0.25  Z  0) + P(0  Z  0.45) = P(0  Z  0.25) + P(0  Z  0.45)

= 0.0987 + 0.1736 = 0.2723.


Application of the Standard Normal Distribution
Let X  N ( ,  2 ). Suppose that we want to find the probability P(a  X  b) . Since a, b,  , and 
are known (given), we standardize a, b and X as:
a− X − b−
P ( a  X  b) = P    = P ( z1  Z  z 2 ), say.
    

Now, we need only to get the readings from the Z- table corresponding to z1 and z2 to get the
required probabilities, as we have done in the preceding example. Also, we can find the following
one-sided probabilities:
 b−   a−
P ( X  b) = P Z   = P ( Z  z 2 ) , and P ( X  a ) = P Z   = P ( Z  z1 ) .
     

We have seen that a Z- value measures the distance between a particular value of X and the mean
in units of standard deviation.
Example 4.12:-If X N   ,  2  , find the probabilities

a. P(  −   X   +  ) ;
b. P(  − 2  X   + 2 ) ;
c. P(  − 3  X   + 3 ) .
Solution: As in the case of P(a  X  b) , we simply replace a and b.

a) P(  −   X   +  ) = P  −  −   Z   +  −   = P(−1  Z  1) = 2 P(0  Z  1)


   

= 2(0.3413) (See Table A) = 0.6828 or 68.28%.


From which we can tell that,
About 68.30% lies in the region  −  &  +  (1 Standard Dev. on either side).
Notation: Z denotes the value of Z for which the area to its right is equal to  .

15
This notation is useful in statistical inference, and note that finding Z is identical with reading

anti-logarithms.
Example 4.13:- Suppose that X N (165, 9), where X = the breaking strength of cotton fabric. A
sample is defective if X<162. Find the probability that a randomly chosen fabric will be defective.
Solution: Given that  = 165 and  2 = 9 ,

 X −  162 −    162 − 165 


P ( X  162) = P   = P Z  
     3 
= P( Z  −1) = 0.5 − P(−1  Z  0) (Since P( Z  0) = 0.5 )

= 0.5 − P (0  Z  1) (By symmetry)

= 0.5 − 0.3413 = 0.1587 (Table value for Z = 1)


Exercise: 1. The test scores for the exam in statistics class have a mean of 78 points and standard
deviation of 9 points. A student is randomly selected. Assuming the scores are normally
distributed find the probability that the score is:
a) More than 95, less than 30 and between 80 and 90.
b) If there are 70 students in that class, about how many will receive between 90 and 95
points?
2. Suppose the number of accidents occurring weekly on a particular stretch of a highway
follows a Poisson distribution with mean 3. Calculate the probability that there is atleast on
accident this week.
3. The no. of monthly breakdowns of a computer is a r.v. having a Poisson distribution with
mean equal to 1.8. Find the probability that this computer will function for a month (i)
without a breakdown (ii) with only one breakdown

Miscellaneous Exercise

1. From a lot containing 20 items, of which 5 are defective, 4 are chosen at random. Let X be the
number of defectives found.

a) Write down the pmf of X. b) Find the probability distribution of X.

c) Find E(X) and V(X).

2. If X has a pdf of f ( x) = 3x 2 , for 0 <x <1, and o elsewhere, find


a) P(X < 0.5); b)E(X) and V(X);

16
c) a if P( X  a) = 0.05 ; d) b if P( X  b) = P( X  b) .

3. The amount of bread X (in hundreds of kg) that a certain bakery is able to sell in a day is found
to be a continuous r-v with a pdf given as below:

 kx , 0 x5

f ( x) = k (10 − x) , 5  x  10
 0
 , otherwise

a) Find k;
b) b) Find the probability that the amount of bread that will be sold tomorrow is
i) More than 500kg, ii) between 250 and 750 kg;

c) Find the expected amount of bread to be sold in any day.

4. Find the value of Z if the area between -Z and Z is a) 0.4038; b) 0.8812; c) 0.3410.
5. The reduction of a person's oxygen consumption during periods of deep meditation may be
looked up on as a random variable having the normal distribution with  = 38.6 cc per minute
and  = 6.5 cc per minute. Find the probabilities that during such a period a person's oxygen
consumption will be reduced by

a) at least 33.4 cc per minute;


b) at most 34.7 cc per minute

17
CHAPTER 5

5. Sampling Theory and Sampling Distributions


5.1 Introduction
The Sampling distribution helps in determining the degree to which the sample statistics from
different samples differ from each other, and the population parameter to determine the degree of
closeness between the particular sample statistics to the population parameter. In other words, the
sampling distribution constitutes the theoretical basis of inferential statistics that involves
determining the extent to which the sample statistic vary from each other and the population
parameter.
Upon successful completion of this chapter, you will be able to:

• Understand the meaning of sampling theory.


• Understand the meaning of sampling distribution.
• Describe the sampling distribution for sample mean and sample proportion.
• Apply the central limit theorem to calculate approximate probabilities for sample means.

5.2 Sampling Theory

The science of statistics deals with drawing conclusions from observed data. For instance, a
typical situation in a technological study arises when one is confronted with a large collection,
or population, of items that have measurable values associated with them. By suitably sampling
from this collection, and then analyzing the sampled items, one hopes to be able to draw some
conclusions about the collection as a whole.
Basic Terms
1. Parameter: A parameter is a numerical characteristic of a population. (e.g., population
mean and population variance)
2. Statistic: It is a characteristics or measure from a sample. (e.g., sample mean and sample
variance)
3. Sampling: The process or method of sample selection from the population.

4. Sampling Unit: is the ultimate unit to be sampled or elements of the population to be


sampled.

18
5. Sampling Frame: is the list of all elements in a population.

6. Errors in sample survey: There are two types of errors

a. Sampling error: Is the discrepancy between the population value and sample value. The
absolute value of the difference between an unbiased point estimate and the corresponding
population parameter is called the sampling error. It may arise due to inappropriate sampling
techniques applied.

b. Non-sampling errors: are errors due to procedure bias such as: Due to incorrect responses,
measurement errors at different stages in processing the data.
Why sampling is important? Because,
➢ It reduce/minimize cost.
➢ We can achieve greater speed.
➢ We can achieve greater accuracy.
➢ We can achieve greater scope.
➢ More detailed information can be obtained.

There are two types of sampling methods i.e. random sampling and non-random sampling.

5.3 Random Sampling or Probability Sampling

It is a method of sampling in which all elements in the population will have a preassigned non-
zero probability to be included in to the sample. There are four types of random sampling
techniques; these are:
➢ Simple random sampling
➢ Stratified sampling
➢ cluster sampling
➢ systematic sampling
1. Simple Random Sampling: is a method of selecting items from a population such that
every possible sample of specific size has an equal chance of being selected. In this case,
sampling may be with or without replacement. Or all elements in the population have the same
pre-assigned non-zero probability to be included in to the sample. Simple random sampling
can be done either using the lottery method or table of random numbers.

19
2. Stratified Random Sampling: The population will be divided in to non-overlapping but
exhaustive groups called strata. Then simple random samples will be chosen from each
stratum. Elements in the same strata should be more or less homogeneous while different in
different strata. It is applied if the population is heterogeneous. Some of the criteria for dividing
a population into strata are: Sex (male, female); Age (under 18, 18 to 28, 29 to 39); Occupation
(blue-collar, professional, other).

3. Cluster Sampling: The population is divided in to non-overlapping groups called clusters.


A simple random sample of groups or cluster of elements is chosen and all the sampling units
in the selected clusters will be surveyed. Clusters are formed in a way that elements within a
cluster are heterogeneous, i.e. observations in each cluster should be more or less dissimilar.

Cluster sampling is useful when it is difficult or costly to generate a simple random sample.
For example, to estimate the average annual household income in a large city we use cluster
sampling, because to use simple random sampling we need a complete list of households in
the city from which to sample. To use stratified random sampling, we would again need the
list of households. A less expensive way is to let each block within the city represent a cluster.
A sample of clusters could then be randomly selected, and every household within these
clusters could be interviewed to find the average annual household income.

4. Systematic Sampling: A complete list of all elements within the population (sampling
frame) is required. The procedure starts in determining the first element to be included in the
sample. Then the technique is to take the kth item from the sampling frame.
➢ Let N = population size, K = N/n = sampling interval.
➢ Choose any number between 1 and k. suppose it is j where (1≤j≤k).
➢ The jth unit is selected at the first and then (j+k)th, (j+2k)th,…,etc unit until required
sample size is selected.
5.4 Non-Random Sampling or Non-probability Sampling.
It is a sampling technique in which the choice of individuals for a sample depends on the basis
of convenience, personal choice or interest. Non- probability sampling methods are: Judgment
sampling, Convenience sampling, Quota sampling.

20
1. Judgment Sampling: The person most knowledgeable on the subject of the study selects
elements of the population that he or she feels are most representative of the population. It is a
relatively easy way of selecting a sample. But it has the disadvantage that the quality of the
sample results depends on the judgment of the person selecting the sample.

2. Convenience Sampling: In this method, the decision maker selects a sample from the
population in a manner that is relatively easy and convenient. Sample selection and data
collection are relatively easy. The disadvantage is, it is impossible to determine how
representative of the population the sample is.
3. Quota Sampling: In this method, the decision maker requires the sample to contain a
certain number of items with a given characteristic. Many political polls are, in part, quota
sampling.

5.5 Sampling Distribution of Sample Mean

A sampling distribution is the probability distribution of a sample statistic when samples of size n
are taken randomly from the population repeatedly (typically, a simple random sample is used).
The sampling distribution of the sample mean has mean and standard deviation denoted by 𝝁𝒙̅ and
𝝈𝒙̅ respectively. If µ and 𝝈 represent the mean and standard deviation for the population, we can
calculate 𝝁𝒙̅ and 𝝈𝒙̅ by:

𝝁𝒙̅ = 𝝁

𝝈
𝝈𝒙̅ =
√𝒏

The standard deviation of the sampling distribution for the mean is often called the standard error
of the mean. If the sample means are closer to the population mean, then the value of the standard
error of mean will be small and if there are considerable variations in the sample means then the
standard error of mean will be large.

We can use the following steps to obtain the sampling distribution of the sample mean:

21
1. Find all possible samples of size n from a population of size N (We have 𝑁 𝑛 possible samples if
N
sampling is with replacement and   possible samples if sampling is without replacement).
n
2. Calculate sample mean for each sample
3. Construct the frequency table for all different values of sample mean also the frequency of each
value (the total frequencies = k)
4. Determine the relationship of sample statistic (mean and sample standard deviation of sample
statistic) with population parameter (µ and 𝜎).
5. Determine the form or shape of the sampling distribution, i.e. the distribution of sampling
distribution of sample statistic may or may not be normal.

Example 5.1

Consider that the population has size 4 with possible values: 2, 4, 6, and 8. We are interested to
select a sample of size 2 without replacement from the population repeatedly. These samples can
be chosen in 4C2 = 6 different ways.

𝟐+𝟒+𝟔+𝟖
µ= =𝟓
𝟒

(𝟐−𝟓)𝟐 +(𝟒−𝟓)𝟐 +(𝟔−𝟓)𝟐 (𝟖−𝟓)𝟐


𝝈= √ = √𝟓 = 2.236
𝟒

Table 5.1.: Sample mean of the possible samples

S.no. Samples Sample Mean


1 (2,4) 3
2 (2,6) 4
3 (2,8) 5
4 (4,6) 5
5 (4,8) 6
6 (6,8) 7

Similarly, the mean and standard deviation of the sampling distribution of the sample mean can be
computed as follows.

22
𝟑+𝟒+𝟓+𝟓+𝟔+𝟕
𝝁𝒙̅ = =𝟓
𝟔

From the above results, the standard deviation of the sampling distribution of sample mean can be
computed as

𝝈 𝟐.𝟐𝟑𝟔
𝝈𝒙̅ = = = 1.58
√ 𝒏 √𝟐

Now the sampling distribution of the sample mean can be computed as follows.

Table 5.2.: Sampling distribution of the sample mean

Possible Sample Means ̅)


P(X=𝒙
3 1/6
4 1/6
5 2/6
6 1/6
7 1/6
Total 6/6 = 1

Example 5.2
A measurement from a population has population mean 6 and standard deviation 2. What are the
mean and standard error of sample mean when n = 4? When n = 100? When n = 400?
𝟐 𝟐
For all sample sizes, 𝝁𝒙̅ = 𝟔, for n = 4, 𝝈𝒙̅ = = 1. For n = 100, 𝝈𝒙̅ = = 0.2, for n = 400, 𝝈𝒙̅
√𝟒 √𝟏𝟎𝟎
𝟐
= = 0.1.
√𝟒𝟎𝟎

5.6 Central Limit Theorem

The Central Limit Theorem (CLT) describes the shape of the sampling distribution of the sample
mean. If the population is normally distributed, then the sampling distribution of the sample mean
is normally distributed for any sample size n.

23
The Central Limit Theorem (CLT) states that, regardless of the population distribution (in most
cases), if n >=30, then the sampling distribution of the sample mean is approximately normal with
mean, variance, and standard error given by:
𝝁𝒙̅ = 𝝁

𝝈𝟐
𝝈𝟐𝒙̅ = 𝒏

𝝈
𝝈𝒙̅ =
√𝒏

The larger n is, the better the approximation will be. Also, the closer to normal the population is,
the better the approximation will be. If the original distribution is drastically skewed (for example,
people's winnings when they play the lottery), you may need a sample size that is larger than 30
to get a good approximation.

Fig 1: Illustration of Central Limit theorems

Example

1. For the population of farm workers in New Zealand, suppose that weekly income has a
distribution that is skewed right with a mean of µ = $500 (N.Z. dollars) and a standard deviation
of 𝜎 = $160. A survey of 100 farm workers is taken, including information on their weekly
income.
A. What are the mean and standard error of the sampling distribution of sample mean?

24
160
The mean is 𝜇𝑥̅ = 500, and the standard error is 𝜎𝑥̅ = = 16.
√100

B. What is the probability that the mean weekly income of these 100 workers is less than
$448?
Using the CLT, we will find the z-score for 448, then use the table to find the probability
448−500
P(𝑋̅< 448) = P (Z< 16 ) = P (Z<-3.25) = 0.0006

C. What is the probability that the mean weekly income of these 100 workers is between $480
and $520?
448−500 520−500
P (480 ≤ ̅𝑋≤520) = P ( ≤Z≤ ) = P (-1.25 ≤Z≤1.25)
16 16

= 2P (0≤Z≤1.25) = 2(0.3944) = 0.7888


2. The heights of 18-year-old men are approximately normally distributed with mean of 68 inches
and standard deviation of 3 inches.
A. What is the probability that an 18-year-old man selected at random is between 67 and 69
inches tall?
With a sample of size of 1, we just need that the heights are approximately normal, not
the CLT:
67−68 69−68
P (67 ≤X≤69) = P ( ≤Z≤ ) = P (-0.33 ≤Z≤ 0.33) = 2(0.1293) = 0.2586
3 3

B. For a sample of 36 18-year-old men, what is the probability that the average of their heights
is between 67 and 69 inches?
For this, we will carry out the same calculation as for the last part, but replace the
3
population standard deviation by the standard error: 𝜎𝑥̅ = = 0.5.
√36
67−68 69−68
P (67 ≤𝑋̅≤69) = P ( ≤Z≤ ) = P (-2 ≤Z≤ 2) = 2 P (0≤Z≤ 2) = 0.9544
0.5 0.5

Note that since the distribution of the height is approximately normal, we did not need n ≥ 30 to
get probablities from the standard normal table.

3. For people under 50, the level of glucose in the blood (in milligrams per deciliter of blood) after a 12-
hour fast have a standard deviation of 25 and a mean of µ. What is the probability that, for a sample of
size 49 readings, the sample mean is within 7 of µ?

25
We can calculate the standard error 𝝈𝒙̅ = 25/√𝟒𝟗 = 3.5714. We are not given the value of the
̅ - µ ≤7) = P ( −𝟕 ≤ 𝒁 ≤ 𝟕 )
population mean µ, but we know that we want to find P (-7≤ 𝑿 𝟑.𝟓𝟕𝟏𝟒 𝟑.𝟓𝟕𝟏𝟒

= P (-1.96≤ Z ≤ 1.96) = 2(0≤Z≤1.96) = 2(0.4750) = 0.95

Miscellaneous Exercise

1. Consider a population of size 4 with possible values 2, 4, 6 and 8.


A. What are the mean and standard error of sample mean when the sampling is without
replacement with n = 2, n=3 and n= 4?
B. In how many ways a sample of size 3 can be chosen from population size 4.
C. Construct the sampling distribution of sample mean of size 3 from population size 4 with
possible values 2, 4, 6 and 8.
2. In a survey of a company, mean salary of employees is 29321 dollars with SD of 2120 dollars.
Consider the sample of 100 employees and find the probability that their mean salary will be
less than 29000 dollars.
3. An unknown distribution has a mean of 90 and a standard deviation of 15. Samples of size n
=25 are drawn randomly from the population.
A. Find the probability that the sample mean is between 85 and 92.
B. Find the probability that the sample mean is greater than 85.
C. Find the probability that the sample mean is less than 92.
D. Find the probability that the sample mean is less than 85.

26
Chapter Six

6. Statistical Inference about a single population


6.1 Estimation of the population mean (μ)
The most fundamental points in estimation process of population parameter is collecting a sample
of data from a population using simple random methods. From these simple random samples, we
can calculate sample statistic such as sample mean, sample variance, sample proportions, etc to
estimate population parameters such as population mean, population variance and population
proportions respectively. A formula that uses sample data to calculate a single number (a sample
statistic) that can be used as an estimate of a population parameter is called estimators. A specific
̅ =10, S2 = 20, etc.
observed value of the statistic is called estimation. e.g., 𝑿

There are two types of estimation. They are:

6.1.1 Point Estimation: deals with about the problems of obtaining a single sample value to
̅ ) and sample variance (s2) are point
estimate the population parameter. Sample mean (𝑿
estimators of population mean (μ) and population variance (σ2), respectively.

Properties of Point Estimators

I. Unbiasedness: If the sampling distribution of a statistic has a mean equal to the parameter
being estimated, the statistic is an unbiased estimator of that parameter, otherwise it is biased.

If 𝜃 is a population parameter to be estimated, and ˆ is an estimator of 𝜃, then we call ˆ an

unbiased estimator of 𝜃 if E ( ˆ ) = 𝜃.

II. Efficiency: if ˆ 1 and ˆ 2 are both unbiased estimator of the population parameter 𝜃,and if V(

ˆ 1) is less than V( ˆ 2) , then ˆ 1 is more efficient estimator than ˆ 2.


III. Consistency: if ˆ approaches to 𝜃 as the sample size approaches infinity or population size,

then ˆ is consistent estimator of 𝜃.

IV. Sufficiency: if ̂ is sufficient estimator of 𝜃, if it uses the entire sample information in


estimating the population parameter to be estimated.

27
6.1.2 Confidence Interval Estimation

Regardless of the estimator used it is necessary to allow for uncertainty due to sampling variation,
i.e. the numerical value obtained from the sample will not be exactly the same as the parameter
value and an interval must be defined with in which we can be reasonably confident that the
parameter lies. An interval estimate gives us a range of values which is likely to contain the
population parameter. The interval within which a population parameter is expected to occur is
called a confidence interval. Thus, two numbers are calculated to determine the ends of an interval
with in which we can state that the population parameter to lies. A probability is attached to the
calculated interval and dignifies the confidence we have in stating that the parameter actually falls
within the interval. The two confidence intervals that are used extensively are the 95% and the
99%. For a 95% confidence interval, about 95% of the similarly constructed intervals will contain
the parameter being estimated. Also 95% of the sample means for a specified sample size will lie
within 1.96 standard deviations of the hypothesized population mean. For the 99% confidence
interval, 99% of the sample means for a specified sample size will lie within 2.575 standard
deviations of the hypothesized population mean.

The factors that determine the width of a confidence interval are:

i. The sample size, n.


ii. The proportion or variability in the population.
iii. The desired level of confidence: a confidence level is the probability that the
interval estimate will include the population parameter (such as the mean). A
parameter is a numerical description of a characteristic of the population.

To determine the confidence interval for the population mean, we can consider three cases.

• Case 1: when the population standard deviation (𝜎) is known and n is large(n≥30),
then the population mean is estimated by:
𝜎
𝜇 = x ±𝑍𝛼⁄2 .
√𝑛
• Case 2: when the population standard deviation (𝜎) is unknown and n is large(n≥30),
then the population mean is estimated by:

28
𝑆
𝜇 = x ± 𝑍𝛼⁄ . 𝑛
2 √
• Case 3: when the population standard deviation (𝜎) is unknown and n is small(n<30)l,
then the population mean is estimated by:
𝑆
𝜇 = x ± 𝑡𝛼⁄ (n-1). 𝑛
2 √

Example 6.1

1. The Dean of the Business School wants to estimate the mean number of hours worked per
week by students. A sample of 49 students showed a mean of 24 hours with a standard
deviation of 4 hours. What is the population mean?

Solution: the value of the population mean is not known. Our best estimate of this value is the
sample mean of 24.0 hours. This value is called a point estimate. To find the 95 percent confidence
interval for the population mean, we can compute it as

𝟒
̅ ± 𝒁𝜶⁄ . 𝑺 = 24.00 ± 1.96.
𝑿 = 24.00 ± 1.12
𝟐 𝒏 √ √𝟒𝟗

The confidence limits range from 22.88 to 25.12. About 95% of the similarly constructed intervals
included the population parameter.

2. The mean life of a sample of 200 tyres taken from the lot is found to be 40,000kms. Past
experience shows that the standard deviation for life of tyres in the lot is 3200kms. Construct
a 95% confidence interval for the mean life of tyre in the lot is expected to lie?

Solution: The givens are :n =200, x =40,000km,𝜎=3200km

𝜎
𝜇 = x ±𝑍𝛼⁄2 . 𝑛, but𝑍0.05⁄ =Z0.025=1.96
√ 2
3200 3200
[ 40,000 − 𝑍0.05⁄ . ≤ 𝜇 ≤ 40,000+𝑍0.05⁄ . ]
2 √200 2 √200
3200 3200
[ 40,000 − 𝑍0.025 . ≤ 𝜇 ≤ 40,000+𝑍0.025. ]
√200 √200

3200 3200
[40,000 − 1.96. ≤ 𝜇 ≤ 40,000+1.96. ]
√200 √200

[39, 557 ≤ 𝜇 ≤ 40,444], thus, the mean life of tyres in the lot is expected to lie in
between [39, 557 and 40,444] km.
29
3. A soap manufacturing company was distributing a particular type of brand through a large
number of retails soaps. Before a heavy advertising movement, the mean sales per weak per
sho*p were 140 dozens. After the movement, a sample of 49 shops was taken and the mean
sales were found to be 147 dozens with SD 16. Construct a 95% confidence interval for the
mean sales of soap manufacturing company?
Solution: The givens are n=49, x =147, S= 16

𝑆
The CI 𝜇 = x ± 𝑍𝛼⁄2 . 𝑛, but 𝑍0.05⁄ =Z0.025=1.96
√ 2

16 16
[147 − 𝑍0.05⁄ . ≤ 𝜇 ≤ 147+𝑍0.05⁄ ]
2 √49 2 √49
16 16
[147 − 1.96. ≤ 𝜇 ≤ 147+1.96. ]
√49 √49

[𝟏𝟒𝟐. 𝟓𝟐 ≤ 𝝁 ≤ 𝟏𝟓𝟏. 𝟒𝟖], thus, the mean sale of soap manufacturing company is expected to
lie in between 142.52 and 151.48.

4. An automobile tyer manufacturing claims that the average life of a particular grade of tyres is
more than 20,000kms when used under normal driving conditions. A random sample of 16
tyres was tested and mean and SD of 22,000Kms and 5000kms respectively were computed.
Construct a 95% confidence interval for the average life of an automobile tyre manufacturing
company?
Solution: Left as exercise.

6.2 Hypothesis Testing

A hypothesis is an idea about something around us. Statistical hypothesis is a conjecture about a
population parameter. This conjecture may or may not be true. We may think of amount of marks
scored by students of a given group, the amount of yield of a given variety, the difference between
the mean of two gropes sets of values, and so on. The procedure we follow to accept or reject a
hypothesis is called test of hypothesis. Hypothesis testing is a decision-making process for
evaluating claims about a population.

There are two types of hypotheses.

30
Null hypothesis (Ho) - is the beginning hypothesis that the researcher or experimenter wishes to
disprove. It is a statement involving equality about a population parameter. We assume the null
hypothesis is true to do our analysis.
Alternative hypothesis (H1) - The alternative hypothesis (Ha) is a statement that contradicts the
null hypothesis. The alternative hypothesis is what we conclude is true if the experimental results
lead us to conclude that the null hypothesis (our assumption) is false.

Example

1. A medical researcher is interested in finding out whether a new medication will have any undesirable
side effects. The researcher is particularly concerned with the pulse rate of the patients who take the
medication. What are the hypotheses to test whether the pulse rate will be different from the mean pulse
rate of 82 beats per minute? H0: µ = 82 vs H1: µ≠ 82. This is a two-tailed test.
2. A chemist invents an additive to increase the life of an automobile battery. If the mean lifetime of the
battery is 36 months, then his hypotheses are
H0: µ= 36 vs H1: µ > 36. This is a right-tailed test.

3. A contractor wishes to lower heating bills by using a special type of insulation in houses. If the average
of the monthly heating bills is $78, her hypotheses about heating costs will be H0: µ= $78 vs H1: µ <
$78. This is a left-tailed test.
Common phrases in hypothesis testing
Common Phrases in Hypothesis Testing
> <
Is greater than Is less than
Is higher than Is lower than
Is longer than Is shorter than
Is smaller than
Is bigger than
Is decreased or reduced from
Is increased ≠
= Is not equal to
Is equal to Is different from
Is the same as Has changed from
Has not changed from Is not the same as
Is the same as
Types of Errors

When we test a hypothesis is we commit two types of errors:

31
1. A type I error occurs if one rejects the null hypothesis when it is true. The level of significance is the
maximum probability of committing a type I error. This probability is symbolized by α (Greek letter
alpha). That is, P (type I error) = α.
Level of significance is the probability that shows the level of the confidence for our conclusion. It
establishes the criteria for rejection or non-rejection of the null hypothesis.
2. A type II error occurs if one does not reject the null hypothesis when it is false P (type II error) = β
(Greek letter beta).

The hypothesis-testing situation can be likened to a jury trial. In a jury trial, there are four possible
outcomes. The defendant is either guilty or innocent, and he or she will be convicted or acquitted.
Now the hypotheses are
H0: The defendant is innocent
H1: The defendant is not innocent (i.e., guilty)
Next, the evidence is presented in court by the prosecutor, and based on this evidence, the jury
decides the verdict, innocent or guilty. If the defendant is convicted but he or she did not commit
the crime, then a type I error has been committed. On the other hand, if the defendant is convicted
and he or she has committed the crime, then a correct decision has been made.

If the defendant is acquitted and he or she did not commit the crime, a correct decision has been
made by the jury. However, if the defendant is acquitted and he or she did commit the crime, then
a type II error has been made.
The decision of the jury does not prove that the defendant did or did not commit the crime. The
decision is based on the evidence presented. If the evidence is strong enough, the defendant will
be convicted in most cases. If the evidence is weak, the defendant will be acquitted in most cases.

32
Nothing is proved absolutely. Likewise, the decision to reject or not reject the null hypothesis does
not prove anything. The only way to prove anything statistically is to use the entire population,
which, in most cases, is not possible. The decision, then, is made on the basis of probabilities. That
is, when there is a large difference between the mean obtained from the sample and the
hypothesized mean, the null hypothesis is probably not true. The question is, how large a difference
is necessary to reject the null hypothesis? In this case we can use the level of significance.

• Test statistic is the value that is computed from a sample result. It uses data obtained from a sample
to make a decision about whether or not the null hypothesis should be rejected. The numerical value
obtained from a statistical test is called the test value.

• Critical value is the demarcation point between the acceptance and the rejection region. It separates
the critical region from non-critical region. The critical or rejection region is the range of values of
the test value that indicates that there is a significance difference and that the null hypothesis should
be rejected. The noncritical or nonrejection region is the range of values of the test value that
indicates that the difference was probably due to chance and that the null hypothesis should not be
rejected.

A one-tailed test (right or left) indicates that the null hypothesis should be rejected when the test
value is in the critical region on one side of the mean. The critical value for α = 1% (right-tailed
test) is obtained as follows.

33
In a two-tailed test, the null hypothesis should be rejected when the test value is in either of the
two critical regions.
The critical value for α = 1% (two-tailed test) is obtained as follows.

Steps in hypothesis testing

1. Stating the particular hypotheses (H0 and H1) that will be investigated.
2. Fix the level of significance (∝).
3. Selecting an appropriate test statistic and calculate its value.
4. Determine the critical value or tabulated value.
5. Compare the test statistic with the tabulated value and draw the statistical conclusion.

6.1.3 Testing About a Single Population Mean

The tests concerning a single population mean (𝛍) may take one of the following forms.

1. H0: μ = μ0 - this is called a two tailed test.


H1: μ ≠ μ0
2. H0: μ = μ0 - this is called a one tailed test (left tailed test)
H1: μ < 𝜇0
3. H0: μ = μ0 -this is called a one tailed test (right tailed test)
H1: μ > 𝜇0
❖ There are three cases to be considered while testing any one of the above hypothesis.
• Case 1: if the population standard deviation (σ) is known and n is large(n≥ 30), then the
appropriate test of statistic:

34
x −μ0
Z= σ
√n

• Case 2: if the population standard deviation (σ) is unknown and n is large(n≥ 30), then the
appropriate test of statistic is:
x −μ0
Z= S
√n
• Case 3: if the population standard deviation (σ) is unknown and n is small (n< 30), then the
appropriate test of statistic is:
x −μ0
t= S
√n

For standard normal distribution (case 1 and 2) the decision is based on the following tables:

Decision Alternative hypothesis


H1: μ ≠ μ0 H1: μ > 𝜇 0 H1: μ < 𝜇 0

Rejecting H0 if |z cal|> Z∝/2 Z cal > Z∝ Z cal < - Z∝

Accepting H0 if |z cal|≤ Z∝/2 Z cal ≤ Z∝ Z cal ≥- Z∝

For student t-distribution (case 3) the decision is based on the following tables:

Decision Alternative hypothesis

H1: μ ≠ μ0 H1: μ > 𝜇 0 H1: μ < 𝜇 0

Rejecting H0 if |t cal|> t∝/2(n-1). t cal> t∝(n-1). t cal<-t∝(n-1).

Don’t rejecting H0 if |t cal|≤t∝/2(n-1). t cal ≤t∝(n-1). t ca𝑙 ≥ −t∝(n-1).

Examples:

1. The mean life of a sample of 200 tyres taken from the lot is found to be 40,000kms. Past experience
shows that the standard deviation for life of tyres in the lot is 3200kms.
A. Construct a 95% confidence interval for the mean life of tyre in the lot is expected to lie?
B. Is it reasonable to suppose the mean life of tyres in the lot as 41,000kms?
(At 5% level of significance)

35
2. A soap manufacturing company was distributing a particular type of brand through a large number of
retails soaps. Before a heavy advertising movement, the mean sales per weak per shop were 140 dozens.
After the movement, a sample of 49 shops was taken and the mean sales were found to be 147 dozens
with SD 16.
A. Construct a 95% confidence interval for the mean sales of soap manufacturing company?
B. Can you consider the advertisement effective?
3. An automobile tyre manufacturing claims that the average life of a particular grade of tyres is more
than 20,000kms when used under normal driving conditions. A random sample of 16 tyres was tested
and mean and SD of 22,000Kms and 5000kms respectively were computed.
A. Construct a 95% confidence interval for the average life of an automobile tyre
manufacturing company?
B. At 5% level of significance, decide whether the manufacturer’s clime is true?

Solution

1. The givens are: n =200, x =40,000km, 𝜎=3200km


A. The givens are :n =200, x =40,000km,𝜎=3200km
𝜎
𝜇 = x ±Z∝/2. 𝑛, but𝑍0.05⁄ =Z0.025=1.96
√ 2
3200 3200
[ 40,000 − 𝑍0.05⁄ . ≤ 𝜇 ≤ 40,000+𝑍0.05⁄ . ]
2 √200 2 √200
3200 3200
[ 40,000 − 𝑍0.025 . 200 ≤ 𝜇 ≤ 40,000+𝑍0.025. 200]
√ √

3200 3200
[40,000 − 1.96. ≤ 𝜇 ≤ 40,000+1.96. ]
√200 √200
[39, 557 ≤ 𝜇 ≤ 40,444], thus, the mean life of tyres in the lot is expected to lie in between
[39, 557 and 40,444] km.
B. The givens are n=200, sample mean= 40, 000, S=3200, μ0=41,000. So, test the hypothesis we
follow the following steps.

1. The hypotheses to be tested is H0: μ= 41,000 vs H1: μ ≠ 41,000


2. α=0.05 since it is stated in the problem.
3. Based on the given data n≥30 and σ known , the appropriate test statistic is:
x −μ0 40,000−41,000
Z= σ = 3200 = - 4.42
√n √200
4. since it is two tailed test the critical value is:
𝑍0.05⁄ =Z0.025=1.96
2
5. Since |Z cal|=4.42 > critical value or Z tabulated =1.96, we reject Ho and we conclude the mean
life of tyres in the lot is significantly different from 41,000kms.

36
2. The givens are n=49, x =147, S= 16
𝑆
A. The CI 𝜇 = x ±Z∝/2. 𝑛, but 𝑍0.05⁄ =Z0.025=1.96
√ 2
16 16
[147 − 𝑍0.05⁄ . ≤ 𝜇 ≤ 147+𝑍0.05⁄ ]
2 √49 2 √49
16 16
[147 − 1.96. 49 ≤ 𝜇≤ 147+1.96. 49]
√ √

[𝟏𝟒𝟐. 𝟓𝟐 ≤ 𝝁 ≤ 𝟏𝟓𝟏. 𝟒𝟖], thus, the mean sale of soap manufacturing company is expected to
lie in between 142.52 and 151.48.

B. To test the hypothesis based on the given, we follow the following steps:
1. The hypotheses to be tested is H0: μ= 140 vs H1: µ > 140.
2. Select a level of significance: it is stated in the problem as 5% or α=0.05.
3. Identify the statistical test to use. Based on the given data n≥30 and σ unknown , the
appropriate test of statistics is:
x −μ0 147−140
Z= S = 16 =2.15
√n √49
Recall that in the normal curve, Z=0 corresponds to the mean. Z=1, 2, 3 represent 1, 2,
and 3 standard deviations above the mean; the negatives are below the mean.
4. since it is One tailed (right tailed) test the critical value is Z∝= Z0.05=1.645.
5. Since Zcal = 2.15 > critical value or tabulated value =1.645, we reject Ho and we conclude
that the advertising is effective for increasing sales.
3. The given data are n=16, sample mean =22,000km, S= 5000km
A. CI left as an exercise.
B. Now the solution is as follows:

Step 1: H0: 𝛍= 𝟐𝟎, 𝟎𝟎𝟎 Vs H1: 𝛍 > 𝟐𝟎, 𝟎𝟎𝟎

Step 2: select a level of significance

Stated in the problem as 5% or 𝛂=0.05

Step 3: Identify the statistical test to use.

Use t-test because 𝛔 unknown is known and the sample (n= 16) is a small sample (n < 30).
𝐗−𝛍𝟎 𝟐𝟐,𝟎𝟎𝟎−𝟐𝟎,𝟎𝟎𝟎
t= 𝐒 = 𝟓𝟎𝟎𝟎 =1.55
√𝐧 √𝟏𝟔

Step 4: Since it is a one tailed test the critical value is

37
t∝ (n − 1)= t0.05 (16-1) = t0.05 (15) = 1.75

Step 5: Since t cal=1.55 >t tabulated=1.75, we reject Ho and we conclude the manufacturer product is
not as good as claimed.

Miscellaneous Exercise

1. A representative sample of 256 salaries for women in a particular job classification yields the
following results: sample mean salary ( ) = $59,000 sample standard deviation of the
salaries (s) = $3,200. Testing at the 1% level of significance, do we have strong statistical
evidence that the population mean salary for all women in that job classification is lower than
$60,000?

2. The average score of all sixth graders in school District A on a math aptitude exam is 75 with a standard
deviation of 8.1. A random sample of 100 students in one school was taken. The mean score of these
100 students was 71. Does this indicate that the students of this school are significantly less skilled in
their mathematical abilities than the average student in the district? (Use a 5% level of significance.)
3. A sample of 250 married workers showed 22 missed more than 5 days last year for any reason. A
sample of 300 unmarried workers showed 35 missed more than 5 days. Use the 5% level of significance
to test and answer the question: Are unmarried workers more likely to be absent from work than married
workers?
4. A researcher reports that the average salary of assistant professors is more than $42,000. A sample of
30 assistant professors has a mean salary of $43,260. At α = 0.05, test the claim that assistant professors
earn more than $42,000 a year. The standard deviation of the population is $5230.
5. A national magazine claims that the average college student watches less television than the general
public. The national average is 29.4 hours per week, with a standard deviation of 2 hours. A sample of
30 college students has a mean of 27 hours. Is there enough evidence to support the claim at α= 0.01?
6. The Medical Rehabilitation Education Foundation reports that the average cost of rehabilitation for
stroke victims is $24,672. To see if the average cost of rehabilitation is different at a large hospital, a
researcher selected a random sample of 35 stroke victims and found that the average cost of their
rehabilitation is $25,226. The standard deviation of the population is $3,251. At α = 0.01, can it be
concluded that the average cost at a large hospital is different from $24,672?
7. A researcher wishes to test the claim that the average age of lifeguards in Ocean City is greater than 24
years. She selects a sample of 36 guards and finds the mean of the sample to be 24.7 years, with a
standard deviation of 2 years. Is there evidence to support the claim at α= 0.05?

38
8. A researcher claims that the average wind speed in a certain city is 8 miles per hour. A sample of 32
days has an average wind speed of 8.2 miles per hour. The standard deviation of the sample is 0.6 mile
per hour. At α= 0.05, is there enough evidence to reject the claim?
9. A job placement director claims that the average starting salary for nurses is $24,000. A sample of 10
nurses has a mean of $23,450 and a standard deviation of $400. Is there enough evidence to reject the
director’s claim at α= 0.05?
10. Sugar is packed in 5-pound bags. An inspector suspects the bags may not contain 5 pounds. A sample
of 50 bags produces a mean of 4.6 pounds and a standard deviation of 0.7 pound. Is there enough
evidence to conclude that the bags do not contain 5 pounds as stated, at α= 0.05? Also, find the 95%
confidence interval of the true mean.

39

You might also like