0% found this document useful (0 votes)
51 views111 pages

Statistics For Finance Notes

Uploaded by

Precious Pearl
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views111 pages

Statistics For Finance Notes

Uploaded by

Precious Pearl
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 111

PREFACE

This handbook is written with the basic objective of introducing to students some basic
quantitative statistical techniques administered through statistical concepts to help in business
decision making process. An attempt has been made to present explanations in such a way that the
underlying statistical theory is fully exposed to enable thorough understanding of the relationship
between theory and application. Many illustrations on the applications of these statistical
techniques with the help of business data, makes this handbook to be a non-technical and non-
mathematical in character. Various types of study material are given at the end of each chapter to
aid students in applying theories discussed in the text to develop the attitude of quantitative
thinking and develop skills for performing calculations needed for various methods of analysis.

i
PREFACE ...................................................................................................................................... i
CHAPTER ONE ........................................................................................................................... 5
RANDOM VARIABLES AND PROBABILITY DISTRIBUTION ............................................ 5
1.1 Introduction ........................................................................................................................... 5
1.2 Definition of key Terms ......................................................................................................... 5
1.2.1 A variable........................................................................................................................ 5
1.2.2 Random Variable (R.V)................................................................................................... 5
1.2.2.1 Types Of Random Variable ......................................................................................... 5
1.3 Probability Distribution Of Random Variable ....................................................................... 6
1.3.1 Discrete Probability Distributions .................................................................................... 6
1.3.1.1 Probability Mass Function (PMF) ................................................................................. 6
1.4 Cumulative Mass Function (CMF) ......................................................................................... 8
1.5 Characteristics Of Probability Distribution........................................................................... 10
1.5.1 Expected Value of a Discrete Probability Distribution ................................................... 11
1.5.2 Variance of a Discrete Probability Distribution.............................................................. 12
1.8 Multivariate Or Joint Probability Distribution ...................................................................... 16
1.8.1 Discrete Joint Probability Distribution ............................................................................. 16
1.8.2 Marginal Discrete Probability Distribution .................................................................... 16
1.9 Covariance And Correlation ................................................................................................ 17
1.9.1 Covariance .................................................................................................................... 17
1.9.2 Correlation .................................................................................................................... 18
1.9.3 Properties Of Correlation Coefficient ............................................................................ 18
1.10 Relationship Between Population And Sample ................................................................... 19
1.11 Linear Combination Of Random Variables ........................................................................ 21
1.11.1 Properties Of Linear Combination Of Random Variables ............................................ 22
1.12 Special Types Of Probability Distributions ....................................................................... 23
1.12.1 Binomial Distribution .................................................................................................. 24
1.12.2 Mean, Variance, and Standard Deviation for a Binomial Random Variable ................. 25
1.12.2 Poisson Probability Distribution .................................................................................. 27
1.13 Continuous Probability Distributions ................................................................................. 29
1. 14 Normal Distribution .......................................................................................................... 31
1.14.1 Properties of normal distribution ................................................................................. 31
1.14.2 Standard Normal distribution ....................................................................................... 31
1.14.3 The Standard Normal Distribution Table ..................................................................... 32
CHAPTER TWO ........................................................................................................................ 39
SAMPLING DISTRIBUTION ................................................................................................... 39
2.1 Introduction ......................................................................................................................... 39
2.2 Distribution of the sample mean ....................................................................................... 39
2.3 Distribution Of The Sample Proportion ............................................................................ 41
2.4 .Sampling Distribution of the Difference between Two Sample Means ................................ 42
2.5 Distribution for the differences between two sample proportions, − .......................... 43
CHAPTER THREE .................................................................................................................... 46
ESTIMATION OF PARAMETERS .......................................................................................... 46
3.1 Introduction ......................................................................................................................... 46
3.2 Definition of Key terms: ...................................................................................................... 46
3.3 Properties of good estimator ................................................................................................ 47
3.3.1 Unbiasedness................................................................................................................. 47
2
3.3.2 Efficiency ...................................................................................................................... 47
3.3.3 Consistency ................................................................................................................... 47
3.3.4. Sufficiency ................................................................................................................... 47
3.4 Types of Estimation ............................................................................................................. 48
3.4.1 Point Estimation ............................................................................................................ 48
3.4.2 Interval Estimation ........................................................................................................ 48
3.4.3 Confidence Intervals..................................................................................................... 48
3.5 Formula for Confidence Interval for the population mean ................................................. 49
3.5.1 When population variance ( ) is known ...................................................................... 49
3.5.2 When population variance ( ) is unknown................................................................... 51
3.5.2.1 Confidence interval for µ when (σ ) is unknown and the sample size is large ........... 51
3.5.2.2 Small sample confidence Interval for the mean µ, when (σ ) is unknown.................. 51
3.5.3 Summary on how to estimate the confidence interval for µ ............................................ 53
3.6 Confidence Interval estimate for the population Proportions ................................................ 53
3.7 Estimation of the difference between two population means ( − )................................. 55
3.7.1 When population variances are known.......................................................................... 55
3.7.2 When population variances are unknown....................................................................... 55
3.8 Estimation of the difference between two population proportions ( − ) ...................... 57
3.9 Estimation of the Sample size .............................................................................................. 57
CHAPTER FOUR ....................................................................................................................... 62
HYPOTHESIS TESTING .......................................................................................................... 62
4.1 Introduction ......................................................................................................................... 62
4.2 Definition of key Terms ....................................................................................................... 62
4.2.1 Statistical Hypothesis .................................................................................................... 62
4.2.2 A Hypotheses Testing ................................................................................................... 63
4. 3 Types of Error .................................................................................................................... 63
4.4 Types of Statistical Tests ..................................................................................................... 64
4.5 Formulation of Null and Alternative hypotheses .................................................................. 65
4.6 Approaches For Hypothesis Testing..................................................................................... 66
4.6.1 The Test of Significance Approach................................................................................ 66
4.6.1.1 Critical Value Approach ............................................................................................. 66
4.6.1.2. P-value approach ....................................................................................................... 67
4.6.2 Confidence Interval approach ........................................................................................ 67
4.7 Hypothesis testing for population mean ( ).......................................................................... 69
4.7.1 Hypothesis testing for μ when  is known ................................................................... 69
2

4.7.2 Hypotheses Testing for when ( )is unknown ............................................................ 70


4.8 Hypothesis testing for the difference between two population means: .................................. 72
4.8.1 When population variances are known .......................................................................... 72
4.8.2 When the population variances are unknown. ................................................................ 72
4.10 Matched Pair t-test for Dependent Samples ........................................................................ 74
4.11 Hypothesis Testing For Population Proportions.................................................................. 76
4.12 Testing for the Equality between two Population Proportions ............................................ 77
CHAPTER FIVE ........................................................................................................................ 81
REGRESSION AND CORRELATION ANALYSIS ................................................................ 81
5.1 Introduction ......................................................................................................................... 81

3
5.2 Objective of Regression Analysis ........................................................................................ 81
5.3 Simple Linear Regression Model ......................................................................................... 81
5.4 The method of Ordinary Least Square (OLS) ....................................................................... 83
5.5 Sampling distribution for Estimators of regression line ........................................................ 86
5.5.1 Distribution for and ............................................................................................... 86
5.5. 2. Sampling Distribution for and ............................................................................ 87
5.6. Estimation for regression coefficients .................................................................. 87
5.7. Hypothesis testing for Regression Coefficients ( ) ............................................................ 89
5.8. Correlation Analysis ........................................................................................................... 90
5.8.1 Sample correlation coefficient ( ) .................................................................................. 91
5.8.2 Properties of correlation coefficient ............................................................................... 91
5.9 Coefficient of determination ( ) ......................................................................................... 91
5.9.1.Properties of coefficient of determination ...................................................................... 92
5.10 Hypothesis Testing For Correlation Coefficient ( ) ........................................................... 92
5.11 Hypothesis Testing For Significance Of Regression Model ( )........................................ 93
5.12 Analysis of Variance (ANOVA) ........................................................................................ 94
5.13. Computer output and interpretation of the results. ............................................................. 96
5.13 Multiple Linear Regression Analysis ................................................................................. 98
5.13. 1Assumption of Multiple Linear Regression models...................................................... 99
5.14 OLS estimators of Multiple Linear Regression model ..................................................... 100
5.15 Estimation for partial regression coefficients.................................................................... 100
5.16 Hypothesis testing in Multiple Linear Regression Analysis .............................................. 101
5.16.1 Testing Individual Partial Regression Coefficients ..................................................... 101
5.16.2 Testing for two or more Partial Coefficients .............................................................. 101
5.16.3 Testing for the significance of the model ................................................................... 103

4
CHAPTER ONE
RANDOM VARIABLES AND PROBABILITY DISTRIBUTION

1.1 Introduction
In this chapter we discuss probability distributions of random variables and these are customary
used to model some problems in various fields such as business, finance, economics and in general
life. To clearly understand this chapter, you need some basic knowledge in probability
fundamentals.

1.2 Definition of key Terms


1.2.1 A variable
It is an attribute that can take on different values or characteristics from one individual or object to
another. For example, number of items sold per day varies from one day to another, then “number
of items sold” is a variable.
1.2.2 Random Variable (R.V)
This is a variable whose numerical value is determined by the outcome of a random or statistical
experiment. In other words this is the variable that is subject to randomness and it can take on
different values which make it different from algebraic variable. Random variables also are
normally used in econometrics to determine relationships among one another. On the other hand,
random experiment is an experiment which results into at least two possible outcomes without a
prior knowledge to which one will occur. That is, an experiment whose outcome cannot be
predicted in advance. Most of these statistical experiments are described in words (e.g. an
experiment of tossing a fair coin, asking a sample of 10 people if they have ever been in the UK,
etc), however, at one point in time, the outcomes from these experiments are described more
meaningful in terms of numbers. But also in real life, most of sample data are in explicated in
terms numbers than in words. Therefore in short random variables are numerical values.

Traditionally random variables are denoted by capital letters, X, Y, Z or X1, X2, X3 etc.

1.2.2.1 Types Of Random Variable

There are two types of random variable (R.V), these are discrete and continuous random
variables. A discrete random variables takes on only a finite number of values and these are
integers. Examples of discrete random variables are; the number of cars passing through the
roadblock, an experiment of tossing one or more fair coins, number of defective items in a sample,
number of death by COVID-19 in year 2020, etc. On the other hand a continuous random variable
is a random variable that can take on any value (always real numbers)in a given interval of values.
Examples of continuous R.V are, height, weight, rainfall, temperature etc.

5
General Examples of random variables in the real life are; the unemployment rate, consumer
price index, number of sales made in week, yearly profit of a company, share prices, return on
investments, money supply, GDP, wages, cash flows, interest rates, etc.

1.3 Probability Distribution Of Random Variable


Definition: Aprobability distribution is a Graph, Table or a function that links each outcome
of a random experiment with its probability of occurrence. In other words we may say probability
distribution is a function that can be used to derive probabilities of each outcome of a random
variable. That is the value taken by this random variable and their associated probabilities. There
are two types of Probability Distribution of a Random Variable (R.V), these are discrete and
continuous Probability Distribution. The following chart may help you understand the branching of
probability distribution

Figure 1: Branches of probability distribution

1.3.1Discrete Probability Distributions


If a random variable is a discrete variable its probability distributions is called a discrete
probability distribution. With a discrete probability distribution, each possible value of the
discrete random variable can be associated with a non-zero probability. Thus, a discrete probability
distribution can always be presented in Tabular form. The discrete Probability Distribution is
commonly known as Probability Mass Function (PMF). See Figure 1.

1.3.1.1Probability Mass Function (PMF)

If X is a discrete random variable, the function denoted by ( ) = ( = ) for each within the
range of X is called Probability Mass Function of X. To capture clearly the meaning of the
probability distribution of a discrete random variable consider Example 1.1.

6
Example 1.1
Consider an experiment of tossing two fair coins simultaneously. Find the probability distribution
of obtaining a total number of heads.

Solution:
The following are the procedures for building probability distribution:

1. list all possible events


2. calculate probability of each event
3. present these probabilities in a suitable table or diagram

Solution:
The list of all possible events can be obtained easily by using a structure of tree diagram as shown
below:

H
T

Start
H
T

TT

Figure 2: Tree Diagram


From a tree diagram as shown in Figure 2 the list of all possible outcomes of an experiment (i.e
sample space, S) is as shown below:

= , , ,

From the sample space described above, a random variable (i.e. number of heads) takes on three
different values, 0, 1, 2 depending on whether zero head (no head), one head, or two heads were
obtained in the experiment of tossing two fair coins. That is

→ 2heads
→ 1 heads
→ 1 head
→ 0 head
Probability of an event can be obtained by employing traditional definition of probability, that is:

7
n( E )
P( E ) 
n( S )

Let be the number of observed heads. Theprobabilities of the number of heads showing up are as
indicated below and the probability distribution is shown inTable 1.
(0) 1
(0) = ( = 0) = =
( ) 4
(1) 1
(1) = ( = 1) = =
( ) 2
(2) 1
(2) = ( = 2) = =
( ) 4

Probability Distribution
Number of heads ( )= ( = )
( )
0 /

1 /
2 /
Total 1.00

Properties of PMF
1. ( ) ≥ 0 for each ∈
2. 0 ≤ ( ) ≤ 1
3. ∑ ( ) = 1

Example 1.2
An employment rate of a certain country A ( ) in percentage is assumed to be a discrete random
variable whose probability distribution is as shown below:
-12 -10 -6 0 4 8 10 12
Prob. 0.10 0.15 0.10 0.15 0.1 0.15 0.1 0.15

Find
a) ( ≥ 0)
b) ( ≥ −10)

Solution:
a) ( ≥ 0) = ( = 0) + ( = 4) + ( = 8) + ( = 10) + ( = 12)
= 0.15 + 0.1 + 0.15 + 0.1 + 0.15
= 0.65
c) ( ≥ −10) = 1 − ( < −10)
= 1 − ( = −12)
= 1 − 0.10
= 0.9

8
1.4Cumulative Mass Function (CMF)
Definition: Cumulative Mass Function, ( ) associated with Probability Mass Function, ( ) of a
discrete random variable is defined as follows:
( )= ( ≤ )

Where ( ≤ ) means the probability that a random variable takes a value of less than or equal
to a specific value , where is given. For example ( ≤ 2) means the probability that the random
variable takes the value less than of equal to 2.

Example 1.3

Find Probability Mass Function (PMF) and Cumulative Mass Function (CMF) of a total number of
heads obtained by tossing a fair coin three times.

Solution:
The following tree diagram is used to obtain the sample space

H T

H
H
T T
Start

H
T T
H
T

Therefore = , , ,T , , , ,
By letting =number of heads shown up, we find that can take values 0, 1, 2 or 3 and hence the
corresponding PMF will be obtained as indicated below:
1
(0) = ( = 0) =
8
3
(1) = ( = 1) =
8
3
(2) = ( = 2) =
8
1
(3) = ( = 3) =
8

9
In Tabula form:
0 1 2 3
( ) 1/8 3/8 3/8 1/8

It follows that, the Cumulative Mass Function (CMF) will be obtained as indicated here:
From:
( )= ( ≤ )
1
(0) = ( ≤ 0) = ( = 0) =
8
4
(1) = ( ≤ 1) = ( = 0) + ( = 1) =
8
7
(2) = ( ≤ 2) = ( = 0) + ( = 1) + ( = 2) =
8
(3) = ( ≤ 3) = ( = 0) + ( = 1) + ( = 2) + ( = 3) = 1
However in plotting the CMF the following will be the ranges of :

0 for x  0
1
 for 0  x  1
8
 4
F (x)   for 1 x  2
8
7
8 for 2  x  3

 1 for 3  x

With reference to the previous example, it can be observed that, a CMF is merely an accumulation
of PMF for the values of less than or equal to a given . That is,

( )= ( )

Note: The results for both PMF and CMF can also be summarized in the following Table:
Number of
heads values of PMF values of
( ) CMF ( )
0 0≤ <1 1/8 =0 1/8
1 1≤ <2 3/8 ≤1 4/8
2 2≤ <3 3/8 ≤2 7/8
3 3≤ 1/8 ≤3 1

10
1.5 Characteristics Of Probability Distribution
Although a probability distribution shows the values taken by a random variable and their
corresponding probabilities, in most cases a researcher might be interested in deducing some
summary characteristics from such probability distribution. These summary characteristics include
among others; the expected value(population mean), variance, covariance, correlations, etc.

1.5.1 Expected Value of a Discrete Probability Distribution


The average value of a random variable is called the expected value of the random variable, and this
is denoted by E ( X ) .

Definition:
Let be a random variable with probability ( ) = ( = ) then the expected value ( ) is given
by
( )=∑ ( = )or

= ( )

=
In other words; if a discrete random variable has possible values , , ⋯ with
corresponding probabilities ( = ), ( = ), ( = ), ⋯ , ( = )then expected value
is obtained by multiplying the value the random variable takes with the corresponding probability
of occurrence, i.e.

( )= ( = )+ ( = )+ ( = ) + ⋯+ ( = )
Therefore Σ denotes summation notation whose properties are as indicated below:
Properties of summation notation
1. If is constant, then

2. If is constant, then

3. It both and are constants, then

( + )= +

Properties of Expected Value


1. The expected value of a constant is equal to the same constant. Hence if is a constant,

11
( )=
2. The expected value of the sum of two random variables is equal to sum of expected value of
the two random variables. That is for the random variables X and Y.
( + )= ( )+ ( )

3. Also

( )≠ ( ) ( )

That is, generally, the expected value of the product of two random variables is not equal to
product of the expected values of those random variables. However, there is an exception to
the rule, if X and Y are independent then
( )= ( ) ( )
4. If is a constant, then
( )= ( )
That is to say, the expected value of a constant times a random variable X, is equal to the
constant times the expected value of the R.V
5. If and are constants, then
( + )= ( )+ ( )
= ( )+

1.5.2 Variance of a Discrete Probability Distribution


1.5.2.1 Variance and standard deviation

Variance indicates how individual values are spread, dispersed or distributed around the mean
value. But also the statistical concept of variance is a useful measure of risk of any kind. Generally if
X is a discrete random variable, then its variance is given by:

( ) = − ( )
= ( ) − ( ( ))
= ( )−
=
Where ( )=∑ ( )
The standard deviation of X is therefore given by
( )= Var( )
=
Example 1.4
A company estimates the net profit for a new product to be launched with its corresponding
probabilities under different market conditions as follows;

12
Market Condition Good Fair Poor
Net Profit (in million Tsh.) 30 10 -3
Probability , ( = ) 0.15 0.25 0.60

Required:
a) Calculate the expected value of the net profit for the Company
b) What is standard deviation of the net profit

Solution:
a) We know that

E ( X )   xP ( X  x )
n

i 1

Market Condition Net Profit (in million Probability ( ) ( )


Tsh)( )
Good 30 0.15 4.5
Fair 10 0.25 2.5
Poor -3 0.6 -1.8
Total 5.2

( )= ( = ) = 30 × 0.15 + 10 × 0.25 − 3 × 0.6 = 5.2

Therefore the expected value of the net profit for the company under all three given market
conditions is 5.2 million Tsh.

b) The variance is given by

( )= ( )− ( )

Market Net Profit (in million Tsh) Probability ( ) ( )


Condition ( ) ( )
Good 30 0.15 4.5 135
Fair 10 0.25 2.5 25
Poor -3 0.6 -1.8 5.4
Total 5.2 165.4

( ) = 165.4 − 5.2 = 138.36

13
The Standard Deviation is given by:

= ( ) = √138.36 = 11.76
Therefore the standard deviation of the net profit for the company under all three given market
conditions is 11.76 million Tsh. This tells us how much the net profit deviates from the expected
value of 5.2 million Tsh. Thus, we may say that although the expected net profit is about 5.2 million
Tsh, it may go above or below this value by 11.76 million Tsh. You may calculate the confidence
interval to estimate the interval on which the expected net profit will fall.

Example 1.5

A return of certain investment B ( ) in percentage is a discrete random variable whose probability


distribution is as shown below:
-10 -6 0 4 8 12
Prob. 0.15 0.2 0.15 0.1 0.15 0.25

Find the expected value and the standard deviation of


Solution
Let be a return of investment. For the purpose of simplification, we can summarise the sums as
indicated in the Table below:

( ) ( ) ( )
-10 0.15 -1.5 15
-6 0.2 -1.2 7.2
0 0.15 0 0
4 0.1 0.4 1.6
8 0.15 1.2 9.6
12 0.25 3 36
TOTAL 1.9 69.4
Hence:

( )= ( )

= 1.9

( )= ( )− ( )

= ( )− ( )

= 69.4 − 1.9
= 65.79
( )= ( )

14
= √65.79
= 8.11

Example 1.6

A monthly income of workers in millions of TShs from a certain sector with their associated
probabilities are as indicated in the following probability distribution

Income 1.4 3.5 2.0 0.9 3.0


Probability 0.25 0.2 0.15 0.1 0.3
Find the expected income and standard deviation of all workers
Solution:
Let be a monthly income of worker. For the purpose of simplification, we can summarise the
sums as indicated in the Table below:

( ) ( ) ( )
1.4 0.25 0.35 0.49
3.5 0.2 0.7 2.45
2 0.15 0.3 0.6
0.9 0.1 0.09 0.081
3 0.3 0.9 2.7
TOTAL 2.34 6.321
Hence:

( )= ( )

= 2.34

( )= ( )− ( )

= ( )− ( )

= 6.321 − 2.34
= 0.8454

( )= ( )

= √0.8454
= 0.9194

Properties of variance

1. The variance of a constant is zero. That is to say, if is a constant, then


Var( ) = 0

15
2. If X and Y are two independent random variables, then
( + )= ( )+ ( )and

( − )= ( )+ ( )

3. If is a constant, then
( + )= ( )

4. If is a constant, then
( )= ( )

5. If X and Y are independent random variables and and are constants, then
( + )= ( )+ ( )

1.8 Multivariate Or Joint Probability Distribution


So far we have discussed discrete probability distributions for the case of single random variable
or univariate. However the outcomes of numerous experiment can be described by more than one
random variables (R.Vs), in which case we must find their probability distribution. Such kind of
probability distribution are called Multivariate or Joint Probability Distribution, and the
simplest is the bivariate or two-variables probability distribution which we expected to discuss in
here under.

1.8.1 Discrete Joint Probability Distribution


IF X and Y are discrete random variables, the function given by ( , )= ( = , = ) for
each pair of values ( , ) within the range of and is called the joint probability distribution of
and

Properties of joint Probability Distribution

1. ( , )≥0
2.  f ( x, y)  1
x y

Example 1.6

Find the value of if the joint probability distribution is defined as

a ( x 2  y ) for x  1, 2 and y  2,3


f ( x, y )  
0 otherwise

Solution
Recall one of the condition of joint probability distribution function, that is  f ( x, y)  1
x y

16
It implies that,
( , )+ ( , )+ ( , )+ ( , )=
+ + + =
=
Therefore:

= .

1.8.2Marginal Discrete Probability Distribution


These are probability that one random variable say X assumes a given value regardless of the value
taken by another variable say Y. This is obtained by summing up all probabilities of other variable.
More general, if X and Y are two discrete random variables and ( , ) is the joint probability
distribution at ( , ) then the function given by

( )= ( , )

for each within the range of X is called marginal distribution of X. Similarly the function

ℎ( ) = ∑ ( , ) within the range of Y is called marginal distribution of Y

1.9 Covariance And Correlation


1.9.1 Covariance
Covariance is a measure of how two random variables change together. If the two variables either
increases or decreases simultaneously we get a positive covariance. However if they vary inversely
or move in opposite direction, they give negative covariance. Let X and Y be two random variables,
then covariance between the two variables is defined as:

( , )= − ( ) ( − ( )

= ( )− ( ) ( )

= ( )−

Where = ( ) and = ( )

To compute the covariance as defined in the above equation, we now use the following formula

( , )=  ( X  
x y
x )(Y   y ) ( , )

17
  xyf ( x, y )   x  y
x y

Where ( , ) is the joint probability distribution function of the two random variables X and Y.
The double summation sign in this expression indicates that covariance requires the summation of
both variables over the range of their values.

Properties Of Covariance

1. If X and Y are independent random variables, their covariance is zero. This can be verified
as indicated here. Recall that if two random variables are independent, then

( )= ( ) ( )=

Substituting the above expression into equation (...), we see at once that the covariance of
two independent random variable is zero

2. If , , and are constants, then

( + , + )= ( , )

3. ( , )= ( ) that is covariance of a random variable with itself is simply a variance


of such variable.

1.9.2 Correlation
Correlation coefficient is a numerical value which measures the strength of relationships between
two random variables. IF X and Y are two random variables, their correlation coefficient denoted
by ( , )or is given by
( , )
=

1.9.3 Properties Of Correlation Coefficient


Like covariance, can be either positive or negative
The correlation coefficient always lies between −1 and +1. Symbolically
−1 ≤ ≤1

If the correlation coefficient is +1, it means that the two variables are perfectly positive correlated,
whereas if the correlation coefficient is −1, it means that they are perfectly negative correlated. If 0,
it means no relationship at all. However if 0.8 ≤ < 1 then this indicates very strong linear
relationship, it is just strong when 0.6 ≤ < 0.8. Furthermore if ≤ 0.3 then it indicates weak
linear relationship

Example 1.7

18
The following Table gives the joint PDF of two random variables X and Y, where X represent the
first-year rate of return (%) expected from investment A, and Y stands for the first-year rate of
return expected from investment B.

X(%)

-10 0 20 30

20 0.27 0.08 0.16 0.00


Y(%)
50 0.00 0.04 0.1 0.35

a) Find the marginal distribution of X and hence the expected rate of return and standard
deviation
b) Find the marginal distribution of Y and hence the expected rate of return and standard
deviation
c) Compute the covariance, correlation coefficient between these cash flows and comment on
the results
d) Are the expected rates of return from the two investments independent?

Solution:
The marginal distribution of is given below:

-10 0 20 30
Prob. 0.27 0.12 0.26 0.35

The corresponding sums are summarised below:

ℎ( ) ℎ( ) ℎ( )
-10 0.27 -2.7 27
0 0.12 0 0
20 0.26 5.2 104
30 0.35 10.5 315
TOTAL 13 446

Hence:

( )= ℎ( )

= 13

( )= ( )− ( )

= ℎ( ) − ℎ( )

= 446 − 13

19
= 277
( )= ( )

= √277
= 16.64
The marginal distribution of is given below:
y 20 50
g(y) 0.51 0.49

From above distribution:

( ) = ( )

= 20(0.51) + 50(0.49)
= 10.2 + 24.5
= 34.7

( )= ( )− ( )

= ( )− ( )

= 20 (0.51) + 50 (0.49) − 34.7


= 204 + 1225 − 1204.09
= 224.91
( )= ( )

= √224.91
= 14.99

1.10Relationship Between Population And Sample

So we have seen on how to compute numerous characteristics of PDF of discrete random variables
such as expected value, variance, standard deviation, covariance, and correlation coefficient. All
these are population variables. In reality when conducting quantitative research it is somehow
difficult to deal with the whole population, unless otherwise if the population is finite. But in most
cases , we normally use a sample which is subset of the population to draw conclusion about the
properties of a given population. This is the basis of the so called inference statistics that will be
discussed in later sections. But mean while the sample counterpart together with their
corresponding population variables are summarized in the following Table.

20
Population variable Sample counterpart (Raw Data)

( )= ( )=
=

( )= ( − ( )) ( − )
=
= ( − ) −1
=

( )= ( )= ( )= ( )=

( , )= ( − )( − ) ∑( − )( − )
sample ( , )=
−1
= ( )−

( , ) sample ( , )
( , )= = ( , )= =
( ) ( )

Example 1.8
Returns (in millions of shillings) from two samples of investments projects X and Y were recorded
as follows;
X 3 4 6 8 7

Y 2 5 7 8 10

(a) Compute the mean return and standard deviation for each Project.
(b) Compute the correlation between the returns and comment on your results.

Hint: Let the students find the solution of the above problem

1.11Linear Combination Of Random Variables


In this subsection we expect to discuss the concept of linear combination of two or more random
variables. We also anticipate to discuss issues related to expected value and variance of such
random variables. Mathematically, if , ,⋯, are random variables, and , ,⋯, are
constants, then

= + +⋯+ = = ́

21
is a linear combination of random variables. Where ́ is a row vector (a row matrix) and isa
columnvector (column matrix). That is to say a linear combination of random variables can also be
expressed in terms of matrix notation.

1.11.1 Properties of Linear Combination of Random Variables


If , ,⋯, are random variables with a linear combination

= + +⋯+ =

Then

( )= ( )+ ( ) + ⋯+ ( )= ( )

( )= + + ⋯+ = = ́

where , ,⋯, are constants

If , ,⋯, are random variables with a linear combination

( )= ( )+2 ( , )

and if , ,⋯, are independent random variables, then

( )= ( )

By considering two random variables as indicated below:

= +
Then
( )= ( )+ ( )
( )= ( )+ ( )+2 ( , )
For three random variables as indicated below:
= + +
( )= ( )+ ( )+ ( )
( )= ( )+ ( )+ ( )+2 ( , )+2 ( , )
+2 ( , )
If we have three or more random variables, it is more convenient to use matrix approach to
compute mean and variance of linear combination.

22
Example 1.9: Application in Portfolio mathematics:
Eighty percent of a portfolio is invested by in TBL stock and the remaining 20% is invested in UTT
stock. TBL stock has expected return of 6% and the expected standard deviation of return of 9%.
UTT stock has expected return of 20% and an expected standard deviation of 30%. The coefficient
of correlation between of the two securities is expected to be 0.4. Determine the following:

a) the expected return of portfolio


b) the expected variance of the portfolio
c) the expected standard deviation of a portfolio
d) the expected variance of the portfolio if the two securities are independent

Solution:
Let be the amount invested in TBL, the amount invested in UTT, and denotes the
corresponding weights

Data Given:
= 80%, = 20%, ( ) = 6%, ( ) = 20%, ( ) = 30%, ( ) = 30% = 0.4

a) = ( )+ ( )

= (0.8 × 0.06)+(0.2 × 0.2)


= 0.048 + 0.04
= 0.088
= 8.8%

b) = ( )+ ( )+2 ( , )

Where:
( , )= ( ) ( )

Therefore:

= ( )+ ( )+2 ( ) ( )

= 0.8 × 0.09 + 0.2 × 0.3 + 2 × 0.8 × 0.2 × 0.4 × 0.09 × 0.3


= 0.00514 + 0.0036 + 0.003456
= 0.012196

c) = ( )

= √0.012196
= 0.11043
d) If the two securities are independent, then ( , )=0

23
Therefore:

= ( )+ ( )

= 0.8 × 0.09 + 0.2 × 0.3

= 0.00874

1.12 Special Types of Probability Distributions


We have seen from the previous section that a random variable can be described by some specific
characteristics of probability distribution such as expected value and variances. However, this
presumes that the PDF of the corresponding R.V is known. In practice, PDF of some of the R.V
have already been established by statistician. Hence in the following section, we expect to discuss
just few examples of some probability distributions and these are, Binomial distribution, Poisson
distribution, Normal distribution, t-distribution and also the sampling distribution of the sample
statistics that will be discussed in the next chapter.

1.12.1 Binomial Distribution

1.12.1.1 Definition of key terms

1.12.1.2 Binomial experiment

A binomial experiment is a statistical experiment which consist of repeated trials. Each trial can
result in just two possible outcomes. We call one of these outcome a success and the other, a
failure. The probability of success, denoted by P, is the expected to be constant on every trial.

Examples of binomial experiment

 An experiment of tossing a fair coin


 An experiment of selecting a defective item from a given sample
 Asking 10 people if they have ever been to South Africa

1.12.1.3 Binomial Random Variable

This is the number of successes denoted by a letter X in n repeated trials of a binomial experiment.

Binomial Probability Distribution


To understand binomial distributions and binomial probability, it helps to understand binomial
experiments and some associated notation. A binomial experiment (also known as the outcome
of Bernoulli Process) as also explicated from previous section, is a random experiment that has
the following properties:

i) The experiment consists of finite n number of repeated trials.

24
ii) Each trial can result in just two mutually exclusive possible outcomes. We call one of these
outcomes a success and the other, a failure.
iii) The probability of success, denoted by P, is the same (constant) on every trial.
iv) The trials are independent; that is, the outcome on one trial does not affect the outcome on
other trials.

Consider the following random experiment. You flip a coin 10 times and count the number of times
the coin lands on tails. This is a binomial experiment because:

 The experiment consists of repeated trials. We flip a coin 10 times.


 Each trial can result in just two possible outcomes – it's either heads or tails.
 The probability of success is constant, i.e. 0.5 on every trial.
 The trials are independent; that is, getting tails on one trial does not affect whether we get
tails on other trials.

Notation: Now, let us suppose that, we have:


 x = the number of successes that result from the binomial experiment.
 n = the number of trials in the binomial experiment.
 p = the probability of success on an individual trial.
 q = the probability of failure on an individual trial (This is equal to 1 - p.)
 b(x; n, p) = Binomial probability - the probability that an n-trial binomial experiment
results in exactly x successes, when the probability of success on an individual trial is p.
Then, the model for specifying the probability of obtaining exactly x successes in a given number of
trials, n is given by

n
b( x; n, p)  P( X  x)    p x q n  x for x  0, 1, 2,    , n
 x
n n n!
  C x 
x!( n  x )!
Note:
 x
1.12.2Mean, Variance, and Standard Deviation for a Binomial Random Variable
The expected value (mean) of binomial random variable is given by ( ) = , and the standard
deviation is given by ( )= (1 − . That is,

Mean: =

Variance: =

Standard Devition: =

25
Examples 1.10
Given that the expected value of a binomial distribution is 40 and standard deviation is 6.

Required:
Calculate n, p and q.

Solution:
From
  np  40
and
 2  npq  36
Put 1 into 2, we have
40q  36
q  0.9  p  q  1  p  0.1
Thus p  0.1, q  0.9, and n  400
Example 1.11
Assume that on an average one telephone number out of 15 is busy.

Required:
What is the probability that if six randomly selected telephone numbers are dialled
a) Not more than three will be busy?
b) At least three of them will be busy?

Solution:
1 14
p , q  , n  6 , then
15 15
a) p ( x  3)  p ( x  0)  p ( x  1)  p ( x  2)  p ( x  3)  0.9997 (Use the Binomial
distribution formula)

b) P( x  3)  1  P( x  3)  1  P( x  0)  P( x  1)  P( x  2)  0.0051 (use the same


approach as in part a)

Example 1.12

The probability that a student is accepted to a prestigious college is 0.3. If 5 students from the same
school apply, what is the probability that at most 2 are accepted?

26
Data Given:

= 0.3 =5
Find ( ≤ 2)
( ≤ 2) = ( = 0) + ( = 1) + ( = 2)
From

( = )= (1 − )
!
= (1 − )
( − )! !
5!
( = 0) = × 0.3 × 0.7
(5 − 0)! 4!
5!
= × 0.3 × 0.7
5! 0!
= .
5!
( = 1) = × 0.3 × 0.7
(5 − 1)! 1!
5!
= × 0.3 × 0.7
5! 1!
= .
5!
( = 2) = × 0.3 × 0.7
(5 − 2)! 2!
5!
= × 0.3 × 0.7
5! 2!
= .

Therefore:

( ≤ 2) = 0.16807 + 0.36015 + 0.3087


= .

1.12.2 Poisson Probability Distribution


A Poisson experiment is a random experiment that has the following properties:
 The occurrence of events is independent
 The average number of successes ( ) that occurs in a specified region is known.
 There is a possibility of infinite number of occurrences in a specified region
 The probability that a success will occur is proportional to the size of the region.

Note: the specified region could take many forms. For instance, it could be a length, an area, a
volume, a period of time, etc.
27
1.12.2.1 Application of Poisson Distribution
 The number of deaths by COVID-19 in the global in 2020
 The number of birth defects and genetic mutations
 The number of car accidents in Dar es Salaam city
 The number of typing errors on a page
 The spread of an endangered animal in Sub Saharan Africa
 The number of failure of a machine in one month

Notation
The following notation is helpful, when we talk about the Poisson distribution.
 : A constant equal to approximately 2.71828. (Actually, e is the base of the natural
logarithm system.)
 λ: The mean number of successes that occur in a specified region.
 x : The actual number of successes that occur in a specified region.
 P( x,  ) : The Poisson probability that exactly x successes occur in a Poisson

experiment, when the mean number of successes is  .

The probability distribution of the Poisson random variable X, representing the number of
outcomes occurring in a given time interval or a specified region of space is:

e  x
P ( x,  )  x  0,1,2...
x! for

Or

( , )= ( = )= = 0,1,2, ⋯
!

Where λ represent the average number of outcomes occurring in the specified time or region.
Furthermore, if X has a Poisson distribution, then ( ) = and ( ) = √

Examples 1.13
The average number of days a school is closed due to snow during winter in a certain City in USA is
4. Calculate the probability that the schools in this city will close for 6 days during a winter?

Solution:

Given  =4, x = 6 using Poisson distribution

28
e 4 4 6
p ( x  6)   0.1042
6!
Note: The Poisson distribution may be used to approximate the Binomial distribution, when n-is
very large and p-is very small. See the following example

Example 1.14
Suppose that on average, 1 person in every 1000 is an alcoholic. Find the probability that a random
sample of 8000 people will yield fewer than 7 alcoholics.
Solution:
Let represent the number of alcoholic persons

1
p( x)   0.001, n  8000
1000
Since p is very small, and n is very large, then
= = 0.001 × 8000 = 8
Now,

p( x  7)  p ( x  0)  p( x  1)  p( x  2)  ...  p ( x  6)
e 8 8 0 e 8 81 e 8 8 2 e 8 8 6
    ...   0.3134
0! 1! 2! 6!

Example 1.15

The number of customers attended at CRDB bank follows Poisson distribution with a mean of 10
customers per hour, find the probability that in any given hour

a) exactly 6 customers will be attended


b) No customer will be attended
c) At least 2 customers will be attended

Solution:
Data Given
= 10
From:

( = )=
!
×
a) ( = 6) =
!
= .

29
×
b) ( = 0) =
!
= .

c) ( ≥ 2) = 1 − (< 2)
× 10 × 10
=1− +
0! 1!
= 1 − 0.0000454 + 0.000454
= .

1.13Continuous Probability Distributions


If a random variable is a continuous variable, its probability distribution is called a continuous
probability distribution. There are different types of Continuous distribution; normal
distribution is perhaps the single most used continuous distribution which will be discussed in this
lecture notes.
A continuous probability distribution differs from a discrete probability distribution in several
ways.
 The probability that a continuous random variable will assume a particular value is zero.
 As a result, a continuous probability distribution cannot be expressed in Tabular form.

 Instead, an equation or formula f (x) is used to describe a continuous probability


distribution function.

Most often, the equation used to describe a continuous probability distribution is called a
probability density function (PDF). Sometimes, it is referred to as a density function. For a
continuous probability distribution, the density function has the following properties:

1. 0  f ( x)  1
The continuous random variable is defined over a continuous range of values (called the
domain of the variable), the graph of the density function will also be continuous over that
range.

2.  f ( x)dx  1
The area bounded by the curve of the density function and the x-axis is equal to 1, when
computed over the domain of the variable.

A   f ( x)dx
b

3.
a

The probability that a random variable assumes a value between a and b is equal to the area
under the density function bounded by a and b.

30
1. 14 Normal Distribution
Normal distribution is perhaps the single most important probability distribution involving
continuous random variable. By definition, a continuous random variable X has a normal
distribution if its probability distribution function (PDF) is given by:

( )= , −∞ < <∞

If X has a normal distribution, then ( ) = and ( )=


The common notation for a normal random variable is ~ ( , ). Where ~ means distributed as,
N stands for normal distribution, and the variables enclosed in brackets are the parameters of the
distribution, termed as population mean or expected value and variance

1.14.1 Properties of normal distribution


1) It is bell-shaped ranging from negative to positive infinity
2) It is symmetrical around its mean value (see figure 1)
3) The area under the curve is unity

Figure 1: Normal Distribution Curve

1.14.2 Standard Normal distribution


PDF of a normal random variable is to some extent complicated to use. Therefore in order to find
probability of a normal random variable let say X, we normally used the so called standard normal
variable Z defined as indicated below.

=

31
Where Z has a zero mean and a unit variance. The common notion of expressing a standard normal
random variable is as indicated below:
~ (0,1)

Therefore, a normally distributed random variable with a given mean and variance can be
converted to a standard normal variable (aka normal deviate), which greatly simplifies our task of
computing probabilities.

1.14.3 The Standard Normal Distribution Table


A standard normal distribution Table shows a cumulative probability associated with a
particular z-score. Table rows show the whole number and tenths place of the z-score. Table
columns show the hundredths place.

Example 1.15
Find the probability that a -score will be greater than 3.00 from the standard normal Table.
Solution:
Required to find ( > 3.00)
( > 3.00) = 0.5 − (0 ≤ ≤ 3.00)
= 0.5 − 0.4987
= 0.0013

Example 1.16
It is given that, the daily sale of bread in a bakery, follows the normal distribution with a mean of
70 loaves and variance of 9, i.e ~ (70,9). What is the probability that on any given day the sale of
bread is greater than 75 loaves?
Solution:
Data Given:
= 70, = 9, =3
Let represent the daily sale of the bread in a bakery, then:

− 75 − 70
( > 75) = >
3
= ( > 1.67)
= ( > 1.67)
= 0.5 − (0 ≤ ≤ 1.67)
= 0.5 − 0.4525
= .

32
Example 1.17
An investor is considering to purchase a stock whose monthly return is approximately normally
distributed with an expected return of 0.01 and a standard deviation of 0.02. Use the standard
normal distribution table to find the probability that the stock return is positive.
Data Given:
= 0.01, = 0.02
Let represent the stock return, then required to find:
− 0 − 0.01
( ≥ 0) = ≥
0.02
= ( ≥ −0.5)
= ( ≥ −0.5)
≅ ( ≤ 0.5)
= 0.5 + (0 ≤ ≤ 0.5)
= 0.5 + 0.1915
= .

Example 1.18
The income in thousands of dollars of a given company are normally distributed with the mean 20
and the standard deviation of 5. Find the probability that a selected income will be
a) More than twenty five thousand dollars
b) Anywhere between eighteen twenty four thousand dollars
Data Given:
= 20, =5
Let represent the income of the company, then required to find:

− 25 − 20
( > 25) = >
5
= ( > 1)
= ( > 1)
= 0.5 − (0 ≤ ≤ 1)
= 0.5 − 0.0000393
= .
18 − 20 − 24 − 20
(18 ≤ ≤ 24) = ≤ ≤
5 5
= (−0.4 ≤ ≤ 0.8)
≅ (0 ≤ ≤ 0.4) + (0 ≤ ≤ 0.8)
= 0.1554 + 0.2881
= .

33
Example 1.19
Applicants for a certain job are given an aptitude test. Past experience shows that score from the
test are normally distributed with a mean of 60 points and a standard deviation of 12 points. What
percentage of candidates would be expected to pass the test, if a minimum score of 75 is required?
Data Given;
= 60 = 12
Let represent the scores of candidate, required to find ( ≥ 75)
Solution:
− 75 − 60
( ≥ 75) = ≥
12
= ( ≥ 1.25)
= 0.5 − (0 ≤ ≤ 1.25)
= 0.5 − 0.3944
= 01056
Conclusion: Almost 10.56% of candidates would be expected to pass the test

Example 1.20

For a group of 1800 employees of a manufacturing company, IQ is approximately normally


distributed with mean 110 and standard deviation 12. It is known from experience that for a
particular job only persons with IQs of at least 95 are intelligent enough to do it, but those with IQs
greater than 120 soon become bored and unhappy with it. On the basis of IQ alone, how many of
the 1800 employees would you expect to be suitable for the work?
Data Given;
= 1800, = 110 = 12

Let represent the IQ of employees. Required to find (95 ≤ ≤ 120)


Solution:
95 − 110 − 120 − 110
(95 ≤ ≤ 120) = ≤ ≤
12 12
= (−1.25 ≤ ≤ 0.833
= (0 ≤ ≤ 1.25) + (0 ≤ ≤ 0.833)
= 0.3944 + 0.2967
= 0.6911
≈ 66.11%
Then:
Find 66.11% of the total employees:

34
= 0.6611 × 1800
= 1244
Conclusion: The results indicates that 1244 candidates would be suitable for the work based on
IQ test alone

35
Review Questions:
1. Suppose that in one year the number of industrial accidents follows a Poisson distribution with
mean 3. What is the probability that in a given year there will be at least 1 accident?

2. A company owns 400 laptops. Each laptop has an 8% probability of not working. You randomly
select 20 laptops for your salespeople.
(a) What is the likelihood that 5 will be broken?
(b) What is the likelihood that they will all work?
(c) What is the likelihood that they will all be broken?

3. The LMB Company manufactures tires. They claim that only 0.007 of LMB tires are defective.
What is the probability of finding 2 defective tires in a random sample of 50 LMB tires?

4. A study was conducted at Muhimbili National Hospital by the National Institute for Medical
Research (NIMR) to examine the national altitudes about “SP” drugs. The study revealed that
about 70% believe “SP” doesn’t really cure malaria; they just cover up the real trouble.
According to this study, what is the probability that at least 3 of the next 5 people selected at
random will be of the opinion that SP doesn’t really cure the problem but just cover it up?

5. On an average a certain intersection results in 3 traffic accidents per month. What is the
probability that in any given month at this intersection
i. Exactly 5 accidents will occur?
ii. Less than 3 accidents will occur?
iii. At least 2 accidents will occur?

6. An average light bulb manufactured by the Acme Corporation lasts 300 days with a standard
deviation of 50 days. Assuming that bulb life is normally distributed, what is the probability
that an Acme light bulb will last at most 365 days?

7. The heights of 1000 students are normally distributed with a mean of 174.5 centimeters and a
standard deviation of 6.9 centimeters. Assuming that the heights are recorded to the nearest
half of a centimeter, how many of these students would you expect to have heights
i. Less than 160.0 centimeters?
ii. Between 171.5 and 182.0 centimeters inclusive?
iii. Greater than or equal to 188.0 centimeters?

36
8. A drug manufacturer claims that a certain drug cures a blood disease on the average 80% of the
time. To check the claim, government testers used the drug on a sample of 100 individuals and
decide to accept the claim if 75 or more are cured.
i. What is the probability that the claim will be rejected when the cure probability is
in fact 0.8?
ii. What is the probability that the claim will be accepted by the government when the
cure probability is as low as 0.7?
9. A sampling scheme involves taking a sample of ten items from each batch produced and
rejecting the batch if more than two defectives are found. If in fact five per cent of all items are
defective, what is the probability that a batch will be rejected?

10. A batch of items is believed to contain 20 percent defectives. A sample of six items is taken at
random from the batch. Use binomial distribution to find the probability that the sample
contains:
(a) one defective;
(b) two or more defectives.
11. Each year a company selects a number of employees for a management-training program given
by a nearby university. On the average, 70% of those sent complete the program. Out of seven
people sent by the company, what is the probability that:
(a) Exactly five complete the program?
(b) Five or more complete the program?

12. Printing errors in the work produced by a particular firm occur randomly at an average rate of
0.6 per page. What is the probability that a seven-page pamphlet prepared by the firm contains
more than three errors?

13. The proportion of articles produced by a company, which are defective, is 0.5 per cent. They are
sold in boxes of 100 and the company guarantees to replace any box containing more than two
defectives. The cost of this replacement is TZS 20,000. The company is considering introducing
a new inspection scheme, which will cost TZS 50 per box but eliminate all defectives. Use the
Poisson approximation to the binomial distribution to decide whether this inspection is
worthwhile.

14. In a certain large factory the mean number of stoppages per week is 1.5. What is the probability
that:
(a) In a given week there will be no stoppages
(b) In a given week there will be three or more than three

37
(c) In a given two-week period there will be at most one stoppage?

15. The mean life of a certain type of electric light bulb is 1400 hours, with a standard deviation of
300 hours.
(a) If the manufacturer guarantees a life of 1000 hours, what percentage of bulbs can he
expect to have returned?
(b) At what length of life should the guarantee be set in order for 95 per cent of bulbs to be
found satisfactory?

16. Applicants for a certain job are given an aptitude test. Past experience shows that score from
the test are normally distributed with a mean of 60 points and a standard deviation of 12
points. What percentage of candidates would be expected to pass the test, if a minimum score
of 75 is required?

17. For a group of 1800 employees of a manufacturing company, IQ is approximately normally


distributed with mean 110 and standard deviation 12. It is known from experience that for a
particular job only persons with IQs of at least 95 are intelligent enough to do it, but those with
IQs greater than 120 soon become bored and unhappy with it. On the basis of IQ alone, how
many of the 1800 employees would you expect to be suitable for the work?

18. If 60 percent of the customers of a large department store charge all their purchases, what is
the probability that among 200 customers (randomly chosen), at least 125 charge all their
purchases?

19. An electrical appliance manufacturer claims that 20 percent of all appliance breakdowns are
caused by the failure of customers to follow operating instructions. If this claim is correct, what
is the probability that among 100 breakdowns, more than 25 are caused by the failure of
customers to follow operating instructions?

20. The annual salaries of employees in a large company are approximately normally
distributed with a mean of $5,000 and a standard deviation of $2,000.
(a) What percent of people earn less than $4,000?
(b) What percent of people earn between $4,500 and $6,500?
(c) What percent of people earn more than $7,000?

38
CHAPTER TWO
SAMPLING DISTRIBUTION
2.1 Introduction
In Chapter One we discussed probability distributions of discrete and continuous random
variables. In this chapter we extend the concept of probability distribution of a random variable to
that of a sample statistic. We usually use inferential statistics methods to estimate population
parameter values of random variables by using findings from samples. In most cases when
conducting a certain investigation, we normally select some individuals to represent the whole
group of individuals at which an investigator is interested with. This is due to factors related to cost
of dealing with entire population, time, etc. Sampling is the process of selecting a representative
group from the population under study.The target population is a collection of all individuals
from which the sample might be drawn. A sample is the group of people who take part in the
investigation or in other words, a sample is a subset of a given target population. There are
different techniques for selecting sample, these can be grouped into two categories; random
(probability) sampling and non-random(non-probability) sampling techniques.

2.2 Distribution of the sample mean

When a sample is selected from a normally distributed population, obviously it is also normally
distributed. Therefore in this section we expect to study the sampling distribution of the sample
mean and the so called sample proportion ̂ . Furthermore, it is anticipated that since more than
one samples can be selected from the given population, the corresponding sample means are also
expected to vary from one sample to the other, which means sample mean can be treated as a
random variable, which will have its own PDF.

If , , ,⋯ constitute independently and identically distributed random sample variables of


the size , then recall that the mean of these random sample can be denoted by

1
=

1
= ( + + +⋯+ )

Then the expected value of is given by

( ) = ∑ = ∑ ( ) by the properties of and  operators

1
= (( ( )+ ( )+ ( ) + ⋯+ ( )

Then let ( ) = for any , then

39
1
( )= ( + + + ⋯+ )

1
=

1
=

=
Therefore ( ) =
Since ( )= for each and that " are independent, then the variance of is given by
( )= ( )+ ( )+ ( )+⋯ ( ))

1
= ( × )

Therefore ~ ( , )

The square root of variance of is called the standard error of which is given by

( )=

Therefore the standardized normal variable in this case is given by
− ̅ −
= = ~ (0,1)
( )

Example 2.1
A random X is normally distributed with mean 8 and variance 25. From a random sample of 36
observations, find
a) Standard error of this sampling distribution
b) The probability that the sample mean is greater than 9

Data Given

= 8, = 25, = 36

Let represents the sample mean calculated from a random sample of 36 observation. Then
a) ( )=

8
=
√36
= .
b) ( > 9) = >
( ) .

40
1
= >
1.33
= ( > 0.75)
= 0.5 − (0 ≤ ≤ 0.75)
= 0.5 − 0.2734
= .

Example 2.2
Revenues collected from a certain firm in thousands of dollars are assumed to be normally
distributed with mean 20 and standard deviation 8. Suppose that a random sample of 100
revenues were selected, find
a) Standard error of the sampling distribution.
b) Probability that the mean revenue is less 21.6 thousand dollars.

Data Given

= 20, = 8, = 100

Let represents the average revenue collected from 100 samples. Then
a) ( )=

8
=
√100
= .
( < 21.6) =
.
b) ( )
<
.
1.6
= <
0.8
= ( < 2)
= 0.5 + (0 ≤ ≤ 2)
= 0.5 + 0.04772
= .

2.3 Distribution Of The Sample Proportion


Population proportion which is denoted by is obtained by taking the ratio of the number of
elements in a population with a specific characteristic to the total number of elements in the
population. Correspondingly, the sample proportion ̂ gives a similar ratio for a sample. Rules of
sample proportion say that if both ≥ 10 and (1 − ) ≥ 10, then the sample proportion ̂ is
( )
approximately normally distributed with mean (population proportion) and variance =

where is the sample size. We write this statement as

(1 − )
̂~ ,

41
Therefore the standard variable Z becomes

̂− ̂−
= = ~ (0,1)
( ) ( ̂)

Where sample proportion is given by ̂ = , is the number of success

Example 2.3
Suppose the proportion of all college students who have used marijuana in the past 6 months is
40%, suppose a random sample of 200 students representing the population of all students who
use marijuana was taken, what is the probability that the proportion of students who have used
marijuana is less than 32%?

Solution
Given = 0.4, = 200. Required to find ( ̂ < 0.32)

̂− 0.32 − 0.4
( ̂ < 0.32) = <
( ) . × .

= ( < −2.31)
≅ ( > 2.31)
= 0.5 − (0 ≤ ≤ 2.31)
= 0.5 − 0.4896
= 0.0104

2.4 .Sampling Distribution of the Difference between Two Sample Means


In some situations we may want to study the difference, if any, between the means of the two
populations. For instance; we may be interested to determine the difference between the mean
sales of two brands of a product. Since we are interested to know whether there is difference
between two unknown population means, we need to know how the statistic (difference between
the sample means = − ) relates to the true parameter (the difference between the population
means = − ).
Let

X  x1 , x 2 , x3 ,    , xn be a random sample of size of a population variable with mean

and variance  x2 and


Y  y1 , y 2 , y3 ,    , y n be a random sample of size n2 of a population variable with mean

 y and variance  y2 , where X and Y are independent.

42
 x2  y2
Thus, − has mean  x   y and variance 
nx ny

And hence for large nx and ny

X  Y  ~ N   x   y ,  
 y2 
  2
x

 nx n y 

Therefore, the standardized normal variable, Z, is given by


( − )− −
=
+

and the expression is used when the variances are known but different i.e.  x2   y2

But for the situation of equal variances,  2   x2   y2

 
 1 1 
Then, X  Y ~ N   x   y ,    
n 
2

  x ny  
X  Y     y 
~ N 0, 1
And hence, Z 
x

1 1
 2   
 nx n y 
When the population variance is unknown but assumed to be equal then we have
( ) ( )
= ~

( )
Where = is called a pooled variance.

Example 2.4
A random sample of size 60 was taken from normal population with mean 30 and variance 121.
Another independent random sample of size 100 was taken from a normal population with mean
20 and variance 81. Find the probability that the difference between the arithmetic means is at
least 13

2.5 Distribution for the differences between two sample proportions, −


Suppose that each of two sample proportion ̂ and ̂ are normally distributed. That is if

( ) ( )
̂ ~ , and ̂ ~ ,

43
Then it can be shown that the difference between the two sample proportions, ̂ − ̂ is also
normally distributed with the mean − and variance
(1 − ) (1 − )
+

written as
(1 − ) (1 − )
( ̂ − ̂ )~ − , +

Hence, the standardized variable Z is given by


( ̂ − ̂ )−( − )
= ~ (0,1)
( ) ( )
+

44
Review Questions
1. A Lorry carries heavy Cartons of Goods. The weights of these cartons are distributed about a
mean of 100kg and with a standard deviation of 7kg. Find how many cartons the Lorry can
carry so that the probability of the total load exceeding 4500kg is less than 0.05.
2. Consider a random sample of size 16 from a N (100, 400) distribution. Find P( X  97) .
3. The electric light bulbs of manufacturer A have a mean lifetime of 1,400 hrs with a standard
deviation of 200 hrs, while those of manufacturer B have a mean lifetime of 1,200 hrs with a
standard deviation of 100 hrs. If random sample of 125 bulbs of each brand are tested, what is
the probability that the brand A will have a mean lifetime that is at least (a) 160 hrs, (b) 250 hrs
more than brand B bulbs?
4. Ball bearings of a given brand weigh 0.5 oz with a standard deviation of 0.02 oz. What is the
probability that two lots of 1,000 ball bearings each, will differ in weight by more than 2 oz?
5. The weights of the packages received by a department store have a mean of 300 lb and a
standard deviation of 50 lb. What is the probability that 25 packages received at random and
loaded on an elevator will exceed the safety limit of the elevator, listed as 8,200 lb?
6. A random variable is normally distributed with mean 18 and variance 25. From a random
sample of 36 observations, find
Standard error of this sampling distribution
The probability that the sample mean is greater than 16
7. A certain research showed that 45 out of 60 workers from a certain company use smart phones.
Is a sample of 100 workers is randomly selected, what is the probability that 80% of workers
use smart phones?
8. A random sample of size 60 was taken from a normal population with mean 30 and variance
121. Another independent random sample of size 100 was taken from a normal population
with mean 20 and variance 81. Find the probability that the difference between the arithmetic
means is at least 13

45
CHAPTER THREE
ESTIMATION OF PARAMETERS
3.1 Introduction
In chapter two we said something about how sampling distribution plays a big role in inferential
statistics. In this chapter we are going to start the discussion of inferential statistics by applying the
concepts of sampling distribution discussed in chapter two. As we have said earlier, inferential
statistics is the part of statistics that uses information from a sample to make decisions and draw
conclusions for the population from which the sample was drawn. There are two important parts of
inferential statistics; estimation and hypotheses testing, these two taken together are referred to as
inference making. In this chapter we will discuss the concepts of estimation (point and interval
estimation) and hypothesis testing will be discussed in the next chapter

3.2 Definition of Key terms:

Estimation is the process of estimating unknown parametersof the given population based on the
sample statistics. Statisticians use sample statistics to estimate population parameters. For
example, sample means are used to estimate population means; sample proportions, to estimate
population proportions. In most cases it is not always possible to work with the whole population
and determine the desirable statistical measures. This is due to factors related to time, cost and
sometimes the population might be infinite. Therefore under this situation, the findings from the
samples are used to represent the whole population.

Parameters are the variables that summarizes data for entire population
Statistics are variables computed from sample information that are used to estimate certain
population parameters

Estimator is a statistic (a function of the observable sample data) that is used to estimate an
unknown population parameter

Estimate is a numerical value that is used to estimate unknown population parameters

Significance Level
Since we are using samples to estimate for the population we cannot be 100% sure that the
estimated value or interval will be true. Significance level is the probability (percent) that the true
population parameter might fall out of the confidence interval constructed. For example if the
confidence level is 95%, then, the significance level is 5%. This concept will be explicitly discussed
in the next chapter of hypothesis testing.

Degrees of Freedom
The number of data values which are allowed to vary once a statistic has been determined.

46
Maximum Error of the Estimate
The maximum difference between the point estimate and the actual parameter is known as the
maximum error of the estimate, it is 0.5 the width of the confidence interval for means and
proportions.

As we have already said, one area of concern in inferential statistics is the estimation of the
population parameters from the sample statistics. It is important to realize the order here. The
sample statistic is calculated from the sample data and the population parameter is inferred
(estimated) from this sample statistic. In that case statistics are usually calculated while
parameters are estimated.

Another area of inferential statistics is sample size determination. That is, how large of a sample
should be taken to make an accurate estimation. In these cases, the statistics cannot be used since
the sample has not been taken yet.

3.3 Properties of good estimator


3.3.1 Unbiasedness
Unbiased estimator is that one in which its expected value (mean) is exactly equal to the parameter
being estimated. That is; If is unbiased estimator of the population parameter then, = .
The quantity − is called the bias of .

3.3.2 Efficiency
An estimator is said to be efficient if it has got relatively smaller variance or standard deviation.
That is, if you have two or more estimators select that one with smaller standard deviation and
make it as your estimator. Let Band be two unbiased estimators of , then is said to be more

   
efficient than if

Var ˆ1  Var ˆ2


3.3.3 Consistency
A consistent estimator is the one whose both its variance and bias tend to zero with an increase of
the size of the sample. That is, when we increase the size of the sample we should anticipate the
variance to come down unless the estimator is not consistent.

3.3.4. Sufficiency
An estimator is said to be sufficient if it utilizes all the information in the sample to arrive to
estimate. For example, when estimating a population mean we would prefer a sample mean to
sample median and a sample mode because the last two estimators utilize only some of the
information to come to the estimate.

47
3.4 Types of Estimation
There are two types of estimation: Point estimation and interval estimation

3.4.1 Point Estimation


Under point estimation, we estimate a single value that we believe to be the value of the population
parameter we need. Any statistic can be a point estimate: For example

 Sample mean X is a point estimate of population mean


 Sample standard deviation s is a point estimate of population standard deviation 
 Sample variance s2 is a point estimate of population variance 
2

 Sample proportion p̂ is a point estimate of population variance P


3.4.2 Interval Estimation

This is the process of estimating the interval through which the unknown population parameters
lie. An interval estimate is defined by two numbers, between which a population parameter is said
to lie. For example, a<µ<b is an interval estimate of the population mean μ. It indicates that the
population mean is greater than a but less than b.

3.4.3 Confidence Intervals


In statistics, a confidence interval (CI) is a particular kind of interval estimate of a population
parameter. It is used to indicate the reliability of an estimate, which means together with the range,
we also indicate the level of confidence we have that indeed the range has the value of the
parameter. Statisticians use a confidence interval to express the precision and uncertainty
associated with a particular sampling method. A confidence interval consists of three parts.

 A confidence level.
 A statistic.
 A margin of error.

Confidence Level
The confidence level is the probability value (1-α) 100% associated with a confidence interval. For
example, say α=0.05=5%, then the confidence level is equal to (1- 0.05) = 0.95, i.e. a 95%. The
confidence level describes how strongly we believe that a particular sampling method will produce
a confidence interval that includes the true population parameter.

Statistic

This is a variable that is calculated from the sample information that is used to estimate a certain
population parameter

48
Margin of Error.
These are range of values in a confidence interval that are above and below the sample statistic are
called the margin of error. Therefore interval estimate of a confidence interval is defined by the
sample statistic+margin of error.

3.5 Formula for Confidence Interval for the population mean


The formula for Confidence Interval will depend on a number of criterion as indicates in the
following subsections:
3.5.1 When population variance ( ) is known

When population variance is known, then (1   )100% confidence interval estimate for  is given

by:
 (1)
  X  Z / 2
n
Where  is the level of significance
 is the error term
E  Z / 2
n
n  Sample size
  Population standard deviation
Hint: If the population variance (  2 ) is known, the above formula is applied regardless of the

sample size

Example 3.1

Let X be a normally distributed random variable with variance equal to 81. A random sample of
size 25 is drawn and the arithmetic mean is calculated to be 10.2; use this information to compute a
99% confidence interval estimate for
Data Given:
n  25, x  10.2  2  81, CL  99%,   1% Solution: Since
n  30 and  2 is known, then 99% confidence interval estimate for  will be:

49

  x  Z / 2
n
9
 10.2  Z 0.005 
25
9
 10.2  2.575 
25
 10.2  4.635
 5.565; 14.835

Conclusion: We are 99% confident that the true mean lies within 5.565 and 14.835

Example 3.2
A mining Company in Zambia needs to estimate the average amount of Copper ore per ton mined.
A random sample of 50 tons gives a sample mean of 146.75 kg. The population standard deviation
is assumed to be 35.2 kg.

a) Provide a 90% CI for the average amount of copper in the population of tons mined
b) Give a 95% C.I and 99% C.I for the average amount of copper per ton
Data Given:
= 50, ̅ = 146.75, = 35.2, = 90%, = 10%
Solution:
Since is known then 90% confidence interval for is given by:

= ̅±

35.2
= 146.75 ± . ×
√50
35.2
= 146.75 ± 1.645 ×
√50
= 146.75 ± 8.189
= 138.561; 154.94
Conclusion: We are 90% confident that the true mean lies within 138.561 and 154.94

Example 3.3
A team of efficiency experts intends to use the mean of random sample size n  150 to estimate the
average mechanical aptitude of assembly-line workers in a large industry (as measured by certain
standardized test). If based on experience, the efficiency experts can assume that   6.2 points on
the test scale for such data. Estimate 99% about the maximum error of their estimate.
Data Given:

50
= 150, = 6.2, = 99%, = 1%
Solution:
From

=

The maximum error is obtained by:



6.2
≤ . ×
√150
6.2
≤ 2.575 ×
√150
≤ 1.303
Therefore the maximum error of their estimate is 1.303

3.5.2 When population variance ( ) is unknown


There are two situations under this category
a) When the sample size is large n  30
b) When the sample size is small n  30

3.5.2.1 Confidence interval for µ when ( ) is unknown and the sample size is large

If the sample size is large and population standard deviation is unknown, the formula for
(1   )100% confidence interval estimate for the population mean is given by:

s
  X  Z / 2
n (2)

Where s  Sample standard deviation


Therefore, what has to be done is first to estimate the standard deviation from the sample and then
estimate the confidence interval for the population mean,  using the formula above.
Example 3. 4
Vodacom wants to estimate the average length of long-distance calls during weekends. A random
sample of 50 calls gives a mean, ̅ = 14.5 minutes and standard deviation, s=5.6 minutes. Give a
95% C.I for the average length of long-distance phone call during weekends.

Data Given:
= 50, ̅ = 14.5, = 5.6, = 95%, = 5%
Solution:

51
Since is unknown and the sample size is more than 30, then 95% confidence interval for is
given by:

= ̅±

5.6
= 14.5 ± . ×
√50
5.6
= 14.5 ± 1.96 ×
√50
= 14.5 ± 1.55
= 12.95; 16.05
Conclusion: We are 95% confident that the true mean lies within 12.95 and 16.05

3.5.2.2 Small sample confidence Interval for the mean µ, when ( ) is unknown

When σ is unknown, and the sample size is small, the sampling distribution for the sample mean
follows a new distribution called student’s t-distribution. This distribution has the same properties
like the normal distribution except that it is not being described by mean and standard distribution
like the normal distribution does. In steady the t-distribution is described by a parameter called
degrees of freedom (df). For this case the formula for(1 − )100% confidence interval for µ is
defined by:

s
  X  t / 2, n1
n (3)
Where n is the sample size, (n-1) is the degree of freedom

Example 3.5
A management Consulting firm needs to estimate the average number of years of experience of
executive in a given branch of management. A random sample of 25 executive gives ̅ = 2.4years
and s=1.5. Give a 99% C.I for the average number of years of experience for all executives in a
branch
Data Given:
= 25, ̅ = 2.4, = 1.5, = 99%, = 1%
Solution:
Since is unknown and the sample size is less than 30, then 99% confidence interval for is given
by:

52
= ̅±

,( )

1.5
= 2.4 ± . , ×
√25
1.5
= 2.4 ± 2.064 ×
√25
= 2.4 ± 0.62
= 1.78; 3.02
Conclusion: We are 99% confident that the true mean lies within 1.78 and 3.02

3.5.3 Summary on how to estimate the confidence interval for µ


Population Sample size σ known σ not known
distribution

Normally distributed Sample size large (  s


n  30) X  Z / 2 X  Z / 2
n n
Normally distributed Sample size small  s
(n<30) X  Z / 2 X  t / 2, n1
n n
Not normally Sample size large (  s
distributed n  30) X  Z / 2 X  Z / 2
n n

3.6Confidence Interval estimate for the population Proportions


Population proportion which is denoted by p is obtained by taking the ratio of the number of
elements in a population with a specific characteristic to the total number of elements in the

population. Correspondingly, the sample proportion p gives a similar ratio for a sample. For

this case the formula for (1 − )100% (1   )100% confidence interval for the population
proportion given a lager sample is given by:

 
 pq
p  p  Z / 2 (5)
n
 x 
Where p 

, q  1 p
n
x is the number of success in n trials

Example 3.6

53
In a random sample of 400 cars stopped at a roadblock, 152 of the drivers were wearing their seat
belts. Construct 95% confidence interval for the corresponding true proportion in the population
sampled

Data given:
= 400, = 152

Solution:
95% CI estimate for P is given by:
̂
= ̂±

But ̂ = ⁄ = 152 400= 0.38


=1− ̂
= 1 − 0.38
= .

0.38 × 0.62
= 0.38 ± .
400
= 0.38 ± 1.96√0.000589
= 0.38 ± 1.96 × 0.02427
= 0.38 ± 0.048
= . ; .

Conclusion: We are 95% confident that the true proportion lies within 0.332 and 0.428

Example 3.7

In a random sample of 200 vacationers interview at a resort, 142 said that they chose the resort
mainly because of its climate. With 99% confidence, find the maximum error we can make

Data given:
= 200, = 142, = 99%, = 1%

Solution:
From

̂
=

The maximum error is obtained by:

̂

But

54
142
̂= = = 0.71
200
=1− ̂
= 1 − 0.71
= 0.29
Therefore:

0.71 × 0.29
≤ .
200

≤ 2.575 × √0.00103
≤ 0.083
The maximum error we can make is 0.083

3.7Estimation of the difference between two population means( − )


3.7.1 When population variances are known

When two populations are normally distributed the independent sample extracted from such
population are also normally distributed. The formula for (1 − )100% confidence interval for the
difference between the two populations 1   2 is given by:

   
1   2   x1  x2   Z  / 2  1  2 
2 2
(8)
 n1 n2 

Example 3.8

Mr Juma, a bank official in Dar es Salaam wants to know the difference between the average
amount of money customers have on deposit in two branch banks. He selects a random sample of
25 customers from each branch and the results are as indicated below:
Branch A Branch B
Sample mean Tshs. 450 Tshs. 325
Variances 759 850
If the two populations A and B are normally distributed, Construct 95% confidence interval for
1   2

Hint: The formula above is used when the population variances are known regardless the sample
sizes
3.7.2 When population variances are unknown

55
Two situations are considered under this category
a) When both sample sizes are large, n1  30 and n2  30
b) At least one of the samples is small

3.7.2.1Large sample sizes


In this case the population variances are replaced by the corresponding sample variances and the
distribution employed in Z . Hence the formula for(1 − )100% confidence interval estimate for
the differences between two population means is as indicated below:

 
1   2   x1  x2   Z  / 2  1  2 
s s
2 2
(9)
 n1 n2 

Example 3.9
A utility company used to send out monthly statements to its customers without addressed return
envelopes. From a random sample of 120 customers it was determined that, on average it took 9
days for a payment to be made, and with a sample standard deviation of 2 days. Wishing to speed
up receipt of payments, pre-addressed return envelops were subsequently included with the
invoices. An independent sample of 130 customers indicated that average payment time fell to 8
days, with a sample standard deviation of 2.2 days. Compute a 95% confidence interval estimate for
the differences between population means

3.7.2.2 Small sample sizes

In this situation the estimation of the differences between the two population means is done under
the assumption that samples are drawn from two independent population (say X 1 and X 2 ) and

that these populations have a common variance. i.e.  12   22   2 . In this case the confidence
interval estimate for the difference between the two population is given by:

 S p2 S 2 
1   2   x1  x2   t    P 
 n1 n2 

, ( n1  n2  2 )
2

Where S p is known as pooled variance (aka common variance) which is obtained by computing
2

the weighted average of the two samples variances, it is given by the following formula:

(n1  1) s1  (n2  1) s 2
2 2
Sp  and n1  n2  2 is the degrees of freedom
2

n1  n2  2

Example 3.10

56
A random sample of 14 firms belonging to a particular industry is selected and the current
percentage gross yield X is noted for each firm. The data show that  X  46.8 and

 X  X   50.934 . Similar data, denoted by Y obtained for an independent random sample of


2

 Y  59.4 and  Y  Y   38.677 . Compute a 95%


2
19 firms in a different industry show that

confidence interval estimate for  x   y on the assumption that two population variances are

equal.
3.8Estimation of the difference between two population proportions ( − )
The confidence interval for estimating the difference between two population proportions is given
by the following formula

    

     p1 q1 p2 q 2 
p1  p 2   p1  p2   Z  / 2  
   n1 n2 
 
x1 x
Where pˆ 1  , qˆ1  1  pˆ 1 and pˆ 2  2 , qˆ 2  1  pˆ 2
n1 n2
Example 3. 11

A random sample of 350 sales persons and an independent random sample of 325 personnel
managers are interviewed regarding their reading habits. Out of the 350 salespersons, 105 show
that they subscribe to financial review magazine. 130 of the executives subscribe to financial
Review magazine. Construct the 99% confidence interval for the difference between the true
proportions subscribing to this magazine
3.9Estimation of the Sample size

The formula for confidence interval estimate for either population mean  or population

proportion P is used to estimate a sample size that is suitable to provide good approximation of
population parameters. This can be done earlier if one can set the interval length and when
population/sample variance is known in advance. For the case of proportion, we need the estimate
of sample proportion p̂ .

Given:

  x  Z / 2 , we obtain   x  E , for the case of population proportion, the CI is
n
expressed as:

pˆ qˆ
P  pˆ  Z  / 2 and this shows that P  pˆ  E
n

57

Where, E  Z  / 2 is known as the margin of error. Making n (sample size) the subject, we get
n

Z  
2

n    /2  (5)
 E 
For the case of population proportion the sample size is given by:

Z 
2

n    / 2  pˆ qˆ (6)
 E 
Hint: The inequality is preferred because the larger the sample size the more precise are the
estimates
Example 3.12
What minimum sample size would be required to estimate the population mean for a large set of
company invoices to within 0.30 with 95% confidence, given that the estimated population
standard deviation is 5?

58
Review Questions
1. Suppose 100 randomly selected used cars on a lot have an average of 40,000 miles on them,
with a standard deviation of 500 miles.
a) Find the standard errors for the average miles of the cars in this lot.
b) Find the margin of errors for the average miles of the cars in this lot.
c) Find a 95 percent confidence interval for the average miles on all the cars in this lot and
interpret your answer.

2. (a) What is the purpose of a confidence interval?


(b) If sample mean= 74, standard deviation = 9 and sample size= 60, set up a 90% and 95%
confidence intervals estimate of the population mean, μ. How do the two CI intervals differ?

3. If the population standard deviation, σ, is not known, what standardized statistic is used to
construct a confidence interval?

4. If x = 49, s = 4.5 and n = 20, set up a 90% confidence interval estimate of the population
mean, μ.

5. The operations manager of a sugar mill in Morogoro wants to estimate the average size of an
order received. An order is measured in the number of pallets shipped. A random sample of 90
orders from customers had a sample mean value of 135.4 pallets. Assume that the population
standard deviation is 25 pallets and that order size is normally distributed.
(a) Estimate, with 95% confidence, the mean size of orders received from all the mill’s
customers.
(b) If the sugar mill receives 820 orders this year, calculate, with 95% confidence, the total
number of pallets of sugar that they will ship during the year.

6. A travel agency call centre wants to know the average number of calls received per day by its
call centre. A random sample of 25 days is selected and the sample mean number of calls
received was found to be 175.8 with a sample standard deviation of 23.5 calls. Assume that calls
received daily are normally distributed.
(a) Calculate a 95% confidence interval for the mean number of daily calls received by the call
centre. Interpret the findings.
(b) Find a 99% confidence interval for the mean number of daily calls received by the call
centre.
(c) Compare the findings of (a) and (b) and explain the reason for the difference.

59
(d) Estimate, with 95% confidence, the total number of calls received over a 30-day period.
Interpret the result.

7. The Department of Trade and Industry (DTI) wants to determine the percentage of
manufacturing firms that have met the employment equity charter. To assist the department,
National Bureau of Statistics (NBS) selected a random sample of 250 manufacturing firms and
established that 89 have met the employment equity charter. Determine, with 95% confidence,
the percentage of manufacturing firms that have met the employment equity charter. Prepare a
brief report to the DTI detailing your findings.

8. AQB bank analyzed a random sample of 465 business accounts at their city central branch and
found that 88 of them were overdrawn. Estimate, with 90% confidence, the percentage of all
bank accounts at the city central branch of the bank that were not overdrawn. Interpret the
findings.

9. A market research firm wants to estimate the share that foreign companies have in the TZ
market for certain products. A random sample of 100 consumers is obtained, and t=it is found
that 34 people in the sample are users of foreign made products; the rest are users of domestic
products. Give a 95% confidence interval for the share of foreign products in this market.

10. A random sample of 350 shoppers in a shopping mall is interviewed to identify their reasons
for coming to this particular mall. The factor of ‘I prefer the store mix’ was the most important
reason for 125 of those interviewed. Estimate the likely percentage of all shoppers who frequent
this shopping mall primarily because of the mix of stores in the mall, using 95% confidence
limits.

11. A 2008 survey of low- and middle-income households showed that consumers aged 65 years
and older had an average credit card debt of $10,235 and consumers in the 50- to 64-year age
group had an average credit card debt of $9342 at the time of the survey (USA TODAY, July 28,
2009). Suppose that these averages were based on random samples of 1200 and 1400 people
for the two groups, respectively. Further assume that the population standard deviations for the

two groups were $2800 and $2500, respectively. Let 1 and  2 be the respective population
means for the two groups, people aged 65 years and older and people in the 50- to 64-year age
group.

(a) What is the point estimate of 1   2


(b) Construct a 97.5% confidence interval for 1   2 .
60
12. A consumer agency wanted to estimate the difference in the mean amounts of caffeine in two
brands of coffee. The agency took a sample of 15 one-pound jars of Brand I coffee that showed
the mean amount of caffeine in these jars to be 80 milligrams per jar with a standard deviation
of 5 milligrams. Another sample of 12 one-pound jars of Brand II coffee gave a mean amount of
caffeine equal to 77 milligrams per jar with a standard deviation of 6 milligrams. Construct a
95% confidence interval for the difference between the mean amounts of caffeine in one-pound
jars of these two brands of coffee. Assume that the two populations are normally distributed
and that the standard deviations of the two populations are equal.

13. (a) When are the samples considered large enough for the sampling distribution of the
difference between two sample proportions to be (approximately) normal?

(b) Construct a 99% confidence interval for P1  P2 for the following

n1  300, P1  0.55, n2  200, P2  0.62


(c) Construct a 95% confidence interval for P1  P2 for the following

n1  100, P1  0.81, n2  150, P2  0.77

14. Suppose that you computed a 95% confidence interval for a population mean. The user of the
statistics claims your interval is too wide to have any meaning in the specific use for which it is
intended. Give two methods of solving this problem.

15. A research firm wants to conduct a survey to estimate the average amount spent on
entertainment by each person visiting a popular resort. The people who plan the survey would
like to be able to determine the average amount spent by all people visiting the resort to within
$120, with 95% confidence. From past operation of the resort, an estimate of the population
standard deviation is σ = $400. What is the minimum required sample size?

61
CHAPTER FOUR
HYPOTHESIS TESTING
4.1 Introduction
In Chapter Three we discussed parameter estimation which is the first part of inferential statistics;
the second part is the testing of hypothesis which is discussed in this chapter. When a sales
manager claims that his company’s product has the largest market share compared to the rival
company, we can use hypothesis testing techniques to test statistically the validity of the manager’s
claim. Here we need to go to the market and collect appropriate samples on the sales of the
products from the two companies and decide to either support or refute the manager’s claim based
on the sample evidence.

Hypothesis testing is very important in decision making process as it involves collecting evidence
(data) on the claim and use statistical methods to determine whether there is any significant
difference between the claim and the obtained information. For example if a person is suspected to
have committed a crime and taken to court for trial, based on the available evidence the jury can
make one of the two possible decisions; it is either the person is guilty or innocent. At the outset of
the trial the person is presumed innocent until proven guilty. The role of the prosecutor is to prove
that the person is guilty and if he can prove this beyond doubt then the defendant convicted. This is
a non-statistical example but in statistical terms the statement that the defendant is innocent is
known as the null hypothesis and the statement of the prosecutor that the person is guilty is termed
as alternative hypothesis. The null and alternative hypotheses are always stated in terms of the
population parameters that need to be verified from the sample statistics. As we have seen in the
court trial example, the null hypothesis is true until we are able to nullify its truthiness.

In hypothesis testing, the researcher must do the following to reach desired conclusion

i. Define the population under study.


ii. State the particular hypotheses that will be investigated, both null and alternative.
iii. Give the significance level
iv. Draw appropriate sample from the target population
v. Collect relevant data from the target population
vi. Perform the calculations required for the statistical test, and
vii. Reach conclusion

4.2 Definition of key Terms


4.2.1 Statistical Hypothesis

62
This is verbal statement or claim made about a population parameter. In other words, this is a
premise or claim that we want to test.

4.2.2 A Hypotheses Testing


This is a technique involving a set of rules to be followed in order to make a decision in choosing
one of two conflicting hypotheses. In other words this is a process that uses sample statistics to test
a claim about the value of a population parameter. These two conflicting hypotheses are referred to
as null and alternative hypotheses. For example suppose a local government at Temeke claims
that the average monthly household income at its locality is at least Tshs, 100,000. A sample would
be taken to test this claim, then a pair of hypotheses will be stated: one will represent a claim and
the other its complement. When one of these hypotheses is false, the other must be true.

4.2.2.1 A null hypothesis


This is a statement to be tested also known as the currently accepted claim for a particular
population parameter. We denote the null hypothesis by a notation ( termed as "H subzero" or
"H naught") and it is customary to represent this hypothesis by the following signs; (=, ≥, ≤). This
means that always where there is equal sign implies .

4.2.2.2 Alternative hypothesis


Alternative hypothesis also known as researcher's own hypothesis is a statement which may be
accepted when the null hypothesis is not true. In other words, alternative hypothesis is the
complement of the null hypothesis. We denote the alternative by H 1 or and it contains a

statement of inequality such a ≠, >, or <. Note that since always a statistical hypothesis will be in a
form of verbal statement ; in order to write down the two complementary hypotheses we will be
required to transform the given verbal statement into mathematical statement.

4. 3 Types of Error
In making statistical decision it is also possible to commit error. This is because the procedures for
testing hypothesis relies on sample data, and because sample data are not completely reliable, then
there is possibility of having a wrong conclusion about the population parameters being tested.
Hence in carrying out hypothesis testing, there are two types of error that can be committed. These
are Type I and type II errors.

4.3. 1Type I error

This is committed when we reject a true H 0 , i.e. the null hypothesis was not supposed to be

rejected but due to either insufficient or wrong data we wrongly come to a decision of rejecting. The
significance level (  ) is defined as the probability of Type I error. Thus  gives us the probability
of wrongly rejecting a true null hypothesis.

63
4.3.2 Type II error

This is the probability of failing to reject H 0 when it is really false, i.e. accept H 0 when it is not

true. Type II error is denoted by the Greek letter  (Beta).

The following Table summarizes the relationship between H 0 and the two types of errors.

Decision H 0 True H 0 False

Reject H 0 Type I error (  ) Correct decision

Accept H 0 Correct decision Type II error (  )

4.4 Types of Statistical Tests


There are two types of hypothesis tests. These are one-sided (one tailed) and two-sided (two tailed
tests). These tests will depend on the nature of alternative hypothesis. It will be one sided tests if
the alternative hypothesis takes the following form: H 1 :    0 or H 1 :    0 . There will be a two

sided test if the alternative hypothesis will take the form : ≠ . That is in general,
1. If the alternative hypothesis contains the not-equal- to symbol (≠) then the test is two sided
test.
i.e : ≠ .
2. If the alternative hypothesis contains the greater than inequality symbol (>) then the test is
right tailed tests
i.e : >
3. If the alternative hypothesis contains the less than inequality symbol (<) , the hypothesis
test is left tailed test
i.e : <
Furthermore, two-tailed test is when the hypothesis about the population mean is rejected for a
value falling into either tails of the sampling distribution as indicated in figure 2.

Figure 2: Two Tailed Test

64
When the hypothesis about the population mean is rejected only for the value falling into one of the
tails of the sampling distribution; it is known as one-tailed test. See figure 2 and 3
For one tailed hypothesis test we have

Figure 3:Right Tailed Test Left Tailed Test

The shaded region is equivalent to the magnitude of the significance level, for a two tailed test we
equally distribute the significance level between the upper and lower tails by dividing  by 2.

The critical value is obtained from statistical tables depending on the sampling distribution of the
test statistic. For example if normal distribution is assumed then the standard normal table should
be used to obtain the critical value, if it is the student’s t-distribution then the t-distribution table is
appropriate.
4.5 Formulation of Null and Alternative hypotheses
So far we have seen the main differences between the null and alternative hypotheses, we will now
demonstrate on how to formulate the two conflicting hypothesis by using the following examples.

Example 4.1

State the null and alternative hypotheses and identify which represents the claim.
(i) A local government at Ubungo Municipality claims that the mean monthly
household income at its locality is at least Tshs. 50,000.
(ii) A Pepsi Cola manufacturer claims that less than a quarter of Tanzania population
drink Pepsi Cola.

65
(iii) The mean weight of babies born in 2019 is at most 3.5kgs
Example 4.2

A company has stated that their plastics machine makes plastics that are 8 mm diameters. A
worker believes the machine no longer makes plastics of this size and samples 100 plastics to
perform the hypothesis testing with 99% confidence. State both null and alternative hypothesis.

Example 4.3

Doctors believe that the average child sleeps on average for at most 10 hours per day. A researcher
believes that children on average sleep longer. Write down and

Example 4.4

The school board claims that at least 80% of students bring a phone to school. A teacher believes
this number is too high and randomly samples 25 students to test at a level of significance of 0.02.
Write down and

4.6 Approaches For Hypothesis Testing


There are two complementary approaches to hypotheses testing:
a) The test of significance approach
b) The confidence interval approach

4.6.1 The Test of Significance Approach


The key idea behind this approach is the test statistics and its probability distribution under the
hypothesized value of . This means we use test statistics computed from sample information to
make decision. There are two techniques under this approach; Critical value and P-value
techniques. The test procedures are as stipulated below

4.6.1.1 Critical Value Approach

The testing procedures include choosing a suitable tests statistic and dividing its value into two
regions known as a rejection and non-rejection regions. This partitioning is done at the critical
value(s). The size of rejection region is just the probability of committing Type I error. This
probability is also known as the level of significance. A critical value is the value of the random
variable whose area is equal to the level of significant.

Procedures for testing hypothesis-Critical value approach:


1. Identify H 0 , H 1 and 

2. Choose the appropriate test statistics and compute its value

66
3. Using the given level of significance, find the critical value(s) and specify the rejection and
non-rejection regions of the distribution. Allocate also the value of the test statistic in the
distribution
4. Make a statistical decision based on where the value of the test statistic falls and then give a
managerial decision or conclusion or comment.

4.6.1.2. P-value approach

Alternatively after computing the appropriate test statistic from procedure number two, we can
also make decision by calculating its corresponding probability value (commonly known as Prob.
value/P-value) and compare with the given level of significance. If the P-value is found to be
greater than the given level of significance ( . . − > ) then we will fail to reject the null
hypothesis, otherwise if the P-value is found to be less than or equal to the level of significance
(. . − ≤ ) then the null hypothesis will be rejected. This option is often employed in
many statistical computer packages. Generally the test procedures under this technique are as
indicated below:
Test procedures under P-value approach
1. Identify the two conflicting hypotheses (i. e. and ) and the level of significance ( )
2. Compute the appropriate test statistics
3. Calculate the P-value: Note that if the appropriate test statistics is then, the P-value will
be deduced as indicated below
Test Type P-value
Right tailed P( > )
Left tailed P( < )
Two tailed2 × P( > ( ))

4. Decision:
If P-value≤ then reject and accept
If P-value> then we do not reject/we fail to reject
4.6.2 Confidence Interval approach
The following procedures will be used under confidence interval approach
1. Identify the null hypothesis ( ), alternative hypothesis ( ) and the level of significance
( )
2. Establish or compute the appropriate confidence interval by using the given sample
statistics.
3. Make both statistical and managerial decisions based on whether the claimed hypothesis
falls within or outside the confidence interval
Key Concepts

67
Test statistics
This is calculated from the sample and will depend on the nature of the test and the corresponding
sampling distribution for the population parameter. The following are the formula for calculating
the test statistics:

1. = standardized test statistics for population mean


2. = standardized test statistics for population proportion


( )

3. = standardized tests statistics for population mean appropriate for


small sample
Level of significance ( )
This is the maximum allowable probability of committing a type I error. It is denoted by a lowest
Greek letter alpha ( ). The commonly used level of significance in research are:
= 1% = 5% = 10%
Critical value
A critical value is any value that separates the normal curve into critical region (where we reject the
null hypothesis) and non rejection region. This value will depend on the nature of the alternative
hypothesis, the sampling distribution that applies, and the significance level  and will also be
obtained from statistical Tables. If the normal distribution is assumed then the standard normal
Table will be used to obtain the critical value, if it is student's t-distribution then t-distribution will
b appropriate

Rejection Region
The critical region (or rejection region) is the set of all values of the test statistic that cause us to
reject the null hypothesis. For example in the case of right tailed test, the rejection region can be
shown as indicated in the following normal curve:

68
Non Rejection region
(NRR)

Rejection
region(RR)

0
Where denotes critical value in this case

4.7 Hypothesis testing for population mean (  )

4.7.1 Hypothesis testing for when  is known


2

As usual the first step will be to identify the two conflicting hypotheses ( and ) and state the
level of significance ( ). Since population variance ( ) will be known in this case, regardless of

X  0
the sample size the appropriate test statistics (T.S) will be such that Z 0 

n
The critical value(s) will depend on the nature of alternative hypothesis ( H 1 ) stated in the
question. That is if:
H 1 :    0 then, critical value will be   Z 

H 1 :    0 then, critical value will be  Z 

H 1 :    0 then, critical value will be   Z  / 2


Note: By using the P-value approach we expect that after calculating the test statistics:

= ,

then we will be required to compute the corresponding P-value by using expression stated earlier in
testing procedures.

Example 4.5:
A random sample of size 16 was taken from a normal population of variance 4. The information
from the sample showed that x  9.8 . Test H 0 :   9.0 against the alternative H 1 :   9.0 at 5%

level of significance. (Hint: Use both critical value and P-value approaches)

Example 4.6

The average IQ for adult population is 100 with standard deviation of 15. A researcher believes this
value has changed. The researcher decides to test the IQ of 75 randomly selected adults. The

69
average IQ of the sample is 105. Is there enough evidence to suggest the average IQ has changed?
Use = 0.05

Example 4.7

Use the confidence interval approach to test the hypothesis stipulated in example 1.5

4.7.2 Hypotheses Testing for when is unknown


There are two cases in this situation
4.7.2.1 When the sample size is large ( n  30 )

Generally when population variance ( ) is unknown and the sample size is large the appropriate
test statistics will be and we will be testing the null hypothesis against one of the alternative
hypotheses. In this case the test statistic is again but  is replaced by the sample standard
x  0
deviation s and it is given by Z 0 
s
n
The critical value(s) will depend on the nature of alternative hypothesis ( H 1 ) stated in the
question. That is if:
H 1 :    0 then, critical value will be   Z 

H 1 :    0 then, critical value will be  Z 

H 1 :    0 then, critical value will be   Z  / 2

Note: By using the P-value approach we expect that after calculating the test statistics: = ,

then we will be required to compute the corresponding P-value by using expression stated earlier in
testing procedures.

Example 4.8
A study is conducted to look at the time students exercise in average. Currently it is believed that
they spend at least 16 hours per month doing exercises. However, a researcher claims that in
average students exercise less than 16 hours per month contrary to what is currently believed. In a
random sample of size n = 120 he finds that the mean time students exercise is ̅ =12.3h/month
with s = 7.43h/month. Use the test of significance approach to test the above claim by using
= 10%

Example 4.9

70
A manufacturer company claims that its rechargeable batteries are good for an average of more
than 1,000 charges. A random sample of 81 batteries has a mean life of 1002 charges and a
standard deviation of 14. Is there enough evidence to support this claim at  = 0.01?

Example 4.10

A local telephone company claims that the average length of a phone call is 8 minutes. In a random
sample of 58 phone calls, the sample mean was 7.8 minutes and the standard deviation was 0.5
minutes. Is there enough evidence to support this claim at  = 0.05?

4.7.2.2 When The Sample Size Is Small ( n  30 )

Generally when population variance ( ) is unknown and the sample size is small, in this case the
statistic will no longer be used. The appropriate test statistic will be where has a t-
x  0
distribution with n  1 degrees of freedom and is given by t 0 
s
n
The critical value(s) under this situation will depend on the nature of alternative hypothesis stated
in the question. That is if:
H 1 :    0 then, critical value will be  t ,( n 1)

H 1 :    0 then, critical value will be  t ,( n 1)

H 1 :    0 then, critical value will be  t / 2,( n 1)

Note that if the appropriate test statistics is then, the P-value will be deduced as indicated below
Test Type P-value
Right tail P( > )

Left tail P( < )


Two tail2 × P( > ( ))
Note that the degree of freedom ( − 1), will depend on the value of the sample size given in the
question
Example 4.11:
Let X be a normally distributed random variable with unknown mean and variance. Information
obtained from a sample size 20 showed that x  9.5 and s  3 . Test the null hypothesis that the true
mean is 11 against the alternative that it is different from 11 at 5% level of significance. Hint: use the
test of significance approach

Example 4.12

The average IQ for adult population is 100 with standard deviation of 15. A researcher believes the
average IQ is lower. A random sample of 5 adults are tested and the scores are ; 69, 79,89,99, and

71
109. Is there enough evidence to suggest the average IQ is lower? Hint: Use the critical value and
confidence interval approaches.

Example 4.13

Airtel claims that the average length of a phone call is 8 minutes. In a random sample of 25 phone
calls, the sample mean was 9.8 minutes and the standard deviation was 0.5 minutes. Is there
enough evidence to support this claim at  = 0.05? (Hint: Use the test of significance approach)

Example 4.14

A manufacturer claims that its rechargeable batteries have an average life greater than 1,000
charges. A random sample of 10 batteries has a mean life of 1002 charges and a standard deviation
of 14. Is there enough evidence to support this claim at  = 0.1? (Use the test of significance
Approach)

4.8 Hypothesis testing for the difference between two population means:
4.8.1 When population variances are known
As usual the following procedures will be used. Step one will be to formulate the two conflicting
hypotheses. That is we test for H 0 : 1   2  c against one of the alternative hypotheses

H 1 : 1   2 c or H 1 : 1   2 c or H1 : 1   2 c
x1  x2   c
The appropriate test statistic is such that Z 0 
 12  22

n1 n2
By using Critical value approach, the above test statistics will be compared with critical values. That
is if:
H 1 :    0 then, critical value will be   Z 

H 1 :    0 then, critical value will be  Z 

H 1 :    0 then, critical value will be   Z  / 2

4.8.2 When the population variancesare unknown.


There are two cases to consider under this situation
4.8.2.1 When both of the sample are large ( n1  30 and n2  30 )

In this case the test procedure is similar to previous situation except that the population variances
(  i2 ) are replaced by sample variances ( s i2 ) for i  1, 2

72
x1  x2   c
The test statistics is thus Z 0 
s12 s 22

n1 n2
By using Critical value approach, the above test statistics will be compared with critical values. That
is if:

H 1 :    0 then, critical value will be   Z 

H 1 :    0 then, critical value will be  Z 

H 1 :    0 then, critical value will be   Z  / 2

Example 4.15:
Let X denotes the annual income of workers from financial sector and Y denotes the annual income
of workers from education sector. Data collected from these two sectors provide the following
information; n x  210 ,  X  11,340 ,  ( X  X ) 2
 13,376 , n y  190 ,  Y  8,930 ,
 (Y  Y ) 2
 10,584 . Test the null hypothesis that the two sectors have the same annual income

against the alternative hypothesis that the financial sector pays more. Use   5%

4.8.2.2 When either one or both of the sample are small


As in estimation, in this case we assume that the populations variances are common and can be

estimated by pooled sample variances s 2p where s 2p 


n1  1s12  n2  1s22
n1  n2  2
The test statistics will be no longer , instead is again used which has t-distribution with

n1  n1  2 degrees of freedom, where t 0 


x1  x2   c
s 2p s 2p

n1 n2
The critical value(s) under this situation will depend on the nature of alternative hypothesis stated
in the question. That is if:
H 1 : 1   2  c then, critical value will be  t ,( n n
1 2 2)

H 1 : 1   2  c then, critical value will be  t ,( n1  n2  2)

H 1 : 1   2  c then, critical value will be  t / 2,( n1  n2  2 )

Example 4.16:

73
A random sample of size 16 showed an average of 480g with a standard deviation of 21g, on the
other hand, a sample of size 25 resulted to an average of 490g with a standard deviation of 24g.
Test a null hypotheses H 0 : 1   2  0 against the alternative H1 : 1   2  0 at 5% level of

significance.

4.10 Matched Pair t-test for Dependent Samples


The matched pair t-test is used in situations where the same measurement is taken twice on the
same unit. The measurement may be taken at two different periods of time or circumstances.
Examples are measuring sales before and after running an advert, measuring workers performance
before and after training, etc. the objective is to study whether there is a significant change between
the two periods of time or circumstances. In this case the two samples are not independent because
the nature of the units is different, i.e. since the units are the same, they have either undergone
some changes overtime or they have been affected by a certain treatment. This is the case of before-
and-after situations.

The hypotheses are given by

H 0 :  after  before  H 0 :  after  before  0


H1 :  after  before  H1 :  after  before  0

The test statistic for the matched pair t-test is given by

xd  d
tc 
sd
n

Where xd is the difference between pairs ( xafter  xbefore )

xd 
x d
is mean of the differences and n is the number of pairs
n

 x
 d d  x  n
2

  
2 d
x x
2

sd  
d

n 1 n 1
is the standard deviation of the differences

74
Example 4.17

A company sent seven of its employees to attend a course in building self-confidence. These
employees were evaluated for their self-confidence before and after attending this course. The
following table gives the scores (on a scale of 1 to 15, 1 being the lowest and 15 being the highest
score) of these employees before and after they attended the course.

Before 9 8 11 5 7 6 10
After 9 5 9 4 5 8 5

Test whether attending the course has positively affected the self-confidence of the employees.

Solution

H 0 :  after  before
H1 : after  before

These can also be stated as

H0: The training has no effect on the employees score


H1: The training has a positive effect on the employees score

Before ( xb ) After ( xa ) xd  xa  xb xd2


9 9 0 0
8 5 -3 9
11 9 -2 4
5 4 -1 1
7 5 -2 4
6 8 2 4
10 5 -5 25
Total x
d  -11 x 2
d  47

xd 
x d

 11
 1.57
n 7

 x   112
 xd2 
2
d
47 
sd  n  7  2.225
n 1 7 1

75
xd   d  1.57  0
tc    1.8669
sd 2.225
n 7

The critical value is given by

t , n 1  t0.05, 6  1.1943

The calculated tc value falls under the non rejection, so we do not reject the null hypothesis at 5%
level and we can conclude that the training had no significant positive effect on the employees
score.

4.11Hypothesis Testing For Population Proportions.


In testing the hypothesis for population proportion P the same procedures should be employed as
indicated here under.

1. Identify H 0 , H 1 and 

2. Choose the appropriate test statistics and compute its value


3. Using the given level of significance, find the critical value(s) and specify the rejection and
non-rejection regions of the distribution. Allocate also the value of the test statistic in the
distribution
4. Make a statistical decision based on where the value of the test statistic falls
5. Give a managerial decision or conclusion or comment.

We will be testing the null hypothesis such that H 0 : P  P0 against one of the alternative

hypotheses
H 1 : P  P0 or H 1 : P  P0 or H 1 : P  P0

However, we assume that the sampling distribution of sample proportion p̂ is approximately


normal and hence the test statistics will be:
pˆ  P0
Z0 
P0 (1  P0 )
n

76
x
pˆ 
n
Note:
The testing can be carried out if the following condition holds:
npˆ (1  pˆ )  9

Example 4.18:
A marketing company claims that it receives 8% responses from its mailing. To test this claim, a
random sample of size 500 were surveyed with 25 responses. Test this claim at the  = .05
significance level. Hint use test of significance approach

4.12Testing for the Equality between two Population Proportions


Let p̂1 and p̂ 2 be the sample proportion obtained in large samples of n1 and n2 drawn from

respective population having proportions P1 and P2 . We can test the null hypothesis that: There is
no difference between the population proportions.
In this case we will be testing the null hypothesis such that H 0 : P1  P2 against one of the

alternative hypotheses
H 1 : P1  P2 or H 1 : P1  P2 or H 1 : P1  P2
But, since the two populations are assumed to be independent, then the appropriate test statistic
becomes
( pˆ 1  pˆ 2 )  ( P1  P2 )
Z0 
1 1 
p (1  p )  
 n1 n2 
x1 x2
Where pˆ 1  and pˆ 2  are the sample proportions, and p is called the pooled
n1 n2
sampleproportion from the two samples, and is given by:
x1  x2
p
n1  n2
Where x1 and x 2 are number of successes from the two samples, and n1 and n 2 are the sample sizes
Example 4.19:
In a random sample of 100 persons taken from village A, 60 are found to be consuming tea. In
another sample of 200 persons from village B, 100 are found to be consuming tea. Do the data
reveal significant difference between the two villages so far as the habit of taking tea is concerned?
Use 1% level of significance

77
Review Questions
1. The average salary of graduates entering the ICT field is reported to be Tsh 400,000 per
month. To test this, ICT company manager surveys 20 graduates and finds their average
salary to be Tsh 432,280 with a standard deviation of Tsh 40,000. Using 10% level
significance, has he shown the reported salary to be incorrect? Hint: Use test of significance
approach

2. The mean life time of a sample of 100 light bulbs produced by a certain company found to
be 1,580 hrs with the standard deviation of 90 hrs. Test the hypothesis that the mean life
time of bulbs produced by the company is 1,600 hrs at 5% level of significance. Hint:Use
confidence interval approach

3. A company uses two machines to fill packets of crisps. A sample of 30 packets from
the first machine had a mean weight of 180g and a standard deviation of 14g. A
sample of 40 packets from the second machine had a mean weight of 170g and a
standard deviation of 10g. Does the evidence from these samples support the view
that the two machines produce packets of equal weights? Use   5%

4. Let X denotes the annual income of workers from financial sector and Y denotes the annual
income of workers from education sector. Data collected from these two sectors provide the
following information; n x  25 ,  X  11,340 ,  ( X  X ) 2
 13,376 , n y  35 ,

 Y  8,930 ,  (Y  Y ) 2
 10,584 . Test the null hypothesis that the two sectors have the

same annual income against the alternative hypothesis that the financial sector pays more.
Use   5%
5. A lecturer wants to know if her introductory statistics class has a good grasp of basic math. 8
students are chosen at random from the class and given a math proficiency test. The lecture
wants the class to be able to score above 70 on the test. The six students get scores of 62, 92,
75, 68, 83, 75, 90 and 95. Can the lecturer have 90 percent confidence that the mean score
for the class on the test would be above 70?

6. Kiomboi Pharmacy, a prominent manufacturer of a patient medicine claimed that it was


90% effective in relieving an allergy for a period of 8hrs. A sample of 200 people who had

78
the allergy was taken and showed that the medicine provided relief for 160 people.
Determine whether the manufacturer’s claim is legitimate.

7. Cereals business man took 70 observations concerning the price of rice/kg in Mbeya region.
The results produced a mean price of Tsh. 700 with a standard deviation of Tsh. 30.
Similarly 65 observations were taken in Shinyanga region. The result revealed a mean price
of Tsh.650 with a standard deviation of Tsh. 20. Determine whether there is enough
evidence that the mean prices of rice in Mbeya region are significantly greater than those in
Shinyanga region at 1% level of significance.

8. An automatic bottling machine fills cola into 2-litre bottles. A consumer advocate wants to
test the null hypothesis that the average amount filled by the machine into a bottle is at
least 2 liters. A random sample of 40 bottles coming out of the machine and the exact
contents of the selected bottles are recorded. The sample mean was 1.9996 liters. The
population standard deviation is known from past experience to be 0.0013 liters.
a) Test the hypothesis at α = 5%
b) Assume that the population is normally distributed with the same σ. Assume that
the sample size is only 20 but the sample mean is the same. Conduct the test once
again at α = 0.05.
c) If there is a difference in the two test results, explain the reason for the differences.

9. St. Michaels Swahili Medium School has 300 students. The head teacher claims that the
average IQ of students at this school is at least 110. To prove his point, he administered an
IQ test to 20 randomly selected students. The results revealed the average IQ of 108 with a
standard deviation of 10. Based on these results, at a 0.01 level, should the head teacher
stick to his origin assumption?

10. The same test was given to a group of 100 scouts and to a group of 144 guides. The mean
score for the scouts was 27.53 and the mean score for the guides was 26.81. Assuming a
population variance of 12.11 for both scouts and guides scores.

a) Test whether the scout’s performance in the test is the same to that of guides at 5%
level of significance, assuming that the scores are normally distributed. Use the
general procedures for testing hypothesis.
b) If you used the interval estimation to test whether the scout’s performance in the
test is the same to that of guides, what decision could you make from the 95%
confidence interval? Is it the same decision as in (a) above? Explain.

79
10. A recent article describes how finance incentives by major automakers are reducing banks’
share of the market for automobile loans. The article reports that in 1990, banks wrote
about 53% of all car loans, and in 2000 the banks’ share was only 43%. Suppose that these
data are based on a random sample of 100 car loans in 1990 were 53 of the loans were
found to be bank loans; and the 2000 data also based on a random sample of 100 loans, 43
of which were found to be bank loans. Carry out the test of the equality of the banks’ share
of the car loan market in 1990 and in 2000.

11. Within a District, students were randomly assigned to two Mathematics teachers; Mrs.
Smith and Mrs. Jones. After the assignment, Mrs. Smith had 30 students and Mrs. Jones
had 45 students. At the end of the year, each class took the same standardized test. Mrs.
Smith’s students had an average test score of 78 with a standard deviation of 10, and Mrs.
Jone’s students had an average test score of 85 with a standard deviation of 15. Assume that
student’s performance is approximately normal. Test the hypothesis that Mrs. Smith and
Mrs. Jones are equally effective teachers at 10% level.

12. The CEO of a large electric utility claims that 80% of his 1,000,000 customers are very
satisfied with the services they receive. To test this claim, the local Newspaper surveyed 100
customers, using simple random sampling. Among the sampled customers 73% say they are
very satisfied. Based on this finding, can we reject the CEO’s hypothesis that 80% of the
customers are very satisfied? Use 5% level of significance.

13. Suppose that the Goodyear Tire Company has historically held 42% of the market for
automobile tires in Tanzania. Recent changes in company operations, especially its
diversification to other areas of business, as well as changes in competing firms’ operations,
prompt the firm to test the validity of the assumption that it still controls 42% of the
market. A random sample of 550 automobiles on the road shows that 219 of them have
Goodyear tires. Conduct the test at α = 0.01.

80
CHAPTER FIVE
REGRESSION AND CORRELATION ANALYSIS
5.1 Introduction
Regression analysis is concerned with the study of the relationship between one variable called
dependent variable and one or more independent variables. In other words this is the statistical
analysis which deals with the prediction or estimation of dependent variable based on values of
another variable called independent variables.Other names of dependent variable are outcome
variable or predicted variable or explained variable whereby other names of independent variables
are explanatory variables. Thus we may be interested in studying the relationship between the
profitability of commercial bank in terms of liquidity, capital adequacy ratio, non performing loan,
interest rate and GDP growth. Or, we may be interested to examine how individual income is
related to the level of education acquired and working experience, or how sales of a certain product
are attributed to advertising expenditure incurred. Hence all those situations are examples where
regression analysis can be applied. Furthermore the relationship between numerous economic or
financial variables can be linear or non-linear. However, linear regression analysis assumes linear
relationship among the variables of interest. More specifically linear in parameter. For consistency
of notation we normally use to denote dependent variable and to represent independent
variables.

5.2 Objective of Regression Analysis


The objective of regression analysis may be either

1. To predict, or forecast, the mean value of the dependent variable, given the value of the
independent variable(s)
2. To estimate the mean or average value of the dependent variable, given the value of the
independent variables.
3. To test the hypotheses about the nature of the dependence-hypotheses suggested by some
economic theories

5.3 Simple Linear Regression Model


Simple linear regression analysis consist of one dependent variable and only one independent
variable which is quite different from multiple linear regression analysis. Therefore a regression
model which consist of only one independent variable is termed as simple linear regression model.
General simple linear regression model is as indicated below.

( )= +

81
The above model is called deterministic population regression function (PRF) where and are
constant parameters also known as regression coefficients. The corresponding stochastic
population regression function is as indicated below:

= + + = 1, ⋯ ,

= ( )+

Where is called stochastic or random error term. In general stochastic PRF indicates that,
the actual observation of each individual is equal to the average of that group plus or minus some
quantity. In other words, the stochastic version of PRF states that, any individual Y value can be
expressed as the sum of two components; deterministic/systematic( + )and
nonsystematic/random ( ) .Both and are treated as random (stochastic) variables, whereby
and as well as are treated as non random. Since is treated as a random variable, then it is
characterized by probability distribution as it will be shown later. The differences between the
stochastic PRF and its deterministic counterpart can be explained in more detail using the
following example.

Example 5.1

Economic theory claims that there exist a positive linear relationship between individual
consumption and income as indicated in the following scatter plot. From the graph, deduce both
deterministic and stochastic PRF. Note that, individual consumption is the dependent variable (Y),
and Income is independent variable (X).

82
( )= +

= + +

However, as we have pointed out in the previous sections, we normally use sample information to
represent the whole population due to factors related to cost, time, and complexity of obtaining
information from the whole population. Therefore, the Sample Regression Function (SRF)
which is estimator of PRF in its deterministic form is as indicated below:

= +
and in its stochastic form is as indicated below:
= + +
= +
Where
= the estimator of ( )
= the estimator of
= the estimator of
= the estimator of

5.4 The method of Ordinary Least Square (OLS)


There are several methods of obtaining the SRF as an estimator of the true PRF, and these are;
inspection, semi-average and least square. However in the regression analysis the common method
used is that of least squares popularly known as OLS method. Therefore we expected to discuss this
specific technique only in this lecture notes. OLS technique states that the sample regression

83
coefficients ( ) should be chosen in such a way that the residual sum of squares (RSS) is as small
as possible. In other words OLS method tries to minimize the sum of squares of the vertical
distance between the actual data points and the points in the regression line. Algebraically, it states
that

Minimize: = ( − )

= ( − − )

If we let = ∑ and setting = 0, = 0 we expect to have the following algebraic expressions

for computing and


∑( − )( − )
=
∑( − )

∑ −
=
∑ −

∑ − ∑ ∑
=
∑ − (∑ )
1
= − = −

By defining the following

1
= −

1
= −

1
= −

One can simply express =

Example 5.2
The following pairs of value ( , ) represent the amount of sales ( )of a certain product in million's
Tshs, and the amount spent in advertising the same product ( )
X 5 10 15 20 25

Y 22 32 38 59 67

a) Plot a scatter diagram and estimate the line of the best fit

84
b) Obtain the regression line by using least squares method
c) Predict the value of Y when = 110

Solution
a)A scatter diagram and the estimated line of the best fit is shown in the figure below:

80

70

60

50

40
Y

30

20

10

0
0 5 10 15 20 25 30

b) Consider the following Table which consist of computed statistics:

5 22 110 25 484
10 32 320 100 1024

15 38 570 225 1444


20 59 1180 400 3481
25 67 1675 625 4489
75 218 3855 1375 10922

From the Table above we see that,


= 75, = 218, = 3855, = 1375, = 10922
Therefore:
1
= −

85
1
= 1375 − (75)
5
= 250
1
= −
1
= 3855 − (75)(218)
5
= 585
1
= −
1
= 10922 − (218)
5
= 1417.2

Hence,

=
585
=
250
= 2.34
= −
1
= −
1
= 218 − 2.34(75)
5
= 8.5

Therefore the required regression line is:


= 8.5 + 2.34
c)If = 110, then the predicted value of is = 8.5 + 2.34(110) = 265.9

5.5 Sampling distribution for Estimators of regression line


As we have pointed out earlier, both dependent variable ( ) and stochastic error term ( ) are
treaded as random variables and therefore they are characterized by probability distributions.
Hence in the following sections we expected to discuss probability distributions for estimators of
regression lines.

5.5.1 Distribution for and

One of the classical linear regression assumptions states that the stochastic error terms are
independent but identically normally distributed with the mean zero and constant variance. That
is to say
~ (0, )
The dependent variable has also normal distribution with variance and mean + . That is
~ ( + , )

86
5.5. 2. Sampling Distribution for and

Because and are computed from a random sample, these estimators themselves are
random variables hence they are also characterised with a probability distribution and suppose
the underlying assumptions concerning the probability distribution for are held true, then it can

be shown that is normally distributed with the mean and variance . That is symbolically


~ ,

Similarly, it can be shown that is normally distributed with the mean and variance . That is

symbolically

~ ,

Furthermore, the standard error i.e. = ( ) and = ( ) such that the

corresponding standard normal random variable is defined as

= ( )
~ (0,1) or = ( )
~ (0,1)

However in most cases the homoscedastic cannot be computed easily hence it is normmally
estimated by the following formula:

=
−2

Where is an estimator of and ∑ is residual sum of squares (RSS) defined as


= − = −
Therefore:

1
= =

−2 −2
However if we replace by its estimator these estimator are no longer normally distributed
instead they follow t-distribution with ( − 2)d.f such that
= ( )
~ ,( ) or = ( )
~ ,( )


Where = and = , 2 represent the number of variables used in the
regression model; in the case of simple regression model, there are two variables.

5.6. Estimation for regression coefficients ( )

87
So far we have seen the sampling distribution for sample regression coefficients from the previous
section. In the following section we expect to learn on how to construct (1 − )100% confidence
interval for the corresponding population regression coefficients. Under estimation as we have seen
before, we employ sample statistics to estimate unknown population parameters. Therefore sample
regression coefficients are used to estimate the corresponding population variances. Hence
confidence interval for is given by:

= ± ( ) ( )
And for is given by

= ± ( ) ( )

Where is the total number of observation

Example 5.3
Use the results from Example 5.2 to construct 95% confidence interval estimate for and
Solution:
Recall from Example 4.1 = 240, = 1417.2, = 585, = 8.5, = 2.34
From:

= ± ( )

Where,


=

We first need to find :

=
−2
1
= −
−2
1
= (1417.2 − 2.34 × 585)
3
= 16.1
Therefore:
16.1 × 1375
=
5 × 250

88
= 4.21

( ) = . ,

= 3.182

Therefore 95% confidence interval for is:

= 8.5 ± 3.182 × 4.21


= 8.5 ± 13.4
Confidence Interval for is:
= ± ( ) ( )
Where,

16.1
=
250

= 0.254

= 2.34 ± 3.182 × 0.254


= 2.34 ± 0.81

5.7. Hypothesis testing for Regression Coefficients ( )

In order to embark on hypothesis testing, the following procedures will be used:


1) The first step will be to state the null hypothesis against one of the three alternative
hypothesis. That is
: = againsteither : ≠ or : > or : < for = 1,2
2) The appropriate test statistics will be such that

= ~ ,( )
( )
3) Critical value will depend on the nature of alternative hypothesis. That is if
: < then critical value will be = − ,( )

: > then critical value will be = ,( )

: ≠ then critical value will be = ± ( )

4) The last step will be to make both statistical and managerial decision

Where is the total number of observation

89
Example 5.4
Use information given in Example 5.2 to test the significance of each regression coefficient
Solution
Testing for significance of
Procedures
Test : = 0 against : ≠ 0 at = 5%
The appropriate tests statistics is :

=

8.5 − 0
=
4.21
8.5
=
4.21
= 2.019
Critical value = ± ( )

=± . ,

= ±3.182

Since the value of test statistics (TS) falls within non-rejection region (NRR), then null hypothesis
should not be rejected at 5% level. Therefore the intercept is not significant

5.8. Correlation Analysis


Correlation analysis is the concept which is used to measure the strength and direction of the
linear relationship between two variables. Primarily the relationship between two variables can be
depicted from the so called scatter plots. These are just diagrams in which the data points ( x, y )
are plotted on Cartesian coordinate system, and the plots also indicate how one variable is affected
by another variable. They are also a useful means of getting a better understanding of the nature of
the data you have. Another way of measuring the strength of relationships between two variables is
by using a numerical value called correlation coefficient. There is population correlation coefficient
and the sample correlation coefficient. Population correlation coefficient is denoted by a letter 
and its sample counterpart is represented by a letter . Population correlation coefficient is

90
normally based on probability distribution and in most cases its value will be unknown, and hence
it will be estimated or tested by using its sample counterpart .

5.8.1 Sample correlation coefficient ( )


Correlation coefficient as we have pointed out from above, is a numerical value which measures the
strength of relationships between two random variables. IF X and Y are two sample variables, their
sample correlation coefficient denoted by ( , )or is given by:

Sample Cov( , )
=
( ) ( )

As defined earlier:
1
= −

1
= −

1
= −

5.8.2Properties of correlation coefficient


Like covariance, can be either positive or negative

The correlation coefficient always lies between −1 and +1. Symbolically

−1 ≤ ≤1

If the correlation coefficient is +1, it means that the two variables are perfectly positive correlated,
whereas if the correlation coefficient is −1, it means that they are perfectly negative correlated. If 0,
it means no relationship at all. However if 0.8 ≤ < 1 then this indicates very strong linear
relationship, it is just strong when 0.6 ≤ < 0.8. Furthermore if ≤ 0.3 then it indicates weak
linear relationship

5.9 Coefficient of determination ( )


In the regression context, the coefficient of determination is the square of correlation coefficient
(i.e. ). It is always expressed in % and it can be interpreted as the percentage of the variation in
the dependent variable explained by the independent variable(s), and therefore provides an overall
measure of the extent to which the variation in one variable determine the variation in the other.
This measure can also used to indicate how good is the fitted regression line.

91
5.9.1.Properties of coefficient of determination
1) The value is always positive
2) Its value ranges between zero and one, i.e. 0 ≤ ≤ 1. An of 1 mean the entire variation
in Y is explained by the regression. An of zero mean no relationship between Y and X
Example 5.4
Use information given in Example 5.2, compute both and and interpret the results:
Solution
From
Sample Cov( , )
=
( ) ( )

585
=
√250 × 1417.2
= 0.9828

Interpretation: The value indicates there is a very strong positive relationship between the two
variables
= (0.9828)
≈ 96.6%
Interpretation: The value indicates that 96.6% variations in the dependent variable ( ) has been
explained by the independent variable ( ), the remaining 3% has been explained by other factors
not currently included in the regression model.

5.10 Hypothesis Testing For Correlation Coefficient ( )


The following steps will be used.
1) The first step will be to state the null hypothesis ( ) against one of the three alternative
hypothesis ( ). That is
: = 0 againsteither : ≠ 0 or : > 0 or : <0
2) The appropriate test statistics will be T such that

=
( )

And if : = 0 then the above test statistics will be identical to

( − 2)
=
(1 − )

3) Critical value will depend on the nature of alternative hypothesis. That is if

92
: < 0 then critical value will be = − ,( )

: > 0 then critical value will be = ,( )

: ≠ 0 then critical value will be = ± ( )

4) The last step will be to make both statistical and managerial decision

Where is the total number of observation and is the sample correlation coefficient
Example 5.5
Use the results obtained in Example 5.4 to test : = 0 against : ≠ 0 at 5% level of
significance
Solution
Procedures:

i. Test : = 0 against : ≠ 0 at 5% level


ii. The appropriate test statistics is :

( − 2)
=
(1 − )

3
= 0.9828
(1 − 0.966)
=9.314
iii. Critical value = ± ( )

=± . ,
= ±3.182

iv. Conclusion: Since the value of test statistics falls with rejection region (RR), then the null
hypothesis should be rejected at 5% level. Therefore there exist a linear relationship between the
two variables.

5.11 Hypothesis Testing For Significance Of Regression Model ( )


Testing for significance of regression model is normally carried out by using F-statistics. The
following steps are to be followed.

93
1) The first step is to state the null hypothesis ( ) against one of the three alternative
hypothesis ( ). That is
: = 0 againsteither : ≠ 0 or : > 0 or : <0
2) The appropriate test statistics as we have pointed out earlier will be such that
( − 2)
=
1−
3) Critical value to be compared with the above test statistics will always be = ( , )

4) The last step will be to make decision

Example 5.6
Use the results obtained in Example 5.4 to test for the significance of the model at 5% level of
significance:
Solution
Procedures:
Test : = 0 against : ≠ 0 at 5% level
The appropriate test statistics is :
( − 2)
=
1−

0.966 × 3
=
1 − 0.966
= 85.24
Critical value = ( , )

= . ( , )

= 10.1

Since the value of Test Statistics falls within RR then the null hypothesis should be rejected at 5%,
hence the regression model is significant.
5.12 Analysis of Variance (ANOVA)
Analysis of variance also can be employed for testing for significance of the linear regression
model. This is made by first computing numerous statistics and hence creating an ANOVA Table
whose general structure is as indicated below.

SV DF SS MS F-value

Estimated k−1 ESS MSE F

Residual n−k RSS MSR

Total n−1 TSS

94
Where
SS = Source of variation
DF = Degree of freedom
SS = Sum of squares
MS = Mean squares
k = number of variables ( = 2 for simple linear regression)
The above statistics are normally obtained from the following identities
= +
Then,
1= +
But,
=
This implies that
1= +

Where = =∑ − (∑ ) and =∑ =∑ − = −
The other statistics are computed as indicated below:
= , = , =
−1 −
Example 5.7
Use the results from Example 5.1 to fill the following ANOVA Table, and use it to test for
significance of regression model.

SV DF SS MS F-value

Estimated A D G I

Residual B E H

Total C F

Solution:
= −1
=2−1
=1
= −
=5−2
=3
= −
=5−1
=4
= = = 1417.2
= = − = 1417.2 − 2.34 × 585 = 48.3

95
= = −= 1417.2 − 48.3 = 1368.9
1368.9
= = = = 1368.9
1
48.3
= = = = 16.1
3
1368.9
= − Value = = = 85.02
16.1
Therefore the complete ANOVA Table is as shown below:
SV DF SS MS F-value

Estimated 1 1368.9 1368.9 85.02

Residual 3 48.3 16.1

Total 4 1417.2

5.13. Computer output and interpretation of the results.


There are numerous computer software or statistical packages that are normally used to carry out
regression analysis. These computer software among others includes EXCEL, EVIEWS, SAS,
STATA, MATLAB and R.

Example 5.8
The following pairs of value ( , ) represent the amount of sales ( )of a certain product in million's
Tshs, and the amount spent in advertising the same product ( )
X 5 10 15 20 25

Y 22 32 38 59 67

Use Excel to find regression output and interpret all the statistics obtained.

Note: The computer output for regression analysis of the relationship between sales ( )of a certain
product and the amount spent in the advertising the same product ( ) for 5 days is as asked in
Example 8.3 indicated below:

SUMMARY
OUTPUT

Regression Statistics
Multiple R 0.982811637
R Square 0.965918713
Adjusted R
Square 0.954558284

96
Standard Error 4.01248053
Observations 5

ANOVA

Significance
SV Df SS MS F F
Regression 1 1368.9 1368.9 85.02484472 0.002698129
Residual 3 48.3 16.1
Total 4 1417.2

COEFFICIENT

Standard
Coefficients Error t Stat P-value Lower 95% Upper 95%
Intercept 8.5 4.208325083 2.01980594 0.136681808 -4.89276860 21.89276861
X Variable 1 2.34 0.253771551 9.220891753 0.002698129 1.532385666 3.147614334

Interpretation of results

Solution
The output as shown above mainly consists of three Tables namely; the summary output, ANOVA,
and coefficients. We will interpret the meaning of just few information from these three Tables

= 0.9828 indicates very strong positive relationship between dependent variable and
independent variable.
= 0.966 shows that 96.6% variations in the amount of sales ( )has been explained by the
amount spent in the advertisement ( )

From ANOVA Table as we can see, there are various statistics and these are d.f, SS, MS, F-
statistics and significance F. The significance F is used to draw conclusion about the significance
of the model. From econometric point of view, the model is said to be significant if the significance
F value is less than or equal to 10% ( − ≤ 10%)

The coefficient Table consist of numerous information as well. In summary, the coefficients column
gives the value of regression coefficients ( and ), the standard errors of the regression
coefficients are shown next. The t-statistics for testing the significance of individual parameters are
given in column labelled t Stat.

97
= 8.5 indicates the average value of sales ( ) when the amount spent on advertisement ( ) is
zero
= 2.34 indicates a change in the mean value of sales ( ) when the amount spent on
advertisement ( ) increases by 1 unit

The P-value as it is for significance F, is used to conclude for the significance of individual
parameters. The parameter is said to be significant if P-value is less than or equal to 10%. This is
from econometric point of view. In our case is not significant while is significant

The confidence interval estimates are also given in the last columns. In this example 95%
confidence interval estimate for is (-4.8927; 21.8927) while for is (1.5323; 3.1476). The
general regression model is as shown below
= 8.5 + 2.34

5.13 Multiple Linear Regression Analysis


In the previous section, we have been dealing with a single independent variable and how it is
related to the dependent variable. However, in most case, a dependent variable can be influenced
by more than one independent variables. Such regression model is termed as Multiple regression
model. Therefore in the current section we will be looking the concept of multiple regression
model. Generally the multiple population regression function in its deterministic (non-stochastic)
form is as indicated below:
( )= + + + ⋯+ , = 1, ⋯ ,
and in the stochastic form is as indicated here under
= + + + ⋯+ + , = 1, ⋯ ,
Where
= dependent variable
= explanatory/independent variables
= Stochastic error term
= ith observation
As before is the intercept term and it represents the average value of Y when all the explanatory
variables are set equal to zero. The coefficients , ,⋯, are called partial regression coefficients
whose meaning can be explained hereunder.
measuresthe change in the mean value of , ( ) per unit change in when the rest of the
variables are held constant. Likewise measure the change in the mean value of per unit change
in when the rest of the variables are held constant. This is the unique feature of multiple
regression which is quite different from simple linear regression.

98
As usual the population regression coefficients cannot be computed easily, therefore we normally
use their sample counterpart to estimated their values. Therefore the sample regression coefficient
in its deterministic form is as indicated below:

= + + + ⋯+

and in its stochastic form is represented hereunder

= + + +⋯+ +

Where

= the estimator of
= the estimator of
= the estimator of
= the estimator of
= the estimator of
= the estimator of
5.13. 1Assumption of Multiple Linear Regression models
1. The dependent variable is linearly related to the coefficients of the regression model and the
model is correctly specified. That is the regression coefficients should be raised to the power
of 1 only (i.e neither = 2,3 ⋯ nor are allowed)
2. The independent variable(s) is/are uncorrelated with the random error term. That is
mathematically
( , )=0

3. The mean of the error term is zero. That is mathematically

( )=0

4. The error term has a constant variance (homoscedastic error). No heteroscedasticity. That
is

( )=

5. The error terms are uncorrelated with each other. That is, No autocorrelation or serial
correlation. Algebraically this assumption can be written as

, =0 ≠

6. No perfect multicollinearity. No independent variable has a perfect linear relationship with


any of the other independent variables.

99
, =0 ≠
7. The error term is normally distributed. That is algebraically
~ (0, )

8. Independent variables are non-stochastic, that is their values are fixed in repeated trial

Note that, the above distribution assumptions we put for multiple regression analysis allows us
to do inferences on the remaining model parameters

5.14 OLS estimators of Multiple Linear Regression model


The partial regression coefficients of the multiple linear regression model can be deduced by the
OLS method as it was in the case of simple linear regression model. It also involves calculus
technique of differentiation. However, without going into detail, the process and its mathematical
details can be depicted from Gujarat et.al (1988). For example, in the case of three-variables
regression model the coefficients are obtained by solving the following three systems of linear
equations:

= + +

= + +

= + +

Generally for the case of three variables and three equation as indicated above we can use simple
algebraic manipulation to solve the above system of linear equation. However, for more than three
variables things becomes complex, therefore we normally use the concepts of matrices or vectors
to solve the systems of such linear equations. But also due to their complexity, the OLS estimators
and other statistics are obtained using computers software such as EXCEL, STATA, SAS, EVIEWS,
R, SPSS and so on.

5.15 Estimation for partial regression coefficients


In similar way as for simple regression analysis, each of the partial regression coefficients follows t-
distribution with ( − ) degree of freedom. Therefore the formula for (1 − )100% confidence
interval estimate for a parameter is as indicated below:

= ± ( ) ( )
Where denoted the total number of observation, represent the number of variables employed in
a specific multiple regression model.

100
5.16 Hypothesis testing in Multiple Linear Regression Analysis

The following section discusses hypothesis testing for regression coefficients in multiple linear
regression analysis. As in the case of simple linear regression, these tests can only be carried out if
it can be assumed that the random error terms, , are normally and independently distributed
with a mean of zero and variance of . Three types of hypothesis tests can be carried out for
multiple linear regression models:

1. test: This test checks the significance of individual regression coefficients.


2. test: This test can be used to simultaneously check the significance of two or more partial
regression coefficients. It can also be used to test individual coefficients.
3. Test for significance of the entire regression model:

5.16.1 Testing Individual Partial Regression Coefficients


As usual, we have to follow some steps for testing the hypotheses as it was in the case of simple
linear regression analysis. That is:

1) The first step will be to state the null hypothesis against one of the three alternative
hypothesis. That is
: = againsteither : ≠ or : > or : < for = 1,2
2) The appropriate test statistics will be such that

= ~ ,( )
( )
3) Critical value will depend on the nature of alternative hypothesis. That is if
: < then critical value will be = − ,( )

: > then critical value will be = ,( )

: ≠ then critical value will be = ± ( )

4) The last step will be to make both statistical and managerial decision

Where is the total number of observation, represent the total number of the variables used in
the regression model. Note that, decision can also be made using P-value approach, and these value
can be computed from their corresponding test statistics or they can be deduced directly from
computer output. Generally if P-value is less than the given level of significance , then the null
hypothesis is reject, otherwise we fail to reject the null hypothesis.

5.16.2 Testing for two or more Partial Coefficients


There is no general structure for the null hypotheses, but one expects to see at least two restrictions
of parameters under the null hypotheses. For instance; we may be required to test : = 0, =

101
1 against one of the alternative hypotheses. In this example we say there are ( = 2) restriction
under the null hypothesis.

In order to carry out this testing procedure, two different regression or analysis should be done,
one before restriction and the other after parameter restrictions. From each these analysis the
residual sum of square (RSS) is recorded, therefore, we get residual sum of squares for unrestricted
model labelled ( ) and the one for restricted model labelled ( ). The appropriate test
statistics for this procedure is always F such that

− −
=

This statistics is compared to the critical value ( , − )

Where denotes the number of variables or number of estimated parameters in the original
unrestricted regression model, represent the number of restrictions under the null hypothesis

Example 5.9

Consider the following multiple linear regression model

= + + + + + ,

Suppose one wishes to test a null hypothesis : = = 0. If this hypothesis is true then we
expected a restricted model to be:

= + + +

To test the null hypothesis at 5% level of significance, two regression was carried out using time
series data on 25 observation, and the resulting residual sum of square were given by =
6722.04 and = 7088.58

Solution

Data Given

= 7088.58 (Residual sum of squares for restricted model)

= 6722.04 (Residual sum of Squares for unrestricted model)

= 25 (Total number of observations)

= 5 (The number of variables or parameters in unrestricted model)

= 2 (The number of restrictions under the null hypothesis)

102
Procedures

 Test : = = 0 against : ≠ ≠0
 The appropriate test statistics if F:
− −
=

7088.58 − 6722.04 25 − 5
=
6722.04 2
366.54 21
=
6722.04 2
= 0.57

 Critical value = ( , − )

= . ,( , )

= 3.49

Conclusion: Since the value of test statistic falls within Non Rejection Region (NRR) then the null
hypothesis should not be rejected and therefore the given restriction is valid (i.e the variables
and have no contribution to the original regression model.

5.16.3 Testing for the significance of the model


This is also known as F test for the significance of the model. We are testing whether or not the
entire model is significant or sensible or reliable. The general structure of the hypothesis is as
indicated below:

: = =⋯= =0

The null hypothesis above is a joint hypothesis that , up to are jointly or simultaneously
equal to zero. In other words, this hypothesis states that all the explanatory variables together have
no influence on response variable Y. To put differently, it means the explanatory variables explain
zero percent of the variation in the dependent variable. This is the same as saying that

: =0

Therefore the two sets of hypotheses are equivalent; one implies the other

The appropriate test statistics is F such that


=
1− −1

Where is the coefficient of multiple determination. The F value is also obtained from ANOVA
table.

103
The above test statistics is compared with critical F value deduced as indicated below:

( − 1, − )

Decision can also be made using a significance F obtained from the computer output. If the
significance F is less than the given level of significance , then the null hypothesis is rejected,
otherwise it is not rejected.

Example 5.10

The data on sales (Y) in thousands of dollars and two independent variables customers annual
income ( ) in thousands of dollars and the average annual local currency inflation rate ( ) are
shown below:

Year Sales Income Inflation

1 25 4 0.3
2 22 3 0.4
3 26 5 0.3
4 25 4.5 0.25

5 30 5 0.2
6 33 7 0.2
7 30 6.5 0.3
8 37 8 0.06
9 40 10 0.03

10 42 12 0.02
a) Use excel to deduce output
b) Fit the multiple linear regression model and interpret the regression coefficients
c) Comment on the value of and
d) Construct 95% confidence interval for partial coefficient of inflation rate
e) Construct 95% confidence interval for partial coefficient of customer annual income
f) Test at 5% level the contribution of the average inflation rate in the model
g) Test for the significance of the model

Solution

104
a) Hint: Computer OUTPUT is as indicated below:

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.987663

R Square 0.975477

Adjusted R Square 0.968471

Standard Error 1.215878

Observations 10

ANOVA

Significance
DF SS MS F F

Regression 2 411.6515 205.8257 139.2258 2.31E-06

Residual 7 10.34851 1.478359

Total 9 422

COEFFICIENT

Standard Upper
Coefficients Error t Stat P-value Lower 95% 95%

Intercept 25.7302 3.578448 7.190324 0.000179 17.26851 34.19188

Income 1.478157 0.332305 4.448195 0.002978 0.692381 2.263934

Inflation -21.0593 7.207764 -2.92176 0.022284 -38.103 -4.01567

b) The multiple regression model is:

= + +

Where from the computer output

= 25.730

= 1.478

= −21.059

Therefore:

= 25.730 + 1.478 − 21.059

105
Interpretation of coefficients

= 25.730 is the average value of sales when income and inflation are held equal to zero

= 1.478 this indicates that the average value of sales increases by 1.478 by a unit increase in
income when inflation is held constant.

= −21.059this indicates that the average value of sales decreases by 21.059 by a unit increase
in inflation when income is held constant.

c) The value of multiple correlation coefficient = 09877 shows a very strong positive relationship
between the dependent variable and independent variables. The coefficient of multiple
determination = 97.6% indicates that 97.6% proportion of variation of sales (dependent
variable) has been explained by the income and inflation (independent variables). The remaining
2.4% has been explained by other variables not currently included in regression model
d) 95% confidence interval for partial coefficient of inflation is given by:

= ± ( )

= −21.059 ± . ,( ) × 7.20776
= −21.059 ± . , × 7.20776
= −21.059 ± 2.365 × 7.20776
= −21.059 ± 17.046
= −38,11 ; −4.013

e) 95% confidence interval for partial coefficient of income is given by:

= ± ( )

= 1.478 ± . ,( ) × 0.332
= 1.478 ± . , × 0.332
= 1.478 ± 2.365 × 0.332
= 1.478 ± 0.78518
= 0.69282 ; 2.26318

Note that: If computer output gives out the (1 − )100% confidence interval, there is no need of
computing manually the confidence interval rather you just pick them from the results.

f) Test the significance of inflation in the regression model

106
Procedures

 Test : = 0 against : ≠0
 Test statistics from computer output = −2.922
 By using P-value approach: Since P-value = 0.022 < 0.05 ,reject the null hypothesis and
therefore inflation is statistically significant

g) Test the significance of the regression model

Procedures

 Test : = 0 against : ≠0
 The appropriate test statistics is from ANOVA Table = 139.23
 By using P-value approach: Since Significance-F= 0.0000023 < 0.05 ,reject the null
hypothesis and therefore the regression model is significant

107
Review Questions

QUESTION ONE
To investigate linear relationship between a certain variable Y and an explanatory variable X,
sample paired data on 100 observations were collected, and the following summary data were
obtained.

∑ = 302, ∑ = 1155, ∑ = 16245, ∑ = 4576, ∑ = 61299 and = 25


(a) Estimate the coefficients and hence obtain the simple linear regression line.
(b) What is the meaning of each coefficient of the regression line?
(c) Compute residual sum of squares and hence estimate the regression variance ( ).
(d) Compute both ( ) and ( ).
(e) Construct 90% confidence interval estimate of .
(f) Test the significance of the variable X in the regression model.
(g) Test the significance of the model in general by using ANOVA method.

QUESTION TWO
The following is incomplete computer output to analyze the following multiple linear regression
model
= + + + +

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.7810
R Square 0.6100
Observations 25

ANOVA
df SS MS F-value
Regression A D 7302.58 G
Residual B E F
Total C 35916

Coefficients Std Error t Stat


Intercept 91.534 I 5.205
H 1.585 2.661
0.335 0.558 K
-2.205 J -0.654
(a) Complete both ANOVA and COEFFICIENTS tables.
(b) Fit the multiple linear regression model and interpret the coefficients.
(c) Comment on the strength of the multiple linear regression.
(d) Test the significance of the general multiple linear regression using ANOVA table.
(e) Construct 95% confidence interval estimate for
(f) Test the significance of the variable in the model.
(g) Test : = 5 against : < 5 at 5% level of significance.

108
QUESTION THREE
The profitability of commercial banks has been claimed to be influenced by some external factors
such GDP growth rate (GDP), inflation (INFL) and interest rate (INTR). The data for such financial
variables are as indicated below, where return on asset (ROA) is a proxy variable for profitability.

ROA GDP INFL INTR


4.57% 8.50% 7.00% 9.00%
4.87% 5.60% 10.30% 4.70%
4.01% 5.40% 12.10% 2.90%
2.46% 6.40% 7.40% 7.20%
4.67% 7.90% 12.60% 2.30%
4.13% 5.10% 16.10% -0.60%
4.86% 7.30% 7.90% 7.90%
5.72% 7.00% 6.10% 10.20%
5.03% 7.00% 5.60% 10.50%
2.64% 7.00% 5.20% 10.80%
5.01% 8.50% 7.00% 9.00%
3.41% 5.60% 10.30% 4.70%
3.26% 5.40% 12.10% 2.90%
-1.52% 6.40% 7.40% 7.20%
-11.79% 7.90% 12.60% 2.30%
1.01% 5.10% 16.10% -0.60%
0.69% 7.30% 7.90% 7.90%
1.72% 7.00% 6.10% 10.20%
-1.80% 7.00% 5.60% 10.50%
0.62% 7.00% 5.20% 10.80%
2.95% 8.50% 7.00% 9.00%
3.10% 5.60% 10.30% 4.70%
2.98% 5.40% 12.10% 2.90%
2.95% 6.40% 7.40% 7.20%
2.59% 7.90% 12.60% 2.30%
2.23% 5.10% 16.10% -0.60%
2.43% 7.30% 7.90% 7.90%
3.04% 7.00% 6.10% 10.20%
2.80% 7.00% 5.60% 10.50%
2.02% 7.00% 5.20% 10.80%
-4.14% 8.50% 7.00% 9.00%
-4.25% 5.60% 10.30% 4.70%
-5.69% 5.40% 12.10% 2.90%
1.47% 6.40% 7.40% 7.20%
1.81% 7.90% 12.60% 2.30%
0.05% 5.10% 16.10% -0.60%
-4.85% 7.30% 7.90% 7.90%
-1.24% 7.00% 6.10% 10.20%
-12.80% 7.00% 5.60% 10.50%

109
-17.57% 7.00% 5.20% 10.80%
-6.05% 8.50% 7.00% 9.00%
-6.05% 5.60% 10.30% 4.70%
-25.57% 5.40% 12.10% 2.90%
-39.07% 6.40% 7.40% 7.20%
-16.19% 7.90% 12.60% 2.30%
-17.25% 5.10% 16.10% -0.60%
-17.24% 7.30% 7.90% 7.90%
-27.35% 7.00% 6.10% 10.20%
-29.23% 7.00% 5.60% 10.50%
-14.98% 7.00% 5.20% 10.80%
2.29% 8.50% 7.00% 9.00%
-0.33% 5.60% 10.30% 4.70%
2.88% 5.40% 12.10% 2.90%
-1.54% 6.40% 7.40% 7.20%
2.51% 7.90% 12.60% 2.30%
-4.39% 5.10% 16.10% -0.60%
-0.82% 7.30% 7.90% 7.90%
-28.41% 7.00% 6.10% 10.20%
-18.01% 7.00% 5.60% 10.50%
-38.19% 7.00% 5.20% 10.80%
a) Use Excel to run regression analysis
b) Obtain the regression model of financial performance as explained by the given
independent variables.
c) Interpret the meaning of each regression coefficients
d) Comment on the value of multiple R and R square
e) Construct 95% confidence interval for each population regression coefficient
f) Test for the significance of each explanatory variable in the model by using both critical
value approach and P-value approach
g) Test for significance of the model

QUESTION FOUR

The Table below provides some information on pairs of values (x, y):

8 12 15 20 25
25 35 40 59 67

a) Use Excel to plot a scatter diagram and estimate the line of best fit (label all necessary
information in the graph).
b) Obtain the regression line by using Least Squares Method.
c) Predict the value of when = 90.
d) Construct 95% confidence interval estimate of for the data given.
e) Compute correlation coefficient and comment on the result.
f) Compute the coefficient of determination and comment on the results.
g) Test : = 0 against : ≠ 0 at 5% level of significance.

110
111

You might also like