0% found this document useful (0 votes)
377 views98 pages

DBA Maths

Uploaded by

victor chisenga
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
377 views98 pages

DBA Maths

Uploaded by

victor chisenga
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 98

ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

MINISTRY OF EDUCATION, SCIENCE & VOCATIONAL TRAINING

ZIBSIP
ZAMBIA INSTITUTE OF BUSINESS STUDIES AND INDUSTRIAL PRACTICE

NCE T
EXCE

UGH
ZIBSIP
LIGE N
OPEN
DISTANCE LEARNING
DIPLOMA
BUSINESS ADMINISTRATION

BUSINESS MATHEMATICS AND


STATISTICS
DBA 140
0977779135

2016
1
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

COPYRIGHT
No part of this publication may be reproduced, stored in retrieval system, or transmitted
in any form, or by any means, electronic, electrostatic, mechanical, photocopied or
otherwise, without the express permission in writing from Zambia Institute of Business
Studies and Industrial Practice. (ZIBSIP)

TABLE OF CONTENTS

Introduction 3
SECTION A 4
( i ) Sources of data 4
( ii ) Primary and Secondary data 5
( iii ) Types of data 6
( iv ) Measurement scales 7
SECTION B 8
( i ) Presentation of data 8
( ii ) Measures of location 23
( iii ) Measures of spread or dispersion 27
( iv ) Weighted means 36
ASSIGNMENT 1 37
SECTION C 38
( i ) Probability 38
( ii ) Laws of probability 42
( iii ) Permutations and Combinations 45
ASSIGNMENT 2 49
( iv ) The Binomial Distribution 50
( v ) The Poisson Distribution 52
( vi ) The Normal Distribution 55
ASSIGNMENT 3 59
SECTION D 60
( i ) Hypothesis testing 60
( ii ) Testing based on a sample value 60
( iii ) Testing based on a sample mean 63
( iv ) Testing the difference between means 66
( v ) Testing a proportion 70
SECTION E 73
( i ) Correlation and Regression 73
( ii ) Pearson’s Coefficient of correlation 76
( iii ) Spearman’s rank correlation 78
SECTION F 81
( i ) Time Series 81
SECTION G 86
( i ) Index numbers 86
ASSIGNMENT 4 90
( ii ) Laspeyres Index 92

2
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

( iii ) Paasche Index 94


( iv ) Uses of Index numbers 95
REFERENCES 97

INTRODUCTION

The title of this course is Business Mathematics and Statistics. However, many other
terms are used in business and by Professional bodies to describe the same subject
matter. For example, Quantitative Techniques, Quantitative Methods for Business and
Management, Numerical Analysis or simply Quantitative Methods.

A particular problem for Management is that most decisions need to be taken in the light
of incomplete information. That is, not everything will be known about current business
processes and very little (if anything) will be known about future situations. The
techniques described in this course, “Quantitative Methods”, enable structures to be built
up which help management to alleviate this problem. The main areas included in this
course are (a)Statistical Methods; (b) Management Mathematics (c) Probability and (d)
Decision making techniques.

The use of calculators

The examining body, ZICA, permits the use electronic calculators in examinations. It is
therefore essential that you equip yourself with a calculator from the beginning of the
course.

Essential facilities that the calculator should include are:


a) a square- root function and
b) an accumulating memory.
y
c) A power function ( labeled ' x ’ ),
d) A logarithm function ( labeled ‘log x’) and
e) An exponential function (labeled ‘ ex ’).

The use of programmable calculators is prohibited. You are thus, urged to check on this
point before you purchase a calculator. Where relevant, this batch includes instructions
which describe techniques for using calculators to their best effect.

This batch covers the main theory required by the ZICA examining body. Special
attention has been given to topics which, in my experience, cause great difficulty, in
particular, probability and significant testing. Throughout the text, I have aimed to
provide you with a mathematical structure and a logical framework within which to work.
Points of theory are precisely presented and illustrated by worked examples.

Assignments
You are required to submit the assignments set in this batch on the dates indicated.
Please note that failure to submit your assignments on time will result in loss of marks.

3
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

SECTION A

SOURCES OF DATA

LEARNING OBJECTIVES:

At the end of this section you should be able to:


1. Explain the main sources of data and types of data.
2. Distinguish between alternative measurement scales ( i.e. nominal,
ordinal, interval and ratio scales).
3. Compare and contrast alternative sampling methods and understand the
main features of surveys, questionnaire design and the concept of
sampling error and bias.

COLLECTING DATA

Before you can get information from statistical data you must first collect data.
Data can be defined as ‘a series of observations, measurements or facts’ and the
thing being observed is called the ‘variate’.

DEFINITIONS AND CONCEPTS

1. Population:
The data we use in statistics comes from a population. The population is
generally large, as a result only a subset of it is used. A population is
defined as ‘the set of collection of objects we wish to study and for which
data is sought. It is not always possible to reach every member of the
population. Consequently, a population is often sub-divided.

2. Sample:
We can study a population by looking at all the members of the population
or by looking at a sub-set of the population to represent the population.
The sub-set is called a sample. The process of selecting a sample is
called sampling. There are a number of methods of collecting data:

TAKING A CENSUS

If every member of a population is observed or measured, it is called taking a


census. For example, if you wanted to find the mean height of First Year
students in a certain college, you could measure each student. You would then
be taking a census. The best known census is that conducted by the

4
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

government. The Zambian Government usually conducts a 10- year national


census where every citizen of the country is counted.

In statistical studies, a census is used if:

i) the size of the population is small or if


ii) Extreme accuracy is required.

Advantage of taking a census

The advantage of using a census is that, it should give a completely accurate


result. This may be necessary where the government is planning housing
requirements, school programmes, new hospitals and so on, or in industry where
a component is being tested for safety.

Disadvantages of a census
i) it is very time consuming and expensive
ii) the information is difficult to process because there is so much of it.
iii) It can not be used if the testing process is to destruction ( for example,
testing a mango for sweetness, definitely you can’t test all the
mangoes.

TAKING A SAMPLE

A sample is a sub-set of a population. It is a collection of individual items or


members of the population.

Advantages of taking a sample

Briefly the advantages are:

i. it is generally cheaper
ii. it can be taken as a representative of the whole population, that is, it
will have the same distribution.

Disadvantage of a sample

The disadvantage of taking a sample is that, “there is some uncertainty as a


result of natural variation and bias.

PRIMARY AND SECONDARY SOURCES OF DATA

There are two main sources of data: primary and secondary data.

5
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

1. PRIMARY DATA

By primary data we mean data that is collected by the person who is going to
use the data. It is ‘first hand’ information. For example, if you collect your own
data for your project they will be primary data.
Primary data is also the name given to data that are used for the specific
purpose for which they were collected. They will contain no unknown
quantities in respect of method of collection, accuracy of measurements or
which members of the population were investigated. Sources of primary data
are either censuses or samples.

Primary data have the advantages that:

 the collection method is known


 the accuracy is known
 the exact data needed are collected

The disadvantage is that:

 it is very costly in time and effort

2. SECONDARY DATA

By secondary data we mean data that is not collected by the person who is to
use the data. They are second hand. For example, if you use data from
government census in your project, they are secondary data. Secondary data is
also the name given to data that are being used for some purpose other than that
for which they were originally collected.

Secondary data have the advantages that:

i. They are cheap to obtain-government publications are relatively


cheap.
ii. A large quantity of data is available.
iii. Much of the data have been collected for years and can be used to
plot trends.

The disadvantages are that:

i. The collection method may not be known.


ii. The accuracy may not be known.
iii. The data may not be ideal for the purpose to which they are to be put.

TYPES OF DATA

6
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

There are basically two types of data:


1. Quantitative data
2. Qualitative data.

QUANTITATIVE DATA

Quantitative data is data that involves numbers. In other words, quantitative data
have values that are intrinsically numerical. There are two types of quantitative
data:
a) Discrete data and
b) Continuous data.

Discrete data are data that can only exact values. The values for discrete
data are integers. For example

 The number of children in a family.


 the shoe sizes of children in a class
 the number of cars passing a police check point in 30 minutes
 the number of registered voters in a certain constituency

Note that all the above examples can only take exact values. Discrete data are
data that can be measured precisely (e.g. by counting).

Continuous data are data that can not take exact values. They have values that
are given in a certain range or measured to a certain degree of accuracy. Their
values can only be approximated to. For example, 144cm (correct to the nearest
cm) could have arisen from any value in the interval 143.5cm ≤ h ≤ 144.5cm.

QUALITATIVE DATA

Qualitative data are data that do not involve numbers but categories. Thus
qualitative data are also known as categorical data. They have values that are
intrinsically non- numerical.

MEASUREMENT SCALES

Collecting data from a sample involves measuring or obtaining the values of one
or more variables from each study unit. A variable is a characteristic which
may vary from a study unit to a study unit. A sample consists of variables. There
is need to distinguish between a constant and a variable.

7
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Variables are classified using different scales:

1. the nominal scale


2. the ordinal scale
3. the interval scale and
4. the ratio scale
A nominal scale classifies variables by name e.g. tribe, gender, marital status,
nationality etc. An ordinal scale classifies variables by order e.g. social
economic status, stage of disease, course grade etc. An interval scale classifies
variables in a given range e.g. temperature, IQ etc. A ratio scale is a
classification by comparing variables. E.g. weight, distances, age, family size
etc.

The above measuring scales are summarized in the table below:

SCALE CHARACTERISTIC EXAMPLES


QUESTION
1. Nominal Does A differ from B? Marital status, gender,
tribe, nationality etc.
2.Ordinal Is A bigger than B? Stage of disease, course
grade; social economic
status, etc.
3. Interval How much is A bigger Cost of items,
than B? temperature, IQ etc.
4. Ratio How many times is A Weight, distance, age,
bigger than B? family size.

SECTION B

PRESENTATION OF DATA
LEARNING OBJECTIVES:

At the end of this section you should be able to:

1. Construct appropriate tables and charts, including frequency and


cumulative frequency distributions and their graphical representations.
2. Calculate and interpret measures of location, dispersion, relative
dispersion and skewness for ungrouped and grouped data
3. Compute unweighted and weighted index numbers and understand their
applications.
4. Change the base period of an index number.

8
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

There are three main methods for summarizing data and these are:

1. Tabular
2. Graphical and
3. Numerical

The reasons for summarizing data are as follows:

a) When the number of data values is large.


b) When it is impossible to see the overall picture by examining the individual
values.

This area of statistics that deals with summary of data is often called descriptive
statistics.

FREQUENCY DISTRIBUTIONS

One way of summarizing data is in the form a table called frequency


distributions. Frequency distributions can be constructed for two types of data:
ungrouped data and grouped data.

FREQUENCY DISTRIBUTION FOR UNGROUPED DATA

This type of frequency distribution deals with discrete raw data. Discrete data is
data that can take exact values- values that are usually obtained by counting. A
simple frequency distribution consists of a list of data values each showing the
number of occurrences of each value( called the frequency). Each value may
have many occurrences i.e. may appear a lot of times.

A frequency distribution for ungrouped data may appear like this:

AGE OF STUDENTS AT Z.I.B.S.I.P.


Age (years) Number of students (frequency)

17 267
18 164
19 96
20 74
21 and over 23

TOTAL 624

We have constructed this table by listing variable (age) vertically. But we can
also write it horizontally. “There is no golden rule”. In the horizontal form, the
table would look like this:

9
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Age ( years) 17 18 19 20 21 and over TOTAL


Frequency 267 164 96 74 23 624

There is nothing wrong with our producing 15 or 20 tables like this one, each
concerned with one variable, but it is better for presentation purposes if we could
produce a small number of compound tables each showing several variables at
once. Thus, we could construct a double table showing the two variables, age
and sex of students at the same time. Such a table would look like this:

Age (years) Number of Students


Male Female TOTAL
17 151 116 267
18 98 66 164
19 70 26 96
20 52 22 74
21 and above 18 5 23
TOTAL 389 235 624

Notice that we have totaled both the vertical and the horizontal columns and this
adds to our information. We not only have the age distribution of male and
female students but also the age distribution of the entire student population, and
the total number of male and female students.

Let us extend our table to consider three variables, age, sex and type of
accommodation. Obviously now we must further subdivide either the horizontal
or the vertical columns. Let us suppose that we are aiming to show that the type
of accommodation a student occupies depends on his or her age. Our table may
now appear like this:

NUMBER OF STUDENTS

Age Day Boarders Rented


Scholars Flats
Male Female Male Female Male Female TOTAL
17 112 92 16 20 23 4 267
18 64 42 24 16 10 8 164
19 31 12 28 7 11 7 96
20 8 4 16 10 28 8 74
21 2 3 3 1 13 1 23
+
TOTAL 217 153 87 54 85 28

370 141 113 624

10
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

You will really appreciate what a vast amount of information a table such as this
one can give us: the number of students who live at home, subdivided into male
and female and classified according to age, as well as the same information for
those who are living in college hostels or rent their own flat.
If you are in a position of having to construct tables to present the raw material
you have collected, there are several points you should bear in mind. We shall
call them “characteristics of good tables”.

CHARACTERISTICS OF GOOD TABLES

Tables have the following characteristics:

1. Every table should have a short explanatory title at the head. At the end
you should put a note of the source of information you have used, whether
it is based on own survey or secondary data. Only original data are used
without citation.
2. The unit of measurement should be clearly stated, and if necessary
defined in a footnote. Not many people, for example would know off hand
what a metric tone is. In addition the heading to every column should be
clearly shown.
3. Use different rulings to break a larger table- double lines or thicker lines
add a great deal to the ease with which a table is understood.
4. Whenever you feel it useful, insert both column and row totals.
5. If the volume of data is large, two or three single tables are better than
one cumbersome one.
6. Before you start to draft a table, be quite sure what you want it to show.
Remember that although most people read from left to right, most people
find it easier to absorb figures which are in columns rather than rows

CONSTRUCTING A FREQUENCY TABLE


(UNGROUPED DATA)

Example:
The following data record shows the numbers of children in 30 randomly chosen
families:

1 2 4 0 2 3 1 4 2 3
5 2 2 3 2 2 3 1 2 3
2 0 1 1 2 0 3 2 3 3

Construct a frequency distribution for this data.


Solution:

Number of children Tally marks Frequency


0 ||| 3
1 |||| 5
2 |||| |||| | 11

11
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

3 |||| ||| 8
4 || 2
5 | 1

TOTAL 30
The final answer can also be presented horizontally as follows:

Number of children TOTAL


0 1 2 3 4 5
Number of families 30
3 5 11 8 2 1

The mode is the value that occurs most often. From the table it is easy to see
that the mode is 2 children per family.

GROUPED FREQUENCY DISTRIBUTION

The main purpose of a frequency distribution is to summarize numeric data in a


logical manner that enables an overall perspective of the data to be obtained
quickly and easily. When the number of distinct data values in a set of raw data
is large, there will be too much information that can not easily be assimilated. In
this type of situation, a grouped frequency distribution is used. This type of
distribution is normally used for continuous raw data.
A grouped frequency distribution organizes data items into groups of values
called class intervals, each showing the number of items having values in the
group. There are various ways of presenting the groups. Some of them are
shown below:

A)

MARKS FREQUENCY
1-5 7
6 – 10 4
11 – 15 13
16 – 20 17
21- 25 3
26 - 30 6
TOTAL 30

B)
HEIGHT (cm ) FREQUENCY
120 ≤ h< 125 1
125 ≤ h< 130 5
130 ≤ h< 135 7
135 ≤ h< 140 4
140 ≤ h < 145 3

12
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

145 ≤ h< 150 5


TOTAL 25

C)
VALUE OF PROPERTY (K’000) NUMBER OF PROPERTIES
10 and less than 15 2
15 and less than 20 6
20 and less than 25 14
25 and less than 30 21
30 and less than 35 33
35 and less than 40 19
40 and less than 45 5
TOTAL 100

The values 10, 15, 20, 25,…………….. are called the class boundaries. For each
class interval there is a lower class boundary and an upper class boundary. The
upper class boundary (u.c.b.) of one interval is the lower class boundary (l.c.b.)
of the next interval.

CLASS WIDTH

The class width of an interval is equal to the difference between the upper
boundary and the lower boundary. i.e.
Class width = u.c.b. – l.c.b. Therefore, the width of the first interval in example
(B) is 125 -120 = 5cm. Notice however that once items have been grouped in
class intervals, their individual values are lost. For example, we do not know the
value of the one item in the first interval of example (B), only that it lays between
120 and 125cm.

CUMULATIVE FREQUENCY

Another way of presenting frequencies is by forming cumulative frequencies. Any


frequency distribution can be adapted to form a cumulative frequency
distribution. The technique involves adding up the frequency values less than or
equal to a particular value. The simplest way to calculate cumulative frequency is
by adding together the actual frequency in the class to the cumulative frequency
of the previous class.

13
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

EXAMPLE

Cumulative frequencies can be calculated as shown in the table below:

Marks Frequency Cumulative Frequency


25 – 29 2 2
30 – 34 1 2+1= 3
35 – 39 0 3+0= 3
40 – 44 1 3+1= 4
45 – 49 4 4+4= 8
50 – 54 11 8 + 11 = 19
55 – 59 9 19 + 9 = 28
60 – 64 4 28 + 4 = 32
65 – 69 2 32 + 2 = 34
70 – 74 3 34 + 3 = 37
75 – 79 1 37 + 1 = 38
80 – 84 1 38 + 1 = 39
85 – 89 1 39 + 1 = 40

There are two forms of cumulative frequency distributions:


a) the less than cumulative frequency distribution and
b) the more than cumulative frequency distribution

NOTE:

Distributions are not usually presented using the more than cumulative frequency
distribution. In all our examples, in this course, we shall deal solely with less than
cumulative frequency distributions.

EXAMPLE

Six weeks after planting, the heights of 30 broad bean plants were measured
correct to the nearest cm. The frequency distribution is given below:

Height 3-5 6-8 9 - 11 12 - 14 15 - 17 18 – 20


(cm)
Frequency 1 2 11 10 5 1

Obtain the cumulative frequency distribution for this information.

14
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

SOLUTION:

Height ( cm) Frequency Cumulative


Frequency
≤3 0 0
≤5 1 1
≤8 2 3
≤ 11 11 14
≤ 14 10 24
≤ 17 5 29
≤ 20 1 30

GRAPHS

Another useful way of summarizing data is by using graphs. Important notes on


graphs are as follows:

 They often provide very effective visual, descriptive summary of the


data.
 They offer a summary of the data at a glance. In the case of tables,
one requires a brief period of study before we are able to ascertain the
meaning of their contents.
 They can be used to enhance or distort certain features of the data.
 Graphs are used mainly for efficient and convenient presentation of
statistical data and results. They are not generally used for the actual
analysis of the data. They may however be of use in indicating what
kind of analysis is feasible.
 There are a number of graphs. These include bar- graphs, histograms,
pie-charts, pictographs etc.

BAR- GRAPHS

A Bar- graph is another useful tool for visual summary of the data. It is a chart
consisting of a set of non-joining bars. A separate bar for each class is drawn to
a height proportion to the class frequency. The widths of the bars drawn for each
class are always the same and, if desired, each bar can be shaded or coloured
differently. (Note: do not confuse bar charts and histograms. Histograms
represent numeric data with joined bars. Bar charts represent non - numeric data

15
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

and have their bars separate from each other.) A bar- chart has the following
characteristics:

 used to display categorical values


 most appropriate for qualitative variables e.g. nominal
values
 for each value ( or group of values ) of the variable, a rectangle is
constructed with base of constant width and height proportional to
frequency
 the rectangles are then displayed equally spaced
from one another

ADVANTAGES OF BAR- CHARTS

a) Easy to construct
b) Easy to understand the values being represented by the bars
c) Easily adapted to show negative values (loss- or – gain charts) or for comparison
purposes.

There are no significant disadvantages.

Bar- graphs can be presented vertically or horizontally.

VERTICAL BAR-GRAPHS

Example:

The table below shows the annual copper production at KCM, Nchanga Division:

Year 2001 2002 2003 2004 2005 2006 2007 2008


Production
(metric 100 250 150 150 200 250 200 350
tones)

Represent this information in a bar- chart.

Solution:

ANNUAL COPPER PRODUCTION AT KCM

500
450
400
350
300
250
200

16
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

150
100
50
0 2001 2002 2003 2004 2005 2006 2007 2008

YEAR

HORIZONTAL BAR- CHARTS

Example:

The total number of goals scored by each team in the football Premier Division is
recorded as follows:
ZANACO---------------------------------------------------- 35
GREEN BUFFALOES------------------------------------ 20
KABWE WARRIORS------------------------------------- 30
POWER DYNAMOS-------------------------------------- 25
NKANA F. C. ----------------------------------------------- 15
KITWE UNITED------------------------------------------- 25
MUFULIRA WANDERERS----------------------------- 10

Represent this information in the form of a horizontal bar- graph.


Solution:

PREMIER DIVISION SCORE CHART

WANDERERS

KITWE UNIT.

NKANA F. C.

WARRIORS

BUFFALOES

ZANACO
0 5 10 15 20 25 30 35 40

NUMBER OF POINTS

CIRCULAR DIAGRAMS OR PIE CHARTS

17
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Pie charts are sometimes referred to as circular diagrams or as divided circles. A


pie chart shows the totality of the data being presented using a single circle ( or a
‘pie’). The circle is split into sectors, the size of each one being drawn in
proportion to the class frequency. Each sector can be shaded or coloured
differently if desired.

PROCEDURE FOR CONSTRUCTING A PIE CHART

a) Tabulate the data and calculate the proportion of the total that each
frequency represents.
b) Multiply each proportion by 360°, giving the sizes of the relevant sectors
that need to be drawn.
c) Construct the diagram by means of a pair of compasses and a protractor.
Do not overlook this point, because examiners dislike inaccurate and
roughly drawn diagrams.
d) Label the diagram clearly, using a separate ‘key’ if necessary.
e) It is best not use a diagram of this kind with more than four or five
component parts.

Example:

The sales (in thousands of litres) of petrol from four filling stations A, B, C, and D,
are noted for the first week of June, and are shown in the table below:

Petrol Station D
A B C
Sales ( in thousands of 140 20
litres) 90 30

Construct a pie chart to illustrate this information.

Solution:

The total angle of 360° at the centre of a circle is divided according to the sales
at each of the stations.
The total sales (thousands of litres) = 90 + 140 + 30 + 20 = 280.
90
The angle representing the sales of Petrol station A is given by: 280 X 360° =

18
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

115.7° (to 1 d.p.)


And so, for each of the petrol stations we have:

Petrol Station Sales (thousands of Sector angle


litres)
A 90 90
280 X 360° =
115.7°
B 140 140
280 X 360° =
180°
C 30 30
280 X 360° =
38.6°
D 20 20
280 X 360° =
25.7°
TOTAL 280 TOTAL = 360°

PIE CHART SHOWING THE SALES OF PETROL (thousands of litres)

ADVANTAGES OF PIE CHARTS

a) It is a dramatic and appealing way of presenting data.

19
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

b) It is good for comparing classes in relative terms.

DISADVANTAGES

a) Compilation is laborious. Circles should not be drawn free hand and


sectors should be drawn using a protractor. However, without using a
protractor, (once the size of each sector has been determined), their
physical size within the circle can intelligently be guessed at.
b) Can be untidy if there are many classes (say 8 or more) and if different
shadings or colourings are being used.
HISTOGRAMS
A frequency distribution can also be presented pictorially by a histogram. A
histogram is a chart consisting of a set of vertical bars in which the bars are
drawn so that the area of each bar is proportional to the frequency.
i.e. AREA ≈ FREQUENCY

CHARACTERISTICS OF A HISTOGRAM

a) Each bar represents just one class. The bar width corresponds to class
width and the bar height generally corresponds to the class frequency.
b) The bars are joined together. (Whereas there is space between each bar
in a bar graph, there is no space between the bars in a histogram).
c) The vertical axis represents the frequency and the horizontal axis
represents the data values. Both axes must scaled and labeled clearly.
d) Each interval is represented by a bar with base defined by the end points
of the interval.
e) While bar graphs display frequencies by height, histograms represent
frequencies by their areas.
f) Much confusion can be avoided by using intervals of equal width. In this
case, the height of each bar will be proportional to the frequency.
g) The histogram as a whole must have a title.

Examples.

A) HISTOGRAMS WITH EQUAL CLASS WIDTHS.

The table below gives the heights of 34 students with the data grouped in
class intervals of 5. Measurements were taken correct to the nearest cm.

Height 140 - 145 - 150 - 155 - 160 - 165 –


( cm) 144 149 154 159 164 169
Frequency 3 8 4 9 6 4

Draw a histogram to illustrate this information.

20
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Solution:
The class boundaries are 140, 144, 149, 154,………, 169.
The class widths are: 5, 5, 5, 5, 5, 5, 5.
Since the class intervals are equal width, the frequency can be used for the
height of each rectangle.

Example:
A) HISTOGRAM WITH UNEQUAL CLASS WIDTHS

Sixty lecturers were asked to record the duration, to the nearest minute, of
their next telephone call. The results were as follows:

Time 0 -9 10 -14 15 - 20 - 24 25 – 34
(minutes) 19
Frequency 10 5 5 5 10

Draw a histogram to represent the data.

Solution:

The class boundaries are: 0, 9, 14, 19, 24, 34.


The class widths are: 10, 5, 5, 5, 10.
In this case, the class widths are not equal, so the heights of the bars must be
adjusted so that the areas of bars are proportional to the frequencies. The best
way to this is to calculate the frequency density.

21
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

FREQUENCY
FREQUENCY DENSITY = CLASSWIDTH
We use “frequency density” as the heights of the rectangles.
The calculations are shown in the table below:

Time Class Frequency Frequency


(minutes) Width Density
0 -9 10 13 13
10 = 1.3
10 -14 5 19 19
5 = 3.8
15 -19 5 12 12
5 = 2.4
20 -24 5 7 7
5 = 1.4
25 – 34 10 9 9
10 = 0.9

DURATION OF TELEPHONE CALLS

Note:

22
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

1. Frequency = frequency density x class width


2. Modal class = interval with greatest frequency density.

MEASURES OF LOCATION
When interpreting data, we often look for a “typical value”. Measures of location
commonly known as averages are the best measures of location. They are single
values intended as representatives, which can neatly characterize a whole
group. Averages are commonly used to compare samples of the same kind. We
speak of goal averages, or class averages for marks in an examination, the
average wage of the population or the average rainfall in a certain area.
The measures of location are the mean, the mode and the median.

THE MEAN, X̄
The mean is often denoted by x. The mean is defined as “the sum of the
values divided by the number of the values”.

Mean x̄ , = the sum of the values


the number of the values

NOTATION FOR THE SUM OF THE VALUES.

Adding the values of set general items together occurs in so many


different formulae in statistics that it has its own notation, enabling sums
to be written very compactly.
X1 + X2 +X3 + …………….Xn is written as ∑xi, i = 1, 2, 3,……….n.

Note: The symbol ∑ means, “the sum of” and is read ‘Sigma’. ∑ is the
Greek symbol for capital ‘S’ for ( sum) and ∑x can simply be
translated as ‘ add up all the x- values under consideration’.

FORMULA FOR THE MEAN OF A SET OF VALUES


The mean of a set values X1, X2, X3,………………Xn is calculated as
follows:

∑x
Mean, x̄ = n

23
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

x 1+ x 2+x 3 . .. . .. .. . .. ..+xn
= n

Example:
Find the mean of the set of numbers 63, 65, 67, 68, 69.

Solution:
Here n = 5
∑x = 63 + 65 + 67 + 68 + 69
= 332
Therefore,

∑x
Mean, x̄ = n

=332
5
= 66.4

The mean of the set of numbers is 66.4.

FINDING THE MEAN OF A FREQUENCY DISTRIBUTION

a) Mean for ungrouped data


For a frequency distribution,

∑ fx
Mean, x̄ = ∑f
Example:

30 University Students were asked how many courses they were each taking in
the first semester of 2009. The results are set out in the frequency distribution
given below:

Number of courses ,x 1 2 3 4 5
Frequency , f 11 10 5 3 1

Calculate the mean number of courses taken by each student.

24
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Solution:

x f fx
1 11 11
2 10 20
3 5 15
4 3 12
5 1 5
∑f = 30 ∑fx = 63
∑ fx
Mean x̄ = ∑f
= 63
30
= 2.1

The mean number of courses per student is 2.1.

b) Mean for grouped data

When the data has been grouped into intervals, we do not know the actual
values, so we can only ‘estimate’ the mean. We take the mid-point of an interval
to represent that interval.

1
Note: Mid- point of an interval = 2 (upper class boundary + lower class
boundary)

1
= 2 (u. c. b. + l. c. b.)
Example:

The lengths of 40 bean pods were measured to the nearest cm and grouped as
follows:

Length 4-8 9 - 13 14 - 19 - 24 - 29 – 33
( cm) 18 23 28
Frequency 2 4 7 14 8 5

Solution:
1 1
The mid-point of the interval 4 – 8 is 2 (4+ 8) = 2 x 12 = 6, so we assume
that the values in that interval are represented by 6. We find the other mid-points
and form a table:

25
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Length ( cm ) Mid – point (x ) f fx


4–8 6 2 12
9 – 13 11 4 44
14 – 18 16 7 112
19 – 23 21 14 294
24 – 28 26 8 208
29 – 33 31 5 155
∑ f = 40 ∑ fx =825

∑ fx
Mean x̄ = ∑f
825
= 40
= 20.625.

Therefore, the mean length of the bean pods is 20.625cm.

THE MEDIAN:

For a set of observations arranged in “order of size”, the median is the value
50% of the way through the distribution, i.e. the middle value. For ungrouped
data we can the median (middle value) according to the following rule:
1
If there are n observations, arranged in order of size, the median is the 2 (n+
1)th observation. We find that:
i) if n is odd, there is a middle value and this is the median.
ii) if n is even, there are two middle values. I f these are c and d then the
median is ½ (c+ d).

Examples:

Find the median of each of the following sets of numbers:


a) 7, 7, 2, 3, 4, 2, 7, 9, 31.
b) 36, 41, 27, 32, 29, 38, 39, 43.

Solutions:

For (a) we first arrange the numbers in order of size:


2, 2, 3, 4, 7, 7, 7, 9, 31.

The median is the ½ (9 +!)th value, i.e. the 5th value.

26
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

So the median = 7.

For (b): arranging the numbers in order of size:

27, 29, 32, 36, 38, 39, 41, 43.


The median is the ½ ( 8 +1)th value. i.e. the 4.5th value. This does not exist, so
we consider the 4th and 5th values.

1
Median = 2 (36 +38)

1
= 2 X 74
= 37.
Therefore, the median = 37.

MEASURES OF SPREAD OR DISPERSION

There are several types of measures of dispersion:

1. The range
2. The quartiles
3. Interquartile range
4. Mean deviation
5. Standard deviation

THE RANGE

The range is based entirely on the extreme values of a set data. It is the
difference between the highest value and lowest value.

Range = Highest value – Lowest value.

Examples:
Consider the following sets of numbers:
a) 7, 8, 11, 17, 19.
b) 5, 6, 7, 8, 9.
c) -193, -93, 7, 107, 207.
Solutions: In (a) range = 19 – 7 = 12.
(b) range = 9 - 5 = 4.
(c) range = 207 – (-193) = 400.

27
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

THE QUARTILES

The median divides a collection of ordered measures into two equal parts. The
quartiles divide the measures into four equal parts. I f the measures are arranged
in order on a line as shown below, Q 1 is the lower quartile, Q2 is the median and
Q3 is the upper quartile.

Lowest Highest

value Q1 Q2 Q3 value

Note: For n observations, arranged in order of size, the lower quartile, Q1 is the
value 25% of the way through the distribution and the upper quartile Q3 is 75%
of the way through the distribution.

Examples:

Find the lower quartile Q1, the median Q2 and the upper quartile Q3 for each of
the following sets of data:

a) 3, 3, 5, 6, 8, 9, 12, 14, 19, 20, 24.


Solution: Fix the position of Q2 first, then Q1 and Q3.
Here, lower quartile Q1 = 5
median Q2 = 9
upper quartile Q3 = 19

b) 20, 23, 23, 26, 27, 28.

Solution: lower quartile Q1 = 23


median Q2 = ½ ( 23 + 26 ) = 24.5
upper quartile Q3 = 27

c) 10, 12, 13, 15, 19, 19, 24, 26, 27.

Solution: lower quartile Q1 = ½ ( 12+13) = 12.5


median Q2 = 19
upper quartile Q3 = ½ ( 24 + 26) = 25.

d) 147, 150, 154, 158, 159, 162, 164, 165.

Solution: lower quartile Q1 = ½(150 + 154) = 152


median Q2 = ½(158 + 159) = 158.5.
upper quartile Q3 = ½(162 + 164 ) = 163.

28
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

We prefer this method for finding the values of the quartiles for ungrouped
data. However, often the following rule is used:
Q1 = ¼ ( n +1)th value

Q3 = ¾ ( n+1)th value
This rule agrees with our method when n is odd, but there is a discrepancy
when n is even. However, it does not make a great deal of difference which
method is used.

INTERQUARTILE RANGE

The Interquartile range is a useful measure of spread of a distribution. It is the


range of the middle 50% of the values and is found by subtracting the lower
quartile from the upper quartile.

Interquartile range = upper quartile – lower quartile


= Q 3 – Q1

The advantage of this range is that it is not affected by extreme values.

SEMI- INTERQUARTILE RANGE

The semi- interquartile range = ½(Q3 – Q1)

Example: Find the semi- interquartile range of the following set of numbers:
2, 3, 3, 9, 6, 6, 12, 11, 8, 2, 3, 5, 7, 5, 4, 4, 5, 12, 9.

Solution: First arrange the numbers in ascending order:


2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 6, 6, 7, 8, 9, 9, 11, 12, 12.
Q1 Q2 Q3

From the diagram we see that Q1 = 3 and Q3 = 9.

Therefore, interquartile range = Q3 – Q1 = 9- 3 =6.


Semi- interquartile range = ½( Q3- Q1) = ½ x 6 =3.

MEAN DEVIATION
It is usual to consider how spread out the numbers are either side of the
mean. The mean deviation is measure of dispersion that gives the average
absolute difference ( i.e. ignoring the ‘minus’ sign) between each item and the
mean. The mean deviation is much more representative than the range since
all item values are taken into account in its calculation.

29
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

FORMULA FOR THE CALCULATION OF THE MEAN DEVIATION


The mean deviation of a set of n numbers, ( x 1, x2, x3, …….xn) is given by:

∑ |x − x̄|
a) For a set, Mea deviation = n
∑ f |x− x̄|
b) For a frequency distribution, Mean deviation = ∑f
where, x̄ = mean of the set of numbers.
Note: The modulus symbol |……|, means “the absolute value of” and simply
ignores the ‘minus’ sign of the expression inside it. We write | x | to mean the
magnitude or ‘modulus’ of x. We are interested only in size of x and we can
disregard the sign.
For example, | -7 | = 7; | 6 – 10 | = | - 4 | = 4; | 24 | = 24 and so on.

EXAMPLES:
a) Mean deviation for a set of numbers
Calculate the mean deviation of 43, 75, 48, 39, 51, 47, 50, and 47.
Solution: First we determine the mean x̄ as
∑x
x̄ = n

400
= 8
= 50.
∑ |x − x̄|
Therefore, mean deviation = n

= |43-50| + |75-50| + |48-50| + |39-50| + | 51-50| +| 47-50 | + | 50-50 | + | 47-50 |


8

= 7 + 25 + 2 + 11 + 1 + 3 + 0 + 3
8
52
= 8
= 6.5
This means that each value in the set is, on average 6.5 units away from the
common mean.

Example:
b) Mean deviation for frequency distribution

30
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

The table below shows the number of successful sales made by the
salesmen employed by a large micro- computer firm in a particular
quarter:

Number of sales 0-4 5-9 10 - 15 - 19 20 - 24 25 - 29


14
Number of 1 14 23 21 15 6
salesmen

Calculate the mean and the mean deviation of the number of sales.
Solution:
The standard layout and calculations are shown below. The mean is calculated
first, and then used to find the mean deviation.
Number of Mid- value Frequency (f) fx |x- f | x - x̄
sales (x) x̄ | |
0–4 2 1 2 13.3 13.3
5–9 7 14 98 8.3 116.2
10 – 14 12 23 276 3.3 75.9
15 -19 17 21 357 1.7 35.7
20 – 24 22 15 330 6.7 100.5
25 - 29 27 6 162 11.7 70.2
∑ f = 80 ∑ fx ∑f| x- x̄ |=
=1225 411.8

Mean number of sales x̄ = ∑ fx = 1225 = 25.3.


∑f 80

For the group 0 – 4, | x - x̄ | = | 2- 15.3 | = 13.3

For the group 5 – 9, |x - x̄ | = | 7 – 15.3 | = 8.3


and so on.
∑ f |x− x̄| 411. 8
Thus, mean deviation, md. = ∑f = 80 = 5.1

CHARACTERISTCS OF THE MEAN DEVIATION


a) The mean deviation can be regarded as a good representative measure of
dispersion that is not difficult to understand. It is useful for comparing the
variability between distributions of like nature.
b) Its practical disadvantage is that it can be complicated and awkward to
calculate if the mean is anything other than a whole number.
c) Because of the modulus sign, the mean deviation is virtually impossible to
handle theoretically and thus is widely used in more advanced analysis

31
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

A much more useful measure of dispersion is the standard deviation.

STANDARD DEVIATION, s.

The procedure for calculating the standard deviation is as follows:


a) For each value the deviation from the mean, x - x , is found.
b) This deviation is then squared, i.e. ( x - x̄ )2, so that all the values will
be positive.
c) The average of these values is then calculated.
d) Finally, the “positive” square- root is taken to give the standard deviation.

Standard deviation for a set of values


The standard deviation of a set of n numbers, with mean x̄ is given by s,
where
∑ ( x− x̄ )2
s= √ n

In words, the standard deviation can be defined as “ the root of the mean of
the squares of deviations from the common mean”.
Note: If the mean is not a whole number, the calculations could involve some
awkward decimal- bound work.

Example: Find the standard deviation of 5, 6, 7, 8, 9.

Solution: The mean, x̄ =∑x


n
=5+6+7+8+9
5

= 35
5

= 7

∑ ( x− x̄ )2
Standard deviation, s =
x
√ n
x - x̄ ( x - x̄ )2
5 -2 4
6 -1 1
7 0 0
8 1 1
9 2 4
∑ ( x - x̄ )2 =

32
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

10

∑ ( x− x̄ )2
Therefore, s = √ n

10
= √ 5

=√2

= 1.41.

ALTERNATIVE FORMULA FOR THE STANDARD DEVIATION


The formula given above is sometimes difficult to use, especially when the
mean x‾, is not an integer. So an alternative form is often used. This is:

∑ x 2 − x̄ 2
S= √ n

This formula will always yield the same result for the standard deviation as the
other formula described earlier.

∑ x 2 − x̄2
Note: It is useful to remember that n can be thought of as “the mean
of the squares minus the square of the mean”.

Example: Find the mean and the standard deviation of the set of numbers 2, 3,
5, 6, 8.

∑ x 2 − x̄ 2
Solution: Using S = √ n

x x2
2 4
3 9
5 25
6 36
8 64
2
∑ x = 138

33
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

∑x 24
Mean, x̄ = n = 5 = 4.8.
∑ x 2 − x̄ 2 138
Standard deviation, S = n √ = √ 5
−4 . 82
= √ 4.56
= 2.14.
The alternative formula for the standard deviation is generally preferred since it
involves less awkward arithmetic.

STANDARD DEVIATION OF A FREQUENCY DISTRIBUTION

The following formulae for standard deviation have been adapted from the
formula for a set and can be used for both discrete and grouped distributions.
If the values x1, x2, x3,…………xn. Occur with frequencies f1, f2, f3, …………fn then
the standard deviation is given by:

∑ fx 2 − x̄ 2
S = √ ∑f
Note: For grouped data, x is the mid-point of the class interval and is taken to
represent the interval.

Example: The table below shows the number of children per family for a group
of 20 families. The mean number of children per family is 2.9:

No. of children per 1 2 3 4 5


family
Frequency (f) 3 4 8 2 3

Calculate the standard deviation.


∑ fx 2 − x̄ 2
Solution: S = √ ∑f
x f x2 fx2
1 3 1 3
2 4 4 16
3 8 9 72
4 2 16 32
5 3 25 75
∑f = 20 ∑fx2 = 198

34
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

∑ fx 2 − x̄ 2
So, S = √ ∑f
198
= √ 20
−2 . 92

= √ 1.49
= 1.22.
The standard deviation is 1.22

Example: The lengths of 32 leaves were measured correct to the nearest mm.
Find ( a ) the mean length
( b ) the standard deviation
Length ( mm ) 20 - 23 - 25 26 - 28 29 - 31 32 - 34
22
Frequency 3 6 12 9 2

Solution: The mid-points, x, of each class interval are considered.

Length Mid- value x2 Frequency ( f ) fx fx2


(mm) ( x)
20 - 22 21 441 3 63 1323
23 -25 24 576 6 144 3456
26 -28 27 729 12 324 8748
29 – 31 30 900 9 270 8100
32 - 34 33 1089 2 66 2178
2
∑ f = 32 ∑ fx = ∑ fx =
867 23805

∑ fx
a) Mean x̄ = ∑f
867
= 32
= 27.1.
∑ fx 2 − x̄ 2
c) Standard deviation S = √ ∑f

35
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

23805
= √ 32
−27 . 12

= √ 9.835
= 3.14

The mean length of the leaves is 27.1 and the standard deviation is 3.14.

VARIANCE

Dealing with the square- root sign in calculating the standard deviation can be
very cumbersome, so we often consider the variance where

Variance = ( standard deviation)2


The variance of a set of numbers is given by S2 , where
∑ x 2 − x̄2
Variance, S2 = n

Therefore, standard deviation = √ var iance


The variance is quite useful as a measure of dispersion and, indeed, is used for
numerous purposes in more advanced statistical analysis. For practical purposes
however it has one drawback, it is measured in square units. For example, if the
original units were say, in cm, then the mean will be in cm, but the variance
would be in cm2. This particular inconvenience can be overcome of course, by
taking the square-root of the above expression to give the standard deviation.

WEIGHTED MEANS
In some cases, it may not be suitable to calculate an ordinary mean. There
may be times when we wish to place greater emphasis on some of the
values, as illustrated in the following example:

Example: A candidate obtained the following results in her statistics


examination, Paper 1: 72%, Paper 2: 64%, Course work: 73%. The
regulations state that the two written papers have equal weighting and
account for 80% of the final result, whereas the coursework accounts for
20%. What was the candidate’s final mark?

Solution: The results are in the following ratios, 40%: 40%: 20% = 2: 2: 1
For the final mark, we have to take this weighting into account.

Weighted mean = 2(72) + 2(64) +1 (73)


2+2+1

36
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

= 365
5
= 69.
Therefore, the final mark is 69%.

In general, if x1, x2, x3,……….xn, are the given weightings w1, w2, w3,
……….wn, then,

Weighted mean = w1(x1) +w2(x2) +w3(x3)+………..wn(xn)


w1 + w2 +w3+……..+wn.

= ∑ wi xi for i = 1, 2, 3, …………n.
∑ wi

ASSIGNMENT 1 DUE DATE 30TH JANUARY 2016


1. Find the weighted mean of the numbers 8 and 12 if they are given the
weights 2 and 3 respectively.
2. The prices of articles A, B and C are K30 000, K42 000 and K65 000. Find
the mean price if the three articles are given weights of 5, 3 and 2
respectively.
3. The weighted mean of the two numbers 30 and 15 is 20. If the weightings
are 2 and x respectively, find x.
4. The final mark allocated to a student is calculated from her mark in each
subject.
a) The class teacher worked out an ordinary mean.
b) The head teacher decided to weight the subjects in proportion to the
number of lessons per week, as shown in the table below:

Subject Mark Number of


lessons per week
Mathematics 64% 5
English 52% 4
Science 71% 6
French 75% 3
History 82% 2

37
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Which method gave the higher mark and by how much?

5. Two students, Mwansa and Bupe, take an examination in French, German


and English. The table below shows the marks for each student and the
weight to be applied to each subject.

Subject French German English


Marks for 80 72 46
Mwansa
Marks for Bupe 64 82 40
Weight 2 x 3

Calculate the value of x for which Mwansa and Bupe have the same weighted
mean mark and find the value of this mean.

SECTION C

PROBABILITY

LEARNING OBJECTIVES: At the end of this section you should be able to:
1. To demonstrate an understanding of the basic rules of
probability.
2. Explain the conditions under which the binomial and
Poisson distributions may be used and apply them to
compute probabilities.
3. Explain the characteristics of the normal distribution and
apply it to compute probabilities.

Probability is a concept that most people understand naturally, since words such
as “chance”, “likelihood”, “possibility”, “proportion” (and indeed probability itself)
are used as part of everyday speech. For example, most of the following which
might be heard in any business situation are in fact statements of probability:

a) There is a 30% chance that this job will not finish on time.
b) There is every likely that the business will make a profit next year.
c) “Nine times of ten” he arrives late for his appointments.
d) There is no possibility of delivering the goods on Saturday.

38
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

1
e) The probability that he will pass his exams is 10 .
TYPES OF PROBABILITY

There are basically two types of probability:

1. Theoretical ( or classical ) probability and


2. Experimental (or empirical) probability.

Theoretical probability is the name given to probability that is calculated without


the experiment being performed; that is, only using the information that is known
about the physical situation. Empirical probability is the name given to probability
that is calculated using the results of an experiment that has been performed a
number of times. Empirical probability is sometimes called relative frequency or
subjective probability.

The term experiment is used in probability theory to describe virtually any


process whose outcome is not known in advance with certainty.

SAMPLE SPACE

The collection of all possible outcomes of an experiment is called a sample


space.

i) We regard this as a set


ii) Each element is an outcome
iii) The sample space is the universal set in the language of sets.

EVENTS

An event is subset of the sample space.


Probability is usually expressed as a fraction in its lowest terms or as a decimal.
Thus probability is defined as follows: “For a given event E of an experiment, the
probability of the event E occurring when the experiment is performed a number
of times is written as P (E) and given by:

P (E) = No. of times that the event can occur


No. of different outcomes of the experiment

In set language, if the sample space, S, consists of a finite number of equally


likely outcomes, then the probability of an event A, written P (A) is defined as

39
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

P (A) = n (A)
N(S)
When an event is ‘absolutely certain’ to happen, we say that the probability is 1,
and when an event can ‘never’ happen, we say that the probability is 0.For
example, the probability that I will obtain a score of 1, 2, 3, 4, 5 or 6 when I throw
an ordinary die is 1, whereas the probability that I will obtain an 8 is 0.

Probability is measured on a scale from 0 to 1 and most events therefore, have


probabilities between these extremes. We write:

0 ≤ P(A) ≤1

This means that the probability of an event A, is a number between 0 and 1 and
if P(A) = 0, then the event can not possibly occur and if P(A) = 1, then the event
is certain to occur.

TYPES OF EVENTS

1. MUTUALLY EXCLUSIVE EVENTS

i) Two events of the same experiment are said to be mutually excusive if


their respective event sets do not overlap.
In this case, events A and B do not overlap so, P(A∩B) = 0.
ii) Mutually exclusive events are events that cannot happen at the same
time (i.e. cannot occur together).
iii) If an event A can occur ‘or’ an event B can occur but not both A and B
can occur then the two events A and B are said to be mutually
exclusive.

2. INDEPENDENT EVENTS

i) Two events A and B are said to be independent if the occurrence of one of the
events will in no way affect the occurrence of the other.

ii) In other words, if either of the two events A and B can occur without being
affected by the other, then the two events are independent.

IMPORTANT NOTES

40
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Note 1. Complementation: Let the number of events in the sample space S,


be n, so that n(S) = n. Let the number of elements in event A be r, so that
n (A) = r.

Then P (A) will denote the probability that the event A does occur. The probability
that the event A does not occur will be denoted by P (A’).

P (A) = n (A) = r
N(S) n

Probability that A does not occur = P(A`) = 1− P(A)

Therefore, P (A’) = 1 − P(A) or P(A’) + P(A) = 1.

Example:

A bag contains 3 red balls and 2 blue balls. A ball is picked at random from the
bag. What is the probability that it is?

a) Blue
b) Red
c) Not blue?

Solutions:
Total number of balls = 5.
2
a) P(Blue) = 5
3
b) P(Red) = 5
c) P(not blue) = 1 − P(blue)
2
=1− 5
3
= 5
Note 2: If A and B are any two events of the same experiment such that P (A) ≠
0, and P (B) ≠0, then P (A or B) = P (A) + P (B) − P (A and B).

It is important to realize that P (A or B) means P (A occurs or B occurs or both A


and B occur).
“A or B” is represented in set notation by A U B the union set of the two sets A
and B.
“A and B” is represented by A ∩ B the intersection of set A and set B. Therefore
note 2 can be written in set notation as
P (A U B) = P (A) + P (B) − P (A ∩ B).
Examples:

41
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

19 2 4
1. Events A and B are such that P(A) = 30 , P(B) = 5 and P(A U B) = 5 .
Find P(A ∩ B).
Solution:
P( A U B) = P(A) + P(B) − P(A ∩B)

4 19 2
↔ 5 = 30 + 5 − P(A ∩ B)
19 2 4
↔ P(A ∩ B) = 30 + 5 − 5

19 12 24
= 30 + 30 − 30

31 24
= 30 − 30

7
= 30
7
Therefore, P (A ∩ B) = 30

2. In a group of 20 adults, 4 out of the 7 women and 2 out of the 13 men wear
glasses. What is the probability that a person chosen at random from the group
is a woman or someone who wears glasses?

Solution:

Let W be the event “the person chosen is a woman” and G be the event “ the
person chosen wears glasses” then,
7 6 4
P (W) = 20 ; P(G) = 20 ; P(W and G) = P(W ∩ G) = 20

P( W or G ) = P( W U G)

↔P(W U G) = P(W) + P(G) − P(W∩ G)

7 6 4
= 20 + 20 − 20

13 4
= 20 − 20

42
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

9
= 20 .

LAWS OF PROBABILITY

1. THE ADDITION LAW FOR MUTUALLY EXCLUSIVE EVENTS

If an event A can occur or an event B can occur but not both A and B can occur
together, then the two events are said to be mutually exclusive. The two events
do not overlap and therefore, P (A∩B) = 0. The addition law is stated as follows:

When A and B are two mutually exclusive events, then


P(A or B) = P(AUB) = P(A) + P(B)

This is also referred to as the “ or rule”. We must take care to use it only when
the two events can not occur at the same time.

Example: In a race, the probability that John wins is 0.3, the probability that Paul
wins is 0.2 and the probability that Mark wins is 0.4. Find the probability that:

a) John or Mark wins (assume that there are no dead heats or a tie).
b) Neither John nor Paul wins.
Solutions: We assume that only one person can win, so the events are mutually
exclusive.
a) P( John or Mark wins ) = P (John wins ) + P( Mark wins)
= 0.3 + 0.4
= 0.7.
P(John or Mark wins ) = 0.7.

b) P( neither John nor Paul wins ) = 1 − P( John or Paul wins)


= 1− ( 0.3 + 0.2)
= 1− 0.5
= 0.5.
P ( neither John nor Paul wins) = 0.5.

Example: A lady has 8 brown socks, 6 red socks and 4 white socks in her
drawer. One night in the dark, she picked out a sock from the drawer. What
is the probability that the sock she picked out is:

a) white
b) not red
c) brown or red
d) red or white

Solutions: Total number of socks = 18.

43
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

4 2
a) P(white sock) = 18 = 9
b) P(sock not red) = 1 − P(red)
6
= 1 − 18
12
= 18 .
12 2
P(not red) = 18 = 3 .

c) P ( brown or red ) = P(brown ) + P( red )


8 6
= 18 + 18

14
= 18
7
= 9

d) P ( red or white ) = P( red ) + P ( white )


6 4
= 18 + 18
10 5
= 18 = 9 .

2. THE MULTIPLICATION LAW FOR INDEPENDENT EVENTS.

Recall that, independent events are events that can occur at the same time
without affecting the occurrence of each other
If two events A and B are independent then,

P (A and B) = P (A) x P (B)


P (A ∩ B) = P (A) x P (B)

This is known as the multiplication law of independent events and is sometimes


known as the “and rule”,
since P (A and B) = P (A) x P (B).

44
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Example:

A die is thrown twice. Find the probability of obtaining a 4 on the first throw and
an odd number on the second throw.
1
Solution: Let A be the event “a 4 is obtained on the first throw”, then P (A) = 6
.
Let B be the event “an even number is obtained on the second throw”.
Now the result on the second throw is not affected in any way by the result of
3 1
the first throw. Therefore, A and B are independent events. P (B) = 6 = 2 .
P (A and B) = P (A) x P (B)
1 1
= 6 x 2
1
= 12
The probability that the first throw results in a 4 and the second throw results in
1
an odd number is 12 .
Example: A bag contains 5 red counters and 7 black counters. A counter is
drawn at random from the bag, the is noted and the counter is not replaced. A
second counter is then drawn from the same bag. Find the probability that:

a) the first counter is red and the second is black.


b) both counters are black
c) the two counters are of the same colour.

Solutions: Total number of counters = 12.


5
a) P (1st counter red) = 12
7
P (2nd counter black) = 11
5 7 35
Therefore, P (1st Red and 2 Black ) = 12 x 11 = 132
nd

7
b) P (1st counter black) = 12
6
P (2nd counter black) = 11
7 6 42 7
Therefore, P (both counters black) = 12 x 11 = 132 = 22 .

c) P (same colour) = P (1st Red and 2nd Red) OR P (1st Black and 2nd Black)

45
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

5 4 7 6
= ( 12 x 11 ) + ( 12 x 11 )

20 42
= 132 + 132

62
= 132

31
= 66

TREE DIAGRAMS

A useful way of tackling many probability problems is to draw a probability tee.


In a probability tree or tree diagram the probabilities are written on the branches.
These are then used to calculate the desired outcomes by applying the addition
law or the multiplication law.

PERMUTATIONS
A permutation is an arrangement of a group of items in a particular way. The
order in which the items are arranged is VERY IMPORTANT. For example,
consider the different ways of arranging the letters ABC. These letters can be
arranged as:
ABC , ACB, BCA, BAC, CAB, CBA. Notice that ABC is a different permutation
from ACB and so on..
Note: Remember that in a permutation, the order in which the items are
arranged matters.

THE PERMUTATION FORMULA

The number of different permutations or ordered arrangements, of r objects


taken from n unlike objects is written as
n
Pr and is calculated as follows:

n!
n
Pr = (n−r)!

7! 7! 7 x 6 x5 x 4 x3 x2 x1
Example: 7P3 = (7−3)! = 4! = 4 x 3 x2 x1 = 7 x 6 x 5 = 210.

46
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

That is, there are 210 different permutations of 3 objects from 7 unlike objects.
n
On a calculator this can be obtained directly as: 7 Pr 3 =

Example: How many ways are there of arranging 3 different jobs between 5 men
where any man can do only one job?

Answer: The number of arrangements possible = 5P3

5!
= (5−3)!
5! 5 x 4 x 3x 2x 1
= 2! = 2 x1 =5x4
x3 = 60 ways

COMBINATIONS

When considering the number of combinations of r objects from n items, the


order in which the objects are placed is not important. Re- arranging the items
within a combination does not give a different combination.

I n a combination, the arrangement of the letters ABC is the same as ACB and
the same as CBA or ACB. The order is not important.

Example: Out of the five people A, B, C, D, E in an office, just three are to be


selected to go to an exhibition. In how many ways can three be chosen?

Solution: Here the selection BCD will be the same as the selection CBD or
BDC. Thus, we are interested in combinations. The list of combinations will be
as: ABC, ABD, ABE, ACD, ACE, ADE, BCD, BCE, BDE, CDE.
Therefore, there are 10 different ways that the three people can be chosen.
THE COMBINATION FORMULA

The number of different combinations of r objects from n distinct objects is


(n ¿) ¿¿¿ n!
written as nCr or ¿ and is calculated as: nCr = r!(n−r )! .So, the number
of ways of choosing 3 people from the 5 people ( A, B, C, D, E.) is: 5C3 =
5! 5! 5 x 4 x 3x 2x 1 5x4
3!(5−3)! = 3!2! = (3x 2x 1) x(2x1) = 2 x1 = 10 ways.

(n ¿) ¿¿¿
Note: nCr is sometimes written as ¿

47
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Example: A team of 4 is chosen at random from 5 girls and 6 boys. In how many
ways can the team be chosen if?

(a) There are no restrictions


(b) There must be more boys than girls?

Solutions: There are 11 people from whom 4 are chosen. The order in which
they are chosen is not important.
Number of ways of choosing the team = 11C4

11!
= 4!7!
= 330 ways.

a) If there are no restrictions, the team can be chosen in 330 ways.


On a calculator, this can be obtained directly as: 11 nCr 4 =
b) If there are to be more boys than girls, then there must be 3 boys and 1 girl or
4 boys and no girl.
Number of ways of choosing 3 boys and 1 girl = 6C3 x 5C1

6! 5!
= 3!3! x 1!4!
= 20 x 5
= 100 ways

Number of ways of choosing 4 boys and no girl = 6C4 x 5C0


6! 5!
= 4!2! x 0!5!
= 15 x 1
= 15 ways.
Therefore the number of ways of choosing the team if there are more boys than
girls is 100 + 15 = 115 ways.

Example (Combinations and probability):


A committee of 4 must be chosen from 3 women and 4 men. Calculate:
a) In how many ways the committee can be chosen.
b) In how many ways 2 men and 2 women Can Be chosen
c) The probability that the committee consists of 2 men and 2 women
d) The probability that the committee consists of at least 2 women.

Solutions: It is only selections (not arrangements) that we are interested in here.


That is, we are concerned with combinations.
a) We need to choose 4 people from 7. This can be done in: 7C4 ways
7!
= 4!3!

48
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

= 35 ways.
4!
b) From 4 men, 2 men can be chosen in 4C2 ways = 2!2! = 6 ways.
3!
From 3 women, 2 women can be chosen in 3C2 ways = 2!1! = 3ways.
Therefore, 2men and 2 women can be selected in 6 x 3 = 18 ways.
18
c) P (2men and 2 women) = 35 from (a) and (b) above.
(4C 2x 3C 2) (4C 1x3C 3)
d) P (committee consists of at least 2 women) = 7C 4 + 7C 4
6x3 4 x1 22
= 35 + 35 = 35 .

ASSIGNMENT 2 DUE DATE 28TH FEBRUARY 2016


1. A box contains 8 white balls and 3 green balls. Two balls
are taken out at random from the box, one after the other,
without replacement.

a) Draw a tree diagram to show all the possible outcomes.


b) Use your diagram to calculate the probability that:
i) Both balls are white.
ii) The first ball is green and the second ball is white.
iii) One ball is white and the other is white in any order.

2. A tin contains 3 white beads, 8 green beads and 4 blue


beads. A bead is selected at random and is not
replaced. A second bead is then selected.
By drawing a probability tree, find the probability of
obtaining:
a) two blue beads
b) one white bead and one green bead

49
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

c) two beads of the same colour


d) Two beads of different colours.

3. In a group of 30 students, all study at least one of the


subjects physics and biology. 20 attend the physics class
and 21 attend the biology class. Find the probability that a
student chosen at random studies both physics and
biology.
2
4. a) For events A and B it is known that P(A) = 3 , P(AU
3 5
B) = 4 and P(A∩ B)= 12 .
Find P (B).
5. For events A and B it is known that P(A) = P(B) and P(A∩
B) = 0.1 and P(AU B) = 0.7. Find P(A).
6. In a street containing 20 houses, 3 house-holds do not
own a television set; 12 house-holds have a black and
white set and 7 house-holds have a colour and a black and
white set. Find the probability that a house-hold chosen at
random owns a colour television set.

THE BINOMIAL DISTRIBUTION

Consider an experiment which has two possible outcomes, one of which may be
termed as “success” and the other “failure”, and performed in n number of trials
of an experiment. A binomial situation arises when we consider the number of
successes occurring. For example, i) toss a coin six times and consider the
number of heads that come up; ii) throw a die ten times and consider the number
of times you obtain a six.

CHARACTERISTICS OF A BINOMIAL SITUATION

A binomial situation can be recognized by the following characteristics:


i) The existence of a trial (of an experiment) which is defined in terms of two
states “success” and “failure”.
ii) Identical trials are repeated a number of times, yielding a number of
successes.

50
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Note: a) Success/ failure can be interpreted in any way that is convenient. For
example, a situation can be good/bad; an item can be defective/ o.k.; a company
can make a profit/loss and so on.

b)The variable (x) in a binomial frequency will always be a whole number in


the range 0 to n ( where n is the number of repeated trials), the frequency,
(f) is the number of times that each of the x successes occurs.

THE BINOMIAL PROBABILITY FORMULA

If the probability that an experiment results in a successful outcome is p, and the


probability that the outcome is a failure is q, where q = 1 − p, and n = the number
of trials, then the probability of obtaining x successes is given by:

Pr(x) = nCx pxqn-x where x can take one of the values 0, 1, 2, 3… n.

If the probabilities for all values of x are calculated and tabulated against their
respective values of x, the result is known as a “binomial probability distribution”.
If X is distributed in a binomial probability distribution we write X ~ Bin (n, p)
where n is the number of independent trials and p is the probability of a
successful outcome in one trial. “n and P” are called the parameters of the
distribution. So we read the statement:

X ~ Bin (n, p) as: X follows a binomial distribution with parameters n and p.

Example: The probability that a person supports the ruling party, MMD is 0.6.
Find the probability that in a randomly selected sample of 8 voters, there are

c) Exactly 3 people who support the MMD.


d) More than 5 people support the MMD.

Solution: We will consider supporting the MMD as success. Then, p= 0.6 and q
= 1 − 0.6 = 0.4.
Let X be the number of MMD supporters. Then X ~ Bin (n, p) with n = 8 and p =
0.6 So
X ~ Bin ( 8, 0.6) and Pr(x) = nCx pxqn-x
= 8Cx (0.6)x (0.4)n-x where x= 1, 2, 3, ……….., 8.

a) We require Pr( X=3). Therefore, P(X=3) = 8C3 (0.6)3 (0.4)5


= 0.124.
Therefore, the probability that there are exactly 3 people who support the
MMD is 0.124.

51
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

b) We require P( X > 5)
P (X > 5) = P (X=6) + P(X=7) + P(X=7) +P(X=8)

=8C6 (0.6)6(0.4)2 + 8C7 (0.6)7(0.4)1 + 8C8

= 24(0.6)6(0.4)2 + 8(0.6)7(0.4)1 + (0.6)8


= 0.315.

Therefore, the probability that there are more than 5 MMD supporters is
0.315.

MEAN AND VARIANCE OF THE BINOMIAL DISTRIBUTION

Given a binomial distribution such that X ~ Bin (n, p) where n= number of


trials and p = probability of success at each trial, then: Mean = np
Variance = npq where q = 1 – p

Example: If the probability that it will be a rainy day is 0.4. Find:

a) The mean number of rainy days in a week.


b) The standard deviation.

Solution: Let “rainy day” be success. Then p = 0.4 and q = 0.6. n = number
of days in a week.
Then, X ~ Bin (n, p) where n = 7 and p = 0.4 such that X ~ Bin (7,
0.4)
a) mean = np
= 7 x 0.4
= 2.8
b) variance = npq
= 7 x 0.4 x 0.6
= 1.68
Standard deviation = √ var iance
= √ 1.68
= 1.30.
Example: The random variable X is such that X ~ Bin ( n , p) and the mean =
24
2 and variance = 13 . Find the values of n and p.
Solution: Mean = np
np = 2 …………(i)

Variance = npq

52
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

24
npq = 13 ………..(ii)
Substituting np into equation (ii) we have:
24
npq = 13
24
2q = 13

26q = 24
24 12
q = 26 = 13

Now p = 1 − q
12 1
=1− 13 = 13
Substituting p into equation (i), we have:
np = 2
1
n ( 13 ) = 2
n = 26
1 1
Therefore, n = 26 and p = 13 and so X ~ Bin (26, 13 ).

THE POISSON DISTRIBUTION

A Poisson distribution describes a number of “events” that occur within some


given interval. The important characteristic of the Poisson distribution is that
the events in question must occur at random. That is, they must be
independent of each other.
Also they must be what is described as ‘rare’. That is, in any particular point
in the time interval, the probability of an event occurring must be very low.

POISSON PROBABILITY FORMULA

Given a Poisson situation, it is to calculate the probability of any number of


events occurring in the defined interval. This can only be done however, if the
mean number of events per interval is known.

Given a Poisson situation, with m = mean number of events per interval, the
x
m
probability of x events occurring is given by: Pr(x) = e -m x! where x can
take any one of the values, 0, 1, 2, 3 …

53
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Note: The mean, m is the parameter of the distribution.


If X is distributed this way then we write X ~ Po(m).
The letter ‘e’ represents a special mathematical constant (having an
approximate value of 2.718)

Example: If X follows a Poisson distribution with mean 2, i.e. X ~ Po (2), find the
following probabilities:

a) Pr (0) (b) Pr (1) (c) Pr (2) (d) Pr (X ≤2).

Solutions:
x
m
a) Pr(x) = e-m x!
0
2
Pr (0) = e-2 0! = e −2
= 0.1353.
1
2
b) Pr (1) = e-2 1! = 2e −2 = 0.2706.
2
2
c) Pr (2) = e
−2
2 ! = 2e −2 = 0.2706.
d) Pr ( X ≤2) = Pr(0) + Pr(1) + Pr(2).

= 0.1353 + 0.2706 + 0.2706

= 0.6765.

MEAN AND VARIANCE OF THE POISSON DISTRIBUTION

If X ~ Po (m) then, mean = m and variance = m.

Example: If X ~ Po (2) find:

(a) The mean of X


(b) The standard deviation of X.

Solution: (a) mean = m= 2.


(b) Variance = 2
Standard deviation = √ var iance = √2 = 1.41.

54
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

NOTES ON THE POISSON DISTRIBUTION

e) A Poisson interval can be adjusted provided the mean is adjusted


accordingly. For example, if we are given that the average number of
deliveries to a large warehouse is 2.3 per day, then of course the Poisson
formula with mean = 2.3 can be used to calculate probabilities of 0, 1, 2,
3, ……etc deliveries on any particular day. However, if we wanted to
calculate the probabilities of 0, 1, 2, 3, ……etc deliveries in any 2- day
period, the Poisson probability formula will need to be used with a mean =
2x 2.3 = 4.6 and so on.
f) The variance of the Poisson distribution is ‘always equal to mean’.
x
g) Many electronic calculators have a special function key for either e or
−x
e or both. Learn the use of these keys through practice.

Example: The mean number of bacteria per milliliter of a liquid is known to be 4.


Assuming that the number of bacteria follows a Poisson distribution, find the
probability that:
a) in 3ml of liquid there will be less than 2 bacteria
b) in ½ ml of liquid there will be more than 2 bacteria.

Solution: a) In 1ml of liquid, we expect to find 4 bacteria, so in 3ml of liquid we


expect to find 12 bacteria.
x
m
So, X ~ Po(12). Pr(x) = e-m x ! . We require P(X < 2).

P(X < 2) = Pr(0) + Pr(1)


0 1
12 12
= e
−12
0! + e
−12
1!
−12 −12
= e + 12 e

55
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

−12
= 13e
−5
= 7.99 x 10
= 0.0000799.

c) In 1ml of liquid we ‘expect’ 4 bacteria, so in ½ ml of liquid we


expect 2 bacteria. So X~ Po (2).
We require Pr(X > 2) = 1 − Pr (X ≤ 2)
= 1 − [ Pr(0 )+Pr(1)+Pr(2) ]
−2 −2 −2
=1−(e + 2e + 2e
−2
= 1 − 5e
= 0.323.

Therefore, the probability that there will be more than 2 bacteria in ½ ml of


liquid is 0.323.

USES OF THE POISSON DISTRIBUTION


There are two main practical uses of the Poisson distribution:
1. When considering the distribution of random events.
2. The Poisson distribution as an approximation to the binomial distribution.

THE NORMAL DISTRIBUTION

The normal distribution is the most important continuous distribution in statistics.


Many measured quantities in the natural sciences follow the normal distribution.
For example, heights, masses, ages, random errors, examination results etc.
Under certain circumstances the normal distribution is a useful approximation to
the binomial and Poisson distributions.

CHARACTERISTICS OF THE NORMAL DISTRIBUTION

The main characteristics of the normal distribution are:


a) It has a ‘symmetric’ (frequency) curve about the mean of the distribution.
In other words, one half of the curve is a mirror- image of the other.
b) The majority of the values tend to cluster about the mean, with the
greatest frequency at the mean itself.
c) The frequencies of the values taper away (symmetrically) either side of
the mean, giving the curve a characteristic “bell - shape”.

56
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

d) Approximately, 95% of the distribution lies within 2 standard deviations of


the mean.
e) Approximately 99.8% (nearly all) of the distribution lies within 3 standard
deviations of the mean.

NORMAL DISTRIBUTION PROBABITIES

The information necessary to evaluate probabilities for ranges of values for a


normal distribution is the value of

a) The mean, μ and


b) The standard deviation σ of the distribution in question. Both of these
statistics “need to be known”.

The procedure used in calculating probabilities associated with the normal


distribution requires knowledge of

i) The Z – scores and


ii) The Standard Normal tables

In order to use “the same set of tables” for all possible values of μ, we
perform a process known as “standardizing X” to obtain the “standard normal
variable” which is given the symbol Z.

STANDARDISING THE X – VALUE


The X – score for any value of the normal distribution, having a mean μ and
X−μ
standard deviation σ, is given by: Z = σ
This is the process used for calculating Z - scores.

STANDARD NORMAL TABLES


Standard normal tables are used to calculate Z – score probabilities. The
standard normal variable Z is the normal variable with mean 0 and variance
1.

We can find the areas under the standard normal curve by referring to
standard normal tables which give cumulative probabilities. The diagram at
the head of the standard normal tables shows how any probability read from
the table can be represented as an area under the ‘standard normal curve’.

The Standard Normal distribution is in fact a Normal distribution having a


mean = 0 and standard deviation = 1. The total area under the Standard
Normal Curve is always 1 (representing total probability).

57
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

The tables we shall use in our course are those that give the area in the right-
hand tail of the Normal distribution.

APPLICATIONS OF THE NORMAL DISTRIBUTION

Example: A normal distribution has a mean of 300 and standard deviation 5.


Find
a) P ( X > 305)
b) P ( X < 291)
c) P ( X < 312)
d) P ( X > 286)
Solutions: ( a) To find P ( X > 305), we standardize X by subtracting the
X−μ
mean 300 and dividing by the standard deviation, so that Z = σ
305−300
= 5
= 1.
X−300 305−300
So, P ( X > 305) = P
(5
>
5 )
5
=P(Z> 5 )
= P ( Z > 1)
= 0.1587 (from the tables).

X−300 291−300
c) P( X< 291) = P 5 (<
5 )
−9
=P(Z< 5 )
= P ( Z < -1.8 )
= P ( Z > 1.8 ) ( By symmetry)
= 0.0359.

Therefore, P ( Z < 291) = 0.0359.

d) P (X > 286 ) = P
( X−300
5
>
286−300
5 )
=P
( Z> −145 )
= P ( Z > -2.8)
= 1 − P ( Z > 2.8 ) ( By symmetry )
= 1 − 0.00256
= 0.99744.

58
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

ASSIGNMENT 3 DUE DATE 30th MARCH 2016

1.If X ~ Bin
(6 , 31 )
, find a) P(X=4) (b) P( X ≤ 2 ).
2. The probability that a pen drawn at random from a box of pens is defective
is 0.1. If a sample of 6 pens is taken, find the probability that it will contain (a)
no defective pens, (b) 5 or 6 defective pens, (c) less than 3 defective pens.

59
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

3. a) State the conditions under which the binomial distribution may be used
for the calculations of probabilities.
b) The probability that a girl chosen at random has a weekend birthday in
2
2009 is 7 . Calculate the probability that, among a group of ten girls
chosen at random,
i) None has a weekend birthday in 2009,
ii) Exactly one has a weekend birthday in 2009.
Among 100 groups of ten girls, how groups would you expect to contain
more than one girl with a weekend birthday in 2009.
3. An insurance company receives on average 2 claims per week from a
certain factory. Assuming that the number of claims follows a Poisson
distribution, find the probability that
a) it receives more than 3 claims in a particular week.
b) it receives more than 2 claims in a particular fortnight.
c)it receives no claims on a given day, given that the factory operates
on a 5- day week.
4. The mean number of flaws per 100m of material produced on a certain
machine at a factory is 2. If flaws occur randomly, find the probability that
a) in 2000m length of material, there will be more than 3 flaws,
b) in 50m length of material there will be exactly 2 flaws.
5. If a random variable X is normally distributed such that X follows a normal
distribution with mean, μ = 60 and standard deviation σ = 5, find:
a) P( X > 65)
b) P( X < 80)
c) P( X >50)
6. The masses of oranges sold at a super market are normally distributed with
mean 600g and standard deviation 20g.
a) If an orange is chosen at random, find the probability that its mass lies
between 570g and 610g.
b) Find the mass exceeded by 7% of the oranges.
c) In one day, 1000 oranges are sold. Estimate how many weigh less than
545g.

SECTION D
HYPOTHESIS TESTING

TEST 1: TESTING A MEAN BASED ON A SAMPLE VALUE

Example: The random variable X is such that the distribution of X follows a


normal distribution so that X ~ N( μ, 100). A value is taken at random from the

60
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

population and found to be 172. Test, at the 5% level whether the mean μ could
be 150.
Solution: We assume that μ is 150, and this is the null hypothesis (H 0). The
alternative hypothesis is (H1) that μ is that is not 150. We write

State H0 and H1: H0 : μ = 150


decide whether the test H1 : μ ≠ 150 (two- tailed test)
is one-tailed or two-tailed.

Consider the distribution Now if H0 is true, X ~ N(150, 100)


given by H0

Decide on the level of the We will at the 5% level, and consider the test
X−μ
test and decide on rejection statistics Z = σ
criteria. We will reject H0 if | z | > 1.96

X−μ
Calculate the value of Now Z = σ
172−150
the test statistic = 10 Note: σ = √ 100 = 10.
= 2.2

Make conclusion Since | z | > 1.96, we reject H0 and conclude that


There is significant evidence, at the 5% level, to
Suggest that the population mean is not 150.

Example: The random variable X is such that X ~ N(μ, 30).A sample value of 54
is obtained. Test at the 1% level, whether the population mean is less than 65.

Solution: Although the question asks that we test whether the mean is less than
65, the null hypothesis must state a definite value. So, for the null hypothesis, we
assume that the mean is 65, and the alternative hypothesis is that the mean is
less than 65.

61
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

H0 : μ = 65
H1 : μ < 65 (one – tailed test)

Now, if H0 is true then X ~ N(65,30) and the test static is


X−μ X−65
Z= σ , i.e. Z = √30
We perform a one-tailed test at the 1% level, and reject H 0 if Z< -2.32, where

X−μ
Z= σ
54−65
= √30
= -2.01
Conclusion: Since Z > -2.32, we do not reject H0 and we conclude that at the
1% level, the sample value could have been drawn from a population with mean
65.

The following example is a very useful application of Test1.

Example: If 100 seeds are planted, and 83 germinate, use the normal
approximation to the binomial distribution to test the manufacturer’s claim of a
90% germination rate. Use a 5% level of significance.

Solution: Let X be the random variable “the number of seeds that germinate”.
Then we have a binomial situation, and X ~ Bin (n,p) with n = 100.

H0: p = 0.9 (the germination rate is 90%)


H1: p < 0.9 (the germination rate is less than 90%)

62
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

(We have chosen a one-tailed test, since this seems more appropriate to the
given situation).
Under H0, X ~ Bin (100,0.9).
Now, since n is large, we use the normal approximation to the binomial
distribution so X ~ N (np, npq) with
np = (100)(0.9) = 90 and npq = (100)(0.9)(0.1) = 9

i.e. X ~ N(90, 9)
We perform a one- tailed test, at the 5% level, and reject H 0 if Z < -1.64, where

X−np
Z= √npq
83−90
= √9
−7
= 3
= -2.33.

Conclusion: Since Z < - 1.64, we reject H0 and conclude that there is significant
evidence, at the 5% level, to suggest that the manufacturer’s claim is false.

TEST 2: TESTING A MEAN BASED ON A SAMPLE MEAN

Instead of taking one sample value, for a more reliable test of the mean we take
a random sample of n independent observations and then use the sample mean.
We proceed as follows:

Consider the random variable X with known variance σ 2 but unknown mean.

Make the null hypothesis (H0) that the population mean is μ.

63
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Take a sample of size n and consider the distribution of the sample mean x̄.

If H0 is true, then:

σ2
(i) if X is normally distributed, X̄ ~ N
( )
μ,
n
(ii) if X is not normally distributed but n is large, by the Central Limit
σ2
Theorem, X̄ ~ N
( )
μ,
n
Reminders: the distribution of X̄ is known as the sampling distribution of
σ
means; the standard deviation of this distribution, √n , is known as the
standard error of the mean.

Now we want to investigate whether there is a significant difference between the


sample mean and the population mean given by the null hypothesis. To do this
we need to consider the standard normal variable Z.

X̄−μ
σ
Standardising, we have X = √n where Z ~N(0,1).

X̄−μ
We use the test statistic X = σ /√n which is distributed as N (0, 1) under the
null hypothesis H0 that the true population mean is μ.

Example: The lengths of metal bars produced by a particular machine are


normally distributed with mean length 420cm and standard deviation 12cm. The
machine is serviced, after which a sample of 100 bars gives a mean length of
423cm. Is there evidence, at the 5% level, of a change in the mean length of the
bars produced by the machine, assuming that the standard deviation remains the
same?
Solution: Let X be the random variable, ‘the length of a metal bar in cm’. Let the
population mean be μ and the population variance be σ 2 . We know that σ = 12,
so X ~ N(μ,122).

We are trying to establish whether there has been a change in the mean length
of the bars. However the null hypothesis must assume that the mean is still
420cm.

The alternative hypothesis is that the mean is not 420cm.

64
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

i.e. H0 : μ = 420cm (there is no change)


H1 : μ ≠ 420cm (there is a change)

Now consider the sampling distribution of means.

σ2
X̄ ~N
( )
μ,
n with σ2 = 122, n = 100
122
If H0 is true, μ = 420, so X ~ N
420 , (
100 ) .
X̄−μ
The test statistic is Z = σ /√n

X̄−420
i.e. Z= 12/ √100

We perform a two-tailed test, at the 5% level, and reject H0 if | z | > 1.96.

X̄−μ
We calculate Z where Z = σ /√n

423−420
= 1 .2
= 2.5

Conclusion: Since | x | > 1.96, we reject H0 and conclude that there is significant
evidence, at the 5% level, of a change in the mean length of the bars produced
by the machine.

Example: Experience has shown that the scores obtained in a particular test are
normally distributed with mean score 70 and variance 36.When the test is taken
by a random sample of 49 students, the mean score is 68.5. Is there sufficient
evidence, at the 3% level, that these students have not performed as well as
expected?

65
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Solution: Let X be the r.v. ‘the score of a student’. Let the population mean be μ
and the population variance be σ2 where σ2 = 36, so X ~ N(μ,36).

We assume that μ is 70 and that the students have performed as expected. This
is the null hypothesis.

The alternative hypothesis that the mean is less than 70. We write

H0 : μ = 70 (the population mean is 70)


H1 : μ < 70 (the population mean is less than 70 and the students
have not performed as expected).

Consider the sampling distribution of means.

σ2
X̄ ~N
( ) μ,
n with σ
2
= 36 and n = 49.

If H0 is true, μ = 70, so X̄ ~N
(70 , 3649 ) .

X̄−μ
The test statistic is Z = σ /√n

X̄−70
i.e. Z=
( 6/ √ 49 )
X̄−70
= 6/7

We perform a one-tailed test, at the 3% level. The critical value a is


such that P( Z < a ) = 0.03.

From the tables a = - 1.88 where

X̄−μ
Z= σ /√n

66
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

68.5−70
= 6/7

= - 1.75.

Conclusion: Since | z | > -1.88, we do not reject H0 and we conclude that,


at the 3% level, the students have not under-achieved.

TEST 3: TESTING THE DIFFERENCE BETWEEN MEANS


Consider two unpaired, independent samples of sizes n 1 and n2 such that

X1 ~ N ( μ1 , σ 21 ) and X2 ~ N ( μ 2 , σ 22 )

σ 21 σ 22

Then
X̄ 1 − X̄ 2 ~ N ( μ1 − μ2 ,
n1
+
n2 )
This distribution is known as the sampling distribution of the difference
between means.
The following may be used to test whether there is a significant difference
between means.
2 2
We will consider the case when σ 1 and σ 2 are known.

We use the test statistic


X̄ 1− X̄ 2 −( μ1 −μ2 )
2 2
σ1 σ 2
Z=
which is distributed as N(0,1).
√ +
n1 n2

Example: A random sample of size 100 is taken from a normal population with
2
variance σ 1 = 40. The sample mean x̄ 1 is 38.3. Another random sample of
size 80, is taken from a normal population with variance σ 22 = 30. The sample
mean x̄ 2 is 40.1. Test, at the 5% level, whether there is a significant difference
in the population means μ1 and μ2.
Solution: Sample 1: n1 = 100, x̄ 1 = 38.3, σ 21 = 40,
population mean = μ1
Sample 2: n2 = 80, x̄ 2 = 40.1, σ 22 = 30,
population mean = μ2

67
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

H0: μ1 = μ2 (there is no difference between the means)

H1: μ1 ≠ μ2 (there is a difference)

We consider the sampling distribution of the difference between means

σ 21 σ 22
X̄ 1 − X̄ 2 ~ N ( μ1 − μ2 ,
n1
+
n2 )
Under H0, μ1 − μ2 = 0, so

40 30
X̄ 1 − X̄ 2 ~ N ( 0, +
100 80 )
i.e.
X̄ 1 − X̄ 2 ~ N(0, 0.775)

X̄ 1− X̄ 2 −0
σ 21 σ 22
The test statistic is Z = √ +
n1 n2

X̄ 1− X̄ 2 −0
= √ o .775
X̄ 1− X̄ 2 −0
= 0 .880 . .. .

We use a two-tailed test, at 5% level and reject H0 if | x | > 1.96,


38.3−40 .1
where Z = 0.880 ...

38 .3−40 .1
Z= 0.880 ...

= - 2.04

68
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Conclusion: Since | x | > 1.96, we reject H0 and conclude that there is evidence,
at the 5% level, of a difference in population means.

Example: The same test was given to a group of 100 scouts and to a group of
144 girl guides. The mean score for the scouts was 27.53 and the mean score
for the girl guides was 26.81. Assuming a common population standard deviation
of 3.48, test, using a 5% level of significance, whether the scouts’ performance in
the test was better than that of the girl guides. Assume that the scores are
normally distributed.

Solution: Let X1 be the random variable ‘a scout’s score’.

Scouts:
X̄ 1 = 27.53, n = 100, population mean = μ
1 1

Let X2 be the random variable ‘guide’s score’.

Guides:
X̄ 2 = 26.81, n = 144, population mean = μ
2 2

Common population standard deviation σ = 3.48. Thus, σ 21 =σ 22 .

H0 : μ1 = μ2 (there is no difference in the performances)

H1: μ1 > μ2 (the performance of the scouts was better)

Consider the sampling distribution of the difference between means.

σ 21 σ 22
X̄ 1 − X̄ 2 ~ N ( μ1 − μ2 ,
n1
+
n2 )
2 2
σ1 σ2 2
3 . 48 3 . 48
2
+ +
Now n1 n2 = 100 144

= 0.205204.

Under H0, μ1 − μ2 = 0, so
X̄ 1 − X̄ 2 ~ N(0, 0.205204)

69
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

X̄ 1− X̄ 2 −0
Z=
σ 21 σ 22
The test statistic is √ +
n1 n2
X̄ 1 − X̄ 2−0
= √0 . 205204
X̄ 1− X̄ 2
= 0 . 452

We use a one-tailed test, at the 5% level, and reject H 0 if | x | > 1.64,


27 .53−26 . 81−0
where, Z = 0 . 452

= 1.589.

Conclusion: since Z < 1.64, we do not reject Ho and we conclude that there
is not sufficient evidence, at the 5% level, to show that the performance of the
scouts in the test was better than that of the guides.

TEST 4: TESTING A PROPORTION, SAMPLE SIZE LARGE

We may wish to test whether a random sample of size n, where n is


large, with proportion of ‘success’ ps could have been drawn from a population
with proportion of ‘success’ p.

The sampling distribution of proportions gives

Ps ~ N
( p , pqn ) where q = 1 − p
and n is large.

70
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

The test statistic used is

P s− p
Z = √ pq/n which is distributed as N(0,1) under the
null hypothesis Ho that the proportion of ‘success’ in the population is p.

Example: The manufacturer of Chummy Morsels claims that 8 out of 10 dogs


choose his product rather than that produced by a rival firm. In a random sample
of 200 dogs, 152 chose Chummy Morsels, and the rest chose the rival brand.
Comment on the manufacturer’s claim.

152
Solution: From the sample Ps = 200 = 0.76, n = 200.

Let p be the population proportion of dogs who prefer Chummy


Morsels and let q = 1 − p.

H0: p = 0.8 (8 out 10 dogs prefer Chummy Morsels and the


manufacturer’s claim is correct).
H1: p < 0.8 (less than 8 out of 10 dogs prefer Chummy Morsels
and the manufacturer’s claim is not correct).

Consider the sampling distribution of proportions.

pq
Ps ~ N
( ) p,
n with n = 200

Under H0, p = 0.8, q = 0.2, so Ps ~ N


(0. 8,200(0. 8)(0. 2) )
i.e. Ps ~ N(0.8, 0.0008)

Ps− p
pq
The test statistic is Z = √ n

Ps−0. 8
= √0. 0008

Ps−0 .8
= 0.028 ...
We use a one-tailed test at the 5% level. We will reject H 0 if Z < - 1.64

71
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Where

Ps− p
Z = pq/n
0. 76−0.80
= 0 .028 ... = - 1.414

Conclusion:
Since Z > -1.64, do not reject H0 and we conclude that there is not sufficient
evidence, at the 5% level, to refuse the manufacturer’s claim.

Example: A large college claims that it admits equal numbers of men and
women. In a random sample of 500 students at the college there were 267
males. Is there evidence, at the 5% level, that the college population is not
evenly divided into males and females?

267
Solution: From the sample, Ps = 500 = 0.534, n = 500

Let p be the proportion of males in the population and let q = 1− p.

H0: p = 0.5 (there are equal numbers of males and females)

H1: p ≠ 0.5 (the college population is not evenly divided into males
and females).

Consider the sampling distribution of proportions,

Ps ~ N
( p , pqn ) with n = 500

pq (0.5)(0.5)
Under H0, p = 0.5, q = 0.5, n = 500 = 0.0005

72
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

So P ~ N(0.5, 0.0005)
Ps− p
pq Ps−0 .5 Ps−0 .5
The test statistic is Z = √ n = √0. 0005 = 0.022 ...
We use a two-tailed test at the 5% level, and reject Ho if |z| > 1.96 where,

Ps− p
Z = pq/n

0. 534−0 .5
= 0.022 ..

= 1.52.
Conclusion: since | z | < 1.96, we do not reject Ho and we conclude that at the
5% level, there is not sufficient evidence to refute the claim that the college
population is evenly divided into males and females.

SECTION E

CORRELATION AND REGRESSION


Sometimes we wish to investigate the results of a statistical enquiry or
experiment by comparing two sets of data, x and y, for example:

x y
The weight at the end of a spring The length of the spring

73
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Pupil’s mark in French Pupil’s mark in German

The diameter of the stem of a plant The average length of leaf of a plant

The age of a plant The quantity of fruit produced by a plant

Regression is concerned with obtaining a mathematical equation which


describes the relationship between two variables. The equation can be used for
comparison or estimation purposes.

LEAST SQUARES REGRESSION LINES

These are obtained by calculation according to a rule known as ‘least squares’.


The least squares method of obtaining the y on x regression line is given by
y = a + bx
The values of a and b can be obtained using the following formulae:

Least squares regression formulae


If the least squares regression line of y on x is given by: y = a + bx, then

n ∑ xy−∑ x ∑ y
2 2
∑ y −b ∑ x
b= n ∑ x −( ∑ x ) ; a= n n
Note: a is also given as: a = ȳ−b x̄

Example: The following table gives the test results for 10 children.

Child A B C D E F G H I
J
Arithmetic mark, x 1 8 15 18 23 28 33 39 45
English mark, y 45
3 14 8 20 19 17 36 26 14
29

Find the least squares regression line of the marks of these children and use it to
predict a child’s mark in English whose Arithmetic mark is 60.

Solution: The standard layout for the calculation of the regression line is follows:

x y xy x2
1 3 3 1
8 14 112 64
15 8 120 225
18 20 360 324
23 19 437 529
28 17 476 784

74
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

33 36 1188 1089
39 26 1044 1521
45 14 630 2025
45 29 1305 2025
Σx= Σy= Σ xy = Σ x2 =
Notice that the regression line of y on x is being asked for, and if we express it in
the form y = a + bx, the values of a and b can be found using the formulae given
above.

Example: The data given in the table below, relates the weekly maintenance
cost (₤) to the age (in months) of ten machines of similar type in a manufacturing
company. Find the least squares regression line of maintenance cost on age and
use it to predict the maintenance cost for a machine of this type which is 40
months old.

Machin 1 2 3 4 5 6 7 8 9
e 10
Age (x) 5 10 15 20 30 30 30 50 50
Cost (y) 60
190 240 250 300 310 335 300 300 350
395

Solution: The calculations are laid out as follows:

x y xy x2
5 190 950 25
10 240 2400 100
15 250 3750 225
20 300 6000 400
30 310 9300 900
30 335 10050 900
30 300 9000 900
50 300 15000 2500
50 350 17500 2500
60 395 23700 3600
Σ x = 300 Σ y = 2970 Σ xy = 97650 Σ x 2 = 12050

Here: n = 10; Σ x = 300; Σ y = 2970; Σ xy = 97650; Σ x 2 = 12050

n ∑ xy−∑ x ∑ y
2
Thus, b= n ∑ x 2−( ∑ x )

10 x 97650−300 x 2970
= 10 x 12050−300 2

75
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

85500
= 30500

= 2.803.

∑ y −b ∑ x 2970
−2. 803 x
300
Also: a= n n = 10 10

= 297 – 2.803 x 30

= 212.90.

Therefore, the least squares regression line of on age is: y = a + bx


= 212.90 + 2.803x

This line can now be used to estimate the maintenance cost associated with an
age of 40 months. Substituting x = 40 in the above regression line gives:

y= 212.90 + 2.803(40)

= ₤325.
The estimated maintenance cost of a machine with an age of 40 months is
₤325.

CORRELATION TECHNIQUES

Purpose of correlation:

The purpose of regression analysis is to identify a relationship for a given set of


bivariate data. What it does not do however, is to give any indication of how good
this relationship might be.
This is where correlation comes in. It provides a measure of well a least squares
regression line ‘fits’ the given set of data. The better the correlation, the closer
the data points are to the regression line and hence the more confidence one
would have in using the regression line for estimation.
Correlation is concerned with describing the strength of the relationship
between two variables by measuring the degree of ‘scatter’ of the data values.
The less scattered (or variable) the data values are, the stronger the correlation
is said to be.

The correlation coefficient

76
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

We need a way of measuring the strength of the correlation between two


variables. This is achieved through a correlation coefficient, normally represented
by the symbol r. It is a number which lies between -1 and +1 (inclusive). That is
-1≤ r ≤ +1. A value of r = 0 signifies that there is no correlation present, while the
further away from 0 (towards -1 or +1) r is, the stronger the correlation.

The product-moment correlation coefficient


(or Pearson’ coefficient of correlation)

The standard measure of correlation is called the product-moment correlation


coefficient, r. It is a numerical value which indicates the degree of scatter. The
product –moment correlation coefficient is a very useful measure because it is
independent of the units of scale of the variables. It is calculated using the
following formula:

Product moment correlation coefficient (or Pearson’s coefficient of


Correlation) formula.

n ∑ xy−∑ x ∑ y

r= √(n ∑ x − (∑ x ) )(n ∑ y −( ∑ y ) )
2 2 2 2

Example: The following table shows the marks of 10 candidates in Physics and
Mathematics. Calculate Pearson’s coefficient of correlation and comment on your
value.

Mark in
Physics (x) 18 20 30 40 46 54 60 80 88
92

Mark in
Mathematics(y 42 54 60 54 62 68 80 66 80
) 100

The normal layout of calculations is as follows

x y x2 y2 xy
18 42 324 1764 756
20 54 400 2916 1080
30 60 900 3600 1800
40 54 1600 2916 2160
46 62 2116 3844 2852
54 68 2916 4624 3672
60 80 3600 6400 4800
80 66 6400 4356 5280

77
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

88 80 7744 6400 7040


92 100 8464 10000 9200
∑ x=528 ∑ y=666 ∑ x 2=34464 Error! Objects
cannot be
∑ xy=38640
created from
editing field
codes.

There are 10 pairs of values, therefore, n = 10.

From the table, ∑ x=528 , ∑ y=666 , ∑ x 2=34464 , Error! Objects

cannot be created from editing field codes. and ∑ xy=38640 .

n ∑ xy−∑ x ∑ y

Thus, r = √(n ∑ x − (∑ x ) )(n ∑ y −( ∑ y ) )


2 2 2 2

10 x 38640−528 x 666
= √( 10 x 34464−528 2)( 10 x 46820−6662 )
386400−351648
= √ (344640−278784 )( 468200−443556)
34752
= √65856 x 24644
34752
= 40285 . 92

= 0.8626.

Positive correlation

Correlation can exist in such a way that increases in the value of one variable
tend to be associated with increases in the value of the other. This is known as
positive (or direct) correlation. In this case, the correlation coefficient, r, will take
a value between 0 and +1, with r = +1 signifying ‘perfect’ positive correlation.

Negative correlation

78
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Correlation also exists when increases in the value of one variable tend to be
associated with decreases in the value of the other (and vice versa). In this type
of case the correlation is said to be negative (or inverse). In this case, the
correlation coefficient ,r, will take a value between 0 and -1, with r = -1 signifying
‘perfect’ negative correlation.

Spearman’s ‘rank’ correlation coefficient

Instead of using the values of the variables x and y, we can rank them in order of
size, using the numbers 1, 2, 3, ….n. A correlation coefficient can then be
determined on the basis of these ranks. The measure of rank correlation most
commonly used is known as Spearman’s rank correlation coefficient.

Spearman’s correlation coefficient is given by the following formula:


2
6∑ d
r=1− n ( n2 −1 ) where n is the number of bivariate pairs.

The procedure for obtaining this rank correlation coefficient is as follows:


STEP 1: Rank the x values (to give rx values)

STEP 2: Rank the y values (to give ry values)

Note that the rankings of the x - values are performed quite independently of the
rankings of the y- values and ranking is normally performed in ascending order
(although this is not essential)

STEP 3: For each pair of ranks, calculate d2 = (rx − ry)2.

STEP 4: Calculate ∑ d2
STEP 5: The value of the rank correlation coefficient can then be found using the
formula.

Method of ranking

Suppose we have the masses, x (in kg), of five men


66, 68, 65, 69, 70.
Arranged in order of magnitude, these are 65, 66, 68, 69, 70. So we
assign the ranks as follows:

x 66 68 65 69
70

79
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Rank x 2 3 1 4
5

If we have two or more equal values we proceed as follows:

x 66 68 65 68
70
Rank x 2 3.5 1 3.5
5

Here, the 3rd and the 4th places represent the same mass (68kg), so

we assign the average rank 3.5 to both these places.

Similarly, for the eight values:

x 66 65 66 67 66 64 68
68
Rank x 4 2 4 6 4 1 7.5
7.5

Here, 3rd, 4th and 5th places represent the same mass (66kg) so we assign the
Same average rank 4 to these places; also the 7th and the 8th places represent
the same mass (68kg) so we assign the average rank 7.5 to both these places.

Example: These are the marks obtained by 8 pupils in Mathematics and


Physics. Calculate Spearman’s coefficient of rank correlation.

Mathematics (x) 67 42 85 51 39 97 81
70
Physics (y) 70 59 71 38 55 62 80
76

Solution:
x y Rank x Rank y d d2
67 70 4 5 -1 1
42 59 2 3 -1 1
85 71 7 6 1 1
51 38 3 1 2 4
39 55 1 2 -1 1
97 62 8 4 4 16
81 80 6 8 -2 4
70 76 5 7 -2 4
∑ d 2=32

80
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

From the table: n = 8 , ∑ d 2=32


2
6∑ d
Therefore, r=1− n ( n2 −1 )

6(32)
= 1 − 8(63)

= 0.6199.

Spearman’s coefficient of rank correlation is 0.6199.

Example: The marks of 10 pupils in French and German tests are as follows:

French (x) 12 8 16 11 7 10 13 17 12 9
German (y) 6 5 7 7 4 9 8 13 10 11

Calculate Spearman’s coefficient of rank correlation.

French (x) German(y) Rank x Rank y d d2


12 6 6.5 3 3.5 12.25
8 5 2 2 0 0
16 7 9 4.5 -4.5 20.25
11 7 5 4.5 0.5 0.25
7 4 1 1 0 0
10 9 4 7 -3 9
13 8 8 6 2 4
17 13 10 10 0 0
12 10 6.5 8 -1.5 2.25
9 11 3 9 -6 36
∑ d 2=84
From the table: n = 10, ∑ d 2=84
6∑ d2
Therefore, r = 1 − n(n2 −1)

6(84)
= 1 − 10(99)

= 0.49.
Spearman’s coefficient of rank correlation is 0.49, indicating some positive correlation
between the marks in the two tests.

81
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

SECTION F

TIME SERIES ANALYSIS

A time series is the name given to the values of some statistical variables
measured over a uniform set of time points. Any business, large or small, will
need to keep records of such things as sales, purchases, value of stock held and
VAT and these things could be recorded daily, weekly, monthly, quarterly or
yearly. These are examples of time series.
Time series occur naturally in all spheres of business activity as demonstrated in
the following example:
Example: (Situations in which time series occur naturally)
a) Annual turnover (in ₤m) of a firm for ten successive years.
b) Numbers employed (in thousands) for each quarter of four successive
years.
c) Total monthly sales (in ₤0000) for a small business for three successive
years.
d) Daily takings for a supermarket over a two month period.
e) Number of registered journeys for Managers in a large firm ( see table
below)

Qtr 1 Qtr 2 Qtr 3 Qtr 4


Year 1 73 90 121 98
Year 2 69 92 145 107
Year 3 86 111 157 122
Year 4 88 109 159 131

Time series cycle

Normally, time series data exhibits a general pattern which broadly repeats,
called a cycle. Monthly sales for a business will exhibit some natural
12-monthly cycle; daily takings for a supermarket will display a definite 6-daily
cycle. The cycle for the Managers’ journeys in example 2(e) above, can be
seen to be a 4-quarterly.

Time series models

Business records, and in particular certain time series of sales and


purchases, need to be kept by law. Of course they are also used to help
control current and future business activities. To use time series effectively for
such purposes, the data have to be organized and analysed. In order to
explain the movement of time series data, models can be constructed which
describe how various components combine to form individual data values.

82
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Time series analysis

It is the evaluation and extraction of the components of a model that


‘break down’ a particular series into understandable and explainable portions
and enables:
a) trends to be identified,
b) extraneous factors to be eliminated and
c) forecasts to be made.
The understanding, description and use of these processes is known as time
series analysis.

Standard time series models

Depending on the nature, complexity and extent of the analysis required,


there are various types of models that can be used to describe time series
data. However, for the purpose of our course, two main models will be
referred to. They are known as the simple additive and multiplicative models.
The components that go to make up each value of the time series are
described in the following definitions:

The time series additive model


y=t+s+r
where y is a given time series value
t is the trend component
s is the seasonal component
r is the residual component

The time series multiplicative model


y=txSxR
where: y is a given time series value
t is the trend component
S is the seasonal component
R is the residual component

Put in another way, given a set of time series data, every single given (y) value
can be expressed as the sum or product of three components. It is the evaluation
and interpretation of these components that is the main aim of the overall
analysis.

Note that although the trend component will be constant no matter which of the
models are used, the values of the seasonal and the residual will depend on
which model is being used. In other words, given a set of data to which both
models are being applied, both trend values will be identical whereas the
respective seasonal and residual components would be quite different.

83
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Description of time series components

a) Trend: Trend is the underlying, long-term tendency of the data. There are
various techniques for extracting a trend from a given time series.

b) Seasonal variation: These are short- term cyclic fluctuations in the data
about the trend which take their name from the standard business quarters
of the year. Note however that the word ‘season’ in this context can have
many different meanings.

c) For example:

i. Daily ‘seasons’ over a weekly cycle for sales in a supermarket.


ii. Monthly ‘seasons’ over a yearly cycle for purchases of a company.
iii.Quarterly ‘seasons’ over a yearly cycle for sales of electricity in
the domestic sector.
Techniques for obtaining and analyzing seasonal variation are described
later.

d) Residual variation: These include other factors not explained by a) and b)


above. This variation normally consists of two components:
i. Random factors. These are disturbances due to ‘everyday’
unpredictable influences, such as weather conditions, illness,
transport breakdown, and so on. The evaluation of this component
is not usually required in examinations, but its interpretation
should be known.
ii. Long-term cyclic factor. This can be thought of (if it exists) as due
to underlying economic causes outside the scope of the
immediate environment. Examples are standard trade cycles or
minor recessions. This particular type of variation is not discussed
further in this course since it requires techniques that not outlined
in our syllabus.

The significance of trend values

It will be recalled that the objective of finding the time series trend is to enable
the underlying tendency of the data to be highlighted. Thus, a business sales
trend will normally show whether sales moving up or down (or remaining static)
in the long term.
The trend can also be thought of as the core component of the additive time
series model about which the other two components, seasonal(s) and residual (r)
variations, fluctuate. This component is found by identifying separate trend (t)
values, each corresponding to a time point. In other words, at each time point of
the series, a value of t can be obtained which forms one of the components that

84
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

go to make up the observed value of y. There are different ways of obtaining


trend values for a given time series.

Techniques for extracting the trend

There are three techniques that can be used to extract a trend from a set of time
series.
a) Semi-averages: This the simplest technique, involving the calculation of
two (x,y) averages which, when plotted on a chart as two separate points
and joined up, form a straight line.
b) Least squares regression: This method, similarly results in a straight line.
c) Moving averages: This is the most commonly used method for identifying
a trend and involves the calculation of a set of averages. The trend, when
obtained and charted, consists of straight line segments.

The method of moving averages

This method of obtaining a time series trend involves calculating a set of


averages, each one corresponding to a trend (t) value for a time point of the
series. These are known as moving averages, since each average is calculated
by moving from one overlapping set of values to the next. The number of values
in each set is always the same and is known as the period of the moving
average.
To demonstrate the technique, a set of moving averages of period 5 has been
calculated below for a set of values.

Original values: 12 10 11 11 9 11 10 10 11 10
Moving totals: 53 52 52 51 51 52
Moving averages: 10.6 10.4 10.4 10.2 10.2 10.4
The first total, 53, is formed from adding the first 5 items,
i.e. 53 = 12 + 10 + 11 + 11 + 9.

Similarly, the second total 52 = 10 + 11 + 11 + 9 + 11, and so on. The averages


are then obtained by dividing each total by 5.

Notice that the totals and the averages are written down in line with the middle
value of the set being worked on. These averages are the trend (t) values
required.
It should also be noticed that there are no trend values corresponding to the first
and last two original values. This is always the case with moving averages and is
a disadvantage of this particular method of obtaining a trend.

85
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Moving average centering

When calculating moving averages with an even period (i.e. 4, 6 or 8), the
resulting moving average should be placed in between two corresponding time
points. As an example, the following data has a 4-period moving average
calculated and shows its placing.

Time point 1 2 3 4 5 6 7 8 9
10
Data value 9 14 17 12 10 14 19 15 10
16
Totals (of 4) 52 53 53 55 58 58 60
Averages (of 4) 13.00 13.25 13.25 13.75 14.50 14.50 15.00
The placing of these averages as described above would not be satisfactory
when the averages are being used to represent a trend, since the trend values
need to coincide with particular time points. A method known as centering is
used in this type of situation, where the calculated averages are themselves
averaged in successive overlapping pairs. This ensures that each calculated
(trend) value ‘lines up’ with a time point.

This technique is now shown for the previous data:

Time point 2 3 4 5 6 7 8 9
Average (of 4) 13.00 13.25 13.25 13.75 14.50 14.50 15.00
Average (of 2) 13.125 13.250 13.50 14.125 14.50 14.75

Example:

The following table shows the number of passengers (in millions) traveling by
plane in each quarter of three consecutive years:
Year 1 Year 2 Year 3
Quarter 1 2 3 4 1 2 3 4 1 2 3 4
Number of 2.2 5.0 7.9 3.2 2.9 5.2 8.2 3.8 3.2 5.8 9.1 4.1
passengers
Calculate the trend values for this data, using moving averages with an
appropriate period.

Solution: The cycle of the data is clearly 4-quarterly and we thus need a
(centered) 4-quarterly moving average trend. The standard layout of calculations
is demonstrated below:
Quarter Original data Moving totals Moving Centered
Average moving average

Year 1 1 2.2
2 5.0
3 7.9
4 3.2

86
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Year 2 1 2.9
2 5.2
3 8.2
4 3.8
Year 3 1 3.2
2 5.8
3 9.1
4 4.1

SECTION G

INDEX NUMBERS

Sometimes, we wish to compare quantities, for example, the price of a bag of


mealie meal in 2007 and the price in 2009. A very useful way of doing this is to
express one quantity as a percentage of the other quantity in the form of an
“index number”.
The simplest example of an index number is a price relative, or price index. First
of all, a “base year” is chosen. This is the year on which the price changes are
based. If we denote the price in the base year as P 0 and the price in the next
year as P1, then
Price relative (Price index) = P1 .
P0
It is usual to give the price relative as a percentage, but “the percentage
symbol is always omitted” in the final answer.

INDEX NUMBER NOTATION

It is convenient, particularly when giving formulae for certain types of index


numbers, to be able to refer to an economic commodity at some general time
point. Prices and quantities, have their own special letters p and q,
respectively. In order to bring in the idea of time, the following standard
convention is used:
P0 = the price at ‘base’ time point
Pn = the price at ‘some other’ time point
q0 = the quantity at ‘base’ time point
qn= the quantity at ‘some other’ time point.

PRICE AND QUANTITY RELATIVES

Price relative: Ip = Pn x 100%


P0
Quantity relative: Iq = qn X 100%
q0
Note: Always omit the percentage from the final answer.

87
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Example:
In January 2006, the price of a 25kg bag of mealie meal was K42,000. In
January 2009, the price was K63,000. Taking 2006 as a base year, find the
price relative.

Solution:

Price relative = Price in 2009


Price in 2006

= Pn x 100%
P0

= K63,000 x 100%
K42,000
= 150%

But we omit the % from the answer. So, P rice relative = 150.
This indicates that the price of mealie meal increased by 50% between 2006
and 2009.

Example 2:
The following table gives details of prices and quantities sold of two particular
items in a departmental store over two years:

Item 2005 2008


Price, (Po) Price, (Pn) Number
Sugar Number sold (qo) sold, (qn)
( 2Kg) K5,000 K8,000
Bread 370,000 180,000
( sliced) K2,300 K4,200
260,000 450,000

Find the price and quantity relatives for 2008 for both sugar and bread.

Solutions:

For sugar, Price relative = Pn x 100% = 8,000 X 100 = 160.


P0 5,000

Quantity relative = qn X 100%


q0
= 180.000 X 100
370,000
= 48.6.

88
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

For bread, Price relative = Pn x 100% = 4200 X 100 = 182.6


P0 2300

Quantity relative = qn X 100% = 450,000 X 100 = 173.1.


q0 260,000

WEIGHTED PRICE INDEX


For a price index to be realistic, it should take into account the relative
importance of the commodities. The method of weighting allows this to be
done. Suppose for example, we know that in consumer expenditure, food is
twice as important as housing, which in turn is twice as important as
transport. Then, we give weights 4, 2, and 1 to the three commodities.

pn
∑ ( )
q0
w

In general, weighted Price Index = ∑w


Note: The weighted price index is sometimes referred to as the “composite
index”.

Example: Calculate a weighted price index for the following figures for 2006
based on 2004.
Item 2004 price 2006 price Weight ( w )
( K’000) ( K’000)
Food 55 60 4
Housing 48 52 2
Transport 16 20 1

pn
∑ ( )q0
w

Solution: Weighted price index = ∑w


60 52 20
= 55 x 4 + 48 x 2 + 16 x 1
4+2+1

7. 780
= 7 X 100

= 1. 111 X 100

= 111

89
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Therefore, the weighted price index = 111.

Example 2: Calculate to nearest integer, the weighted price index from the
following price relatives and weights:

Price relative Weight


Food 118 40
Rent 102 8
Clothing 114 12
Fuel 120 10
110 30
Miscellaneous

Solution: Weighted Price Index = 18 x 40 + 102 x 8 + 114 x 12 + 120 x 10 +


110 x 30
40 + 8 + 12 + 10 + 30

11404
= 100

= 114.4

Therefore, the weighted price index = 114.

ASSIGNMENT 4 DUE DATE 30TH MARCH 2016


1. In 2002 the price index of a commodity, using 1998 as a base year, was
112. In 2004, the index using 2002 as a base year was 85. What would
have been the index in 2004 using 1998 as a base year?

90
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

2. In 2001, the index number of the value of a commodity was 135 when
1999 was taken as base year. The value of the commodity in 2001 was
K54 000 and in 2000 was K46 000. Find:
a) the value of the commodity in 1999.
b) the index number of the value of the commodity in 2000 when 1999
was taken as a base year.

3. The cost of serving a car depends on three items- cost of materials, cost
of labour and cost of overheads. The price relatives of these items in
1990, using 1988 as a base year, are shown in the table below, together
with the weights attached to them.

Materials Labour Overheads


Price relative 115 110 x
Weight 2 5 3

Given that the cost of servicing a car in was K50 000 in 1988 and K57 000 in
1990, find the value of x.

4.The following table shows the price relatives for various commodities in
2002, with 2000 as base year, with their weights. Calculate an index of
retail prices, based on these figures, giving your answer to the nearest
integer.

Commodity Price relative Weight


A 115 20
B 123 25
C 154 10
D 108 15
E 100 30

5.There are four grades of workers in a certain factory. The table below
shows the average weekly wage, in K, of a worker in each grade in 1998 and
in 2008; the final column shows the index number for these wages in 2008,
taking 1998 as a base year.

91
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

1998 2008
Grade Weekly wage Weekly wage Index
( K’000) ( K’000) number
1 120 192 160
2 150 285 x
3 y 330 200
4 170 z 250

Find the values of x, y and z.

The number of workers in each grade in 2008 is shown in the table below.

Grade 1 2 3 4
Number of workers 180 165 100 55

Obtain a composite index number for the average weekly wage for the whole
factory in 2008, using 1998 as a base year.

LASPEYRES INDICES

A Laspeyres Index is a special case of a weighted aggregate index which


always uses base time period weights. It is most commonly associated with
price and quantity indices where:

92
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

i) a Laspeyres “ price” index uses base time periods “ quantities” as weights.


ii) a Laspeyres “ quantity” index uses time periods “prices” as weights.

FORMULAE FOR LASPEYRES INDICES


The two formulae most commonly used for calculating Laspeyres index are
given below:
LASPEYRES INDEX

∑ qoPn
a) Price Index : Lp = ∑ qoPo X 100

∑ Poqn
b) Quantity Index : Lq = ∑ Poqo X 100

PAASCHE INDICES
A Paasche Index is a special case of a weighted aggregated index which
always uses current time period weights. It is most commonly associated with
price and quantity indices where:
a) a Paasche price index uses current time period “ quantities” as weights.
b) a Paasche quantity index uses current time period “ prices” as weights.

FORMULAE FOR PAASCHE INDEX


As with a Laspeyres index, the Paasche Index has two formulae. The two
formulae most commonly used for calculating the Paasche Index are:

∑ Pnqn
a) Price Index : Pp = ∑ qnPo X 100

∑ qnPn
b)Quantity Index : PQ = ∑ Pnqo X 100

Example: The following data relate to a set of commodities used in a


particular process:

Base Period Period 1


Units of Price Price
Commodity purchase Quantity Quantity

93
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

(K’000) (K’000)
(Units) (Units)
A 210Litres 36 40
B drum 100 95
C 1 tonne 80 90
D 100 Kg 12 10
200 meters 45 41
16 18
5 6
1100 1200

Calculate the Laspeyres and the Paasche Price Indices for Period 1.
Solution: The layout for the calculations is shown below:

Commodity Po qo Pn qn
A 36 100 40 95
B 80 12 90 10
C 45 16 41 18
D 5 1100 6 1200

Calculations for Laspeyres Index:

qoPn Poqo
401 3600
1081 960
657 720
6600 5500
Total 12,336 10, 780

∑ qoPn
Laspeyres Price Index = : Lp = ∑ qoPo X 100

12336
= 10780 X 100

=114.4

Calculations for Paasche Index:

Pnqn qnPo
3801 3420
901 800
739 810

94
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

7200 6000
Total 12638 11030

∑ Pnqn
Paasche Price Index Pp = ∑ qnPo X 100

12638
= 11030 X 100

= 114.6.

Both indices are showing an increase in prices of about 14.5%, although notice
that the two indices have slightly different values. This is to be expected, since
the weights used for the commodities are different.

COMPARISON OF LASPEYRES AND PAASCHE INDICES

a) What the indices are calculating:

i) Laspeyres: Laspeyres compares base period expenditure with a hypothetical


current period expenditure at base period quantities.

ii) Paasche: Paasche compares current period expenditure with a hypothetical


base period expenditure at current period quantities.

b) When prices are rising:

i) Laspeyres index tends to over-estimate price increases.


ii) Paasche index tends to under-estimate price increases.
This is thought of as a disadvantage for both types of indices.

c) Nature of the indices:

i) Laspeyres: Laspeyres index is thought of as a “pure” price index since (from


period to period) it is comparing like with like, which is an advantage.
ii) Paasche: Paasche index is not considered as “ pure” since ( from period to
period ) the weights change and thus, strictly, like is not being compared with
like. This is considered to be a disadvantage of the index.

d) Weights used in calculations:

Since the Laspeyres index uses “base” period quantities as weights, they can
easily become out of date (disadvantage), while the “ current” quantities that the
Paasche index uses as weights, are always up-to-date (advantage).

95
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

e) Ease of calculations:

The Laspeyres index needs only the base period quantities, no matter how
many periods the index is being calculated for, which is a considerable
advantage over the Paasche index which needs new quantities for each time
period. Quantities are normally more difficult to determine than prices.
Generally the Laspeyres index is more favored than the Paasche index.

USES OF INDEX NUMBERS

Index numbers are used to reflect general economic conditions over a period of
time. For example, the Retail Price Index, measures changes in the cost of living;
the index of industrial production reflects changes in industrial output; the
Financial Ordinary Shares Index reflects the general state of the stock market.
In particular, index numbers can be used by Government to decide on tax
changes, subsidies to industries or regions or national retirement pension
increases. Trade unions often use the national cost of living and production
indices in wage negotiations or to compare the cost of living across national
boundaries, regions or professions. Insurance companies, use various cost
indices to index-link housing (building or contents) policies.

CHANGING THE BASE OF FIXED BASE RELATIVES

Given a time series of relatives, it is sometimes necessary to change the base.


One of the reasons for doing this might be that the original base time point is too
far in the past to be relevant to today’s conditions and a more recent one is
needed.
For example, the following set has a base of 1985 which would probably now be
considered out-of-date.

2002 2003 2004 2005 2006 2007 2008


Index (1985= 324 351 377 384 391 404 428
100

The procedure for changing the base of a time series of relatives is essentially
the same as that for calculating a set of relatives for a given time series of
values. However, the procedure is given below and demonstrated using the
above set of values:

Step 1: Choose the required new base time point and thus, identify the
corresponding relative. We will choose 2002 as the new base year with a
corresponding relative of 324.

Step 2: Divide each relative in the set by the value of the relative identified
above, and “multiply the result” by 100.

96
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

Thus, each index relative given needs to be divided by 324 and multiplied by
100.

2002 2003 2004 2005 2006 2007 2008


Old Index (1985 351 377 384 291 404 428
=100) 324
New 108 116 119 121 125 132
Index(2002= 100
100)

The calculations are carried out as shown below ( see the new index numbers
printed in bold type in the table).
324 377 404
324 X 100 = 100, 324 X 100 = 116, 324 X 100 = 125, and so
on.

REFERENCES

 ZICA Manual

 ABE study manual

97
BUSINESS MATHEMATICS AND STATISTICS
ZIBSIP - OPEN AND DISTANCE STUDY MATERIAL – ZICA TECHNICIAN

 Curwin, J. and Slater, R. (1996): Quantitative Methods for Business


Decisions; Thomson Business Press.

 Kazmier, L. and Pohl, N. (1987): Basic Statistics for Business and


Economics; 2nd Edition; McGraw-Hill

 Silver, M. (1997), Business Statistics; McGraw

 Greg Attwood and Gill Dyer (1998), Statistics 2; Heinemann.

 Frank Owen and Ron Jones (1994), Statistics; 4th Edition Longman
Group UK Limited

 Andre Francis (2004), Business Mathematics and Statistics; 6th


Edition; Thomson Learning.

98
BUSINESS MATHEMATICS AND STATISTICS

You might also like