0% found this document useful (0 votes)
238 views

Quantitative - Methods Course Text

Quantitative Methods EBS course manual

Uploaded by

Jermaine R
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
238 views

Quantitative - Methods Course Text

Quantitative Methods EBS course manual

Uploaded by

Jermaine R
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 608

Quantitative

Methods
Professor David Targett

QM-A3-engb 1/2017 (1022)


This course text is part of the learning content for this Edinburgh Business School course.
In addition to this printed course text, you should also have access to the course website in this subject,
which will provide you with more learning content, the Profiler software and past examination questions
and answers.
The content of this course text is updated from time to time, and all changes are reflected in the version
of the text that appears on the accompanying website at http://coursewebsites.ebsglobal.net/.
Most updates are minor, and examination questions will avoid any new or significantly altered material for
two years following publication of the relevant material on the website.
You can check the version of the course text via the version release number to be found on the front
page of the text, and compare this to the version number of the latest PDF version of the text on the
website.
If you are studying this course as part of a tutored programme, you should contact your Centre for
further information on any changes.
Full terms and conditions that apply to students on any of the Edinburgh Business School courses are
available on the website www.ebsglobal.net, and should have been notified to you either by Edinburgh
Business School or by the centre or regional partner through whom you purchased your course. If this is
not the case, please contact Edinburgh Business School at the address below:

Edinburgh Business School


Heriot-Watt University
Edinburgh
EH14 4AS
United Kingdom

Tel + 44 (0) 131 451 3090


Fax + 44 (0) 131 451 3002
Email [email protected]
Website www.ebsglobal.net

The courses are updated on a regular basis to take account of errors, omissions and recent
developments. If you'd like to suggest a change to this course, please contact
us: [email protected].
Quantitative Methods
The Quantitative Methods programme is written by David Targett, Professor of Information Systems at
the School of Management, University of Bath and formerly Senior Lecturer in Decision Sciences at the
London Business School. Professor Targett has many years’ experience teaching executives to add
numeracy to their list of management skills and become balanced decision makers. His style is based on
demystifying complex techniques and demonstrating clearly their practical relevance as well as their
shortcomings. His books, including Coping with Numbers and The Economist Pocket Guide to Business
Numeracy, have stressed communication rather than technical rigour and have sold throughout the world.
He has written over fifty case studies which confirm the increasing integration of Quantitative Methods
with other management topics. The cases cover a variety of industries, illustrating the changing nature of
Quantitative Methods and the growing impact it is having on decision makers in the Information Technol-
ogy age. They also demonstrate Professor Targett’s wide practical experience in international
organisations in both public and private sectors.
One of his many articles, a study on the provision of management information, won the Pergamon Prize
in 1986.
He was part of the team that designed London Business School’s highly successful part-time MBA
Programme of which he was the Director from 1985 to 1988. During this time he extended the interna-
tional focus of the teaching by leading pioneering study groups to Hong Kong, Singapore and the United
States of America. He has taught on all major programmes at the London Business School and has
developed and run management education courses involving scores of major companies including:
British Rail
Citicorp
Marks and Spencer
Shell
First Published in Great Britain in 1990.
© David Targett 1990, 2000, 2001
The rights of Professor David Targett to be identified as Author of this Work have been asserted in
accordance with the Copyright, Designs and Patents Act 1988.
All rights reserved; students may print and download these materials for their own private study only and
not for commercial use. Except for this permitted use, no materials may be used, copied, shared, lent,
hired or resold in any way, without the prior consent of Edinburgh Business School.
Contents

PART 1 INTRODUCTION AND BACKGROUND


Module 1 Introducing Statistics: Some Simple Uses and Misuses 1/1

1.1 Introduction 1/1


1.2 Probability 1/3
1.3 Discrete Statistical Distributions 1/5
1.4 Continuous Statistical Distributions 1/8
1.5 Standard Distributions 1/11
1.6 Wrong Use of Statistics 1/16
1.7 How to Spot Statistical Errors 1/20
Learning Summary 1/22
Review Questions 1/24
Case Study 1.1: Airline Ticketing 1/26
Case Study 1.2: JP Carruthers Co. 1/26
Case Study 1.3: Newspaper Letters 1/30

Module 2 Basic Mathematics: School Mathematics Applied to Management 2/1

2.1 Introduction 2/1


2.2 Graphical Representation 2/2
2.3 Manipulation of Equations 2/8
2.4 Linear Functions 2/11
2.5 Simultaneous Equations 2/14
2.6 Exponential Functions 2/18
Review Questions 2/25
Case Study 2.1: Algebraic Formulation 2/28
Case Study 2.2: CNX Armaments Co. 2/29
Case Study 2.3: Bonzo Corporation 2/29
Case Study 2.4: Woof Dog Food 2/29

PART 2 HANDLING NUMBERS


Module 3 Data Communication 3/1

3.1 Introduction 3/1


3.2 Rules for Data Presentation 3/3
3.3 The Special Case of Accounting Data 3/12
3.4 Communicating Data through Graphs 3/16
Learning Summary 3/21

Quantitative Methods Edinburgh Business School v


Contents

Review Questions 3/22


Case Study 3.1: Local Government Performance Measures 3/24
Case Study 3.2: Multinational Company’s Income Statement 3/25
Case Study 3.3: Country GDPs 3/25
Case Study 3.4: Energy Efficiency 3/26

Module 4 Data Analysis 4/1

4.1 Introduction 4/1


4.2 Management Problems in Data Analysis 4/2
4.3 Guidelines for Data Analysis 4/6
Learning Summary 4/15
Review Questions 4/16
Case Study 4.1: Motoring Correspondent 4/17
Case Study 4.2: Geographical Accounts 4/18
Case Study 4.3: Wages Project 4/19

Module 5 Summary Measures 5/1

5.1 Introduction 5/1


5.2 Usefulness of the Measures 5/2
5.3 Measures of Location 5/5
5.4 Measures of Scatter 5/14
5.5 Other Summary Measures 5/20
5.6 Dealing with Outliers 5/21
5.7 Indices 5/22
Learning Summary 5/29
Review Questions 5/30
Case Study 5.1: Light Bulb Testing 5/33
Case Study 5.2: Smith’s Expense Account 5/34
Case Study 5.3: Monthly Employment Statistics 5/34
Case Study 5.4: Commuting Distances 5/34
Case Study 5.5: Petroleum Products 5/35

Module 6 Sampling Methods 6/1

6.1 Introduction 6/1


6.2 Applications of Sampling 6/3
6.3 The Ideas behind Sampling 6/3
6.4 Random Sampling Methods 6/4
6.5 Judgement Sampling 6/10
6.6 The Accuracy of Samples 6/12

vi Edinburgh Business School Quantitative Methods


Contents

6.7 Typical Difficulties in Sampling 6/13


6.8 What Sample Size? 6/15
Learning Summary 6/16
Review Questions 6/18
Case Study 6.1: Business School Alumni 6/20
Case Study 6.2: Clearing Bank 6/20

PART 3 STATISTICAL METHODS


Module 7 Distributions 7/1

7.1 Introduction 7/1


7.2 Observed Distributions 7/2
7.3 Probability Concepts 7/8
7.4 Standard Distributions 7/14
7.5 Binomial Distribution 7/15
7.6 The Normal Distribution 7/19
Learning Summary 7/27
Review Questions 7/29
Case Study 7.1: Examination Grades 7/31
Case Study 7.2: Car Components 7/31
Case Study 7.3: Credit Card Accounts 7/32
Case Study 7.4: Breakfast Cereals 7/32

Module 8 Statistical Inference 8/1

8.1 Introduction 8/1


8.2 Applications of Statistical Inference 8/2
8.3 Confidence Levels 8/2
8.4 Sampling Distribution of the Mean 8/3
8.5 Estimation 8/6
8.6 Basic Significance Tests 8/9
8.7 More Significance Tests 8/18
8.8 Reservations about the Use of Significance Tests 8/24
Learning Summary 8/26
Review Questions 8/28
Case Study 8.1: Food Store 8/30
Case Study 8.2: Management Association 8/31
Case Study 8.3: Textile Company 8/31
Case Study 8.4: Titan Insurance Company 8/31

Quantitative Methods Edinburgh Business School vii


Contents

Module 9 More Distributions 9/1

9.1 Introduction 9/1


9.2 The Poisson Distribution 9/2
9.3 Degrees of Freedom 9/7
9.4 t-Distribution 9/8
9.5 Chi-Squared Distribution 9/14
9.6 F-Distribution 9/19
9.7 Other Distributions 9/22
Learning Summary 9/23
Review Questions 9/25
Case Study 9.1: Aircraft Accidents 9/28
Case Study 9.2: Police Vehicles 9/28

Module 10 Analysis of Variance 10/1

10.1 Introduction 10/1


10.2 Applications 10/2
10.3 One-Way Analysis of Variance 10/5
10.4 Two-Way Analysis of Variance 10/10
10.5 Extensions of Analysis of Variance 10/13
Learning Summary 10/14
Review Questions 10/15
Case Study 10.1: Washing Powder 10/16
Case Study 10.2: Hypermarkets 10/17

PART 4 STATISTICAL RELATIONSHIPS


Module 11 Regression and Correlation 11/1

11.1 Introduction 11/1


11.2 Applications 11/3
11.3 Mathematical Preliminaries 11/4
11.4 Simple Linear Regression 11/7
11.5 Correlation 11/9
11.6 Checking the Residuals 11/13
11.7 Regression on a Computer 11/15
11.8 Some Reservations about Regression and Correlation 11/19
Learning Summary 11/22
Review Questions 11/23
Case Study 11.1: Railway Booking Offices 11/25
Case Study 11.2: Department Store Chain 11/26

viii Edinburgh Business School Quantitative Methods


Contents

Module 12 Advanced Regression Analysis 12/1

12.1 Introduction 12/1


12.2 Multiple Regression Analysis 12/2
12.3 Non-linear Regression Analysis 12/6
12.4 Statistical Basis of Regression and Correlation 12/12
12.5 Regression Analysis Summary 12/22
Learning Summary 12/23
Review Questions 12/26
Case Study 12.1: CD Marketing 12/28
Case Study 12.2: Scrap Metal Processing I 12/29
Case Study 12.3: Scrap Metal Processing II 12/30

PART 5 BUSINESS FORECASTING


Module 13 The Context of Forecasting 13/1

13.1 Introduction 13/1


13.2 A Review of Forecasting Techniques 13/2
13.3 Applications 13/3
13.4 Qualitative Forecasting Techniques 13/5
Learning Summary 13/16
Review Questions 13/18
Case Study 13.1: Automobile Design 13/19

Module 14 Time Series Techniques 14/1

14.1 Introduction 14/1


14.2 Where Time Series Methods Are Successful 14/2
14.3 Stationary Series 14/2
14.4 Series with a Trend 14/6
14.5 Series with Trend and Seasonality 14/8
14.6 Series with Trend, Seasonality and Cycles 14/8
14.7 Review of Time Series Techniques 14/15
Learning Summary 14/17
Review Questions 14/18
Case Study 14.1: Interior Furnishings 14/20
Case Study 14.2: Garden Machinery Manufacture 14/21
Case Study 14.3: McClune and Sons 14/21

Quantitative Methods Edinburgh Business School ix


Contents

Module 15 Managing Forecasts 15/1

15.1 Introduction 15/1


15.2 The Manager’s Role in Forecasting 15/2
15.3 Guidelines for an Organisation’s Forecasting System 15/4
15.4 Forecasting Errors 15/13
Learning Summary 15/15
Review Questions 15/17
Case Study 15.1: Interior Furnishings 15/19
Case Study 15.2: Theatre Company 15/19
Case Study 15.3: Brewery 15/19

Appendix 1 Statistical Tables A1/1


Appendix 2 Examination Formulae Sheet A2/1
Short-Cut Formula 2/1
Binomial Distribution 2/1
Estimation 2/1
Poisson Distribution 2/2
Normal Distribution 2/2
t-Distribution 2/2
Chi-Squared Distribution 2/2
F-Distribution 2/2
One-Way Analysis of Variance 2/3
Two-Way Analysis of Variance 2/3
Regression 2/3
Correlation Coefficient 2/3
Runs Test 2/3
Exponential Smoothing 2/3
Holt’s Method 2/4
Mean Square Error 2/4

Appendix 3 Practice Final Examinations A3/1


Practice Final Examination 1 3/2
Practice Final Examination 2 3/12

Appendix 4 Answers to Review Questions A4/1


Module 1 4/1
Module 2 4/6
Module 3 4/12
Module 4 4/18
Module 5 4/23

x Edinburgh Business School Quantitative Methods


Contents

Module 6 4/31
Module 7 4/37
Module 8 4/45
Module 9 4/53
Module 10 4/59
Module 11 4/66
Module 12 4/72
Module 13 4/79
Module 14 4/85
Module 15 4/96

Index I/1

Quantitative Methods Edinburgh Business School xi


PART 1

Introduction and Background


Module 1 Introducing Statistics: Some Simple Uses and
Misuses
Module 2 Basic Mathematics: School Mathematics
Applied to Management

Quantitative Methods Edinburgh Business School


Module 1

Introducing Statistics: Some Simple


Uses and Misuses
Contents
1.1 Introduction.............................................................................................1/1
1.2 Probability ...............................................................................................1/3
1.3 Discrete Statistical Distributions ..........................................................1/5
1.4 Continuous Statistical Distributions .....................................................1/8
1.5 Standard Distributions ........................................................................ 1/11
1.6 Wrong Use of Statistics ...................................................................... 1/16
1.7 How to Spot Statistical Errors ........................................................... 1/20
Learning Summary ......................................................................................... 1/22
Review Questions ........................................................................................... 1/24
Case Study 1.1: Airline Ticketing ................................................................. 1/26
Case Study 1.2: JP Carruthers Co. ............................................................... 1/26
Case Study 1.3: Newspaper Letters ............................................................. 1/30

Prerequisite reading: None

Learning Objectives
This module gives an overview of statistics, introducing basic ideas and concepts at
a general level, before dealing with them in greater detail in later modules. The
purpose is to provide a gentle way into the subject for those without a statistical
background, in response to the cynical view that it is not possible for anyone to read
a statistical text unless they have read it before. For those with a statistical back-
ground, the module will provide a broad framework for studying the subject.

1.1 Introduction
The word statistics can refer to a collection of numbers or it can refer to the
science of studying collections of numbers. Under either definition the subject has
received far more than its share of abuse (‘lies, damned lies…’). A large part of the
reason for this may well be the failure of people to understand that statistics is like a
language. Just as verbal languages can be misused (for example, by politicians and
journalists?) so the numerical language of statistics can be misused (by politicians
and journalists?). To blame statistics for this is as sensible as blaming the English
language when election promises are not kept.

Quantitative Methods Edinburgh Business School 1/1


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

One does not have to be skilled in statistics to misuse them deliberately (‘figures
can lie and liars can figure’), but misuses often remain undetected because fewer
people seem to have the knowledge and confidence to handle numbers than have
similar abilities with words. Fewer people are numerate than are literate. What is
needed to see through the misuse of statistics, however, is common sense with the
addition of only a small amount of technical knowledge.
The difficulties are compounded by the unrealistic attitudes of those who do
have statistical knowledge. For instance, when a company’s annual accounts report
that the physical stock level is £34 236 417 (or even £34 236 000), it conveys an aura
of truth because the figure is so precise. Accompanying the accountants who
estimated the figure, one may have thought that the method by which the data were
collected did not warrant such precision. For market research to say that 9 out of 10
dogs prefer Bonzo dog food is also misleading, but in a far more overt fashion. The
statement is utterly meaningless, as is seen by asking the questions: ‘Prefer it to
what?’, ‘Prefer it under what circumstances?’, ‘9 out of which 10 dogs?’
Such examples and many, many others of greater or lesser subtlety have generat-
ed a poor reputation for statistics which is frequently used as an excuse for
remaining in ignorance of it. Unfortunately, it is impossible to avoid statistics in
business. Decisions are based on information; information is often in numerical
form. To make good decisions it is necessary to organise and understand numbers.
This is what statistics is about and this is why it is important to have some
knowledge of the subject.
Statistics can be split into two parts. The first part can be called descriptive
statistics. Broadly, this element handles the problem of sorting a large amount of
collected data in ways which enable its main features to be seen immediately. It is
concerned with turning numbers into real and useful information. Included here are
simple ideas such as organising and arranging data so that their patterns can be seen,
summarising data so that they can be handled more easily and communicating data
to others. Also included is the now very important area of handling computerised
business statistics as provided by management information systems and decision
support systems.
The second part can be referred to broadly as inferential statistics. This element
tackles the problem of how the small amount of data that has been collected (called
the sample) may be analysed to infer general conclusions about the total amount of
similar data that exist uncollected in the world (called the population). For instance,
opinion polls use inferential statistics to make statements about the opinions of the
whole electorate of a country, given the results of perhaps just a few hundred
interviews.
Both types of statistics are open to misuse. However, with a little knowledge and
a great deal of common sense, the errors can be spotted and the correct procedures
seen. In this module the basic concepts of statistics will be introduced. Later, some
abuses of statistics and how to counter them will be discussed.
The first basic concept to look at is that of probability, which is fundamental to
statistical work. Statistics deals with approximations and ‘best guesses’ because of
the inaccuracy and incompleteness of most of the data used. It is rare to make

1/2 Edinburgh Business School Quantitative Methods


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

statements and draw conclusions with certainty. Probability is a way of quantifying


the strength of belief in the information derived and the conclusions drawn.

1.2 Probability
All future events are uncertain to some degree. That the present government will
still be in power in the UK in a year’s time (given that it is not an election year) is
likely, but far from certain; that a communist government will be in power in a
year’s time is highly unlikely, but not impossible. Probability theory enables the
difference in the uncertainty of events to be made more precise by measuring their
likelihood on a scale.

Impossible Evens Certain


0 0.5 1

Lifting oneself by A new-born baby There will be at least


one's own bootlaces being male one car accident in
London during the next year

Figure 1.1 Probability scale


The scale is shown in Figure 1.1. At one extreme, impossible events (e.g. that you
could swim the Atlantic) have probability zero. At the other extreme, completely
certain events (e.g. that you will one day die) have probability one. In between are
placed all the neither certain nor impossible events according to their likelihood. For
instance, the probability of obtaining a head on one spin of an unbiased coin is 0.5;
the probability of one particular ticket winning a raffle in which there are 100 tickets
is 0.01.
As a shorthand notation ‘the probability of an event A is 0.6’ is written in this
way:
(A) = 0.6

1.2.1 Measurement of Probability


There are three methods of calculating a probability. The methods are not alterna-
tives since for certain events only one particular method of measurement may be
possible. However, they do provide different conceptual ways of viewing probabil-
ity. This should become clear as the methods are described.
(a) A priori approach. In this method the probability of an event is calculated by a
process of logic. No experiment or judgement is required. Probabilities involving
coins, dice and playing cards can fall into this category. For example, the proba-
bility of a coin landing ‘heads’ can be calculated by noting that the coin has two
sides, both of which are equally likely to fall upwards (pedants, please note: as-
sume it will not come to rest on its rim). Since the coin must fall with one side
upwards, the two events must share equally the total probability of 1.0. There-
fore:

Quantitative Methods Edinburgh Business School 1/3


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

(Heads) = 0.5
(Tails) = 0.5
(b) ‘Relative frequency’ approach. When the event has been or can be repeated a
large number of times, its probability can be measured from the formula:
.
(Event) =
.
For example, to estimate the probability of rain on a given day in September in
London, look at the last 10 years’ records to find that it rained on 57 days. Then:
.
(Rain) =
. ( × )

=
= 0.19
(c) Subjective approach. A certain group of statisticians (Bayesians) would argue
that the degree of belief that an individual has about a particular event may be
expressed as a probability. Bayesian statisticians argue that in certain circum-
stances a person’s subjective assessment of a probability can and should be used.
The traditional view, held by classical statisticians, is that only objective probabil-
ity assessments are permissible. Specific areas and techniques that use subjective
probabilities will be described later. At this stage it is important to know that
probabilities can be assessed subjectively but that there is discussion amongst
statisticians as to the validity of doing so. As an example of the subjective ap-
proach, let the event be the achievement of political unity in Europe by the year
2020 AD. There is no way that either of the first two approaches could be em-
ployed to calculate this probability. However, an individual can express his own
feelings on the likelihood of this event by comparing it with an event of known
probability: for example, is it more or less likely than obtaining a head on the
spin of a coin? After a long process of comparison and checking, the result
might be:
(Political unity in Europe by 2020 AD) = 0.10
The process of accurately assessing a subjective probability is a field of study in
its own right and should not be regarded as pure guesswork.
The three methods of determining probabilities have been presented here as an
introduction and the approach has not been rigorous. Once probabilities have been
calculated by whatever method, they are treated in exactly the same way.
Examples
1. What is the probability of throwing a six with one throw of a die?
With the a priori approach there are six possible outcomes: 1, 2, 3, 4, 5 or 6 show-
ing. All outcomes are equally likely. Therefore:
(throwing a 6) =
2. What is the probability of a second English Channel tunnel for road vehicles being
completed by 2025 AD?
The subjective approach is the only one possible, since logical thought alone cannot
lead to an answer and there are no past observations. My assessment is a small one,
around 0.02.
3. How would you calculate the probability of obtaining a head on one spin of a biased
coin?

1/4 Edinburgh Business School Quantitative Methods


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

The a priori approach may be possible if one had information on the aerodynamical
behaviour of the coin. A more realistic method would be to conduct several trial
spins and count the number of times a head appeared:
.
(obtaining a head) =
.
4. What is the probability of drawing an ace in one cut of a pack of playing cards?
Use the a priori method. There are 52 possible outcomes (one for each card in the
deck) and the probability of picking any one card, say the ace of diamonds, must
therefore be 1/52. There are four aces in the deck, hence:
(drawing an ace) = =

1.3 Discrete Statistical Distributions


Probability makes it possible to study another essential element of statistical work: the
statistical distribution. It can be thought of either as one of the first steps in descrip-
tive statistics or, alternatively, as a cornerstone of inferential statistics. It will first be
developed as a descriptive technique. Suppose there is a collection of data, which
initially might appear as in Figure 1.2.

53
66
41 71 40
110
83 106

72
20
99 92

75

Figure 1.2 USA sales data


The numbers are all measurements of a variable. A variable is just what the word
implies. It is some entity which can be measured and for which the measurement
varies when several observations are made. The variable might be the number of
serious crimes in each French département or the heights of all 20-year-old males in
Sweden. Figure 1.2 shows the annual sales (in thousands) of a brand of tinned
sweetcorn in different sales regions of the USA. The numbers are referred to as
observations or data points.
It is little more than a mess. A mess can take on different forms, of course. The
first sight of a particular set of data may be a pile of dusty production dockets or it
may be a file of handwritten invoices, but it is always likely to be some sort of mess.
A first attempt to sort it out might be to arrange the numbers in order as in Ta-
ble 1.1.

Table 1.1 Columns of numbers


. 52 59 66
. 54 60 66
41 55 60 .

Quantitative Methods Edinburgh Business School 1/5


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

43 56 60 .
45 57 61 .
46 57 62 .
48 58 62
49 58 63
49 58 65
50 59 65

Table 1.1 is an ordered array. The numbers look neater now but it is still not
possible to get a feel for the data (the average, for example) as they stand. The next
step is to classify the data and then arrange the classes in order. Classifying means
grouping the numbers in bands (e.g. 50–54) to make them easier to handle. Each
class has a frequency, which is the number of data points that fall within that class.
This is called a frequency table and is shown in Table 1.2. This shows that seven
data points were greater than or equal to 40 but less than 50, 12 were greater than or
equal to 50 but less than 60 and so on. There were 100 data points in all.

Table 1.2 A frequency table


Class Frequency
40 ≤ x < 50 7
50 ≤ x < 60 12
60 ≤ x < 70 22
70 ≤ x < 80 27
80 ≤ x < 90 19
90 ≤ x < 100 10
100 ≤ x < 110 3
Total frequency 100
Note: ≤ means ‘less than or equal to’; < means ‘less than’.

It is now much easier to get an overall conception of what the data mean. For
example, most of the numbers are between 60 and 90 with extremes of 40 and 110.
Of course, it is likely that at some time there may be a need to perform detailed
calculations with the numbers to provide specific information, but at present the
objective is merely to get a feel for the data in the shortest possible time. Another
arrangement with greater visual impact, called a frequency histogram, will help
meet this objective.

1/6 Edinburgh Business School Quantitative Methods


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

30
27

22

Frequency
20 19

12
10
10 7
3

10 20 30 40 50 60 70 80 90 100 110 120 130

Figure 1.3 A frequency histogram


The transition from Table 1.2 to Figure 1.3 is simple and obvious, yet with the
frequency histogram one can see immediately what the data are like. The numbers
are spread symmetrically over a range from 40 to just over 100 with the majority
falling around the centre of the range.
As a descriptive device the frequency histogram works well and it is not necessary
to refine it further. If, on the other hand, there are analytical objectives, the histo-
gram of Figure 1.3 would be developed into a statistical distribution. To be strictly
accurate, all configurations dealt with are statistical distributions, but it is the most
manageable and generally accepted version that is sought.
To carry out this development, notice first the connection between frequencies
and probabilities via the ‘relative frequency’ approach to probability calculations.
The probability that any randomly selected measurement lies within a particular
class interval can be calculated as follows:
(number lies within class ) =

e.g.
(40 ≤ < 50) = = 0.07

The frequency histogram can then be turned into a probability histogram by


writing the units of the vertical axis as probabilities (as calculated above) instead of
frequencies. The shape of the histogram would remain unaltered. Once the histo-
gram is in the probability form it is usually referred to as a distribution, in this case
a discrete distribution. A variable is discrete if it is limited in the values it can take.
For example, when the data are restricted to classes (as above) the variable is
discrete. Also, when a variable is restricted to whole numbers only (an integer
variable), it is discrete.
The probability histogram makes it easier to work out the probabilities associated
with amalgams of classes. For instance, if the probabilities of two of the classes are:
(50 ≤ < 60) = 0.12
(60 ≤ < 70) = 0.22
then:

Quantitative Methods Edinburgh Business School 1/7


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

(50 ≤ < 70) = 0.12 + 0.22


= 0.34
This is true whether working in probabilities or the frequencies from which they
were derived.
Examples
From the data in Figure 1.3, what are the following probabilities?
1. (80 ≤ < 100)
2. ( < 70)
3. (60 ≤ < 100)
Answers
1.
(80 ≤ < 100) = (80 ≤ < 90) + (90 ≤ < 100)
= +
= 0.19 + 0.10
= 0.29
2.
( < 70) = ( ≤ 50) + (50 ≤ < 60) + (60 ≤ < 70)
= 0.07 + 0.12 + 0.22
= 0.41
3.
(60 ≤ < 100) = 0.22 + 0.27 + 0.19 + 0.10
= 0.78

1.4 Continuous Statistical Distributions


To summarise progress so far – there is a probability histogram of a variable, from
which can be determined the probability that any one measurement of the variable
will fall within one of the classes of the histogram. Such a distribution is a discrete
distribution. It is a distribution because the variable is distributed across a range of
values; it is discrete because the values the variables take are in steps rather than
smoothly following one another.
A continuous variable is not limited in the values it can take. It can be whole
numbers, and all values in between; it does not group data in classes, but distin-
guishes between numbers such as 41.73241 and 41.73242. The distribution formed
by a continuous variable is a continuous distribution. It can be thought of as an
extension of a discrete distribution. The extension process is as follows. (The
process is to illustrate the link between discrete and continuous distributions: it is
not a procedure that would ever need to be carried out in practice.)
A discrete distribution like Figure 1.3 is reproduced in Figure 1.4(a). The column
widths are progressively reduced. In (b) the column widths have been halved; for
example, the class 50 ≤ x < 60 is divided into two classes, 50 ≤ x < 55 and
55 ≤ x < 60. In (c) the classes have been further subdivided. As the process contin-

1/8 Edinburgh Business School Quantitative Methods


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

ues, the distribution becomes smoother, until, ultimately, the continuous distribu-
tion (d) will be achieved.

Variable classes (a) Variable classes (b)

50 60
Continuous variable (d) Variable classes (c)

Figure 1.4 Discrete to continuous


There is now a difficulty concerning the measurement of probabilities. In the
discrete distribution Figure 1.4(a), the probabilities associated with different values
of the variable were equal to the column height. If column heights continue to be
equal to probabilities, the process (a) → (b) → (c) → (d) would result in flatter and
flatter distributions. Figure 1.4(d) would be completely flat since the probability
associated with the now distinct values such as 41.73241 and 41.73242 must be
infinitesimally small. The problem is overcome by measuring probabilities in a
continuous distribution by areas. For example, P(50 ≤ x < 60) is the area under the
part of the curve between 50 and 60, and shaded in Figure 1.4(d).
The argument for using areas is this. In Figure 1.4(a) the column widths are all
the same; therefore probability could be measured just as well by area as by height.
Figure 1.5 gives an example of what happens in the move from (a) to (b), when the
classes are halved. It is supposed that the original data are such that the probabilities
for the new classes can be calculated from them.

Quantitative Methods Edinburgh Business School 1/9


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

50 < x < 55
50 < x < 60
55 < x < 60

P (50 < x < 55) = 0.05


P (50 < x < 60) = 0.12
P (55 < x < 60) = 0.07

Area
0.12 Area Area
0.05 0.07

50 60 50 55 60

Figure 1.5 Reducing class sizes


Using areas to measure probabilities, the column heights of the new classes are
approximately the same as those of the original. The lower probabilities for the new
classes are reflected in the halving of the column widths, rather than changes in the
heights. As the subdivision process continues, there is no tendency for the distribu-
tion to become flatter. In this way a continuous distribution can have a definite
shape which can be interpreted in the same way as the shape of a discrete distribu-
tion, but its probabilities are measured from areas. Just as the column heights of a
discrete distribution sum to 1 (because each observation certainly has some value), so
the total area of a continuous distribution is 1.
The differences between discrete and continuous distributions are summarised in
Table 1.3.

Table 1.3 Differences between discrete and continuous distributions


Discrete Continuous
Variable limited to certain values Variable not limited
Shape is usually stepped Shape is usually smooth
Probabilities are equal to column heights Probabilities are equal to areas under
the curve
Sum of column heights = 1 Total area = 1

1/10 Edinburgh Business School Quantitative Methods


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

Example

0.01 0.49 0.27 0.21 0.02

60 100 110 135

Figure 1.6 The area under each part of the curve is shown. The total
area is equal to 1.0
Using the continuous distribution in Figure 1.6, what are the probabilities that a
particular value of the variable falls within the following ranges?
1. ≤ 60
2. ≤ 100
3. 60 ≤ ≤ 110
4. ≥ 135
5. ≥ 110
Answers
1. ( ≤ 60) = 0.01
2. ( ≤ 100) = 0.01 + 0.49 = 0.5
3. (60 ≤ ≤ 110) = 0.49 + 0.27 = 0.76
4. ( ≥ 135) = 0.02
5. ( ≥ 110) = 0.21 + 0.02 = 0.23
In practice, the problems with the use of continuous distributions are, first, that
one can never collect sufficient data, sufficiently accurately measured, to
establish a continuous distribution. Second, were this possible, the accurate
measurement of areas under the curve would be difficult. Their greatest
practical use is where continuous distributions appear as standard distributions, a
topic discussed in the next section.

1.5 Standard Distributions


The distribution of US sales data shown in Figure 1.2, Table 1.1, Table 1.2 and
Figure 1.3 is an observed distribution. The data were collected, a histogram
formed and that was the distribution. A standard distribution has a theoretical,
rather than observational, base. It is a distribution that has been defined mathemati-
cally from a theoretical situation. The characteristics of the situation are expressed
mathematically and the resulting situation constructed theoretically. When an actual
situation resembling the theoretical one arises, the associated standard distribution is
applied.

Quantitative Methods Edinburgh Business School 1/11


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

For example, one standard distribution, the normal, is derived from the follow-
ing theoretical situation. A variable is generated by a process which should give the
variable a constant value, but does not do so because it is subject to many small
disturbances. As a result, the variable is distributed around the central value (see
Figure 1.7). This situation (central value, many small disturbances) can be expressed
mathematically and the resulting distribution can be anticipated mathematically (i.e.
a formula describing the shape of the distribution can be found).

497 498 499 500 501 502 503


Average = 500 g

Figure 1.7 Normal distribution of weights of loaves of bread


If an actual situation appears to be like the theoretical, the normal distribution is
applied. Analysis, similar to the probability calculations with the USA sales data, can
then be carried out. Areas under parts of the curve can be found from the mathe-
matical formula or, more easily, from the normal curve tables. The normal
distribution would apply, for instance, to the lengths of machine-cut rods. The rods
should all be of the same length, but are not because of the variation introduced by
vibration, machine inaccuracies, the operator and other factors. A typical analysis
might be to calculate the percentage of production likely to be outside engineering
tolerances for the rods.
The normal distribution can be applied to many situations with similar character-
istics. Other standard distributions relate to situations with different characteristics.
Applying a standard distribution is an approximation. The actual situation is unlikely
to match exactly the theoretical one on which the mathematics were based. Howev-
er, this disadvantage is more than offset by the saving in data collection that the use
of a standard distribution brings about. Observed distributions often entail a great
deal of data collection. Not only must sufficient data be collected for the distribu-
tion to take shape, but also data must be collected individually for each and every
situation.
In summary, using an observed distribution implies that data have been collected
and histograms formed; using a standard distribution implies that the situation in
which data are being generated resembles closely a theoretical situation for which a
distribution has been constructed mathematically.

1/12 Edinburgh Business School Quantitative Methods


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

1.5.1 The Normal Distribution


The normal distribution, one of the most common, is now investigated in more
detail. Figure 1.7 gives a rough idea of what it looks like in the case of weights of
bread loaves. The principal features are that it is symmetrical and bell-shaped; it has
just one hump (i.e. it is unimodal); the hump is at the average of the variable.
However, not all normal distributions are completely the same. Otherwise, they
could not possibly represent both the weights of bread loaves (with an average value
of 500 g and a spread of less than 10 g) and heights of male adults (with an average
of 1.75 m and a spread of around 0.40 m). All normal curves share a number of
common properties such as those mentioned above but they differ in that the
populations they describe have different characteristics. Two factors, called param-
eters, capture these characteristics and are sufficient to distinguish one normal
curve from another (and conversely specify exactly a normal curve). A parameter is
defined as a measure describing some aspect of a population.
The first parameter is the average or mean of the distribution. Although the
term ‘average’ has not been formally defined yet, it is no more than the expression
in everyday use (e.g. the average of 2 and 4 is 3). Two normal distributions differing
only by this parameter have precisely the same shape, but are located at different
points along a horizontal scale.
The second parameter is the standard deviation. Its precise definition will be
given later. It measures the dispersion, or spread, of the variable. In other words,
some variables are clustered tightly about the average (such as the bread loaves).
These distributions have a low standard deviation and their shape is narrow and
high. Variables that are spread a long way from the average have a high standard
deviation and their distribution is low and flat. Figure 1.8 shows examples of
distributions with high and low standard deviations: salaries in a hospital have a
large spread ranging from those for cleaners to those for consultants; salaries for
teaching staff at a school have a much smaller spread.
A further characteristic of a normal distribution is related to the standard devia-
tion (see Figure 1.9). The data refer to the weights of bread loaves with average
weight 500 g and standard deviation 2 g.
The property of the normal distribution illustrated in Figure 1.9 is derived from
the underlying mathematics, which are beyond the scope of this introduction. In any
case, it is more important to be able to use the normal distribution than to prove its
properties mathematically. The property applies whether the distribution is flat and
wide or high and narrow, provided only that it is normal. Given such a property, it is
possible to calculate the probabilities of events. The example below demonstrates
how a standard distribution (in this case the normal) can be used in statistical
analysis.

Quantitative Methods Edinburgh Business School 1/13


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

(a)

£6000 £33 000 £60 000

(b)

£10 000 £19 000 £28 000

Figure 1.8 Salaries: (a) hospital – high standard deviation; (b) school –
low standard deviation

1/14 Edinburgh Business School Quantitative Methods


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

68%

1s 1s

498 500 502


Weight (g)
68% of the distribution lies within ±1s of average
68% of bread loaves weigh between 498 g and 502 g

95%

2s 2s

496 500 504


Weight (g)
95% of the distribution lies within ±2s of average
95% of bread loaves weigh between 496 g and 504 g

99%

3s 3s

494 500 506


Weight (g)
99% of the distribution lies within ±3s of average
99% of bread loaves weigh between 494 g and 506 g

Figure 1.9 Characteristics of the standard deviation (s) in a normal


distribution

Quantitative Methods Edinburgh Business School 1/15


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

Example
A machine is set to produce steel components of a given length. A sample of 1000
components is taken and their lengths measured. From the measurements the average
and standard deviation of all components produced are estimated to be 2.96 cm and
0.025 cm respectively. Within what limits would 95 per cent of all components pro-
duced by the machine be expected to lie?
Take the following steps:
1. Assume that the lengths of all components produced follow a normal distribution.
This is reasonable since this situation is typical of the circumstances in which normal
distributions arise.
2. The parameters of the distribution are the average mean = 2.96 cm and the standard
deviation = 0.025 cm. The distribution of the lengths of the components will there-
fore be as in Figure 1.10.

95%

2.91 2.96 3.01

Figure 1.10 Distribution of lengths of steel components


There is a difference between the distribution of all components produced by the
machine (the distribution of the population) and the distribution of the lengths of
components in the sample. It is the former distribution which is of interest and
which is shown in Figure 1.10. The sample has been used to estimate the parame-
ters.
3. From the properties of the normal distribution stated above, 95 per cent of the
distribution of the population (and therefore 95 per cent of all components pro-
duced) will be within two standard deviations of the average. Limits are 2.96 − (2 ×
0.025) and 2.96 + (2 × 0.025), which give 2.91 cm and 3.01 cm.
According to this estimate, 95 per cent of all production will lie between 2.91 cm
and 3.01 cm.

1.6 Wrong Use of Statistics


Statistics are misused whenever statistical evidence is presented in such a way that it
tends to lead to a false conclusion. The Advertising Standards Authority tries to
protect the public from misleading advertising, but the manager has no similar
protection against misleading management data. The presentation may mislead
accidentally or deliberately. In the latter case, the misuse of statistics can be a
creative art. Even so, it is possible to notice a few general types of misuse.

1/16 Edinburgh Business School Quantitative Methods


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

1.6.1 Definitions
Statistical expressions and the variables themselves may not have precise definitions.
The user may assume the producer of the data is working with a different definition
than is the case. By assuming a wrong definition, the user will draw a wrong
conclusion. The statistical expression ‘average’ is capable of many interpretations. A
firm of accountants advertises in its recruiting brochure that the average salary of
qualified accountants in the firm is £44 200. A prospective employee may conclude
that financially the firm is attractive to work for. A closer look shows that the
accountants in the firm and their salaries are as follows:

3 partners £86 000


8 senior accountants £40 000
9 junior accountants £34 000

The average salary could be:


( × ) ( × ) ( × )
The mean = = £44 200
The ‘middle’ value = £40 000
The most frequent value = £34 000

All the figures could legitimately be said to be the average salary. The firm has
doubtless chosen the one that best suited its purposes. Even if it were certain that
the correct statistical definition was being used, it would still be necessary to ask just
how the variable (salary) is defined. Is share of profits included in the partners’
salaries? Are bonuses included in the accountants’ salaries? Are allowances (a car,
for example) included in the accountants’ salaries? If these items are removed, the
situation might be:

3 partners £50 000


8 senior accountants £37 000
9 junior accountants £32 400

The mean salary is now £36 880. Remuneration at this firm is suddenly not quite
so attractive.

1.6.2 Graphics
Statistical pictures are intended to communicate data very rapidly. This speed means
that first impressions are important. If the first impression is wrong then it is
unlikely to be corrected.
There are many ways of representing data pictorially, but the most frequently
used is probably the graph. If the scale of a graph is concealed or not shown at all,
the wrong conclusion can be drawn. Figure 1.11 shows the sales figures for a

Quantitative Methods Edinburgh Business School 1/17


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

company over the last three years. The company would appear to have been
successful.

Sales

2013 2014 2015

Figure 1.11 Sales record (no scale)


However, no scales are shown. In fact, the sales record has been:

2013 £11 250 000


2014 £11 400 000
2015 £11 650 000

A more informative graph showing the scale is given in Figure 1.12. Sales have
hardly increased at all. Allowing for inflation, they have probably decreased in real
terms.

12

10
Sales (£ million)

2013 2014 2015

Figure 1.12 Sales record (with scale)

1.6.3 Sample Bias


Most statistical data are collected as a sample (i.e. they are just a small part of the
total data available (the population)). Conclusions drawn from the sample are
generalised to the population. The generalisation can be valid only to the extent that
the sample is representative. If the sample is not representative then the wrong
conclusions will be drawn. Sample bias can occur in three ways.

1/18 Edinburgh Business School Quantitative Methods


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

First, it arises in the collection of the data. The left-wing politician who states that
80 per cent of the letters he receives are against a policy of the right-wing govern-
ment and concludes that a majority of all the electorate oppose the government on
this issue is drawing a conclusion from a biased sample.
Second, sample bias arises through the questions that elicit the data. Questions
such as ‘Do you go to church regularly?’ will provide unreliable information. There
may be a tendency for people to exaggerate their attendance since, generally, it is
regarded as a worthy thing to do. The word ‘regularly’ also causes problems. Twice a
year, at Christmas and Easter, is regular. So is twice every Sunday. It would be
difficult to draw any meaningful conclusions from the question as posed. The
question should be more explicit in defining regularity.
Third, the sample information may be biased by the interviewer. For example,
supermarket interviews about buying habits may be conducted by a young male
interviewer who questions 50 shoppers. It would not be surprising if the resultant
sample comprised a large proportion of young attractive females.
The techniques of sampling which can overcome most of these problems will be
described later in the course.

1.6.4 Omissions
The statistics that are not given can be just as important as those that are. A
television advertiser boasts that nine out of ten dogs prefer Bonzo dog food. The
viewer may conclude that 90 per cent of all dogs prefer Bonzo to any other dog
food. The conclusion might be different if it were known that:
(a) The sample size was exactly ten.
(b) The dogs had a choice of Bonzo or the cheapest dog food on the market.
(c) The sample quoted was the twelfth sample used and the first in which as many
as nine dogs preferred Bonzo.

1.6.5 Logical Errors


Statistics allows conclusions about numbers to be drawn. Usually, however, it is the
entities that lie behind the numbers that are of interest. Two of the most common
ways for logical errors to be made are as follows.
First, the numbers may not be the same as the entities. For example, employee
dissatisfaction is sometimes measured through staff turnover. It is the first that is
being studied, but the numbers measure the second. The two may not always
correspond. Financial analysts study the profit figures of companies in order to
judge the profitability of the company. Profit figures are, however, just accounting
measures and are no more than (hopefully, good) approximations to the ‘true
profitability’ of the company, which is difficult both to define and to measure.
Second, conclusions about the numbers do not necessarily imply causal effects in
the entities. For instance, there is a well-established relationship between the average
salary of clergymen and the price of rum. The two variables move together and this
can be verified statistically. However, this does not mean that clergymen support the
price of rum or vice versa. The explanation is that the variables are related via a

Quantitative Methods Edinburgh Business School 1/19


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

third factor, inflation. The variables have increased together as the cost of living has
increased, but they are unlikely to be causally related. This consideration is im-
portant when decisions are based on statistical association. To take the example
further, holding down clergymen’s salaries in order to hold down the price of rum
would work if the relationship were causal, but not if it were mere association.

1.6.6 Technical Errors


Mistakes occur where there is an insufficient understanding of even basic technicali-
ties. An oft-quoted and simplistic case is that of a trade union leader stating his
concern for the lower paid by saying that he would not rest until all his members
earned more than the average salary for the union. (It may be that he was in fact
making a very subtle statement.)
Another simple mistake is in the use of percentages. It would be wrong to sup-
pose that, for example, a 20 per cent increase in productivity this year makes up for
a 20 per cent decrease last year. If the index of productivity two years ago was 100,
then a 20 per cent decrease makes it 80. The 20 per cent increase then makes it 96
(i.e. it has not been returned to its former level).

1.7 How to Spot Statistical Errors


Many types of statistical error can only be dealt with in the context of a particular
quantitative technique, but there are several general questions which can help to
uncover statistical errors and trickery. These questions should be posed whenever
statistical evidence is used.

1.7.1 Who Is Providing the Evidence?


The law provides a good analogy. In a legal case the standing of a witness is an
important consideration in evaluating evidence. One does not expect the defence
counsel to volunteer information damaging to his/her client. In statistics also it is
important to know who is providing the evidence. If the provider stands to gain
from your acceptance of their conclusion, greater care is needed.
It is inconceivable that the makers of Bonzo dog food should ever declare ‘We
have long believed that Bonzo is the finest dog food available. However, recent tests
with a random sample of 2000 dogs indicate that the product made by the Woof
Corporation…’. On the other hand, a report on dog food by an independent
consumer unit carries a greater likelihood of being reliable evidence.

1.7.2 Where Did the Data Come from?


In 2014 ‘on average British people took 2.38 baths per week, compared with 1.15
twenty years ago’ reports a survey of people’s washing habits carried out by a
government department. On the surface this appears to be straightforward evidence,
but how reliable is it?
Where did the data come from? One can assume not from personal observation.
Most probably people were asked. Since not to bath frequently would be a shameful

1/20 Edinburgh Business School Quantitative Methods


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

admission, answers may well be biased. The figure of 2.38 is likely to be higher than
the true figure. Even so, a comparison with 20 years ago can still be made, but only
provided the bias is the same now as then. It may not be. Where did the 20-year-old
data come from? Most likely from a differently structured survey of different sample
size, with different questions and in a different social environment. The comparison
with 20 years ago, therefore, is also open to suspicion.
One is also misled in this case by the accuracy of the data. The figure of 2.38
suggests a high level of accuracy, completely unwarranted by the method of data
collection. When numbers are presented to many decimal places, one should
question the relevance of the claimed degree of accuracy.

1.7.3 Does It Pass the Common-Sense Test?


Experts in any subject sometimes can become so involved with their work that they
see only the technicalities and not the main issues. Outsiders, inhibited by their lack
of technical expertise, may suppress common-sense questions to the detriment of a
project or piece of research. Anything that does not appear to make sense should be
questioned.
An academic researcher investigated the relationship between total lifetime earn-
ings and age at death, and found that the two variables were closely related. He
concluded that poverty induces early death.
One may question the fact that he is basing a causal conclusion on a statistical
association. Perhaps more importantly, an outsider may think that being alive longer
gives more time to amass earnings and therefore it is at least as valid to conclude
that the causality works in the opposite direction (i.e. an early death causes a person
to have low total lifetime earnings). The researcher was so involved in his work and
also probably had such a strong prior belief that poverty causes early death that he
did not apply the common-sense test.

1.7.4 Has One of the Six Common Errors Been Committed?


Six of the more common types of statistical errors were described in the last section.
Could one of them have been committed? Check through the six categories to see if
one of them could apply:
(a) Is there ambiguity of definition? A statistical term (especially the average)
capable of more than one interpretation may have been used.
(b) Are the pictorial representations misleading? Take a second look to see if
other conclusions could be drawn. Especially check that scales have been includ-
ed.
(c) Is there sample bias? When two samples are compared, is like being compared
with like?
(d) What is missing? Is there any additional information which should have been
included and which could change the conclusion?
(e) Is there a logical error? The numbers may not fully represent the entities they
are intended to measure; a strong associative relationship may not be causal.

Quantitative Methods Edinburgh Business School 1/21


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

(f) Is there a technical error? Have statistical definitions/techniques/methods


been properly applied? Answering this question will usually require a deeper
theoretical knowledge of the subject.

Learning Summary
The purpose of this introduction has been twofold. The first aim has been to
present some statistical concepts as a basis for more detailed study of the subject.
All the concepts will be further explored. The second aim has been to encourage a
healthy scepticism and atmosphere of constructive criticism, which are necessary
when weighing statistical evidence.
The healthy scepticism can be brought to bear on applications of the concepts
introduced so far as much as elsewhere in statistics. Probability and distributions can
both be subject to misuse.
Logical errors are often made with probability. For example, suppose a ques-
tionnaire about marketing methods is sent to a selection of companies. From the
200 replies, it emerges that 48 of the respondents are not in the area of marketing. It
also emerges that 30 are at junior levels within their companies. What is the proba-
bility that any particular questionnaire was filled in by someone neither in marketing
nor at a senior level? It is tempting to suppose that:
Probability = = 39%

This is almost certainly wrong because of double counting. Some of the 48 non-
marketers are also likely to be at a junior level. If 10 respondents were non-
marketers and at a junior level, then:
Probability = = 34%

Only in the rare case where none of those at a junior level were outside the mar-
keting area would the first calculation have been correct.

2000
No. of civil servants

1500

1000

500

0–8 8–16 16–24 24–40 40–60 60+


Salary (£000s)

Figure 1.13 Civil servants’ salaries

1/22 Edinburgh Business School Quantitative Methods


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

Graphical errors can frequently be seen with distributions. Figure 1.13 shows an
observed distribution relating to the salaries of civil servants in a government
department. The figures give a wrong impression of the spread of salaries because
the class intervals are not all equal. One could be led to suppose that salaries are
higher than they are. The lower bands are of width £8000 (0–8, 8–16, 16–24). The
higher ones are of a much larger size. The distribution should be drawn with all the
intervals of equal size, as in Figure 1.14.
Statistical concepts are open to misuse and wrong interpretation just as verbal
reports are. The same vigilance should be exercised in the former as in the latter.

2000
No. of civil servants

1500

1000

500

0–8 8–16 16–24 24–32 32–40 40–48 48–56 56–64 64+

Salary (£000s)

Figure 1.14 Civil servants’ salaries (amended)

Quantitative Methods Edinburgh Business School 1/23


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

Review Questions
1.1 One of the reasons probability is important in statistics is that, if data being dealt with
are in the form of a sample, any conclusions drawn cannot be 100 per cent certain. True
or false?

1.2 A randomly selected card drawn from a pack of cards was an ace. It was not returned
to the pack. What is the probability that a second card drawn will also be an ace?
A. 1/4
B. 1/13
C. 3/52
D. 1/17
E. 1/3

1.3 Which of the following statements are true?


A. The probability of an event is a number between 0 and 1.
B. Since nothing is ever certain, no event can have a probability equal to 1.
C. Classical statisticians take the view that subjective probability has no validity.
D. Bayesian statisticians take the view that only subjective probability has validity.

1.4 A coin is known to be unbiased (i.e. it is just as likely to come down ‘heads’ as ‘tails’). It
has just been tossed eight times and each time the result has been ‘heads’. On the ninth
throw, what is the probability that the result will be ‘tails’?
A. Less than 1/2
B. 1/2
C. More than 1/2
D. 1

1/24 Edinburgh Business School Quantitative Methods


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

Questions 1.5–1.7 are based on the following information:


A train station’s daily ticket sales (in £000) over the last quarter (= 13 weeks = 78
days) have been collected in histogram form as shown in Figure 1.15.

25
22

17

8
6

Less than 30–39.9 40–49.9 50–59.9 60 or


30 more

Figure 1.15 Train ticket sales

1.5 On how many days were sales not less than £50 000?
A. 17
B. 55
C. 23
D. 48

1.6 What is the probability that on any day sales are £60 000 or more?
A. 1/13
B. 23/78
C. 72/78
D. 0

1.7 What is the sales level that was exceeded on 90 per cent of all days?
A. £20 000
B. £30 000
C. £40 000
D. £50 000
E. £60 000

1.8 Which of the following statements about a normal distribution is true?


A. A normal distribution is another name for a standard distribution.
B. The normal distribution is an example of a standard distribution.
C. The normal distribution is a discrete distribution.
D. The normal distribution may or may not be symmetrical depending upon its
parameters.

Quantitative Methods Edinburgh Business School 1/25


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

1.9 A normal distribution has mean 60 and standard deviation 10. What percentage of
readings will be in the range 60–70?
A. 68%
B. 50%
C. 95%
D. 34%
E. 84%

1.10 A police checkpoint recorded the speeds of motorists over a one-week period. The
speeds had a normal distribution with a mean 82 km/h and standard deviation 11 km/h.
What speed was exceeded by 97.5 per cent of motorists?
A. 49
B. 60
C. 71
D. 104

Case Study 1.1: Airline Ticketing


As a first step towards planning new facilities at one of its city centre ticket offices, an
airline has collected data on the length of time customers spend at a ticket desk (the
service time). One hundred customers were investigated and the time in minutes each
one was at an enquiry desk was measured. The data are shown below.

0.9 3.5 0.8 1.0 1.3 2.3 1.0 2.4 0.7 1.0
2.3 0.2 1.6 1.7 5.2 1.1 3.9 5.4 8.2 1.5
1.1 2.8 1.6 3.9 3.8 6.1 0.3 1.1 2.4 2.6
4.0 4.3 2.7 0.2 0.3 3.1 2.7 4.1 1.4 1.1
3.4 0.9 2.2 4.2 21.7 3.1 1.0 3.3 3.3 5.5
0.9 4.5 3.5 1.2 0.7 4.6 4.8 2.6 0.5 3.6
6.3 1.6 5.0 2.1 5.8 7.4 1.7 3.8 4.1 6.9
3.5 2.1 0.8 7.8 1.9 3.2 1.3 1.4 3.7 0.6
1.0 7.5 1.2 2.0 2.0 11.0 2.9 6.5 2.0 8.6
1.5 1.2 2.9 2.9 2.0 4.6 6.6 0.7 5.8 2.0

1 Classify the data in intervals one minute wide. Form a frequency histogram. What
service time is likely to be exceeded by only 10 per cent of customers?

Case Study 1.2: JP Carruthers Co.


The JP Carruthers Co. is a medium-sized manufacturing firm. Its sales figures are about
£220 million and its employment level has been around 1100 for the last 10 years. Most
of its sales are in the car industry. JPC’s profit last year was £14 480 000. It has always
enjoyed a reputation for reliability and have generally been regarded as being well
managed.
With few exceptions, JPC’s direct labour force, numbering about 600, is represented
by the TWU, the Transport Workers’ Union. It is the practice in this industry to

1/26 Edinburgh Business School Quantitative Methods


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

negotiate employee benefits on a company-wide basis, but to negotiate wages for each
class of work in a plant separately. For years, however, this antiquated practice has been
little more than a ritual. Supposedly, the system gives workers the opportunity to
express their views, but the fact is that the wages settlement in the first group invariably
sets the pattern for all other groups within a particular company. The Door Trim Line
at JPC was the key group in last year’s negotiations. Being first in line, the settlement in
Door Trim would set the pattern for JPC that year.
Annie Smith is forewoman for the Door Trim Line. There are many variations of
door trim and Annie’s biggest job is to see that they get produced in the right mix. The
work involved in making the trim is about the same regardless of the particular variety.
That is to say, it is a straight piecework operation and the standard price is 72p per unit
regardless of variety. The work itself, while mainly of an assembly nature, is quite
intricate and requires a degree of skill.
Last year’s negotiations started with the usual complaint from the union about piece
prices in general. There was then, however, an unexpected move. Here is the union’s
demand for the Door Trim Line according to the minutes of the meeting:

We’ll come straight to the point. A price of 72p a unit is diabolical… A fair
price is 80p.
The women average about 71 units/day. Therefore, the 8p more that we want
amounts to an average of £5.68 more per woman per day…
This is the smallest increase we’ve demanded recently and we will not accept
less than 80p.

(It was the long-standing practice in the plant to calculate output on an average daily
basis. Although each person’s output is in fact tallied daily, the bonus is paid on daily
output averaged over the week. The idea is that this gives a person a better chance to
recoup if she happens to have one or two bad days.)
The union’s strategy in this meeting was a surprise. In the past the first demand was
purposely out of line and neither side took it too seriously. This time their demand was
in the same area as the kind of offer that JPC’s management was contemplating.
At their first meeting following the session with the union, JPC’s management heard
the following points made by the accountant:
a. The union’s figure of 71 units per day per person is correct. I checked it against the
latest Production Report. It works out like this:
Average weekly output for the year to date is 7100 units; thus, average daily output
is 7100/5 =1420 units/day.
The number of women directly employed on the line is 20, so that average daily
output is 1420/20 = 71 units/day/woman.
b. The union’s request amounts to an 11.1 per cent increase: (80 − 72)/72 × 100 =
11.1.
c. Direct labour at current rates is estimated at £26 million. Assuming an 11.1 per cent
increase across the board, which, of course, is what we have to anticipate, total
annual direct labour would increase by about £2.9 million: £26 000 000 × 11.1% =
£2 886 000.
Prior to the negotiations management had thought that 7 per cent would be a rea-
sonable offer, being approximately the rate at which productivity and inflation had been
increasing in recent years. Privately they had set 10 per cent as the upper limit to their

Quantitative Methods Edinburgh Business School 1/27


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

final offer. At this level they felt some scheme should be introduced as an incentive to
better productivity, although they had not thought through the details of any such
scheme.
As a result of the union’s strategy, however, JPC’s negotiating team decided not to
hesitate any longer. Working late, they put together their ‘best’ package using the 10
per cent criterion. The main points of the plan were as follows:
a. Maintain the 72p per unit standard price but provide a bonus of 50p for each unit
above a daily average of 61units/person.
b. Since the average output per day per person is 71, this implies that on average 10
bonus units per person per day would be paid.
c. The projected weekly cost then is £5612:
(71 × 0.72) + (10 × 0.50) = 56.12
56.12 × 5 × 20 = £5612
d. The current weekly cost then is £5112:
71 × 0.72 × 5 × 20 = 5112
e. This amounts to an average increase of £500 per week, slightly under the 10 per
cent upper limit:
500/5112 × 100 = 9.78%
f. The plan offers the additional advantage that the average worker gets 10 bonus units
immediately, making the plan seem attractive.
g. Since the output does not vary much from week to week, and since the greatest
improvement should come from those who are currently below average, the largest
portion of any increase should come from units at the lower cost of 72p each. Those
currently above average probably cannot improve very much. To the extent that this
occurs, of course, there is a tendency to reduce the average cost below the 79p per
unit that would result if no change at all occurs:
5612/(71 × 5 × 20) = 79.0p
At this point management had to decide whether they should play all their cards at
once or whether they should stick to the original plan of a 7 per cent offer. Two further
issues had to be considered:
a. How good were the rates?
b. Could a productivity increase as suggested by the 9.8 per cent offer plan really be
anticipated?
Annie Smith, the forewoman, was called into the meeting, and she gave the following
information:
a. A few workers could improve their own average a little, but the rates were too tight
for any significant movement in the daily outputs.
b. This didn’t mean that everyone worked at the same level, but that individually they
were all close to their own maximum capabilities.
c. A number did average fewer than 61 units per day. Of the few who could show a
sustained improvement, most would be in this fewer-than-61 category.
This settled it. JPC decided to go into the meeting with their ‘best’ offer of 9.8 per
cent. Next day the offer was made. The union asked for time to consider it and the next
meeting was set for the following afternoon.
In the morning of the following day Annie Smith reported that her Production Per-
formance Report (see Table 1.4) was missing. She did not know who had taken it but
was pretty sure it was the union steward.

1/28 Edinburgh Business School Quantitative Methods


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

Table 1.4 Production performance report


Fiscal Week: 10 Cost Centre: 172 Foreperson: Smith
Employee Pay No. Av. Daily Output this Av. Daily Output Y-T-D
Week
11 98 98
13 88 89
17 72 76
23 44 43
24 52 50
26 79 78
30 77 79
32 52 52
34 96 96
35 86 87
40 67 69
42 64 66
43 95 98
45 86 88
47 50 53
48 42 41
52 43 44
54 45 46
55 94 97
59 68 70
Avg. 71
AV. DAILY THIS WEEK – 1398
AV. DAILY YEAR-TO-DATE – 1420

The next meeting with the union lasted only a few minutes. A union official stated his
understanding of the offer and, after being assured that he had stated the details
correctly, he announced that the union approved the plan and intended to recommend
its acceptance to its membership. He also added that he expected this to serve as the
basis for settlement in the other units as usual and that the whole wage negotiations
could probably be completed in record time.
And that was that. Or was it? Some doubts remained in the minds of JPC’s negotiat-
ing team. Why had the union been so quick to agree? Why had the Production
Performance Report been stolen? While they were still puzzling over these questions,
Annie Smith phoned to say that the Production Performance Report had been returned.

1 In the hope of satisfying their curiosity, the negotiating team asked Annie to bring the
Report down to the office. Had any mistakes been made?

Quantitative Methods Edinburgh Business School 1/29


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

Was JPC’s offer really 9.8 per cent? If not, what was the true offer?

Case Study 1.3: Newspaper Letters


The two attached letters appeared recently in a newspaper. In the first letter, Dr X
concludes that dentists should not give anaesthetics. In the second, Mr Y concludes that
dentists are the safest anaesthetists there are.
Danger in the Dental Chair
Sir– As a medically qualified anaesthetist responsible for a large number of
dental anaesthetics I read (17 June) with great distress and despair of the death
under an anaesthetic of Miss A.
It is a source of great concern to me that dentists are permitted to give anaes-
thetics. Any fool can give an intravenous injection, but considerable skill and
experience is needed to handle an emergency occurring in anaesthetics.
For anyone, however qualified, however competent, to give an anaesthetic with
no help whatsoever is an act of criminal folly; the BDA, BMA and all the medical
defence societies would agree with this.
I call upon everyone to boycott anaesthetics given by a dentist under any
circumstances.
Yours faithfully,
Dr X, Colchester, Essex.
A Dental Safety Record That Can’t Be Matched
Sir– Dr X’s feelings (Letters, 25 June) about the tragic death of Miss A will be
shared by many, and they do him credit; but they have also led him astray.
Miss A was not anaesthetised; she was heavily sedated with a combination and
dosage of drugs which produced a severe respiratory depression which the
practitioner was unable to reverse.
In calling for a ban upon the giving of general anaesthetics by dentists, Dr X is
on very unsafe ground. The possession of a medical degree does not of itself
confer immunity from stupidity or negligence; many other people would still be
alive if it did.
If Dr X consults the records produced by the Office of Population Censuses and
Surveys, he will find that, overall, more deaths associated with dental anaesthe-
sia occur when the anaesthetist is medically qualified than when he is a dentist.
Excluding the hospital service (where all anaesthetists are medically qualified but
where nearly 50 per cent of deaths occur), medically qualified anaesthetists give
36 per cent of the dental anaesthetics; they have 45 per cent of the associated
deaths. Not only a balance in favour of the dentist anaesthetist, but one which
shows that mischance can occur to anyone, however skilled.
Not even Dr X, I think, would claim that all the deaths which occurred with
medically qualified anaesthetists were due to misadventure, and all those which
occurred with dentists were negligence.
However, these figures should be put in their proper perspective. In general
dental practice and in the Community Dental Service, about 1.5 million anaes-
thetics are given each year. Over the last 15 years, deaths have averaged 4 a

1/30 Edinburgh Business School Quantitative Methods


Module 1 / Introducing Statistics: Some Simple Uses and Misuses

year. It is a safety record which cannot be matched by any other form of general
anaesthesia.
Yours faithfully,
Mr Y, (President-Elect) Society for the Advancement of Anaesthesia in Dentis-
try.
1 Comment upon the evidence and reasoning (as given in the letters) that lead to these
two conclusions.

Quantitative Methods Edinburgh Business School 1/31


Module 2

Basic Mathematics: School


Mathematics Applied to Management
Contents
2.1 Introduction.............................................................................................2/1
2.2 Graphical Representation ......................................................................2/2
2.3 Manipulation of Equations .....................................................................2/8
2.4 Linear Functions .................................................................................. 2/11
2.5 Simultaneous Equations ...................................................................... 2/14
2.6 Exponential Functions ......................................................................... 2/18
Review Questions ........................................................................................... 2/25
Case Study 2.1: Algebraic Formulation ....................................................... 2/28
Case Study 2.2: CNX Armaments Co. ........................................................ 2/29
Case Study 2.3: Bonzo Corporation ............................................................. 2/29
Case Study 2.4: Woof Dog Food................................................................... 2/29

Prerequisite reading: None

Learning Objectives
This module describes some basic mathematics and associated notation. Some
management applications are described but the main purpose of the module is to lay
the mathematical foundations for later modules. It will be preferable to encounter
the shock of the mathematics at this stage rather than later when it might detract
from the management concepts under consideration. For the mathematically literate
the module will serve as a review; for those in a rush it could be omitted altogether.

2.1 Introduction
The quantitative courses people take at school, although usually entitled ‘mathemat-
ics’, probably cover several quantitative subjects, including algebra, geometry and
trigonometry. Some of these areas are useful as a background to numerical methods
in management. These include graphs, functions, simultaneous equations and
exponents. Usually, the mathematics is a precise means of expressing concepts and
techniques. A technique may not be complex, but the presence of mathematics,
especially notation, can cause difficulties and arouse fears.
Most of the mathematics met in a management course will be reviewed here.
Although their usefulness in management will be indicated, this is not the prime
purpose. Relevance is not an issue at this stage; the main objective is to deal with

Quantitative Methods Edinburgh Business School 2/1


Module 2 / Basic Mathematics: School Mathematics Applied to Management

basic mathematical ideas now so that they do not interfere with comprehension of
more directly applicable techniques at a later stage.

2.2 Graphical Representation


Graphs are the major pictorial method of representing numbers. Changes in, for
instance, sales figures or financial measures over time are immediately apparent; the
relationship between variables, say supply and demand, are quickly evident. Graphs
are used in most areas of management.
The essence of graphical representation is the location of a point on a piece of
paper by specifying its coordinates. Like many good ideas, this is very simple. As an
example, consider a town map. An entry in the index might read:

Hampton Lane: p. 52, F2

1 2 3

F2

Figure 2.1 Town map


Turning to page 52, we find that the map is divided into rectangles (see Fig-
ure 2.1). To find Hampton Lane, look along row F, and down column 2. Where row
and column meet, we have rectangle F2, somewhere within which those with acute
eyesight will be able to find the lane in question.

2/2 Edinburgh Business School Quantitative Methods


Module 2 / Basic Mathematics: School Mathematics Applied to Management

3 units
2 (3,2)

y-axis
1 2 units

1 2 3 4
x-axis

Figure 2.2 Coordinates


More generally, any point on a graph is located by two coordinates, which are the
horizontal and vertical distances of the point from a fixed point called the origin.
Figure 2.2 illustrates this for the point (3,2). This point is three units from the origin
in a horizontal direction (the horizontal scale is usually called the x-axis); it is two
units from the origin in a vertical direction (the vertical scale is usually called the y-
axis). The first coordinate is referred to as the x-value and the second coordinate as
the y-value. By convention, this order is always the same.
Logically developing the above, we find that the following facts emerge:
(a) The origin has coordinates (0,0).
(b) Anywhere along the y-axis, the x-value is 0. Similarly, along the x-axis, the y-
value is 0.
(c) The axes can be extended on the other side of the origin. In this case, either or
both coordinates take on negative values. When the scales are extended in all
four directions, the paper is divided into four quadrants (see Figure 2.3).

Quantitative Methods Edinburgh Business School 2/3


Module 2 / Basic Mathematics: School Mathematics Applied to Management

2
x-values negative x-values positive
y-values positive y-values positive
1

–3 –2 –1 0 1 2 3

–1
x-values negative x-values positive
y-values negative y-values negative
–2

Figure 2.3 Quadrants


(d) There is no restriction to whole units; therefore any point on the paper has a
representation.
Figure 2.4 gives some examples of point representation.

y
5
(–7,4)
4

3
(3,2)
2
(612 ,1)
1
(–4,0) (4,0)

–8 –7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7 8 x
–1

–2

–3 (0,–3)
(–4,–3)
–4

–5

Figure 2.4 Representing points


Graphical representation is not limited to the location of points. Relationships
can be represented on a graph. Consider the simple example of the relationship
between direct profit and volume sold. For a given price and cost of a product, the
direct profit will vary according to the numbers sold:
Direct profit = (Price − Cost) × Volume

2/4 Edinburgh Business School Quantitative Methods


Module 2 / Basic Mathematics: School Mathematics Applied to Management

This can be written more concisely with letters in place of the words. Suppose x
is the variable number of products sold, y is the variable direct profit, p the set price
and q the set cost per unit. Then:
=( − ) ﴾2.1﴿
The equation is given a label (Equation 2.1) so that it can be referred to later.
Note that multiplication can be shown in several different ways. For example,
Price (p) times Volume (x) can be written as:
·
( )( )

The multiplication sign (×) used in arithmetic tends not to be used with letters
because of possible confusion with the use of x as a variable.
The use of symbols to represent numbers is the dictionary definition of algebra.
It is intended not to confuse but to simplify. The symbols (as opposed to verbal
labels, e.g. ‘y’ instead of ‘Direct profit’) shorten the description of a complex
relationship; the symbols (as opposed to numbers, e.g. y instead of 2.1, 3.7, etc.)
allow the general properties of the variables to be investigated instead of particular
properties when the symbols take particular numerical values.
The relationship (Equation 2.1) above is an equation. Since price and cost are
fixed in this example, p and q are constants. Depending upon the quantities sold, x
and y may take on any of a range of values; therefore, they are variables. Once x is
known, y is automatically determined, so y is a function of x. Whenever the value
of a variable can be calculated given the values of other variables, it is said to be a
function of the other variables.
If the values of the constants are known, say p = 5, q = 3, then the equation
becomes:
=2 ﴾2.2﴿
A graph can now be made of this function. The graph is the set of all points
satisfying Equation 2.2, i.e. all the points for which Equation 2.2 is true. By looking
at some of the points, the shape of the graph can be seen:
when x = 0, y = 2 times 0 = 0
when x = 1, y = 2 times 1 = 2
when x = 2, y = 2 times 2 = 4, etc.
Therefore, points (0,0), (1,2), (2,4), etc. all lie on this function. Joining together a
sample of such points shows the shape of this graph. This has been done in
Figure 2.5, which shows the graph of the function y = 2x.

Quantitative Methods Edinburgh Business School 2/5


Module 2 / Basic Mathematics: School Mathematics Applied to Management

y = 2x
2

–2 –1 1 2 3 x

–1

Figure 2.5 Linear function


This function is called a linear function since it is a straight line (or, more math-
ematically, it is linear because y and x are not raised to powers; there are no squared,
cubed, logarithmic, etc. terms). More complicated functions can also be graphed.
Figure 2.6 and Figure 2.7 give examples of two such functions.

(–3,7) 7
6
5
4
3
(–2,2) 2
1

–3 –2 –1 1 2 3 x
–1
2
y= x –2 –2
x = –3 –2 –1 0 1 2 3
y = 7 2 –1–2–1 2 7

Figure 2.6 Squared function

2/6 Edinburgh Business School Quantitative Methods


Module 2 / Basic Mathematics: School Mathematics Applied to Management

y
3 2
y = x + 3x – 2
x = –3 –2 –1 0 1 2 3 4
y = –2 2 0 –2 2 18 52

x
–3 –2 –1 1 2 3

–1

–2

Figure 2.7 Cubed function


Functions are put into graphical form by plotting a selection of points and then
drawing the curve that goes through them. In Figure 2.6, x = −3 might be the
starting point. The corresponding y value is 7 (x = −3 is put into y = x2 − 2 to give
the y value). The point (−3,7) therefore lies on the graph of the function. Trying
x = −2 gives y = 2. The point (−2,2) therefore lies on the graph.
More points are plotted until there are sufficient points to indicate the shape of
the graph. In this case, taking all whole number values for x between −3 and +3
gives sufficient points for the shape to become obvious.

x = −3 −2 −1 0 1 2 3
y= 7 2 −1 −2 −1 2 7

The equation y = x2 − 2 (Figure 2.6) might be used to represent the relationship


between some firm’s profit, y, and output, x (but only for x greater than 0). Profit is
at first negative; a breakeven point is reached at x = 1.4 (approximately); thereafter
profit increases rapidly as economies of scale come into play. Verifying that the
equation (y = x2 − 2) could, in certain circumstances, represent the relationship
between profit and output requires techniques not covered so far.
In Figure 2.7 the same seven x values, from −3 to +3, are sufficient to show the
shape of y = x3 + 3x2 − 2. The choice of these seven points is arbitrary. Any
selection of points that are able to demonstrate the shape can be chosen. It is
important to make sure that a sufficient range of points is included. Had the point
x = −3, y = −2 been excluded, then one could have been led to believe that the
graph was a U shape instead of a ‘Z on its side’ shape.

Quantitative Methods Edinburgh Business School 2/7


Module 2 / Basic Mathematics: School Mathematics Applied to Management

In Figure 2.5 only two points need be plotted since a straight line is defined
completely by any two points lying on it. The number of points which require to be
plotted varies with the complexity of the function.
When we are working with functions, they are usually restricted to their algebraic
form. It is neater and more economical to use them in this form. They are generally
put in graphical form only for illustrative purposes. The behaviour of complex
equations is often difficult to imagine from the mathematical form itself.

2.3 Manipulation of Equations


A simple example of an equation has already been introduced. With more compli-
cated examples it is sometimes necessary to rearrange them. The aim in doing this
might be to simplify an expression (i.e. to shorten it by collecting like terms togeth-
er). This would make it easier to handle in an analysis. Or the aim might be to solve
the equation for a particular variable, say x, when the equation is rearranged in the
form x = ⋯, where the dots represent constants or other variables. For instance, in
economics, the extent to which the sales volumes of a product vary as its price
changes is known as its elasticity. This can be expressed approximately by an
equation:
Elasticity (E) = /
( )/ ( )/

where
Q1 is the quantity sold at price P1
Q2 is the quantity sold at price P2
Suppose the product currently sells at the price P1 and the quantity sold per
month is Q1. A new price is mooted. What is likely to be the quantity sold at this
price? If the elasticity is known (or can be estimated), then the equation can be
rearranged and solved for Q2, i.e. put in the form:
Q2 = function of E,Q1,P1,P2
The likely quantity sold (Q2) at the new price (P2) can then be calculated. But first
the equation would have to be rearranged.
The four rules by which equations can be rearranged are:
(a) Addition. If the same quantity is added to both sides of an equation, the
resulting equation is equivalent to the original equation.
Examples

1 Solve x − 1 = 2 x−1 = 2
Add 1 to both sides of the equation x−1+1 = 2+1
x = 3

2 Solve for x: x − 4 = y + 1 x−4 = y+1


Add 4 to both sides of the equation x−4+4 = y+1+4
x = y+5

2/8 Edinburgh Business School Quantitative Methods


Module 2 / Basic Mathematics: School Mathematics Applied to Management

(b) Subtraction. If the same quantity is subtracted from both sides of an equation,
the resulting equation is equivalent to the original.
Examples

1 Solve x + 4 = 14 x + 4 = 14
Subtract 4 from both sides of the equation x + 4 − 4 = 14 − 4
x = 10

2 Solve for y: y + x − 5 = 2 y+x−5 = 2


Add 5 to and subtract x from y+x−5+5−x = 2+5−x
both sides of the equation y = 7−x
NB: the addition and subtraction rules combine to give the familiar rule that any part of an
equation can be transferred to the other side of the equals sign as long as its sign is changed
(+ becomes −, − becomes +).
(c) Division. If each side of an equation is divided by the same number (but not
zero), the resulting equation is equivalent to the original.
Examples

1 Solve 8x = 72 8x = 72
Divide both sides by 8 =

x = 9

2 Solve for x: 2y − 4x + 5 = 6x − 3y −
5
Add 5 and 3y to both sides 5y − 4x + 10 = 6x
Add 4x to both sides 5y + 10 = 10x
Divide by 10 +1 = x

This illustrates that the solved variable can appear on either side of the equation.
(d) Multiplication. If both sides of an equation are multiplied by the same number,
except zero, the resulting equation is equivalent to the original.
Examples

1 Solve = 6 = 6
Multiply both sides by 3: x = 18

Quantitative Methods Edinburgh Business School 2/9


Module 2 / Basic Mathematics: School Mathematics Applied to Management

2 Solve =1 = 1
Multiply both sides by (4 − y) 2y + 3 = 4 − y
Add y to both sides 3y + 3 = 4
Subtract 3 from both sides 3y = 1
Divide both sides by 3 y =

2.3.1 Use of Brackets in Algebraic Expressions


Brackets indicate that the expression within the brackets should be treated as if it
were a single symbol. For example, 2(y − 4) means that both the y and the 4 should
be multiplied by 2:
2(y – 4) = 2 times y – 2 times 4
= 2y − 8
When multiplying two brackets, everything in one bracket is multiplied by every-
thing in the other.

For example: (x + 3)(y + 4) = x times y


+ x times 4
+ 3 times y
+ 3 times 4
= xy + 4x + 3y + 12

In carrying out multiplications, remember that a positive multiplied by a negative


is negative, and a negative multiplied by a negative is positive.

For example: (x − 3)(y − 4) = x times y


− x times − 4
− 3 times y
− 3 times − 4
= xy − 4x − 3y + 12

Example

Simplify = =

Multiply both sides by (4 − 3y) and (y − 1) (y + 3)(y − 1) = 4(4 − 3y)


Multiply out the brackets y2 − y + 3y − = 16 − 12y
3
Add 12y, subtract 16 from both sides y + 14y − 19 = 0
2

2/10 Edinburgh Business School Quantitative Methods


Module 2 / Basic Mathematics: School Mathematics Applied to Management

2.4 Linear Functions


Linear functions are of great importance in management studies. Not only do they
describe many relationships in their own right, but also, because of their simplicity
and ease of use, more complex relationships can sometimes be approximated by the
linear equation. For instance, it might be possible for the total cost of processing
particular benefits payments to be expressed as a linear function of the number of
applications. The equation could then be used, along with others of a similar nature,
in compiling a budget for the department concerned. As another example, one of
the variables might be time. A linear function can then be used to represent growth
or decay. For instance, if sales volume were expressed as a linear function of time,
the equation would represent the increase or decrease in sales over time.
A linear function of x is one in which only constants and multiples of x appear.
There are no terms such as x2, x5, 1, etc. If y is a linear function of x, then the
equation must have the form:
= + ﴾2.3﴿
where m and c are constants.
Equation 2.3 is the general form of a linear equation with two variables; m and c
are merely labels – any other letters would be just as satisfactory. Thus m is the
coefficient of x, the name given to the constant by which a variable is multiplied. In
the benefit payments example, c is the fixed overhead, m the variable cost per
application and x the number of applications; mx + c is then the total cost (= y).
An alternative definition is that the graphical representation of a linear function
must be a straight line, for at all points a change of 1 unit in x gives rise to a change
of m units in y. The following are all straight lines:
y = 2x + 1
y = 3x
y = 4 − 2x

2.4.1 Interpretation of m and c


Figure 2.8 is the graph of y = 2x + 1. It was obtained, as previously, by plotting two
of the points. In this example m = 2 and c = 1.
The value of c is the intercept of the line (i.e. the point on the y-axis where the
line crosses it). This can be seen either from the graph or by putting x = 0 in the
equation.
The value of m is the slope of the line. Alternatively, the slope is referred to as
the gradient. In either case, what is meant is the usual idea of gradient – the ratio
between the distance moved vertically and the distance moved horizontally.

Quantitative Methods Edinburgh Business School 2/11


Module 2 / Basic Mathematics: School Mathematics Applied to Management

7 y = 2x + 1

6
A
5 (x1,y1)

4
y distance = y1 – y2
3 y1 – y2
B Slope =
(x2,y2) x1 – x2
2 x distance = x1 – x2

1
( 12,0) (0,1)

–1 0 1 2 3 4 x
–1

–2

Figure 2.8 Slope of a line


Figure 2.8 shows two points on the line, A and B, whose coordinates are (x1,y1)
and (x2,y2) respectively. In algebra, subscripts are used to denote unspecified but
fixed values of a variable (e.g. x1 and x2 are the values of x at the two unspecified
points A and B). The slope between A and B is given by:
= =
Since y1 = mx1 + c and y2 = mx2 + c (A and B are on the line):
( )
Slope AB = = =

The same reasoning applies to any two points along the line, confirming the
obvious fact that the slope (and therefore m) is constant along a straight line.
If A and B are two particular points (2,5) and (1,3) (i.e. x1 = 2, y1 = 5, x2 = 1, y2 =
3) then:
Slope AB = =2
For example, if the sales volume of a company were expressed as a linear func-
tion of time, y would be sales and x would be time (x = 1 for the first time period,
x = 2 for the second time period and so on). Then m would be the constant change
in sales volume from one time period to the next. If m = 3, then sales volume would
be increasing by 3 each time period.
A few additional facts are worthy of note.
(a) It is possible for m to be negative. If this is the case, the line leans in a backward
direction, since as x increases, y decreases. This is illustrated in Figure 2.9.
(b) It is possible for m to take the value 0. The equation of the line is then
y = constant, and the line is parallel to the x axis.

2/12 Edinburgh Business School Quantitative Methods


Module 2 / Basic Mathematics: School Mathematics Applied to Management

(c) Similarly, the line x = constant is parallel to the y axis and the slope can be
regarded as being infinite. The last two lines are examples of constant functions
and are also shown in Figure 2.9.

3
y = –x + 3
x = –2
2

y=1
1

–3 –2 –1 0 1 2 3 4 x

Figure 2.9 Examples of linear functions


The question of how the equation of a particular straight line can be found is
now tackled. The equation is known as soon as the values for m and c are found.
This is possible if any of the following sets of information is available:
(a) The values of m and c.
(b) The value of m and the coordinates of any point on the line.
(c) The value of c and the coordinates of any point on the line.
(d) The coordinates of any two points on the line.
Intuitively, this makes sense, since given any one of the set (a)–(d) one could
draw a graph. In the case of information set (a), the equation is found trivially (e.g. if
m = 2 and c = 4, then the straight line is y = 2x + 4). In cases (b) and (c), one has an
equation with either m or c unknown. The coordinates of the known point are
substituted into the equation, which is solved for the unknown constant (see below
for examples). In case (d), the slope is found as follows:
Slope = where the points are ( , ), ( , )
Then the calculation is carried out as for case (b).
Examples
1. What is the equation of the line with slope 2 which passes through the point (3,4)?
Any straight line has the form y = mx + c. Since slope = m = 2, the equation must
be:
y = 2x + c
Since it passes through (3,4):
4=6+c
c = −2
The line is y = 2x − 2.

Quantitative Methods Edinburgh Business School 2/13


Module 2 / Basic Mathematics: School Mathematics Applied to Management

2. What is the equation of the line with intercept −3 and which passes through the
point (1,1)?
Since the intercept is −3, the line is:
y = mx − 3
Since it passes through (1,1):
1=m–3
m=4
The line is y = 4x – 3.
3. What is the equation of the line passing through the points (3,1) and (1,5)?
The slope between these two points is:
= = = −2
Therefore, the line must be y = −2x + c.
Since it passes through (3,1) (NB (1,5) could just as easily be used):
1 = −6 + c
c=7
The line is y = −2x + 7.

2.5 Simultaneous Equations


Relationships between variables can be described by functions. In particular, a linear
equation can represent the linear relationship between two variables. There are,
however, situations that are described by several equations.
For example, in microeconomics the price and production level of a product can
be written as two equations. First, a relationship between price and quantity will
show the amount consumers will demand at a given price level. Second, a further
relationship between price and quantity will show the quantity a supplier will be
willing to supply at a given price level. Economic theory says that there is an
equilibrium point at which a single price and quantity will satisfy both equations (the
demand curve and the supply curve).
This is the type of problem to be investigated. How can values for variables be
found that satisfy simultaneously more than one equation?
Suppose there are two equations in the two variables x and y as below:
3 + 2 = 18 ﴾2.4﴿
+ 4 = 16 ﴾2.5﴿
What are the values of x and y that satisfy both these equations? Do such values
exist at all? Or are there several of them? For example, can a single price and a single
quantity be found that is correct for both the supply and demand equations?

2/14 Edinburgh Business School Quantitative Methods


Module 2 / Basic Mathematics: School Mathematics Applied to Management

2.5.1 Types of Solution


If the equations are plotted on a graph, it is easier to understand the solution. The
lines are best plotted by determining the points at which they cross the two axes.

For line (2.4): when x = 0, y = 9


when y = 0, x = 6
For line (2.5): when x = 0, y = 4
when y = 0, x = 16

The values of x and y that satisfy both equations are found from the point of
intersection of the lines (see Figure 2.10). Since this point is on both lines, the x and
y values here must satisfy both equations. From the graph these values can be read:
x = 4, y = 3. That these values do fit both equations can be checked by substituting
x = 4, y = 3 into the equations of the lines.

Line (2.4)

3 Line (2.5)
x
4 6 16

Figure 2.10 Solving simultaneous equations


In this example there is one and only one answer to the problem. There is said to
be a unique solution to the problem.
On the other hand, suppose the two equations are as below:
2x + 3y = 12
2x + 3y = 24
In this case the two equations are inconsistent. The left-hand side of both equa-
tions is the same: 2x + 3y. It is not possible for this to equal simultaneously 12 and
24. They can be plotted as in Figure 2.11. The two lines are parallel and have
therefore no point of intersection and no common solution.

Quantitative Methods Edinburgh Business School 2/15


Module 2 / Basic Mathematics: School Mathematics Applied to Management

8
2x + 3y = 24

4
2x + 3y = 12

x
6 12

Figure 2.11 Inconsistent equations


There is one other possibility that can arise. Suppose the equations are of the
form:
x + 3y = 15
4x + 12y = 60
If these lines are plotted, it is found that they are exactly the same line. The sec-
ond line is four times the first. Dividing through by four gives the same equation.
Any point lying anywhere along the two coincident lines satisfies both equations.
There is an infinite number of solutions and the equations are said to be depend-
ent.
To recap, solving two linear equations in two variables means finding a point
whose coordinates satisfy both equations. One of three outcomes is possible:
(a) There is one common point (i.e. a unique solution).
(b) There is an infinite number of solutions (i.e. the equations are dependent).
(c) There is no solution (i.e. the equations are inconsistent).

2.5.2 Algebraic Solution


In the preceding examples, the solutions, if any, to the equations were found
graphically. The solutions can also be found by mathematical means. The method is
given in Table 2.1.

2/16 Edinburgh Business School Quantitative Methods


Module 2 / Basic Mathematics: School Mathematics Applied to Management

Table 2.1 Algebraic solution to simultaneous equations


Method Example
3x + 2y = 18
x + 4y = 16
1 Multiply one or both the equations by appropriate 1 Multiply the second equation by 3.
numbers so that the coefficients of x (or y) are the Leave the first equation. The coeffi-
same in both equations. cient of x is now 3 in both equations.
3x + 2y = 18
3x + 12y = 48
2 Add or subtract the equations so that x (or y) 2 Subtract the first from the second.
disappears, leaving one equation in y (or x).
3x + 12y = 48
3x + 2y = 18
3 The remaining equation gives the value of y (or x). 3 10y = 30
y = 3
4 Substitute this answer in either of the original 4 Put y = 3 in.
equations to provide the other half of the solution.
3x + 2y = 18
3x + 6 = 18
3x = 12
x = 4
The answer is x = 4, y = 3

Example
Solve the two simultaneous equations:

5 + 2 = 17 ﴾2.6﴿
2 −3 =3 ﴾2.7﴿
Multiply Equation 2.6 by 3 and Equation 2.7 by 2 so that the coefficients of y are the
same in both equations. (Equation 2.6 could just as well have been multiplied by 2 and
Equation 2.7 by 5 and then x eliminated.)
15x + 6y = 51
4x − 6y = 6
Add the two equations to eliminate y:
19x = 57
x=3
Substitute x = 3 in Equation 2.7 to find the y value:
6 − 3y = 3
3y = 3
y=1

Quantitative Methods Edinburgh Business School 2/17


Module 2 / Basic Mathematics: School Mathematics Applied to Management

The solution is x = 3, y = 1.

2.6 Exponential Functions


Exponential functions are important because of the way they express growth or
decay (e.g. the increase or decrease in a variable over a period of time). Usually they
represent a ‘natural’ pattern of growth or decay.
Think of this in terms of the sales of a new product. Suppose y is the level of
sales and x is time. Under a linear growth function each month shows the same
constant increase in the number (of bottles, packets, boxes, etc.) sold. Under an
exponential growth function each month shows a different sales increase, which is a
constant proportion of the sales level at the start of the month. A linear equation
represents a constant increase each month; an exponential equation represents a
constant percentage increase each month. There are many occasions when exponen-
tial growth is the more realistic representation. Note that exponential functions can
model decreases as well as increases.
An example of exponential growth is that of the growth of money in a bank
deposit account. Suppose £1000 is put into the account, for which the interest rate
is 10 per cent.
At the end of the first year, the balance will have grown to £1000 plus 10 per
cent (i.e. £1100).
At the end of the second year, the balance will have grown to £1100 plus 10 per
cent (i.e. £1210).
At the end of the third year, the balance will have grown to £1210 plus 10 per
cent (i.e. £1331).
And so on …
It is the fact that the interest is given on the total balance at year end (£1100,
£1210, etc.) and not just on the original deposit (£1000) that brings about exponen-
tial growth. The former results in exponential growth, the latter in linear growth.
Figure 2.12 is a graph showing how the balance increases. It shows a shallow but
curved rise. Had the interest rate been greater then the curve would have been
steeper.

2/18 Edinburgh Business School Quantitative Methods


Module 2 / Basic Mathematics: School Mathematics Applied to Management

2000

Balance (£)
1500

1331

1331
1210
1100
1000
0 1 2 3 4 Time
(years)

Figure 2.12 Bank deposit account


An alternative, more mathematical way of representing the growth of the balance
is as follows:
At the end of year 1: £1000 × 1.10 = £1100
At the end of year 2: £1100 × 1.10 = £1000 × (1.10)2 = £1210
At the end of year 3: £1210 × 1.10 = £1000 × (1.10)3 = £1331
At the end of year n: £1000 × (1.10)n
Exponential or growth functions therefore hinge on expressions of the type
(1.10)5, 23, 104, etc. In business forecasting, equations representing exponential
growth are used. To understand how they work it is also necessary to consider the
algebraic form of such expressions, for example, ax. The properties of these
expressions are explained in the next section.

2.6.1 Exponents
Consider an expression of the form ax. The base is a and the exponent is x. If x is a
whole number, then the expression has an obvious meaning (e.g. a2 = a × a,
a3 = a × a × a, 34 = 81, etc.) It also has meaning for values of x that are not whole
numbers. To see what this meaning is, it is necessary to look at the rules for working
with exponents.
(a) Multiplication. The rule is:
× =
For example:
× =
× =
It can be seen that this makes good sense if one substitutes whole numbers for a,
x and y. For instance:
2 ×2 = 4×8
= 32
= 2

Quantitative Methods Edinburgh Business School 2/19


Module 2 / Basic Mathematics: School Mathematics Applied to Management

Note that the exponents can only be added if the bases are the same. For exam-
ple, a3 × b2 cannot be simplified.
(b) Division. The rule is similar to that for multiplication:
/ =
For example:
/ = = =
Again, the reasonableness of the rule is confirmed by resorting to a specific nu-
merical example.
(c) Raising to a power. The rule is:
( ) =
For example:
( ) =
Several points of detail follow from the rules:
(a) = 1 since 1 = / =
(b) = 1/ since 1/ = / = =
1 1 1 1 1 1
3 +
(c) = √ ;
2 = √ since × =
3 2 2 2 2 = 1=
This last point demonstrates that fractional or decimal exponents do have mean-
ing.
Examples
1. ( × )/
=( )/
= /
=
=
2. Evaluate: 274/3
= (27 )
= ( √27)
=3
= 81
3. Evaluate: 4−3/2
1
= /
4
1
=
(√4)
1
=
2
1
=
8
4. Evaluate: (22)3
=2
= 64

2/20 Edinburgh Business School Quantitative Methods


Module 2 / Basic Mathematics: School Mathematics Applied to Management

2.6.2 Logarithms
In pursuing the objective of understanding exponential functions, it is also helpful
to look at logarithms. At school, logarithms are used for multiplying and dividing
large numbers, but this is not the purpose here. A logarithm is simply an exponent.
For example, if y = ax then x is said to be the logarithm of y to the base a. This is
written as logay = x.
Examples
1. 1000 = 103 and therefore the logarithm of 1000 to the base 10 is 3 (i.e.
3 = log101000). Logarithms to the base 10 are known as common logarithms.
2. 8 = 23 and therefore the logarithm of 8 to the base 2 is 3 (i.e. 3 = log28). Logarithms
to the base 2 are binary logarithms.
3. e is a constant frequently found in mathematics (just as π is). e has the value 2.718
approximately. Logarithms to the base e are called natural logarithms and are writ-
ten ln (i.e. x = lny means x = logey).
e has other properties which make it of interest in mathematics.
The rules for manipulation of logarithms follow from the rules for exponents:
(a) Addition: log + log = log
(b) Subtraction: log − log = log | / |
(c) Multiplication by a constant: log = log

2.6.3 Exponential Functions


The general form of the exponential function is:
=
In the bank account example of Figure 2.12:
y is the balance at the end of c years
k is the initial deposit
is (1 + interest rate)
So, at the end of three years, the balance is given by the equation:
Balance = 1000 × (1.10)
More generally, putting x = 0 into the equation reveals that k is the intercept.
The exponent constant c is more interesting. If it is positive, then the relationship is
one of growth; that is, as x increases, y also increases. If it is negative, then the
relationship is one of decay; that is, as x increases, y decreases. There is no easy
interpretation of c as a rate of growth, unlike the linear case where m is the constant
increase in each time period. Figure 2.13 gives examples of exponential functions of
growth and decay.
Recall the property of the exponential function that makes it applicable in certain
circumstances. If the relationship is one of growth, then, if x increases by a constant
amount, y increases by a constant percentage of itself (e.g. in Figure 2.13 three
points are (1,4), (2,8), (3,16)); in each case x increases by 1 and y increases by 100
per cent. Similarly, for decay functions, if x increases by a constant amount, then y
decreases by a constant percentage of itself. Contrast this with a linear function

Quantitative Methods Edinburgh Business School 2/21


Module 2 / Basic Mathematics: School Mathematics Applied to Management

where y increases (or decreases) by the same constant amount (= m, in y = mx + c)


each time x increases by 1.

(a) y y=2×2
x

15

10

k=2

0 1 2 3 x

(b) y
–x
y=5×2

15

10

–1 0 1 2 x

Figure 2.13 (a) Growth; (b) Decay


Example
The occurrence of a particular illness has been at a constant level of 100 new cases/year
for some time. A new treatment has been developed and it is thought that the number
of new cases will now decrease at a constant percentage of 25 per annum. What is the
exponential function that describes this pattern?
A graph of the function is given in Figure 2.14. The general form of the exponential
function is:

2/22 Edinburgh Business School Quantitative Methods


Module 2 / Basic Mathematics: School Mathematics Applied to Management

New cases
Initial level

0 1 2 3 4 Time

Figure 2.14 Decreasing incidence of illness


The base to work with can be chosen. The base 10 is selected here, but any other base
would do as well.
= 10
Initially (at the beginning of the period, when x = 0), the level of new cases is 100.
Therefore:
100 = 10 =
So = 100 · (10 )
After the first year, the new cases have dropped by 25 per cent to 75.
75 = 100(10 )
10 = 0.75
At this point, tables or a calculator have to be consulted. Logarithmic tables show the
logarithm of a number, given the number. In this case, the number is 0.75 and we need
to know its logarithm c. The tables reveal that:
= −0.125
The exponential function relating y and x is therefore:
= 100 · 10 .
Figure 2.14 is a plot of this function. In this situation, estimates were being made for the
future based on judgements of a subjective nature. The judgements were (1) an
exponential function was the correct expression; (2) the rate of decrease was 25 per
cent. The equation gives anticipated numbers of new cases at future points in time.
In other circumstances, the problem might have been the other way round. Historical
records of the numbers of cases might have been available and the task would then have
been to find the function that best described what had happened. After the data were
plotted on a graph there would be two problems. First, it would have to be judged
whether the shape of the pattern of points was most like a straight line or an exponen-
tial curve or one of the many other types of functions that could be used. This
judgement would be based on knowledge of the general shapes of these functions. The
second problem would be that, even if a straight line, say, were judged to be the most
likely shape, not all the points would fall precisely on a straight line (see Figure 2.15).
The second problem is, then, to estimate which straight line best fits the data points. A

Quantitative Methods Edinburgh Business School 2/23


Module 2 / Basic Mathematics: School Mathematics Applied to Management

statistical technique, called regression analysis, is the usual way of dealing with this
second problem. (Regression analysis is a topic covered in Module 11 and Module 12.)

Figure 2.15 Example of regression line

2.6.4 Relationship between Linear and Exponential Functions


The exponential function is:
=
Take logarithms to the base a on either side:
log = log ( )
Apply two of the logarithmic rules (see Section 2.6.2) to the right-hand side:
log = log + log ( ) (Addition)
= log + log (Multiplication by a constant)
Since logaa = 1
log = constant + ﴾2.8﴿
Therefore, the relationship between logay and x is a linear one. (Compare Equa-
tion 2.8 with the equation of a straight line.) If y and x are related by an exponential
function, then logay and x are related by a linear function. In other words, by means
of a transformation (taking logarithms of both sides) an exponential function can
be treated as a linear one. (Transformations will play an important role later in
regression analysis.)

2/24 Edinburgh Business School Quantitative Methods


Module 2 / Basic Mathematics: School Mathematics Applied to Management

Review Questions
2.1 Which point on the graph shown below is (−1,2)?

A
2

D
1

–2 –1 0 1 2 3 4 x

C B

A. Point A
B. Point B
C. Point C
D. Point D

2.2 What is the equation of the line shown on the graph below?

0 1 2 3 x

A. = +1
B. =1−
C. =− −1
D. = −1

Quantitative Methods Edinburgh Business School 2/25


Module 2 / Basic Mathematics: School Mathematics Applied to Management

2.3 Which of the ‘curves’ shown below is most likely to have the equation y = x2 − 6x + 4?

6
C
A

4
B
2

0 2 4 6 x

A. A
B. B
C. C

2.4 Solve the following for y: 6x + 4 = 2y − 4


Which answer is correct?
A. =3
B. =
C. =3 +4
D. =3 +8

2.5 Solve for y: =


Which answer is correct?
A. = 10 − 5
B. = 10
C. + 2 − 10 = 0
D. =2

2.6 What is the equation of the line with intercept 3 and that goes through point (3,9)?
A. =3 +2
B. =6 +3
C. =4 +3
D. =2 +3

2.7 What is the equation of the line that goes through the points (−1,6) and (3,−2)?
A. =2 +8
B. =4−2
C. =− +5
D. =2 −8

2/26 Edinburgh Business School Quantitative Methods


Module 2 / Basic Mathematics: School Mathematics Applied to Management

2.8 Solve the two simultaneous equations


4 + =5
2 − =7
Which answer is correct?
A. = −1, = 9
B. = 1, = 1
C. = −1, = −9
D. = 2, = −3

2.9 Solve the two simultaneous equations


2 +7 =3
3 − 2 = 17
Which answer is correct?
A. = , =
B. = 5, = −1
C. = −2, = 1
D. = −1, = 5

2.10 What is (16)−3/2?


A. −64
B.
C.
D. −

2.11 What is log28?


A. 0.9031
B.
C. 3
D. 256

Quantitative Methods Edinburgh Business School 2/27


Module 2 / Basic Mathematics: School Mathematics Applied to Management

2.12 What is the equation of the curve shown in the graph below?

30

20

10

1 2 x

A. = + 10
B. = 10 · 10 .
C. = 10 · 10 .

D. = 100 · 10 .

Case Study 2.1: Algebraic Formulation


1 This ‘case’ consists of a series of formulations (i.e. exercises in turning words into
algebra).
a. A newspaper boy wants to calculate his daily earnings E. Each paper costs him e
pence while he sells a paper for b pence.
Write down the equation relating his earnings to the number of papers he sells per
day, called x.
What are the constants of the equation?
b. In an accounts department, a clerk makes out a bill for a customer by first writing
the customer’s name and address and then, underneath, the details of each of the
customer’s orders. A time and motion study shows that it takes the clerk two
minutes to write the name and address and one minute to write out the details of
each of the customer’s orders.
Find an equation that gives the time it takes for the clerk to make out a bill in terms
of the number of orders that are to be included in the bill.
c. A car salesman has a basic salary of £10 000 a year. In addition, he earns a commis-
sion of £400 for every new car he sells, and £200 for every second-hand car.
What is his total annual salary, as a function of the numbers of new and used cars he
sells during the year?
d. A firm charges a flat rate of £50 per week for the hire of a car. In addition, there is a
surcharge of 9p per mile up to 1000 miles. The surcharge does not apply over 1000
miles.
For hires lasting no more than one week, express this in graphical and algebraic
form.

2/28 Edinburgh Business School Quantitative Methods


Module 2 / Basic Mathematics: School Mathematics Applied to Management

Case Study 2.2: CNX Armaments Co.


The CNX armaments company has developed two new weapons systems. The costs
and prices associated with each system are as below (in £000):

Fixed cost Variable cost Price


System 1 100 4 5
System 2 1200 4 8

1 What is the breakeven point for each system (i.e. how many of each system need to be
sold so that revenue equals cost)?

Case Study 2.3: Bonzo Corporation


The pet food made by the Bonzo Corporation contains 30 per cent meat and 70 per
cent wheatmeal, whereas the product of the Woof Corporation contains 40 per cent
meat and 60 per cent wheatmeal.
The manager of a large dogs’ home has been told that each dog in his care requires
exactly 6 oz of meat and 10 oz of wheatmeal each day (no more, no less). Express this
in two equations concerning the amounts of the two pet foods each dog should eat.

1 How much of each food should each dog eat?

Case Study 2.4: Woof Dog Food


1 Sales of Woof dog food have increased exponentially (i.e. according to an equation of
the form y = kecx) since six years ago, when the annual sales were 10 000 cases. Last
year the sales were 40 000 cases. What is your estimate of the sales this year? (NB: in
the absence of tables or a calculator, it is necessary to know: log = 1.386 and e1.662=
5.27.)

Quantitative Methods Edinburgh Business School 2/29


PART 2

Handling Numbers
Module 3 Data Communication
Module 4 Data Analysis
Module 5 Summary Measures
Module 6 Sampling Methods

Quantitative Methods Edinburgh Business School


Module 3

Data Communication
Contents
3.1 Introduction.............................................................................................3/1
3.2 Rules for Data Presentation ..................................................................3/3
3.3 The Special Case of Accounting Data ............................................... 3/12
3.4 Communicating Data through Graphs.............................................. 3/16
Learning Summary ......................................................................................... 3/21
Review Questions ........................................................................................... 3/22
Case Study 3.1: Local Government Performance Measures ..................... 3/24
Case Study 3.2: Multinational Company’s Income Statement.................. 3/25
Case Study 3.3: Country GDPs ..................................................................... 3/25
Case Study 3.4: Energy Efficiency ................................................................. 3/26

Prerequisite reading: None

Learning Objectives
By the end of the module the reader should know how to improve data presenta-
tion. This is important both in communicating data to others and in analysing data.
The emphasis is on the visual aspects of data presentation. Special reference is made
to accounting data and graphs.

3.1 Introduction
Data communication means the transmission of information through the medium
of numbers. Its reputation is mixed. Sometimes it is thought to be done dishonestly
(‘There are lies, damned lies and statistics’); at other times it is thought to be done
confusingly so that the numbers appear incomprehensible and any real information
is obscured. Thus far, numbers and words are similar. Words can also mislead and
confuse. The difference seems to be that numbers are treated with less tolerance and
are quickly abandoned as a lost cause. More effort is made with words. One hears,
for instance, of campaigns for the plain and efficient use of words by bureaucrats,
lawmakers, etc., but not for the plain and efficient use of numbers by statisticians,
computer scientists and accountants. Furthermore, while experts spend much time
devising advanced numerical techniques, little effort is put into methods for better
data communication.
This module attempts to redress the balance by looking closely at the question of
data communication. In management, numbers are usually produced in the form of

Quantitative Methods Edinburgh Business School 3/1


Module 3 / Data Communication

tables or graphs. The role of both these modes of presentation will be discussed.
The case of accounting data will be given separate treatment.
It may seem facile to say that data should be presented in a form that is suitable
for the receiver rather than convenient for the producer. Yet computerised man-
agement data often relate more to the capabilities of the computer than to the needs
of the managers – many times accounting information seems to presuppose that all
the receivers are accountants; statistics frequently can only be understood by the
highly numerate. A producer of data should have the users at the forefront of his or
her mind, and should also not assume that the receiver has a similar technical
background to him- or herself.
In the context that the requirements of the users of data are paramount, the aim
now is to show how data might be presented better. How is ‘better’ to be defined?
A manager meets data in just a few general situations:
(a) Business reports. The data are usually the supporting evidence for conclusions
or suggestions made verbally in the text.
(b) Management information systems. Large amounts of data are available on
screen or delivered to the manager at regular intervals, usually in the form of
computer printouts.
(c) Accounting data. Primarily, for a manager, these will indicate the major
financial features of an organisation. The financial analyst will have more detailed
requirements.
(d) Self-generated data. The manager may wish to analyse his own data: sales
figures, delivery performance, invoice payments, etc.
In all these situations speed is essential. A manager is unlikely to have the time to
carry out a detailed analysis of every set of data that crosses his or her desk. The
data should be communicated in such a way that its features are immediately
obvious. Moreover, the features should be the main ones rather than the points of
detail. These requirements suggest that the criterion that distinguishes well-
presented data should be: ‘The main patterns and exceptions in the data should be
immediately evident.’
The achievement of this objective is made easier since, in all the situations above,
the manager will normally be able to anticipate the patterns in the data. In the first
case above, the pattern will have been described in the text; in the other three, it is
unlikely that the manager will be dealing with raw data in a totally new set of
circumstances and he or she will therefore have some idea of what to expect. The
methods of improving data presentation are put forward with this criterion in mind.
In looking at this subject it must be stressed that it is not just communicating
data to others that is important. Communicating data to oneself is a step in coming
to understand them. The role of data communication in analysis is perhaps the most
valuable function of the ideas proposed here. Important results in the sciences,
medicine and economics are not usually discovered through the application of a
sophisticated computerised technique. They are more likely to be discovered
because someone has noticed an interesting regularity or irregularity in a small
amount of data. Such features are more likely to be noticed when the data are well
presented. Sophistication may be introduced later when one is trying to verify the

3/2 Edinburgh Business School Quantitative Methods


Module 3 / Data Communication

result rigorously, but this should not be confused with the original analysis. In short,
simple, often visual, methods of understanding numbers are highly important. The
role of data communication as part of the analysis of data will be explored in the
next module.

3.2 Rules for Data Presentation


There are seven rules for presentation that should improve the communication of
any information contained in the data. Not all can be applied to any one set of data.
Judgement has to be used in deciding which may be suitable. The rules are based on
psychological research into the way the human mind handles information. If the
mind naturally deals with information in a particular format then this should be
taken into account when presenting information.

3.2.1 Rule 1: Round the Numbers to Two Effective Figures


It has been shown experimentally that, when trying to understand numbers, most
people round them mentally. This brings them to a size that the mind can manipu-
late more easily. In practice, the rounding is to two figures. For instance, someone
wanting to make the calculation 34.8/18.3 would probably round the division to
35/18 to obtain the answer of just less than 2. Of course, there are many situations
– an engineer’s structural estimates, accounting audits, statistical calculations, for
example – in which rounding would be wholly inappropriate. Rounding is being
suggested here for the communication of management information. In this context,
rounding will not usually affect decisions being taken, but it will improve the
effectiveness of the information. If simple numerical patterns exist in the data, they
will be more readily seen as a result of rounding since the numbers can now be
manipulated and assimilated mentally.
Rounding to two effective figures is variable rounding. Not all numbers will be
rounded in the same way. Here are some examples of rounding to two effective
figures:

Original Rounded
(2 effective figures)
1382 1400
721 720
79.311 79
17.1 17
4.2 4.2
2.32 2.3

Quantitative Methods Edinburgh Business School 3/3


Module 3 / Data Communication

These numbers have been rounded to the first two figures. Contrast this to fixed
rounding, such as rounding always to the first decimal place. For example, if the
above numbers were rounded to the first decimal place, the result would be:

Original Rounded
(1st decimal place)
1382 1382.0
721 721.0
79.311 79.3
17.1 17.1
4.2 4.2
2.32 2.3

Rounding to the same number of decimal places may appear to be more con-
sistent, but it does not make the numbers easier to manipulate and communicate.
Rounding to two effective figures puts numbers in the form in which mental
arithmetic is naturally done. The numbers are therefore assimilated more quickly.
The situation is slightly different when a series of similar numbers, all of which
have, say, the first two figures in common, are being compared. The rounding
would then be to the first two figures that are effective in making the comparison
(i.e. the first two that differ from number to number). This is the meaning of
‘effective’ in ‘two effective figures’. For example, the column of numbers below
would be rounded as shown:

Original Rounded
1142 1140
1327 1330
1489 1490
1231 1230
1588 1590

The numbers have been rounded to the second and third figures because all the
numbers have the first figure, 1, in common. Rounding to the first two figures
would be over-rounding, making comparisons too approximate.
Many managers may be concerned that rounding leads to inaccuracy. It is true, of
course, that rounding does lose accuracy. The important questions are: ‘Would the
presence of extra digits affect the decision being taken?’ and ‘Just how accurate are
the data anyway – is the accuracy being lost spurious accuracy?’
Often, one finds that eight-figure data are being insisted upon in a situation
where the decision being taken rests only on the first two figures, and where the
method of data collection was such that only the first two figures can be relied upon
as being accurate. Monitoring financial budgets is a case in point. During a financial
year actual costs are continuously compared with planned. There is room for

3/4 Edinburgh Business School Quantitative Methods


Module 3 / Data Communication

approximation in the comparison. If the budget were £11 500, it is enough to know
that the actual costs are £10 700 (rounded to two effective figures). No different
decision would be taken had the actual costs been specified as £10 715. The
conclusion would be the same: actual costs are about 7 per cent below budget. Even
if greater accuracy were required it may not be possible to give it. At such an early
stage actual costs will almost certainly rest in some part on estimates and be subject
to error. To quote actual costs to the nearest £1 is misleading, suggesting a level of
accuracy that has not been achieved. In this situation high levels of accuracy are
therefore neither necessary nor obtainable, yet the people involved may insist on
issuing actual cost data to the nearest £1. Where there is argument about the level of
accuracy required, A. S. C. Ehrenberg (1975) suggests in his book that the numbers
should be rounded but a note at the bottom of the table should be provided to
indicate a source from which data specified to a greater precision can be obtained.
Then wait for the rush.

3.2.2 Rule 2: Reorder the Numbers


The pattern in a set of numbers can be understood better when they are in size
order. A table of data will appear more orderly and relationships between the
numbers will often be pinpointed. For instance, in a financial table, listing the
divisions of a company in size order (by capital employed, perhaps) shows at a
glance whether turnover, profit, etc. are in the same order. Divisions that are out of
line in this respect will be highlighted.
The quantity used to do the ordering may be:
(a) one of the columns or rows of the table;
(b) the averages of the columns or rows;
(c) something external to the table, such as ordering geographical regions by
population.
For example, Table 3.1 shows financial data for the geographical divisions of a
company. In Table 3.2 the divisions have been put in size order (and the data
rounded).

Table 3.1 Divisional financial data (£000)


Division Capital employed Turnover Profit
North 1342.42 533.51 78.28
South 1873.45 728.64 96.30
East 1145.71 432.11 89.33
West 1482.56 561.94 82.18

Quantitative Methods Edinburgh Business School 3/5


Module 3 / Data Communication

Table 3.2 Data reordered and rounded (£000)


Division Capital employed Turnover Profit
South 1870 730 96
West 1480 560 82
North 1340 530 78
East 1150 430 89

It is much easier to see the main features of the data from Table 3.2. East now
stands out as a clear exception, its profit being out of line with the other divisions.
The rounding also facilitates the calculation of accounting ratios. The profit margin
is about 14 per cent for all divisions except East, where it is over 20 per cent. While
it is possible to see these features in Table 3.1, they are not so immediately apparent
as in the amended table. When managers have many such tables crossing their
desks, it is essential that attention should be attracted quickly to important infor-
mation that may require action.
Frequently, tables are ordered alphabetically. This is helpful in a long reference
table that is unfamiliar to the user, but not so helpful when management infor-
mation is involved. Indeed, it may be a hindrance. In management information it is
the overall pattern, not the individual entries, that is of interest. Alphabetical order is
more likely to obscure than highlight the pattern. In addition, managers are usually
not totally unfamiliar with the data they receive. For instance, anyone looking at
some product’s sales figures by state in the USA would probably be aware that, for a
table in population order, California would be close to the top and Alaska close to
the bottom. In other words, the loss from not using alphabetical order is small
whereas the gain in data communication is large.

3.2.3 Rule 3: Interchange Rows and Columns


Making a comparison between numbers or seeing a pattern in them is easier if the
numbers lie underneath rather than alongside one another. The reason is that it is
easier to see the difference between, say, two- and three-figure numbers and to
calculate the size of the difference by subtraction if the numbers are in a column
rather than a row. This probably originates from the way children in school are
taught to manipulate numbers. People learn to do sums in this form:

57
−23
34

In a table, then, the more important comparison should be presented down col-
umns, not along rows. Taking the ‘Capital employed’ data from Table 3.2, it could
be presented horizontally, as in Table 3.3, or vertically, as in Table 3.4. In Table 3.3
the differences between adjacent rows are 390, 140, 190 respectively. When the data
are in a column, such calculations are made much more quickly.

3/6 Edinburgh Business School Quantitative Methods


Module 3 / Data Communication

Table 3.3 Capital employed (by row) (£000)


South West North East
Capital employed 1870 1480 1340 1150

Table 3.4 Capital employed (by column) (£000)


Division Capital employed
South 1870
West 1480
North 1340
East 1150

In many tables, however, comparisons across rows and down columns are equal-
ly important and no interchange is possible.

3.2.4 Rule 4: Use Summary Measures


Summary measures, nearly always averages, for rows and columns provide a focus
for the eye as it moves down or along the numbers. It is a basis for comparison,
making it easier to tell at a glance the degree of variability and to discern whether a
particular element falls above or below the general level of the rest of the row or
column. Table 3.5 shows the number of deliveries per day at a warehouse over a
one-month period (20 days). The average number of deliveries per day provides a
focus, indicating more readily that the numbers are scattered closely and symmetri-
cally about their average.

Table 3.5 Warehouse deliveries


Deliveries Average
8 7 5 11 10 7 8 8 6 7 10 12 5 6 7 9 9 8 8 9 8

The summary measure is usually an average since it is important that the sum-
mary should be of the same size order as the rest of the numbers. A column total,
for example, is not a good summary, being of a different order of magnitude from
the rest of the column and therefore not a suitable basis for comparison. The
summary measure can also be the basis for ordering the rows and columns (see
Section 3.2.2).

3.2.5 Rule 5: Minimise the Use of Space and Lines


There should be as little white space and as few gridlines as possible in a table of
numbers. Lots of white space may look attractive, but the resultant gaps between
the numbers reduce the ability to see patterns, simply because there is a longer
interval as the eye travels from one number to the next. The numbers should be
close together, but not, of course, so close that they become confused.
Likewise, gridlines (horizontal or vertical) in a table interfere with the eye’s
movement from one number to the next. Gridlines should be used to separate one

Quantitative Methods Edinburgh Business School 3/7


Module 3 / Data Communication

type of number from another (e.g. to separate a summary row from the rest of the
numbers). Table 3.6 shows the data of Table 3.2 but with white space and gridlines
introduced. Table 3.7 is a repeat of Table 3.6 but with an acceptable use of space
and gridlines.

Table 3.6 Data of Table 3.2 with white space and gridlines
Division Capital Turnover Profit
employed
South 1870 730 96
West 1480 560 82
North 1340 530 78
East 1150 430 89

Table 3.7 White space and gridlines removed


Division Capital employed Turnover Profit
South 1870 730 96
West 1480 560 82
North 1340 530 78
East 1150 430 89

The patterns and exceptions in these data are much more clearly evident once the
white space and gridlines have been removed. The purpose of many tables is to
compare numbers. White space and gridlines have the opposite effect. They
separate numbers and make the comparison more difficult.
3.2.6 Rule 6: Labelling Should Be Clear but Unobtrusive
Care should be taken when labelling data; otherwise the labels may confuse and
detract from the numbers. This seems an obvious point, yet in practice two labelling
faults are regularly seen in tables. First, the constructor of a table may use abbreviat-
ed or obscure labels, having been working on the project for some time and falsely
assuming that the reader has the same familiarity with the numbers and their
definitions. Second, gaps may be introduced in a column of numbers merely to
accommodate extra-long labels. Labels should be clear and not interfere with the
understanding of the numbers.
Table 3.8 is an extract from a table of historical data relating to United Kingdom
utilities prior to their privatisation in the 1980s and 1990s. The extract shows ‘Gross
income as % of net assets’ for a selection of these organisations. First, the length of
the label relating to Electricity results in gaps in the column of numbers; second,
this same label includes abbreviations, the meaning of which may not be apparent to
the uninitiated. In Table 3.9 the labels have been shortened and unclear terms
eliminated. If necessary, a footnote or appendix could provide an exact definition of
the organisations concerned.

3/8 Edinburgh Business School Quantitative Methods


Module 3 / Data Communication

In Table 3.9 the numbers and labels are clearer. Watertight and lengthy defini-
tions of the organisations do not belong within the table. The purpose of this rule is
to assert the primary importance of tables in communicating numbers. As far as
possible, the labels should give unambiguous definitions of the numbers but should
not obscure the information contained in the numbers.

Table 3.8 Gross income as % of net assets


1960 1965.......
National Coal Board 8.3 ...................
Gas Council and Boards 7.8 ...................
Electricity (BEA, CEA, EC and 10.7 ...................
Boards)
South of Scotland Electricity 9.5 ...................
Board
North of Scotland Hydro-Electric 5.4 ...................
Board
… …

Table 3.9 Gross income as % of net assets (relabelled)


1960 1965 …
National Coal Board 8.3 … …
Gas Council/Boards 7.8 … …
Electricity (ex Scotland) 10.7 … …
S. of Scotland Electricity 9.5 … …
N. of Scotland Hydro-Electric 5.4 … …
… … … …

3.2.7 Rule 7: Use a Verbal Summary


A verbal summary can help achieve the objective of communicating information
quickly by directing attention to the main features of the data. The summary should
be short and deal with the main pattern, not the points of detail or minor eccentrici-
ties in the data. It should not, of course, mislead.
In a management report, the verbal summary will probably be there already since
the data are likely to be present to provide evidence of a conclusion drawn and
described in the text. In other circumstances where a manager meets data, a verbal
summary should be produced, if possible, along with the data. A verbal summary
for Table 3.7 might be: ‘For all divisions turnover is just less than 40 per cent of
capital employed; profit is approximately 14 per cent of turnover except in East,
where the figure is over 20 per cent.’

Quantitative Methods Edinburgh Business School 3/9


Module 3 / Data Communication

Example: Facts and Figures

Table 3.10 GDP of nine EC countries plus Japan and the USA (€ thou-
sand million)
1965 1975 1987 1990 1995 1997
United 99.3 172.5 598.8 766.4 846.3 1133.3
Kingdom
Belgium 16.6 46.2 123.5 154.4 209.0 213.7
Denmark 10.1 26.9 88.8 99.6 129.4 140.2
France 96.8 253.3 770.2 940.8 1169.1 1224.4
Germany* 114.3 319.9 960.9 1182.2 1846.4 1853.9
Ireland 2.7 5.9 27.2 35.9 49.4 65.1
Italy 53.4 130.2 657.4 861.2 832.0 1011.1
Luxem- 0.7 1.7 6.0 8.1 13.2 13.9
bourg
Nether- 18.7 61.2 188.9 222.3 301.9 315.6
lands
Japan 83.0 372.8 2099.4 2341.5 3917.9 3712.1
USA 690.0 1149.7 3922.3 4361.5 5374.3 6848.2
* Germany includes the former GDR in 1995 and 1997.

Table 3.10 is taken from a UK company’s management report. It is a table of


historical economic data intended as background information for managers.
Knowledge of economic changes during earlier years is an important guide to
understanding current economic performance. The table shows the gross domestic
product of the nine countries of the European Community for various years
between 1965 and 1997, with the USA and Japan included for comparison. Since it
is intended as background information only, an economist doing a detailed study of
any or all of the countries would refer to reference sources. Typically, a reader may
wish to ask questions such as:
(a) How does the economic growth of Germany compare with that of the UK?
(b) How do the economic sizes of the major EC countries compare with one
another?
(c) Is Japan catching up with the USA? Just how small are the smaller EC countries?
The table is presented neatly, but typical questions such as these take a surpris-
ingly long time to answer.
A few changes to the layout along the lines of the seven rules make the infor-
mation much easier to grasp. After applying the seven rules, the amended data are
shown in Table 3.11.

3/10 Edinburgh Business School Quantitative Methods


Module 3 / Data Communication

Table 3.11 The GDP of 11 countries (€ thousand million) (amended)


1997 1995 1990 1987 1975 1965
USA 6800 5400 4400 3900 1100 690
Japan 3700 3900 2300 2100 370 83
Germany* 1900 1800 1200 960 320 110
France 1200 1200 940 770 250 97
United Kingdom 1100 850 770 600 170 100
Italy 1000 830 860 660 130 53
Netherlands 320 300 220 190 61 19
Belgium 210 210 150 120 46 17
Denmark 140 130 100 89 27 10
Ireland 65 49 36 27 6 3
Luxembourg 14 13 8 6 2 1
* Germany includes the former GDR in 1995 and 1997.

The rules were applied as follows:


(a) Rule 1. Rounding to two effective figures is straightforward except for the two
smallest countries, Ireland and Luxembourg. For these two countries, the data
have been over-rounded. Two effective figures should mean that the decimal
place is retained. To do so, however, would upset the balance of the table and
attract excessive attention to these countries which would be the only ones to
have decimal places. On the other hand, loss of the decimal place involves the
loss of real accuracy. A trade-off between communication and accuracy has to be
made. On balance, communication is placed first and some numbers for Ireland
and Luxembourg rounded to one figure. Comparisons involving them can still
be made. In any case, students of the economic progress of Luxembourg would
doubtless not be using this table as a source of data.
(b) Rule 2. The rows are ordered by size. The previous order was UK first (it is a
UK company’s report) followed by the other EC countries alphabetically, fol-
lowed by the two non-EC countries. Alphabetical order does not help in
answering the questions, nor does the separation of the USA and Japan (does
anyone doubt that they are not EC members?). Size order makes it easier to
compare countries of similar size as well as immediately indicating changes in
ranking. The ordering in Table 3.11 is based on the 1997 GDP. It could equally
have been based on the 1965 GDP or the populations. There is often no one
correct method of ordering. It is a matter of taste.
(c) Rule 3. Interchanging rows and columns is not necessary. One is just as likely to
wish to compare countries (down the columns) as one country across the years
(along the rows).
(d) Rule 4. The use of summary measures is not particularly helpful. Row and
column averages would have no intuitive meaning. For example, the average
GDP for a country over a selection of years is not a useful summary.
(e) Rule 5. The vertical lines in the original table hinder comparisons across the
years and have been omitted.

Quantitative Methods Edinburgh Business School 3/11


Module 3 / Data Communication

(f) Rule 6. The labelling is already clear. No changes have been made.
(g) Rule 7. It would be very difficult to make a simple verbal summary of these
data. Moreover, in the context, the publishers would probably not wish to be
appearing to lead the reader’s thinking by suggesting what the patterns were.
The typical questions that might be asked of these data can now be applied to
Table 3.11. It is possible to see quickly that Germany’s GDP increased by 1900/110
= just over 17 times; Italy’s by 1000/53 = just over 19 times; Japan’s by just under
45 times; the UK’s by 11; Japan has overtaken Germany, France and the UK;
Ireland is over four times the size of Luxembourg economically. The information is
more readily apparent from Table 3.11 than from Table 3.10.

3.3 The Special Case of Accounting Data


The communication of accounting data requires special treatment. The main factors
that make accounting data different are:
(a) they have a logical sequence (e.g. a revenue and expenditure statement builds up
to a final profit);
(b) there is a tradition of how accounts should be presented (e.g. negative amounts
are shown in brackets);
(c) there are legal requirements as to what should be presented.
These factors do not mean that the presentation of accounts cannot be im-
proved, but it does mean that extra care must be taken and that proposed changes
may be met with resistance. The difficulty seems to be that what constitutes a well-
presented set of accounts to an accountant may differ from what constitutes a well-
presented set of accounts to a layperson. The accountant’s criteria are probably
based on what is good practice when drawing up accounts and on the technical
approval of other accountants. For example, most final accounts use brackets to
indicate a negative. This may make sense when preparing accounts and when
dealing with other accountants, but most laypeople are more used to a minus sign to
indicate a negative, and most financial data are for laypeople. Published accounts are
for shareholders; internal company financial information is for managers (not all of
whom are former accountants). It is more important that accounting data should be
in a form suitable for accounting laypeople than suitable for accountants.
The accounts in Table 3.12 are a case in point. They are a disguised version of
the accounts of a multinational shipping company, and an accounting association
awarded them a prize for their excellence. While they doubtless represent fine
accounting practice, it is almost impossible to tell what, financially, befell the
company between the two years. Communication of the data can be improved by
applying some of the presentation ‘rules’. (Only rules 1, 5, 6 and 7 are appropriate
for accounts; rules 2, 3 and 4 cannot be applied to tables that have a logical se-
quence.)
Some interesting facts emerge when the data are re-presented. Table 3.13 shows
the same accounts re-presented (in just one language). The new table reveals that it
was an unusual year for the company. Incomes and expenditures changed by large
and extremely variable amounts.

3/12 Edinburgh Business School Quantitative Methods


Module 3 / Data Communication

Table 3.12 Shipping company accounts


2015 2016
£ £000

Gross Freight Income 98 898 684 73 884


Voyage Expenses
Own Vessels 31 559 336 (24 493)
Vessels on Timecharter 42 142 838 (73 702 174) (28 378)

Operating Expenses:
Crew Wages and Social Security 7 685 965 (9 010)
Other Crew Expenses 541 014 (633)
Insurance Premiums 1 161 943 (1 367)
Provisions and Stores 1 693 916 (2 268)
Repairs and Maintenance 1 685 711 (3 297)
Other Operating Expenses 60 835 (27)
(12 829 384)

Result from Vessels 12 367 126 4 411


Results from Parts in Limited Partner- 1 793314 (163)
ships
Result from Operation Business 167 343 0
Building
Management Expenses
Salaries, Fees, Employees’ 1 426 607 (1 208)
Benefits, etc.
Other Management Expenses 502 815 (635)
(1 929 422)

Depreciation Fixed Assets (7 106 305) (5 365)


Transferred from Classification 112 401 652
Survey Fund
Set aside for Classification Survey (612 401) (27)
Fund
(500 000)

4 792 056 (2 335)


Capital Income and Expenses
Misc. Interest Income 318 601 364
Misc. Interest Expenses (8 450 307) (6 482)

(8 131 706)

Quantitative Methods Edinburgh Business School 3/13


Module 3 / Data Communication

2015 2016
£ £000
Net Profit/(Loss) Currency Exch. (190 836) (680)
Dividends 35 732 47
(8 286 810)

Other Income and Expenses


Net Profit from Sales Fixed 5 553 047 5 593
Assets
Balance Classification Survey and 1 414 132 858
Self Insurance Fund upon Sale

6 967 179

Net Profit/(Loss) from Sale of Shares (38 647) 0


Misc. Income/(Expenses) relating to 629 630 1 408
previous years
Adjustm. Cost Shares 226 197 (206)
Loss from Receivables (219 189) 0
Reversal write up Fixed Assets (2 026 067) 0
previous years
5 539 103

Result before Taxes and Allocations 2 044 349 (1 433)


Allocations (1 929 400)
114 949 (1 433)

Reserves (11 495) 1 071


Dispositions (103 454) 362
(114 949) 1 433

(a) Freight income up 33%


(b) Voyage expenses up 40%
(c) Operating expenses down 20%
(d) Operating profit up 180%
(e) Partnership shares Small −ve to large +ve
(f) Management expenses equal
(g) Capital income/expense up 20%
(h) Other income/expense down 25%
(i) Profit before tax −ve to +ve

3/14 Edinburgh Business School Quantitative Methods


Module 3 / Data Communication

Table 3.13 Shipping company accounts (amended)


2015 2016
Statement of revenue and expenses (£ million)
Gross freight income 98.9 73.9
Voyage expenses −73.7 −52.9
Operating expenses −12.8 −16.6
Profit from vessels 12.4 4.4
Shares in partnership 1.8 −0.16
Operation business building 0.2 0
Management expenses −1.9 −1.8
Depreciation −7.1 −5.4
Classification survey (net) −0.5 0.6
Capital income and expenses −8.3 −6.8
Other income and expenses 5.5 7.7
Profit before tax 2.0 −1.4
Allocations −1.9 0
To reserves 0.1 −1.4

These features of the company’s finances are remarkable. Even when one knows
what they are, it is very difficult to see them in the original table (Table 3.12). Yet it
is this volatility that is of major interest to shareholders and managers.
The question of rounding creates special difficulties with accounting data. The
reason is that rounding and exact adding up are not always consistent. It has to be
decided which is the more important – the better communication of the data or the
need to allow readers to check the arithmetic. The balance of argument must weigh
in favour of the rounding. Checking totals is a trivial matter in published accounts
(although not, of course, in the process of auditing). If a mistake were found in the
published accounts of such a large company, the fault would almost certainly lie
with a printer’s error. But ‘total checking’ is an obsessive pastime and few compa-
nies would risk the barrage of correspondence that would undoubtedly ensue even
though a note to the accounts explained that rounding was the cause. Because of
this factor the two effective figures rule may have to be broken so that adding and
subtracting are exact. This has been done in Table 3.13. The only remaining
question is to wonder why totals are exactly right in company accounts which have
in any case usually been rounded to some extent (the nearest £ million for many
companies). The answer is that figures have been ‘fudged’ to make it so. The same
considerations apply, but to a lesser degree, with internal company financial infor-
mation.
Communicating financial data is an especially challenging area. The guiding prin-
ciple is that the main features should be evident to the users of the data. It should
not be necessary to be an expert in the field nor to have to carry out a complex
analysis in order to appreciate the prime events in a company’s financial year. Some
organisations are recognising these problems by publishing two sets of (entirely

Quantitative Methods Edinburgh Business School 3/15


Module 3 / Data Communication

consistent) final accounts. One is source material, covering legal requirements and
suitable for financial experts; the other is a communicating document, fulfilling the
purpose of accounts (i.e. providing essential financial information). Other organisa-
tions may, of course, have reasons for wanting to obscure the main features of their
financial year.

3.4 Communicating Data through Graphs


The belief that a picture speaks a thousand words (or numbers) is widely held. For
most people a picture is more interesting and attractive than numbers. A graph is a
form of picture that can, in certain circumstances, be very helpful in communicating
data. They are not, however, always helpful. It is important to distinguish between
the helpful and unhelpful uses of a graph.

Deposit rates: annual percentage average


14

12

10
Rate %

0
86 87 88 89 90 91 92 93 94 95 96
Year

Figure 3.1 USA interest rates


Source: International Monetary Fund

3/16 Edinburgh Business School Quantitative Methods


Module 3 / Data Communication

Deposit rates: annual percentage average


14

12

10

Italy
6
USA

4 Canada
France
Netherlands
Denmark
2 Belgium
Germany
Japan
0
86 87 88 89 90 91 92 93 94 95 96
Year

Figure 3.2 International interest rates


Source: International Monetary Fund
Figure 3.1 is a graph of interest rates in the USA in the years 1986–96. It works
well. The pattern can be seen quickly and easily. This is no longer so if the graph is
used for comparing interest rates in several countries. Figure 3.2 is Figure 3.1 with
several more countries added. The graph ceases to function well. There are too
many lines and they tend to cross over. As a result, no overall message emerges.
Furthermore, the graph cannot be used for reference purposes without difficulty
and inaccuracy. What, for instance, was the interest rate in Belgium in 1993?
This example illustrates the principles underlying the use of graphs for data
communication:
(a) Graphs are helpful when:
 attracting attention and adding variety in a series of tables;
 communicating very simple patterns.
(b) Graphs are not helpful when:
 communicating even slightly complex patterns;
 being used for reference purposes.
These principles are straightforward and it may appear at first sight that little
could go wrong in the use of graphs in organisations. However, things do go very
wrong with surprising frequency. Figure 3.3 is a disguised version of part of the
management information system of a large company. It was one of a series of about
40 similar graphs.
It is not difficult to see that it has been produced automatically from a computer
with little thought to communication (the vertical scale is an indication that it comes
hot from the computer). Nor is it difficult to say what is wrong.

Quantitative Methods Edinburgh Business School 3/17


Module 3 / Data Communication

Key: Denmark
Spain
Italy
Netherlands
Belgium
1500.0

1239.0

978.0

717.0

456.0

195.0
JAN JAN JAN JAN JAN
2011 2012 2013 2014 2015

Figure 3.3 Annual coal imports as monthly averages

3/18 Edinburgh Business School Quantitative Methods


Module 3 / Data Communication

No pattern of any sort is remotely evident. To extract an individual number (say,


the imports for Denmark in March 2013) requires first being able to find the line for
the country and, second, to estimate the imports by a calculation along the lines of
‘about 2/5 of the way between 456 and 717’. In short, the graph is good neither for
showing patterns nor for reference purposes. Since the graph is unusable, at the very
least, expense could be saved by not issuing it.
To be fair, these data are difficult to communicate by any means because there
are so many. Distinct improvements can nevertheless be made, using the rules of
presentation. Table 3.14 gives the same data as are on the graph in Figure 3.3 but in
tabular form. The table can certainly be used for reference purposes. In addition, it
is possible, although not easy, to see the seasonal pattern in the data.

Table 3.14 Annual coal imports as monthly averages


Italy Belg. Den. Neth. Spain Italy Belg. Den. Neth. Spain
2011 Jan. 910 610 230 280 260 2013 Jan. 620 310 360 420 290
Feb. 1200 850 360 360 240 Feb. 1000 520 250 320 440
Mar. 1300 720 340 420 440 Mar. 1100 630 460 430 260
Apr. 1200 590 340 350 430 Apr. 990 660 430 540 410
May 1100 520 310 410 360 May 1200 480 590 290 420
June 1100 570 330 440 320 June 790 580 560 390 330
July 1300 380 380 240 290 July 1300 470 520 370 370
Aug. 1100 390 330 320 250 Aug. 640 540 570 400 330
Sept. 800 360 370 300 210 Sept. 1000 540 510 270 330
Oct. 920 360 380 390 500 Oct. 1000 490 320 500 490
Nov. 1100 390 400 370 240 Nov. 900 570 470 440 380
Dec. 970 500 370 340 290 Dec. 1100 680 530 420 370

2012 Jan. 570 450 230 420 380 2014 Jan. 930 470 470 300 360
Feb. 800 550 300 280 310 Feb. 780 510 530 390 420
Mar. 1100 770 330 400 430 Mar. 900 590 440 260 440
Apr. 910 690 540 270 380 Apr. 780 530 440 490 440
May 970 690 290 390 420 May 1100 510 420 400 440
June 910 660 350 520 240 June 1000 650 440 510 350
July 900 690 370 430 300 July 1100 550 390 350 400
Aug. 900 580 330 240 300 Aug. 870 580 570 350 360
Sept. 750 520 480 430 340 Sept. 1100 610 460 380 360
Oct. 1400 640 410 380 430 Oct. 1100 750 730 360 530
Nov. 1100 590 360 560 430 Nov. 950 660 750 530 410
Dec. 1000 450 340 570 430 Dec. 1200 600 650 500 400

2015 Jan. 570 790 490 500 390


Feb. 1100 520 360 470 450
Mar. 880 870 460 520 450

Average of all years 980 560 420 390 370

Quantitative Methods Edinburgh Business School 3/19


Module 3 / Data Communication

If the general pattern over the years or a comparison between countries is re-
quired, Table 3.15 is suitable. This shows the average monthly imports of coal for
each year. It can now be seen that four of the countries have increased their imports
by a factor of 30–45 per cent. Italy is the exception, having decreased coal imports
by 20 per cent. The level of imports in the countries can be compared.

Table 3.15 Monthly coal imports (average per month in thousand


tonnes)
Italy Belg. Den. Neth. Spain
2011 1100 520 340 350 320
2012 940 610 340 400 370
2013 970 490 460 400 370
2014 980 590 510 400 410
2015 870 750 450 480 430
Average 980 560 420 390 370

In these terms, Italy is the largest, followed by Belgium, followed by the other
three at approximately the same level.
Table 3.15 can be transferred to a graph, as shown in Figure 3.4. General patterns
are evident. Italy has decreased its imports, the others have increased theirs; the level
of imports is in the order Italy, Belgium … The difference between the table and the
graph becomes clear when magnitudes have to be estimated. The percentage change
(−20 per cent for Italy, etc.) is readily calculated from the table, but not from the
graph. In general, graphs show the sign of changes but a table is needed to make an
estimate of the size of the changes. The purpose of the data and personal prefer-
ence would dictate which of the two were used.

1200

1000
Italy
800 Belgium

600 Netherlands
Denmark
400 Spain

200

2011 2012 2013 2014 2015

Figure 3.4 Monthly coal imports as annual averages


Graphs are the most important but not the only pictorial method of communi-
cating numbers. A wide range of possibilities exists, all of which have their
advantages and disadvantages. The underlying principles are the same as for graphs.

3/20 Edinburgh Business School Quantitative Methods


Module 3 / Data Communication

Pictures are useful for attracting attention and for showing very general patterns.
They are not useful for showing complex patterns or for extracting actual numbers.

Learning Summary
The communication of data is an area that has been neglected, presumably because
it is technically simple and there is a tendency in quantitative areas (and perhaps
elsewhere) to believe that only the complex can be useful. Yet in modern organisa-
tions there can be few things more in need of improvement than data
communication.
Although the area is technically simple, it does involve immense difficulties. What
exactly is the readership for a set of data? What is the purpose of the data? How can
the common insistence on data specified to a level of accuracy that is not needed by
the decision maker and is not merited by the collection methods be overcome? How
much accounting convention should be retained in communicating financial
information to the layperson? What should be done about the aspects of data
presentation that are a matter of taste? The guiding principle among the problems is
that the data should be communicated according to the needs of the receiver rather
than the producer. Furthermore, they should be communicated so that the main
features can be seen quickly. The seven rules of data presentation described in this
module seek to accomplish this.
Rule 1: round to two effective digits.
Rule 2: reorder the numbers.
Rule 3: interchange rows and columns.
Rule 4: use summary measures.
Rule 5: minimise use of space and lines.
Rule 6: clarify labelling.
Rule 7: use a verbal summary.
Producers of data are accustomed to presenting them in their own style. As al-
ways there will be resistance to changing an attitude and presenting data in a
different way. The idea of rounding especially is usually not accepted instantly.
Surprisingly, however, while objections are raised against rounding, graphs tend to
be universally acclaimed, even when not appropriate. Yet the graphing of data is the
grossest form of rounding. There is evidently a need for clear and consistent
thinking in regard to data communication.
This issue has been of increasing importance because of the growth in usage of
all types and sizes of computers and the development of large-scale management
information systems. The benefits of this technological revolution should be
enormous but the potential has yet to be realised. The quantities of data that
circulate in many organisations are vast. It is supposed that the data provide
information which in turn leads to better decision making. Sadly, this is frequently
not the case. The data circulate, not providing enlightenment, but causing at best
indifference and at worst tidal waves of confusion. Poor data communication is a
prime cause of this. It could be improved. Otherwise, one must question the
wisdom of the large expenditures many organisations make in providing untouched

Quantitative Methods Edinburgh Business School 3/21


Module 3 / Data Communication

and bewildering management data. One thing is clear: if information can be assimi-
lated quickly, it will be used; if not, it will be ignored.

Review Questions
3.1 In communicating management data, which of the following principles should be adhered
to?
A. The requirements of the user of the data are paramount.
B. Patterns in the data should be immediately evident.
C. The data should be specified to two decimal places.
D. The data should be analysed before being presented.

3.2 The specification of the data (the number of decimal places) indicates the accuracy. True
or false?

3.3 The accuracy required of data should be judged in the context of the decisions that are
to be based upon the data. True or false?

3.4 3732.578 when rounded to two effective figures becomes:


A. 3732.58
B. 3700
C. 3730
D. 3732

3/22 Edinburgh Business School Quantitative Methods


Module 3 / Data Communication

3.5 The column of numbers:

1732
1256.3
988.42
38.1

when rounded to two effective figures becomes:


A.
1730
1260
988
38

B.
1730
1260
990
38

C.
1700
1300
990
38

3.6 Which are correct reasons? It is easier to compare numbers in a column than in a row
because:
A. The difference between two- and three-figure numbers is quickly seen.
B. Subtractions of one number from another are made more quickly.
C. The numbers are likely to be closer together and thus easier to analyse quickly.

3.7 When the rows (each referring to the division of a large company) of a table of numbers
are ordered by size, the basis for the ordering should be:
A. The numbers in the left-hand column.
B. The averages of the rows.
C. The capital employed in the division.
D. The level of manpower employed in the division.

Quantitative Methods Edinburgh Business School 3/23


Module 3 / Data Communication

3.8 What criticisms can be made of the following table of numbers?

Sales region Sales (£000)


Eastern 1230
Northern 1960
South-West 1340
South of Scotland 1030
Wales 1220

A. Numbers not rounded to two effective figures.


B. Regions not in size order.
C. Gap in column of numbers because of labelling.
D. There should be a vertical line between Sales region and Sales.

3.9 Only some of the presentation rules can be applied to financial accounts. This is
because:
A. Rounding cannot be done because the reader may want to check that the
auditing has been correct.
B. Rounding cannot be done because it is illegal.
C. An income statement cannot be ordered by size since it has to build up to a
final profit.
D. Published annual accounts are for accountants; therefore their presentation is
dictated by accounting convention.

3.10 Graphs should be used in which of the following circumstances?


A. Where changes over time are involved.
B. To make data visually more attractive.
C. To distinguish fine differences in just a few variables.
D. To communicate very simple patterns.

Case Study 3.1: Local Government Performance Measures


Table 3.16 shows some performance measurement data relating to local area govern-
ments. The data are percentages that show the extent to which each of the six
authorities has attained predefined objectives in each of five months. For example,
Southam had achieved only 32.5 per cent of its objectives in December 2015 but had
increased this to 70.0 per cent by June 2016.

Table 3.16 Local government performance measures


December February April May June
Eastham 25.0 77.9 74.7 79.0 94.0
Northington 14.7 66.7 85.9 88.0 88.0
Plumby 56.0 67.4 75.0 55.0 71.0
Southam 32.5 49.2 64.6 66.0 70.0
Tyneham 40.0 100.0 84.3 100.0 100.0
Westerley 43.7 86.0 89.5 72.0 87.0

3/24 Edinburgh Business School Quantitative Methods


Module 3 / Data Communication

1 This table is one of many that elected representatives have to consider at their monthly
meetings. The representatives need, therefore, to be able to appreciate and understand
the main features very quickly. In these circumstances, how could the data be presented
better? Redraft the table to illustrate the changes.

Case Study 3.2: Multinational Company’s Income Statement


Table 3.17 is taken from the annual accounts of a large multinational company. A second
income statement later in the accounts is more detailed and presumably satisfies legal
requirements and the needs of financial specialists. Table 3.17 is therefore intended to
communicate the main features of the company’s financial year to non-specialists who
may be shareholders, trade unionists, etc.

Table 3.17 Income statement summary of combined figures


£ million 2015 2016
Results for the year ended 31 December
Sales to third parties 9147 9842

Operating profit 541 601


Concern share of associated companies’ profit 59 64
Non-recurring and financial items 50 56

Profit before taxation 550 609


Taxation 272 315

Profit after taxation 278 294


Outside interests and preference dividends 20 21

Profit attributable to ordinary capital 258 273


Ordinary dividends 95 106

Profit of the year retained 163 167

1 Compared to many accounting statements Table 3.17 is already well presented, but
what further improvements might be made?

Case Study 3.3: Country GDPs


1 Consider again Table 3.10 and Table 3.11 earlier in the module. Table 3.10 was the
original and Table 3.11 an amended version. Could the data have been presented better
by means of a graph?

Quantitative Methods Edinburgh Business School 3/25


Module 3 / Data Communication

Case Study 3.4: Energy Efficiency


An organisation of energy consultants advises companies on how they might save money
by making more efficient use of oil. Table 3.18 is part of a recommendation prepared for
the senior managers of a client company and shows the financial return from a capital
investment in oil conservation. The investment is evaluated by estimating the internal
rate of return (IRR) on the project; the IRR being a measure of the profitability of a
capital project; in general terms, the higher the IRR, the more profitable the project
(although it is not necessary to understand the meaning of IRR in order to answer this
case). Some of the inputs to the calculation (e.g. the life of the project and the level of
initial investment) are uncertain. Therefore, some sensitivity analysis has been done. In
other words, the IRR has been calculated several times. Each time one of the assump-
tions is varied, all others remain fixed at their original values.

Table 3.18 Sensitivity analysis for oil conservation investment

Base IRR 24.93% Variation in New Difference


parameter IRR in IRR

Rebuilding investment + £0.5m (£4.342m) 22.71% – 2.22%


cost (£3.842m) – £0.5m (£3.342m) 30.49% + 5.56%

Outstanding life (13 years) – 1 yr (12 years) 23.10% – 1.83%


– 2 yrs (11 years) 22.62% – 2.31%
– 3 yrs (10 years) 22.06% – 2.87%
– 4 yrs (9 years) 21.27% – 3.66%

Fuel consumption + 5T/day (48T/day) 27.73% + 2.80%


differential (43 tonnes/day) – 5T/day (38T/day) 21.96% – 2.97%

Fuel price + 1%/yr (13%/year) 26.42% + 1.49%


escalation rate (12%/year) – 1%/yr (11%/year) 23.51% – 1.42%

1 Table 3.18 gives the results of this sensitivity analysis. It shows the extent to which the
assumptions have been varied and the new IRR for each variation. The ‘base rate’ is the
IRR for the original calculation. How could it be better presented? (Note that it is not
necessary to understand the situation fully in order to propose improvements to the
data communication.)

References
Ehrenberg, A. S. C. (1975). Data Reduction. New York: John Wiley and Sons.

3/26 Edinburgh Business School Quantitative Methods


Module 4

Data Analysis
Contents
4.1 Introduction.............................................................................................4/1
4.2 Management Problems in Data Analysis .............................................4/2
4.3 Guidelines for Data Analysis ..................................................................4/6
Learning Summary ......................................................................................... 4/15
Review Questions ........................................................................................... 4/16
Case Study 4.1: Motoring Correspondent ................................................... 4/17
Case Study 4.2: Geographical Accounts ...................................................... 4/18
Case Study 4.3: Wages Project ..................................................................... 4/19

Prerequisite reading: Module 3

Learning Objectives
By the end of this module the reader should know how to analyse data systematical-
ly. The methodology suggested is simple, relying very much on visual interpretation,
but it is suitable for most data analysis problems in management. It carries implica-
tions for the ways information is produced and used.

4.1 Introduction
What constitutes successful data analysis? There is apparently some uncertainty on
this point. If a group of managers are given a table of numbers and asked to analyse
it, most probably they will ‘number pick’. Individual numbers from somewhere in
the middle of the table which look interesting or which support a long-held view
will be selected for discussion. If the data are profit figures, remarks will be made
such as: ‘I see Western region made £220 000 last year. I always said that the new
cost control system would work.’ A quotation from Andrew Lang, a Scottish poet,
could be applied to quite a few managers: ‘He uses statistics as a drunken man uses
lamp posts – for support rather than illumination.’
Real data analysis is concerned with seeking illumination, not support, from a set
of numbers. Analysis is defined as ‘finding the essence’. A successful data analysis
must therefore involve deriving the fundamental patterns and eliciting the real
information contained in the entire table. This must happen before sensible remarks
can be made about individual numbers. To know whether the cost system in the
above example really did work requires the £220 000 to be put in the context of
profit and cost patterns in all regions.

Quantitative Methods Edinburgh Business School 4/1


Module 4 / Data Analysis

The purpose of this module is to give some guidelines showing how illumination
might be derived from numbers. The guidelines give five steps to follow in order to
find what real information, if any, a set of numbers contains. They are intended to
provide a framework to help a manager understand the numbers he or she encoun-
ters.
One might have thought that understanding numbers is what the whole subject
of statistics is about, and so it is. But statistics was not developed for use in man-
agement. It was developed in other fields such as the natural sciences. When it is
transferred to management, there is a gap between what is needed and what
statistics can offer. Certainly, many managers, having attended courses or read
books on statistics, feel that something is missing and that the root of their problem
has not been tackled. This and other difficulties involved in the analysis of manage-
ment data will be pursued in the following section, before some examples of the
types of data managers face are examined. Next, the guidelines, which are intended
to help fill the statistics gap, will be described and illustrated. Finally, the implica-
tions of this gap for the producers of statistics will be discussed.

4.2 Management Problems in Data Analysis


Managers face a unique set of problems in their general task of understanding
numbers. This may partly explain the poor or non-existent data analysis seen (or not
seen) in a lot of companies. Rich sources of information, of great potential value in
decision making, are often ignored. The first step in making an improvement is
accepting that these problems exist. The set of problems includes the following
main features:
(a) The statistical gap. The subject of statistics does not provide all the techniques
and methods that a manager would like to have at his or her disposal. For in-
stance, much of statistics is concerned with the rigorous testing of hypotheses.
The manager faced with last month’s sales figures or some accounting data has
first to find a hypothesis (i.e. has first to find what might be the real information
in the data). Statistics gives no clear help with this task. Having found some real
information by whatever means, the manager is less concerned with rigorous
statistical testing than with taking action. He or she will no doubt test the validity
of the information, but by other informal, perhaps qualitative, means. Contrast
this with the natural sciences where the emphasis is on thoroughly testing under
different conditions hypotheses derived by non-statistical methods (such as be-
ing hit on the head by an apple). Statistics plays a different role in the natural
sciences than in management. What the manager primarily needs, and what sta-
tistics does not fully provide, are techniques for first detecting patterns and
regularities in data, without having to wait for an external impetus.
(b) A lack of confidence. This manifests itself in different forms: from the free
admission of a fear of numbers to the aggressive statement that management is
solely a matter of instinct and that methodical evaluations of information are
unnecessary. Whatever the manifestation, the effect is the same: little or no data
analysis is done. In fact the majority of numbers problems require only a mini-
mal knowledge of technical matters but a lot of common sense. Data analysis is

4/2 Edinburgh Business School Quantitative Methods


Module 4 / Data Analysis

rather like reading. When looking at a business report, a manager will usually
read it carefully, work out exactly what the author is trying to say and then decide
whether it is correct. The process is similar with a table of numbers. The data
have to be sifted, thought about and weighed. To do this, good presentation (as
stressed in Module 3 in the rules for data presentation) may be more important
than sophisticated techniques. Most managers could do excellent data analyses
provided they had the confidence to treat numbers more like words. It is only
because most people are less familiar with numbers than words that the analysis
process needs to be made more explicit (via guidelines such as those in Section
4.3 below) in the case of numbers.
(c) Over-complication by the experts. The attitude of numbers experts (and other
sorts of experts as well) can confuse managers. The experts use jargon, which is
fine when talking to their peers but not when talking to a layperson; they try
sophisticated methods of analysis before simple ones; they communicate results
in a complicated form, paying little regard to the users of the data. For example,
vast and indigestible tables of numbers, all to five decimal places, are often the
output of a management information system. The result can be that the experts
distance themselves from management problems. In some companies specialist
numbers departments have adopted something akin to a research and develop-
ment role, undertaking solely long-term projects. Managers come to believe that
they have not the skills to help themselves while at the same time believing that
no realistic help is available from experts.

4.2.1 Examples of the Problems Faced


The following three examples are all situations in which experts have over-
complicated the presentation or analysis of data. In each case a simpler treatment of
the data reveals important information.

Accounting Data
In Module 3 Table 3.12 showed the income statement of a multinational shipping
company. It is difficult to analyse (i.e. it is difficult to say what the significant
features of the company’s business were). Some important happenings are obscured
in Table 3.12, but they were revealed when the table was re-presented in Table 3.13.

Quantitative Methods Edinburgh Business School 4/3


Module 4 / Data Analysis

Table 4.1 Budgeting data from an MIS


PORT - OCEAN PORT TERMINAL COSTS - SHIPSIDE OPERATIONS SCHEDULE 2
PERIOD - (CURRENCY US DOLLARS)

MONTH CUMULATIVE
TERMINAL COSTS
ESTIMATE STANDARD VARIANCE VAR % ESTIMATE STANDARD VARIANCE VAR % BUDGET
LO-LO
STEVEDORING
STRAIGHT TIME - FULL 131 223 143 611 1 288 8.6 1 237 132 1 361 266 124 134 9.1 1 564 896
STRAIGHT TIME - M.T. 13 387 14 651 1 264 8.6 256 991 281 399 24 408 8.7
(UN)LASHING 78 (78) 78 (78)
SHIFTING 801 (801) 11 594 (11 594)
OVERTIME, SHIFT TIME OF
WAITING & DEAD TIME 7 102 (7 102) 190 620 (190 620)

RO-RO
STEVEDORING
TRAILERS
STRAIGHT FULL 20 354 26 136 5 782 22.1 167 159 215 161 48 002 22.3 330 074
STRAIGHT M.T. 178 228 50 21.9 14 846 18 993 4 147 21.8
RO-RO COST PLUS
VOLVO CARGO
ROLLING VEHICLES 14 326 19 515 5 189 26.6 98 210 157 163 58 951 37.5
BLOCKSTONED 29 27 (2) (7.4) 613 674 61 9.1
(UN) LASHING RO-RO 355 (355) 355 (355)
SHIFTING 977 (977) 3 790 (3 790)
OVERTIME, SHIFT TIME OF
WAITING & DEAD TIME 1 417 (1 417) (28 713) (28 713)
HEAVY LIFTS (OFF STANDARD) 2 009 (2 009)

CARS
STEVEDORING
STRAIGHT TIME 6 127 6 403 276 4.3 38 530 35 328 (3 202) (9.1) 168 000
(UN) LASHING 2 (2)
SHIFTING 795 (795) 1 288 (1 288)
OVERTIME, SHIFT TIME OF
WAITING & DEAD TIME 7 573 (7 573)
OTHER SHIPSIDE OF COSTS 3 422 (3 422) 24 473 (24 473)

TOTAL TERMINAL COSTS 200 571 210 571 16 000 4.5 2 083 976 2 069 984 (13 992) (.7) 2 062 970

4/4 Edinburgh Business School Quantitative Methods


Module 4 / Data Analysis

Management Information System Output


Table 4.1 is an example of perfectly accurate and meaningful data which are
presented in a form convenient for the producer of the data but not for the user.
A manager would take a considerable time to understand the table. Since this is
part of a Management Information System (MIS) which contains many such tables,
it is essential that a manager should be able to assimilate the information contained
in the table very rapidly. A little thought given to simplifying the table in accordance
with the needs of the user would bring an enormous improvement (see Table 4.5).

Market Research
Table 4.2 indicates what can happen when experts over-complicate an analysis. The
original data came from interviews of 700 television viewers who were asked which
British television programmes they really like to watch. The table is the result of the
analysis of this relatively straightforward data. It is impossible to see what the real
information is, even if one knows what correlation means. However, a later and
simpler analysis of the original data revealed a result of wide-ranging importance in
the field of television research. (See Ehrenberg, 1975, for further comment on this
example.)

Table 4.2 Result of the analysis of some market research data


Adults who ‘Really Like to Watch’: Correlations to 4 decimal places
(Programmes ordered alphabetically within channel)
PrB ThW Tod WoS GrS LnU MoD Pan RgS 24H
ITV PrB 1.0000 0.1064 0.0653 0.5054 0.4741 0.0915 0.4732 0.1681 0.3091 0.1242
ITV ThW 0.1064 1.0000 0.2701 0.1474 0.1321 0.1885 0.0815 0.3520 0.0637 0.3946
ITV Tod 0.0653 0.2701 1.0000 0.0926 0.0704 0.1546 0.0392 0.2004 0.0512 0.2437
ITV WoS 0.5054 0.1474 0.0926 1.0000 0.6217 0.0785 0.5806 0.1867 0.2963 0.1403
BBC GrS 0.4741 0.1321 0.0704 0.6217 1.0000 0.0849 0.5932 0.1813 0.3412 0.1420
BBC LnU 0.0915 0.1885 0.1546 0.0785 0.0849 1.0000 0.0487 0.1973 0.0969 0.2661
BBC MoD 0.4732 0.0815 0.0392 0.5806 0.5932 0.0487 1.0000 0.1314 0.3267 0.1221
BBC Pan 0.1681 0.3520 0.2004 0.1867 0.1813 0.1973 0.1314 1.0000 0.1469 0.5237
BBC RgS 0.3091 0.0637 0.0512 0.2963 0.3412 0.0969 0.3267 0.1469 1.0000 0.1212
BBC 24H 0.1242 0.3946 0.2437 0.1403 0.1420 0.2661 0.1221 0.5237 0.1212 1.0000

In all three examples any message in the data is obscured. They were produced by
accountants, computer scientists and statisticians respectively. What managers
would have the confidence to fly in the face of experts and produce their own
analysis? Even if they had the confidence, how could they attempt an analysis? The
guidelines described below indicate, at a general level, how data might be analysed.
They provide a starting point for data analysis.

Quantitative Methods Edinburgh Business School 4/5


Module 4 / Data Analysis

4.3 Guidelines for Data Analysis


The guidelines comprise five stages. A manager who follows them should be able to
understand better the data he or she is faced with. They are, however, guidelines and
not a rigorous methodology which guarantees complete success every time.

4.3.1 Stage 1: Reduce the Data


Many tables contain too many numbers. Data are often included on the grounds
that ‘someone, somewhere, may need them’. For the sake of these probably
mythical people, all others wanting to understand the data have to distinguish what
is important from an enveloping confusion of other irrelevant data.
It is true that the producer of the data has to cater for the many requirements of
the people who receive them. Even so, producers tend to err on the side of over-
supply and include more than is necessary, even given a wide range of users. The
effect is multiplied if receivers of data, on the occasions when they are asked what
they require, overstate their needs ‘just in case’.
The first stage of an analysis is, then, to cut down on the numbers. This means
omitting those that are redundant or irrelevant. This is very much a matter of
judgement, but it is important to resist the assumption that every single piece of data
must be important because it has been included. The producer is unlikely to have
included them because he or she knows them to be important (how would they
know?). They will probably have no idea whether they are important or not. They
will have been included because they are available and the producer is being
cautious. Moreover, it is easier to include everything than to have to agonise over
what should be excluded. By reducing the data the analyst may well find that he or
she is dealing with just a fraction of the original set.

4.3.2 Stage 2: Re-present the Data


The visual approach is central to the understanding of numbers. A pattern can often
be seen quickly in a set of well-arranged data which otherwise (for example, when
computerised) appears to be no more than a random jumble. Recall the similarity
between handling words and numbers. In the same way that a verbal report is sifted,
meditated upon and mused over, so time also needs to be spent thinking about a
table of numbers. If the numbers are skilfully presented, the thinking process should
be shorter and there will be a greater chance that they will be understood. Good
presentation of numbers is analogous to clarity of style and the absence of jargon in
verbal reports. However, the visual aspect of data analysis and the need to present
data well is usually neglected, perhaps because it does not have a high technical
content, but the approach has been shown to work in practice.

4/6 Edinburgh Business School Quantitative Methods


Module 4 / Data Analysis

The re-presentation being recommended does not refer just to data that are liter-
ally a random jumble. On the contrary, the assumption is that the data have already
been assembled in a neat table. Neatness is preferable to messiness but the patterns
may still be obscured. When confronted by apparent orderliness one should take
steps to re-present the table in a fashion which makes it easy to see any patterns
contained in it. The ways in which data can be rearranged were explored in detail in
the previous module. Recall that the seven steps were:
(a) Round the numbers to two effective figures.
(b) Put rows and columns in size order.
(c) Interchange rows and columns where necessary.
(d) Use summary measures.
(e) Minimise use of gridlines and white space.
(f) Make the labelling clear and do not allow it to hinder comprehension of the
numbers.
(g) Use a verbal summary.

4.3.3 Stage 3: Build a Model


Building a model is a mathematical-sounding expression for a straightforward task.
It is this stage that can lead the analyst to an unnecessarily technical and mathemati-
cal solution. The aim in building a model is essentially to find a pattern in the
numbers and some way of summarising them. The model may be a verbal one, an
arithmetic one or an algebraic expression, either simple or complex. Frequently the
simpler models prove sufficient. Even so, this stage is the most difficult and skilful,
requiring an element of creativity. The following are examples of very different
models, all of which proved adequate for the particular data and context concerned.
(a) Row averages all equal but with a ±20 per cent variation within each column.
(b) Real profits increased by 5 per cent p.a. between 2005 and 2010 but down by 2
per cent p.a. between 2011 and 2016.
(c) Sales per salesperson approximately equal for each northern region; sales per
salesperson approximately equal for each southern region; but the northern fig-
ure is 25 per cent higher than the southern.
(d) Column 1 (represented by y) is related to column 2 (represented by x) by the
expression: y = 0.75x + 3.
The principal benefit of a model is that just a few numbers or words have to be
dealt with instead of a profusion. If the model is a good one, it can be used as a
summary, for spotting exceptions, for making comparisons and for taking action.
The vast amount of information now available to managers (particularly in view
of recent growth in the use of all sizes of computers) means that when looking for a
model or pattern one should look at the simple ones first. There is usually neither
time nor sufficient specialist help to do otherwise. It has to be a simple model or
nothing. In any case the simple models are usually the more useful since they are
easier to handle and communicate with. There are sometimes objections to simple
models purely because they are simple and, it is supposed, could not represent the
harsh complexities of the world. Yet most of the best established scientific models
have been simple: Newton’s Laws and Boyle’s Law, for instance. They may not

Quantitative Methods Edinburgh Business School 4/7


Module 4 / Data Analysis

encapsulate every nuance of reality, but for some time they have been able to
summarise and predict reality to a generally acceptable level of approximation. In
management the objective is usually no more than this.
Only if the simple approach fails are complex methods necessary, and then ex-
pert knowledge may be required. As a last resort, even if the numbers are random
(random means they have no particular pattern or order), this is a model of a sort
and can be useful. For example, the fact that the day-to-day movements in the
returns from quoted shares are random is an important part of modern financial
theory.

4.3.4 Stage 4: Exceptions


Once a pattern is established, it is feasible to look at the exceptions to it. In manage-
rial situations the exceptions can be more important than the pattern. For the
marketing divisions of a company, the latest monthly sales figures may be seen to
have a pattern in which the sales volume per salesperson is approximately the same
in every sales region except for region X, where the figure is lower. The exception
can now be investigated. Did it occur because it is a large region geographically and
thus more time is taken up travelling? Is it because someone is falling down on the
job? Is it on account of reasons within or beyond the control of the company? One
reason for an extreme exception is that someone made a mistake with the numbers,
(e.g. a typist omitted a zero). This last possibility should always be considered.
When the exceptions have been noticed, they can be corrected or regretfully
disregarded (in the case of items beyond the company’s control) or management
action can be taken. However the exceptions are investigated and dealt with, it is the
recognition of a pattern or model in the numbers that makes it possible to see their
existence.
It is a common error to deal with exceptions before discerning the pattern. This
is the basis of the ‘number picking’ mentioned earlier. Not only is this illogical (since
an exception must by definition have something to be an exception from), but it
also leads to true exceptions being missed while spurious ones are discussed at
length and in depth.
A further possibility may arise. The exceptions may be so numerous and inexpli-
cable that the only conclusion has to be that the model chosen for the data is not
good enough. It does not explain the fluctuations in the numbers sufficiently well to
be regarded as a general model. The exceptions disprove rather than prove the rule.
In this situation the only option is to return to the data and attempt, in the light of
knowledge of the inadequacies of the failed model, to find a new model which
better represents the data. Several iterations of this process may take place before a
satisfactory model is found.

4.3.5 Stage 5: Comparisons


Having detected the pattern and explained or corrected the exceptions to it, one
may compare the results with other relevant information. Rarely is any analysis done
completely in isolation. Nearly always, there are other results with which to make a

4/8 Edinburgh Business School Quantitative Methods


Module 4 / Data Analysis

comparison. The other results may be from another year, from another company,
from another country or from another analyst. In other words, reference can usually
be made to a wider set of information. In consequence, questions may be prompted:
Why is the sales mix different this year from the previous five? Why do other
companies have less brand switching for their products? Why is productivity higher
in the west of Germany? Making comparisons such as these provides a context in
which to evaluate results and also suggests the consistencies or anomalies which
may in turn lead to appropriate management action.
If the results coincide with others, then this further establishes the model and
may mean that in future fewer data may need to be collected – only enough to see
whether the already established model still holds. This is especially true of manage-
ment information systems where managers receive regular printouts of sets of
numbers and they are looking for changes from what has gone before. It is more
efficient for a manager to carry an established model from one time period to the
next rather than the raw data.
Example: Consumption of Distilled Spirits in the USA
As an example of an analysis of numbers that a manager might have to carry out,
consider Table 4.3 showing the consumption of distilled spirits in different states of the
USA. The objective of the analysis would be to measure the variation in consumption
across the states and to detect any areas where there were distinct differences. How
can the table be analysed and what information can be gleaned from it? The five stages
of the guidelines are followed.
Stage 1: Reduce the data. Many of the data are redundant. Are percentage fig-
ures really necessary when per capita figures are given? It is certainly possible, with
some imaginative effort, to conceive of uses of percentage data, but they are not
central to the purposes of the table. It can be reduced to a fraction of its original
size without any loss of real information.
Stage 2: Re-present. To understand the table more quickly, the numbers can be
rounded to two effective figures. The original table has numbers, in places, to eight
figures. No analyst could possibly make use of this level of specification. What con-
clusion would be affected if an eighth figure were, say, a 7 instead of a 4? In any
event, the data are not accurate to eight figures. If the table were a record docu-
ment (which it is not) then more than two figures may be required, but not eight.
Putting the states in order of decreasing population is more helpful than alphabetical
order. Alphabetical order is useful for finding names in a long list, but it adds nothing
to the analysis process. The new order means that states are just as easy to find.
Most people will know that California has a large population and Alaska a small one,
especially since no one using the table will be totally ignorant of the demographic
attributes of the USA. At the same time, the new order makes it easy to spot states
whose consumption is out of line with their population.
The end result of these changes, together with some of a more cosmetic nature, is
Table 4.4. Contrast this table with the original, Table 4.3.

Quantitative Methods Edinburgh Business School 4/9


Module 4 / Data Analysis

Table 4.3 Consumption of distilled spirits in the USA


Per cent to US
Rank in Consumption in Total
consumption wine gallons Percent consumption Per Per
Licence states increase capita capita
2015 2014 2015 2014 decrease 2015 2014 2015 2014

Alaska 46 47 1 391 172 1 359 422 2.3 0.33 0.32 3.64 3.86
Arizona 29 30 4 401 883 4 144 521 6.2 1.03 0.98 1.94 1.86
Arkansas 38 38 2 534 826 2 366 429 7.1 0.60 0.56 1.20 1.12
California 1 1 52 529 142 52 054 429 0.9 12.33 12.32 2.44 2.46
Colorado 22 22 6 380 783 6 310 566 1.1 1.50 1.49 2.47 2.49

Connecticut 18 18 7 194 684 7 271 320 (−1.1) 1.69 1.72 2.31 2.35
Delaware 45 43 1 491 652 1 531 688 (−2.6) 0.35 0.36 2.56 2.65
Dist. of 27 27 4 591 448 4 828 422 (−4.9) 1.08 1.14 6.54 6.74
Columbia
Florida 4 4 22 709 209 22 239 555 1.7 5.33 5.28 2.70 2.67
Georgia 13 13 10 717 681 9 944 846 7.8 2.52 2.35 2.16 2.02

Hawaii 41 40 2 023 730 1 970 089 2.7 0.48 0.47 2.28 2.28
Illinois 3 3 26 111 587 26 825 876 (−2.7) 6.13 6.35 2.33 2.41
Indiana 19 20 7 110 382 7 005 511 1.5 1.67 1.66 1.34 1.32
Kansas 35 35 2 913 422 2 935 121 (−0.7) 0.68 0.70 1.26 1.29
Kentucky 26 26 4 857 094 5 006 481 (−3.0) 1.14 1.19 1.42 1.47

Louisiana 21 21 7 073 283 6 699 853 5.6 1.66 1.59 1.84 1.77
Maryland 12 12 10 833 966 10 738 731 0.9 2.54 2.54 2.61 2.62
Massachusetts 10 10 13 950 268 14 272 695 (−2.3) 3.28 3.38 2.40 2.45
Minnesota 15 15 8 528 284 8 425 567 1.2 2.00 1.99 2.15 2.15
Missouri 20 17 7 074 614 7 697 871 (−7.9) 1.66 1.82 1.48 1.61

Nebraska 36 36 2 733 497 2 717 859 0.6 0.64 0.64 1.76 1.76
Nevada 30 31 4 360 172 4 095 910 6.5 1.02 0.97 7.15 6.92
New Jersey 8 8 15 901 587 16 154 975 (−1.6) 3.73 3.82 2.17 2.21
New Mexico 42 41 1 980 372 1 954 139 1.3 0.47 0.46 1.70 1.70
New York 2 2 41 070 005 41 740 341 (−1.6) 9.64 9.88 2.27 2.30

North Dakota 47 46 1 388 475 1 384 311 0.3 0.33 0.33 2.16 2.16
Oklahoma 33 29 3 904 574 4 187 527 (−6.8) 0.92 0.99 1.41 1.54
Rhode Island 39 39 2 073 075 2 131 329 (−2.7) 0.49 .50 2.24 2.30
South 23 25 5 934 427 5 301 054 11.9 1.39 1.26 2.08 1.88
Carolina
South Dakota 48 48 1 312 160 1 242 021 5.6 0.31 0.29 1.91 1.82

Tennessee 24 24 5 618 774 5 357 160 4.9 1.32 1.27 1.33 1.28
Texas 5 6 17 990 532 17 167 560 4.8 4.22 4.06 1.44 1.40
Wisconsin 11 11 10 896 455 10 739 261 1.5 2.56 2.54 2.36 2.33

Total licence 319 583 215 317 874 435 0.5 75.04 75.22 2.13 2.13

4/10 Edinburgh Business School Quantitative Methods


Module 4 / Data Analysis

Table 4.4 Consumption of distilled spirits in the USA (amended)


Licence states Consumption
(in order of population) Wine gallons (millions) Per capita (gallons)
2015 2014 2014
California 53.0 52.0 2.4
New York 41.0 42.0 2.3
Texas 18.0 17.0 1.4
Illinois 26.0 27.0 2.3
Florida 23.0 22.0 2.7

New Jersey 16.0 16.0 2.2


Massachusetts 14.0 14.0 2.4
Indiana 7.1 7.0 1.3
Georgia 11.0 9.9 2.2
Missouri 7.1 7.7 1.5

Wisconsin 11.0 11.0 2.4


Tennessee 5.6 5.4 1.3
Maryland 11.0 11.0 2.6
Minnesota 8.5 8.4 2.2
Louisiana 7.1 6.7 1.8

Kentucky 4.9 5.0 1.4


Connecticut 7.2 7.3 2.3
S. Carolina 5.9 5.3 2.1
Oklahoma 3.9 4.2 1.4
Colorado 6.4 6.3 2.5

Kansas 2.9 2.9 1.3


Arizona 4.4 4.1 1.9
Arkansas 2.5 2.4 1.2
Nebraska 2.7 2.7 1.8
New Mexico 2.0 2.0 1.7

Rhode Island 2.1 2.1 2.2


Hawaii 2.0 2.0 2.3
D. Columbia 4.6 4.8 6.5
S. Dakota 1.3 1.2 1.9
N. Dakota 1.4 1.4 2.2

Nevada 4.4 4.1 7.2


Delaware 1.5 1.5 2.6
Alaska 1.4 1.4 3.6
Average 9.7 9.6 2.3

Quantitative Methods Edinburgh Business School 4/11


Module 4 / Data Analysis

Stage 3: Build a model. The pattern is evident from the transformed table. Con-
sumption varies with the population of the state. Per capita consumption in each
state is about equal to the figure for all licence states with some variation (±30 per
cent) about this level. The pattern a year earlier was the same except that overall
consumption increased slightly (1 per cent) between the two years. Refer back to
Table 4.3 and see if this pattern is evident even when it is known to be there. There
may of course be other patterns but this one is central to the objectives of the anal-
ysis.
Stage 4: Exceptions. The overall pattern of approximately equal per capita con-
sumption in each state allows the exceptions to be seen. From Table 4.4, three
states stand out as having a large deviation from the pattern. The states are District
of Columbia, Nevada and Alaska. These states were exceptions to the pattern in the
earlier year as well. Explanations in the cases of District of Columbia and Nevada are
readily found, probably being to do with the large non-resident populations. People
live, and drink, in these states who are not included in the population figures (diplo-
mats in DC, tourists in Nevada). An explanation for Alaska may be to do with the
lack of leisure opportunities. Whatever the explanations, the analytical method has
done its job. The patterns and exceptions in the data have been found. Explanations
are the responsibility of experts in the marketing of distilled spirits in the USA.
Stage 5: Comparison. A comparison between the two years is provided by the
table. Other comparisons will be relevant to the task of gaining an understanding of
the USA spirits market. The following data would be useful:
(i) earlier years, say, five and ten years before;
(ii) a breakdown of aggregate data into whisky, gin, vodka, etc.;
(iii) other alcoholic beverages: wine, beer, etc.
Once data from these other sources have been collected they would be analysed in
the manner described, but of course the process would be shorter because the
pattern can be anticipated. Care would need to be taken that like was being com-
pared with like. For example, it would have to be checked that an equivalent
definition of consumption was in force ten years earlier.

4.3.6 Implications for the Producers of Statistics


The methodology is intended for use by managers who are not statistical specialists.
However, if the problems that the methodology is intended to counter are real and
if it does indeed prove useful in overcoming them, there are implications for the
producers of data as well as for the users.
The obvious implication for producers of statistics is that more attention should
be paid to the uses to which the data are being put. This does not mean asking
managers what they want since this will just lead to an over-provision of data.
Likewise, the practice of issuing all the data that are held (albeit broken down into
finance, marketing, etc.) results in a surfeit of data which merely confuses the
aspiring analyst. Designers of information systems, writers of reports and the like
should consider exactly what contribution their data are supposed to make and how
they are expected to improve decision making. What is needed is a more systematic
approach whereby data are provided not in isolation but in the context of the
management processes they are intended to serve.

4/12 Edinburgh Business School Quantitative Methods


Module 4 / Data Analysis

The second implication is more direct. Data should be presented in forms which
enable them to be analysed speedily and accurately. Much of the reduction and re-
presentation stages of the guidelines could, in most instances, be carried out just as
well by the producer of the data as by the user. It would then need to be done only
once rather than many times by the many users of the data. Unfortunately, when
time is spent thinking about the presentation of statistics, it is usually spent in
making the tables look neat or attractive rather than making them amenable to
analysis.
There is much that the producers of data can do by themselves. For example,
refer back to the extract from a management information system shown in Ta-
ble 4.1: if thought were given to the analysis of the data through the application of
the guidelines, a different presentation would result (see Table 4.5).
(a) Some data might be eliminated. For instance, full details on minor categories of
expenditure may not be necessary. This step has not been taken in Table 4.5
since full consultation with the receivers would be necessary.
(b) The table should be re-presented using the rules of data presentation. In
particular, some rounding is helpful. This is an information document, not an
auditing one, and thus rounding is appropriate. In any case, no different conclu-
sions would be drawn if any of the expenditures were changed by one unit. In
addition, improvement is brought about by use of summary measures and a
clearer distinction between such measures and the detail of the table.
(c) A model derived from previous time periods would indicate when changes were
taking place. There is a good case for including a model or summary of previous
time periods with all MIS data. This has not been done for Table 4.5 since previ-
ous data were not available.
(d) Exceptions can be clearly marked. It is, after all, a prime purpose of budget data
to indicate where there have been deviations from plan. This can be an automat-
ic process. For example, all variances greater than 10 per cent could be marked.
This might even obviate the need for variance figures to be shown.
(e) The making of comparisons is probably not the role of the data producer in this
example, involving as it does the judgement of the receivers in knowing what the
relevant comparisons are. The task of the producer has been to facilitate these
comparisons.
Making the suggested changes does of course have a cost attached in terms of
management time. However, the cost is a small fraction of the cost of setting up and
operating the information system. The changes can transform the system and make
it fully operational. If an existing system is being largely ignored by managers, there
may be no alternative.

Quantitative Methods Edinburgh Business School 4/13


Module 4 / Data Analysis

Table 4.5 Budgeting data from an MIS (amended from Table 4.1)
Port: Liverpool OCEAN PORT TERMINAL COSTS – SHIPSIDE OPERATIONS
Period: December (U.S Dollars: Conversion rate 1.60)

MONTH CUMULATIVE
ESTIMATE STANDARD VARIANCE VAR % ESTIMATE STANDARD VARIANCE VAR % BUDGET

LO-LO: Stevedoring (STR-FULL) 131 000 144 000 12 000 9 1 240 000 1 360 000 120 000 9 1 560 000
Stevedoring (STR-MT) 13 400 14 700 1 300 9 257 000 281 000 24 000 9
Unlashing 78 0 −78 * 78 0 −78 *
Shifting 200 0 −200 * 12 000 0 −12 000 *
Overtime, etc. 7 100 0 −7 100 * 191 000 0 −191 000 *

RO-RO: Stevedoring TR (STR-FULL) 20 400 26 100 5 800 22 167 000 215 000 48 000 22 330 000
Stevedoring TR (STR-MT) 180 230 50 22 15 000 19 000 4 100 22
Stevedoring cost plus 0 0 0 0 0 0 0 0
Stevedoring Volvo 0 0 0 0 0 0 0
Stevedoring rolling 14 300 19 500 5 200 27 98 000 157 000 59 000 37
Stevedoring blockstow 29 27 −2 −7 610 670 60 9
Unlashing 350 0 −350 * 350 0 −350 *
Shifting 980 0 −980 * 3 800 0 −3 800 *
Overtime, etc. 1 400 0 −1 400 * 29 000 0 −29 000 *
Heavy lifts 0 0 0 0 2 000 0 −2 000 *

CARS: Stevedoring (STR) 6 100 6 400 280 4 38 000 35 000 −3 000 −9 170 000
Unlashing 0 0 0 0 2 0 −2 *
Shifting 800 0 −800 * 1 300 0 −1 300 *
Overtime, etc. 0 0 0 0 7 600 0 −7 600 *

OTHER: Shipside costs 3 400 0 −3 400 * 24 000 0 −24 000 *

TOTALS: LO-LO 152 000 158 000 5 900 4 1 700 000 1 640 000 −59 000 4
RO-RO 38 000 46 000 8 300 18 315 000 392 000 77 000 20
CARS 6 900 6 4001 −520 −8 47 000 35 000 −12 000 −34
OTHER 3 400 0 −3 400 * 24 000 0 −24 000

GRAND 201 000 211 000 10 000 5 2 080 000 2 070 000 −14 000 −1 2 060 000
TOTAL
Totals may not agree because of rounding.
*Zero standard cost, therefore variance not calculable.

4/14 Edinburgh Business School Quantitative Methods


Module 4 / Data Analysis

Few managers will not admit that there is currently a problem with the provision
and analysis of data, but they rarely tell this to their IT systems manager or whoever
sends them data. Without feedback, inexpensive yet effective changes are never
made. It must be the responsibility of users to criticise constructively the form and
content of the data they receive. The idea that computer scientists/statisticians
always know best, and if they bother to provide data then they must be useful, is
false. The users must make clear their requirements, and even resort to a little
persistence if alterations are not forthcoming.

Learning Summary
Every manager sees the problem of handling numbers differently because each sees
it mainly in the (probably) narrow context with which he or she is familiar in his or
her own work. One manager sees numbers only in the financial area; another sees
them only in production management. The guidelines suggested here are intended
to be generally applicable to the analysis of business data in many different situa-
tions and with a range of different requirements. The key points are:
(a) In most situations managers without statistical backgrounds can carry out
satisfactory analyses themselves.
(b) Simple methods are preferable to complex ones.
(c) Visual inspection of well-arranged data can play a role in coming to understand
them.
(d) Data analysis is like verbal analysis.
(e) The guidelines merely make explicit what comes naturally when dealing with
words.
The need for better skills to turn data into real information in managerial situa-
tions is not new. What has made the need so urgent in recent times is the
exceedingly rapid development of computers and associated management infor-
mation systems. The ability to provide vast amounts of data very quickly has grown
enormously. It has far outstripped the ability of management to make use of the
data. The result has been that in many organisations managers have been swamped
with so-called information which in fact is no more than mere numbers. The
problem of general data analysis is no longer a small one that can be ignored. When
companies are spending large amounts of money on data provision, the question of
how to turn the data into information and use them in decision making is one that
has to be faced.

Quantitative Methods Edinburgh Business School 4/15


Module 4 / Data Analysis

Review Questions
4.1 Traditional statistical techniques do not help managers in analysing data. True or false?

4.2 The need for new management skills in data analysis arises because so many data come
from computers, which means that they have to be presented in a more complicated
style. True or false?

4.3 Which of the following reasons is correct? The first step in data analysis is to reduce the
data. It is done because:
A. Most data sets contain some inaccuracies.
B. One can only analyse a small amount of data at a time.
C. Most data sets contain some items which are irrelevant.

4.4 Data recorded to eight decimal places can be rounded down since such a degree of
accuracy will not affect the decision being taken. True or false?
A. True
B. False

4.5 Which of the following reasons are correct? A model or pattern is used to summarise a
table because:
A. Exceptions can be seen more easily and accurately.
B. It is easier to make comparisons with other sets of data.
C. The model will be more accurate than the original data.

4.6 Which statement below best describes the following data?

Year Sales (£m)


2013 3.2
2014 4.0
2015 5.0
2016 6.2

A. Growth of about £1 million p.a.


B. Growth of about 25 per cent p.a.
C. 2016 sales were the highest.
D. Average sales = 4.6 million.

4/16 Edinburgh Business School Quantitative Methods


Module 4 / Data Analysis

4.7 A company has four divisions. The profit and capital employed by each are given in the
table below. Which division is the exception?

Profit Capital employed


Division 1 4.8 80.3
Division 2 7.2 191.4
Division 3 3.6 59.4
Division 4 14.5 242.0

A. Division 1
B. Division 2
C. Division 3
D. Division 4

4.8 A confectionery manufacturer’s production level for a new chocolate bar is believed to
have increased by 5 per cent per month over the last 36 months. However, for 11 of
these months this model does not fit. The exceptions were as follows: for five months
strikes considerably reduced production; the three Decembers had lower figures, as did
the three Augusts, when the production plant is closed for two weeks. You would be
right in concluding that the 5 per cent model is not a good one because 11 exceptions
out of 36 is too many. True or false?

4.9 Towards the completion of an analysis of the consumption of distilled spirits across the
different states of the USA in a particular year, the results are compared with those of
similar studies. Which of the following other analyses would be useful?
A. Consumption of distilled spirits across the départements of France.
B. Consumption of wine across the départements of France.
C. Consumption of wine across the states of the USA.
D. Consumption of whisky across the states of the USA.

4.10 A simple model is used in preference to a sophisticated one in the analysis of data
because sophisticated models obscure patterns. True or false?

Case Study 4.1: Motoring Correspondent


1 This statement was made in a British national newspaper by its motoring correspondent
in February 2015:

Ten years ago, in 2004, the death rate on the roads of this country was running
at 0.1 death for every 1 million miles driven. By 2009 a death occurred every
12 million miles. Last year, according to figures just released, there were 6400
deaths, whilst a total of 92 000 million miles were driven.

a. Analyse these data.


b. Was driving safer in 2014 than in 2004?
c. Make a forecast of the death rate in 2018.
d. What reservations are there regarding the prediction?

Quantitative Methods Edinburgh Business School 4/17


Module 4 / Data Analysis

Case Study 4.2: Geographical Accounts


1 Table 4.6, taken from the annual accounts of a major UK company, shows the
expenditure by geographical divisions, broken down into categories. For instance, the
Wessex region spent £48 545 000 in all, split into £573 000 on finance, £13 224 000 on
raw materials, etc. Analyse the table, showing particularly, for each expenditure
category, the regions that have unusual expenditure levels.

Table 4.6 Expenditure by geographical division (£000 (%))


Total Finance Raw Manpower Transport Fuel
materials
England and Wales 1 109 896 34 406 447 161 249 820 318 430 8 968
(100.0) (3.1) (40.3) (22.5) (28.7) (0.8)
Divisions
North West 149 831 1 105 75 517 28 050 38 679 1 112
(100.0) (0.7) (50.4) (18.7) (25.8) (0.7)

Northumbria 39 426 121 18 406 9 963 9 689 346


(100.0) (0.3) (46.7) (25.3) (24.6) (0.9)

Severn-Trent 187 005 8 401 68 861 39 136 62 301 11 732


(100.0) (4.5) (36.8) (20.9) (33.3) (0.9)

Yorkshire 109 607 1 270 51 462 18 537 33 916 969


(100.0) (1.2) (47.0) (16.9) (30.9) (0.9)

Anglia 137 898 4 184 47 721 33 487 41 157 811


(100.0) (3.0) (34.6) (24.3) (29.8) (0.6)

Thames 227 745 8 055 87 221 58 572 62 573 1 671


(100.0) (3.5) (38.3) (25.7) (27.5) (0.7)

South 76 539 1 216 22 596 20 292 25 709 693


(100.0) (1.6) (29.5) (26.5) (33.6) (0.9)

Wessex 48 545 573 13 224 16 379 15 481 476


(100.0) (1.2) (27.2) (33.7) (31.9) (1.0)

South West 39 230 950 18 083 8 132 9 564 307


(100.0) (2.4) (46.1) (20.7) (24.4) (0.8)

Wales 94 070 8 531 44 070 17 272 19 361 851


(100.0) (9.1) (46.8) (18.4) (20.6) (0.9)
(Note: This table is only half the original. Several expenditure categories have been left out to keep it to a
manageable size. Although totals for each expenditure type are correct, the sum of expenditure for each
region is less than the total.)

4/18 Edinburgh Business School Quantitative Methods


Module 4 / Data Analysis

Case Study 4.3: Wages Project


1 Table 4.7 was taken from the report on a research project conducted for a UK
government department. The project was intended to find out the extent to which
employers were paying wages that were lower than nationally agreed minimum wages. It
involved a sample survey of 11 trades in four geographical areas. For each of the trades,
the table shows:
a. the percentage of those employers sampled who were found to be underpaying;
b. the percentage of employees included in the survey who were being underpaid;
c. the total amount underpaid in the year of the survey in each trade by the employers
included in the sample.
Analyse the data and suggest what the main conclusions might be. If no general pattern
exists, suggest what extra data are required to further the analysis.

Table 4.7 Wages underpayment


Trades Proportion of Proportion of Amount
employers employees’ underpaid
inspected who wages examined
were found to that were
be underpaying entitled to
arrears
Retail bread 32.99 23.22 2 516.01
Bookselling 50.00 20.07 1 205.48
Drapery and outfitting 31.85 17.20 11 922.26
Retail food 21.74 11.89 16 850.33
Furnishing and allied trades 20.19 7.86 14 085.62
Newsagency and tobacco 32.93 21.63 9 564.66
Hairdressing 26.23 11.30 4 394.04
Licensed non-residential 22.01 10.24 6 218.02
hotels
Licensed restaurants 30.77 8.79 3 939.20
Unlicensed restaurants 47.58 32.78 5 315.83
Other trades 17.86 8.20 157.30
Total 27.08 13.91 76 168.75

References
Ehrenberg, A. S. C. (1975). Data Reduction. New York: John Wiley and Sons.

Quantitative Methods Edinburgh Business School 4/19


Module 5

Summary Measures
Contents
5.1 Introduction.............................................................................................5/1
5.2 Usefulness of the Measures ....................................................................5/2
5.3 Measures of Location..............................................................................5/5
5.4 Measures of Scatter ............................................................................. 5/14
5.5 Other Summary Measures ................................................................. 5/20
5.6 Dealing with Outliers .......................................................................... 5/21
5.7 Indices ................................................................................................... 5/22
Learning Summary ......................................................................................... 5/29
Review Questions ........................................................................................... 5/30
Case Study 5.1: Light Bulb Testing............................................................... 5/33
Case Study 5.2: Smith’s Expense Account .................................................. 5/34
Case Study 5.3: Monthly Employment Statistics ........................................ 5/34
Case Study 5.4: Commuting Distances ........................................................ 5/34
Case Study 5.5: Petroleum Products ........................................................... 5/35

Prerequisite reading: None

Learning Objectives
By the end of the module, the reader should know how large quantities of numbers
can be reduced to a few simple summary measures that are much easier to handle
than the raw data. The most common measures are those of location and scatter.
The special case of summarising time series data with indices is also described.

5.1 Introduction
When trying to understand and remember the important parts of a lengthy verbal
report, it is usual to summarise. This may be done by expressing the essence of the
report in perhaps a few sentences, by underlining key phrases or by listing the main
subsection headings. Each individual has his own method, which may be physical (a
written précis) or mental (some way of registering the main facts in the mind).
Whatever the method, the point is that it is easier to handle information in this way,
by summarising and storing these brief summaries in one’s memory. On the few
occasions that details are required, it is necessary to turn to the report itself.
The situation is no different when it is numerical rather than verbal information
that is being handled. It is still better to form a summary to capture the salient
characteristics. The summary may be a pattern, simple or complex, revealed when

Quantitative Methods Edinburgh Business School 5/1


Module 5 / Summary Measures

analysing the data, or it may be based on one or more of the standard summary
measures described in this module.
Inevitably some accuracy is lost. In the extreme, if the summarising is badly done,
it can be wholly misleading. (How often do report writers claim to have been totally
misunderstood after hearing someone else’s summary of their work?) Take the case
of a recent labour strike in the UK about wages payment. In reporting the current
levels of payment, newspapers/union leaders/employers could not, of course, state
the payments for all 173 000 employees in the industry. They had to summarise.
Five statements as to the ‘average’ weekly wage were made:
The average weekly wage is …

Quote 1: £241 (union leader)


Quote 2: £248 (newspaper)
Quote 3: £271 (newspaper)
Quote 4: £298 (newspaper)
Quote 5: £323 (employers’ organisation)

All these quoted wages were said to be the same thing: the average weekly wage.
Are the employees in the industry grossly underpaid or overpaid? It is not difficult
to choose an amount that reinforces one’s prejudices. The discrepancies are not
because of miscalculations but because of definitions. Quote 1 is the basic weekly
wage without overtime, shift allowances and unsocial hours allowance, and it has
been reduced for tax and other deductions. Since the industry is one that requires
substantial night-time working for all employees, no one actually takes home the
amount quoted. Quote 2 is the same as the first but without the tax deduction.
Quote 3 is the average take-home pay of a sample of 30 employees in a particular
area. Quote 4 is basic pay plus unsocial hours allowance but without any overtime
or tax deductions. Quote 5 is basic pay plus allowances plus maximum overtime
pay, without tax and other deductions.
It is important when using summary measures (and in all of statistics) to apply
common sense and not be intimidated by complex calculations. Just because
something that sounds statistical is quoted (‘the average is £41.83’) does not mean
that its accuracy and validity should be accepted without question. When summary
measures fail, it is usually not because of poor arithmetic or poor statistical
knowledge but because common sense has been lacking.
In this context, the remainder of this module goes on to describe ways of sum-
marising numbers and to discuss their effectiveness and their limitations.

5.2 Usefulness of the Measures


There are several types of summary measure. Each type of measure summarises a
different aspect of the data. For management purposes, it is usually possible to
summarise adequately a set of data using just two or three types of measure. For
example, the managers of the assembly line for a certain make of car receive

5/2 Edinburgh Business School Quantitative Methods


Module 5 / Summary Measures

monthly a computer report of the two previous months’ production. The report for
June (22 working days) and May (19 working days) is given in Table 5.1.

Table 5.1 Car assembly data


Monthly production report
This month: June Previous month: May
Date Production Date Production
Mon 1 235
Tues 2 240
Wed 3 210
Thurs 4 240
Fri 5 225 Fri 1 238
Weekly total 1150 238
Mon 8 238 Mon 4 0
Tues 9 247 Tues 5 248
Wed 10 242 Wed 6 253
Thurs 11 241 Thurs 7 256
Fri 12 228 Fri 8 242
Weekly total 1196 999
Mon 15 226 Mon 11 245
Tues 16 231 Tues 12 242
Wed 17 233 Wed 13 238
Thurs 18 220 Thurs 14 247
Fri 19 215 Fri 15 239
Weekly total 1125 1211
Mon 22 230 Mon 18 249
Tues 23 234 Tues 19 244
Wed 24 225 Wed 20 241
Thurs 25 225 Thurs 21 247
Fri 26 220 Fri 22 236
Weekly total 1134 1217
Mon 29 234 Mon 25 0
Tues 30 220 Tues 26 252
Wed 27 246
Thurs 28 241
Fri 29 235
Weekly total 454 974
MONTHLY TOTAL 5059 4639

The data as shown are useful for reference purposes (e.g. what was the produc-
tion on 15 May?) or for the background to a detailed analysis (e.g. is production
always lower on a Friday and, if so, by how much?). Both these types of use revolve
around the need for detail. For more general information purposes (e.g. was May a
good month for production? What is the trend of production this year?) the amount

Quantitative Methods Edinburgh Business School 5/3


Module 5 / Summary Measures

of data contained in the table is too large and unwieldy for the manager to be able to
make the necessary comparisons. It would be rather difficult to gauge the trend of
production levels so far this year, from six reports, one for each month, such as that
in Table 5.1. If summary measures were provided, then most questions, apart from
the ones that require detail, could be answered readily. A summary of Table 5.1
might be as shown in Table 5.2.
The summary information provided in Table 5.2 enables a wide variety of man-
agement questions to be answered and, more importantly, answered quickly.
Comparisons are made much more easily if summary data for several months, or
years, are available on one report.
Three types of summary measure are used in Table 5.2. The first, average pro-
duction, measures the location of the numbers and tells at what general level the
data are. The second, the range of production, measures scatter and indicates how
widely spread the data are. The third indicates the shape of the data. In this case,
the answer ‘symmetrical’ says that the data fall equally on either side of the average.

Table 5.2 Summary of May and June production


Summary Production Report
June May
Average production/day* 230 244
Range of daily production* 210–247 235–256
Shape of distribution Symmetrical Symmetrical
* Excluding holidays, when there was no production.

The three measures reflect the important attributes of the data. No important
general features of the data are omitted. If, on the other hand, the measure of scatter
had been omitted, the two months could have appeared similar. In actual fact, their
very different ranges provide an important piece of information that reflects
production planning problems.
For each type of measure (location, scatter, shape) there is a choice of measure to
use (for location, the choice is between arithmetic mean and other measures). The
different types of measures are described below. The measures have many uses
other than as summaries and these will be indicated. They will also be found in
other subject areas. For example, the variance, a measure of scatter, plays a central
role in modern financial theory.

5/4 Edinburgh Business School Quantitative Methods


Module 5 / Summary Measures

5.3 Measures of Location


Measures of location are also referred to as measures of central tendency. Their
purpose is to show, in general terms, the size of the data in question.
5.3.1 Arithmetic Mean
The most common and useful measure is the well-known arithmetic mean. It is
defined as:
Arithmetic mean =

This can be put in shorthand (or mathematical) notation:



=

where:
refers to the data in the set
is standard notation for the arithmetic mean
∑ is the Greek capital sigma and, mathematically, means ‘sum of’
is standard notation for the numbers of readings in the set.

For example, for the nine numbers 3, 3, 4, 5, 5, 6, 6, 6, 7:


=
=
Arithmetic mean = 5
5.3.2 Median
The median is the middle value of a set of numbers. There is no mathematical
formula for calculating it. It is obtained by listing the numbers in ascending order,
and the median is that number at the halfway point.
For example, using the same nine numbers as above which are already in ascend-
ing order:

3, 3, 4, 5, 5, 6, 6, 6, 7

Middle number

Median = 5
If there is an even number of readings, then there can be no one middle number.
In this case, it is usual to take the arithmetic mean of the middle two numbers as the
median.
For example, if the set of nine numbers above was increased to ten by the pres-
ence of ‘8’, the set would become:

Quantitative Methods Edinburgh Business School 5/5


Module 5 / Summary Measures

3, 3, 4, 5, 5, 6, 6, 6, 7, 8

Middle two numbers

Median = (5+6)/2
Median = 5.5

5.3.3 Mode
The third measure of location is the mode. This is the most frequently occurring
value. Again, there is no mathematical formula for the mode. The frequency with
which each value occurs is noted and the value with the highest frequency is the
mode.
Again, using the same nine numbers as an example: 3, 3, 4, 5, 5, 6, 6, 6, 7

Number Frequency
3 2
4 1
5 2
6 3
7 1
Mode = 6

5.3.4 Calculating Measures of Location


A set of data might simply be recorded as a list of numbers in a spreadsheet.
Calculating measures of location then involves the easy task of highlighting the
numbers and applying one of the standard spreadsheet functions. On other occa-
sions a set of data may be assembled and displayed in the form of a frequency table.
For example, the data used above to illustrate measures of location were in the form
of a frequency table when the mode was calculated. For large sets of data the
frequency table may show the data in groups. For example, Table 5.3 refers to the
number of customer complaints received daily by a railway company over a one-year
(350 days) period. It shows the number of days on which there were fewer than ten
complaints, the number of days when there were between ten and 19 complaints
and so on.

Table 5.3 Railway complaints


No. of complaints No. of days
0 to 9 24
10 to 19 33
20 to 29 68
30 to 39 54

5/6 Edinburgh Business School Quantitative Methods


Module 5 / Summary Measures

No. of complaints No. of days


40 to 49 53
50 to 59 42
60 to 69 31
70 to 79 19
80 to 89 10
90+ 16
Total 350

The calculation of measures of location for grouped or classified data such as


those in Table 5.3 is based on the principle of representing each class of data by its
mid-point. So, for example, all 53 observations in the 40–49 class are treated as if
they were equal to the mid-point, 44.5. The calculation of the arithmetic mean then
proceeds as in Table 5.4.

Table 5.4 Railway complaints


No. of complaints Mid-point No. of days Frequency × Mid-point
0 to 9 4.5 24 108.0
10 to 19 14.5 33 478.5
20 to 29 24.5 68 1666.0
30 to 39 34.5 54 1863.0
40 to 49 44.5 53 2358.5
50 to 59 54.5 42 2289.0
60 to 69 64.5 31 1999.5
70 to 79 74.5 19 1415.5
80 to 89 84.5 10 845.0
90+ 94.5 16 1512.0
Total 350 14 535.0
Arithmetic mean = ∑(Frequency × Mid-point) for all classes/350
= 14 535/350
= 41.53

Treating all data in a class as if each observation were equal to the mid-point is of
course an approximation, but it is done to simplify the calculations. However, on
some occasions the data may only be available in this form anyway. For example, in
measuring the lengths of machined car components as part of a quality check, the
observations would probably be recorded in groups such as ‘100.5 to 101.0’ rather
than as individual measurements such as ‘100.634’. The most serious approximation
in Table 5.4 is in taking the mid-point of the 90+ class as 94.5, since this class could
include days when complaints had been much higher, say 150, because of some
special circumstances such as severe disruption on account of a derailment. For

Quantitative Methods Edinburgh Business School 5/7


Module 5 / Summary Measures

open-ended groups such as this it may be necessary to examine the outliers to test
the validity of the mid-point approximation.
Calculating the mode and median from a frequency table is more straightforward.
The median class is the one in which the middle observation lies. In this case the
175th and 176th observations lie in the 30–39 class (i.e. the median is 34.5). The
mode is the mid-point of the class with the highest frequency. In this case the class
is 20–29 and the mode is therefore 24.5.

5.3.5 Choosing the Measure of Location


Figure 5.1, Figure 5.2 and Figure 5.3 show three different sets of data generated
from three different situations. Each has a different shape of histogram. Check your
understanding of their definitions by calculating mean, median and mode for each
set and decide which measure of location is the most representative for each set.

Context: Marks out of 15 scored by each of 20 participants in a driving competition.


A symmetrical distribution.
Readings: 5, 5, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, 8, 9, 9, 9, 10, 10, 11, 12
Shape:

4
No. of readings

1 2 3 4 5 6 7 8 9 10 11 12
Marks

Figure 5.1 Driving competition results


Table 5.5 lists the results. Note that the arithmetic mean of all three distributions
is the same: 8.0. However, the data are very different in the three cases, indicating a
need for care in summarising.

Table 5.5 Symmetrical, U shape and Reverse J distributions


Readings Mean Median Mode
Fig. 5.1: Symmetrical 8.0 8 8
Fig. 5.2: U shape 8.0 2 0 and 19
Fig. 5.3: Reverse J 8.0 1 0

For the symmetrical distribution (Figure 5.1) all three measures are equal. This is
always approximately the case for symmetrical data. Whichever measure is chosen, a

5/8 Edinburgh Business School Quantitative Methods


Module 5 / Summary Measures

similar answer results. Consequently, it is best to use the most well-known measure
(i.e. the arithmetic mean) to summarise location for this set of data.
Calculations for Figure 5.1:

Mean =
=
= 8

Median = middle value of set


= average of 10th and 11th values
= 5, 5, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, 8, 9, 9, 9, 10, 10, 11, 12
=8
Mode = most frequently occurring value
=8

Context: Number of episodes in a 19-part serial seen by a sample of 20 viewers.


A U-shaped distribution.
Readings: 0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 4, 17, 18, 18, 19, 19, 19, 19, 19
Shape:
5

4
No. of readings

0 1 2 3 4 5 17 18 19
Episodes seen

Figure 5.2 Television viewing data


The U-shaped data (Figure 5.2) are typical for certain television series and also
magazine readership where many people see hardly any programmes/issues, many
people see all or most, and just a very few fall between the two extremes. Here the
arithmetic mean is not very helpful, since no one actually sees between five and 16
episodes. As a summary of the data, it would mislead. The median is not a particu-
larly good summariser either. Data in which virtually all readings are at one of two
extremes cannot easily be reduced to a single measure that is the middle value of the
set. It is also very sensitive to small changes. A serial that was only slightly more
popular could have resulted in a median of around 18. The best measure is the
mode. In this case, there are two, 0 and 19. Even when technically there is only one
mode for the data, it is usual, when the histogram has more than one definite peak,

Quantitative Methods Edinburgh Business School 5/9


Module 5 / Summary Measures

to quote more than one mode, each mode corresponding to one peak. For example,
in Figure 5.2, had the frequencies for 0 and 19 episodes been 5 and 4 respectively,
technically there would have been one mode at 0, but because the histogram still
would have two peaks, the data should be reported as having two modes.
Calculations for Figure 5.2:

Mean =
=
= 8
Median = middle value of set
= average of 10th and 11th values
= 0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 4, 17, 18, 18, 19, 19, 19, 19, 19
=2
Mode = most frequently occurring values
= 0 and 19

Context: Weeks away from work through sickness in a one-year period for a sample of 20 employees in
a particular company.
Readings: 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 3, 5, 18, 28, 44, 52
Shape:
7

5
No. of readings

0
0 1 2 3 5 18 28 44 52
Weeks of sickness

Figure 5.3 Sickness records


The reverse J-shaped distribution (Figure 5.3) arises with distributions that are
truncated at one end (no values less than 0) but which have a few outliers at the
other end of the scale. It is typical of sickness records, where most workers are
absent very little or not at all, but where one or two employees with a major illness
may be absent for most of the year. The best measure is the median. It tells how
many weeks off the ‘middle’ employee had. There are major defects for the other
two measures. The arithmetic mean is distorted by the outliers. A very different

5/10 Edinburgh Business School Quantitative Methods


Module 5 / Summary Measures

value would be obtained if the outlier of 52 weeks had not been present. Then the
mean would have been reduced from 8.0 to 5.7.
In all situations, the arithmetic mean can be misleading if there are just one or
two extremes in the data. The mode is not misleading, just unhelpful. Most sickness
records have a mode of 0, therefore to quote ‘mode 0’ is not providing any more
information than merely saying that the data concern sickness records.
Calculations for Figure 5.3:
Mean =
=
= 8
Median = middle value of set
= average of 10th and 11th values
= 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 3, 5, 18, 28, 44, 52
=1
Mode = most frequently occurring value
=0
Mean, medium and mode are the major measures of location and are obviously
useful as summaries of data. Equally obviously, they do not capture all aspects of a
set of numbers. Other types of summary measure are necessary. However, before
we leave measures of location, their uses, other than as summarisers, will be
described.

5.3.6 Other Uses of Measures of Location

As a Focus for the Eye


The importance of the visual approach in data analysis has already been stressed.
One aspect of this is the use of summary measures of location as a focus to guide
the eye and give a more speedy impression of the meaning of the data. Consider the
following two sets of data. It is difficult to see the shape or pattern in them.

Set 1: 8, 7, 5, 11, 10, 7, 8, 8, 6, 7, 10, 12, 5, 6, 7, 9, 9, 8, 8, 9


Set 2: 2, 1, 0, 18, 5, 0, 52, 2, 1, 0, 1, 44, 3, 0, 0, 28, 0, 0, 2, 1

Adding the mean of each set of data, as in the next two sets, allows the shape of
the distribution to become apparent more quickly to the eye. As you can see along
the rows:

Set 1: 8, 7, 5, 11, 10, 7, 8, 8, 6, 7, 10, 12, 5, 6, 7, 9, 9, 8, 8, 9 Mean = 8.0


Set 2: 2, 1, 0, 18, 5, 0, 52, 2, 1, 0, 1, 44, 3, 0, 0, 28, 0, 0, 2, 1 Mean = 8.0

Quantitative Methods Edinburgh Business School 5/11


Module 5 / Summary Measures

In the first case, the focus enables one to see that the numbers are scattered
closely and about equally either side of the mean; in the second case, one sees that
most numbers are below the mean with just a few considerably above.
In fact, the two sets are the symmetrical data and the reverse J-shaped data intro-
duced earlier in Figure 5.1 and Figure 5.3. In the latter case the arithmetic mean was
judged not to be the most useful measure to act as a summary; nevertheless it has a
value when used as a focus for the eye. One meets this usage with row and column
averages in tables of numbers.

For Comparisons
Measures of location can be used to compare two (or more) sets of data regardless
of whether the measure is the best summary measure for that set.

Set 1: 5, 7, 8, 9, 9, 10 Mean = 8
Set 2: 5, 5, 5, 6, 6, 7, 8, 10 Mean = 6.5

The two sets of data above contain a different number of readings. The arithme-
tic mean may or may not be the correct summary measure for either set.
Nevertheless, a useful comparison between them can be effected through the mean.
Similarly, the sickness records of a group of people (reverse J shape) over several
years can be compared using the arithmetic mean, even though one would not use
this measure purely to summarise the data.

5.3.7 The Pre-eminence of the Arithmetic Mean


For each type of summary measure, several actual measures are available for use. In
the case of measures of location, the possibilities are arithmetic mean, median and
mode. Which measure should be used in which situation? What are the strengths
and weaknesses of each measure?
The choice between arithmetic mean, median and mode is often an easy one. The
reason is that the arithmetic mean is pre-eminent. It is easy to calculate and use and
it is widely understood and recognised. The arithmetic mean, therefore, is always
used… unless there are good reasons not to. Three examples of good reasons are
given below.

Distortion of the Mean by Outliers


The arithmetic mean is sensitive to large outliers. This was the case with the reverse
J-shaped distribution in Figure 5.3. Whenever large outliers are present, use of the
median should be considered.
Another example of this effect would be in calculating the average salary in a
small engineering firm where one finds:

5/12 Edinburgh Business School Quantitative Methods


Module 5 / Summary Measures

Salary
(inc. bonuses)
1 Founder/managing director £60 000
4 Skilled workers £14 000
5 Unskilled workers £12 000
Arithmetic mean salary = £17 600

The average salary as calculated is totally unrepresentative of salaries earned in


the firm, purely because of the distorting presence of the high earnings of the
managing director. The median salary would be £13 000 and thus more representa-
tive.

Distortion of the Mean by Clusters


The arithmetic mean can be unrepresentative if the data splits into two or more
distinctly different clusters. The mean may well fall in the middle and be some
distance from any of the data.
Such is the case with the U-shaped distribution in Figure 5.2, which referred to
television viewing. People saw almost none or almost all the episodes. The mean
was calculated as coming between the two clusters at the extremes. In such cases the
mode may be more representative.

Error by Taking Averages of Averages


When an average of an average is taken, the arithmetic mean can be incorrect.
Consider the following case of two streams of 15-year-old pupils at a high school
and their average marks in a national examination.

Stream A: 50 pupils, average mark 74%


Stream B: 30 pupils, average mark 50%

What is the average mark for both streams together?


It is tempting to say:
Overall average =
= 62%
However, this figure is the average of two averages. To get the true average, one
has to go back to the first principles:
( )
Overall average =
( × ) ( × )
=
=
=
= 65%

Quantitative Methods Edinburgh Business School 5/13


Module 5 / Summary Measures

The lesson is: when averaging averages where groups of different size are in-
volved, go back to the basic definition of the average.
Except where one of the difficulties described above applies, the arithmetic mean
is the first choice of measure of location.

5.4 Measures of Scatter


Measures of scatter do exactly as their name implies. They are a measure of the
extent to which the readings are grouped closely together or scattered over a wide
interval. They are also called measures of dispersion.

5.4.1 Range
The best-known and certainly the simplest measure of scatter is the range, which is
the total interval covered by the numbers.
Range = Largest reading − Smallest reading
For example, for the nine numbers 3, 3, 4, 5, 5, 6, 6, 6, 7:
Range = 7 − 3
=4

5.4.2 Interquartile Range


The range is defined entirely on the two extreme values, and this might be mislead-
ing if it were used as a general measure of scatter. To overcome this problem, the
interquartile range is the range of the numbers after having eliminated the highest
and lowest 25 per cent. This measure is no longer sensitive to extremes.
Interquartile range = Range of middle 50% of readings
For example, consider the following numbers: 3, 3, 4, 5, 5, 6, 6, 6, 7.
The top 25 per cent of readings (= approximately the top two numbers) and the
bottom 25 per cent (= approximately the bottom two numbers) are eliminated:

3, 3, 4, 5, 5, 6, 6, 6, 7
︸ ︸
remove remove

Interquartile range = 6 − 4
=2

5.4.3 Mean Absolute Deviation


A measure that involves all the readings is the mean absolute deviation (MAD). It
is the average distance of the readings from their arithmetic mean:
( )
MAD =

5/14 Edinburgh Business School Quantitative Methods


Module 5 / Summary Measures

Or, more mathematically:


∑| |
MAD =

where:
is the arithmetic mean
is the number of readings in the set

(The notation | | (pronounced ‘the absolute value of’) means the size of the
number disregarding its sign).
For example, calculate the MAD of: 3, 3, 4, 5, 5, 6, 6, 6, 7.
From the previous work: = 5

3 3 4 5 5 6 6 6 7
− −2 −2 −1 0 0 1 1 1 2
| − | 2 2 1 0 0 1 1 1 2
∑| − | = 2 +2 +1 +0 +0 +1 +1 +1 +2
= 10

MAD =
= 1.1
The concept of absolute value used in the MAD is to overcome the fact that
( − ) is sometimes positive, sometimes negative and sometimes zero. The
absolute value gets rid of the sign. Why is this necessary? Try the calculation without
taking absolute values and see what happens.

5.4.4 Variance
An alternative way of eliminating the sign of deviations from the mean is to square
them, since the square of any number is never negative. The variance is the average
squared distance of readings from the arithmetic mean:
Variance =

Or, more mathematically:


∑( )
Variance =
=
= 2
The n − 1 in the denominator seems surprising. One would expect it to be n so
that the variance is the ‘average squared deviation from the mean’. If the variance is
being calculated from the population (the set of all possible values of the variable),
then the denominator should indeed be n.
However, in virtually every practical situation, the calculation is being made from
a sample (a subset of the population). In general, samples are less diverse than the

Quantitative Methods Edinburgh Business School 5/15


Module 5 / Summary Measures

population. We can see this intuitively because the one-in-a-million extreme outlier
that is present in the population will not usually be present in the sample. Extreme
outliers have a large impact on the variance since it is based on squared deviations.
Consequently, when the variance is calculated from a sample, it tends to underesti-
mate the true population variance.
Dividing by n − 1 instead of n increases the size of the calculated figure, and this
increase offsets the underestimate by just the right amount. ‘Just the right amount’
has the following meaning. Calculating the variance with n − 1 as the denominator
will give, on average, the best estimate of the population variance. That is, if we
were to repeat the calculation for many, many samples (in fact, an infinite number
of samples) and take the average, the result would be equal to the true population
variance. If we used n as the denominator this would not be the case. This can be
verified mathematically but goes beyond what a manager needs to know – consult a
specialist statistical text if you are interested.
Section 9.3 on ‘Degrees of Freedom’ in Module 9 gives an alternative and more
technical explanation.
Unless you are sure you are in the rare situation of dealing with the whole popu-
lation of a variable, you should use the n − 1 version of the formula. Calculators and
popular spreadsheet packages that have a function for calculating the variance
automatically nearly always use n − 1, although there may be exceptions.
Taking the usual example set of numbers, calculate the variance of 3, 3, 4, 5, 5, 6,
6, 6, 7 (Mean = 5).

3 3 4 5 5 6 6 6 7
− −2 −2 −1 0 0 1 1 1 2
( − ) 4 4 1 0 0 1 1 1 4
∑( − ) = 4 +4 +1 +0 +0 +1 +1 +1 +4
= 16

∑( )
Variance =
=
= 2
The variance has many applications, particularly in financial theory. However, as
a pure description of scatter, it suffers from the disadvantage that it involves
squaring. The variance of the number of weeks of sickness of 20 employees is
measured in square weeks. However, it is customary to quote the variance in
ordinary units (e.g. in the above example the variance is said to be two weeks).

5/16 Edinburgh Business School Quantitative Methods


Module 5 / Summary Measures

5.4.5 Standard Deviation


The standard deviation attempts to overcome this problem by taking the square
root of the variance. It also has important applications of its own, particularly in
connection with the normal distribution.
Standard deviation = √Variance
Or, in mathematical form:
∑( )
Standard deviation =

In the example given in Section 5.4.4 above, variance = 2; therefore:


Standard deviation = √2
= 1.4

5.4.6 Calculating Measures of Dispersion


Just as for measures of location, calculating measures of dispersion from a spread-
sheet (or other statistical package) involves the easy task of highlighting the numbers
and applying one of the standard spreadsheet functions. In the absence of a
spreadsheet, the formula for the variance can be awkward to use because of the
need to calculate ( − ) for each observation. The task is made easier by using a
short-cut formula that is exactly equivalent to the original – some simple algebra
transforms one into the other. The short-cut formula is:
Variance = ∑( )− · / −1
The calculations for an earlier example would be as in Table 5.6.

Table 5.6 Short-cut formula

3 9
3 9
4 16
5 25
5 25
6 36
6 36
6 36
7 49
Total 45 241

Mean = =5

Variance = ∑( )− · / − 1
= [241−9×25]/8
= 2

Quantitative Methods Edinburgh Business School 5/17


Module 5 / Summary Measures

5.4.7 Comparing the Measures of Scatter


In the case of measures of location, the choice between measures is relatively easy.
The arithmetic mean is pre-eminent and it is chosen unless there is a good reason to
do otherwise. For measures of scatter, the relative merits of the measures are more
evenly balanced and the choice is more difficult. Table 5.7 shows the major
strengths and weaknesses of each measure.

Table 5.7 Comparison of measures of scatter


Measure Advantages Disadvantages
Range Easily understood Distorted by outliers
Familiar Descriptive measure only

Interquartile range Easily understood Not well known


Descriptive measure only

Mean absolute deviation Intuitively sensible Unfamiliar


Difficult to handle mathematically

Variance Easy to handle mathematically Wrong units


Used in other theories No intuitive appeal

Standard deviation Easy to handle mathematically Too involved for descriptive purposes
Used in other theories

All the measures have their particular uses. No single one stands out. When a
measure of scatter is required purely for descriptive purposes, the best measure is
probably the mean absolute deviation, although it is not as well known as it deserves
to be. When a measure of scatter is needed as part of some wider statistical or
mathematical theory, then the variance and standard deviation are frequently
encountered.
Further Example
A company’s 12 salespeople in a particular region last month drove the following
number of kilometres:

Salesperson Kilometres
(hundreds)
1 34
2 47
3 30
4 32
5 38
6 39
7 36
8 43

5/18 Edinburgh Business School Quantitative Methods


Module 5 / Summary Measures

Salesperson Kilometres
(hundreds)
9 31
10 40
11 42
12 32

Calculate:
(a) range
(b) interquartile range
(c) MAD
(d) variance
(e) standard deviation.
Which measure is the most representative of the scatter in these data?
(a) Range = Highest – Lowest = 47 – 30 = 17
(b) Putting the numbers in ascending order:

30, 31, 32, 32, 34, 36, 38, 39, 40, 42, 43, 47
Lowest quartile (25%) Highest quartile (25%)
Interquartile range = 40 – 32 = 8
(c) To calculate MAD, it is first necessary to find the arithmetic mean:
Mean =
=
= 37
Next calculate the deviations:

34 47 30 32 38 39 36 43 31 40 42 32
− −3 10 −7 −5 1 2 −1 6 −6 3 5 −5
| − | 3 10 7 5 1 2 1 6 6 3 5 5
∑| − | = 54

∑| |
MAD =
= 4.5
(d) The variance

34 47 30 32 38 39 36 43 31 40 42 32
− −3 10 −7 −5 1 2 −1 6 −6 3 5 −5
( − ) 9 100 49 25 1 4 1 36 36 9 25 25
∑( − ) = 320

Quantitative Methods Edinburgh Business School 5/19


Module 5 / Summary Measures

∑( )
Variance =
=
= 29.1
(e) Standard deviation
= √Variance
= √29.1
= 5.4
The best descriptive measure of scatter in this situation is the mean absolute
deviation. The average difference between a salesperson’s travel and the average travel
is 4.5 (450 kilometres). This is a sensible measure that involves all data points. The range
is of great interest, but not as a measure of scatter. Its interest lies in indicating the
discrepancy between the most and least travelled. It says nothing about the ten in-
between salespeople. The interquartile range is probably the second choice. The
variance and standard deviation are probably too complex conceptually to be descrip-
tive measures in this situation, where further statistical analysis is not likely.

5.4.8 Coefficient of Variation


When there are differences in the means of two groups, a measure of scatter must
be ‘standardised’ before comparison of relative variation can be made. The coeffi-
cient of variation does this in the case of the standard deviation.
Coefficient of variation =

This is useful when sets of data with very different characteristics are being com-
pared. For example, suppose one is looking at the number of passengers per day
passing through two airports. Over a one-year period the average number of
passengers per day, the standard deviations and the coefficients of variation are
calculated.

Mean Standard deviation Coefficient of variation


Airport 1 4 200 1 050 0.25
Airport 2 15 600 2 250 0.14

Consideration of the standard deviations alone would suggest that there was more
scatter at Airport 2. In relation to the number of passengers using the two airports, the
scatter is smaller at Airport 2 as revealed by the coefficient of variation being 0.14 as
against 0.25 at Airport 1.

5.5 Other Summary Measures


Measures of location and scatter are the most useful and frequently seen summaries.
There are two other measures that are occasionally encountered.

5/20 Edinburgh Business School Quantitative Methods


Module 5 / Summary Measures

5.5.1 Skew
Skew measures the extent to which a distribution is non-symmetrical. Figure 5.4(a)
is a distribution that is left-skewed; Figure 5.4(b) is a symmetrical distribution with
zero skew; Figure 5.4(c) is a distribution that is right-skewed.

(a) (b) (c)

Figure 5.4 Skew: (a) left skew; (b) zero skew; (c) right skew
The concept of skew is normally used purely descriptively and is assessed visually
(i.e. one looks at the distribution and assesses whether it is symmetrical or right- or
left-skewed). Skew can be measured quantitatively but the formula is complex and
the accuracy it gives (over and above a verbal description) is rarely necessary in
practice. However, the measurement of skew gives rise to the alternative labels
positively skewed (right-skewed) and negatively skewed (left-skewed).

5.5.2 Kurtosis
Kurtosis measures the extent to which a distribution is ‘pinched in’ or ‘filled out’.
Figure 5.5 shows three distributions displaying increasing levels of kurtosis. As with
skew, a qualitative approach is sufficient for most purposes (i.e. when one looks at
the distribution, one can describe it as having a low, medium or high level of
kurtosis). Kurtosis can also be measured quantitatively, but, again, the formula is
complex.

(a) (b) (c)

Figure 5.5 Kurtosis: (a) high; (b) medium; (c) low

5.6 Dealing with Outliers


The data from which summary measures are being calculated may well include one
or more outliers, which may have a disproportionate effect on the result. This is

Quantitative Methods Edinburgh Business School 5/21


Module 5 / Summary Measures

particularly true of the variance and standard deviation, which use squared values.
How does one deal with the outliers? Are they to be included or excluded? Three
basic situations arise.
(a) Twyman’s Law. This only half-serious law states that any piece of data that
looks interesting or unusual is wrong. The first consideration when confronted
by an outlier is whether the number is incorrect, perhaps because of an error in
collection or a typing mistake. There is an outlier in these data, which are the
week’s overtime payments to a group of seven workers (in £s):
13.36, 17.20, 16.78, 15.98, 1432, 19.12, 15.37
Twyman’s Law suggests that the outlier showing a payment of 1432 occurs be-
cause of a dropped decimal point rather than a fraudulent claim. The error
should be corrected and the number retained in the set.
(b) Part of the pattern. An outlier may be a definite and regular part of the pattern
and should be neither changed nor excluded. Such was the case with the sickness
record data of Figure 5.3. The outliers were part of the pattern and similar ef-
fects were likely to be seen in other time periods and with other groups of
employees.
(c) Isolated events. Outliers occur that are not errors but that are unlikely to be
repeated (i.e. they are not part of the pattern). Usually they are excluded from
calculations of summary measures, but their exclusion is noted. For example, the
following data, recorded by trip wire, show the number of vehicles travelling
down a major London road during a ten-day period:
5271, 5960, 6322, 6011, 7132, 5907, 6829, 741, 7098, 6733
The outlier is the 741. Further checking shows that this day was the occasion of
a major royal event and that the road in question was closed to all except state
coaches, police vehicles, etc. This is an isolated event, perhaps not to be repeated
for several years. For traffic control purposes, the number should be excluded
from calculations since it is misleading, but a note should be made of the exclu-
sion. Hence, one would report:
Mean vehicles/day =
=
= 6363 (excluding day of Royal Event)
The procedure for outliers is first to look for mistakes and correct them; and
second, to decide whether the outlier is part of the pattern and should be includ-
ed in calculations or an isolated event that should be excluded.

5.7 Indices
An index is a particular type of measure used for summarising the movement of a
variable over time. When a series of numbers is converted into indices, it makes the
numbers easier to understand and to compare with other series.
The best-known type of index is probably the cost of living index. The cost of
living comprises the cost of many different goods: foods, fuel, transport, etc.
Instead of using the miscellaneous and confusing prices of all these purchases, we

5/22 Edinburgh Business School Quantitative Methods


Module 5 / Summary Measures

use an index number, which summarises them for us. If the index in 2014 is 182.1
compared with 165.3 in 2013, we can calculate that the cost of living has risen by:
. .
× 100 = 10.2%
.

This is rather easier than having to cope with the range of individual price rises
involved.
Every index has a base year when the index was 100 (i.e. the starting point for
the series). If the base year for the cost of living index was 2008, then the cost of
living has risen 82.1 per cent between 2008 and 2014. This could be said in a
different way: the 2014 cost of living is 182.1 per cent of its 2008 value.
The index very quickly gives a feeling for what has happened to the cost of living.
Comparisons are also easier. For example, if the Wages and Salaries Index has 2008
as its base year and stood at 193.4 in 2014, then, over the six years, wages out-
stripped the cost of living: 93.4 per cent as against 82.1 per cent.
The cost of living index is based on a complicated calculation. However, there
are some more basic indices.

5.7.1 Simple Index


At the most primitive level, an index is just the result of the conversion of one series
of numbers into another, based on 100. Suppose the original numbers refer to the
average price of a new house in some region in each of 10 years and, with 2010
arbitrarily chosen as the base year, the original series and the index are as follows:

2005 2006 2007 2008 2009 2010 2011 2012 2013 2014
Price (£000s) 6.1 8.2 8.6 10.1 11.8 12.4 16.9 19.0 19.7 19.4
Index 49 66 69 81 95 100 136 153 159 156

The index for 2010, being the base year, is 100. The other data in the series are
scaled up accordingly. For instance, the index for 2007 is:
8.6 × = 69
.
where 8.6 is the original datum and 12.4 is the original datum for the base year. And
for 2013:
19.7 × = 159
.
The choice of the base year is important. It should be such that individual index
numbers during the time span being studied are never too far away from 100. As a
rule of thumb, the index numbers are not usually allowed to differ from 100 by
more than the factor of 3 (i.e. the numbers are in the range 30 to 300). If the base
year for the numbers in the series above had been chosen as 2005, then the index
series would have been from 100 to 318.
In long series, there might be more than one base year. For example, a series
covering more than 30 years from 1982 to 2014 might have 1982 as a base year with

Quantitative Methods Edinburgh Business School 5/23


Module 5 / Summary Measures

the series then rising to 291 in 2001, which could then be taken as a second base
year:

1982 2001 2014


Index 100 → 291
100 → 213

Care obviously needs to be taken in interpreting this series. The increase from
1982 to 2014 is not 113 per cent. If the original series had been allowed to continue,
the 2014 index would have been 213 × 2.91 = 620. The increase is thus 520 per
cent.

5.7.2 Simple Aggregate Index


The usefulness of an index is emphasised when it is used to summarise several
factors. A monthly index for meat prices could not be based on monthly prices for,
say, beef alone. It would have to take into account prices of other types of meat. An
aggregate index does just this.
For example, the table below shows the price per kilogram at market for beef,
pork and lamb. A simple aggregate combines the different prices by adding them up
and forming an index from the total. For instance, the total price for January is
148 + 76 + 156 = 380. For February it is 150 + 80 + 167 = 397.

Price (p per kilogram)


Month Cattle Pigs Lambs
Jan. 148 76 156
Feb. 150 80 167
Mar. 155 75 180
Apr. 163 79 194
May 171 81 186
June 179 76 178
July 184 82 171
Aug. 176 76 168
Sept. 142 79 163
Oct. 146 84 160
Nov. 149 87 159
Dec. 154 94 162

If January is taken as the base, then the indices for the months are:

Month Total Index


Jan. 380 100
Feb. 397 104

5/24 Edinburgh Business School Quantitative Methods


Module 5 / Summary Measures

Mar. 410 108


Apr. 436 115
May 438 115
June 433 115
July 437 114
Aug. 420 111
Sept. 384 101
Oct. 390 103
Nov. 395 104
Dec. 410 108

One possible disadvantage of this index is that livestock with a low price will
have much less influence than livestock with a high price. For instance, in February
a 20 per cent change in the price of cattle would change the index much more than
a 20 per cent change in the pig price:

February 150 + 80 + 167 = 397 Index = 104.5


With 20% cattle price change 180 + 80 + 167 = 427 Index = 112.4
With 20% pig price change 150 + 96 + 167 = 413 Index = 108.7

However, this may be a desirable feature. If the price level of each factor in the
index reflects its importance, then the higher-priced elements should have more
effect on the index. On the other hand, this feature may not be desirable. One way
to counter this is to construct a price relative index. This means that the individual
prices are first converted into an index and then these individual indices are aver-
aged to give the overall index.

5.7.3 Weighted Aggregate Index


Whether to use prices or price relatives in the index described above is really a
question of weighting. How much influence should the individual parts have,
relative to one another, on the overall index? This issue is of greatest importance in
a cost of living index. A simple aggregate index would not be a suitable method of
combining the prices of milk, bread, fruit, tobacco, electricity and so on. A
weighted aggregate index allows different weights to be given to the different
prices. What the weights should be is still a matter of judgement, but in the case of
price indices the quantities purchased are often used.
If we return to the livestock index, suppose the quantities purchased at the live-
stock markets from which the price data were obtained are as below (quantities in
thousand tonnes):

Quantitative Methods Edinburgh Business School 5/25


Module 5 / Summary Measures

Cattle Pigs Lambs


Price Quantity Price Quantity Price Quantity
Jan. 148 1.6 76 0.25 156 0.12
Feb. 150 1.5 80 0.23 167 0.11
Mar. 155 1.4 75 0.25 180 0.10

Instead of adding them up, we first weight the prices by the quantities, and the
final index is formed from the resulting monthly total. The quantities used for the
weighting should be the same for each month, since this is a price index. Otherwise
the index would measure price and volume changes. If the quantities used for
weighting are the base month (January) quantities, then the index is known as a
Laspeyres Index and is calculated as follows:

Month Weighted total Index


Jan. (148 × 1.6) + (76 × 0.25) + (156 × 0.12) = 274.5 100
Feb. (150 × 1.6) + (80 × 0.25) + (167 × 0.12) = 280.0 280/274.5 × 100 = 102
Mar. (155 × 1.6) + (75 × 0.25) + (180 × 0.12) = 288.3 288.3/274.5 × 100 = 105

A Laspeyres Index, like other indices, can be used for quantities as well as prices.
For a quantity index the role of price and quantity in the above example (of a price
index) would be reversed, with prices providing the weightings to measure changes
in quantities. The weights (prices) remain at the constant level of some base period
while the quantities change from time period to time period. For example, the UK
Index of Manufacturing Production shows how the level of production in the
country is changing as time goes by. The quantities refer to different types of
product – consumer goods, industrial equipment, etc. – and the prices are those of
the products in a base year. The use of prices as weights for quantities gives the
most expensive products a heavier weighting.
A major criticism of the Laspeyres Index is that the weights in the base year may
soon become out of date and no longer representative. An alternative is the
Paasche Index, which takes the weights from the most recent time period – the
weightings therefore change from each time period to the next. In the livestock
example a Paasche Index would weight the prices in each month with the quantities
relating to December, the most recent month. A Paasche Index always uses the
most up-to-date weightings, but it has the serious practical disadvantage that, if it is
to be purely a price index, every time new data arrive (and the weightings change)
the entire past series must also be revised.
A fixed weight index may also be used. Its weights are from neither the base
period nor the most recent period. They are from some intermediate period or from
the average of several periods. It is a matter of judgement to decide which weights
to use.
The cost of living index has already been introduced. It indicates how the cost of
a typical consumer’s lifestyle changes as time goes by, and it has many practical uses.
For example, it is usually the starting point in wage negotiations, since it shows how
big a wage increase is needed if an employee’s current standard of living is to be

5/26 Edinburgh Business School Quantitative Methods


Module 5 / Summary Measures

maintained. A wage increase lower than the cost of living index would imply a
decrease in real wages.
What type of index should be used for the cost of living index? Table 5.8 and
Table 5.9 show the simplified example from Economics Module 12 (Tables 12.1 and
12.2).

Table 5.8 Consumer price index in base period t


Good/Service Quantity Price ($) Total expenditure ($)
Loaf of bread 10 loaves 1.00/loaf 10.00
Glass of wine 5 glasses 1.20/glass 6.00
Haircut 2 7.00 14.00
30.00

Table 5.9 Consumer price index in base period t + 1


Good/Service Quantity Price ($) Total expenditure ($)
Loaf of bread 10 loaves 1.10/loaf 11.00
Glass of wine 5 glasses 1.20/glass 6.00
Haircut 2 8.00 16.00
33.00

Price index for time t + 1 = 33.00/30.00 × 100


= 110
Increase in price index = (110 – 100)/100
= 10%
This is a Laspeyres Index, but it is not enough for a cost of living index, which
must reflect, in addition to prices, the fact that people’s purchasing behaviour
changes in response to changes in prices, wage levels, technology, fashion and so
on. For example, the arrival into the market of good-quality wines from a wider
range of countries than in the past provides extra choice, possibly reduces prices
and therefore may lead to more wine being consumed. A change in fashion from
longer to shorter hair may result in greater spending on haircuts, possibly at the
expense of bread purchases. A cost of living index should reflect changes such as
these, since it is intended to indicate ‘typical’ living costs.
A Paasche Index should be used since it incorporates changes in both prices and
weights. In practice it is based on a lengthy and detailed list of expenditure catego-
ries and their relative weights. The list might look like Table 5.10.

Quantitative Methods Edinburgh Business School 5/27


Module 5 / Summary Measures

Table 5.10 List of expenditures and their weights


Expenditure £ Weight
All items 100.00
Food and beverages
Food at home 12.87
Food not at home 6.34
Alcoholic beverages 1.77 20.98
Housing
Rent 3.66
Home ownership 23.54
Fuel 5.21
Furnishings 6.43 38.84
Clothing
Male 1.07
Female 1.87
Infant 0.34
Footwear 0.85 4.13
Transportation
Cars
Petrol
Public

… 18.72
Medical 2.11
Entertainment 3.22
Other 12.00

The weights are multiplied by the prices to give the index, as in the livestock
example. The weights are changed regularly as a result of government surveys of
expenditure patterns. It is important to note that, for the cost of living index,
previous values of the index are not changed as the weights change (i.e. the index
remains as it was when first calculated). Consequently, in comparing the cost of
living now with that of 20 years ago, a cost of living index reflects changes in
purchasing behaviour as well as the inevitably increasing prices.
A price index such as the cost of living index can be used to deflate economic
data. For example, the measure of a country’s economic activity is its GNP (gross
national product), the total value of the goods and services an economy produces in
a year. The GNP is measured from statistical returns made to the government and is
calculated in current prices (i.e. it is measured in terms of the prices for the goods
and services that apply in that year). Consequently, the GNP can rise from one year
to another because prices have risen through inflation, even though actual economic
activity has decreased. It would be helpful to neutralise the effect of prices in order

5/28 Edinburgh Business School Quantitative Methods


Module 5 / Summary Measures

to have a more realistic measure of economic activity. This is done by deflating the
series so that the GNP is in real, not current, terms. The deflation is carried out by
using a price index. Table 5.11 shows the GNP of a fictional country for 2008–14,
together with a price index for those years. Current GNP is converted to real GNP
by dividing by the price index.

Table 5.11 Deflating GNP


Year GNP(current) Price index GNP(real)
2008 320 100 320
2009 335 104 322
2010 347 107 324
2011 358 110 325
2012 376 113 333
2013 389 116 335
2014 403 120 336

For example, for 2012:


GNP(real) = GNP(current)/Price index × 100
= (376/113) × 100
= 333

The changes in GNP(real) over the years 2008–14 show that economic activity
did increase in each of those years but not by as much as GNP(current) suggested.
It is important that an appropriate price index is used. It would not have been
appropriate to use the cost of living index for GNP since that index deals only with
consumer expenditure. A price index that incorporates the prices used in the GNP –
that is, investment and government goods as well as consumer goods – should be
used.

Learning Summary
In the process of analysing data, at some stage the analyst tries to form a model of
the data, as suggested previously. ‘Pattern’ or ‘summary’ are close synonyms for
‘model’. The model may be simple (all rows are approximately equal) or complex
(the data are related via a multiple regression model). Often specifying the model
requires intuition and imagination. At the very least, summary measures can provide
a model based on specifying for the data set:
(a) number of readings;
(b) a measure of location;
(c) a measure of scatter;
(d) the shape of the distribution.
In the absence of other inspiration, these four attributes provide a useful model
of a set of numbers. If the data consist of two or more distinct sets (as, for example,
a table), then this basic model can be applied to each. This will give a means of

Quantitative Methods Edinburgh Business School 5/29


Module 5 / Summary Measures

comparison between the rows or columns of the table or between one time period
and another.
The first attribute (number of readings) is easily supplied. Measures of location
and scatter have already been discussed. The shape of the distribution can be found
by drawing a histogram and literally describing its shape (as with the symmetrical, U
and reverse-J distributions seen earlier). A short verbal statement about the shape is
often an important factor in summarising or forming a model of a set of data.
Verbal statements have a more general role in summarising data. They should be
short, no more than one sentence, and used only when they can add to the sum-
mary. They are used in two ways: first, they are used when the quantitative measures
are inadequate; second, they are used to point out important features in the data.
For example, a table of a company’s profits over several years might indicate that
profits had doubled. Or a table of the last two months’ car production figures might
have a note stating that 1500 cars were lost because of a strike.
It is important, in using verbal summaries, to distinguish between helpful state-
ments pointing out major features and unhelpful statements dealing with trivial
exceptions and details. A verbal summary should always contribute to the objective
of adding to the ease and speed with which the data can be handled.

Review Questions
5.1 Which of the following statements about summary measures are true?
A. They give greater accuracy than the original data.
B. It is easier to handle information in summary form.
C. They are never misleading.
D. Measures of location and scatter together capture all the main features of data.
Questions 5.2 to 5.4 refer to the following data:
1, 5, 4, 2, 7, 1, 0, 8, 6, 6, 5, 2, 4, 5, 3, 5

5.2 The arithmetic mean is:


A. 4
B. 5.5
C. 4.25
D. 4.5
E. 5

5.3 The median is:


A. 4
B. 5.5
C. 4.25
D. 4.5
E. 5

5/30 Edinburgh Business School Quantitative Methods


Module 5 / Summary Measures

5.4 The mode is:


A. 4
B. 5.5
C. 4.25
D. 4.5
E. 5

5.5 Which of the following applies? As a measure of location, the arithmetic mean:
A. is always better than the median and mode.
B. is usually a misleading measure.
C. is preferable to mode and median except when all three are approximately
equal.
D. should be used when the data distribution is U shaped.
E. None of the statements applies.

5.6 An aircraft’s route requires it to fly along the sides of a 200 km square (see figure
below). Because of prevailing conditions, the aircraft flies from A to B at 200 km/h, from
B to C at 300 km/h, from C to D at 400 km/h and from D to A at 600 km/h. What is
the average speed for the entire journey from A to A?

A B

200 km

D C

A. 325 km/h
B. 320 km/h
C. 375 km/h
D. 350 km/h

5.7 Which of the following statements about measures of scatter are true?
A. Measures of scatter must always be used when measures of location are used.
B. A measure of scatter is an alternative to a measure of location as a summary.
C. One would expect a measure of scatter to be low when readings are close
together, and high when they are further apart.
D. A measure of scatter should be used in conjunction with a measure of disper-
sion.

Quantitative Methods Edinburgh Business School 5/31


Module 5 / Summary Measures

Questions 5.8 to 5.12 refer to the following data:


23, 27, 21, 25, 26, 22, 29, 24, 27, 26

5.8 The range is:


A. 7
B. 8
C. 9
D. 10

5.9 The interquartile range is:


A. 2
B. 4
C. 8
D. 5

5.10 The mean absolute deviation is:


A. 3
B. 2
C. 1.8
D. 2.2

5.11 The variance is:


A. 5.6
B. 6.2
C. 5.2
D. 6.5

5.12 The standard deviation is:


A. 6.2
B. 5.6
C. 2.4
D. 2.5

5.13 Which of the following statements is true?


A. No measure of scatter is pre-eminent, and therefore it does not matter which
is used for a particular set of data.
B. Interquartile range is not distorted by extreme readings.
C. Variance and standard deviation measure scatter but in squared units.
D. Mean absolute deviation is preferable to interquartile range because it is easier
to handle mathematically.

5/32 Edinburgh Business School Quantitative Methods


Module 5 / Summary Measures

5.14 Which of the following statements are true regarding the presence of an outlier in a
data set of which summary measures are to be calculated?
A. An outlier should either be retained in or excluded from the set.
B. An outlier that is part of the pattern of the data should always be used in
calculating the arithmetic mean.
C. An outlier that is not part of the pattern should usually be excluded from any
calculations.

5.15 Two governmental indices have the following values:

2011 2012 2013 2014


Cost of living 100 109.7 118.3 128.6
Wages, salaries 100 111.2 123.1 133.5

Which of the following is correct? Between 2013 and 2014, the growth of the cost of
living compared to wages and salaries was:
A. much greater.
B. slightly greater.
C. equal.
D. slightly less.
E. much less.

Case Study 5.1: Light Bulb Testing


1 Samples of light bulbs were obtained from two suppliers and tested to destruction in
the laboratory. The following results were obtained for the length of life:

Length of life (hours)


700–899 900–1099 1100–1299 1300–1499 Total
Supplier A 12 14 24 10 60
Supplier B 4 34 19 3 60

a. Which supplier’s bulbs have greater average length of life?


b. Which supplier’s bulbs are more uniform in quality?
c. Which supplier would you prefer to use?

Quantitative Methods Edinburgh Business School 5/33


Module 5 / Summary Measures

Case Study 5.2: Smith’s Expense Account


1 Mr Smith’s expense account for six trips made in the past month is given in Table 5.12.

Table 5.12 Mr Smith’s expense account


Trip Days Expenses (£) Expenses per day (£)
1 2 64 32
2 8 128 16
3 0.5 40 80
4 9 108 12
5 4 80 20
6 0.5 25 50
Totals 24 445 210

Mr Smith’s boss felt that these expenses were excessive because, he said, the average
expense per day was £35. Other salespeople away on weekly trips submitted expenses
that averaged at around £20 a day. Where did the £35 come from? How can Smith
argue his case?

Case Study 5.3: Monthly Employment Statistics


1 The statistics for monthly employment in four different departments of Company X are
as follows:

Department
A B C D
Mean monthly employment level 10 560 4891 220 428
Standard deviation 606 302 18 32

Is the monthly employment level more stable in some departments than in others?

Case Study 5.4: Commuting Distances


1 The distribution of the distance between home and place of work for workers in the
London region was found in a survey to be:

Distance (to nearest mile)


0 1 2 3 4–5 6–7 8–9 10–11 12–14 15–19 20–29 30+
% workers 3 7 9 10 16 12 11 8 8 4 6 6

The mean distance from work was found to be 10.5 miles. Calculate the mode, the
median and two measures of scatter. How would you summarise these data succinctly?

5/34 Edinburgh Business School Quantitative Methods


Module 5 / Summary Measures

Case Study 5.5: Petroleum Products


1 The following table shows the wholesale prices and production totals of three
petroleum products, prices being given in pence per gallon and production figures in
millions of barrels.

Prices Quantities
2012 2013 2014 2012 2013 2014
Car petrol 26.2 27.1 27.4 746 768 811
Kerosene 24.8 28.9 26.5 92 90 101
Paraffin 23.0 24.1 24.8 314 325 348
a. Use 2012 = 100 to construct a simple aggregate index for the years 2012 to 2014
for the prices of the three petroleum products.
b. Use 2012 quantities as weights and 2012 = 100 to construct a weighted aggregate
index for the years 2012 to 2014 for the prices of the three petroleum products.
c. Does it matter which index is used? If so, which one should be used?
d. How else could our index be constructed?

Quantitative Methods Edinburgh Business School 5/35


Module 6

Sampling Methods
Contents
6.1 Introduction.............................................................................................6/1
6.2 Applications of Sampling .......................................................................6/3
6.3 The Ideas behind Sampling ....................................................................6/3
6.4 Random Sampling Methods ...................................................................6/4
6.5 Judgement Sampling ........................................................................... 6/10
6.6 The Accuracy of Samples ................................................................... 6/12
6.7 Typical Difficulties in Sampling .......................................................... 6/13
6.8 What Sample Size? .............................................................................. 6/15
Learning Summary ......................................................................................... 6/16
Review Questions ........................................................................................... 6/18
Case Study 6.1: Business School Alumni ..................................................... 6/20
Case Study 6.2: Clearing Bank ...................................................................... 6/20

Prerequisite reading: None

Learning Objectives
By the end of this module the reader should know the main principles underlying
sampling methods. Most managers have to deal with sampling in some way. It may
be directly in commissioning a sampling survey, or it may be indirectly in making
use of information based on sampling. For both purposes it is necessary to know
something of the techniques and, more importantly, the factors critical to their
success.

6.1 Introduction
Statistical information in management is usually obtained from samples. The
complete set of all conceivable observations of a variable is a population; a subset
of a population is a sample. It is rarely possible to study a population. Nor is it
desirable, since sample information is much less expensive yet proves sufficient to
take decisions, solve problems and answer questions in most situations. For
example, to know what users of soap powder in the UK think about one brand of
soap powder, it is hardly possible to ask each one of the 15 million of them his or
her opinion, nor would the expense of such an exercise be warranted. A sample
would be taken. A few hundred would be interviewed and from their answers an
estimate of what the full 15 million are thinking would be made. The 15 million are
the population, the few hundred are the sample. A population does not have to refer

Quantitative Methods Edinburgh Business School 6/1


Module 6 / Sampling Methods

to people. In sampling agricultural crops for disease, the population might be 50 000
hectares of wheat.
In practice, information is nearly always collected from samples as opposed to
populations, for a wide variety of reasons.
(a) Economic advantages. Collecting information is expensive. Preparing ques-
tionnaires, paying postage, travelling to interviews and analysing the data are just
a few examples of the costs. Taking a sample is cheaper than observing the
whole population.
(b) Timeliness. Collecting information from a whole population can be slow,
especially waiting for the last few questionnaires to be filled in or for appoint-
ments with the last few interviewees to be granted. Sample information can be
obtained more quickly, and sometimes this is vital, for example, in electoral opin-
ion polls when voters’ intentions may swing significantly as voting day
approaches.
(c) Size and accessibility. Some populations are so large that information could
not be collected from the whole population. For example, a marketing study
might be directed at all teenagers in a country and it would be impossible to
approach them all. Even in smaller populations there may be parts that cannot
be reached. For example, surveys of small businesses are complicated by there
being no up-to-date lists because small businesses are coming into and going out
of existence all the time.
(d) Observation and destruction. Recording data can destroy the item being
observed. For example, in a quality test on electrical fuses, the test ruins the fuse.
It would not make much sense to destroy all fuses immediately after production
just to see whether they worked. Sampling is the only possible approach to some
situations.
There are two distinct areas of theory associated with sampling, illustrated in
Figure 6.1. In the example of market research into soap powder above, the first area
of theory would show how to choose the few hundred interviewees; the second
would show how to use the sample information to draw conclusions about the
population.

There is a need for The sample results are


statistical information A sample is used to make estimates
about a population. collected. about the population.

Theory
Theory How to make
How to choose inferences about
a sample. the population
from the sample.

Figure 6.1 Areas of sampling theory

6/2 Edinburgh Business School Quantitative Methods


Module 6 / Sampling Methods

Only the first area, methods of choosing samples, is the topic here. The methods
will be described, some applications will be illustrated and technical aspects will be
introduced. The second area, making inferences, is the subject of Module 8.

6.2 Applications of Sampling


Examples of the use of sampling methods are common, so much so that the skill
required in choosing the samples may not be appreciated.
(a) Opinion polls. Newspapers regularly feature the results of opinion polls, which
typically are concerned with issues such as: ‘For which party would you vote if
there were a national election tomorrow?’ and ‘Are you satisfied with the gov-
ernment’s handling of the economy?’ The population to which such questions
are addressed is the entire electorate of a country. For the UK the population
would comprise approximately 47 million people. Surveying the population
would be expensive, time-consuming and practically impossible. Opinion poll
companies base their results on questioning a sample, usually of 1000–2000 peo-
ple in size, of the electorate. The poll is therefore much cheaper and can be
obtained sufficiently quickly to be up to date when published. As will be shown
later, the results are only slightly less accurate than a population survey.
(b) Quality control. A large food processing company receives many tonnes of raw
food materials every day. It would be very expensive, if not impossible, to in-
spect all the incoming supplies. Instead, samples are taken. When a consignment
arrives, a sample from it is inspected. On the basis of the results, the whole con-
signment may be rejected. The rules that govern the acceptance or rejection are
based on statistical theory.
(c) Checking invoices. Large organisations issue hundreds of thousands of
invoices in the course of a financial year. When checking their accuracy, either
for reasons of control or in auditing, they cannot all be investigated. A sample is
taken, and from the results estimates of the overall errors in invoicing are made.

6.3 The Ideas behind Sampling


In these three examples, the methods by which the samples are selected have to be
carefully thought out. In order to draw correct conclusions about the population, a
sample should be representative of it (i.e. the sample should be a microcosm of the
population). No method of collecting samples exists that will guarantee to produce
representative samples every time. All methods will occasionally produce an
unrepresentative sample, but all methods have representativeness as their aim.
However, if samples are chosen in such a way that they are consistently not repre-
sentative, there is said to be sampling bias.
The fundamental method of achieving representativeness is simple random
sampling, in which the sample is chosen from the population in such a way that
each element of the population has an equal chance of being chosen. Imagine that
each element is numbered, the numbers are written on labels that are placed in a
(possibly large) hat and a sample is picked out by a blindfolded person. Simple

Quantitative Methods Edinburgh Business School 6/3


Module 6 / Sampling Methods

random sampling is equivalent to this procedure. The selection of the sample is left
to chance and one would suppose that, on average (but not always), the sample
would be reasonably representative.
The major disadvantage of simple random sampling is that it can be expensive. If
the question to be answered is the likely result of a UK general election, then the
sampler must find some means of listing the whole electorate, choose a sample at
random and then visit the voters chosen (perhaps one in Inverness, one in Pen-
zance, two in Norwich, one in Blackpool, etc.). Both selection and questioning
would be costly.
Variations on simple random sampling can be used to overcome this problem.
For example, multi-stage sampling in the opinion poll example above would
permit the sample to be collected in just a few areas of the country, cutting down on
the travelling and interviewing expenses.
The variations on simple random sampling also make it possible to use other
information to make the sample more representative. In the soap powder example a
stratified sample would let the sampler make use of the fact that known percent-
ages of households use automatic machines, semi-automatic machines, manual
methods and launderettes.
Some situations do not allow or do not need any randomisation in sampling.
(How would a hospital patient feel about a ‘random’ sample of blood being taken
from his body?) Judgement sampling refers to all methods that are not essentially
random in character and in which personal judgement plays a large role. They can
be representative in many circumstances.
Figure 6.2 shows diagrammatically the main sampling methods and how they are
linked. In the next sections these methods will be described in more detail.

6.4 Random Sampling Methods


6.4.1 Simple Random Sampling
Simple random sampling is a procedure of sampling in which each member of the
population has an equal chance of being selected. Numbering the elements of the
population and drawing the numbers from a hat is simple random sampling. The
method does not rely on the ownership of a large hat. More usually, it works with
the help of a computer or a random number table.
Table 6.1 shows 28 two-digit random numbers. They were generated by a comput-
er program, which makes each selection in such a way that each number from 00 to
99 has an equal chance of being chosen. Alternatively, they could have been taken
from a random number table (which is generated in the same way and can be found in
most sets of statistical tables).

6/4 Edinburgh Business School Quantitative Methods


Module 6 / Sampling Methods

All sampling methods

Random sampling Judgement sampling

Simple random Variations on Systematic Convenience Quota


simple random:
- Multi-stage
- Cluster
- Stratified
- Weighting
- Probability
- Variable
- Area

Figure 6.2 Relationship between the main sampling methods

Table 6.1 Two-digit random numbers


31 03 62 98 05 88 69
07 99 26 31 17 74 47
85 53 22 14 60 44 93
58 42 75 72 61 55 02

Suppose a simple random sample of five is to be selected from the 90-strong


workforce of a factory. The five are to be asked to take part in a detailed survey of
attitudes to work.
From payroll records a list of the workforce is made and each person given a
number between 1 and 90, as shown in Table 6.2.

Table 6.2 Numbered list of workforce


1. Appleyard, R. 31. Lester, E. 61. Sharpe, P.
2. Bainbridge, B. 32. Leyland, M. 62. Sutcliffe, H.
3. Binks, J. … …
4. Bowes, W. … …
5. Close, B. … …
6. Cowan, R. … …
… … …
… … 88. Wardle, J.
29. Hutton, L. … 89. Wilson, D.
30. Illingworth, R. 60. Ryan, M. 90. Yardley, N.

Quantitative Methods Edinburgh Business School 6/5


Module 6 / Sampling Methods

To choose the sample, the random numbers are taken one at a time from Ta-
ble 6.1 and each associated with the corresponding number and name from
Table 6.2. The first number is 31, so the corresponding employee is Lester, E. The
second number is 03, so the employee is Binks, J. This is continued until the
necessary sample is collected. The third number is 62, and thus the name is Sutcliffe,
H. The fourth number is 98, which does not correspond to any name and is
ignored. Table 6.3 shows the sample of five.

Table 6.3 Sample of five from workforce


Number Name
31 Lester, E.
03 Binks, J.
62 Sutcliffe, H.
05 Close, B.
88 Wardle, J.

The drawbacks of simple random sampling are that the listing of the population
can prove very expensive, or even impossible, and that the collection of data from
the sample can also be expensive, as with opinion polls. Variations on simple
random sampling have been developed to try to overcome these problems. These
methods still have a sizeable element of random selection in them and the random
part uses a procedure such as that described above.

6.4.2 Variations on Simple Random Sampling

Multi-Stage
In multi-stage sampling, the population is split into groups and each group is split
into subgroups, each subgroup into subsubgroups, etc. A random sample is taken at
each stage of the breakdown. First a simple random sample of the groups is taken;
of the groups chosen a simple random sample of their subgroups is taken and so on.
For example, suppose the sample required is of 2000 members of the workforce of
a large company, which has 250 000 employees situated in offices and factories over
a large geographical area. The objective is to investigate absenteeism. The company
has 15 regional divisions; each division has an average of ten locations at which an
office or factory is situated. Each location has its own computerised payroll. Since
the company operates on a decentralised basis, no company-wide list of employees
exists. Figure 6.3 illustrates how the population is split.

6/6 Edinburgh Business School Quantitative Methods


Module 6 / Sampling Methods

Company

15 geographical divisions

About ten locations in each division

About 1700 employees at each location

Figure 6.3 Population split


If sampling were to be done by simple random sampling, two difficulties would
be encountered. First, a company-wide list of employees would need to be compiled
– no easy task with payrolls on different computer systems. Second, it is likely that
to collect the necessary information from personnel records would require visits to
virtually all the divisions and locations of the company, which would be time-
consuming and expensive.
Multi-stage sampling overcomes these problems. First, a single random sample is
taken of four regional divisions from the list of 15 using the method of simple
random sampling described above. For each of the four divisions chosen, a list of its
locations (on average ten per division) is made. A simple random sample of two of
the locations is taken. Four divisions and two locations per division mean a total of
eight locations in all. At each of these eight locations a simple random sample of
250 employees is taken from the already available computerised payroll listing. The
final sample is then 250 employees from both of two locations, in each of four
divisions, making a total of 2000 as required. However, there have been no prob-
lems of listing elements of a population and the expense of collecting data has been
reduced by restricting work to eight locations instead of 150.
There are no statistical rules for deciding how big a sample should be chosen at
each stage. Common sense and knowledge of the circumstances are usually suffi-
cient. For instance, choosing just one division out of 15 could lead to an
unrepresentative sample because of a complete bias to just one geographical area;
choosing eight divisions out of 15 means that cost reduction advantages may be lost.
Choosing four is no more than a reasonable compromise.

Cluster Sampling
Cluster sampling is closely linked with multi-stage sampling in that the population
is divided into groups, the groups into subgroups and so on. The groups are
sampled, the subgroups sampled, etc. The difference is that, at the final stage, each
individual of the chosen groups is included in the final sample. Pictorially, the final
sample would look like a series of clusters drawn from the population.

Quantitative Methods Edinburgh Business School 6/7


Module 6 / Sampling Methods

For example, suppose the company described in the previous section on multi-
stage sampling is slightly different. The locations are not offices and factories but
small retail establishments at which an average of six staff are employed. The four
divisions and two locations per division could be sampled as before, but, because
there are only six staff, further samples at each location would not be taken. All six
would be included in the sample, carrying the further advantage of ensuring that all
grades of employee are in the sample. In this case the total sample size is 4 × 2 × 6
= 48. This is two-stage cluster sampling, because two stages of sampling, of
divisions and locations, are involved. If all locations, ignoring divisions, had been
listed and sampled, the process would involve only one sampling stage and would be
called ordinary cluster sampling. If a total sample larger than 48 were required
then more divisions or more than two locations per division would have to be
selected.

Stratified Sampling
In stratified sampling prior knowledge of a population is used to make the sample
more representative. If the population can be divided into subpopulations (or strata)
of known size and distinct characteristics then a simple random sample is taken of
each subpopulation such that there is the same proportion of subpopulation
members in the sample as in the whole population. If, for example, 20 per cent of a
population forms a particular stratum, then 20 per cent of the sample will be of that
stratum. The sample is therefore constrained to be representative at least as regards
the occurrence of the different strata. Note the different role played by the strata
compared to the role of the groups in multi-stage and cluster sampling. In the
former case, all strata are represented in the final sample; in the latter, only a few of
the groups are represented in the final sample.
For an example of stratified sampling, let’s return to the previous situation of
taking a sample of 2000 from the workforce of a large company. Suppose the
workforce comprises 9 per cent management staff, 34 per cent clerical, 21 per cent
skilled manual and 36 per cent unskilled manual. It is desirable that these should all
be represented in the final sample and in these proportions. Stratified sampling
involves first taking a random sample of 180 management staff (9 per cent of 2000),
then 680 clerical, then 420 skilled and finally 720 unskilled. This does not preclude
the use of multi-stage or cluster sampling. In multi-stage sampling the divisions and
locations would be selected first, then samples taken from the subpopulations at
each location (i.e. take 22 management staff – 9 per cent of 250 – at location one, 22
at location two and so on for the eight locations).
The final sample is then structured in the same way as the population as far as
staff grades are concerned. Note that stratification is only worth doing if the strata
are likely to differ in regard to the measurements being made in the sample. Other-
wise the sample is not being made more representative. For instance, absenteeism
results are likely to be different among managers, clerks, skilled workers and
unskilled workers. If the population had been stratified according to, say, colour of
eyes, it is unlikely that absenteeism would differ from stratum to stratum, and
stratified sampling would be inapplicable.

6/8 Edinburgh Business School Quantitative Methods


Module 6 / Sampling Methods

Weighting
Weighting is a method of recognising the existence of strata in the population after
the sampling has been carried out instead of before, as with stratification. Weighting
means that the measurements made on individual elements of the sample are
weighted so that the net effect is as if the proportions of each stratum in the sample
had been the same as those in the population. In the absenteeism investigation,
suppose that at each location the computerised payroll records did not indicate to
which staff grade those selected for the sample belonged. Only when the personnel
records were examined could this be known. Stratification before sampling is
therefore impossible. Or, at least, it would be extremely expensive to amend the
computer records. The sample of 2000 must be collected first. Suppose the strata
proportions are as in Table 6.4.

Table 6.4 Strata proportions


Stratum Population Sample Weighting
Management 9% 12% 9/12
Clerical 34% 39% 34/39
Skilled 21% 15% 21/15
Unskilled 36% 34% 36/34

The table shows the weighting that should be given to the elements of the sam-
ple. The weighting allocated to each stratum means that the influence each stratum
has on the results is the same as its proportion in the population. If the measure-
ments being made are of days absent for each member of the sample, then these
measurements are multiplied by the appropriate weighting before calculating average
days absent for the whole sample.

Probability Sampling
In simple random sampling each element of the population has an equal chance of
being selected for the sample. There are circumstances when it is desirable for
elements to have differing chances of being selected. Such sampling is called
probability sampling.
For example, when a survey is carried out into the quality and nutritional value of
school meals, a random sample of schools will have to be taken and their menus
inspected. If every school has an equal chance of being chosen, children at large
schools will be under-represented in the sample. The probability of choosing menus
from small schools is greater when a sample of schools is taken than when a sample
of schoolchildren is taken. This may be important if there is a variation in meals
between large and small establishments.
The issue hinges on what is being sampled. If it is menus, then school size is
irrelevant; if it is children subjected to different types of menu, then school size is
relevant. Probability sampling would give schools different chances of being chosen,
proportional to the size of the school (measured in terms of the number of children

Quantitative Methods Edinburgh Business School 6/9


Module 6 / Sampling Methods

attending). In the final sample, children from every size of school would have an
equal chance that their menu was selected.

Variable Sampling
In variable sampling some special subpopulation is over-sampled (i.e. deliberately
over-represented). This is done when the subpopulation is of great importance and
a normal representation in the sample would be too small for accurate information
to be gleaned.
Suppose the project is to investigate the general health levels of children who
have suffered from measles. The population is all children, of specified ages, who
have had measles. A small subpopulation (perhaps 1 per cent) is formed of children
who have suffered permanent brain damage as part of their illness. Even a large
sample of 500 would contain only five such children. Just five children would be
insufficient to assess the very serious and greatly variable effects of brain damage.
Yet this is a most important part of the survey. More of such children would
purposely be included in the sample than was warranted by the subpopulation size.
If calculations of, for instance, IQ measurements were to be made from the sample,
weighting would be used to restore the sample to representativeness.

Area Sampling
Sometimes very little may be known about the population. Simple random sampling
may be difficult and expensive, but, at the same time, lack of knowledge may
prevent a variation on simple random sampling being employed. Area sampling is
an artificial breaking down of the population to make sampling easier. The popula-
tion is split into geographical areas. A sample of the areas is taken and then further
sampling is done in the few areas selected.
The survey of users’ attitudes to a particular soap powder is such a case. A listing
of the whole population of users is impossible, yet not enough is known to be able
to use multi-stage or cluster sampling. The country/district where the sampling is to
be conducted is split into geographical areas. A sample of the areas is selected at
random and then a further sample is taken in the areas chosen, perhaps by listing
households or using the electoral roll. The difficult operation of listing the popula-
tion is reduced to just a few areas.

6.5 Judgement Sampling


Judgement sampling methods involve a significant degree of personal judgement.
They may be used when random methods are impossible (e.g. when taking a blood
sample). In other situations random sampling may be extremely expensive and a
non-random method is a cheaper alternative.
The major judgement sampling methods are as follows.
(a) Systematic sampling. In systematic sampling the sample is taken at regular
intervals from the population. If the population consisted of 50 000 and a sample
of 1000 were required, then every fiftieth item on a list of the population would
be taken. This would constitute a systematic sample. To avoid always taking the

6/10 Edinburgh Business School Quantitative Methods


Module 6 / Sampling Methods

first name on the list, the starting point would be selected at random from the
first 50.
To return to the workforce absenteeism survey, the computerised payroll could
be sampled systematically. If the payroll had 1750 names and a sample of 250
was required, then every seventh name could be taken. Provided that the list is in
random order, the resulting sample is, in effect, random, but all the trouble of
numbering the list and using random numbers has been avoided. Systematic
sampling would usually provide a random sample if the payroll were in alphabet-
ical order.
There are, however, dangers in systematic sampling. If the payroll were listed in
workgroups, each group having six workers and one foreman, then taking one
name in seven might result in a sample consisting largely of foremen or consist-
ing of no foremen at all.
A systematic sample can therefore result in a sample that is, in effect, a random
sample, but time and effort has been saved. At the other extreme, if care is not
taken, the sample can be hopelessly biased.
(b) Convenience sampling. This means that the sample is selected in the easiest
way possible. This might be because a representative sample will result or it
might be that any other form of sampling is impossible.
In medicine, a sample of blood is taken from the arm. This is convenience sam-
pling. Because of knowledge of blood circulation, it is known that the sample is
as good as random.
As a further example, consider the case of a researcher into the psychological
effects on a family when a member of the family suffers from a major illness.
The researcher will probably conduct the research only on those families that are
in such a position at the time of the research and who are willing to participate.
The sample is obviously not random, but the researcher has been forced into
convenience sampling because there is no other choice. He or she must analyse
and interpret the findings in the knowledge that the sample is likely to be biased.
Mistaken conclusions are often drawn from such research by assuming the sam-
ple is random. For instance, it is likely that families agreeing to be interviewed
have a different attitude to the problem of major illness from families who are
unwilling.
(c) Quota sampling. Quota sampling is used to overcome interviewer bias. It is
frequently used in market research street interviews. If the interviewer is asked to
select people at random, it is difficult not to show bias. A young male interviewer
may, for instance, select a disproportionate number of attractive young females
for questioning.
Quota sampling gives the interviewer a list of types of people to interview. The
list may be of the form:
10 males age 35–50
10 females age 35–50
10 males age 50+
10 females age 50+
Total sample = 40
Discretion is left to the interviewer to choose the people in each category.

Quantitative Methods Edinburgh Business School 6/11


Module 6 / Sampling Methods

Note that quota sampling differs from stratified sampling. A stratum forms a
known proportion of the population; quota proportions are not known. Any
conclusions must be interpreted in the context of an artificially structured, non-
random sample.

6.6 The Accuracy of Samples


If a method of sample collection existed that could guarantee to produce representa-
tive samples every time, then the problem of inference would be a simple one. For
example, suppose the average income of all the 20 000 students at London Universi-
ty is to be estimated. By some means a sample of 200 students that is fully
representative of the population (the 20 000 students) has been collected. If the
average income measured over the sample is £6000, then, since the sample is fully
representative, the average income for the population must also be £6000.
The fact that no sample can definitely be known to be representative means that
there is a need to estimate the errors that may arise. Estimating these errors is a
fundamental part of statistical inference. The statistical theory by which this can be
done will be covered later, but its result can be appreciated now. In the student
income example the sample mean was £6000. The true population mean was
unknown, but statistical theory allows statements to be made such as: ‘There is a 95
per cent probability that the true population average income lies in the range £6000
±£200, i.e. between £5800 and £6200.’ Such statements can only be made when
random sampling methods are used. In other words, random sampling not only
gives the single estimate of £6000 but, in addition, gives a measure of the accuracy
of the estimate (±£200). The statistical theory on which such accuracy calculations
are based rests on the random selection of the sample. Accuracy levels cannot be
estimated from judgement samples since they are not essentially random in nature.
Stratified sampling, besides its other advantages, increases the level of accuracy of
estimates as against that of a simple random sample. This can be demonstrated with
a simplified example. Suppose the population being sampled comprises four
managers. Their average salary is being estimated. The population is stratified by
virtue of two of the managers having accounting qualifications.

Manager Salary (£000s) Stratum


A 32 1 (Accounting qualification)
B 24 1 (Accounting qualification)
C 20 2 (No qualification)
D 16 2 (No qualification)

If samples of size 2 are taken and the average salary of each sample is computed,
then using simple random sampling there will be six possible samples:

6/12 Edinburgh Business School Quantitative Methods


Module 6 / Sampling Methods

Sample Average salary


A, B 28 (= (32 + 24)/2)
A, C 26
A, D 24 Range of sample averages: 18 → 28
B, C 22
B, D 20
C, D 18

Under stratified sampling, where each sample of two comprises one manager
from each stratum, there are four possible samples:

Sample Average salary


A, C 26
A, D 24 Range of sample averages: 20 → 26
B, C 22
B, D 20

If a sample of size 2 were being used to estimate the population average salary
(23), simple random sampling could give an answer in the range 18 → 28, but
stratified sampling is more accurate and could only give a possible answer in the
range 20 → 26. On a large scale (in terms of population and sample size), this is the
way in which stratified sampling improves the accuracy of estimates.
In summary, simple random sampling allows (in a way as yet unspecified) the
accuracy of a sample estimate to be calculated. Stratified sampling, besides its other
advantages, improves the accuracy of sample estimates.

6.7 Typical Difficulties in Sampling


6.7.1 Sampling Frame
The sampling frame is the complete list from which the sample is selected. This
sounds like the population, but in practice the two are likely to be different. A
sizeable difference between the two may mean that the results of the sampling are
unreliable.
In the usual example of the absenteeism survey, the population is the entire
workforce of the company. However, computer records will not be precisely up to
date as people join and leave the company. In addition, the payroll listings may
differ from location to location as regards such things as the inclusion or exclusion
of part-time staff. Factors such as these result in a sampling frame that is likely to
differ from the population. The difference may be important. Absenteeism among
part-time staff may not be the same as for full-time staff and care would need to be
taken in the interpretation of the results.

Quantitative Methods Edinburgh Business School 6/13


Module 6 / Sampling Methods

6.7.2 Non-response
Envisage a sample of households with which an interviewer has to make contact to
ask questions concerning the consumption of breakfast cereals. If there is no one at
home when a particular house is visited, then there is non-response. If the house is
revisited (perhaps several times are necessary) then the sampling becomes very
expensive; if the house is not revisited then it will be omitted from the sample and
the sample is likely to be biased. The bias occurs because the households where no
one is in may not be typical of the population. For instance, if the interviewer made
his calls during the daytime then households where both marriage partners are at
work during the day would tend to be omitted, possibly creating bias in the results.
The information from the sample could not properly be applied to the whole
population but only parts of it.
Non-response does not occur solely when house visits are required, but whenev-
er the required measurements cannot be made for some elements of the sample. In
the absenteeism example, missing personnel records for some of the staff selected
for the sample would constitute a non-response problem. If efforts are made to find
or reconstruct the missing records it will be expensive; if they are left out of the
sample it will be biased since, for instance, the missing records may belong mainly to
new employees.

6.7.3 Bias
Bias is a systematic (i.e. consistent) tendency for a sampling method to produce
samples that over- or under-represent elements of the population. It can come from
many sources, and sampling frame error and non-response, as described above, are
two of them. Other sources include:
(a) Inaccurate measurement. This might be physical inaccuracy, such as a
thermometer that is calibrated 1 degree too high; it might be conceptual, such as
a salary survey that excludes perks and commission bonuses.
(b) Interviewer bias. This is when the interviewer induces biased answers. For
example, an interviewer may pose a question aggressively: ‘You don’t believe the
new job conditions are advantageous, do you?’, signalling that a particular answer
is required.
(c) Interviewee bias. Here the interviewee injects the bias. The interviewee may be
trying to impress with the extent of his knowledge and falsely claim to know
more than he does. Famously, a survey of young children’s video viewing habits
seemed to indicate that most were familiar with ‘adult’ videos until it was realised
that the children interviewed were anxious to appear more grown-up than they
were and made exaggerated claims. People of all ages tend to give biased answers
in areas relating to age, salary and sexual practice.
(d) Instrument bias. The instrument refers to the means of collecting data, such as
a questionnaire. Poorly constructed instruments can lead to biased results. For
example, questions in the questionnaire may be badly worded. ‘Why is Brand X
superior?’ raises the question of whether Brand X actually is thought to be supe-
rior. Bad questions may not just be a question of competence: they may indicate

6/14 Edinburgh Business School Quantitative Methods


Module 6 / Sampling Methods

a desire to ‘rig’ the questionnaire to provide the answers that are wanted. It is
always advisable to ask about the source of funding for any ‘independent’ survey.
Bias is most dangerous when it exists but is not recognised. When this is the case
the results may be interpreted as if the sample were truly representative of the
population when it is not, possibly leading to false conclusions.
When bias is recognised, it does not mean that the sample is useless, merely that
care should be taken when making inferences about the population. In the example
concerned with family reactions to major illness, although the sample was likely to
be biased, the results could be quoted with the qualification that a further 20 per
cent of the population was unwilling to be interviewed and the results for them
might be quite different.
In other circumstances a biased sample is useful as a pilot for a more representa-
tive sampling study.

6.8 What Sample Size?


One of the most important sampling questions concerns the size of the sample. In
our example, why choose a sample of 2000? Why not 1000 or 5000? There are two
approaches to this question.
The first is to ask what level of accuracy is needed in the results. For example, an
opinion poll to predict the level of support for a political party may give the answer
‘53% of the electorate intends to vote for the party’. What level of accuracy do we
require to be associated with this figure? If the accuracy of the poll was ‘53% ±
20%’ then the result would not be very useful in predicting the result of an election
or gauging changes in support as time went by. On the other hand, if the accuracy
was ‘53% ± 1%’ then the result would be useful for both purposes. Similarly, in a
survey of family expenditure a result that said ‘The average family spends £120 per
week on food’ would be useful if the accuracy was ‘± £1.50’ but useless if the
accuracy was ‘± £50’.
Once we know what accuracy we require to meet the purposes of the enquiry, we
can turn to statistical theory to calculate the sample size – which is related to
accuracy. The statistical theory is described in later modules but it works roughly as
follows.
With a random sample, the accuracy of the results can be calculated. It is possible
to say, with 95 per cent probability, how close to the calculated sample statistic the
true population value is likely to be. The theory (to be explained later) shows that
the larger the sample size, the closer is the estimate to the true population. In fact,
the accuracy increases (i.e. error decreases) in proportion to the square root of the
sample size. Consequently, increasing the sample from 100 to 400 (a factor of four)
doubles the accuracy (i.e. halves the error: a factor of √4 = 2); increasing it from
400 to 1600 doubles it again, and so on. There are, therefore, decreasing returns
from enlarging the sample. The answer to the question ‘What should the sample size
be?’ is this: decide what accuracy is required for the objectives of the study and then
use statistical theory to calculate the sample size that will provide this accuracy.

Quantitative Methods Edinburgh Business School 6/15


Module 6 / Sampling Methods

The second answer to the sample size question is not theoretical but is more
frequently met in practice. It is to collect the largest sample that the available budget
allows. Then the results can be interpreted in the light of the accuracy given by the
sample size.
It may seem surprising that the population size has no bearing on the sample
size. However, the population size has an effect in a different way. The definition of
simple random sampling (which is also involved in other types of sampling) is that
each member of the population should have an equal chance of being chosen.
Suppose the population is small, with only 50 members. The probability of the first
member being chosen is 1/50, the second is 1/49, the third is 1/48 and so on (i.e.
the probabilities are not equal). For a large population the difference in probabilities
is negligible and the problem is ignored. For a small population there are two
options. First, sampling could be done with replacement. This means that when an
element of the sample has been chosen, it is then returned to the population. The
probabilities of selection will then all be equal. This option means that an element
may be included in the sample twice. The second option is to use a different theory
for calculating accuracy and sample size. This second option is based on sampling
without replacement and is more complicated. A small population, therefore,
while not affecting sample size, does make a difference to the nature of sampling.

Learning Summary
It is most surprising that information collection should be so often done in apparent
ignorance of the concept of sampling. Needing information about invoices, one
large company investigated every single invoice issued and received over a three-
month period: a monumental task. A simple sampling exercise would have reduced
the cost to around 1 per cent of the actual cost with little or no loss of accuracy.
Even after it is decided to use sampling, there is still, obviously, a need for careful
planning. This should include a precise timetable of what and how things are to be
done. The crucial questions are: ‘What are the exact objectives of the study?’ and
‘Can the information be provided from any other source?’ Without this careful
planning it is possible to collect a sample and then find the required measurements
cannot be made. For example, having obtained a sample of 2000 of the workforce,
it may be found that absence records do not exist, or it may be found that another
group in the company carried out a similar survey 18 months before and their
information merely needs updating.
The range of uses of sampling is extremely wide. Whenever information has to
be collected, sampling can prove valuable. The following list gives a guide to the
applications that are frequently encountered:
(a) opinion polls of political and organisational issues;
(b) market research of consumer attitudes and preferences;
(c) medical investigations;
(d) agriculture (crop studies);
(e) accounting;
(f) quality control (inspection of manufactured output);

6/16 Edinburgh Business School Quantitative Methods


Module 6 / Sampling Methods

(g) information systems.


In all applications, sampling is a trade-off between accuracy and expense. By
sampling at all, one is losing accuracy but saving money. The smaller the sample, the
greater the accuracy loss, but the greater the saving. The trade-off has to be made in
consideration of the accuracy required by the objectives of the study and the budget
available. Even when the full population is investigated, however, the results will not
be entirely accurate. The same problems that occur in sampling – non-response, the
sampling frame and bias – will occur with the full population. The larger the sample
size, the closer the accuracy to the maximum accuracy that is obtainable with the full
population. It may even be the case that measurement errors may overwhelm
sampling errors. For instance, even a slightly ambiguous set of questions in an
opinion poll can distort the results far more than any inaccuracy resulting from
taking a sample rather than considering the full population. The concern of many
managers that sample information must be vastly inferior to population information
is ill-founded. A modest sample can provide results that are of only slightly lower
accuracy than those provided by the whole population and at a fraction of the cost.

Quantitative Methods Edinburgh Business School 6/17


Module 6 / Sampling Methods

Review Questions
6.1 Which reason is correct?
The need to take samples arises because:
A. it is impossible to take measurements of whole populations.
B. sampling gives more accurate results.
C. sampling requires less time and money than measuring the whole population.

6.2 Which reason is correct?


Randomness of selection is incorporated into many sampling methods in order that:
A. the accuracy of inferences can be calculated.
B. the sample will be totally representative of the population.
C. the cost of sampling is reduced.
D. it is not necessary to list the entire population.

6.3 A British company divides the country into nine sales regions. Three of them are to be
selected at random and included in an information exercise. Using the table of random
numbers, what are the regions chosen? (Take the numbers starting at the top left, a row
at a time.)

Random Numbers
5 8 5 0 4
7 2 6 9 3
6 1 4 7 8

Sales regions
1. North West England
2. Eastern England
3. Midlands
4. London
5. South East England
6. South West England
7. Wales
8. Scotland
9. Northern Ireland
The sample will comprise:
A. SE England, Scotland, SE England.
B. SE England, Scotland, London.
C. Scotland, SE England, NW England.
D. SE England, Wales, SW England.
E. NW England, N Ireland, NW England.

6/18 Edinburgh Business School Quantitative Methods


Module 6 / Sampling Methods

6.4 The advantages of multi-stage sampling over simple random sampling are:
A. fewer observations need to be made.
B. the entire population does not have to be listed.
C. it can save time and effort by restricting the observations to a few areas of the
population only.
D. the accuracy of the results can be calculated more easily.

6.5 Which of the following statements about stratified sampling are true?
A. It is usually more representative than a simple random sample.
B. It cannot also be a cluster sample.
C. It may be more expensive than a simple random sample.

6.6 Which of the following statements about variable sampling fractions is true?
A. Measurements on one part of the sample are taken extra carefully.
B. A section of the population is deliberately over-represented in the sample.
C. The size of the sample is varied according to the type of items so far selected.
D. The results from the sample are weighted so that the sample is more repre-
sentative of the population.

6.7 Where are non-random (judgement) methods of sampling used? When:


A. random sampling is impossible.
B. non-random methods provide a representative sample at lower cost.
C. there is a likelihood of interviewer bias.

6.8 What is the essential difference between stratified and quota sampling?
A. Quota sampling refers only to interviewing, whereas stratification can apply to
any sampling situation.
B. Stratification is associated with random selection within strata, whereas quota
sampling is unconnected with random methods.
C. The strata sizes in the sample correspond to the population strata sizes,
whereas the quota sizes are fixed without necessarily considering the popula-
tion.

Quantitative Methods Edinburgh Business School 6/19


Module 6 / Sampling Methods

6.9 A sample of trees is to be taken (not literally) from a forest and their growth monitored
over several years. In the forest 20 per cent of the trees are on sloping ground and 80
per cent are on level ground. A sample is taken by first dividing the forest into geo-
graphical areas of approximately equal size. There are 180 such areas (36 sloping, 144
level). Three sloping areas and 12 level ones are selected at random. In each of these 15
areas, 20 trees are selected at random for inclusion in the sample.
The resulting sample of 300 (20 × 15) trees can be said to have been selected using
which of the following sample methods?
A. Multi-stage.
B. Cluster.
C. Stratification.
D. Weighting.
E. Probability.
F. Area.
G. Systematisation.
H. Convenience.

6.10 A random sample of 25 children aged 10 years is taken and used to measure the average
height of 10-year-old children with an accuracy of ±12 cm (with 95 per cent probability
of being correct). What would have been the accuracy had the sample size been 400?
A. ±3 cm
B. ±0.75 cm
C. ±48 cm
D. ±4 cm

Case Study 6.1: Business School Alumni


1 A French business school wishes to gain some information on the careers of its
graduates. It decides to take a sample of its graduates and send them a questionnaire.
The sample is selected as follows.
After a random start, every twentieth name on the list of the Alumni Association is
taken, together with every twentieth name on the list of graduates whose addresses are
known but who did not join the Association. Both lists are alphabetical. In all, 1200
usable replies were received. Does this constitute a random sample?

Case Study 6.2: Clearing Bank


1 A UK clearing bank with more than 2200 branches and 7 million customers wants to
take a representative sample of personal customer chequebook accounts. Measure-
ments taken from the sample will include average balances, numbers of transactions,
profitability and usage of services. This information will be used for setting bank charges,
assessing the effect of variations in interest rates and formulating marketing plans.
The bank is split into 14 geographical regions. Each region has between 100 and 500
branches of varying size. Over the whole country the profile of bank sizes is as follows:

6/20 Edinburgh Business School Quantitative Methods


Module 6 / Sampling Methods

Branch size No. of branches %


Large (average 5000 accounts) 697 32
Medium (average 2500 accounts) 728 34
Small (average 1000 accounts) 740 34
2165 100

It is thought that branch size affects the profitability of customer accounts, because of
differing staff ratios and differing ranges of services being available at branches of
different size. Each branch has between 100 and 15 000 chequebook accounts. All
accounts are computerised regionally. Each region has its own computer, which can
provide a chronological (date of opening account) list of chequebook accounts and from
which all the necessary information on average balances and so on can be retrieved.
a. Why should the bank adopt a sampling approach rather than taking the information
from all account holders?
b. In general terms, what factors would influence the choice of sample size?
c. If a total sample of 2000 is to be collected, what sampling method would you
recommend? Why?
d. In addition, the bank wants to use the information from the sample to compare the
profitability of chequebook accounts for customers of different socioeconomic
groups. How would you do this? What extra information would you require? NB:
An account holder’s socioeconomic group can be specified by reference to his/her
occupation. The bank will classify into five socioeconomic groups.
e. What practical difficulties do you foresee?

Quantitative Methods Edinburgh Business School 6/21


PART 3

Statistical Methods
Module 7 Distributions
Module 8 Statistical Inference
Module 9 More Distributions
Module 10 Analysis of Variance

Quantitative Methods Edinburgh Business School


Module 7

Distributions
Contents
7.1 Introduction.............................................................................................7/1
7.2 Observed Distributions ..........................................................................7/2
7.3 Probability Concepts ..............................................................................7/8
7.4 Standard Distributions ........................................................................ 7/14
7.5 Binomial Distribution .......................................................................... 7/15
7.6 The Normal Distribution .................................................................... 7/19
Learning Summary ......................................................................................... 7/27
Review Questions ........................................................................................... 7/29
Case Study 7.1: Examination Grades ........................................................... 7/31
Case Study 7.2: Car Components................................................................. 7/31
Case Study 7.3: Credit Card Accounts ........................................................ 7/32
Case Study 7.4: Breakfast Cereals ................................................................ 7/32

Prerequisite reading: Module 1, Module 5


Learning Objectives
By the end of the module the reader should be aware of how and why distributions,
especially standard distributions, can be useful. Having numbers in the form of a
distribution helps both in describing and in analysing. Distributions can be formed
from collected data, or they can be derived mathematically from knowledge of the
situation in which the data are generated. The latter are called standard distributions.
Two standard distributions, the binomial and the normal, are the main topics of the
module.
Proof of some of the formulae requires a high level of mathematics. Where pos-
sible, mathematical derivations are given but, since the purpose of this course is to
explore the practical applications of techniques, not their historical and mathemati-
cal development, there are situations where the mathematics is left in its ‘black box’.
Note on technical sections: Section 7.3 ‘Probability Concepts’, Section 7.5.3
‘Deriving the Binomial Distribution’ and Section 7.6.3 ‘Deriving the Normal
Distribution’ are technical and may be omitted on a first reading.

7.1 Introduction
Several examples of distributions have been encountered already. In Module 5, on
summary measures, driving competition scores formed a symmetrical distribution;
viewing of television serial episodes formed a U-shaped distribution; sickness

Quantitative Methods Edinburgh Business School 7/1


Module 7 / Distributions

records formed a reverse J-shaped distribution. These distributions, particularly their


shapes, were used as part of the description of the data. A closer look will now be
taken at distributions and their use in the analysis of data.
There are two general types of distribution, observed and standard. The latter is
also known as theoretical or probability distribution. An observed distribution is
derived from collecting data in some situation; the distribution is peculiar to that
one situation. A standard distribution is derived mathematically and is, theoretically,
applicable to all situations exhibiting the appropriate mathematical characteristics.
Standard distributions have the advantage that time and effort spent on data
collection are saved. The formation and application of both types of distribution will
be described in this module.

7.2 Observed Distributions


An observed distribution starts with the collection of numbers. The numbers are all
measurements of a variable, which is just what the word implies. It is some entity
that can be measured and for which the measurement varies when observations are
made of it. For instance, the variable might be the market share of a well-known
soft drink at 100 of the producer’s European marketing centres, as shown in
Figure 7.1

7/2 Edinburgh Business School Quantitative Methods


Module 7 / Distributions

6
5

9
11
8

11 12
12 11 10

18 16 14
16 17 15
17 15
16
19
17 13 23 15
18 19
20 21 16 19 17
20
19 18 18
19 20 18
20 18
21 19
13 23 23
22 24 19
13 22 20
23 23
24 20
25 24 26 25
22 24 24
23 25 24 25 20 25
27 24
18 25
30 16 28 25
25 29 25 25
26
16 29 28 26 27
29 27 27
26 27 28 28 31
4
34 32
8
28
36

Figure 7.1 Market share (%) of soft drink product throughout Europe
Figure 7.1 is a mess. At first, most data are. They may have been taken from large
market research reports, dog-eared production dockets or mildewed sales invoices;
they may be the output of a computer system where no effort has been made at data
communication. Some sorting out must be done. A first attempt might be to arrange
the numbers in an ordered array, as in Table 7.1.

Table 7.1 Ordered array


4 11 15 16
5 11 15 …
6 12 15 …
8 12 16 …
8 13 16 …
9 13 16 …
10 13 16 …
11 14 16 …

Quantitative Methods Edinburgh Business School 7/3


Module 7 / Distributions

The numbers look neater now, but it is still difficult to get a feel for the data –
the average, the variability, etc. – as they stand. The next step is to classify the data.
Classifying means grouping the numbers in bands, such as 20–25, to make them
easier to appreciate. Each class has a frequency, which is the number of data points
(market shares) that fall within that class. A frequency table is shown in Table 7.2.

Table 7.2 Frequency table


Class Frequency
0<x≤5 2
5 < x ≤ 10 5
10 < x ≤ 15 12
15 < x ≤ 20 31
20 < x ≤ 25 27
25 < x ≤ 30 19
30 < x ≤ 35 3
35 < x ≤ 40 1
Total 100

30

25
Frequency

20

15

10

0 5 10 15 20 25 30 35 40
Market shares (%)

Figure 7.2 Frequency histogram of soft drink market shares


Table 7.2 shows that 12 data points (market shares at 12 centres) were greater
than 10 but less than or equal to 15, 31 were greater than 15 but less than or equal
to 20 and so on. Data points are sometimes referred to as observations or read-
ings. It is now easier to get an overall impression of what the data mean. Most of
the numbers are between 15 and 25 with extremes of just above 0 and just below
40. Another arrangement, a frequency histogram as in Figure 7.2, has even greater
visual impact.
Figure 7.2 makes it easy to see that the data are spread symmetrically over their
range with the majority falling around the centre of it. As a descriptive device the
frequency histogram is satisfactory without further refinement. If, on the other
hand, the exercise had analytical objectives, then it would be developed into a

7/4 Edinburgh Business School Quantitative Methods


Module 7 / Distributions

probability histogram. The transformation uses the relative frequency method of


probability measurement. The probability that any randomly selected measurement
lies within a particular class interval can be calculated:
(number lies in class ) =
The frequency histogram becomes a probability histogram by making the units of
the vertical axis probabilities instead of frequencies. Since the total frequency is 100,
the frequency 2 becomes the probability 0.02, 4 becomes 0.04, 12 becomes 0.12, etc.
The shape of the histogram remains the same. Once the histogram is in probability
form it is referred to as a distribution, although, to be strictly accurate, all configura-
tions in Figure 7.1 and Figure 7.2, and in Table 7.1 and Table 7.2, are distributions
of some sort. Analyses, based on the manipulation of probabilities, can now be
carried out. For example, what is the probability that any market share will be in the
range 10–20? From Figure 7.2:
(10 < ≤ 15) = 0.12
(15 < ≤ 20) = 0.31
These probabilities are calculated from frequencies of 12 and 31 respectively for
each class. The frequency for the combined class 10–20 is therefore 43 (= 12 + 31).
Consequently:
( )
(10 < ≤ 20) = = 0.43

Alternatively, this probability can be calculated by adding the probabilities of the


individual classes:
(10 < ≤ 20) = 0.12 + 0.31 = 0.43
As a further example, what is the probability that any market share is 15 or less?
( ≤ 15) = (0 < ≤ 5) + (5 < ≤ 10) + (10 < ≤ 15)
= 0.02 + 0.05 + 0.12
= 0.19
The calculations can be made in either frequency or probability form. Probability
calculations such as these form the basis of analyses with observed distributions.
Example
Table 7.3 shows the number of deliveries of food made to a hypermarket (the number
of trucks arriving at the hypermarket) per day over a one-year period (300 working
days). The frequencies have been turned into percentages and then probabilities. For
instance, on 53 days there were between zero and nine deliveries and therefore the
probability of there being deliveries in this range on any day is 53/300 = 0.18 (approx.).
Figure 7.3 shows a probability histogram of the same data.

Table 7.3 Frequency table of deliveries


No. of No. of % Probability
deliveries days
0–9 53 18 0.18
10–19 82 27 0.27

Quantitative Methods Edinburgh Business School 7/5


Module 7 / Distributions

20–29 75 25 0.25
30–39 42 14 0.14
40–49 29 10 0.10
50+ 19 6 0.06
Total 300 100 1.00

Probability 0.25

0.20

0.15

0.10

0.05

0–9 10–19 20–29 30–39 40–49 50+

Figure 7.3 Probability histogram of deliveries


(a) The facilities at the hypermarket are only sufficient to handle 29 deliveries per day.
Otherwise, overtime working is necessary. What is the probability that on any day
overtime will be required?
From Figure 7.3:
P(deliveries are 50 or more) = 0.06
P(deliveries 40–49) = 0.10
P(deliveries 30–39) = 0.14
Therefore:
P(deliveries exceed 29) = 0.06 + 0.10 + 0.14
= 0.30
There is a 30 per cent probability that overtime will be needed on any day.
(b) The capacity can be increased from the present 29 by taking on more staff and
buying more handling equipment. To what level should the capacity be raised if there
is to be no more than a 10 per cent chance of overtime being needed on any day?
A capacity level X is to be found such that:
P(deliveries exceed X) = 0.10
Since:
P(deliveries 50 or more) = 0.06
P(deliveries 40–49) = 0.10
X must lie between 40 and 49. To find X, assumptions have to be made. Making the
assumption that deliveries in the range of 40–49 are spread evenly over this range,
then:
P(deliveries 40–45) = 0.06
P(deliveries 46–49) = 0.04

7/6 Edinburgh Business School Quantitative Methods


Module 7 / Distributions

Consequently:
P(deliveries exceed 45) = P(deliveries 46–49) + P(deliveries 50 or more)
= 0.04 + 0.06
= 0.10
Therefore, X = 45.
The capacity of 45 will result in overtime being needed on 10 per cent of days.
An alternative representation of a frequency distribution is as a cumulative frequency
distribution. Instead of showing the frequency for each class, a cumulative frequency
distribution shows the frequency for that class and all smaller classes. For example,
instead of recording the number of days on which the deliveries of food to a hypermar-
ket were in the ranges 0–9, 10–19, 20–29, etc., a cumulative distribution records the
number of days when deliveries were less than 9, less than 19, less than 29, etc.
Table 7.4 turns Table 7.3 into cumulative form.

Table 7.4 Cumulative frequencies of deliveries


No. of Cumulative % Probability
deliveries (f) no. of days P(no. deliveries ≤ f)
≤9 53 18 0.18
≤ 19 135 45 0.45
≤ 29 210 70 0.70
≤ 39 252 84 0.84
≤ 49 281 94 0.94
All 300 100 1.00

Table 7.4 shows cumulative frequencies in ‘less than or equal to’ format. They could just
as easily be in ‘more than or equal to’ format, as in Table 7.5.

Table 7.5 Cumulative frequencies of deliveries


No. of Cumulative % Probability
deliveries (f) no. of days P(no. deliveries ≥ f)
≥0 300 100 1.00
≥ 10 247 82 0.82
≥ 20 165 55 0.55
≥ 30 90 30 0.30
≥ 40 48 16 0.16
≥ 50 19 6 0.06

These cumulative frequency tables can be put into the form of graphs. They are then
known as ogives. Figure 7.4 shows Table 7.4 and Table 7.5 as ogives.

Quantitative Methods Edinburgh Business School 7/7


Module 7 / Distributions

less than
more than

300

250

200
Frequency

150

100

50

0
0 9 19 29 39 49
Deliveries

Figure 7.4 Ogives


Typical questions, such as ‘If the facilities can handle 29 deliveries per day, what is the
probability of this capacity being exceeded?’, can be answered more easily when the data
are in cumulative form or displayed as an ogive.

7.3 Probability Concepts


Before we move from observed distributions to standard distributions, some
preliminary work on probability concepts is needed. This will be of value in coming
to understand the derivation of standard distributions.
The fundamentals of statistical probability were discussed in the first module.
Probability is the likelihood of an event taking place; it takes values between 0 and 1;
it can be measured by three methods: a priori, relative frequency and subjective
assessment. These basic ideas will now be extended to some further properties of
probability. At first this will be done in abstract by thinking of hypothetical events
referred to as A, B, C, … Examples later will make the ideas more concrete.
Events A, B, C, … are mutually exclusive if the occurrence of one automatical-
ly rules out the occurrence of others. For example, suppose the events refer to a
company’s profit in one year’s time: A is a loss or breakeven, B is a profit greater
than 0 but less than £100 000, C is a profit of at least £100 000 but less than
£200 000, D is a profit of at least £200 000 but less than £300 000, and so on. The
events are mutually exclusive since the occurrence of one automatically rules out the
others. If an additional event (AA) were the company being made bankrupt, then
the events AA, A, B, C, … are not mutually exclusive, since, for instance, it is

7/8 Edinburgh Business School Quantitative Methods


Module 7 / Distributions

possible for the company to make a loss (event A) and be made bankrupt (event
AA).
The addition law for mutually exclusive events is:
(A or B or C or … ) = (A) + (B) + (C) + ⋯
This law has already been used implicitly in the example on hypermarket deliver-
ies. The probabilities of different classes were added together to give the probability
of an amalgamated class. The law was justified by relating the probabilities to the
frequencies from which they had been derived.
Example
Referring to the example on hypermarket deliveries (Table 7.3), what is the probability
of fewer than 40 deliveries on any day?
The events (the classes) in Table 7.3 are mutually exclusive. For example, given that the
number of deliveries on a day was in the range 10–19, it is not possible for that same
number of deliveries to belong to any other class. The addition law, therefore, can be
applied:
P(fewer than 40 deliveries) = P(0–9 deliveries or 10–19 or 20–29 or 30–39
= P(0–9) + P(10–19) + P(20–29) + P(30–39)
= 0.18 + 0.27 + 0.25 + 0.14
= 0.84

Equally, the result could have been obtained by working in the frequencies from which
the probabilities were obtained.
Note that if events are not mutually exclusive, this form of the addition law cannot be
used.
A conditional probability is the probability of an event under the condition that
another event has occurred or will occur. It is written in the form P(A/B), meaning the
probability of A given the occurrence of B. For example, if P(rain) is the probability of
rain later today, P(rain/dark clouds now) is the conditional probability of rain later
given that the sky is cloudy now.
The probabilities of independent events are unaffected by the occurrence or non-
occurrence of the other events. For example, the probability of rain later today is not
affected by dark clouds a month ago and so the events are independent. On the other
hand, the probability of rain later today is affected by the presence of dark clouds now.
These events are not independent. The definition of independence is based on condi-
tional probability. Event A is independent of B if:
(A) = (A/B)
This equation is merely the mathematical way of saying that the probability of A is not
affected by the occurrence of B. The idea of independence leads to the multiplication
law of probability for independent events:
(A and B and C and … ) = (A) × (B) × (C) × …
The derivation of this law will be explained in the following example.

Quantitative Methods Edinburgh Business School 7/9


Module 7 / Distributions

Example
Twenty per cent of microchips produced by a certain process are defective. What is the
probability of picking at random three chips that are defective, defective, OK, in that
order?
The three events are independent since the chips were chosen at random. The multipli-
cation law can therefore be applied.
(1st chip defective and 2nd chip defective and 3rd chip OK)
= (defective) × (defective) × (OK)
= 0.2 × 0.2 × 0.8
= 0.032
This result can be verified intuitively by thinking of choosing three chips 1000 times.
According to the probabilities, the 1000 selections are likely to break down as in
Figure 7.5. Note that in a practical experiment the probability of, for instance, 20 per
cent would not guarantee that exactly 200 of the first chips would be defective and 800
OK, but it is the most likely outcome. Figure 7.5 shows 32 occasions when two
defectives are followed by an OK; 32 in 1000 is a proportion of 0.032 as given by the
multiplication law.

1st chip 2nd chip 3rd chip


128 OK
160 OK
32 defective
32 OK
200 defective 40 defective
8 defective
1000 selections
512 OK
800 OK 640 OK
128 defective
128 OK
160 defective
32 defective

Figure 7.5 Breakdown of probabilities


Closely associated with probability calculations is the idea of a combination. It is
concerned with the sequencing of objects selected in a sample. A combination is
defined as the number of different ways in which r objects can be chosen from a
total of n objects. is the mathematical notation for a combination. An example
will explain the idea further: how many ways are there of choosing a subcommittee
of two people from a full committee of four?
The problem is tackled by first looking into how many orderings or sequences
the four can be put. Suppose the four people are labelled A, B, C, D. There are four
possible candidates for the first place. Once the first place has been decided, there
remain three possible candidates for the second place, leaving two for the third and
finally there is just one possibility for the last place. In total, this is 4 × 3 × 2 × 1 =
24 ways of sequencing the four people. This is shown diagrammatically in Fig-
ure 7.6. The expression 4 × 3 × 2 × 1 is known as 4 factorial (written ‘4!’).

7/10 Edinburgh Business School Quantitative Methods


Module 7 / Distributions

4 possibilities for 1st place

A B C D

3 possibilities for 2nd place

B C D A C D A B D A B C
2 possibilities for 3rd place

C D B D B C C D A D A C B D A D A B B C A C A B
1 possibility for 4th place

D C D B C B D C D A C A D B D A B A C B C A B A

Figure 7.6 Ways of ordering four people


More generally:
factorial = ! = × −1× − 2 × …× 3 × 2 × 1
From the bottom row of Figure 7.6, there are seen to be 24 ways of sequencing
four people. If the first two are separated from the others to form the subcommit-
tee, as in Table 7.6, it may appear to the cursory view that there are 4! ways of
forming the subcommittee. This is not so, because there is repetition. First, repeti-
tion occurs because the order of the two people left out does not affect the
subcommittee. In Table 7.6, the first two sequences (numbers 1 and 2) are the same
as regards the subcommittee. Both start with A, B. They appear as separate se-
quences only because the two people left out are in the order C, D and D, C
respectively. This situation occurs whatever two people are in the subcommittee
places. Each of the pairs of orderings 1,2; 3,4; 5,6; 7,8, etc. provides just one
subcommittee. The number of different subcommittees can be halved from 24 to 12
as a result of this effect.

Table 7.6 Selecting two from four


A A A A A A B B B B B B C C C C C C D D D D D D
B B C C D D A A C C D D A A B B D D A A B B C C
C D B D B C C D A D A C B D A D A B B C A C A B
D C D B C B D C D A C A D B D A B A C B C A B A
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Second, repetition occurs because the order of the two selected for the subcom-
mittee does not affect its constitution. It is the same subcommittee of A and B
whether they were selected first A then B or vice versa. In Table 7.6 each pair of
orderings 1,7; 2,8; 3,14 etc. provides just one subcommittee. Consequently, the
number of subcommittees can be further halved from 12 to 6. Since the process
started with all possible sequences of the four candidates, and since allowance has
now been made for all repetitions, this is the final number of different subcommit-
tees. They are:
AB, AC, AD, BC, BD, CD

Quantitative Methods Edinburgh Business School 7/11


Module 7 / Distributions

Put more concisely, the number of ways of choosing these six was calculated
from:

Using factorial notation this can be written:


!
= = =6
!× ! ×

More generally, the number of ways of selecting r objects from a larger group of
n objects is:
!
=
( !×( )!)

This can be verified intuitively by using numbers instead of n and r in a similar


manner to the subcommittee example. (Proof of the combination formula involves
some difficult mathematics and is not given.)
Example
What are the probabilities of obtaining 0, 1, 2, 3, 4 sixes when four dice are rolled
simultaneously?
The probability of four sixes is obtained from the multiplication law for independent
events. The events (the outcome for each die) are independent since the rolling of one
is not affected by the rolling of any other. The law is therefore applicable.
(4 sixes) = (6 on 1st die) × (6 on 2nd) × (6 on 3rd) × (6 on 4th)
= × × ×
= 0.001 (approx. )
The probability of zero sixes is calculated similarly:
(0 sixes) = (1 … 5 on 1st die) × (1 … 5 on 2nd)
× (1 … 5 on 3rd) × (1 … 5 on 4th)
= × × ×
= 0.482
An attempt to calculate the probability of one six could be made as follows:
(1 six) = (6 on 1st die) × (1 … 5 on 2nd)
× (1 … 5 on 3rd) × (1 … 5 on 4th)
= × × ×
= 0.0965
However, this only allows for the one six being on the first die; the six could have been
on any die. Using the addition law:
(the 6 is on 1st or 2nd or 3rd or 4th die) = (1 six)
= (the 6 on 1st die) + (the 6 on 2nd) + (the 6 on 3rd) + (the 6 on 4th)
= 0.0965 × 4
= 0.386
An attempt to calculate the probability of two sixes could be as follows:

7/12 Edinburgh Business School Quantitative Methods


Module 7 / Distributions

(2 sixes) = (6 on 1st die) × (6 on 2nd) × (1 … 5 on 3rd) × (1 … 5 on 4th)


= × × ×
= 0.0193
Once again this refers to the sixes occurring in a specific order. The two sixes could
occur on any of the dice. In how many ways could this happen? The answer is , the
number of ways to select two objects (showing sixes) from a total of four. Consequent-
ly:
(2 sixes) = 0.0193 ×
= 0.0193 × 6 (from the subcommittee example)
= 0.116
Similarly, the probability of three sixes:
(3 sixes) = No. of ways the 3 sixes could occur × (3 sixes in a specific order)
= × × × ×
!
= × 0.00386
( !× !)
= 0.015
All the calculations, not just those for two and three sixes, can be generalised using the
combination notation:
( sixes) = × (1/6) × (5/6)
This formula has been shown to work for r = 2 and 3. It also works for one six since
= 4. For r = 0 the formula is:
(0 sixes) = × (5/6)
requires knowledge of 0! Mathematicians have defined this as 1 since this is the only
value that makes consistent sense. Given this definition:
!
= × 0!
!
= 1
Similarly, using 0! = 1 makes the formula work for four sixes. The complete answer to
the question is:

No. sixes Probability


0 0.482
1 0.386
2 0.116
3 0.015
4 0.001

These calculations are not of immediate practical value, but they are the basis for
the derivation of the binomial distribution later in the module.

Quantitative Methods Edinburgh Business School 7/13


Module 7 / Distributions

7.4 Standard Distributions


Observed distributions often entail a great deal of data collection. Not only must
sufficient data be collected for the distribution to take shape and probabilities to be
measurable, but also the data must be collected individually for each and every
situation. Standard distributions can reduce the amount of data collection.
A standard distribution is one that has been defined mathematically from a theo-
retical situation. The characteristics of the situation were expressed mathematically
and the resulting distribution (that is, the probabilities of the variable taking differ-
ent values) calculated. In other words, the probabilities were measured not by the
relative frequency method (as in observed distributions) but by the a priori method.
When an actual situation resembling the theoretical one arises, the associated
standard distribution is applied without the need for a lot of data collection.
For example, what distribution results when an unbiased coin is tossed? In this
case the variable takes on just two values, heads or tails. To obtain the standard
distribution of this situation is easy. Both heads and tails are equally likely (the coin
is unbiased). Therefore the distribution is:
P(heads) = 0.5
P(tails) = 0.5
An observed distribution would be obtained by tossing the coin many times,
counting the number of heads and tails and calculating probabilities from these
frequencies. The standard and observed distributions should be approximately
(although they may not be exactly) the same.
One much-used standard distribution, the normal, was derived from the theoret-
ical situation of a variable being generated by a process that should give the variable
a constant value but does not do so because it is subject to many small disturbances.
At a result the variable is distributed around the constant value. The mathematical
expression of such a situation can be used to calculate the probabilities of the
variable taking different values. The normal distribution would then apply, for
instance, to the lengths of machine-cut steel rods. The rods should all be of the
same length but are not because of variations introduced by vibration, machine
setting, the operator, etc. Knowledge of the properties of the normal distribution
allows analyses concerning, for instance, the percentage of rods whose lengths fall
within given tolerance ranges to be found. The normal distribution also applies to
many other situations with similar characteristics. It will be discussed in detail later
in the module. There are other standard distributions that derive from situations
with other characteristics.
In summary, therefore, using an observed distribution implies that data have
been collected, probabilities calculated and histograms formed; using a standard
distribution implies that the situation in which the data are being generated resem-
bles closely a theoretical situation for which a distribution has been constructed
mathematically.

7/14 Edinburgh Business School Quantitative Methods


Module 7 / Distributions

7.5 Binomial Distribution


One of the earliest standard distributions to be developed was the binomial
distribution.

7.5.1 Characteristics
The binomial distribution is discrete (the values taken by the variable are distinct),
giving rise to stepped shapes. Figure 7.7 illustrates this and also that the shape can
vary from right-skewed through symmetrical to left-skewed depending upon the
situation in which the data are collected.

Figure 7.7 Binomial distribution

7.5.2 Situations in which the Binomial Occurs


Like all standard distributions, the binomial is constructed mathematically from a
theoretical situation. For the binomial the situation is as follows:

The elements of a statistical population are of two types. Each element must be
of one but only one type. The proportion, p, of the population that is of the
first type is known (and the proportion of the second type is therefore 1 − p).
A random sample of size n is taken. Because the sample is random, the number
of elements of the first type it contains is not certain (it could be 0, 1, 2, 3, …
or n) depending upon chance.

From this theoretical situation the probabilities that the sample contains given
numbers (from 0 up to n) or elements of the first type can be calculated. If a great
many samples were actually collected, a histogram would be gradually built up, the
variable being the number of elements of the first type in the sample. Probabilities
measured from the histogram should match those theoretically calculated (approxi-
mately but not necessarily exactly, as with the coin tossing example). The
probabilities calculated theoretically are binomial probabilities; the distribution
formed from these probabilities is a binomial distribution.
For example, a machine produces microchip circuits for use in children’s toys.
The circuits can be tested and found to be defective or OK. The machine has been
designed to produce no more than 20 per cent defective chips. A sample of 30 chips
is collected. Assuming that overall exactly 20 per cent of chips are defective, the

Quantitative Methods Edinburgh Business School 7/15


Module 7 / Distributions

binomial probabilities give the likelihood that the sample contains 0, 1, 2, … or 30


defective chips. The probabilities can be found in a binomial distribution table (such
as Appendix 1, Table A1.1, the use of which will be explained later). If several
samples of size 30 are taken, a distribution of the number of defectives per sample
will emerge. The frequencies of the distribution should match the theoretically
calculated probabilities. If they do not, this may be purely by chance. On the other
hand, it may be an indication that the assumption of 20 per cent defectives is not
correct. Standard distributions are often used in this way. Theoretical probabilities
are compared with observed frequencies and discrepancies used to indicate where
assumptions (in this example the defective rate of the machine) may be false.
To summarise, the binomial relates to situations in which a sample is taken from
a population that consists of two types of element (hence the name ‘binomial’).
Binomial probabilities are the probabilities of obtaining different numbers of each
type in the sample.
Actual situations in which the binomial is used include:
(a) inspection schemes (as above);
(b) opinion polls (the two-way split of the population results from agreement/
disagreement with statements made by the pollster);
(c) selling (the split is the outcome of a contact being a sale/no sale).

7.5.3 Deriving the Binomial Distribution


Suppose a population is split into type 1 elements (proportion p = 1/6) and type 2
elements (proportion 1 − p = 5/6) and that a sample of size 4 is taken. How can the
probabilities that the sample contains 0, 1, 2, 3 or 4 elements of type 1 be calculated?
These probabilities are binomial probabilities. The histogram formed from them is a
binomial distribution.
Fortunately these probabilities have already been calculated. The earlier example
involving dice was exactly the situation described above. The population was split
into two parts: sixes (proportion 1/6) and non-sixes (proportion 5/6). A sample of
4 (4 dice were rolled) was taken. The formula used to calculate the probabilities in
this situation was:
( sixes) = × (1/6) × (5/6)
This formula for the specific case of sample size 4 and proportion of type 1, p =
1/6, can be extended to the general binomial case of sample size n and proportion p
by substituting p for 1/6, 1 − p for 5/6 and n for 4. The general formula by which
binomial probabilities are calculated is:
( of type 1 in sample) = × × (1 − )
The symbols are:
= sample size
= proportion of type 1 in population
1 − = proportion of type 2 in population
= mathematical shorthand for a ‘combination’ = n!/(r ! × (n − r)!)
! (pronounced ‘n factorial’) = n × (n − 1) × (n − 2) × … × 2 × 1

7/16 Edinburgh Business School Quantitative Methods


Module 7 / Distributions

The following example shows how this formula can be used. If 40 per cent of the
USA electorate are Republican voters, what is the probability that a randomly
assembled group of three will contain two Republicans?
Using the binomial formula:
(2 Republicans in group of 3)
= ( = 2)
= × × (1 − )
= × × (1 − ) since sample size, , equals 3
= × 0.4 × 0.6 since the proportion of type 1 (Republicans) is 0.4
!
= × 0.16 × 0.6
( !× !)

= 3 × 0.16 × 0.6
= 0.288
There is a 28.8 per cent chance that a group of three will contain two Republican
voters.
Making similar calculations for r = 0, 1 and 3, we obtain the binomial distribution
of Figure 7.8.

43.2%

28.8%
21.6%

6.4%

0 1 2 3

Figure 7.8 Binomial distribution for groups of three, p = 40%

7.5.4 Using Binomial Tables


Most analyses involving the binomial distribution are based not on mathematics but
on binomial tables. The probabilities calculated in the Republican voter example can
be obtained from tables. The answers to this example may be checked using
binomial tables for sample size 3 (for this and other sizes from 1 to 8 see Appendix
1, Table A1.1). Look down the left-hand side of the table until the correct sample
size, n = 3, is found; look across the table to find p = 0.4; read down the column;
the 4 probabilities for r = 0, 1, 2, 3 are the same as in Figure 7.8.

Quantitative Methods Edinburgh Business School 7/17


Module 7 / Distributions

Example
A manufacturer has a contract with a supplier that no more than 5 per cent of the
supply of a particular component will be defective. The component is delivered in lorry
loads of several hundred. From each delivery a random sample of 20 is taken and
inspected. If three or more of the 20 are defective, the load is rejected. What is the
probability of a load being rejected even though the supplier is sending an acceptable
proportion of 5 per cent defective?
The binomial distribution is applicable, since the population is split two ways and the
samples are random. Table A1.1 in Appendix 1 is a binomial table for samples of size 20.
As with Table A1.1, the rows refer to the number of defective components in the
sample; the columns refer to the proportion of defectives in the population of all
components supplied (0.05 here). The following is based on the column headed 0.05 in
Table A1.1 in Appendix 1:

P(0 defectives) = 35.8%


P(1 defective) = 37.7%
P(2 defectives) = 18.9%
P(3 defectives) = 6.0%
P(more than 3) = 1.6%
Total = 100.0%

The manufacturer rejects loads when there are three or more defectives in the sample.
From the above, the probability of a load being rejected even though the supplier is
sending an acceptable proportion of defectives is (6.0% + 1.6% =) 7.6 per cent.

7.5.5 Parameters
The distributions in Figure 7.7 differ because the parameters differ. Parameters fix
the context within which a variable varies. The binomial distribution has two
parameters: the sample size, n, and the population proportion of elements of the
first type, p. Two binomial distributions with different parameters, while still having
the same broad characteristics shown in Figure 7.7, will be different; two binomial
distributions with the same sample size and the same proportion, p, will be identical.
The variable is still free to take different values but the parameters place restrictions
on the extent to which different values can be taken.
The binomial probability formula in Section 7.5.3 demonstrates why this is so.
Once n and p are fixed, the binomial probability is fixed for each r value; if n and p
were changed to different values, the situation would still be binomial (the same
general formula is being used), but the probabilities and the histogram would differ.
Right-skewed shapes occur when p is small (close to 0), left-skewed shapes when p
is large (close to 1), and symmetrical shapes when p is near 0.5 or n is large (how
large depends upon the p value). This can be checked by drawing histograms for
different parameter values.

7/18 Edinburgh Business School Quantitative Methods


Module 7 / Distributions

7.5.6 Deciding whether Data Fit a Binomial


The theoretical situation on which the binomial distribution is defined involves
several assumptions, particularly the assumption that the sample is selected at
random. In an actual situation, if the assumptions are not met, then the theoretical
probabilities will not reflect reality. In practice it is difficult to know the extent to
which the assumptions are met. It is prudent, therefore, to check whether the
theoretical situation matches the actual. This is done by making some observations
and comparing them with what would be expected theoretically if the binomial
applied.
This check is not part of the analysis itself but is to decide whether the situation
is binomial. For example, the purpose of the sampling scheme in the previous
example involving components supplied under contract was to check that one of
the parameters (p) was within acceptable limits. A preliminary check should be made
that the situation is truly binomial and that binomial probabilities can be applied
legitimately. This initial check could be done by taking samples from lorries contain-
ing a known proportion of defective items. If this proportion were 5 per cent, then
35.8 per cent of the samples should have no defectives, 37.7 per cent one defective,
18.9 per cent two defectives and so on in accordance with the extract taken from
Table A1.1 in Appendix 1 in the example above. If this were not approximately the
case then, since the parameters are known with certainty, the conclusion would have
to be drawn that the situation was not binomial. There is a further example of this
procedure in the case studies at the end of the module.

7.6 The Normal Distribution


The most used and best-known standard distribution is probably the normal.

7.6.1 Characteristics
The normal distribution is bell shaped and symmetrical, as in Figure 7.9. It is also
continuous. The distributions met up to now have been discrete. A distribution is
discrete if the variable takes distinct values such as 1, 2, 3, … but not those in
between. The delivery distribution (Table 7.3) is discrete since the data are in
groups; the binomial distribution is discrete since the variable (r) can take only
whole number values. Continuous means that all values are possible for the variable.
It could take the values 41.576, 41.577, 41.578, … rather than all these values being
grouped together in the class 40–49 or the variable being limited to whole numbers
only.

Quantitative Methods Edinburgh Business School 7/19


Module 7 / Distributions

68%

1s 1s
2s 2s
3s 3s

Figure 7.9 Normal distribution


A further implication of continuity is that probabilities are measured not by the
height of the distribution above the x-axis (as in the discrete case), but by areas
under the curve. For example, P(15 ≤ x ≤ 25) is the area under the part of the
curve between x = 15 and x = 25 as a proportion of the whole area of the distribu-
tion. The reason for this is mathematical, but intuitively it can be seen that since
P(x = 18.629732…), say, must be infinitesimally small, a continuous distribution in
which probabilities are measured by the height of the curve above points (as with
the discrete distributions) would be virtually flat. (Refer back to Module 1 for a
fuller discussion of continuity.)
The normal distribution also has important attributes with respect to the standard
deviation (see Figure 7.9):
68% area (68% of readings) within ±1 standard deviation of mean
95% area (95% of readings) within ±2 standard deviations of mean
99.7% area (99.7% of readings) within ±3 standard deviations of mean
For example, it is known that the IQ of children is normally distributed with
mean 100 and standard deviation 17. These attributes imply that:
68% children have an IQ in range 83–117 (mean ±1 standard deviation)
95% children have an IQ in range 66–134
99.7% children have an IQ in range 49–151
Tables exist that give the area under any part of a normal curve, not just within
ranges specified by whole numbers of standard deviations. Table A1.2 from
Appendix 1, which will be explained later, is an example.

7.6.2 Situations in Which the Normal Occurs


The normal distribution is constructed mathematically from the following theoreti-
cal situation:

Repeated observations or measurements are taken of the same constant quan-


tity. Each time an observation is taken, the quantity is subject to many sources

7/20 Edinburgh Business School Quantitative Methods


Module 7 / Distributions

of disturbance. Each source gives rise to a small variation in the value of the
quantity. The variations are equally likely to be positive or negative at random;
they are independent of one another (i.e. no variation is influenced by any oth-
er); they can be added together.

Because of positive and negative variations cancelling each other out, the tenden-
cy would be for most values to be close to the central value, but with a few at
greater distances. Intuitively, the resulting symmetrical bell shape can be visualised.
A great many practical situations approximate very well to the theoretical. It is
not necessary to know what the sources of variation are, merely that they are likely
to exist. Actual situations in which the normal distribution has been found to apply
include:
(a) IQs of children;
(b) heights of people of the same sex;
(c) dimensions of mechanically produced components;
(d) weights of machine-produced items;
(e) arithmetic means of large samples.
In the case of human beings, the variations giving rise to normal distributions of
IQs, heights and other variables are presumably the many genetic and environmen-
tal effects that make people different. For mechanically produced items, the sources
must include vibration (from a number of factors), tiny differences in machine
settings, different operators, etc.
The use of the distribution in sampling is one of the most important. The varia-
tions that make one sample different from another arise from the random choices as
to the elements included in the sample. This is an important part of statistical
inference, the subject of the next module.

7.6.3 Deriving the Normal Distribution


The situation from which the normal distribution is constructed can be expressed
mathematically as follows. Repeated observations are taken of the same quantity, L.
Each time an observation is taken, the quantity is subject to many sources of
variation. Each source gives rise to a change in the value of L by +u or −u (u is
some very small amount). It is a matter of chance and equally likely whether the
variation is positive or negative. The variations are independent and additive.
Suppose there were 100 such sources. If, on the first observation, 57, say, acted
positively and 43 negatively, the quantity measured (x) would be:
+ 57 − 43 = + 14
The second observation might have 38 acting positively, 62 negatively; x would
be L − 24u. Over many observations a distribution of values would be obtained.
Can this distribution of x be expressed mathematically?
Each observation is based on a sample of 100 variations. The population from
which the sample comes is split into two types: 50 per cent are variations of +u and
50 per cent are variations of −u. This is exactly the binomial distribution.

Quantitative Methods Edinburgh Business School 7/21


Module 7 / Distributions

The probability of, say, 57 positive variations (and therefore 43 negative) is given
by the binomial probability formula with p = 0.5 and n = 100:
( = 57) = × 0.5 × 0.5
Since r = 57 gives rise to an x value of L + 14u:
( = + 14 ) = × 0.5 × 0.5
The probabilities of other x values could be calculated similarly. The distribution
of x will be symmetrical if displayed in the form of a histogram since a binomial
distribution with p = 0.5 is symmetrical. This is close to what is being sought except
that a normal distribution is based on many small variations, rather than 100
variations of size u. To make further progress some advanced mathematics is
needed. The number of variations is increased to be infinitely large while the size of
the variations is reduced to be infinitely small. The resulting formula for the normal
distribution is:
( ) /
( ≤ )=∫ ×

The formula may not appear attractive or even comprehensible. Fortunately it is
not necessary to be able to use it. As with the binomial, there are normal curve
tables (see Table A1.2 in Appendix 1) from which probabilities can be read.

7.6.4 Using Normal Curve Tables


Table A1.2 in Appendix 1 is a normal curve table from which probabilities relating
to all variable values can be obtained. Because the normal is a continuous distribu-
tion, probabilities are measured by areas and relate to ranges, not specific values of
the variable. The table shows the area from the mean of the distribution as far as a
given number, z, of standard deviations to the right. This is illustrated in Fig-
ure 7.10. For example, to find the area from the mean to 1.18 standard deviations to
the right, first look down the left-hand column as far as z = 1.1. Then look along the
top row as far as 0.08. The intersection of row and column shows 0.3810. In other
words, the area (shaded in Figure 7.10) from the mean to the point 1.18 standard
deviations to the right is 0.3810.

Figure 7.10 Normal curve

7/22 Edinburgh Business School Quantitative Methods


Module 7 / Distributions

Example
A factory produces a wide variety of tins for use with food products. A machine
produces the lids for tins of instant coffee. The diameters of the lids are normally
distributed with a mean of 10 cm and a standard deviation of 0.03 cm.
(a) What percentage of the lids have diameters in the range 9.97 cm to 10.03 cm?
Information about areas under normal curves refers to values of the variable being
measured in terms of ‘standard deviations away from the mean’. The range 9.97 to
10.03 when in this form is 10 – 0.03 to 10 + 0.03 (i.e. Mean – 1s to Mean + 1s).
From the basic properties of a normal curve (Figure 7.9), this range includes 68 per
cent of the area under the curve or, in other words, 68 per cent of the observations.
So, 68 per cent of the lids have diameters in the range 9.97 to 10.03 cm. (Check
with Table A1.2 in Appendix 1 that this is approximately correct by looking up z =
1.0.)
(b) Lids of diameter greater than 10.05 cm are too large to form an airtight seal and
must be discarded. If the machine produces 8000 lids per shift, how many are wast-
ed?
Lids wasted have diameters greater than 10.05, or greater than mean + 1.67 stand-
ard deviations (since (10.05 − 10.00)/0.03 = 1.67). The range is therefore no longer
expressed as the mean plus a whole number of standard deviations. For ranges such
as these the normal curve table has to be used. The table is used as follows: each
number in the body of the table is the area under the curve from the mean to a
point a given number (z) of standard deviations to the right of the mean (the shaded
area in Figure 7.10). In this example, the area under the curve from the mean to the
point 1.67 standard deviations to the right is wanted. In Table A1.2, Appendix 1,
look down the left-hand column to find 1.6, then across the top row to find 0.07.
The figure at the intersection is 0.4525, corresponding to 1.67 standard deviations
from the mean.
The question asks for the probability of a diameter greater than 10.05 cm (i.e. for
the area beyond 1.67 standard deviations). The area to the right of the mean as far
as 1.67 standard deviations has just been found to be 0.4525; the entire area to the
right of the mean is 0.5, since the distribution is symmetrical. Therefore (see Fig-
ure 7.11):
(diameter greater than 10.05) = 0.5 − 0.4525
= 0.0475
So, 4.75 per cent of lids will have a diameter greater than 10.05 cm. Therefore, in a
shift, 4.75 per cent of 8000 = 380 lids will be unusable.

Quantitative Methods Edinburgh Business School 7/23


Module 7 / Distributions

0.0475
0.4525

10.0 10.05
Diameter (cm)

Figure 7.11 Large lids


(c) Lids that are too small also have to be discarded. Diameters less than 9.93 cm do
not fit and are unusable. How many lids per shift have to be discarded because they
are either too large or too small?
First, the percentage of diameters that are less than 9.93 is to be found. In terms of
standard deviations this is:
= (10 − 9.93)/0.03
= 2.33 standard deviations to the left of the mean
Since the distribution is symmetrical, the table can equally well be used for areas
either side of the mean. As before, find 2.3 down the left column and 0.03 along the
top row. At the intersection, the table gives a value of 0.4901.
(diameter 9.93 to 10.0) = 0.4901
(diameter ≤ 9.93) = 0.5 − 0.4901
= 0.0099
So 0.99 per cent of the lids are too small, which, in a shift, comes to 0.99 per cent of
8000 = 79. In total, 79 + 380 lids are unusable = 459 per shift.
(d) Between what limits is 90 per cent of the production?
We can restate this as: ‘How many standard deviations (z) will cover an area of 0.45
to the right of the mean and an area of 0.45 to the left (see Figure 7.12)?’

90%

z z

Figure 7.12 90 per cent limits


Use Table A1.2, Appendix 1 in reverse. In the body of the table, find 0.45. The value
0.4495 is in the row 1.6 and the column 0.04; 0.4505 is in the row 1.6 and the col-
umn 0.05. Approximately, then, the area 0.45 corresponds to 1.645 standard

7/24 Edinburgh Business School Quantitative Methods


Module 7 / Distributions

deviations. So 90 per cent of the production lies within ±1.645s of the mean, or
within ±1.645 × 0.03 cm of the mean = ±0.04935 cm. The range of diameters for 90
per cent of the lids is, therefore, 9.951 to 10.049 cm (approx.).
Note that the basic properties given in Figure 7.9 are approximations only. By look-
ing up z = 1.00, it can be seen that 68.26 per cent of the area lies within ±1 standard
deviation of the mean. Similarly, 95.44 per cent lies within ±2 standard deviations
and 99.74 per cent between ±3 standard deviations.

7.6.5 Parameters
The normal distribution applies to both IQs and weights of bread loaves, yet the
shapes of the two distributions are different (see Figure 7.13). The distributions
differ because the parameters differ. The normal distribution has two parameters,
the arithmetic mean and the standard deviation. Contrast this with the binomial
distribution, whose two parameters are the sample size and the population propor-
tion of type 1. Two normal distributions with the same mean and the same standard
deviation will be identical; two normal distributions with different means and
standard deviations, while still having the characteristics shown in Figure 7.9, will be
centred differently and be of different widths.

(a) (b)

495 500 505 grams 40 60 80 100 120 140 160

Figure 7.13 Normal distributions of (a) bread; (b) IQ


Parameters fix the context within which the variable varies. In terms of the way
in which a normal distribution arises theoretically, the parameters can be thought of
as determining L and the number and size of variations (u) operating. Once these
are fixed, the distribution is fixed exactly, as can be seen from the normal distribu-
tion formula.

7.6.6 Deciding whether Data Fit a Normal Distribution


The normal distribution is constructed on the basis of a theoretical situation and a
set of assumptions that are not likely to match exactly the real situation to which
they are being applied. When it is suspected that a standard distribution can be
applied to a situation, it may be prudent to check that the variable does, approxi-
mately, fit the distribution. To do this the observations should be compared to what
is theoretically expected. For example, do the following data approximate to a
normal distribution? The data are the numbers of breakdowns per month at a
manufacturing plant.

Quantitative Methods Edinburgh Business School 7/25


Module 7 / Distributions

5, 8, 2, 4, 4, 6, 5, 3, 7, 6, 9, 7, 4, 2, 6, 5, 3, 5, 5, 4
The data will be easier to handle if they are grouped together:

Value 0 1 2 3 4 5 6 7 8 9
Frequency 2 2 4 5 3 2 1 1

First, calculate the parameters: the arithmetic mean and standard deviation.
Mean = 5.0
Standard deviation = 1.9
What is theoretically expected from a normal distribution with these parameters
is that 68 per cent of the observations will be between 5 ±1.9, 95 per cent between 5
±3.8, etc. as follows:

Observed % Theoretical %
Mean ±1s : 3.1 to 6.9 60 68.0
Mean ±2s : 1.2 to 8.8 95 95.0
Mean ±3s : −0.7 to 10.7 100 99.7

Although it is not perfect, the match between observed and theoretical for just 20
observations is reasonably good, suggesting an approximate normal distribution.
Judgement alone has been used to say that the match is acceptable. Statistical
approaches to this question are available but are beyond the scope of this module.
The methods are, however, based on the same principle. The observed data are
compared with what is theoretically expected under an assumption of normality.

7.6.7 Approximating the Binomial with the Normal


The binomial distribution is difficult to use because of its complicated formula and
lengthy probability tables (a separate table is needed for each sample size). The
normal is easier to use because of its simple, single table. For certain parameter
values the shape of the binomial is similar to the normal. In these circumstances,
purely for ease of use, the binomial is approximated by the normal distribution.
The binomial is roughly symmetrical when p is not close to 0 or when n is large.
As a rule of thumb, the binomial can be approximated by the normal provided np
and n(1 − p) both exceed 5. In order to use the normal, its parameters, the mean
and standard deviation need to be known. For a binomial distribution, using some
‘black box’ mathematics, it can be shown that:
Arithmetic mean =
Standard deviation = (1 − )
These parameters are calculated from the available data and then used as if the
distribution were truly normal.
Instead of relating to the numbers of each type in the sample, the distribution
could also refer to the proportion; instead of the number of defective parts in a

7/26 Edinburgh Business School Quantitative Methods


Module 7 / Distributions

sample, the distribution could be based on the proportion of defectives. The


parameters then become (just dividing by n):
Arithmetic mean =
( )
Standard deviation =

Example
If, on average, 20 per cent of firms respond to industrial questionnaires, what is the
probability of fewer than 50 responses when 300 questionnaires are sent out?
The situation is one to which the binomial applies. Firms either respond or do not
respond, giving the two-way split of the population. A sample of 300 is being taken from
this population. To answer the question using the binomial formula, the following would
have to be calculated or obtained from tables:
(0 responses) = × 0.2 × 0.8
(1 response) = × 0.2 × 0.8



(49 responses) = × 0.2 × 0.8
These 50 probabilities would then have to be added. This is a daunting task, even for
mathematical masochists. However, since np = 60 (= 300 × 0.2) and n(1 − p) = 240 are
both greater than 5, the normal approximation can be used, with parameters:
Arithmetic mean = = 60
Standard deviation = (1 − )
= √300 × 0.2 × 0.8
= 7 (approx. )
A slight difficulty arises in that a discrete distribution is being approximated by a
continuous one. To make the approximation it is necessary to pretend that the variable
(the number of responses) is continuous (i.e. 60 responses represent the interval
between 59.5 and 60.5, 61 the interval between 60.5 and 61.5 and so on). Consequently,
to answer the question posed, P(r ≤ 49.5) must be found:
49.5 is 10.5 from the mean, or
= 10.5/7
= 1.5 standard deviations from the mean
Using Table A1.2 in Appendix 1:
( ≤ 1.5) = 0.5 − 0.4332
= 0.0668
(fewer than 50 replies) = 6.68%

Learning Summary
The analysis of management problems often involves probabilities. For example,
postal services define their quality of service as the probability that a letter will reach

Quantitative Methods Edinburgh Business School 7/27


Module 7 / Distributions

its destination the next day; electricity utilities set their capacity at a level such that
there is no more than some small probability that it will be exceeded and power cuts
necessitated; marketing managers in contracting companies may try to predict future
business by attaching probabilities to new contracts being sought. In such situations
and many others, including those introduced earlier, the analysis is frequently based
on the use of observed or standard distributions.
An observed distribution usually entails the collection of large amounts of data
from which to form histograms and estimate probabilities.
A standard distribution is mathematically derived from a theoretical situation. If
an actual situation matches (to a reasonable approximation) the theoretical, then the
standard distribution can be used both to describe and to analyse the situation. As a
result, fewer data need to be collected.
This module has been concerned with two standard distributions: the binomial
and the normal. For each, the following have been described:
(a) its characteristics;
(b) the situations in which it can be used;
(c) its derivation;
(d) the use of probability tables;
(e) its parameters;
(f) how to decide whether an actual situation matches the theoretical situation on
which the distribution is based.
The mathematics of the distributions have been indicated but not pursued rigor-
ously. The underlying formulae, particularly the normal probability formula, require
a relatively high level of mathematical and statistical knowledge. Fortunately such
detail is not necessary for the effective use of the distributions because tables are
available. Furthermore, the role of the manager will rarely be that of a practitioner
of statistics; rather, he or she will have to supervise the use of statistical methods in
an organisation. It is therefore the central concepts of the distributions, not the
mathematical detail, that are of concern. To look at them more deeply goes beyond
what a manager will find helpful and enters the domain of the statistical practitioner.
The distributions that have been the subject of this module are just two of the
many that are available. However, they are two of the most important and useful.
The principles behind the use of any standard distribution are the same, but each is
associated with a different situation. A later module will look at other standard
distributions and their applications.

7/28 Edinburgh Business School Quantitative Methods


Module 7 / Distributions

Review Questions
7.1 What method of probability measurement is used to estimate probabilities in observed
distributions?
A. A priori
B. Relative frequency
C. Subjective

7.2 A machine produces metal rods of nominal length 100 cm for use in locomotives. A
sample of 1000 rods is selected, the rods measured, their lengths classified in groups of
size 0.1 cm, and a frequency histogram drawn. From the histogram, probabilities are
calculated. What type of distribution has been formed?
A. Observed
B. Binomial
C. Normal

7.3 The amounts owed by the customers of a mail-order company at the end of its financial
year would form a continuous distribution. True or False?
Questions 7.4 to 7.6 refer to the following situation:
A medical consultant books appointments to see 20 patients in the course of a morn-
ing. Some of them cancel their appointments at little or no notice. From past records
the following probabilities have been calculated.

P(0 cancellations) = 32%


P(1 cancellation) = 29%
P(2 cancellations) = 22%
P(3 cancellations) = 11%
P(4 cancellations) = 3%
P(5 or more) = 3%
Total 100%

7.4 What is the probability that on a particular morning there will be no more than one
cancellation?
A. 61%
B. 29%
C. 39%
D. 9%

7.5 What is the probability that on two successive mornings there will be no cancellations?
A. 64%
B. 16%
C. 10.2%
D. 9.1%

Quantitative Methods Edinburgh Business School 7/29


Module 7 / Distributions

7.6 What assumption must be made to justify the use of the multiplication rule to answer
Question 7.5?
The events are:
A. mutually exclusive.
B. conditional.
C. independent.

7.7 A management team of three is to be chosen to handle an organisation’s relocation.


There are eight suitable candidates. How many ways of choosing the team are there?
A. 1344
B. 336
C. 40 320
D. 56

7.8 Which of the following are advantages in using standard distributions compared to
observed distributions?
A. No data need be collected.
B. Knowledge of the distribution is available in probability tables.
C. Greater accuracy.

7.9 What standard distribution would you expect to describe the variation in the number of
persons, in randomly selected samples of 100 residents of London, who are watching a
popular television programme at a given time?
A. Normal
B. Binomial

7.10 In Question 7.9, the normal distribution could be used as an approximation to the
binomial. True or False?

7.11 A company manufacturing chocolate bars planned that, three months after launching a
new product, 40 per cent of the nation should have heard of it. If this has been achieved,
what is the probability that a random sample of five people will contain only one who
has heard of it?
A. 0.08
B. 0.05
C. 0.13
D. 0.26

7/30 Edinburgh Business School Quantitative Methods


Module 7 / Distributions

Questions 7.12 to 7.14 refer to the following situation:


An accounting office receives dockets from the factory floor. Clerks in the office
have to collect them, record the information on computer and file the dockets. Previous
research has established that the number of dockets cleared per clerk per day is
normally distributed with mean 190 and standard deviation 25. There are 12 clerks
working in the office.

7.12 Estimate the total number of dockets cleared by all clerks each day.
A. 2280
B. 300
C. 190
D. 2880

7.13 Estimate the approximate number of clerks who clear more than 215 per day.
A. 1
B. 1 clerk every other day
C. 2
D. 3

7.14 Within what range would the number of dockets cleared by any clerk on any day be
expected to lie, with 95 per cent probability?
A. 140 to 240
B. 165 to 215
C. 115 to 265
D. 140 to 215

Case Study 7.1: Examination Grades


1 At the end of an MBA course in business statistics, the final examination grades have a
mean of 69.8 and a standard deviation of 11.6. There were 180 students on the course.
Assuming that the distribution of these grades (all whole numbers) is normal, find:
a. the percentage of grades that should exceed 85;
b. the percentage of grades less than 40;
c. the number of failures (pass = 50 per cent);
d. the lowest distinction mark if at most the highest 8 per cent of grades are to be
awarded distinctions.

Case Study 7.2: Car Components


1 A company manufactures car components. The quality control scheme for a particular
type of component consists of taking random samples of six components at regular
intervals. The number of defective components is counted and charted. Out of 100 such
samples, in 52 cases no defectives were found, in 34 cases one defective, in ten cases
two defectives and in the remaining cases three defectives.

Quantitative Methods Edinburgh Business School 7/31


Module 7 / Distributions

a. Are these results consistent with the view that the process is operating with an
average 10 per cent of defectives?
b. What reservations have you about your conclusion?

Case Study 7.3: Credit Card Accounts


1 A credit card company is monitoring the use of client accounts. A study shows that the
average monthly expenditure of its regular card users is normally distributed with mean
£280 and standard deviation £90. The customers are classified into four groups
according to expenditure:
a. Group 1 spends less than £200.
b. Group 2 spends at least £200 but less than £300.
c. Group 3 spends at least £300 but less than £400.
d. Group 4 spends £400 or more.
What percentage of customers would you expect to fall within each group?

Case Study 7.4: Breakfast Cereals


1 Random samples of 20 consumers in the United Kingdom are asked if they usually eat a
particular brand of breakfast cereal at least once per week. It is known that 45 per cent
of people are regular users (as defined above) of the cereal. The number of regular
users in the samples will vary, but within what range would you expect the number of
users to be for 95 per cent of such samples?

7/32 Edinburgh Business School Quantitative Methods


Module 8

Statistical Inference
Contents
8.1 Introduction.............................................................................................8/1
8.2 Applications of Statistical Inference .....................................................8/2
8.3 Confidence Levels ...................................................................................8/2
8.4 Sampling Distribution of the Mean .......................................................8/3
8.5 Estimation................................................................................................8/6
8.6 Basic Significance Tests..........................................................................8/9
8.7 More Significance Tests ...................................................................... 8/18
8.8 Reservations about the Use of Significance Tests............................ 8/24
Learning Summary ......................................................................................... 8/26
Review Questions ........................................................................................... 8/28
Case Study 8.1: Food Store ........................................................................... 8/30
Case Study 8.2: Management Association ................................................... 8/31
Case Study 8.3: Textile Company ................................................................ 8/31
Case Study 8.4: Titan Insurance Company.................................................. 8/31

Prerequisite reading: Module 7

Learning Objectives
Statistical inference is the set of methods by which data from samples can be turned
into more general information about populations. By the end of the module, the
reader should understand the basic underlying concepts. Statistical inference has two
main parts. Estimation is concerned with making predictions and specifying their
accuracy; significance testing is concerned with distinguishing between a result
arising by chance and one arising from other factors. The module describes some of
the many different types of significance test. As in the last module, some of the
mathematics will have to be left in a ‘black box’.

8.1 Introduction
Statistical inference is the use of sample data to predict, or infer, further pieces of
information about the population from which the sample or samples came. It is a
collection of methods by which knowledge of samples can be turned into
knowledge about the populations. Recall that statistically a population is the set of
all possible values of a variable. One form of inference is estimation, where a
sample is the basis for predicting the values of population parameters; a second is

Quantitative Methods Edinburgh Business School 8/1


Module 8 / Statistical Inference

significance testing, which is a procedure for deciding whether sample evidence


supports or rejects a hypothesis (a statement about a population).
The theoretical background required concerns statistical distributions, especially
the normal distribution. However, to make fuller use of inferential methods, some
additional theory is required. This will be discussed before we move on to consider
the processes of inference, but first some applications of inference are given.

8.2 Applications of Statistical Inference


(a) Market research. Much of market research is based on investigating a sample
of consumers and then generalising the results to the whole of the potential
market. For example, a manufacturer of male toiletries was re-designing the
packaging of the product line. The company wanted to know the proportion of
purchases of male toiletries made by women. The market research department
looked at a random sample of 1200 purchases made at several different retail
outlets. Of the purchases, 728 (61 per cent) were made by women. Statistical
inference allowed the conclusion to be drawn (with near but not absolute cer-
tainty) that between 58 and 64 per cent of all purchases of male toiletries are
made by women.
(b) Medicine. When a new medical treatment is discovered it is often both impos-
sible and undesirable to give it to all those who suffer from the complaint.
Usually it is given to a sample of sufferers and their progress compared to that of
patients not given the treatment. A significance test helps to decide whether a
difference in the average recovery rates of the two groups is of sufficient size to
conclude that the new treatment is beneficial. For example, a new process for
speeding the healing of pulled muscles had been developed. Hospital records
showed that, in the recent past, healing had taken an average of 12 days with
some variation on either side of the average. The new process was applied to a
sample of 30 people with pulled muscles. On average, healing took 10.5 days.
The difference of 1.5 days could be because the new treatment was better or it
could have arisen purely by chance since healing time does of course vary from
person to person. A significance test is a method of distinguishing between the
two causes. In this case the conclusion was that the evidence was not strong
enough to support the view that the new treatment was better.

8.3 Confidence Levels


A sample is just a selection of data. More data from the rest of the population
remain uncollected. Although drawn at random, a sample could still be unrepre-
sentative of the population. Conclusions drawn from the sample may be wrong
because it is not representative. To allow for this possibility, inferences cannot be
stated with complete certainty. Fortunately, from the variability of the data within
the sample, it is possible to calculate what is in effect the probability that conclu-
sions are correct. Conclusions are drawn, such as ‘It is predicted, with 95 per cent
confidence, that the population mean percentage is in the range 58–64 per cent.’
This is what was meant by ‘near certainty’ in the toiletries example. It is not stated

8/2 Edinburgh Business School Quantitative Methods


Module 8 / Statistical Inference

that the population mean is definitely in the range 58–64 per cent. The sample of
1200 on which the prediction was based may have been unrepresentative. It was not
known whether this was the case, but, from the variability within the sample, it was
possible to say that it would be expected that 19 out of 20 such examples would
have a mean in the range given. This is the meaning of ‘95 per cent confidence’. A
confidence level attached to a statement is in effect the probability that the state-
ment is true.
All inferences are made at some level of confidence. In the medical example, the
conclusion that the new treatment was no better was stated ‘with 95 per cent
confidence’, meaning that, if the same sample results arose on 100 occasions, it
would be expected that the conclusion ‘no better’ would be correct on 95 of those
occasions. By convention, 95 per cent is the usual confidence level. If the confi-
dence level were set too high, it would be too tough a barrier and very little could
pass it; if set too low it would have little discriminatory ability. The 95 per cent level,
or one in 20, is generally regarded as an acceptable level of risk. But there is no
reason why other levels of confidence should not be used. The method of deriving
confidence levels from sample variability uses the next piece of theory.

8.4 Sampling Distribution of the Mean


Inference is accomplished more easily and more accurately with the help of a further
distribution, that of the mean of a sample. Imagine a series of samples being taken
at random from a population. The mean of each sample would differ from sample
to sample, purely by chance because the samples were chosen at random. The
sample mean has, therefore, a distribution. It is referred to as the sampling
distribution of the mean. The distribution has particular predictable characteris-
tics.
Figure 8.1(a) shows a variable that has a normal distribution. For example, the
variable may be the actual lengths of machine-produced rods of nominal length
100 cm (or any of the previous examples of normally distributed variables). A
random sample of rods is taken and the arithmetic mean of the rods is calculated
and recorded. A second sample of the same size is taken, and a third, and a fourth
and so on. Gradually, a distribution of the sample means is built up (see Fig-
ure 8.1(b)). It can be verified mathematically that the new distribution will be normal
and have the same mean as the original but be narrower. It is intuitively reasonable
that the sampling distribution should have these characteristics. The narrowness
occurs because of the tendency for extreme (very long or very short) rods to be
‘averaged out’ by the rods of more moderate length in the sample. The extent of the
narrowing is measured by the sampling distribution having a smaller standard
deviation equal to:

Quantitative Methods Edinburgh Business School 8/3


Module 8 / Statistical Inference

(a) (b)

100 cm 100 cm

Figure 8.1 Sampling distribution of the mean from a normal distribution


The relationship between the sampling distribution of the mean and the individu-
al distribution from which the samples were drawn can be summarised:

Individual distribution Sampling distribution


Shape Normal Normal
Mean
Standard deviation / Sample size

To prove these properties are true requires some ‘black box’ mathematics.
If the individual distribution is not normal, the outcome of taking samples is
more surprising. Figure 8.2(a) shows a non-normal distribution. It could be any non-
normal distribution, for example, the number of copies of a local weekly newspaper
read in a year by the inhabitants of the small town that it serves. A random sample
of at least 30 inhabitants is taken and the arithmetic mean of the number of copies
read by the sample is calculated. More samples of the same size are taken, and
eventually the distribution will be as shown in Figure 8.2(b).

8/4 Edinburgh Business School Quantitative Methods


Module 8 / Statistical Inference

(a) (b)

Number of inhabitants

012 8
Copies read Average copies read
(average = 8) per person for samples
of more than 30 people

Figure 8.2 Sampling distribution of the mean of non-normal distribution


The sampling distribution is normal. The similarities and differences between the
original individual distribution and the sampling distribution of the mean can be
summarised:

Individual distribution Sampling distribution


Shape Non-normal Normal
Mean
Standard deviation / Sample size

Provided the sample size is greater than 30, the sampling distribution of the mean
will be approximately normal whatever the shape of the distribution from which the
samples were taken. The ‘30’ is a rule of thumb. If the individual distribution is at all
similar to a normal, then a sample size of fewer than 30 will be enough to make the
sampling distribution of the mean normal. For a distribution only slightly skewed, a
sample size of just four or five may well be sufficient.
The normalisation property is a consequence of the central limit theorem. This
states that as the sample size is increased, the sampling distribution of the mean
becomes progressively more normal. In view of the mathematics involved, only the
results have been presented. The great benefit of this theorem is that, even though
the distribution of a variable is unknown, the taking of samples enables the distribu-
tion of the sample mean to be known. Analysis can then proceed using techniques
and tables associated with the normal distribution.
Example
A manufacturing organisation has a workforce of several thousand. Sickness records
show that the average number of days each employee is off sick is 14 with a standard
deviation of 6. If random samples of 100 employees were taken and the average number

Quantitative Methods Edinburgh Business School 8/5


Module 8 / Statistical Inference

of days of sickness per employee calculated for each sample, what would be the
distribution of these sample averages? What would be its parameters?
The starting distribution is the number of days of sickness for each employee. The shape
of this distribution is not known, but it is likely to be a reverse J (see Figure 8.3(a)). Its
mean is 14 and its standard deviation 6. Moving to the distribution of the average
number of days of sickness per employee of samples of 100 employees (see Fig-
ure 8.3(b)), the new distribution will be normal or nearly normal because the sample
size exceeds 30 (central limit theorem). The parameters will be:
Mean = 14 days
Standard deviation = 6/√100
= 0.6 days

(a) (b)
Number of employees

Average = 14 Average = 14
Standard deviation = 6 Standard deviation = 0.6

0 5 10 15 20 25 30 35 40 45 50 55 60 65 14
Days' sickness Average days' sickness per employee
for sample of size 100

Figure 8.3 Sampling sickness records

8.5 Estimation
Estimation is the prediction of the values of population parameters given knowledge
of a sample. The sickness record example of Figure 8.3 can demonstrate the use of
estimation. Since the sampling distribution for samples of size 100 is normal with
mean 14 and standard deviation 0.6, 95 per cent of all such samples will have their
mean in the range 12.8–15.2 days. This follows from the property of normal
distributions that 95 per cent of values lie within ±2 standard deviations of the
mean. In other words, on 95 per cent of occasions (or, at the 95 per cent confidence
level) the sample mean will be within 1.2 days of the population mean.
This shows how a range for sample means can be estimated from the population
mean. However, this is estimation in the wrong direction – from the population to
the sample instead of from the sample to the population. The process can be turned
round to estimate the population mean from a single sample mean. Suppose, as is
usually the case, the population mean is not known but that a sample of 100
employees’ sickness records has an average number of days of sickness per employ-
ee of 11.5. This must be less than 1.2 days away from the unknown population

8/6 Edinburgh Business School Quantitative Methods


Module 8 / Statistical Inference

mean (at 95 per cent confidence level), from the above calculations. This unknown
population mean must then be within 1.2 days of 11.5 (see Figure 8.4).

10.3 10.9 11.5 12.1 12.7 Days


–2 –1 0 +1 +2 Standard deviations
1 1 1 1 1

Sample
mean

Population mean in this range

Figure 8.4 Sickness records: estimating the population mean


It can therefore be said that, at the 95 per cent confidence level, the population
mean is in the range 10.3–12.7 days. The mean of the sample (11.5 days) is the
point estimate of the population mean; the range 10.3–12.7 days is the 95 per cent
confidence limits for the estimate. Note that, although the concept of the distribu-
tion of sample means is being used, in practice only one sample is taken.
When estimating the population mean, we used the population standard devia-
tion. This is unlikely to be known. More typically, it has to be calculated from the
sample (i.e. the standard deviation of the sample is calculated and used as if it were
the population standard deviation). Fortunately, statistical theory shows that this
approximation is valid provided the sample size is, as another rule of thumb, greater
than 30. There are now two reasons for choosing a sample size of 30+. First, the
central limit theorem requires a sample size of 30 or more; second, using the sample
standard deviation as the population standard deviation requires a sample size of 30
or more.
The standard deviation of the sampling distribution of the mean is sometimes
referred to as the standard error. This stems from its use in specifying confidence
limits (i.e. errors of estimation).
On a technical point, it should be noted that, by convention, statisticians use
different notations for population measures and measures derived from samples.
Population measures are written in Greek letters, sample statistics in Roman. The
known arithmetic mean of a population is denoted by μ; a mean calculated from a
sample taken from the population is denoted by . Likewise, σ and s refer to
population and sample standard deviations. So, in the above example, the confi-
dence limits for μ are (11.5) ±1.2.
The general procedure for estimating the mean of a population (for convenience
and brevity let the variable in question be labelled x) is as follows.
(a) Take a random sample of size at least 30. Let the sample size be labelled n. The
minimum of 30 is so that the central limit theorem holds and the standard devia-
tion approximation is valid. A smaller sample can be used if the distribution is
normal and the population standard deviation is already known. Even if these

Quantitative Methods Edinburgh Business School 8/7


Module 8 / Statistical Inference

two conditions do not hold, a sample smaller than 30 can sometimes still be
used, but some more advanced theory is needed.
(b) Calculate the sample mean ( ) and the sample standard deviation (s).
(c) The standard deviation of the sampling distribution of the mean (the standard
error) is calculated as /√ .
(d) The point estimate of the population is .
(e) The 95 per cent confidence limits for the population mean are ± 2 /√ .
Example
A sample of 49 of a brand of light bulb lasted on average 1100 hours before failing. The
standard deviation was 70 hours. Estimate the overall average life length for this brand
of bulb.
Follow the general procedure as set out above:
(a) The sample, which is assumed to be random, has been taken. The size is sufficient
for the central limit theorem to apply and the standard deviation approximation to
be valid.
(b) The mean is 1100 hours, the standard deviation 70 hours.
(c) The standard deviation of the sampling distribution is 70/√49 = 10.
(d) The point estimate of life length is 1100 hours.
(e) The 95 per cent confidence limits are 1080–1120 hours. From the normal distribu-
tion Table A1.2 in Appendix 1, the limits for other confidence levels can be found.
For example, since 90 per cent of a normal distribution lies within ±1.645 standard
deviations of the mean, the 90 per cent confidence limits in this case are:
1100 ± 1.645 × 10
1083.55  1116.45
Because the central limit theorem applied, these confidence limits could be found
without the need to know anything about the shape of the original distribution. The only
calculations were to find the mean and standard deviation from the sample.
The ideas of estimation can help in deciding upon sample sizes. In making an estimation,
a particular level of accuracy may be required. The sample size can be chosen in order
to provide that level of accuracy.
Example
In the case of the light bulbs above, what sample size is needed to estimate the average
length of life to within ±5 hours (with 95 per cent confidence)?
Instead of the sample size being known and the confidence limits unknown, the situation
is reversed. At the 95 per cent level:
Confidence limits required = ± 2 /√
= 1100 ± 5 hours
1100 ± 5 = 1100 ± 140/√
i. e. 5 = 140/√
√ = 28
= 784
The formula used in this calculation related the accuracy of the estimate to the square
root of the sample size. This is why accuracy becomes progressively more expensive.
For example, a doubling in accuracy requires a quadrupling of sample size (and, presum-

8/8 Edinburgh Business School Quantitative Methods


Module 8 / Statistical Inference

ably, a great increase in expense). The trade-off between accuracy and sample size is
therefore an important one.

8.6 Basic Significance Tests


Significance tests are a methodology by which to judge whether a particular piece of
sample evidence is consistent with a hypothesis (a supposition about a population).
The methodology involves five steps.
(a) Formulate the hypothesis. It may be any idea or hunch, the truth of which is
to be investigated. Using the example given in Section 8.2 on the treatment for
pulled muscles, the hypothesis was that the new treatment made no difference to
recovery time. In other words, the population of recovery times under the new
treatment is the same as under the old. This hypothesis is usually referred to as
the null hypothesis.
In science the null hypothesis has traditionally been something the experimenter
is trying to disprove. For example, in testing the effectiveness of a new medical
treatment the null hypothesis would generally be that the treatment made no
difference – and the test would then try to reject this statement. However, this
does not mean that the null hypothesis always refers to ‘no change’. It all de-
pends on the circumstances. For example, if a drugs company claimed that a new
drug lowered the temperature of a person with fever by one degree then, if the
government department responsible for monitoring drugs was doubtful about
the claim, it might well set up a test with the null hypothesis that the drug did
lower temperature by one degree – and then try to reject this statement. In short,
there are no hard-and-fast rules for setting the null hypothesis but it does depend
on the purpose and desired outcome of the test.
The alternative hypothesis is what we conclude if the null hypothesis is dis-
proved. For example, if the null hypothesis was that the treatment had no effect,
the alternative hypothesis would normally be that the treatment did have an
effect. If the null hypothesis was that the drug lowered temperature by one de-
gree, the alternative hypothesis would normally be that the drug did not lower
the temperature by one degree. Described this way the alternative hypothesis
may seem a trivial opposite of the null, but it does assume greater importance in
cases when it is possible to stipulate a specific alternative. This is discussed later
in the module.
(b) Collect a sample of evidence concerned with the validity of the hypothesis and
calculate the statistics needed to test the hypothesis. In the example, the recovery
times for the patients who were given the new treatment provided such a sample
and the statistic was the sample average recovery rate.
(c) Decide on a significance level. This is a probability that marks the borderline
between what the tester is prepared to believe has happened by chance and what
has not. It is supposed that if an event (the sample statistic) occurs that has a
probability greater than the significance level, then this is not especially unusual
and it is entirely believable that it happened purely by chance. If an event occurs
with a probability of less than the significance level, then this is deemed to be an
unusual event and it is not believable that it happened purely by chance! Factors

Quantitative Methods Edinburgh Business School 8/9


Module 8 / Statistical Inference

other than pure chance must be at work. The significance level is chosen by the
person conducting the test to reflect his or her view of the credible and the in-
credible, but usually and conventionally it is 5 per cent.
(d) Calculate the probability of the sample evidence occurring, under the
assumption that the hypothesis is true. In the example, the probability of the
sample average recovery rate being 10.5 days was calculated, assuming that it
came from a population with a mean recovery rate of 12 days (the average for
the old treatment).
(e) Compare the probability with the significance level. If it is higher, it is
judged consistent with the hypothesis (the sample result is thought to have hap-
pened purely by chance) and the hypothesis is accepted; if it is lower, it is judged
inconsistent with the hypothesis (the sample result is thought too unusual to
have happened purely by chance) and the hypothesis is rejected. When the hy-
pothesis is rejected, the result is said to be significant at the 5 per cent level.
It is rare for evidence to prove conclusively the truth of a proposition. The evi-
dence merely alters the balance of probabilities. Significance tests give a critical but
arbitrary point that divides evidence supporting the proposition from that which
does not. The significance level is this critical point. Its use is an abrupt, black-and-
white method of separation, but it does provide a convention and a framework for
weighing evidence. When a result is reported as being statistically significant at the 5
per cent level, the conclusion has a consistent meaning in whatever situation the test
took place. Occasionally a conclusion may be reported ‘at the 95 per cent confi-
dence level’ instead of ‘at the 5 per cent significance level’. The meaning is the same.
Example
One of the products of a dairy company is 500 g packs of butter. There is some concern
that the production machine may be manufacturing slightly overweight packs. A random
sample of 100 packs is weighed. The average weight is 500.4 g and the standard
deviation is 1.5 g. Is this consistent with the machine being correctly set and producing
packs with an overall average weight of 500 g?
Follow the steps of hypothesis testing.
(a) The hypothesis is that the true population average weight produced by the machine
is 500 g.
(b) The evidence is the sample of 100 packs with average weight 500.4 g and standard
deviation 1.5 g.
(c) Let the significance level be the conventional 5 per cent.
(d) Assuming the hypothesis is true, the sample mean has come from a sampling distribu-
tion of the mean as shown in Figure 8.5. The sampling distribution is normal. Even if
the distribution of the weights of individual packs is not normal, the central limit theo-
rem makes the sampling distribution normal. The mean is 500 (the hypothesis). The
standard deviation is equal to the standard deviation of the individual distribution di-
vided by the square root of the sample size. It is valid to calculate the standard
deviation from the sample because the sample size exceeds 30.

Standard deviation = 1.5/√100


= 0.15

8/10 Edinburgh Business School Quantitative Methods


Module 8 / Statistical Inference

The sample taken had a mean of 500.4. Recall the way in which z values are calculat-
ed for normal distributions. In this situation:

= (500.4 − 500)/0.15
= 2.67
From the normal curve table given in Appendix 1 (Table A1.2) the associated area
under the normal curve is 0.4962 (see Figure 8.6). The probability of obtaining a
sample result as high as or higher than 500.4 is:
= 0.5 − 0.4962
= 0.0038
= 0.38%

Standard deviation
= 0.15

500 g

Figure 8.5 Sampling distribution for butter packs

0.0038

0.4962

z = 2.67
500 500.4

Figure 8.6 Butter packs: significance tests


(e) The probability of the observed evidence is therefore 0.38 per cent, much lower
than the significance level of 5 per cent. It is highly improbable that if the true aver-
age weight were 500 g, a sample with mean weight 500.4 g would be obtained. The
hypothesis is rejected. The machine does appear to be producing slightly overweight
packs. The result is said to be statistically significant at the 5 per cent level.

Quantitative Methods Edinburgh Business School 8/11


Module 8 / Statistical Inference

8.6.1 Critical Values


The butter pack example was based on comparing two probabilities. The probability
of the sample evidence was compared with the significance level. There is an
alternative procedure. The significance level can be used to calculate a critical
value. This is a value of the sample mean lying exactly on the boundary separating
significant (probability less than 5 per cent) from non-significant (probability greater
than 5 per cent) results. The acceptance or rejection of the hypothesis can then be
based on comparing the observed sample mean with the critical value. The two
procedures are different ways of viewing the same thing that leaves 5 per cent (the
significance level) in the ‘tail’ of the distribution (see Figure 8.7). It therefore has a z
value of 1.645 (the z value corresponding to an area in the normal table equal to
0.45). Thus:
Critical value = 500 + (1.645 × 0.15)
= 500.247 g

45%

5%

z = 1.645
500.247

Critical value

Figure 8.7 Butter packs: critical value


Since the sample result of 500.4 g was more extreme (further away from the
mean) than this, the hypothesis is rejected. This ‘critical value’ method of viewing
significance tests will be used in Module 11 on correlation and regression.

8.6.2 One- and Two-Tailed Tests


The significance test in the butter example is a one-tailed test. Only the possibility
of the machine being ill-set in the sense of producing overweight packs was
considered. An inconsistent result could only be at one extreme (tail) of the distribu-
tion. However, the machine could also have been ill-set in the sense of producing
underweight as well as overweight packs. If this possibility were also considered, the
test would have to be two-tailed. An extreme result could arise in a sample mean
being too small or too large. There is a possibility of an extreme result in either tail
of the distribution.
The two-tailed significance test works as follows. The probability calculated in
step (d) should now be the probability of obtaining a sample result as far away from

8/12 Edinburgh Business School Quantitative Methods


Module 8 / Statistical Inference

the mean as 0.4 g in either direction, not just 500.4 g or higher. This new probability
is double the previous one, since the area in the tail (0.0038) occurs on both sides of
the mean. It is equal to 0.76 per cent. Figure 8.8 summarises the difference between
one- and two-tailed tests. In the two-tailed test this new probability, 0.76 per cent,
would be compared with the 5 per cent significance level.

(a) (b)
P(sample result)
= 0.38% + 0.38%
= 0.76%

P(sample result)
= 0.38% 0.38% 0.38%

500 500.4 499.6 500 500.4

Figure 8.8 (a) One-tailed test; (b) two-tailed test

(a) (b)

2.5% 2.5%

5%

Critical values Critical values

Figure 8.9 Critical values for (a) one-tailed test; (b) two-tailed test
Using the critical value method of viewing significance tests, we see that there are
two critical values in the two-tailed test. Since the possibility of an extreme result on
either side of the mean is to be taken into account, there must be a critical value on
either side of the mean. However, as the significance level is still 5 per cent, each
critical value must leave half of 5 per cent (= 2.5 per cent) in its tail of the distribu-
tion. This is illustrated in Figure 8.9.
When the area in a tail is 2.5 per cent, the corresponding z value is approximately
2 (95 per cent of a distribution is within ±2 standard deviations of the mean). In the
butter pack example the critical values are therefore at:
500 ± 2 × 0.15
i.e. 499.7 and 500.3

Quantitative Methods Edinburgh Business School 8/13


Module 8 / Statistical Inference

The sample result is compared with the critical values. A result between the val-
ues leads to acceptance of the hypothesis; a result outside the critical values leads to
rejection of the hypothesis.
The decision between one- and two-tailed tests depends upon the hypothesis (i.e.
what you are setting out to test). In the butter example, the production manager of
the organisation would be concerned both if the packs were too heavy (he would be
giving butter away) and if they were too light (he would be infringing trading
standards laws). So he would use a two-tailed test. The null hypothesis would be
that the population mean was 500 g and the alternative hypothesis would be that the
population mean was not 500 g. On the other hand, an official from the trading
standards department of the municipal authority would only be interested if the
packs were too light and customers were being cheated. It would not be his concern
if the organisation was giving customers more than it needed to. So he would use a
one-tailed test. The null hypothesis would be that the population mean is 500 g (or it
could be is 500 g or greater) and the alternative hypothesis would be that the popula-
tion mean is less than 500 g.

8.6.3 Errors in Significance Tests


An alternative way of interpreting a significance level is that it is the probability of
wrongly rejecting the hypothesis. Even when a hypothesis is true, a sample statistic
in a ‘reject’ tail of the distribution could still occur and the hypothesis be falsely
rejected. The probability of this is equal to the significance level. An error like this is
called a type 1 error.
The other type of mistake that could be made is to accept the hypothesis even
when it is false. This is a type 2 error. The probability of a type 2 error is harder to
find since it requires knowledge of the alternative hypothesis. If the original hypoth-
esis (the null hypothesis) is rejected, the alternative hypothesis is the one accepted in
its place. A type 2 error is the probability of wrongly accepting the null hypothesis
or (the same thing) wrongly rejecting the alternative hypothesis. This probability
cannot be calculated unless the alternative hypothesis is specified exactly. In the
butter packs example, the null hypothesis was that the true weight of the packs was
500 g. If the alternative hypothesis took an exact form such as ‘true weight = 490 g’,
then the probability of wrongly accepting it can be calculated. If it took an inexact
form, such as ‘true weight does not equal 500 g’, then the probability of wrongly
accepting it cannot be calculated.
In practice, there are sometimes circumstances when an exact alternative hypoth-
esis can be specified. For example, the motivation for conducting the test on the
packs of butter could have been that a large supermarket supplied by the organisa-
tion had returned a lorry load of packs whose weights were clustered around the
490 g level.
The probability of correctly accepting the alternative hypothesis is the power of
the significance test.

8/14 Edinburgh Business School Quantitative Methods


Module 8 / Statistical Inference

To summarise:
(a) If the null hypothesis is true (assuming significance level is 5 per cent):
P(correctly accepting null hypothesis) = 95%
P(erroneously rejecting null hypothesis) = 5% (type 1 error)
(b) If the alternative hypothesis is true:
P(correctly accepting alternative hypothesis) is the power of the test
P(erroneously rejecting alternative hypothesis) = 100% − power (type
2 error)
Example
The manufacturers of a new influenza drug for children claim it will reduce the child’s
temperature by 1 °C within 12 hours. The drug is tried on a random sample of 36
children. The average temperature reduction is 0.8 °C with a standard deviation of
1.4 °C. At the 5 per cent significance level, has the drug performed according to
specification? What are the probabilities of type 1 and type 2 errors? What is the power
of the test? It can be assumed that the distribution of individual temperatures is normal.
Follow the five stages of a significance test:
(a) The null hypothesis is that the drug makes no difference to the temperature; the
alternative hypothesis is that the temperature is reduced by 1 °C.
(b) The sample evidence is the reduction in temperature for 36 children, mean 0.8,
standard deviation 1.4.
(c) The significance level is the conventional 5 per cent.
(d) If we assume the truth of the hypothesis, the sample mean comes from a distribution
as shown in Figure 8.10, with mean 0 and standard deviation 0.23 (1.4/√36). The test
is one-tailed since only a reduction in temperature is considered. The critical value
for a 5 per cent test is, from previous work, at the point z = 1.645 from the mean.
The critical value is therefore at 0.38 (= 1.645 × 0.23).
(e) The sample result, mean = 0.8, is well outside the critical value. The null hypothesis
is rejected and the alternative accepted. The new drug does make a significant differ-
ence to temperature.
The probability of a type 1 error – the probability of wrongly rejecting the null hypothe-
sis – is 5 per cent, the significance level.
The probability of a type 2 error – the probability of wrongly rejecting the alternative
hypothesis (i.e. accepting the null hypothesis when it is false) – needs some calculation
(Figure 8.11). The critical value marks the accept/reject boundary for the null hypothe-
sis. It therefore marks the reject/accept boundary for the alternative. The probability of
wrongly rejecting the alternative hypothesis must then be the area in the tail of the
distribution based on the alternative, as marked by the critical value. This is the
highlighted area in Figure 8.11.

Quantitative Methods Edinburgh Business School 8/15


Module 8 / Statistical Inference

Standard error
= 1.4
36
= 0.23

0 z = 1.645 Critical value


= 1.645 x 0.23
= 0.38

Figure 8.10 Sampling distribution of the mean temperature reduction

Null Alternative
The shaded area
is the probability
of a type 2 error.

0 0.38 1
Critical value

Figure 8.11 Type 2 error


For the alternative distribution the z value associated with the critical value is:
z = (0.38 − 1.0)/0.23 = −2.70
From the normal curve table given in Appendix 1 (Table A1.2) the area associated with
this z value is 0.4965. Thus:
P(type 2 error) = 0.35%
The power of the test is the probability of accepting the alternative hypothesis when it
is true. It is the reverse of the probability of a type 2 error.
Power = 100% − 0.35% = 99.65%
A well-designed significance test is generally one in which the two types of errors are
equally likely. Ceteris paribus (meaning ‘other things being equal’), there should be the
same chance of erring in favour of one hypothesis as the other. In the above example
the errors are not properly balanced. It is much easier to choose incorrectly the null (5
per cent) than the alternative (0.35 per cent). The balancing can be done either by
adjusting the significance level or by changing the sample size. Reducing the significance
level immediately reduces the probability of a type 1 error. At the same time, it moves
the critical value closer to the centre of the alternative distribution and increases the
probability of a type 2 error. Reducing the sample size does not affect the type 1 error
but again moves the critical value closer to the alternative distribution and increases the

8/16 Edinburgh Business School Quantitative Methods


Module 8 / Statistical Inference

probability of a type 2 error. In the above example, if the sample size were 25 instead of
36:
Standard error of sample means = 1.4/√25 = 0.28
Critical value = 0 + 1.645 × 0.28 = 0.46
z value for alternative distribution = (0.46 − 1.0)/0.28 = 1.93
P(type 2 error) = 2.68%
The probabilities of error are now closer to a balance with a smaller, and presumably
cheaper, sample size. Likewise, the sample size could have been maintained at 36 and
the significance level reduced from 5 per cent to obtain a balance.
Note that, in many cases, an alternative hypothesis is not known and it is not possible to
measure the probability of a type 2 error and the power of the test.

8.6.4 Significance Tests and Confidence Levels


A significance test is directly comparable to the estimation of confidence levels. For
a two-tailed test, the rejection of a hypothesis at the 5 per cent level from sample data
is the same as the hypothesised mean (or other statistic on which the hypothesis is
based) not falling within the 95 per cent confidence interval of the population mean (or
other statistic) estimated from the same sample. And vice versa: the acceptance of a
hypothesis at the 5 per cent level from sample data is the same as the hypothesised
mean falling within the 95 per cent confidence interval of the population mean estimated
from the same sample.
For a one-tailed test, the rejection of a hypothesis at the 5 per cent level from sam-
ple data is the same as the hypothesised mean (or other statistic on which the
hypothesis is based) not falling within the 90 per cent confidence interval of the population
mean (or other statistic) estimated from the same sample. And vice versa: the
acceptance of a hypothesis at the 5 per cent level from sample data is the same as the
hypothesised mean falling within the 90 per cent confidence interval of the population mean
estimated from the same sample.
In the butter packs example (Section 8.6), a one-tailed significance test of the
hypothesis that the mean weight was 500 g concluded that the hypothesis should be
rejected. If a 90 per cent confidence interval for the population mean were con-
structed from the same sample, then 500 g would fall outside it, equivalent to the
rejection of the hypothesis. See Figure 8.12.

Quantitative Methods Edinburgh Business School 8/17


Module 8 / Statistical Inference

Significance test

z = 1.645 5%

500 500.247 500.4

Hypothesis Critical Sample


value mean

Confidence
interval

5% z = 1.645 z = 1.645 5%

500 500.153 500.4

90% confidence level

Figure 8.12 Comparing significance tests and confidence levels

8.7 More Significance Tests


The significance tests met so far have compared a single sample with a population
hypothesis. Other types of significance test compare one sample with another.

8.7.1 Difference in Means of Two Samples


In the medical example about the healing of pulled muscles, evidence from a single
sample was compared with a population, the set of all past records. This may be
neither possible nor desirable. In practice past records may not be available. Even if
they are available, this approach may not be valid, since the very fact that people in
the sample know that they are receiving special attention may affect their recovery.
It may be better to take evidence from two samples and then compare the differ-
ence in results. Both samples are told about the experiment and asked if they will
take part. One sample is given the new treatment and the other is not, but none of
the patients knows to which group they belong. The difference in the average
recovery rates of the two samples measures the improvement brought about by the

8/18 Edinburgh Business School Quantitative Methods


Module 8 / Statistical Inference

new treatment. The sample that is not given the new treatment is known as the
control group.
Previous tests have been based on the distribution of sample means. The signifi-
cance test for two samples is based on a new distribution: that of the difference in
sample means. Suppose two samples are taken from a population, their means and
the differences in means recorded. Gradually a distribution is built up. This is the
distribution of the difference in sample means.
The mean of this distribution is 0 since, overall, the difference in means of sam-
ples taken from the same population must average to 0. To calculate the standard
deviation requires the use of the variance sum theorem. It states that, if x and y
are two variables, then the variance of their sum or difference is:
Variance (x + y) = Variance (x) + Variance (y)
Variance (x −y) = Variance (x) + Variance (y)
The proof of the theorem is mathematical. In this case the two variables (x and y
in the formula) are the means of the two samples. Each sample mean has a variance
given by:
Variance ( ) = Individual variance/Sample size
= /
where V = variance of the population from which the samples were drawn.
This expression has previously been met in square root (standard deviation) form
in the context of the sampling distribution of the mean. From the variance sum
theorem:
Variance ( − ) = / + / ﴾8.1﴿
Usually the sample sizes are chosen to be equal and therefore:
Variance ( − ) = 2 / ﴾8.2﴿
If s is the individual standard deviation ( = √ ) then:
Standard deviation ( − ) = √2 · /√
This distribution of the difference in sample means, which has mean 0 and stand-
ard deviation √2 · /√ , is the basis of the test of significance between two samples.
The same five stages of the basic significance test are applicable:
(a) Set a hypothesis. The hypothesis is that the observed difference in sample means
comes from a distribution with mean 0. If it is supposed that there is no differ-
ence between the two samples, the observed difference in the sample means will
have arisen purely by chance.
(b) Collect sample evidence. This will consist of the two samples and their means.
(c) Set the significance level. The conventional 5 per cent is usually chosen. It is
interpreted as before. An observed difference in sample means is supposed to
have arisen purely by chance if its probability exceeds the significance level. The
hypothesis will then be accepted. The difference is supposed to be too unlikely

Quantitative Methods Edinburgh Business School 8/19


Module 8 / Statistical Inference

to have arisen purely by chance if its probability is less than the significance level.
It will then be concluded that the hypothesis is wrong.
(d) Calculate the probability of the sample evidence assuming the truth of the
hypothesis. Normal curve theory is used, for instance, to calculate the probability
that the observed difference in sample means has come from a distribution with
mean 0 and standard deviation √2 · /√ (as calculated above).
(e) The probability found in (d) is compared with the significance level. If it is
higher, the hypothesis is accepted; if lower, the hypothesis is rejected. The critical
value approach to significance tests could equally well be used.
Example
A supermarket chain wants to find which of two promotions is the more effective.
Promotion A is tried out at a sample of 36 stores over a one-week period; promotion B
is tried out at a sample of 36 similarly sized stores over the same period. For both
samples the increase in sales turnover over the previous week is measured. The average
increase for sample A is £12 000 and for sample B it is £53 000. The standard deviation
of weekly sales turnover at all the stores belonging to the chain is £120 000. Is there a
significant difference between the effectiveness of the two promotions?
Follow the five stages of the two sample significance tests:
(a) The hypothesis is that there is no difference between the two promotions.
(b) The evidence is sample A with mean £12 000 and sample B with mean £53 000.
(c) The significance level is selected to be the usual 5 per cent.
(d) If the hypothesis is true and, given the standard deviation of all weekly sales turno-
vers to be £120 000, the difference in the sample means comes from a distribution of
mean 0 and standard deviation £28 284 (= √2 · 120 000/√36), as shown in Fig-
ure 8.13. The distribution is normal because of the central limit theorem.
The observed difference in sample means is £41 000 (= 53 000 − 12 000). The value
z is (41 000 − 0)/28 284 = 1.45.
From the normal curve table in Appendix 1 (Table A1.2) the area from the mean to
this point is 0.4265. The area in the tail is therefore 0.0735. Since it was possible for
either promotion to prove better than the other, the test is a two-tailed test. The
probability of the sample evidence is thus 2 × 0.0735 = 0.1470 (i.e. 14.7 per cent).
(e) The probability of the sample evidence, 14.7 per cent, exceeds the significance level.
The hypothesis is accepted. There is no evidence that one promotion is significantly
better than the other.

8/20 Edinburgh Business School Quantitative Methods


Module 8 / Statistical Inference

Standard error
= £28 284

z = 1.45
0 41 000

Figure 8.13 Supermarket promotions


Care must be taken in interpreting the conclusion. The key word is ‘significantly’. The
evidence is that promotion B is better than promotion A. However, the evidence is not
sufficiently strong to be significant. There is a clear possibility (14.7 per cent) that the
difference in sales turnover has come about purely by chance.
In one respect at least the above example is unrealistic. The standard deviation of the
underlying individual distribution of turnovers is assumed to be known. It is more likely
that it would not be known and would have to be estimated from the samples. When
this occurs, the standard deviations calculated from each sample are pooled to get an
overall estimate. The pooled estimate of the standard error is calculated as
follows. The standard deviations calculations from each sample are:
= ∑( − ) /( − 1)
= ∑( − ) /( − 1)
These estimates are pooled to give:

= (( − 1) +( − 1) )/( + − 2) ﴾8.3﴿
The reason for the ‘2’ in the denominator is technical and will be explained later when
the concept of degrees of freedom is introduced. This estimate is then used in formulae
Equation 8.1 and Equation 8.2 as if it were the real standard deviation.
Example
In the supermarket promotion example above, suppose the standard deviation of the
weekly turnovers at all stores is not known. From sample A, s is calculated to be
£95 200 and from sample B, s is calculated to be £140 500. What is the pooled estimate
of the standard deviation?
Using the formula in Equation 8.3 and working in thousands:
= ((36 − 1) × 95.2 + (36 − 1) × 140.5 )/(36 + 36 − 2)
= (35 × 9063 + 35 × 19 740)/70
= 120
The estimated standard deviation is thus £120 000 (approximately). It is used in the
significance test just as in the above example, when £120 000 was assumed to be the
true population standard deviation.

Quantitative Methods Edinburgh Business School 8/21


Module 8 / Statistical Inference

8.7.2 Difference between Paired Samples


The above was a test between two samples for which the observations were not
paired. An observation in one sample did not have a partner in the other. On
occasions, paired samples have to be compared. For example, another approach to
the supermarket promotion problem might have been to take a sample of, say, 36
stores and measure their sales turnover in a week without the promotion and then in
a corresponding week with the promotion. There are now two samples (one relating
to each week) but they are paired. Any observation in the first sample has a partner
in the second in that both observations relate to the same store. The significance
test in this case is to investigate the size of the difference between turnover with and
without the promotion.
This test is handled by forming a single new sample, which is the differences
between the observations taken in pairs. The new sample is treated just as for the
basic single sample significance test with the hypothesis that the promotion has
made no difference and that the real mean is 0.
Example
The supermarket chain is still investigating the effect of promotional schemes. Promo-
tion A is to be given a final chance. The sales turnover at a sample of 50 similarly sized
stores is measured for a one-week period. The promotion is introduced during another
week and the turnover is again measured. The weeks have been chosen so that there
are likely to be few different factors affecting turnover apart from the promotion. The
results are summarised in Table 8.1.

Table 8.1 Supermarket promotion results


Store Turnover Turnover Difference
(week 1) (week 2) (week 2 − week 1)
(£000) (£000) (£000)
1 325 376 51
2 418 408 −10 For the differences
3 369 398 29 Mean = 26
4 311 323 12 Standard deviation = 68
… … … …
… … … …
49 286 284 −2
50 354 391 37

The significance test proceeds through the five stages:


(a) The hypothesis is that the promotion makes no difference to sales turnover. The
sample formed from the differences is hypothesised to come from a distribution of
mean 0.
(b) The evidence is the newly formed sample of differences.
(c) Set the significance level at the conventional 5 per cent.
(d) Calculate the probability of the sample evidence. Assuming the truth of the hypothe-
sis, the sample has come from a normal distribution (since the sample size exceeds

8/22 Edinburgh Business School Quantitative Methods


Module 8 / Statistical Inference

30) with mean 0 and standard deviation 9.6 (= 68/√50). The z value of the ob-
served sample mean is 2.71 (= 26/9.6). From the normal curve table in Appendix 1
(Table A1.2), the associated area is 0.4966. The test is one-tailed since, presumably,
only the possibility of the promotion improving sales is considered. The probability
of the sample evidence is therefore 0.34 per cent.
(e) A probability of 0.34 per cent is well below the significance level. The hypothesis is
rejected. The promotion makes a significant difference to sales turnover.

8.7.3 Tests on Proportions


Section 7.6.7 showed how the binomial distribution can be approximated to the
normal provided the sample size is reasonably large. In particular, the normal
distribution can be applied to data in the form of proportions and used to conduct
significance tests. For example, we can carry out significance tests on the proportion
of defective parts in a sample. As explained in Section 7.6.7, the parameters of the
distribution (when the variable is the proportion, p) are:
Arithmetic mean =
Standard deviation = (1 − )/

Example
A political party claims that it has the support of the majority of the electorate. An
opinion poll that questioned 1600 potential voters showed that 48 per cent of them
supported the party. Does this evidence refute the claim?
A significance test will answer the question. The usual five stages are followed.
(a) The null hypothesis is that 50 per cent of the electorate support the party (i.e. the
proportion is 0.5). A one-tailed test is required since, in respect of the task of trying
to refute the claim, we are only interested in whether support is significantly lower
than the 50 per cent mark, not significantly higher. The alternative hypothesis is that
support is less than 50 per cent.
(b) The sample data are the opinions of the 1600 people interviewed.
(c) The conventional 5 per cent significance level is chosen.
(d) Assuming the truth of the hypothesis, the sample mean (the 48 per cent, or a
proportion of 0.48, from the opinion poll) comes from a normal distribution (the
normal can be used since the sample size is large), as shown in Figure 8.14.

Quantitative Methods Edinburgh Business School 8/23


Module 8 / Statistical Inference

s = 0.0125

z = 1.645
5%

Critical 0.5
value
= 0.4795

Figure 8.14 Political party significance test


The parameters are:
Mean = 0.5
Standard deviation = (1 − )/
= (0.5 × 0.5)/1600
= 0.25/1600
= 0.5/40
= 0.0125
The critical value for a one-tailed test at the 5 per cent level is, from previous exam-
ples, 1.645 from the mean. The critical value is therefore 0.4795 (= 0.5 − (1.645 ×
0.0125)).
(e) The sample result, a proportion of 0.48, is just inside the critical value. The null
hypothesis is accepted. The result is close but it is not possible to refute the party’s
claim that it has the support of the majority of the people.
The normal distribution of data in the form of proportions can be used for all types of
significance tests, including those for two independent samples and two paired samples,
and for estimation, such as point estimates and confidence intervals.

8.8 Reservations about the Use of Significance Tests


The ideas underlying significance tests make them important and powerful aids in
approaching management problems. Like all techniques they should be used with
full awareness of the reservations and limitations to their use. These qualifications
are to do with the tests themselves as well as with wider issues that may be missed if
the tests are used without thought.
(a) A test is only as good as the logic surrounding it. If the circumstances in which
the test is to work have not been carefully thought out, it is unlikely to help in
decision making or the provision of information. Indeed, it may mislead. For
example, in the case of the supermarket promotions, there are many other fac-

8/24 Edinburgh Business School Quantitative Methods


Module 8 / Statistical Inference

tors operating to affect sales turnover besides the promotions. The weeks chosen
(close to Christmas or holidays), advertising campaigns and weather would all
influence sales turnover. In addition, the size of stores would govern the magni-
tude of the increase that could be expected. The effect of all these factors must
be minimised when it is the effect of the promotion alone that is to be investi-
gated. Both tests of promotional effectiveness used (paired and unpaired
samples) have their own advantages in this respect. The unpaired sample is based
on just one week and thus eliminates the effect of data being collected in differ-
ent weeks; the paired sample is based on one set of stores and thus eliminates
some of the possibly systematic effects of data being collected at different stores
in different geographical areas.
(b) For many decisions it is simply not possible to collect sample evidence of the
sort that can be used in a significance test. For instance, it is difficult to carry out
trials with new medical drugs on humans. In other cases, it is not possible to
select random samples of sufficient size to satisfy assumptions and apply a test.
(c) Significance tests make no attempt to balance costs. The cost of sampling
compared with the value of the evidence provided has to be handled outside the
statistical method. It may be that the cost of sampling is greater than the benefit
accruing from this information.
(d) A significance test is a black-and-white, all-or-nothing process. If it is followed
rigidly, a probability of sample evidence of 4.9 per cent leads to the rejection of
the hypothesis; a probability of 5.1 per cent leads to acceptance. Clearly the real
difference between the two cases is marginal, yet in terms of the conclusion the
difference is total. When the probability is close to the accept/reject border, it is
preferable to quote the conclusion as ‘The sample evidence leads to the ac-
ceptance of the hypothesis at the 5 per cent level but the probability, at 7.3 per
cent, was close to the significance level.’
(e) The significance tests are based on assumptions, such as a normal distribution, a
sample size exceeding 30, a known population standard deviation and so on. The
assumptions may not be met in practice and the test may therefore be inapplica-
ble. However, some significance tests can still be used even when the
assumptions do not hold exactly. They are said to be robust.
(f) Generally, a good significance test balances type 1 and type 2 errors. The
incorrect rejection and the incorrect acceptance of the hypothesis should have
small and about equal probabilities. Often there is no alternative hypothesis and
the probability of a type 2 error cannot be calculated. The test may therefore be
unbalanced in the sense of imposing accept/reject standards that are too harsh
or too loose for the circumstances.
Situations do arise, however, when the equality of the probabilities of type 1 and
type 2 errors is not desirable. When the cost of a type 1 error is very different
from that of a type 2 error, it may be better to weight the probabilities according
to the costs.

Quantitative Methods Edinburgh Business School 8/25


Module 8 / Statistical Inference

Learning Summary
Statistical inference belongs to the realms of traditional statistical theory. Its
relevance lies in its applicability to specialised management tasks, such as quality
control and market research. Most managers would find that it can only occasionally
be applied directly to general management problems. Its major value is that it
encompasses ideas and concepts that enable problems to be viewed in broader and
more structured ways.
Two areas have been discussed, estimation and significance testing. New theory –
confidence levels, the sampling distribution of the mean, the central limit theorem
and the variance sum theorem – has been introduced.
The conceptual contribution that estimation makes is to concentrate attention on
the range of a business forecast rather than merely the point estimate. To take a
previous market research example, the estimate that 61 per cent of male toiletries
are purchased by females sounds fine. But what is the accuracy of the estimate? The
61 per cent is no more than the most likely value. How different could the true
value be from 61 per cent? If it can be said with near certainty (95 per cent confi-
dence) that the percentage is between 58 per cent and 64 per cent, then the estimate
is a good one on which decisions may be reliably based. If the range is 8 per cent to
88 per cent, then there must be doubts about its usefulness for decision making.
Surprisingly, the confidence limits of business forecasts are often reported with little
emphasis, or not reported at all.
The second area considered was significance testing. It is concerned with distin-
guishing real from apparent differences. The discrepancy between a sample mean
and what is thought to be the mean of the whole population is judged in the context
of inherent variation. An apparent difference is one that is likely to have arisen
purely by chance because of the inherent variation; a real difference is one that is
unlikely to have arisen purely by chance and some other explanation (i.e. that the
hypothesis is untrue) is supposed. A significance level draws a dividing line between
the two. The dividing line marks an abrupt border. In practice, extra care is exer-
cised over samples falling in the grey areas immediately on either side of the border.
A number of significance tests have been introduced and it can be difficult to
know which one to use. To illustrate the different circumstances in which each is
appropriate, a medical example will be used in which a new treatment for reducing
cholesterol levels is being tried out. Country-wide records are available showing that
the existing treatment on average reduces cholesterol levels by five units.
The three types of test described in the module are:
(a) Single sample. This is the basic significance test described in Section 8.6.
Evidence from one sample is used to test a hypothesis relating to the population
from which it has come. For example, to show that the new cholesterol treat-
ment was more effective than the existing treatment, the hypothesis would be
that the new treatment was no more effective than the old (i.e. it reduced choles-
terol levels by five units on average). A representative sample of patients would
be given the new treatment and the average reduction in cholesterol measured.
This would be compared with the hypothesised population figure of five units.

8/26 Edinburgh Business School Quantitative Methods


Module 8 / Statistical Inference

(b) Two independent samples (Section 8.7.1). Two independently drawn samples
are compared, usually with the hypothesis that there is no difference between
them. For example, in trying out the new cholesterol treatment there might be
some doubt about the accuracy of the country-wide data on which the hypothe-
sis was based. One way to get round the problem would be to use two samples.
The first would be a sample of patients to whom the new treatment had been
given and the second a ‘control’ sample of patients to whom the old treatment
was given. As before, the hypothesis would be that the new treatment was no
better than the old. The average reduction measured for the first sample would
be compared to that for the second to test whether the evidence supported this.
(c) Paired samples (Section 8.7.2). Two samples are compared but they are not
drawn independently. Each observation in one sample has a ‘partner’ in the oth-
er. For example, instead of testing the ultimate effect of the new treatment in
reducing cholesterol levels, it might be helpful to know whether it worked quick-
ly, taking effect within, say, three days. To do this the new treatment would be
given to a single sample of patients. Their cholesterol levels would be measured
at the outset and again three days later. There would then be two samples, the
first of cholesterol levels at the outset and the second of levels three days later.
However, each observation in one sample would be paired with one in the other
– paired because the two observations would relate to the same patient. The
hypothesis would be that the treatment had made no difference to cholesterol
levels after three days. As described in Section 8.7.2, the significance test would
be carried out by forming a new sample from the difference in cholesterol levels
for each patient and testing whether the average for the new sample could have
come from a population of mean zero.
If two independent samples had been used (i.e. the two samples contained differ-
ent patients, as for the independent samples above), and the cholesterol levels had
been measured for one sample at the outset and for the second three days later, any
difference in cholesterol levels might be accounted for by the characteristics of the
patients rather than the treatment.
In deciding how to conduct a significance test, there are three other factors to
consider. First, the test can be conducted with probabilities or critical values. This is
purely a matter of preference for the tester – both would produce the same result
(see Section 8.6.1). Second, the test can be one-tailed or two-tailed. This decision is
not a matter of preference; it depends upon the purpose of the test and what
outcome is wanted (Section 8.6.2). Third, the test could use data in the form of
proportions. This depends on the nature of the data, whether proportional or not
(Section 8.6.3).
Both estimation and significance testing can improve the way a manager thinks
about particular types of numerical problems. Moreover, they help to show the
manager what to look for in a management report: does an estimate or forecast also
include a measure of accuracy? In making comparisons, are the differences real or
apparent? From the point of view of day-to-day management, this is where their
importance lies.

Quantitative Methods Edinburgh Business School 8/27


Module 8 / Statistical Inference

Review Questions
8.1 Which is correct? Statistical inference is a set of methods for:
A. collecting samples.
B. estimating sample characteristics from parameters.
C. using sample information to make statements about populations.
D. designing statistical experiments.

8.2 Which is correct? Statistical inferences have confidence levels associated with them
because:
A. statistical data cannot be measured accurately.
B. alternative hypotheses may not be known.
C. they are made from samples that may be unrepresentative.
D. circumstances that apply when the sample was selected may change.
Questions 8.3 to 8.5 refer to the following random sample of nine observa-
tions collected from a normally distributed population:
7, 4, 9, 2, 8, 6, 8, 1, 9

8.3 What is the standard deviation of the sampling distribution of the mean of samples of
size 9?
A. 1
B. 3
C. 0.94
D. 1/3

8.4 The point estimate of the population mean is:


A. 3
B. 6
C. 2
D. 6 ± 2

8.5 What are the 90 per cent confidence limits for the population mean (approximately)?
A. 4 to 8
B. 4.4 to 7.6
C. 1.1 to 10.9
D. 0 to 12

8.6 The mean of a normally distributed population is to be estimated to within ±20 at the
95 per cent confidence level. The standard deviation is 150. What should the sample
size be?
A. 225
B. 250
C. 15
D. 56

8/28 Edinburgh Business School Quantitative Methods


Module 8 / Statistical Inference

8.7 A significance test is a method for deciding whether sample evidence is sufficiently
strong to prove the truth of a hypothesis. True or false?

8.8 Which is correct? A significance level is usually set at 5 per cent because:
A. through time 5 per cent has come to be accepted as the norm.
B. 5 per cent is the definition of significance.
C. 5 per cent marks the boundary between an acceptable risk and an unacceptable
risk.
D. 5 per cent significance implies 95 per cent confidence.

8.9 Critical values can be used in both two-tailed significance tests and one-tailed
significance tests. True or false?
Questions 8.10 to 8.12 refer to the following situation:
In a significance test a random sample of size 64 with mean 9.87 and standard devia-
tion 48 is taken from a normal population. The hypothesis is that the population mean is
0.

8.10 What conclusion should be drawn?


A. Hypothesis accepted at 5 per cent level.
B. Hypothesis rejected at 5 per cent level.
C. Test inconclusive at 5 per cent level.
D. Hypothesis accepted at 10 per cent level.
E. Hypothesis rejected at 10 per cent level.

8.11 If the alternative hypothesis is that the population mean is 20, what is the probability of
a type 2 error under a significance test at the 5 per cent level? (Note that with this
alternative hypothesis the test must be one-tailed.)
A. 5%
B. 18.36%
C. 4.55%
D. 9.18%

8.12 If the alternative hypothesis is that the population mean is 20, what is the power of the
5 per cent significance test?
A. 95%
B. 95.55%
C. 90.82%
D. 95.45%

Quantitative Methods Edinburgh Business School 8/29


Module 8 / Statistical Inference

Questions 8.13 to 8.15 refer to the following situation:


Three months ago a take-away food chain with several hundred branches throughout
the country had told its outlet managers that sales turnover had to increase if the chain
was to survive in its present form. According to the managing director, sales turnover
per branch had to increase by no less than £2500 per month. A first attempt at testing
the effectiveness of his plea was made by selecting 100 branches at random and
measuring their turnover for the three months before and the three months after the
plea. The turnover per branch in the first three months was £169 500 with standard
deviation £41 000; in the second three months it was £175 000 with standard deviation
£45 500.

8.13 What should be the null hypothesis in a significance test to determine whether the plea
met its objective? The mean increase between the two three-month periods is:
A. 0
B. £2500
C. £7500
D. £250 000

8.14 What type of significance test should be applied?


A. Unpaired samples, one-tail.
B. Paired samples, one-tail.
C. Unpaired samples, two-tail.
D. Paired samples, two-tail.

8.15 A significance test at the 5 per cent level would proceed by calculating a pooled
estimate of the standard error, determining a critical value at 1.645 standard errors
below £7500 and comparing the observed difference in means (£5500 = £175 000 −
£169 500) with the critical value. True or false?

Case Study 8.1: Food Store


1 A high-class food store is investigating its overdue accounts. A sample of 250 overdue
accounts has an average amount outstanding of £186 and a standard deviation of £95.
The smallest amount overdue in the sample is £15 and the largest £638. What are the
95 per cent confidence limits for the average amount outstanding for all overdue
accounts?

8/30 Edinburgh Business School Quantitative Methods


Module 8 / Statistical Inference

Case Study 8.2: Management Association


1 A national management association advises organisations on how to increase the
numeracy of their employees. In particular, it runs short courses designed to make
participants more numerate. At the end of a course a standard test is given. From its
records, the association knows that the scores obtained are normally distributed with
mean 242 and standard deviation 52. A new computer-based course has just been
designed. A trial is conducted with 16 participants. Their average score on the test given
at the end of the course is 261. Is this consistent with the hypothesis that the computer
course is as efficient as the old course in improving numeracy?

Case Study 8.3: Textile Company


1 A textile company makes a variety of linen and other cloths. The yarns used to make
linen are purchased from suppliers. One particular yarn should, according to the
contract, have a tensile strength of 12 kg on average. However, the highly experienced
foreman believes that recent batches are of inferior quality. A sample of 50 specimens
was selected at random and found to have a mean tensile strength of 11.61 kg and a
standard deviation of 1.48 kg. Use a significance test to indicate whether the supplier is
sending yarn of an acceptable mean tensile strength. Have you any reservations about
your conclusions?

Case Study 8.4: Titan Insurance Company


1 The Titan Insurance Company has just installed a new incentive payment scheme for its
life policy sales force. It wants to have an early view of the success or failure of the new
scheme. Indications are that the sales force is selling more policies, but sales always vary
in an unpredictable pattern from month to month and it is not clear that the scheme has
made a significant difference.
Life insurance companies typically measure the monthly output of a salesperson as the
total sum assured for the policies sold by that person during the month. For example,
suppose salesperson X has, in the month, sold seven policies for which the sums
assured are £1000, £2500, £3000, £5000, £5000, £10 000, £35 000. X’s output for the
month is the total of these sums assured, £61 500. Titan’s new scheme stipulates that
the sales force receive low regular salaries but are paid large bonuses related to their
output (i.e. to the total sum assured of policies sold by them). The scheme is expensive
for the company, but they are looking for sales increases that more than compensate.
The agreement with the sales force is that if the scheme does not at least break even
for the company, it will be abandoned after six months.
The scheme has now been in operation for four months. It has settled down after
fluctuations in the first two months due to the changeover. To test the effectiveness of
the scheme, Titan have taken a random sample of 30 salespeople, measured their output
in the penultimate month prior to changeover and then measured it in the fourth month
after the changeover (they have deliberately chosen months not too close to the
changeover). The outputs of the salespeople are shown in Table 8.2.

Quantitative Methods Edinburgh Business School 8/31


Module 8 / Statistical Inference

Table 8.2 Titan Insurance output


Output (£000)
Salesperson Old scheme New scheme
1 57 62
2 103 122
3 59 54
4 75 82
5 84 84
6 73 86
7 35 32
8 110 104
9 44 38
10 82 107
11 67 84
12 64 85
13 78 99
14 53 39
15 41 34
16 39 58
17 80 73
18 87 53
19 73 66
20 65 78
21 28 41
22 62 71
23 49 38
24 84 95
25 63 81
26 77 58
27 67 75
28 101 94
29 91 100
30 50 68

a. Describe the 5 per cent significance test you would apply to these data to determine
whether the new scheme has significantly raised outputs.
b. What conclusion does the test lead to?
c. What reservations have you about this result?
d. Suppose it has been calculated that, in order for Titan to break even, the average
output must increase by £5000. If this figure is the alternative hypothesis, what is:
i. the probability of a type 1 error?

8/32 Edinburgh Business School Quantitative Methods


Module 8 / Statistical Inference

ii. the probability of a type 2 error?


iii. the power of the test?
e. What sample size would make the probabilities of type 1 and type 2 errors equal?

Quantitative Methods Edinburgh Business School 8/33


Module 9

More Distributions
Contents
9.1 Introduction.............................................................................................9/1
9.2 The Poisson Distribution .......................................................................9/2
9.3 Degrees of Freedom ...............................................................................9/7
9.4 t-Distribution ...........................................................................................9/8
9.5 Chi-squared Distribution .................................................................... 9/14
9.6 F-Distribution ....................................................................................... 9/19
9.7 Other Distributions ............................................................................. 9/22
Learning Summary ......................................................................................... 9/23
Review Questions ........................................................................................... 9/25
Case Study 9.1: Aircraft Accidents ............................................................... 9/28
Case Study 9.2: Police Vehicles..................................................................... 9/28

Prerequisite reading: Module 7, Module 8

Learning Objectives
By the end of this module the reader should be more aware of the very wide range
of standard distributions that are available as well as their applications in statistical
inference. Two standard distributions, the binomial and normal, and statistical
inference were the subjects of the previous two modules. Those fundamental
concepts are amplified and extended in this module. More standard distributions
relating to a variety of theoretical situations and their use in estimation and signifi-
cance tests are described.
The module covers some advanced material and may be omitted the first time
through the course.

9.1 Introduction
Statistical inference is the collection of methods by which sample data can be turned
into more general information about a population. There are two main types of
inference, estimation and significance testing. The former is concerned with
predicting confidence intervals for parameters, the latter with judging whether
sample evidence is consistent with a hypothesis.
The ideas of statistical inference are particularly useful when used in conjunction
with standard distributions (sometimes called theoretical or probability distribu-
tions). Recall how a standard distribution arises. Each standard distribution has been
constructed theoretically from a situation in which data are generated. The situation

Quantitative Methods Edinburgh Business School 9/1


Module 9 / More Distributions

has characteristics that can be expressed mathematically. This enables the probabili-
ties of the variable taking different values to be calculated by the a priori method of
measurement. These probabilities form the standard distribution (cf. observed
distributions, which are based on data collection and for which the probabilities are
calculated by the frequency method of probability measurement).
These concepts have been discussed previously, in the context of the binomial
and normal distributions. The purpose now is to extend the discussion to some
more standard distributions. Their application to statistical inference in some
practical situations will be described. The first of these standard distributions is the
Poisson.

9.2 The Poisson Distribution


The Poisson distribution is a standard distribution closely linked to the binomial.

9.2.1 Characteristics
The Poisson is a discrete distribution. Its shape varies from right-skewed to almost
symmetrical as in Figure 9.1.

Figure 9.1 The Poisson distribution

9.2.2 Situations in which the Poisson Occurs


The Poisson distribution describes the occurrence of ‘isolated events within a
continuum’. For example, the occurrence of road accidents over a given period of
time in a particular geographical area is likely to have a Poisson distribution. The
road accidents are the isolated events and time is the continuum. The distribution
would give the probabilities of different numbers of accidents in that area in the
given period of time.
The Poisson may be thought of as similar to the binomial but with an infinite
sample size. The binomial is based on taking samples from a population whose
elements are of two types. The Poisson is also based on taking a sample from a
population of elements of two types, the split being into the ‘occurrence of events’

9/2 Edinburgh Business School Quantitative Methods


Module 9 / More Distributions

(e.g. road accidents that happened) and the ‘non-occurrence of events’ (e.g. road
accidents that could have happened but did not). The difference is that for the
Poisson the total number of elements in the sample is not known. Whereas the
occurrence of events in the sample can be counted, the non-occurrence of events
cannot. Because the number of events that could have occurred but did not is
infinite, the sample size is in effect infinite. The Poisson distribution provides the
probabilities that given numbers of events occur within the sample (usually but not
always a period of time). Compare this with the binomial distribution, whereby the
probabilities that the sample contains given numbers of elements of one type are
calculated. The mathematics of the Poisson are based on those of the binomial but
changed to allow for an infinite sample size.
A typical application is the arrival of telephone calls at a switchboard. The sample
is a period of time; the elements of the sample are the arrival or non-arrival of calls.
Since, potentially, an infinite number of calls could arrive during the time period, the
sample size is, in effect, infinite. The number of calls that do arrive can be counted,
but there remains an uncountable number of calls that did not arrive. Given an
average arrival rate of calls (the average number of calls per time period), the
probability that any number of calls will arrive can be calculated. For instance,
knowing that 10 calls per minute arrive on average, the Poisson gives the probabili-
ties that in any minute 0, 1, 2, 3, 4, … calls will arrive, leading to distribution shapes
like those in Figure 9.1. The probabilities can be found in Poisson distribution tables
such as those in Appendix 1, Table A1.3, the use of which will be explained later.
The purpose of such an analysis might be to determine the most efficient capacity
of the switchboard.
Other applications of the Poisson are:
(a) Flaws in telegraph cable (there are only a finite number of flaws in a given length
of cable, but an infinite number could potentially have occurred though did not)
– here the continuum is not time but cable.
(b) Mechanical breakdowns of machinery, cars, etc.
(c) Clerical errors.
9.2.3 Deriving the Poisson
The Poisson distribution is derived from the binomial. The starting point is the
binomial formula for probabilities:
( of type 1) = · · (1 − )
If the sample size, n, increases indefinitely while np (the average number of the
first type per sample) remains at some constant level, λ (the Greek letter lambda),
then the formula becomes (after some not inconsiderable ‘black box’ mathematics):
( events) = · / !
The parameter of the distribution is λ, the average number of events per sample;
e is a constant, equal to 2.718…, which just happens to occur in certain mathemati-
cal situations (cf. the way π, equal to 3.141… , just happens to occur as the ratio
between the circumference and diameter of a circle).

Quantitative Methods Edinburgh Business School 9/3


Module 9 / More Distributions

This formula is the method by which the probabilities in Table A1.3, Appendix 1,
are calculated. It can be applied to the switchboard example above to calculate the
probabilities of given numbers of telephone calls per time period. Suppose the
average number of calls per minute is two (i.e. λ = 2). From tables or a calculator, e−2
= 0.135. Thus:
(0 calls) = 0.135 × 1/1 = 0.135
(1 call) = 0.135 × 2/1 = 0.27
and so on.
As with the binomial, it is sometimes easier, with the aid of a calculator, to use
the formula than the lengthy tables, especially when it is noted from the Poisson
formula that:
( + 1) = ( ) × /( + 1)
For example:
( = 1) = ( = 0) ×
( = 2) = ( = 1) × /2
( = 3) = ( = 2) × /3 etc.
The important assumption is that the sample is taken at random, just as for the
binomial. In the switchboard example, if the average arrival rate of calls refers to the
whole day, then a sample time period taken at a particularly busy time of day would
violate the assumption.

9.2.4 Using Poisson Tables


Table A1.3 in Appendix 1 is a Poisson distribution probability table. Along the top
are the values of λ; down the sides are the values of r, the variable number of events
per sample. For instance, if there are on average 2.5 events per time period (λ is 2.5)
then, using the column headed 2.5, the probability of zero events in such a time
period is 0.0821, of one event is 0.2052, of two events is 0.2565 and so on.
Example
A company receives, on average, two telephone calls per minute. What capacity
switchboard (in terms of calls per minute) should be installed so that it can cope with
incoming calls 95 per cent of the time?
Assuming the average number of calls per minute does not vary during the switch-
board’s working day, the Poisson can be used. The assumption is not restrictive since
the day can be divided into sections reflecting different levels of telephone activity. Each
section would be the subject of a separate analysis. The assumption is important since
the Poisson is based (like the binomial) on taking a random sample, not a sample from a
part of the day when the average rate may be different from the average. If two calls per
minute is the average for the whole day, then, in busy periods, the switchboard may not
cope as well as the calculations suggest.

9/4 Edinburgh Business School Quantitative Methods


Module 9 / More Distributions

Table 9.1 Probabilities of given numbers of calls


No. calls Probability %
0 13.5
1 27.1
2 27.1
3 18.0
4 9.0
94.7
5 3.6
6 1.2
7 or more 0.5
100.0

The required probabilities can be read off Table A1.3 in Appendix 1. For this example
the average number of calls per minute is two. The column associated with this value
gives the probabilities shown in Table 9.1. The table shows that there will be four or
fewer calls 94.7 per cent of the time. A switchboard of capacity four will therefore be
able to handle incoming calls (approximately) 95 per cent of the time.

9.2.5 Parameters
The Poisson has just one parameter, the average occurrence of events (the average
number of calls per minute in the above example). Once this parameter, λ, is known,
the shape of the distribution is fixed exactly. This can be verified from the Poisson
formula. Once λ is fixed, the probabilities are fixed and the distribution shape will
be fixed. If λ takes a different value, the probabilities will be different, as will the
shape, although it will nevertheless have a general profile in accordance with
Figure 9.1.

9.2.6 Deciding whether Data Fit a Poisson


As with the normal and binomial distributions, the theoretical basis of the Poisson is
unlikely to be matched exactly in practice. For instance, there may be some bias in
the way the sample is taken from the population. In practice it is prudent to check
that a Poisson distribution is a reasonable approximation to the situation in ques-
tion.
As with the other standard distributions, there are two tests to check whether the
Poisson is applicable. First, the actual situation is compared qualitatively (although
some more advanced techniques allow this to be done quantitatively) with that on
which the distribution is based. Second, some observed data are compared with
what is theoretically expected, to see whether they match. An example of this is
given as a case study at the end of the module.

9.2.7 Using the Poisson to Approximate the Binomial


The way in which the normal distribution can, in certain circumstances, be used as
an approximation to the binomial has already been described. This is done purely

Quantitative Methods Edinburgh Business School 9/5


Module 9 / More Distributions

for convenience. The binomial tables and probability formula are lengthy and
tedious to use. The normal curve table is much easier to use.
In other circumstances the binomial can be approximated by the Poisson. Be-
cause of the way the derivations of the distributions are linked, the binomial can be
approximated by the Poisson when the sample is large and the proportion, p, is
small. As a rule of thumb, the approximation will give good results if n > 20 and p <
0.05. The parameter of the approximating Poisson is easily found since λ is defined
as being equal to the mean of the binomial, np.
Example
At peak hours, 15 000 cars per hour use the traffic tunnel at a major international
airport. From historical records it has been calculated that the probability of any one car
breaking down in the tunnel is 0.00003. Should three cars break down, it is unlikely that
emergency services will be able to deal with the situation; traffic will come to a stand-
still, flights will be missed and ulcers will be activated. What is the probability that
emergency services will be able to cope during a peak hour (i.e. that there will be no
more than two car breakdowns in the tunnel)?
The population is split into car journeys involving a breakdown in the tunnel and car
journeys not resulting in a breakdown in the tunnel. A sample (15 000 cars in one hour)
is taken from the population. The binomial applies to situations like this. The parameters
are n =15 000 and p = 0.00003. Using the binomial probability formula:
(0 breakdowns) = ( = 0)
= · (0.00003) · (0.99997)
(1 breakdown) = ( = 1)
= · (0.00003) · (0.99997)
(2 breakdowns) = ( = 2)
= · (0.00003) · (0.99997)
The probability of there being no more than two accidents has to be found.
(0, 1 or 2 accidents) = ( = 0) + ( = 1) + ( = 2)
The calculations are possible but, to most people, daunting. Since n > 20 and p < 0.05,
the Poisson approximation can be used.
= = 15 000 × 0.00003
= 0.45
The Poisson formula is:
( )= · / !
and
( + 1) = ( ) × λ/( + 1)
Therefore:
.
( = 0) = = 0.64
( = 1) = 0.64 × 0.45 = 0.29
( = 2) = 0.29 × 0.45/2 = 0.06
(at most 2 breakdowns) = 0.64 + 0.29 + 0.06
= 0.99

9/6 Edinburgh Business School Quantitative Methods


Module 9 / More Distributions

Therefore, the breakdown services are likely to find themselves unable to cope one
hour in a hundred.

9.3 Degrees of Freedom


The next standard distribution to be discussed is dependent upon a new concept,
that of degrees of freedom. They are defined as the number of observations that
are free to vary in estimating a parameter from a sample. For instance, the arithmetic
mean is estimated by the formula:
Mean = ∑ /
where n is the sample size.
The estimate is made from n observations, all of which can take any value within
the population if the sample is chosen at random. None of the observations is
restricted to a particular value; all are free to vary. An estimate of the arithmetic
mean made from a sample of size n has therefore n degrees of freedom.
It might appear that the number of degrees of freedom for estimates of parame-
ters other than the arithmetic mean should also be equal to the sample size. This is
not so. The starting point for calculating degrees of freedom is indeed the sample
size, but usually adjustments have to be made. For instance, the standard deviation
is estimated via deviations from the arithmetic mean. If the sample size were two,
then, in the absence of prior knowledge, the arithmetic mean would have to be
calculated from the sample. The two deviations would necessarily be the same size
but have opposite signs. For example, if the sample were the two observations 7 and
3, then:
Mean = (7 + 3)/2 = 5
In calculating the standard deviation, the two deviations are 2 = 7 − 5 and
−2 = 3 − 5.
The second deviation is always dependent on the first, being its negative, and is
therefore not free to vary. Consequently, there is only one deviation that is free to
vary. An estimate of standard deviation made from a sample of two has thus only
one degree of freedom. More generally, deviations require a fixed point (usually the
mean) against which to measure the deviation. The fixed point, when it is calculated
from the sample, in effect uses up a degree of freedom. The estimation of standard
deviation from a sample of size n is thus based on:
− 1 degree of freedom
Another way of viewing this is through the standard deviation formula:
∑( )
Standard deviation =

where n is the sample size.


Essentially, there are n observations free to vary. However, the mean has to be
calculated from the sample before the deviations can be measured. Suppose the n
observations were all listed and their deviations from their mean measured. The
deviations for the first n − 1 observations are free to vary and can take on different

Quantitative Methods Edinburgh Business School 9/7


Module 9 / More Distributions

values dependent on their random selection, but the last cannot. The last observa-
tion must take a particular value if the mean is to equal the value used in calculating
the previous n − 1 deviations. Only n − 1 deviations are therefore free to vary in the
estimate of the standard deviation, and it has n − 1 degrees of freedom.
The fact that the standard deviation estimate has n − 1 degrees of freedom lies
behind the use of n − 1 as the denominator in the formula. The averaging can be
thought of as being based on the degrees of freedom, not the sample size.
To summarise, the degrees of freedom associated with the estimate of a parame-
ter is the sample size minus the number of observations ‘used up’ because of the
need to measure other statistics (such as the arithmetic mean) before the estimate
can be made. In some cases (e.g. arithmetic mean), the degrees of freedom are equal
to the sample size; in other cases (e.g. the standard deviation), the degrees of
freedom are equal to the sample size minus one; in yet other cases to be met later,
the adjustment to the sample size is more than one.

9.4 t-Distribution
When uses of the sampling distribution of the mean were discussed in the previous
module, there was a restriction on the sample size in certain circumstances. If the
underlying individual distribution was non-normal, the sampling distribution was
normal only when the sample size exceeded 30; the standard deviation could be
estimated from the sample only when the sample size again exceeded 30. The t-
distribution overcomes the second of these restrictions and allows small sample
work to be done.

9.4.1 Characteristics
The t-distribution is similar to a normal distribution except that it tends to have
longer tails. It is continuous. The shape is symmetrical and varies, as in Figure 9.2.
For small sample sizes the tails are considerably longer than those of the normal; for
sample sizes of 30 or more, to a very good approximation, the t-distribution and the
normal coincide.

Normal
t

Figure 9.2 The t-distribution

9/8 Edinburgh Business School Quantitative Methods


Module 9 / More Distributions

9.4.2 Situations in which the t-Distribution Occurs


The t-distribution is used in place of the sampling distribution of the mean when all
the following conditions apply:
(a) The population standard deviation is unknown and has to be estimated from the
sample.
(b) The sample size is less than 30. When the sample size exceeds 30, the normal
and t-distributions almost coincide and thus the use of the t would make no
difference.
(c) The underlying distribution of the population from which the sample is taken is
normal. If this is not the case, then, since the sample size is less than 30, the
central limit theorem cannot be invoked to assume the sampling distribution is
normal. The t-distribution allows only one of the two reasons for the sample size
exceeding 30 to be set aside.
When these conditions all apply, the t-distribution can be used for estimation and
statistical inference, just as the sampling distribution of the mean has previously
been used. For example, the calculation of 95 per cent confidence limits for the
length of life of a brand of electric light bulb from a sample of 40 such bulbs would
be based on the sampling distribution of the mean in conjunction with a normal
curve table; were the sample size 20 and the distribution of individual bulb life
lengths known to be normal, the calculation would be based on a t-distribution
probability table (see Appendix 1, Table A1.4, to be explained below).

9.4.3 Derivation of the t-Distribution


The mathematical derivation of the distribution is credited to W.S. Gosset in the
early part of the twentieth century. The purpose was purely to extend the methods
of estimation and inference to small samples. Incidentally, the brewery that em-
ployed Gosset did not allow him to publish his work. He published it anyway but
under the pen name of ‘Student’, which is why the distribution is often referred to
as Student’s t-distribution.
The formula giving the probabilities of sample means taking different ranges of
values is based on that of the normal distribution, refined to cope with the extra
variation that may arise because the standard deviation is not known exactly but
estimated from the sample. The extra width (i.e. longer tails) of the t-distribution
compared with the normal is because of this extra variation. The final formula and
its derivation are mathematically complicated. It is rarely quoted and almost never
needed explicitly because, as with the normal distribution, tables give the required
probabilities.
There is, however, one additional complication. The t-distribution is wider than
the normal distribution, but, the larger the sample size, the more certain is the
standard deviation estimate made from the sample and the closer is the t-
distribution to the normal. As a result, the t-distribution is slightly different for each
different sample size. More precisely, the distribution differs according to the
degrees of freedom of the estimate. In other words, the degrees of freedom are a
parameter of the t-distribution. Before the t-distribution can be used, the number of

Quantitative Methods Edinburgh Business School 9/9


Module 9 / More Distributions

degrees of freedom must be specified. The t-distribution is based on estimating the


standard deviation from a sample and, as one might suspect, has the same number
of degrees of freedom as the standard deviation, the sample size minus one.
9.4.4 Using t-Distribution Tables
Use of the normal distribution table starts with the calculation of a z-value:
= (Observed value − Mean)/Standard deviation
When applied to the sampling distribution of the mean, this formula becomes for
some observed sample mean :
( )
=
/√
where s is the estimate of standard deviation and n is the sample size.
Analogously, the t-distribution starts with the calculation of this same quantity,
but labelled ‘t’ to denote that the distribution being used is not normal:
( )
=
/√
The interpretation of t values is the same as that of z values, but in practice they are
used in different ways because of the layout of t tables. Since the t-distribution has a
different set of probabilities for each level of the degrees of freedom, only certain
probabilities are given in a t table, to prevent it becoming too long. Only those
probabilities, 0.10, 0.05, 0.025, etc., that are particularly useful in estimation and
inference are given in the t-distribution in Appendix 1 (Table A1.4). The rows refer to
degrees of freedom, the columns refer to probabilities and the body of the table
contains t values. Figure 9.3 illustrates how the table works. For example, for a sample
size 12 (= 11 degrees of freedom), a t value of 2.201 leaves 0.025 (2.5 per cent) in the
tail of the distribution.

2.5%

t0.025

Figure 9.3 Use of the t-distribution


Table A1.2 in Appendix 1 shows how normal and t-distributions coincide when
the sample size is large. As the degrees of freedom increase, the t value that leaves
2.5 per cent in the tail (t0.025) approaches 1.96, the same as the z value that leaves 2.5
per cent in the tail. (The 1.96 has usually been approximated to 2 in general discus-
sions: ‘95 per cent lies within ±2 standard deviations’).
In estimation, t values are used in the same way as z values. The 95 per cent
confidence limits of the estimate of a population mean were:
± 2 /√

9/10 Edinburgh Business School Quantitative Methods


Module 9 / More Distributions

The 2 in this expression is just the z value that leaves 2.5 per cent in one tail of a
normal distribution, because 95 per cent of normal distribution lies between ±2
standard deviations of the mean. When the distribution is the t, then, instead of z =
2, the formula uses t0.025. The 95 per cent confidence limits for a population mean
when the t-distribution is used are:
± . · /√
This t value will vary according to the sample size, unlike the z value, which does
not change with the sample size. Different confidence levels can be substituted. The
90 per cent confidence limits would have t0.05 in place of t0.025.
Example
A sample of 20 brand X light bulbs is tested to destruction. Their average life length is
1080 hours and the standard deviation of the sample is 60 hours. What are the 95 per
cent confidence limits of the average life length of all brand X bulbs?
Follow the five stages of the estimation procedure introduced in the last module.
(a) A random sample of size 20 has been taken. The sample size is not large enough for
the central limit theorem to be used. The t-distribution will apply, but only if it is
known or can be assumed that the individual distribution of bulb life lengths is nor-
mal.
(b) The mean of the sample has been calculated to be 1080 hours, the standard
deviation to be 60 hours.
(c) The standard error of the sampling distribution is:
Individual standard deviation/ Sample size = 60/√20
= 13.42
(d) The point estimate of the population mean (all brand X bulbs) is 1080 hours.
(e) 95 per cent confidence limits are given by:
± . · /√
For 19 (= 20 − 1) degrees of freedom, t0.025 is 2.093 (from Table A1.4 in Appendix
1). Therefore the limits are:
= 1080 ± 2.093 × 60/√20
= 1080 ± 125.58/4.47
= 1080 ± 28.09
= 1052 to 1108
The 90 per cent limits use 1.729 (see Table A1.4 in Appendix 1) instead of 2.093 and
so are:
= 1080 ± 1.729 × 60/√20
= 1080 ± 23.2
= 1057 to 1103
In significance testing as well as estimation, the layout of t tables requires t values to be
used in a slightly different way to z. Instead of probabilities or critical values being
compared, an observed t value is compared to a critical t value. The approach is very
similar to the critical value approach to significance tests. The five stages of significance
test become:
(a) Specify the hypothesis.

Quantitative Methods Edinburgh Business School 9/11


Module 9 / More Distributions

(b) Collect sample evidence.


(c) Select a significance level.
(d) Calculate the t value related to the sample evidence.
(e) Compare the observed t value with the t value associated with the significance level.
Accept or reject hypothesis accordingly.
Example
An engineering firm uses a heating treatment to harden the metal parts of certain
components it produces. It has recently purchased 10 new machines as part of its
automation policy. The machines are intended to operate at 280 °C. According to the
manufacturer some small, normally distributed variation about this level should be
expected. A check on the machines shortly after installation reveals that their average
operating temperature is 265 °C with a standard deviation of 28 °C. Are the machines
operating to specification?
Follow the stages of a significance test.
(a) The hypothesis is that the machines are operating according to specification and
therefore that their temperature comes from a normal (according to the manufac-
turer) population of arithmetic mean 280 °C.
(b) The evidence is a sample of size 10 with mean ( ) 265 and standard deviation (s) 28.
(c) The significance level is usually 5 per cent.
(d) The observed t value is calculated from the sample:
( )
=
/√

=
/√
= −15/8.85
= −1.69
(e) The observed t value is −1.69. Note that, since the t-distribution is symmetrical, a
negative t value causes no difficulties. The theoretical t value obtained from Ta-
ble A1.4 in Appendix 1 and appropriate to the significance level depends on whether
the test is one- or two-tailed. Since the machines are for hardening metal, only undu-
ly low temperatures are of concern to the engineering company. Assuming this is the
case, the test is one-tailed. From the table, the t value is in the row corresponding to
9 degrees of freedom and the column corresponding to 5 per cent in the tail (i.e. the
column headed 0.05). This value is 1.833. The observed value is less than 1.833 and
thus the sample evidence has a probability greater than 5 per cent (see Figure 9.4).
The hypothesis is accepted. The machines are performing according to specification.

9/12 Edinburgh Business School Quantitative Methods


Module 9 / More Distributions

5%

t0.05 = 1.833
Sample t =1.69

Figure 9.4 Engineering example


The problems of small samples can be handled by increasing the sample size. But this
option does not always exist. In this example, if the components were of high value and
low volume production, increasing the sample beyond 10 may be prohibitively expensive
or even impossible.
Note that if the test had been two-tailed the t value for the 5 per cent significance level
would have been taken from the column headed 0.025. This value is 2.262. A two-tailed
test would still result in the acceptance of the hypothesis.
If, in error, the sampling distribution of the mean based on the normal distribution had
been used instead of the t-distribution, the hypothesis would have been rejected. In this
case the z value from the sample is still 1.69, but the z value from the normal curve
table that leaves 5 per cent in the tail is 1.645. This is lower than the observed value,
which must therefore be in the tail. On this (false) basis, the sample evidence has a
probability of less than 5 per cent and the hypothesis is rejected.
All the tests previously carried out for large samples with the sampling distribution of
the mean may be carried out for small samples using the t-distribution. This means that
the following tests are now available for small samples:
(a) single sample tests of a population hypothesis;
(b) tests between two unpaired samples;
(c) tests between paired samples.

9.4.5 Parameters
When the sample size exceeds 30, the sampling distribution of the mean is almost
normal and so has two parameters, the arithmetic mean and the standard deviation.
The t-distribution stems from this but has one extra parameter, the degrees of
freedom. The t-distribution, therefore, has three parameters in total. This can be
verified by noting that once mean, standard deviation and degrees of freedom are
specified, the distribution probabilities are fixed.

9.4.6 Deciding whether Data Have a t-Distribution


As with all standard distributions, an actual situation may not match exactly the
theoretical situation on which the distribution was defined. A mismatch is only likely
to occur with the t-distribution when the underlying individual distribution from
which the sampling distribution is formed is not normal. Since the sample sizes for

Quantitative Methods Edinburgh Business School 9/13


Module 9 / More Distributions

which the t-distribution is used are always less than 30, the central limit theorem can
never be invoked and therefore the normality assumption is important. This is
especially true, of course, when the individual distribution is far from normal in
shape.
The check that should be made is that the individual distribution is indeed nor-
mal. This may be done by following the usual procedure of taking a small sample
and comparing observed with expected frequencies.
Four standard distributions have been encountered so far: binomial, normal,
Poisson and t. They are probably the most widely used and for that reason have
been covered in some detail. Many more standard distributions are available but
their range of application tends to be limited and specialised. They can also be
mathematically complex. For these reasons the distributions that follow are de-
scribed in less detail, and we concentrate on how and where they can be applied.

9.5 Chi-Squared Distribution


One of the major applications of the normal and t-distributions is to compare the
mean of a sample with the hypothesised mean of the whole population. Such
significance tests are in effect dealing with location (of which arithmetic mean is the
major measure – see Module 5). They answer the question: is the observed arithmetic
mean of the sample in accord with what is thought to be the arithmetic mean of the
population? Less frequently, the scatter may be of interest rather than or as well as
the location. The chi-squared (χ2) distribution provides the method for comparing
an observed sample variance with a hypothesised population variance. It can answer
the question: is the observed scatter of the sample in accord with what is thought to
be the scatter of the population?

9.5.1 Characteristics
Chi-squared, the variable of the distribution, is defined as:
= ( − 1) × Observed sample variance/Population variance
The (n − 1) is the degrees of freedom of the chi-squared distribution. The sam-
ple size is n, and one degree of freedom is lost because an estimate of the mean is
required for the calculation of the sample variance, just as with the t-distribution.
The distribution can take on a variety of shapes, as shown in Figure 9.5. As the
degrees of freedom approach 30, the shape becomes more and more like that of the
normal distribution.

9/14 Edinburgh Business School Quantitative Methods


Module 9 / More Distributions

k=2
k = Degrees of freedom
k=5

k = 20

Figure 9.5 Chi-squared distribution

9.5.2 Situations in which the Chi-Squared Occurs


The distribution is developed from the following theoretical situation:

A random sample is taken from a normally distributed population whose popu-


lation variance, σ2, is known. The variance of the sample is measured using
= ∑( − ) /( − 1). The chi-squared ratio, (n − 1)s2/σ2, is calculated.
More samples are collected and chi-squared ratios calculated for them all. The
distribution of these ratios is known as the chi-squared distribution.

As usual with sampling distributions, in practice only one sample is gathered and
the theoretically derived distribution used to calculate probabilities as part of
estimation procedures or significance tests.
The assumption, common to all sampling distributions, that the sample has been
taken at random also applies to the chi-squared distribution. A second important
assumption ceases to be important when, as a rule of thumb, the sample size
exceeds 30. Just as with the t-distribution, the chi-squared becomes approximately
normal as the sample size approaches 30. The same restrictions on normality and
sample size apply therefore to the chi-squared as to the t-distribution. To summa-
rise, the assumptions are either that the sample has been taken at random from a
normal population or that, if the population is non-normal, the sample size is 30 or
more.

9.5.3 Use of Chi-Squared Tables


The chi-squared is a non-symmetrical distribution. When one is estimating or
carrying out significance tests, the two tails of the distribution have to be treated
separately. The rows of Table A1.5 in Appendix 1 refer to degrees of freedom; the
columns refer to the area in the upper or right-hand side of the distribution. The
column headed 0.95 therefore relates to a left-hand tail of 5 per cent and the column
headed 0.05 relates to a right-hand tail of 5 per cent. The values in the body of the
table are the chi-squared values relating to the degrees of freedom and distribution
areas specified by the rows and columns. As with the t-distribution, since there is a
different distribution for each degree of freedom, only the more important probabil-
ities are given.

Quantitative Methods Edinburgh Business School 9/15


Module 9 / More Distributions

Estimation is based on finding confidence limits for chi-squared and then trans-
forming these into variances. For example, at the 95 per cent level, there is 95 per
cent probability that a chi-squared observed from a sample will fall between:
. and .
Thus:
. < ( − 1) × Sample variance/Population variance < .
Inverting this expression (and therefore also changing < into >), with s2 as usual
referring to the sample variance, gives:
1/ . > Population variance/( − 1) > 1/ .
( )
> Population variance >
( ) ﴾9.1﴿
. .

This expression gives the 95 per cent confidence limits for a population variance.
Example
A random sample of 10 is taken from a normal distribution. The variance of the sample
is six. What are the 90 per cent confidence limits for the population variance?
From Equation 9.1:
( ) ( )
> Population variance >
. .

The limits are:


( − 1) / . and ( − 1) / .
From Table A1.5 in Appendix 1, since there are 10 − 1 = 9 degrees of freedom:
. = 3.325 and . = 16.919
The limits are 54/16.919 and 54/3.325 (3.2 and 16.2).
It was known that the population from which the sample was drawn was normal and
therefore the sample size did not have to be 30 or more.
As with the t-distribution, significance tests are based on comparing an observed chi-
squared value with a critical value obtained from the table.
Example
A new computer-based management aptitude test is being checked against the previous
test. The check involves comparing the test scores of a random sample of 25 executives
who have taken the new test against the past records of the old test collected over
several years. The mean of the sample has been found to be broadly in line with the old
test. The variance of the sample is now to be checked against the past records. The
sample variance is 4; the variance calculated from the past records is 3.2. Does the
evidence show that the new test has a different scatter than the old? What assumptions
are being made in the test?
Follow the five stages of a significance test:
(a) The hypothesis is that the new test has not affected the scatter. The sample variance
is supposed to have come from a population whose variance is 3.2.
(b) The evidence is the sample of 25 scores of variance 4.
(c) Choose the conventional 5 per cent significance level.
(d) The observed chi-squared is calculated:

9/16 Edinburgh Business School Quantitative Methods


Module 9 / More Distributions

= ( − 1) × Sample variance/Population variance


= (24 × 4)/3.2
= 30
(e) The new test could have increased or decreased the scatter and thus the test is
two-tailed. There are two critical chi-squared values, one for either tail. The relevant
columns of Table A1.5 in Appendix 1 are those that leave 2.5 per cent in each tail:
. and .
There are 25 − 1 = 24 degrees of freedom. From the table, the chi-squared values
are therefore 12.4 and 39.4. The observed χ2 falls between the two critical values
(see Figure 9.6) and thus the probability of the sample evidence must be greater than
5 per cent. The hypothesis is accepted. The new test does not affect the scatter.
There are two major assumptions: first, the distribution of test scores from which
the sample was drawn is normal; second, the sample was selected at random.

2.5% 2.5%

c20.975 c2Obs c20.025


= 12.4 = 30 = 39.4

Figure 9.6 Management aptitude: Chi-squared test

9.5.4 Using Chi-Squared to Test Differences in Proportions


One of the most common managerial uses of the chi-squared distribution is to test
for differences in proportions. For example, suppose an advertising agency wants to
know whether there are gender disparities in the senior levels of its management.
Personnel records provide the data in Table 9.2. The question the agency wants
answered is whether the levels of seniority its managers reach are linked to, or are
independent of, their genders. In other words, given the numbers of men and
women in the organisation, are there disproportionate numbers of men and women
at different levels of management?

Table 9.2 Management levels in an advertising agency


Gender Directors Senior Middle Total
management management
Men 7 12 9 28
Women 3 8 11 22
Total 10 20 20 50

In any significance test of this question the null hypothesis would be that seniori-
ty is independent of gender (i.e. there are as many men and women at each
management level as would be expected given the numbers of men and women
employed by the agency). So, as a first step, it would be useful to know what should
be expected.

Quantitative Methods Edinburgh Business School 9/17


Module 9 / More Distributions

Given that 28 out of 50 employees are men, it would be expected (no gender
differences, remember!) that 28/50 directors are men. Likewise 22/50 directors
should be women. But there are 10 directors and therefore 28/50 of 10 = 5.6 would
be men, setting aside concerns about what 0.6 of a director means. This would mean
4.4 women directors. Table 9.3 has been developed from Table 9.2 to include these
expected numbers.

Table 9.3 Management levels in an advertising agency: observed and


expected
Gender Directors Senior Middle Total
management management
Men fo = 7; fe = 5.6 fo = 12; fe = 11.2 fo = 9; fe = 11.2 28
Women fo = 3; fe = 4.4 fo = 8; fe = 8.8 fo = 11; fe = 8.8 22
Total 10 20 20 50
fo = observed frequency; fe = expected frequency

It rather looks from Table 9.3 as if women are under-represented at the higher
levels and over-represented at the lower, but how can this impression be tested
statistically? The key is that the statistic:
( )

over all cells of the table, follows, approximately, a chi-squared distribution. Some
statistical theory, which we will leave in the ‘black box’, is needed to demonstrate
this. The degrees of freedom are:
(Number of rows − 1) × (Number of columns − 1)
Example
A significance test for the advertising agency follows the usual five stages.
(a) The null hypothesis is that of independence: the proportions of male and female
managers at each management level are in proportion to the numbers of men and
women employed by the agency.
(b) The data are the evidence collected from personnel records and shown in Table 9.2.
(c) The conventional 5 per cent significance level is used.
(d) The observed chi-squared value is calculated:
= (7 − 5.6) /5.6 + (12 − 11.2) /11.2 + (9 − 11.2) /11.2 + (3 − 4.4) /4.4
+(8 − 8.8) /8.8 + (11 − 8.8) /8.8
= 1.96/5.6 + 0.64/11.2 + 4.84/11.2 + 1.96/4.4 + 0.64/8.8 + 4.84/8.8
= 1.91
(e) The test is one-tailed because the purpose of the test is to determine whether
observed and expected are significantly different, i.e to determine whether chi-
squared is unusually high; it would be of no interest in addressing the hypothesis if
chi-squared were unusually low. The conventional five per cent significance level
means, for a one-tailed test, that the value of chi-squared that leaves five per cent in
the tail of the distribution must be found. The degrees of freedom are:
(Number of rows − 1) × (Number of columns − 1) = 1 × 2 = 2

9/18 Edinburgh Business School Quantitative Methods


Module 9 / More Distributions

From the chi-squared table (Table A1.5 in the Appendix 1), the value of chi-squared
leaving 5 per cent in the tail and relating to two degrees of freedom is 5.99.
The observed chi-squared is much smaller than this and therefore the hypothesis must
be accepted: level of seniority and gender are independent. There are no significant
gender differences in the levels of seniority. The point is that, in this case, although there
are differences and women do appear to be under-represented, these differences are
not statistically significant.
There is a potential difficulty with this test. The above formula for chi-squared is an
approximation to a true chi-squared. It is akin to approximating the binomial distribu-
tion with the normal. In this case the approximation to chi-squared is only valid if the
sample size is large. As a rule of thumb, the expected frequency in each cell of the table
should be around five or more. If it is not, a correction should be made to the formula.
In the above example none of the cells had expected frequencies much below five and
so the correction was not made. The nature of the correction goes beyond the scope of
this module and is not described.
This approach, of comparing expected and observed frequencies to test for pro-
portions in a table, is also the basis for testing the goodness of fit of distributions to
observed data. For example, it is possible to test whether a sample of data follows a
standard distribution such as the normal or binomial distribution. The expected
frequencies are what would be expected theoretically from statistical tables of the
distribution. For instance, 34 per cent of the distribution lies between the mean and
+1 standard deviation. The observed frequencies are what is observed in the sample
to lie, for instance, between the mean and +1 standard deviation. The above
approximation is used to calculate an observed chi-squared and this is compared to
the critical value for the specified significance level found in chi-squared tables. This
application of chi-squared is important to statisticians but less so to managers and
for this reason is not covered in detail.

9.6 F-Distribution
When location (usually measured by the arithmetic mean) is the subject of analysis,
the normal and t-distributions can be used both to compare a sample mean with a
population mean and to compare the mean of one sample with that of another.
When scatter is the subject of analysis, the chi-squared distribution can be used to
compare a sample variance with a population variance. The F-distribution com-
pletes the picture as regards scatter. It is used to compare the variance of one
sample with that of a second. The variable of an F-distribution is the ratio between
two variance estimates. Just as the location of two samples could be compared
through the difference in their means (by applying a normal or t test), so the scatter
of two samples can be compared through the ratio of their variances (by applying an
F test).

9.6.1 Characteristics
The variable of the F-distribution is defined as:
= Variance of sample 1/Variance of sample 2

Quantitative Methods Edinburgh Business School 9/19


Module 9 / More Distributions

There is a pair of degrees of freedom associated with the F-distribution, one for
the sample variance in the numerator (the upper part of the ratio) and one for the
sample variance in the denominator (the lower part of the ratio). Each of these
degrees of freedom is the sample size minus one, the one being lost for the usual
reason that in calculating a variance the arithmetic mean must also be calculated
from the sample. For example, if the two samples had 17 and 13 observations
respectively then the F-distribution would be said to have (16,12) degrees of
freedom. The distribution can take on a variety of shapes, as shown in Figure 9.7.
The shapes are right-skewed. The lower the degrees of freedom, the more skewed
the distribution is likely to be, since estimates of variances will be more uncertain.

df(20,8) df(30,30)

df(8,8)

0 1 2 3 4

Figure 9.7 The F-distribution

9.6.2 Situations in which the F-Distribution Occurs


The F-distribution is formed from the following situation:

Two samples are taken at random from a normally distributed population. The
variances of the two samples, and , are calculated. The ratio between the
variances, = / , is also calculated. More pairs of samples are taken and F
calculated for each pair. The distribution of these ratios is the F-distribution.

As usual for sampling distributions, the applications of the F-distribution are


based on taking just one pair of samples and using them in conjunction with
theoretically derived F-distribution tables, such as Table A1.6 in Appendix 1, for
estimation and to carry out significance tests. For example, if the samples have come
from populations with different variances, then the F ratio will be unlikely to
conform to the theoretical distribution. A significance test can help to verify or
refute the hypothesis that the samples do come from the same population (or
populations with the same variance). Just as the t and normal distributions can
compare the difference in the arithmetic means of two samples, so the F-
distribution can compare the difference in variances of two samples.
A major application of the F-distribution is in analysis of variance, the subject of
the next module.
As with the chi-squared distribution, the F rests on two important assumptions:
first, the samples should be selected at random; second, the population from which
the samples come should be normally distributed.

9/20 Edinburgh Business School Quantitative Methods


Module 9 / More Distributions

9.6.3 Use of F Tables


In an F table (see Appendix 1, Table A1.6) the rows refer to the degrees of freedom
in the denominator of the F ratio and the columns to the degrees of freedom in the
numerator. In both cases, the degrees of freedom are the sample size minus one (n
− 1). The ‘minus one’ is for the usual reason that one degree of freedom is lost
because the mean has to be calculated before the variance can be calculated.
For a given pair of degrees of freedom, the corresponding cell in the table con-
tains two numbers. The upper is the 5 per cent critical value, labelled F0.05; the lower
is the 1 per cent critical value, labelled F0.01. Their meaning is as follows. Suppose
that two samples of sizes, say, n and m, are drawn at random from a normal
population. The ratio of their variances (the F ratio) is calculated with the larger of
the variances on top (the numerator). More samples of the same size are drawn and
more F ratios calculated. Eventually, a distribution is formed. The critical 5 per cent
value is that F ratio which leaves 5 per cent in the tail of the distribution (see Fig-
ure 9.8). F0.01 is interpreted as leaving 1 per cent in the tail. By convention, the larger
of the variances is on top in the ratio so that only one tail has to be considered. If
either variance could be the upper then the F ratio could be less than one as well as
greater, and the tables would have to be twice as large to include critical values for
the left-hand as well as right-hand tail.

5%
1%

F0.05 F0.01

Figure 9.8 Using an F-distribution table


For example, two samples are selected at random from a normal distribution and
their F ratio is calculated as the quotient of their variances with the larger variance
on top. If the sample sizes are 7 (numerator) and 10 (denominator) then the 5 per
cent critical F ratio, from Table A1.6 in Appendix 1 is 3.37. The 1 per cent critical F
ratio is 5.80.
A typical use of the F table is in a significance test. For example, suppose it is
hypothesised that two samples come from populations with the same variance. If
their F ratio exceeds the 5 per cent critical value, an event has occurred that has a
probability less than 5 per cent and, as in other significance tests, the conclusion is
to reject the hypothesis at the 5 per cent level. Similarly, if the observed F ratio
exceeds the 1 per cent critical value, the hypothesis is rejected at the 1 per cent level.
Example
A manufacturer of electronic kitchen equipment buys in resistors for oven controls
from two sources, Supplier A and Supplier B. The resistors are of nominal resistance 3

Quantitative Methods Edinburgh Business School 9/21


Module 9 / More Distributions

ohms, but inevitably not all are of exactly 3 ohms. From past experience it is thought
that each supplier’s resistors are normally distributed about a mean of 3 ohms. A quality
control scheme based on sampling and significance testing continually checks that the
average resistance is 3 ohms. The evidence is that the average is 3 ohms for both
suppliers. However, recent customer complaints have begun to suggest that there may
be a difference in quality between the two suppliers’ products in that the variation in
resistance is greater for Supplier A than Supplier B. A sample of 12 resistors is taken at
random from A’s supply and the variance calculated to be 0.042; a sample of 18 is taken
from B’s supply and the variance calculated to be 0.017. Is there a significant difference
in quality?
Follow the five stages of a significance test:
(a) The hypothesis is that the two samples come from populations with the same
variance; in other words, there is no difference in quality.
(b) The evidence is the two samples and their variances, from which an F ratio can be
calculated. Since the distribution of resistances of each supplier’s resistors is thought
to be normal, a significance test based on the F-distribution will be valid.
(c) Set the significance level at the conventional 5 per cent.
(d) The critical F value for a 5 per cent significance level and (11,17) degrees of freedom
is 2.41. The observed F ratio is 0.042/0.017 = 2.47. The observed F ratio is slightly
larger than the critical value.
(e) Since the observed F exceeds the critical value, the sample evidence has a probability
of less than 5 per cent. The hypothesis is rejected. There is evidence that one supplier
(Supplier B) is supplying a product of better quality, as defined by variation in re-
sistance. The result is so close to the accept/reject borderline that this fact should be
mentioned in quoting the result. If the significance level had been 1 per cent then the
hypothesis of no difference in quality would be accepted, since the 1 per cent critical
value is 3.52 while the observed F ratio remains 2.47.
Two major assumptions were involved in this significance test. They are the same as for
the chi-squared distribution. First, the populations from which the samples were taken
were normal. Second, the samples were selected at random from the populations.

9.7 Other Distributions


There are many other standard distributions in addition to the six met up to now,
but not all are applicable to management problems. Some may relate to other fields
of interest such as the natural sciences or engineering. The ones that are applicable
to management are generally used only occasionally, but two are worthy of note.
First, the negative binomial distribution (it is nothing directly to do with the
binomial) is applicable to situations that can almost, but not quite, be modelled by the
Poisson. For example, it might be thought that the number of packets of a brand of
breakfast cereal bought by families in, say, a six-month period would be represented
by the Poisson distribution (a sample of time, a finite number of purchases, an infinite
number of non-purchases, etc.). However, tests of the goodness of fit of the Poisson
show that there is a mismatch between what is observed and what is theoretically
expected. The reason is that the process of buying breakfast cereal differs from family
to family. Some families are more prone to buying cereals than others. In other words,
the parameter (λ) varies from family to family. If this variation is taken into account

9/22 Edinburgh Business School Quantitative Methods


Module 9 / More Distributions

mathematically, the negative binomial distribution emerges. It is of increasing im-


portance in marketing, particularly in the area of buyer behaviour.
A second distribution to note is the beta-binomial distribution. This is applica-
ble in situations where the binomial is not quite sufficient. If the binomial
parameter, p (the proportion of type one), varies in a certain way across different
parts of the population, the beta-binomial may be applicable. For example, it might
be supposed that the binomial distribution would be applicable to the number of
boys or girls per family (a family of size five is a sample from a population, that of
all five-children families, containing 51 per cent boys and 49 per cent girls). This
appears to be a binomial situation and it would be expected that the binomial would
be able to predict, for instance, the numbers of all-boy families for each family size.
It does not do so accurately. The reason for the inaccuracy is that p is not constant
across the population. Some families are more prone to having boys (or girls) than
others. In other words, not all families have a 51 per cent chance that a birth will
produce a boy. The beta-binomial distribution is a better model of this situation
than the binomial.

Learning Summary
In respect of their use and the rationale for their application, the standard distribu-
tions introduced in this module (Poisson, t, chi-squared, F, negative binomial and
beta-binomial) are in principle the same as the earlier ones (normal and binomial).
Their areas of application are to problems of inference, specifically estimation and
significance testing. The advantages their use brings are twofold. First, they reduce
the need for data collection compared with the alternative of collecting one-off
distributions for each and every problem. Second, each standard distribution brings
with it a body of established knowledge that can widen and speed the analysis.
The eight standard distributions encountered so far are just a few, but probably the
most important few, of the very many that are available. Each has been developed to
cope with a particular type of situation. Details of each distribution have then been
recorded and made generally available. When a new distribution has been developed
and added to the list, it has usually been because it is applicable to some particular
problem that can be generalised. For instance, W.S. Gosset developed the t-
distribution because of its value when applied to a sampling problem in the brewing
company for which he worked. Because this problem was a special case of a general
type of problem, the t-distribution has gained wide acceptance.
To summarise, when one is faced with a statistical problem involving the need to
look at a distribution, there is often a better alternative than having to collect large
amounts of data. A wide range of standard distributions are available and may be of
help. Table 9.4 summarises the standard distributions described so far and the
theoretical situations from which they have been derived.
One of the principal uses of standard distributions is in significance testing. Ta-
ble 9.5 lists four types of significance test and shows the standard distribution that is
the basis of each.

Quantitative Methods Edinburgh Business School 9/23


Module 9 / More Distributions

In addition to their direct application to problems, standard distributions are


fundamental to many other parts of formal statistical analysis. In later modules a
knowledge of distributions, particularly the normal, will be essential.

Table 9.4 Summary of standard distributions


Distribution Situation
Normal Observations taken (or measurements made) of some
quantity that is essentially constant but is subject to many
small, additive, independent disturbances.
Binomial Samples taken from a population in which the elements
are of two types. The variable is the number of elements
of one of the types in the sample.
Poisson Samples taken of a continuum (e.g. time, length). The
variable is the number of ‘events’ in the sample.
Similar to the normal but where the standard deviation is
estimated from a sample of size < 30.
Chi-squared Sample taken from a normal population. The variable is
based on the ratio between the sample variance and the
population variance.
Two samples taken from a normal population. The variable
is the ratio between the variances of the two samples.
Negative binomial Like the Poisson, but with the parameter λ, itself subject
to variation across the population.
Beta-binomial Like the binomial, but with the parameter p, subject to
variation across the population.

Table 9.5 Summary of significance tests


Significance test Distribution
Comparing a sample mean with a Normal (if sample size ≥ 30); t (if sample
population mean. size < 30)
Comparing one sample mean with Normal (if combined sample ≥ 30); t (if
another sample mean. combined sample < 30)
Comparing a sample variance with a Chi-squared
population variance.
Comparing one sample variance with
another sample variance.

9/24 Edinburgh Business School Quantitative Methods


Module 9 / More Distributions

Review Questions
9.1 What standard distribution would you expect to apply to the heights of the male
employees of an organisation?
A. Binomial
B. Normal
C. Poisson
D. t
E. F

9.2 If n > 20 and p < 0.05, the binomial distribution is usually approximated by the Poisson
when analysing data. The reasons for this are:
A. the Poisson is more accurate since it is based on the binomial but with p small
and n large.
B. it is easier to calculate with the Poisson formula than with the binomial.
C. Poisson tables are easier to use than binomial tables.

9.3 The stretch of motorway between two major cities has had 36 major accidents in the
last year. What is the probability that there will be more than five major accidents next
month?
A. 18%
B. 8%
C. 10%
D. 5%

9.4 When the sample size is 25, the number of degrees of freedom associated with
estimating the variance is:
A. 25
B. 24
C. 23
D. 22

9.5 The sampling distribution of the mean is a t-distribution, not a normal distribution, when
random samples of size 20 are taken from a population of unknown standard deviation
and unknown type. True or false?

Quantitative Methods Edinburgh Business School 9/25


Module 9 / More Distributions

Questions 9.6 to 9.9 refer to the following situation:


Eighteen executives are sent on a two-week physical fitness course by their company.
A test measures their fitness on a scale ranging from 0 to 500. An unfit executive taking
no regular exercise would score about 100 on the test, a person taking exercise three
times a week 250–300, an Olympic athlete over 400. The executives are tested at the
start and end of the course. The difference in scores for each executive is calculated.
The mean difference for the sample of 18 is found to be 60, the standard deviation 95. A
paired t test is used to determine whether the fitness of the executives has improved
significantly, at the 5 per cent level. The hypothesis is that the sample of differences has
come from a distribution of mean 0. It can be assumed that the differences are normally
distributed.

9.6 What is the standard error of the sampling distribution of mean differences?
A. 9.7
B. 23.0
C. 5.3
D. 22.4

9.7 What is the observed t value?


A. 1.74
B. 1.73
C. 0.63
D. 2.68

9.8 What is the t value corresponding to the 5 per cent significance level against which to
compare the observed t value?
A. 1.73
B. 1.74
C. 2.10
D. 2.11
E. 1.645

9.9 The conclusion should be:


A. Accept the hypothesis.
B. Reject the hypothesis.
C. Inconclusive test.

9.10 Provided conditions of randomness and normality are met, the chi-squared distribution
can be the basis of testing whether a sample variance is significantly different from a
hypothesised population variance. True or false?

9/26 Edinburgh Business School Quantitative Methods


Module 9 / More Distributions

9.11 Referring to the figure below, what is the 10 per cent critical chi-squared value for the
upper tail of a distribution with 18 degrees of freedom?

10%

c2 = ?

A. 10.865
B. 10.085
C. 24.769
D. 28.869
E. 25.989

9.12 The F-distribution could be used to determine confidence limits in the estimation of
population variance from a sample variance provided the sample has been selected at
random and the distribution is normal. True or false?

9.13 The 5 per cent critical value for the F-distribution when the numerator has 8 degrees of
freedom and the denominator 11 is:
A. 3.20
B. 5.74
C. 4.74
D. 2.95
E. 3.31

9.14 An F ratio is calculated from the variances 24 and 96 of two samples of sizes 15 and 13
respectively. This ratio exceeds the 1 per cent critical F ratio. True or false?

9.15 One of CBA plc’s products is a range of bathroom tiles. The production process has an
average 0.35 per cent tiles defective. The tiles are sold in packages containing 100 tiles.
What standard distribution would be used in practice to monitor the number of
defective tiles per package?
A. Normal
B. Binomial
C. Poisson
D. t
E. Chi-squared
F. F

Quantitative Methods Edinburgh Business School 9/27


Module 9 / More Distributions

Case Study 9.1: Aircraft Accidents


1 Passenger aircraft are, unfortunately, subject to accidents. There is a wide variety of
reasons for them. Bad weather, pilot error, incorrect maintenance, poor air traffic
control and design faults all feature in official accident reports. The frequency of
accidents depends very much on the make and mark of aircraft concerned. To some
degree this is because some aircraft are inferior to others, but mainly it is because the
circumstances in which aircraft operate make them more or less vulnerable to accident.
For all aircraft, insurance companies have to decide on the level of premium to charge
to cover against these risks. Premiums are revised from time to time as circumstances
change and as more operations records become available. The premium for a particular
make of aircraft is shortly to be revised in the light of its first 18 months of major
commercial operation. To help with this decision, information on the incidents involving
100 aircraft of this type over a 400-day period has been collected. An incident is not the
same as an accident. An accident is when there is personal injury and a claim on the
insurance company; an incident is some situation that, although not necessarily leading
to an accident, could potentially have led to an accident, in the pilot’s view. As a result,
the situation has been made the subject of an official report by the pilot. The pattern
and frequency of incidents, especially in comparison to the record for other makes of
aircraft, is used to establish the insurance premium. The record of incidents for the
aircraft is as follows:

Number of incidents in 400 days (r) 0 1 2 3 4 5 6 7


Number of aircraft with r incidents 23 33 23 11 5 3 1 1
Total = 100
a. Does this pattern of incidents follow a Poisson distribution?
b. Eventually it is anticipated that there will be 800 aircraft in service throughout the
world. If the pattern of incidents is Poisson, how many of the 800 would be ex-
pected to have five or more incidents in a 400-day period?
c. What reservations are there concerning the analysis?
d. What further data or analyses are required to strengthen the conclusion?

Case Study 9.2: Police Vehicles


1 The vehicle manager of a police force is investigating the possibility of using a different
type of tyre on the force’s high performance patrol cars. All such cars are of the same
make and mark. Their annual mileage is exceptionally high and it is thought that the new
tyres will bring about large savings in petrol cost. The manufacturers of the tyres claim
that it should bring about a reduction in petrol consumption equal to 1.5 miles per
gallon (i.e. with the new tyres a car would, on average, be able to travel a further 1.5
miles for every gallon of petrol). Before embarking on a detailed cost study the manager
plans to carry out a pilot study, which, if it looks promising, should enable him to secure
the assistance and funds required for a detailed evaluation.
The pilot study took the following form. The force has well over 100 patrol cars in total.
Of these, the 14 patrol cars aged between six and nine months were selected to take
part in a trial. The new tyres were put on these cars for a three-month period. At the

9/28 Edinburgh Business School Quantitative Methods


Module 9 / More Distributions

end of this time the miles-per-gallon (mpg) figure was calculated for each of the 14 cars.
The result was as follows:

Car 1 2 3 4 5 6 7 8 9 10 11 12 13 14
mpg 21 24 22 24 29 18 21 26 25 19 22 20 28 23

The police computer holds not only criminal records but also records of the history of
all police vehicles. This includes summaries of petrol consumption per quarter. From
these data it was found that, in the past, patrol cars, when between six and nine months
old, had on average consumed petrol at the rate of 22.4 miles per gallon over quarterly
periods. It is not possible to calculate the standard deviation of this distribution because
the data are held in summary form only and there is insufficient detail for the standard
deviation to be determined. Checks with the records of other police vehicles indicate
that the distribution can be assumed to be normal.
a. Conduct a significance test to determine whether the new tyres have made a
significant difference to petrol consumption, the null hypothesis being that the tyres
make no difference to petrol consumption.
b. What should the sample size ideally have been if the test were to discriminate
equally between the null hypothesis and the alternative hypothesis (the manufactur-
er’s claim)?
c. Why is it necessary for the cars included in the trial to be of the same make and
mark and of roughly similar ages?
d. A senior officer of the force who took a statistics option at the police training
college suggests that the test is invalid because the past records refer to the three
previous years and, he says, maintenance methods must have improved over this
time. His proposal is to compare the consumption of a sample of cars using the new
tyres with another sample of cars of the same make and ages at a particular neigh-
bouring force whose maintenance routines just happen to be identical. What are the
arguments for and against his suggestion?

Quantitative Methods Edinburgh Business School 9/29


Module 10

Analysis of Variance
Contents
10.1 Introduction.......................................................................................... 10/1
10.2 Applications .......................................................................................... 10/2
10.3 One-Way Analysis of Variance ........................................................... 10/5
10.4 Two-Way Analysis of Variance ........................................................ 10/10
10.5 Extensions of Analysis of Variance................................................... 10/13
Learning Summary ....................................................................................... 10/14
Review Questions ......................................................................................... 10/15
Case Study 10.1: Washing Powder ............................................................. 10/16
Case Study 10.2: Hypermarkets ................................................................. 10/17

Prerequisite reading: Module 7, Module 8, Module 9


Learning Objectives
Up to now the statistical tests have concentrated on the differences between two
samples. In practice, a number of different samples are often available and there is
the need for a test that shows whether there are statistically significant differences
within a group of samples. Analysis of variance is such a test. By the end of the
module the reader should know how analysis of variance extends statistical infer-
ence from the one- and two-sample tests described in earlier modules to many-
sample tests. He or she should know the difference between one-way and two-way
analyses of variance and the type of problems to which they are applied, and should
also appreciate further extensions of the subject to more complicated tests. As with
all statistical inference, the crucial underlying assumptions and points of practical
interest should accompany the knowledge of how and where to apply them.
This module covers advanced material and may be omitted the first time through
the course.

10.1 Introduction
The significance tests encountered up to now have involved just one or two
samples. The mean or variance of a sample has been compared with that of a
population; the mean or variance of one sample has been compared with that of a
second sample. The purpose has been to test the effect of new conditions on some
management situation. The tests show whether or not the new conditions have led
to a significant change in the data at a given confidence level. For example, the
influence of using a different brand of tyres on police cars was investigated by
carrying out a significance test to compare miles-per-gallon data of a small sample of
police cars using the new brand of tyres with the historical record of miles per

Quantitative Methods Edinburgh Business School 10/1


Module 10 / Analysis of Variance

gallon for all cars with the current brand of tyres. Although the miles per gallon
achieved by the sample was greater, it was not sufficiently so to suggest that the new
tyres had made a significant difference at the 95 per cent confidence level.
In other situations the need sometimes arises to make comparisons with several
samples rather than just one or two. For example, several different fertilisers may be
used to improve crop yield. The question that has to be answered is whether the
fertilisers all have approximately the same effect or whether there are significant
differences in their effects. In this case several samples need to be compared, each
referring to crop yield data for one of the fertilisers.
It would be possible to tackle problems such as this by taking the samples in
pairs and using ordinary two-sample significance tests. Suppose there were six
fertilisers. Comparing them all in pairs would need a large number of significance
tests, which would be very time-consuming. Worse, at the conventional 5 per cent
confidence level there is a 5 per cent chance of wrongly rejecting the hypothesis (of
no difference between the two samples). Consequently, even if in reality there is no
difference in the two fertilisers, it could be expected that 5 per cent of the tests
would show, purely by chance, that there is. Paired significance tests are therefore
not a statistically valid option.
Analysis of variance is a type of significance test that allows, in a single test, sev-
eral samples to be compared under the hypothesis that they have all come from the
same population (or that the populations from which they come all have the same
mean). First, some applications will be described. Second, the underlying theory will
be explained before we move on to consider the basic technique, called one-way
analysis of variance. Third, some extensions, particularly two-way analysis of
variance, will be discussed.

10.2 Applications
10.2.1 Television Assembly Lines
A plant in Asia manufactures TVs. There are eight parallel assembly lines within the
plant. Production rates vary from day to day and from line to line because of a range
of factors, including breakdowns, lack of supplies and the skill levels of the workers.
Management wants to know whether the observed production rates differ signifi-
cantly from line to line so that it can decide whether attention should be given to all
lines or restricted to particular lines. Table 10.1 shows production data collected
from the lines over a 200-day period.
Analysis of variance permits one single significance test to indicate whether the
differences in average production rates, shown in the bottom row of Table 10.1, are
significant or not. It tests the hypothesis that all samples (lines) come from the same
population (or from populations with the same mean). In other words, analysis of
variance will show whether the production rate for each assembly line can be
thought of as being essentially the same.

10/2 Edinburgh Business School Quantitative Methods


Module 10 / Analysis of Variance

Table 10.1 Television production rates


Line 1 Line 2 Line 3 Line 4 Line 5 Line 6 Line 7 Line 8
Day 1 23 32 28 29 25 31 30 26
Day 2 28 29 28 25 24 30 31 28
Day 3 31 28 29 27 29 30 27 30

Day 200 29 23 32 22 31 27 23 29
Average 27.1 29.6 28.8 25.8 28.4 29.7 27.7 28.0
Although analysis of variance shows whether a difference exists, it does not show
whether this is because of one particular sample or group of samples. However, the
sample means (such as the bottom row of Table 10.1) do reveal which samples are
the most deviant, in the sense of being furthest from the overall mean.
This test is ‘one-way’ analysis of variance because it considers one reason for
different production rates, i.e. that the rates refer to different assembly lines.
Much of the variation in production rates may stem from random causes, such as
malfunctions of electrical equipment. But variation may also stem from some
systematic cause such as, say, the day of the week. There may be variation because
production tends to be lower on Mondays and Fridays than mid-week. If this cause
of variation were to be removed, some of the inherent variation in production rates
would be removed. Would there then be a significant difference between the
production rates of the lines? Two-way analysis of variance answers this question.
Two-way analysis of variance allows a source of systematic variation to be isolated.
The technique is called ‘two-way’ because it looks at two sources of systematic
variation: different assembly lines and different days of the week.

10.2.2 Effectiveness of Fertilisers in Agriculture


The effectiveness of fertilisers can be tested through analysis of variance. Suppose
there are six fertilisers. They are allocated at random to six plots of land (see Fig-
ure 10.1). Later, for each plot, the crop yield for each of, say, five random
observations of 50 square metres is measured and recorded (see Table 10.2). A one-
way analysis of variance on the average crop yield for each fertiliser tests the
hypothesis that the observations for each fertiliser come from the same population.
It indicates whether the differences between the measured yields for each fertiliser
are or are not likely to be due purely to chance. If the differences are likely to have
arisen purely by chance, then there is no evidence that the fertilisers have different
effects on crop yield; if they are not likely to have arisen purely by chance, then the
evidence is that the fertilisers do have different effects on yield. As with the televi-
sion production, although the test indicates a difference, it does not pinpoint which
fertilisers or groups of fertilisers are different from the rest.

Quantitative Methods Edinburgh Business School 10/3


Module 10 / Analysis of Variance

Five random observations of 50 sq. metres from each plot.

Plot 3
Plot 2
Plot 1

Plot 4 Plot 6

Plot 5

Figure 10.1 Fertiliser plots

Table 10.2 Crop yields from fertilisers


Fertiliser A B C D E F
Plot 1 2 3 4 5 6
Observation 1 4.5 4.1 3.6 3.7 3.9 3.4
Observation 2 4.3 4.0 3.7 3.9 4.1 3.9
Observation 3 – – – – – –
Observation 4 – – – – – –
Observation 5 – – – – – –
Average 4.1 4.0 3.6 3.8 3.8 3.4

However, there are many reasons that could account for a difference in crop
yields between the plots apart from the effect of the fertiliser. The slope, drainage
and acidity of the soils are some examples. If the effects of these factors could be
neutralised, a clearer idea of whether differences between fertilisers exist could be
obtained. Some of the factors can be neutralised by arranging the experiment so that
the location of the plot has no effect on the outcome. This can be done by dividing
each plot into six sections and allocating (at random) one fertiliser to each section
(see Figure 10.2).
One measurement of crop yield is made in each section within each plot. Now 36
measurements are made. They can be grouped in two different ways. They can be
thought of as six groups (one for each fertiliser), as previously, or six groups (one
for each plot). These two ways of grouping are the basis of the isolation of the
effect of plot location. Two-way analysis of variance is used to analyse the observa-
tions. Whatever the application, by convention, the first grouping (here the
fertilisers) is referred to as the treatments and the second grouping (here the plots)
is referred to as the blocks.

10/4 Edinburgh Business School Quantitative Methods


Module 10 / Analysis of Variance

B Plot 1 Plot 3
F Plot 2 F E A DC B
A F
C A D E C F B
B
D Plot 4 E
D A F C B E Plot 6
E A
Plot 5 C
C E B F D A D

Figure 10.2 Fertiliser sections within plots (A, B, C, D, E and F refer to


the fertiliser)

10.3 One-Way Analysis of Variance


A simple example will be used to illustrate one-way analysis of variance. Table 10.3
shows three treatments, each with five observations. To make the example concrete
it may be thought of as a test on three fertilisers where five measurements of crop
yield have been made for each fertiliser.

Table 10.3 Example for one-way analysis of variance


Treatment 1 Treatment 2 Treatment 3
Observation 1 4 3 7
Observation 2 7 2 5
Observation 3 5 4 6
Observation 4 6 8 4
Observation 5 3 3 8
Average 5 4 6 Grand mean = 5

The underlying idea of one-way analysis of variance is to compare the variation


between the treatments (the groups or samples) with the variation within the treat-
ments. If the former is significantly greater than the latter then the differences
between the means of the treatments must be significantly greater than would be
anticipated purely by chance. Thus, the treatment groups are unlikely to have come
from populations with the same mean (or a common population) and the treat-
ments can be thought of as having had a significant influence on the values taken by
the variable. The process starts with the total variation of all observations, regardless
of the treatments from which they come. This total variation, the total sum of
squares (referred to as ‘Total SS’) is calculated:
Total SS = ∑( − ̿)
where the xij are the observations (xij being the ith observation in the jth treatment),
and ̿ is the grand mean of all observations. For example, x13 is 7 and x42 is 8.
In the example ̿ is 5 and the Total SS (starting at the first observation in the first
treatment and working down each treatment in turn) is:

Quantitative Methods Edinburgh Business School 10/5


Module 10 / Analysis of Variance

Total SS = (4 − 5) + (7 − 5) + (5 − 5) + (6 − 5) + (3 − 5) + (3 − 5) + ⋯
= 52
Next turn to the variation between the treatments. This variation, the sum of
squares between treatments (referred to as ‘SST’), is calculated:
SST = No. observations × ∑( − ̿)
where is the mean of the jth treatment.
SST measures the variation between group means via the deviations between
group means and the grand mean. In the example, SST is:

SST = 5 × [(5 − 5)2 + (4 − 5)2 +(6 − 5)2]


Treatment 1 Treatment 2 Treatment 3
= 10

The initial multiplication by 5 reflects the number of observations within each


treatment group.
Finally, consider the variation within treatments. This is known as the error sum
of squares (SSE). The reason for the label ‘error’ is as follows: SST can be thought
of as explained variations since it may be accounted for by the effect of the different
‘treatments’. SSE, on the other hand, is unexplained since it is the variation of
observations that have all received the same ‘treatment’.
The calculation is made over all observations as for total SS, but this time each
deviation is measured with respect to its treatment mean rather than the grand
mean. In the example SSE is:

SSE = (4 − 5)2 + (7 − 5)2 + (5 − 5)2 + (6 − 5)2 + (3 − 5)2 (Treatment 1)


+ (3 − 4)2 + (2 − 4)2 + (4 − 4)2 + (8 − 4)2 + (3 − 4)2 (Treatment 2)
+ (7 − 6)2 + (5 − 6)2 + (6 − 6)2 + (4 − 6)2 + (8 − 6)2 (Treatment 3)
= 42

It can be shown mathematically (and it is demonstrated by the above example)


that:
Total sum of squares = Treatment sum of squares + Error sum of squares
Total SS = SST + SSE
This equality is not essential to carrying out one-way analysis of variance but it
can be used as a check on the accuracy of calculations.
The next step is to calculate the mean squares (the treatment mean squares,
MST, and the error mean, MSE) from the sums of squares (SST and SSE). A mean
square is a sum of squares divided by the degrees of freedom. The degrees of
freedom of SST is c − 1 (where c = number of treatments), 1 being lost because of
the presence of the grand mean; the degrees of freedom of SSE is (r − 1)c (where r
is the number of observations for each treatment), 1 being lost from each treatment
because of the presence of the treatment mean. Degrees of freedom are defined and
calculated just as they would be outside the confines of analysis of variance.

10/6 Edinburgh Business School Quantitative Methods


Module 10 / Analysis of Variance

MST = SST/( − 1)
MSE = SSE/( − 1)
In the example:
MST = 10/2 = 5
MSE = 42/(4 × 3) = 3.5
Recall that the underlying idea of analysis of variance is to compare the variation
between treatments with the variation within treatments. If the former is significant-
ly greater than the latter (at some level of significance) then the differences in
treatment means must be larger than those expected purely by chance. If chance is
not causing the differences, something else must be. Therefore, it is supposed, the
treatments have a significant effect in determining the differences observed.
The relevance of the mean squares to this process is that they are the basis of a
significance test to determine whether explained variation (between treatments) is
significantly different from unexplained variation (within treatments). This is
because the ratio between MST and MSE follows an F-distribution. Proof of this
fact involves some ‘black box’ mathematics.
The significance test proceeds as follows. The hypothesis is that all treatments
come from populations with the same mean, or from a common population. If the
observed F value (MST/MSE) lies beyond the critical value, at a given level of
significance, then the hypothesis is rejected: the treatments do have a significant
effect. If the observed F value is less than the critical value then the hypothesis is
accepted: there is not a significant difference between the treatment means.
In the example:
Observed = MST/MSE = 5/3.5 = 1.43
The critical F value for (2,12) degrees of freedom at the 5 per cent significance
level from the F-distribution table (see Appendix 1, Table A1.6) is 3.88. Observed F
is less than the critical value and therefore the hypothesis is accepted. The treat-
ments can be thought of as having no significant effect on the variable.
Intuitively, the significance test can be thought to operate in the following way.
MST and MSE are the bases of two alternative ways of estimating the variance (σ2)
of the common population from which the observations are supposed to have been
taken.
̿
MST = No. obs ×
= No. obs × Variance of sample means
is an estimate of:
No. obs × =
.
( )
MSE = , all ,
( )

is an estimate of σ2

Quantitative Methods Edinburgh Business School 10/7


Module 10 / Analysis of Variance

But the test for differences in estimates of variances is the F test. The ratio
MST/MSE is the ratio between two estimates of a variance, and thus the F test can
be used to determine whether they are significantly different.
Suppose there were a significant difference between the two estimates of σ2, and
that the one based on MST were the larger. This would be an indication that much
more (indeed a significant proportion) of the variation between the observations is
accounted for by the effect of the different treatments rather than being a result of
random variation between observations. Treatments, then, appear to have a
significant effect on the means of observations relating to them. In other words,
there must be a difference between the effects of the treatments.
This test carries three underlying assumptions. The test is basically an F test and
two of the assumptions are exactly those described in Module 9 in relation to the F-
distribution. First, the observations are supposed to have come from a normal
distribution. Second, the observations should have been taken at random. The third
assumption is equally fundamental. The test is based on the treatment groups having
come from a common population or from populations with equal means (in
practice it does not matter which assumption is made). In the latter case, the
populations must also have equal variances. This stems from the statistical theory of
the test but, intuitively, it can be envisaged that a difference in variances would
distort the sums of the squares within groups. In either case there should not be a
significant difference in the variances of the treatment groups. This is the third
assumption, that the treatment groups have the same underlying variance.

10.3.1 ANOVA Tables


An analysis of variance is a significance test and could be set out in the five stages
described in earlier modules. Conventionally, however, analyses of variance are laid
out in a systematic form called an ANOVA table. The layout for the simple example
above is shown in Table 10.4.
This table neatly contains the earlier calculations. The test is completed by com-
paring the observed F value (from the table) with the critical F value for
(c − 1, (r − 1)c) degrees of freedom obtained from an F-distribution table.

Table 10.4 ANOVA table


Variation Degrees of Sums of squares Mean square F
freedom
Explained by treatments c−1=2 SST = 10 MST = 10/2 MST/MSE = 5/3.5
(between columns) =5 = 1.43
Error or unexplained (r − 1)c = 12 SSE = 42 MSE = 42/12
(within columns) = 3.5
Total rc − 1 = 14 SS = 52

10/8 Edinburgh Business School Quantitative Methods


Module 10 / Analysis of Variance

Example: Evaluation of Teaching Materials


An education authority wishes to test the effectiveness of three different packages
(labelled A, B and C) of material in preparing pre-university students for an examination
in economics. Six schools are to take part in the evaluation. In each school an equal
number of students is randomly assigned to three equal groups. Each group uses a
different package in working for the examination. The eventual results (as average
percentages for each group at each school) in the examination are shown in Table 10.5.
A one-way analysis of variance will test the hypothesis that the average examination
scores for each package come from populations with the same mean. It will therefore
indicate whether the packages make a difference to eventual examination results.

Table 10.5 Economics examination results (%)


Package used
A B C
School 1 64 61 55
School 2 69 63 63
School 3 72 68 70
School 4 58 65 60
School 5 64 60 59
School 6 63 61 59
Average 65 63 61 Grand mean = 63

The systematic way of carrying out the test is to base it on an ANOVA table. The first
step is to calculate SSE, the within-group sum of squares. This is shown in Table 10.6.
The next step is the calculation of SST, the between-groups sum of squares:
SST = 6 × [(65 − 63) +(63 − 63) +(61 − 63) ]
= 6×8
= 48
Total SS can now be calculated as 344 (SST + SSE). Alternatively, it can be calculated
independently as a check on SST and SSE. The sums of squares can now be inserted in
an ANOVA table (Table 10.7).

Table 10.6 Calculation of SSE for economics examination results


Package used
A B C
2 2
School 1 (64 – 65) = 1 (61 – 63) = 4 (55 – 61)2 = 36
School 2 (69 – 65)2 = 16 (63 – 63)2 = 0 (63 – 61)2 = 4
2 2
School 3 (72 – 65) = 49 (68 – 63) = 25 (70 – 61)2 = 81
School 4 (58 – 65)2 = 49 (65 – 63)2 = 4 (60 – 61)2 = 1
School 5 (64 – 65)2 = 1 (60 – 63)2 = 9 (59 – 61)2 = 4
2 2
School 6 (63 – 65) = 4 (61 – 63) = 4 (59 – 61)2 = 4
Totals 120 46 130
SSE = 120 + 46 + 130 = 296

Quantitative Methods Edinburgh Business School 10/9


Module 10 / Analysis of Variance

Table 10.7 ANOVA table for economics material


Variation Degrees of Sums of Mean square F
freedom squares
Explained by treatments c−1=2 SST = 48 MST = 48/2 MST/MSE = 24/19.73
(between columns) = 24 = 1.22
Error or unexplained (r − 1)c = 15 SSE = 296 MSE = 296/15
(within columns) = 19.73
Total rc − 1 = 17 SS = 344

The first column describes the sources of error. The second relates to degrees of
freedom, always given by c − 1 and (r − 1)c. The sums of squares go in the third
column. The mean squares are the sums of squares divided by the degrees of freedom
and are shown in the fourth column. Finally, the observed F value is calculated to be
1.22, the ratio between the mean squares.
To finish the test, the critical F value at the 5 per cent level is found from the table of
the F-distribution. In this case the value is 3.68. The observed F value, 1.22, is much
smaller. The hypothesis is accepted. There is no significant difference in the examination
results. The packages do not appear to make a difference to examination results.

10.4 Two-Way Analysis of Variance


One-way analysis tests for differences in the treatments. It does this by splitting the
variation of the data into two parts: that due to the treatments and that due to
unspecified effects (the error). However, there are likely to be sources of variation
other than the treatments. Some of them may be known. If these sources of
variation were isolated, would the conclusion be different? If another source of
variation is removed from the scene, perhaps the treatments would now be seen to
have a significant effect when previously they had not, or vice versa. This is the
question that two-way analysis of variance addresses. It allows a source of variation
to be isolated before testing the effect of the treatments. Just as the first source of
variation is conventionally referred to as the treatments, so the second source is
referred to as the blocks.
Table 10.8 shows the earlier example on crop yields taken from Table 10.3. The
conclusion from the one-way analysis of variance was that the treatments (fertilisers)
had no significant effect. But suppose that the five observations of each fertiliser
were each taken from a different plot of land and that observation i always referred
to plot i for all treatments. It is likely that the nature of the plot (slope, drainage,
etc.) would affect the crop yield. Possibly the effect is such that the effect of the
treatments is masked. Two-way analysis of variance allows the effect of the plot to
be neutralised. The first step is to calculate the mean for each block (plot or row).
This is shown in Table 10.8.

10/10 Edinburgh Business School Quantitative Methods


Module 10 / Analysis of Variance

Table 10.8 Example on crop yield


Treatment 1 Treatment 2 Treatment 3 Average
Plot 1 4 3 7 4.7
Plot 2 7 2 5 4.7
Plot 3 5 4 6 5.0
Plot 4 6 8 4 6.0
Plot 5 3 3 8 4.7
Average 5 4 6 Grand mean = 5

For two-way analysis of variance an extra sum of squares, for the blocks, must be
calculated. This is labelled SSB and is calculated in the same way as SST except that
it is rows, not columns, to which the process is applied. SSB is calculated from the
deviations of the row (plot) means from the grand mean. In general:
SSB = No. of observations in block × ∑( − )
In the example:
SSB = 3 × [(4.7 − 5) +(4.7 − 5) +(5 − 5) +(6 − 5) +(4.7 − 5) ]
= 4.0
Statistically, it can be demonstrated that:
Total SS = SST + SSB + SSE
This is similar to the one-way analysis of variance. Here, the equation is of real
practical use. It is used to calculate SSE. In other words, the procedure is to
calculate Total SS, SST and SSB directly from the data, but then calculate SSE from
the summation equation.
In the example:
Total SS = 52, as before
SST = 10, as before
SSB = 4 by the above calculation
SSE = 52 − 10 − 4 = 38 by the summation equation
The mean squares are now calculated. The degrees of freedom for MSE are no
longer c(r − 1). The block (row) means have been used in the calculations and the
associated number of degrees of freedom is lost. The degrees of freedom for MSE
are now (c − 1)(r − 1), which, in the example, is 8. Therefore:
MST = 10/2 = 5 as before
MSE = 38/8 = 4.75
The F value is still MST/MSE: 5/4.75 = 1.05.
The loss of degrees of freedom because of the use of block means also affects
the F value. The critical F value against which to compare the observed value now
relates to (2,8) degrees of freedom, not (2,12) as previously. At the 5 per cent level
the critical F value for (2,8) degrees of freedom is 4.46, much greater than the
observed value. The hypothesis is still accepted. The treatments do not appear to
have a significant effect.

Quantitative Methods Edinburgh Business School 10/11


Module 10 / Analysis of Variance

These workings may be seen more clearly from a two-way analysis of variance
table (Table 10.9). Column 1 describes the sources of variation. Column 2 refers to
the degrees of freedom. The degrees of freedom for the blocks is r − 1, analogously
to c − 1 being the degrees of freedom for the treatments. Degrees of freedom for
the error are (c − 1)(r − 1), as described above.

Table 10.9 Two-way ANOVA table for crop yields


Variation Degrees of Sums of Mean F
freedom squares square
Explained by treatments c−1=2 SST = 10 MST = 10/2 MST/MSE = 5/4.75
(between columns) =5 = 1.05
Explained by blocks r−1=4 SSB = 4 MSB = 4/4 MSB/MSE = 1/4.75
(between rows) =1 = 0.21
Error or unexplained (r − 1)(c − 1) = 8 SSE = 38 MSE = 38/8
(within columns) = 4.75
Total rc − 1 = 14 SS = 52

The third column relates to the sums of squares. Recall that only Total SS, SST
and SSB are calculated directly. SSE is calculated from the equation:
Total SS = SST + SSB + SSE
The fourth column refers to the mean squares, calculated by dividing the sums of
squares by their degrees of freedom. The F ratio in the fifth column is formed from
MST/MSE: in this example, 1.05.
The critical F value, taken from tables, is compared to this observed F ratio. At
the 5 per cent level for the appropriate degrees of freedom (2,8), the critical F value
is 4.46. As indicated above, the observed value is less than the critical value and the
hypothesis must still be accepted. After allowing for the effect of different lots, the
fertilisers still do not appear to cause significantly different crop yields.
Example: Teaching Material for Economics (continued)
The conclusion drawn from the earlier example concerning the provision of teaching
material for economics courses was that the material had no significant effect on the
results. The hypothesis that the treatment (type of material) groups came from
populations with the same mean was accepted. However, the experiment was carried
out in several schools. It could well be the case that a large part of the variation
between the groups was accounted for by differences between the schools. If this
variation were neutralised by quantifying it and removing it from the calculations, would
the conclusion be the same? The possibility can be pursued by considering each school
as a ‘block’ and carrying out a two-way analysis of variance. The data including block
means are given in Table 10.10.

10/12 Edinburgh Business School Quantitative Methods


Module 10 / Analysis of Variance

Table 10.10 Economics examination results (%) with block means


Package used
A B C Average
School 1 64 61 55 60
School 2 69 63 63 65
School 3 72 68 70 70
School 4 58 65 60 61
School 5 64 60 59 61
School 6 63 61 59 61
Average 65 63 61 Grand mean = 63

The sum of squares of the blocks (SSB) can now be calculated from the average
examination result of each school:
SSB = 3 × [(60 − 63) +(65 − 63) +(70 − 63) +(61 − 63) +(61 − 63) +(61 − 63) ]
= 3 × (9 + 4 + 49 + 4 + 4 + 4)
= 222
A two-way ANOVA is laid out as in Table 10.11. From the table the observed F value is
3.24. From F-distribution tables the critical F value for (2, 10) degrees of freedom at the
5 per cent significance level is 4.10. The hypothesis is accepted. There seem to be no
significant differences between the effects the teaching material has on the results.

Table 10.11 Two-way ANOVA table for teaching material example


Variation Degrees of Sums of Mean square F
freedom squares
Explained by treatments c−1=2 SST = 48 MST = 48/2 MST/MSE = 24/7.4
(between columns) = 24 = 3.24
Explained by blocks r−1=5 SSB = 222 MSB = 222/5 MSB/MSE = 44.4/7.4
(between rows) = 44.4 = 6.0
Error or unexplained (r − 1)(c − 1) = 10 SSE = 74 MSE = 74/10
(within columns) = 7.4
Total rc − 1 = 14 SS = 344

10.5 Extensions of Analysis of Variance


The applications of the ideas of the subject go well beyond the basic cases of one-
and two-way analysis of variance. The theory can be developed in several directions.
These extensions will be outlined here but not covered in depth.
The first extension is the obvious one that two-way analysis of variance can be
used to test the effect of blocks as well as treatments. In the teaching material
example the hypothesis that the schools have no significant effect on the results can
also be tested. This is done by calculating an observed F value comparing the mean
square for blocks with the error mean square.

Quantitative Methods Edinburgh Business School 10/13


Module 10 / Analysis of Variance

Using the data of Table 10.11:


MSB = 44.4
MSE = 7.4
Observed F = 44.4/7.4 = 6.0
The critical F value at the 5 per cent level for (5,10) degrees of freedom is 3.33.
The observed F exceeds the critical F. The hypothesis is rejected. The results for
various schools are significantly different.
A second extension is the use of an interaction variable. An implicit assumption
in two-way analysis of variance has been that blocks and treatments have additive
effects. A block or treatment is supposed to raise the mean of a group by some
constant amount from some ‘background’ level. It may well be the case that blocks
and treatments interact. For instance, a block may only have an effect in the
presence of a particular treatment and not otherwise. In the teaching material
example it may be that certain sets of material have an effect at some schools but
not others. This problem is handled by the calculation and the inclusion of a sum of
squares due to interaction. The procedure involves no more than a minor extension
of the basic two-way ANOVA table.
In all the work so far a balanced design has been assumed. The groups or sam-
ples have all been of the same size. It is possible to do analysis of variance where
there is not a balanced design. This procedure, however, involves considerable
difficulties and complications compared to the basic case. It is usual to make
extensive efforts to ensure a balanced design. In other words, every effort is made at
the time a test is designed to ensure that the sample size for each treatment is the
same. The overall saving invariably compensates for the initial extra effort.
Treatments and blocks are referred to as factors. One- and two-way analyses of
variance deal with one and two factors respectively. They can be extended to multi-
factor situations. Several causes of variation can be investigated, with a correspond-
ing increase in the number crunching. The principles are the same as for basic
analysis of variance.

Learning Summary
Analysis of variance is one of the most advanced topics of modern statistics. It is far
more than an extension of two-sample significance tests, for it allows significance
tests to be approached in a much more practical way. The additional sophistication
allows significance tests to be used far more realistically in areas such as market
research, medicine and agriculture.
In practical situations there is a close association between analysis of variance and
research design. Although multi-factor analysis of variance is theoretically possible,
attempts to carry out such tests can involve large amounts of data and computing
power. Moreover, large and involved pieces of work can be more difficult to
comprehend conceptually than statistically. The results often present enormous
problems of interpretation. Consequently, before one embarks upon lengthy
analyses, time must be spent planning the research so that the eventual statistical

10/14 Edinburgh Business School Quantitative Methods


Module 10 / Analysis of Variance

testing is as simple as possible. This process is known as experimental design. It


offers methods of isolating the main effects as simply as possible. If it is at all
possible, multi-factor analysis of variance should only be undertaken after very
careful planning of the research.

Review Questions
10.1 Analysis of variance is a significance test applied to several samples. Which of the
following hypotheses could it test?
A. The samples come from populations with equal means.
B. The samples come from populations with equal variances.
C. The samples come from a common population.
D. The samples are normally distributed.
Questions 10.2 to 10.5 refer to the following situation:
A one-way analysis of variance is carried out for five treatment groups, each with 12
observations. Calculations have shown Total SS = 316, SST = 96.

10.2 Calculate SSE.


A. 220
B. 412
C. 212
D. 292

10.3 Calculate MST.


A. 1.75
B. 19.2
C. 8
D. 24

10.4 Calculate the observed F value.


A. 1.2
B. 0.17
C. 4
D. 6

10.5 If the critical value at the 5 per cent significance level for (4,55) degrees of freedom is
2.54, then the conclusion to be drawn is that the treatments do have a significant effect.
True or false?

10.6 Analysis of variance is a significance test that is applied to several samples. Each sample
should have which of the following attributes?
A. Be randomly selected.
B. Have the same mean as the other samples.
C. Have the same variance as the other samples.
D. Come from a population with a normal distribution.

Quantitative Methods Edinburgh Business School 10/15


Module 10 / Analysis of Variance

Questions 10.7 to 10.9 refer to the data in Table 10.12.

Table 10.12 Data table


Treatments Average
3 7 4 2 4
2 6 2 2 3
Observations 8 7 3 6 6
4 2 5 1 3
8 3 1 4 4
Average 5 5 3 3 Grand mean = 4

10.7 Calculate Total SS.


A. 20
B. 100
C. 16
D. 24
E. 96

10.8 Calculate SST.


A. 20
B. 4
C. 16
D. 24
E. 6

10.9 Calculate SSB.


A. 20
B. 4
C. 16
D. 24
E. 6

10.10 An analysis of variance with a balanced design is one in which the number of treatments
(columns) equals the number of blocks (rows). True or false?

Case Study 10.1: Washing Powder


1 Some controversy rages over the extent to which automatic washing powders can cause
irritation to the skin. There is some evidence that when clothes washed in certain
powders are worn, there is an increased incidence of mild skin disorders such as
eczema. People who have never suffered from skin problems do so; chronic sufferers
find their problems aggravated.

10/16 Edinburgh Business School Quantitative Methods


Module 10 / Analysis of Variance

A consumer organisation is conducting a test on the six leading brands. It is difficult to


measure skin irritation accurately, but recent research work has provided an acceptable
means of doing so. Several factors, such as acidity, roughness, etc. are measured for a
batch of newly washed clothing and an index is formed by weighting and aggregating the
results. This index is the overall measure of skin irritation for the batch of washing.
The test applied to the six leading brands involves ten households washing ‘standard’
batches of clothing according to their usual practice. For each brand and for each
household the index is calculated. The outcome is 60 observations of the index, ten
(one per household) for each brand of powder. The results are shown in Table 10.13.
Carry out a one-way analysis of variance test to indicate whether there are differences
in the levels of irritation caused by the six leading brands of powder. What qualifications
would you wish to attach to your conclusions?

Table 10.13 Irritation index for six leading washing powders


Brand
Household 1 2 3 4 5 6
1 97 106 112 105 101 108
2 102 109 98 93 102 113
3 95 100 101 99 97 104
4 99 104 101 96 100 108
5 98 103 105 99 105 103
6 100 101 103 101 99 104
7 95 99 102 97 101 108
8 96 103 105 99 103 104
9 103 105 108 101 105 110
10 98 99 101 95 100 102
Average 98.3 102.9 103.6 98.5 101.3 106.4 Grand mean = 101.8

Case Study 10.2: Hypermarkets


1 A chain of hypermarket stores is concerned about its reputation for service and value as
well as other aspects of its relationship with customers. It has decided to conduct an
interview survey at all its eight superstores, which are located throughout the United
Kingdom. The interviews, 600 at each store, will all be conducted during the same week
and spread over the five days Monday to Friday (i.e. there will be 120 interviews per day
per store).
The interviewees will be given a list of attributes that might apply to the store (such as
‘gives value for money’, ‘provides top-quality products’, etc.) and asked to tick those
they believe describe the products and service given by the store. One of the attributes
is ‘courteous staff’. Table 10.14 shows the results with respect to this attribute. The
number in each cell of the table is the number of people (out of 120) at each store and
on each day who ticked the attribute ‘courteous staff’.

Quantitative Methods Edinburgh Business School 10/17


Module 10 / Analysis of Variance

Table 10.14 Hypermarket survey: positive responses to the attribute


‘courteous staff’
Store
B C D G L M N S
Monday 71 73 66 69 58 60 70 61
Tuesday 71 78 81 89 78 85 90 84
Wednesday 73 78 76 86 74 80 81 76
Thursday 73 75 73 80 75 71 73 72
Friday 62 66 69 81 60 64 61 57
Note: Store legend: B Bristol L Luton
C Cardiff M Manchester
D Derby N Newcastle
G Glasgow S Southampton

a. Carry out a two-way analysis of variance to determine whether the results indicate a
difference between stores in the responses to the attribute ‘courteous staff’.
b. Why, in testing for this difference, is it important to allow for the effect of the day of
the week on interviewees’ responses?
c. Would the outcome of the test be different if this influence were not neutralised?
d. Do the days of the week have a significant influence on responses, after allowing for
the effect of location?
e. Intuitively, is it likely to be worth pursuing the question of an interaction effect in
this situation?

10/18 Edinburgh Business School Quantitative Methods


PART 4

Statistical Relationships
Module 11 Regression and Correlation
Module 12 Advanced Regression Analysis

Quantitative Methods Edinburgh Business School


Module 11

Regression and Correlation


Contents
11.1 Introduction.......................................................................................... 11/1
11.2 Applications .......................................................................................... 11/3
11.3 Mathematical Preliminaries ............................................................... 11/4
11.4 Simple Linear Regression ................................................................... 11/7
11.5 Correlation ........................................................................................... 11/9
11.6 Checking the Residuals ..................................................................... 11/13
11.7 Regression on a Personal Computer (PC) ...................................... 11/15
11.8 Some Reservations about Regression and Correlation ................. 11/19
Learning Summary ....................................................................................... 11/22
Review Questions ......................................................................................... 11/23
Case Study 11.1: Railway Booking Offices ................................................. 11/25
Case Study 11.2: Department Store Chain ............................................... 11/26

Prerequisite reading: Module 2

Learning Objectives
Regression and correlation are concerned with relationships between variables. By
the end of this module the reader should understand the basic principles of these
techniques and where they are used. He or she should be able to carry out simple
analyses using a calculator or a personal computer. The many pitfalls in practical
applications should also be known.
The module deals with simple linear regression and correlation at a non-statistical
level. The aim is to explain conceptually the principles underlying these topics and
highlight the management issues involved in their application. The next module
extends the topics and describes the statistical background.

11.1 Introduction
Regression and correlation are concerned with relationships between variables. They
investigate whether a variable is related statistically to one or more other variables
that are thought to ‘cause’ changes in it. Regression is a method for determining
the mathematical formula relating the variables; correlation is a method for
measuring the strength of the relationship. Regression shows what the connection
is; correlation shows whether the connection is strong enough to merit using it.
For example, suppose a company wishes to investigate the relationship between
the sales volume for one of its products and the amount spent advertising it. The

Quantitative Methods Edinburgh Business School 11/1


Module 11 / Regression and Correlation

purpose might be to forecast future sales volumes or it might be to test the effec-
tiveness of the advertising. Regression and correlation are based on past data and
therefore the first step is to unearth the historical records of both variables. Perhaps
the quarterly sales volume and the quarterly advertising expenditure over several
years are available. If both variables are plotted graphically, the scatter diagram of
Figure 11.1 results. Each point (or observation) refers to one quarter. For instance,
the point A refers to the quarter when advertising expenditure was 12 (in £000) and
sales volume was 36 (in 000).

Quarterly sales volume (y) (000) 60

40 A

20

0
12 24 36
Quarterly advertising expenditure (x) (£000)

Figure 11.1 Quarterly sales and advertising


From the scatter diagram it appears that there is a rough straight-line relationship
between the variables. Regression will provide the mathematical formula linking
them together. It might be:
Sales volume = 21.3 + 1.1 × Advertising expenditure
This formula could then be used to predict sales in a future quarter for which
advertising expenditure has been decided. For instance, if advertising expenditure in
a future quarter was budgeted to be £12 000, then the sales volume would be
forecast at £34 500 (21.3 + 1.1 × 12).
Note that, even though it is being suggested that there is a linear relationship
between sales and advertising, the points do not lie exactly on a straight line. When
‘advertising expenditure = 12’ is put into the equation, sales volume does not equal
36, as it did at point A, since A is not on the line. In this sense regression is an
approximation. It involves finding the ‘best’ of many possible formulae that could
link the variables. Incidentally, 34.5 is preferable to 36 as a forecast because it uses
information from several quarters; 36, on the other hand, relates to just one quarter
and one-off factors may have influenced sales volume. In a sense, regression
‘averages out’ such factors.
While regression finds the formula, correlation will indicate the strength of the
relationship. It will provide a measure of the extent to which the sales volume had
moved in parallel with advertising expenditure in the past (i.e. whether a high level
of sales in a quarter had tended to correspond with a high advertising expenditure in

11/2 Edinburgh Business School Quantitative Methods


Module 11 / Regression and Correlation

the same quarter; and low with low). Correlation will reinforce the intuitive evidence
of the scatter diagram.
The expression regression analysis is often used to include both regression and
correlation.
The purpose of this module is to show where regression and correlation can be
used, to illustrate the underlying principles and to point out the pitfalls in their use.
Initially, only simple linear regression will be described. ‘Simple’ in this context
means involving only two variables; linear means that the relationship is some form
of straight line (as opposed to a curve). In the next module the description will be
extended to other types of regression. On the assumption that anyone doing any
serious regression or correlation will have the help of some form of calculator or
computer, the mathematics are kept to a minimum. Unfortunately, to gain any more
than a merely superficial understanding of the subject, some technical details are
necessary. The mathematical preliminaries will be tackled immediately after the next
section, which describes some practical applications of the techniques.

11.2 Applications
11.2.1 Forecasting
Forecasting is frequently based on regression analysis. The variable to be forecast is
regressed against another variable thought to cause changes in it. This example is of
a furniture manufacturer forecasting revenue by relating it to a measure of national
economic activity. It is reasonable to do this since it is likely that the business of
such an organisation will be influenced by the economic environment. The econom-
ic variable chosen is the UK gross domestic product (GDP). Annual data for
furniture sales and GDP over a ten-year period are shown in a scatter diagram in
Figure 11.2.

10.0

9.0
Sales (£m)

8.0

7.0

6.0

5.0
0 50 100 150 200 250 300
GDP (£bn)

Figure 11.2 Furniture sales and GDP


Figure 11.2 suggests that there is a strong relationship between the variables.
There is a clear tendency for high numbers to be associated with high numbers, and
low with low. Correlation can measure this strength. Regression can determine the

Quantitative Methods Edinburgh Business School 11/3


Module 11 / Regression and Correlation

formula linking the two sets of numbers (i.e. can provide the equation of the straight
line that is as near as possible to all ten points). Then, given an estimate of GDP for
future years, estimates of furniture sales volumes can be calculated from the
equation.

11.2.2 Explaining
On occasions it is helpful to understand the extent to which two variables are
related, even when there is no intention to use the relationship for forecasting. The
alleged link between smoking and lung cancer is an obvious example.
In organisational research into the possibility of a relationship between the sala-
ries and weights of top executives, data on salary and weight for a random sample of
top executives in a selection of US companies were collected. The scatter diagram is
given in Figure 11.3. The data are cross-sectional because the observations relate
to different people at one point in time; the data in Figure 11.2 are time series
because each observation relates to a different point in time.
Figure 11.3 shows a relationship but not a strong one. In general, high salaries are
associated with low weights and low salaries with high weights. But the association
is loose. The points are not very close to a straight line. There is a weak correlation
between the variables. Since high is associated with low and vice versa, the correla-
tion is said to be negative. In the furniture sales application the correlation was
positive since high was associated with high and low with low.

50
Salary (£000)

40

30

20
40 60 80 100
Weight (kg)

Figure 11.3 Salaries and weights for top executives


The purpose of this analysis is to gain greater understanding. Clearly, it would not
be used for prediction. The idea is not to tell an executive how many kilograms he
or she should lose in order to secure a 20 per cent rise in salary. The statistical
results showing a correlation would be just the start of an investigation into why
heavy people appear to be paid less.

11.3 Mathematical Preliminaries


Simple linear regression refers to the case where two variables, when plotted on a
graph, have an (approximately) straight-line relationship. Mathematically this means
that the equation linking the variables is of the form:

11/4 Edinburgh Business School Quantitative Methods


Module 11 / Regression and Correlation

= + ( stands for multiplied by )


where y and x are the variables, and a and b are fixed numbers or constants. In the
example shown in Figure 11.1, sales volume was y, advertising expenditure was x, a
was 21.3 and b was 1.1.
Simple linear regression is the task of finding the values of a and b that provide
the best connection between the two variables. As yet, ‘best’ remains undefined.

11.3.1 The Equation of a Straight Line


The equation of a straight line is:
= +
(a) a is the intercept: the value of y at the point where the line crosses the y axis.
This can be verified by noting that y = a when x = 0.
(b) b is the slope: the change in y for a change in x of one unit. Some algebraic
manipulation can verify this fact.
The slope can be positive or negative. In Figure 11.4(a) an increase in x is associ-
ated with an increase in y, and u and v are therefore both positive. Consequently, b
(= u/v) must be positive. In Figure 11.4(b), an increase in x is associated with a
decrease in y. Therefore, while v is positive, u is negative. Consequently, b must be
negative. Figure 11.4(a) corresponds to positive correlation and Figure 11.4(b) to
negative correlation.
Determining the equation of a straight line amounts to finding the values of a
and b. Once a and b are known, the line is completely determined. Linear regression
is the task of finding the values of a and b that provide the best connection between
the two variables.

Quantitative Methods Edinburgh Business School 11/5


Module 11 / Regression and Correlation

(a) y

y = a + bx
u
Slope =b = v
u

Intercept = a
0
x
(b) y

a
v
u
Slope = v
u

0
x

Figure 11.4 Straight line (a) with positive slope; (b) with negative slope

11.3.2 Residuals
Figure 11.5 is an example of a scatter diagram. If we draw any straight line through
the set of points, the points will not in general fall exactly on the straight line.
Consider the first point, A, at x = 1. The y value at A is the actual y value. The
point directly below A that lies on the line is B. The y value at B is the fitted y
value. If the equation of the line is known, then the fitted y value is obtained by
putting the x value at A into the equation and calculating y. If at A the actual y value
is 12 and the line is y = 10 + 0.5x, the fitted y value is as follows:
Fitted value = 10 + 0.5 × 1 = 10.5
The difference between actual and fitted y values is the residual:
Residual = Actual value − Fitted value
At A, the residual is 12.0 − 10.5 = 1.5. If the point lies above the line, the residu-
al is positive; if it is below, the residual is negative; if it is on the line, the residual is
zero. Each point has a residual. The residuals would, of course, be different if a
different line were drawn.

11/6 Edinburgh Business School Quantitative Methods


Module 11 / Regression and Correlation

Residual
A
12

10.5
10 B

1 2 3 x

Figure 11.5 Definition of residuals

11.4 Simple Linear Regression


In deciding which straight line is the best straight line through a set of points, a
criterion is required. There must be some way of defining in what way ‘best’ is best.
Since the line is to be close to the actual values (the points), the criterion should be
something to do with making the residuals as small as possible. One approach
would be to make the ‘best’ line the one for which the sum of the residuals is a
minimum compared with any other line drawn through the points. This does not
work since positive and negative residuals will cancel with one another and so a line
with large residuals can still have the sum of its residuals very small or even zero.
Indeed it can be proved that any line through the mean value of x ( ) and the mean
value of y ( ) will have the sum of its residuals equal to zero (see Figure 11.6).

(x, y)

Sum of residuals = 0

Figure 11.6 A line with a zero sum of residuals


A second approach would be to make the sum of the absolute values of the
residuals as small as possible. (Recall that the absolute value of a number is its size
when the sign is ignored; e.g. the absolute value of +6 is +6 and the absolute value

Quantitative Methods Edinburgh Business School 11/7


Module 11 / Regression and Correlation

of −6 is +6.) This avoids the problem of positives and negatives cancelling. This
would work well except that the device of taking absolute numbers is not easy to
manipulate mathematically. In the past the absence of efficient computers made this
an important consideration. For this and other, more statistical, reasons the criterion
has been rarely used, though the growing availability of computers has led to an
increased usage in recent years.
The third approach, and the one traditionally employed, is called least squares.
The negative signs are eliminated by squaring the residuals (a negative number
squared is positive). The sum of the squared residuals is a minimum for the best
line. In other words, the best line is the one that has values for a and b that make
the sum of squared residuals as small as possible. The criterion in least squares
regression can be stated: sum (residuals squared) is a minimum. (The least
squares method also has some technical advantages over alternative criteria, which
will not be pursued here.)
Although the least squares criterion has been selected, it is not immediately obvi-
ous how it should be used to find the equation of the best line. Finding the equation
of the line means finding values for a and b in y = a + bx. The least squares criterion
has to be turned into formulae for calculating a and b. The means of the transfor-
mation is differential calculus, a level of mathematics beyond the scope of this
module. Differential calculus will be left in its ‘black box’ and a jump made straight
to the least squares regression formulae:
For two variables, labelled x and y, and n paired observations on those variables,
the least squares regression line of y on x is given by:
= +
where:
∑( )( )
=
∑( )

and:
= −
Fortunately the availability of computers and calculators means that these formu-
lae rarely have to be used manually. However, using the formulae on a simple
example does provide a better intuitive understanding of regression.
Example
Find the regression line for the points:

x 1 2 4 5 8 Mean( ) = 4
y 20 19 34 30 47 Mean( ) = 30

∑( − )( − ) = (1 − 4)(20 − 30) + (2 − 4)(19 − 30) + (4 − 4)(34 − 30)


+(5 − 4)(30 − 30) + (8 − 4)(47 − 30)
= 30 + 22 + 0 + 0 + 68
= 120

11/8 Edinburgh Business School Quantitative Methods


Module 11 / Regression and Correlation

∑( − ) = (1 − 4) + (2 − 4) + (4 − 4) + (5 − 4) + (8 − 4)
= 9 + 4 + 0 + 1 + 16
= 30
From the ‘black box’ formulae above:
= 120/30 = 4
= 30 − 4 × 4 = 14
The regression line is therefore:
= 14 + 4

11.5 Correlation
The formulae for a and b can be applied to any set of paired data. A regression line
can therefore be found for any group of points. The scatter diagram may show that
the points lie approximately on a circle, but regression will still find the equation of
the line that is closest to them. In other words, regression finds the best line for a
set of points but does not reveal whether a line is a good representation of the
points. Correlation fills this gap. It helps to decide whether, in view of the closeness
(or otherwise) of the points to a line, the regression line is likely to be of any
practical use. It does this by quantifying the strength of the linear relationship (i.e. it
measures whether, overall, the points are close to a straight line).

11.5.1 Correlation Coefficient


The actual measure of the strength of the relationship is called the correlation
coefficient (denoted by r). The formula by which r is calculated is:
∑( )( )
=
∑( ) ·∑( )

An intuitive explanation of why the coefficient does measure correlation and an


example of the use of the formula will be given later.

Quantitative Methods Edinburgh Business School 11/9


Module 11 / Regression and Correlation

y
r = –1

x
r = –1 occurs when there is perfect correlation (all the
points are on the line) and the line has a negative slope.
y
r = about –0.8

x
r = –0.8 occurs when the points lie reasonably close
.to a negatively sloped straight line.

y
r = about 0

x
When r is close to zero the points are not at all like a straight line.

r = about 0.8

x
r = 0.8 occurs when the points lie reasonably close
to a positively sloped straight line.
y
r=1

x
r = 1 occurs when there is perfect correlation and the points
are on a positively sloped straight line.

Figure 11.7 Examples of correlation coefficients

11/10 Edinburgh Business School Quantitative Methods


Module 11 / Regression and Correlation

The correlation coefficient, r, can take on all values between −1 and +1. Correla-
tion coefficients close to −1 or +1 always indicate a strong linear relationship
between the variables. In such a case the scatter diagram would show the points
falling approximately along a line. Close to 0, correlation coefficients indicate a weak
or non-existent linear relationship. For example, the scatter diagram might show the
points following a circular pattern. Figure 11.7 illustrates the meaning of different
values of r.
An intuitive understanding of the way the correlation coefficient works can be
gained by considering the square of the correlation coefficient. By convention, but
for no apparent reason, it is written with a capital letter, R2. The intuitive explana-
tion of R2 runs like this: before carrying out a regression analysis one can measure
the total variation in the y variable by:
Total variation = ∑( − )
This expression measures the extent to which y varies from its average value. It
uses squares for the same reason that the residuals are squared in the least squares
criterion: to eliminate negatives. It measures scatter in a similar manner to variance
in summary measures.
Part of this total variation in y can be thought of as being caused by the x varia-
ble. This is the purpose of regression and correlation: to investigate the extent to
which changes in y are affected by changes in x. The variation in y that is caused by
x is called the explained variation. It can be measured from the difference between
the fitted y value and the average y value:
Explained variation = ∑(Fitted − )
Since the fitted y values are calculated from the regression equation, this variation
is understood, hence ‘explained’. Another part of the total variation in y is ‘unex-
plained’. This is the variation in the residuals, the ‘left-over’ variation:
Unexplained variation = ∑(Residuals)
Note that it is this unexplained variation that least squares regression minimises.
Using some more ‘black box’ mathematics, it is possible to show that:
Total variation = Explained variation + Unexplained variation
This fact is used to define the correlation coefficient:
= Explained variation/Total variation
That this formula is fully compatible with the earlier formula for r will be demon-
strated in a later example. The formula for R2 indicates why the correlation
coefficient squared measures the strength of a linear relationship. If:
=1
then:
Explained variation = Total variation
and:
Unexplained variation = 0

Quantitative Methods Edinburgh Business School 11/11


Module 11 / Regression and Correlation

If:
Unexplained variation = 0
then the sum of all residuals is 0, and therefore all the residuals must individually be
equal to 0. The points must all lie exactly on the line. R2 = 1 thus signifies perfect
correlation. If:
=0
then:
Explained variation = 0
and:
Unexplained variation = Total variation
In this case the regression line does not explain variations in y in any way. The
residuals vary just as much as the original values of y. The points are not at all like a
straight line.
Because variation is measured in squared terms it is labelled R-squared. Some-
times a computer regression program will print out r, sometimes R-squared and
sometimes both. r has the advantage of distinguishing between positive and negative
correlation (and some technical statistical advantages); R2 has the advantage of a
better intuitive meaning.
To summarise, the essence of correlation is that when a regression analysis is
carried out, the variation in the y variable is split into two parts: (a) a part that is
explained by virtue of associating the y values with the x, and (b) a part that is
unexplained since the relationship is an approximate one and there are residuals.
The correlation coefficient squared tells what proportion of the original variation in
y has been explained by associating y with x (i.e. drawing a line through the points).
The higher the proportion, the stronger the correlation is said to be. It is often
sufficient to make a judgement as to whether R2 is high enough to be useful in a
particular situation. In most cases, 0.75 or more would be regarded as highly
satisfactory; 0.50–0.75 would be adequate; below 0.50 would give rise to serious
doubts. However, there are statistical methods by which one can be more precise.
They will be covered in the next module.
Example
This is the same example that was used for calculating the regression coefficients a and
b:

x 1 2 4 5 8 Mean( ) = 4
y 20 19 34 30 47 Mean( ) = 30

Some calculations have been made already:


∑( − )( − ) = 120
∑( − ) = 30

11/12 Edinburgh Business School Quantitative Methods


Module 11 / Regression and Correlation

All that is needed to be able to use the formula for r is:


∑( − ) = (20 − 30) + (19 − 30) + (34 − 30) + (30 − 30) + (47 − 30)
= 100 + 121 + 16 + 0 + 289
= 526
∑( )( )
=
∑( ) ·∑( )

= 120/√30 × 526
= 120/126
= 0.95
The correlation coefficient is close to 1, indicating a strong positive correlation. Exactly
the same result can be achieved using the formula for R2.
To use the formula for R2, it is first necessary to work out the fitted y values as a step
in obtaining the explained variation. The fitted values result from putting the x values
into the regression equation. Recall that the regression equation is:
= 14 + 4
For = 1, fitted = 14 + 4 = 18
For = 2, fitted = 22
For = 4, fitted = 30
For = 5, fitted = 34
For = 8, fitted = 46
Explained variation= ∑(fitted − )
= (18 − 30) + (22 − 30) + (30 − 30)
+(34 − 30) + (46 − 30)
= 144 + 64 + 0 + 16 + 256
= 480
Total variation = ∑( − )
= 526 (calculated previously)
= Explained variation/Total variation
= 480/526
= 0.91
The formula for r resulted in:
= 120/√30 × 526
Squaring:
= 120 /(30 × 526)
Cancelling by 30:
= 480/526
This is precisely equivalent to the result obtained from the formula for R2.

11.6 Checking the Residuals


Once the regression equation has been found, the residuals can be calculated as the
difference between actual and fitted y values. In the simple example used to
calculate the regression and correlation, the residuals can be found as follows:

Quantitative Methods Edinburgh Business School 11/13


Module 11 / Regression and Correlation

Actual Fitted Residuals


x y y Actual y – Fitted y
1 20 18 (= 14 + 4 × 1) 2 (= 20 − 18)
2 19 22 (= 14 + 4 × 2) −3 (= 19 − 22)
4 34 30 (= 14 + 4 × 4) 4 (= 34 − 30)
5 30 34 (= 14 + 4 × 5) −4 (= 30 − 34)
8 47 46 (= 14 + 4 × 8) 1 (= 47 − 46)

The reason why the residuals of a regression analysis are important, apart from
their role as the basis of the least squares criterion, is that they help to decide
whether the variables are linearly related. The correlation coefficient is one test for
deciding whether the two variables are linked by a linear equation. By itself, howev-
er, this test is not sufficient. A further test is that the residuals should be random,
meaning that they should have no pattern or order.
Figure 11.8 shows a scatter diagram for which the correlation coefficient is high
but for which a straight line does not fully represent the way in which the variables
are linked. There is another effect, probably a seasonal pattern or cycle, which
should be incorporated (in a way as yet unspecified) into the equation. In short, if
the unexplained variation in y (the residuals) is random, then, no matter how large
this variation may be, a linear equation is the best way to represent the relationship.
If the unexplained variation has a pattern, then a linear equation is not, by itself, the
best way to represent the relationship. More can be done to improve the model
expressing the relationship between the variables. Should the purpose of the
regression be to forecast, Figure 11.9 shows how different the predictions might be
if this pattern were not incorporated.

Figure 11.8 Non-random residuals

11/14 Edinburgh Business School Quantitative Methods


Module 11 / Regression and Correlation

y
With pattern
Without pattern

Now Forecast x

Figure 11.9 Forecasting with non-random residuals


The residuals should therefore be checked for randomness as part of the process
for deciding whether a linear equation is an adequate way of expressing the connec-
tion between y and x. The first way of testing for randomness is visual. The scatter
diagram with the best-fit line drawn in is inspected for patterns (see Figure 11.8).
Alternatively, a scatter diagram can be drawn plotting the residuals against the fitted
y values. This shows more directly the size of the residuals and any pattern in them
(see Figure 11.10). In Figure 11.10 a visual test is all that is necessary to detect the
obvious pattern. This is frequently the case in practice, but inevitably situations exist
that are not so clear-cut. There may be a hint of a pattern but it is not definite. In
these circumstances statistical tests of randomness are a more precise approach to
the problem. Many statistical tests for randomness exist, and one is described in
Module 12.

1
Residuals

–1 Fitted y

–2

–3

Figure 11.10 Scatter diagram of residuals/fitted

11.7 Regression on a Computer


The situations investigated so far have involved simple linear regression with small
data sets (few observations). Even so, the calculations of coefficients were lengthy
and rather tedious. Fortunately, all personal computers and even sophisticated hand
calculators are capable of carrying out regression and correlation. The skill of

Quantitative Methods Edinburgh Business School 11/15


Module 11 / Regression and Correlation

regression analysis is no longer accurate number crunching but rather the ability to
interpret the vast outputs of computer regression packages.
Four steps in regression and correlation have been dealt with so far. They are:
(a) inspecting the scatter diagram;
(b) calculating the regression coefficients;
(c) calculating the correlation coefficient;
(d) checking the residuals for randomness.
All four steps can be carried out on a PC. The exact commands that would have
to be issued would depend upon the package itself (and could be found from its
manual), but the principles and the general form of the output will be the same.
The following situation will be used to demonstrate how a package might work.
A retail clothing company is trying to forecast its sales of clothes for four-year-old
children as part of its corporate plan. As a first step, a regression model relating
nationwide sales to the birth rate four years previously is investigated. It makes
sense to consider this relationship since sales must be related in some way to the
number of children needing clothes. Annual data on sales (in constant price terms)
and the birth rate for the last 20 years are available on a PC. The variables were
labelled ‘sales’ and ‘births’ when they were entered into the computer.

11.7.1 Inspect the Scatter Diagram


To produce a scatter diagram the computer will ask for a variable to go on the
vertical axis and one to go on the horizontal axis. The answers should be ‘sales’ and
‘births’ respectively.
Figure 11.11 is a scatter diagram relating the data on clothing sales with the birth
rate. There are 20 points on the diagram, one for each of the 20 years. Each point
marks the sales and birth rate in a particular year. The diagram confirms that there is
an approximate straight-line relationship between the two variables and therefore it
makes sense to consider simple linear regression in this case. Had the data looked
like Figure 11.12, there would have been no evidence of a linear relationship and
little reason to proceed further with the analysis.
Sales of clothing for 4-year-olds
(Constant £)

Birth rate

Figure 11.11 Clothing sales/birth rate scatter diagram

11/16 Edinburgh Business School Quantitative Methods


Module 11 / Regression and Correlation

A scatter diagram is a first check that the analysis makes sense. It also gives the
analyst better insight and more direct knowledge of the situation.

0 x

Figure 11.12 Non-linear scatter diagram

11.7.2 Calculate the Regression Coefficients


The computer will ask for the dependent variable (the y variable, also referred to as
the left-hand-side variable: in this case sales) and the independent variable (the x
variable, also referred to as the right-hand-side variable: in this case births). The
exact output will vary from package to package but will be similar to the following:

Coefficient
Births 3.14
Constant 8.65
R-squared = 0.93
Standard error of residuals = 5.62

This output means that the equation relating sales and births is:
Sales = 8.65 + 3.14 × Births
If the births four years ago were 18.0, then the estimate for sales this year would
be:
Sales = 8.65 + 3.14 × 18.0
= 8.65 + 56.52
= 65.17

11.7.3 Calculate the Correlation Coefficient


The same computer output also gives information about the correlation coefficient.
R-squared is 0.93, indicating a high level of correlation, 93 per cent of the variation
in sales being accounted for. This verifies the intuitive conclusion obtained from the
scatter diagram.

Quantitative Methods Edinburgh Business School 11/17


Module 11 / Regression and Correlation

The only other quantity on the simplified output is the residual standard error
(the standard deviation of the residuals), which is 5.62. If the scatter of the residuals
(the scatter of the actual values about the fitted line) is the same as their scatter in
the future, then this figure gives an idea of the accuracy of any forecasts made. The
65.17 forecast made above is a point on the line. The residual standard error
indicates how the actual value might differ from the forecast. Indeed, if the residuals
are normally distributed, then the 95 per cent confidence limits for the sales will be
at least as wide as plus or minus twice the residual standard error (i.e. ±11.24). (The
reason for the ‘at least’ will be explained in Module 12.)

11.7.4 Examine the Residuals


The scatter diagrams of Figure 11.11 and Figure 11.12 were of the y variable against
the x. To examine the residuals, the computer can be asked to produce a scatter
diagram of the residuals against the fitted values. This is shown in Figure 11.13.

3
2
Residuals

1
0
Fitted y
–1
–2
–3

Figure 11.13 Residual scatter diagram


The variable on the horizontal (or x-) axis is the fitted values. This is in effect the
regression line. The residuals should of course be random. If there is a pattern in
them, it should be evident from the movement of the residuals around the x-axis. A
pattern might be a seasonal variation (Figure 11.14(a)). In cases like this when each
residual is related to the previous residual, there is said to be serial correlation. This
problem occurs particularly in time series data where there may be some time-
related cycle (a cycle is a pattern that repeats at intervals longer than a year).

11/18 Edinburgh Business School Quantitative Methods


Module 11 / Regression and Correlation

(a)
3
2

Residuals
1
0
Fitted y
–1
–2
–3

(b)

3
2
Residuals

1
0
Fitted y
–1
–2
–3

Figure 11.14 (a) Serial correlation; (b) heteroscedasticity


Alternatively a pattern might be a tendency for residuals to vary in size at differ-
ent parts of the line (Figure 11.14(b)). This problem is called heteroscedasticity. It
is likely to occur in cross-sectional data when the size of residuals is related to the x
value. For instance, in a study relating profitability to company size, it is probable
that the residuals will not be of the same size for small companies as for companies
many times larger. Rather, the residuals are likely to be roughly proportional to
company size. In such a case a scatter diagram similar to Figure 11.14(b) would
result.
If there is a pattern, a visual test with a scatter diagram is usually sufficient to
detect it. However, statistical tests for randomness are available on most packages.
Many such tests exist. Any one package will offer some but not all. (The runs test is
one of the most common and will be described in Module 12.)
Most computer packages are sufficiently flexible to carry out any analysis required
by a wide range of users. Unfortunately this tends to result in the output of the
packages being extensive as well as sophisticated. The output shown above is the
bare minimum. In practice, a real difficulty arises in having to find the estimates and
statistics wanted from the profusion that are produced.

11.8 Some Reservations about Regression and Correlation


(a) Causality is probably the largest single source of confusion and error. Carrying
out regression analysis on a personal computer deals with the statistical aspects
of a relationship. Potentially more important are the interpretive aspects. Even

Quantitative Methods Edinburgh Business School 11/19


Module 11 / Regression and Correlation

when a relationship passes all the statistical tests, what can be concluded and
how can the results be used?
It is essential to remember that, while the statistics can show whether the varia-
bles are associated, they do not say that changes in one variable cause changes in
another. Some examples will illustrate the point.
There is a close association (high correlation, random residuals, etc.) between the
price of rum and the remuneration of the clergy. This does not mean that the
two variables are causally related. A rise in salary for the clergy will not, presum-
ably, be spent on rum thereby depleting stocks and causing a rise in price. It is
more likely that there is a third factor, such as inflation or the general level of
society’s affluence, which affects both variables so that in the past the salaries
and the price of rum have moved together. It would be a mistake to suppose
that, if conditions were to change, the relationship must continue to hold. If, for
some philanthropic purpose, the clergy agreed to take a cut in salary one year, it
is unlikely that the price of rum would fall as well. The price of rum would con-
tinue to change in response to inflation or whatever factors influence it.
Different circumstances would now apply to clergy salaries and the association
would be broken.
The point is that a causal relationship will hold through changing circumstances
(provided the causal link is not broken); a purely associative relationship is likely
to be broken as circumstances change. In applications the implication is clear. If
the relationship is associative (and most are), beware of changing circumstances.
The difference between a causal and an associative relationship depends on its
structure, not on the statistics. Common sense and logic are the ways to distin-
guish between the two. In the case of rum and the clergy, no one would seriously
argue that the link is causal; in the case of sales of clothing for four-year-old
children and births four years earlier, there is some sort of causal link (although
of course there are other influences operating); in other cases the question of
causality can be far from clear.
(b) Spurious regressions should be guarded against. Spurious means that the
correlation coefficient is high but there is no underlying relationship. This may
arise purely by chance when the data available for the analysis just happen to
have a high correlation. If other observations had been taken no correlation
would have been apparent. Spurious correlations may also arise because of a
fault in the regression model. For example, a study sought to determine the rela-
tionship between companies’ profitability and their size. Profitability was
measured by return on net assets; size was measured by net profit. The equation
was:
Return on net assets = + × Net profit
The observations were taken from a large sample of companies. This regression
has an in-built likelihood of a high correlation since:
Return on net assets = Net profit/Net assets
The regression equation can therefore be re-expressed:
Net profit/Net assets = + × Net profit
Net profit appears in both the y and x variables. If there are only small variations
in net assets from company to company, then the tendency will be for high y

11/20 Edinburgh Business School Quantitative Methods


Module 11 / Regression and Correlation

values to be associated with high x values (and low with low), simply because net
profit dominates both sides of the equation. A high correlation coefficient may
well result but doubt would surround the conclusion that profitability and size
are linked.

(a) (b)
y y

x x

Figure 11.15 Effect of double data sets


(c) Extrapolation should be avoided if possible. Extrapolation means using the
equation outside the range of the data on which the regression was based. For
example, if a regression was based on x values in the range 100 to 200, then
forecasting the y value for x = 400 is extrapolation. This is dangerous because
the model is based on x between 100 and 200. Nothing is known about the be-
haviour of y and x when x = 400. Often, in using regression to forecast,
extrapolation cannot be avoided, especially with time series data. But it should be
done in a spirit of wariness.
(d) Regression applies only to single sets of data. The data in Figure 11.15(a) may
have a high correlation coefficient, hiding the fact that two separate straight lines
(Figure 11.15(b)) would be more appropriate. This highlights the need to be
familiar with the data, especially the need to study the scatter diagram.
(e) The least squares criterion can be misleading by being over-precise. The three
lines A, B and C in Figure 11.16 may be very close to one another in terms of
variation explained (perhaps less than one per cent different), although their
equations are very different. The least squares principle picks out one equation as
best, giving the impression that it is clearly the best. Other equations may, for all
practical purposes, be just as good. They may even be better when non-statistical
factors are considered. Prior knowledge or previous work may be more relevant
in deciding between A, B and C.

Quantitative Methods Edinburgh Business School 11/21


Module 11 / Regression and Correlation

A y = 10 + 0.8x

B y = 20 + 0.5x
C y = 30 + 0.1x

Figure 11.16 Alternative linear equations


(f) The least squares criterion has been applied to regressions of y on x. The sum of
squared y residuals has been minimised. Regressions of x on y can be executed
by minimising the sum of squared x residuals. A different regression line (differ-
ent intercept, different slope) emerges. Only by a fluke would the two lines
coincide. It is not always clear which way round is correct. A judgement may
have to be made as to which variable is the more important and should therefore
be the one whose residuals are minimised.
Some mathematical manipulation of the regression formula would be necessary
to demonstrate that the two lines differ. It is, however, easier to see that the
correlation coefficient is the same whichever way round the variables are. The
formula for r:
∑( )( )
=
∑( ) ·∑( )
is symmetrical in x and y. If x and y were interchanged, it would make no differ-
ence to the formula and the value of r would be unaffected.

Learning Summary
Regression and correlation are important techniques for predicting and understand-
ing relationships in data. They have a wide range of applications: economics, sales
forecasting, budgeting, costing, human resource planning, corporate planning, etc.
The underlying statistical theory (outlined in the next module) is extensive. Unfor-
tunately, the depth of the subject can in itself lead to errors. Users of regression can
allow the statistics to dominate their thought processes. Many major errors have
been made because the wider non-statistical issues have been neglected. As well as
providing company knowledge and broad expertise, managers have a role to play in
drawing attention to these wider issues. They should be the ones asking the pene-
trating questions about the way regression and correlation are being applied. If not
the managers, who else?
Managers can only do this, however, if they have a reasonable grasp of the basic
principles (although they should not be expected to become experts or to be
involved in the technical details). Only when they have taken the trouble to equip
themselves in this way will they be taken seriously when they participate in discus-

11/22 Edinburgh Business School Quantitative Methods


Module 11 / Regression and Correlation

sions. Only then will they take themselves seriously and have sufficient confidence
to participate in the discussions. Regression and correlation have a mixed track
record in organisations, varying from high success to abject failure. A key to success
seems to be for managers to become truly involved. Too often the managers pay lip
service to participation. Their contribution is potentially very large. To make it
count they need to be aware of two things. First, the broad principles and manageri-
al issues (the topics in this module) are at least as important as the technical,
statistical aspects. Second, knowledge of the statistical principles (the topic for the
next module) is necessary, not in order that they may do the regression analyses
themselves, but as a passport to a legitimate place in discussions.

Review Questions
11.1 Which of the following statements are true?
A. Correlation and regression are synonymous.
B. Correlation would establish whether there was a linear relationship between
the weights of one sample of business executives and the salaries of another
sample of business executives.
C. Annual sales figures for each of a company’s 17 regions is a set of cross-
sectional data containing 17 observations.
D. If high values of one variable are associated with low values of a second and
vice versa, then the two variables are negatively correlated.

11.2 Which of the following statements is true? The residuals of a regression line are:
A. the perpendicular distances between actual points and the line.
B. the difference between actual and fitted y values.
C. always positive.
D. all zero for a ‘best-fit’ line.

Questions 11.3 to 11.6 refer to the following data:

x 4 6 9 10 11
y 2 4 4 7 8

11.3 The slope of the regression line is:


A. 8
B. 0.765
C. 1.12
D. −0.71
E. 5

Quantitative Methods Edinburgh Business School 11/23


Module 11 / Regression and Correlation

11.4 The intercept of the regression line of y on x is:


A. 5
B. 0.765
C. −1.12
D. 1.12
E. 0.68

11.5 What is the value of the correlation coefficient?


A. 0.91
B. 0.76
C. −0.76
D. 0.83

11.6 The residual for the point (4,2) is:


A. 0.06
B. −0.06
C. −1.12
D. 1.59
E. −1.59

11.7 The relationship between the ages of husbands and wives is likely to show:
A. strong positive correlation.
B. weak positive correlation.
C. zero correlation.
D. weak negative correlation.
E. strong negative correlation.

11.8 When the residuals resulting from a linear regression analysis are examined, which of
the following characteristics is desirable?
A. Randomness.
B. Serial correlation.
C. Heteroscedasticity.

11.9 Is the conclusion true or false?


Research has shown there is a strong positive correlation between age at death and the
number of times, as an adult, a patient has visited his doctor. The researcher concludes
that visiting a doctor prolongs life.

11/24 Edinburgh Business School Quantitative Methods


Module 11 / Regression and Correlation

Questions 11.10 to 11.11 refer to the following output from a computer


linear regression package relating sales volume and advertising expenditure:

Coefficient
Ad. exp. 6.3
Constant 14.7
R-squared = 0.70
Sum of squared residuals = 900

11.10 What is the prediction of sales volume when the advertising expenditure is 5?
A. 21.0
B. 31.5
C. 46.2
D. 7.74

11.11 What is the total variation in sales ∑( − ) ?


A. 900
B. 3000
C. 2100
D. 1286

11.12 The relationship between a company’s sales (y) and its expenditure on advertising (x) is
investigated. The linear regression lines of sales on advertising (y on x) and advertising
on sales (x on y) are both calculated. Which of the following statements is true?
A. The correlation coefficients are the same for both.
B. The slopes of the two regression lines are the same.
C. The two intercepts are the same.

Case Study 11.1: Railway Booking Offices


1 A pilot study involving six railway booking offices provides the following information
about the number of transactions (y) and the number of clerks (x):

Booking office
1 2 3 4 5 6
Transactions (y) 11 7 12 17 19 18
Clerks (x) 3 1 3 4 6 7
a. Draw a scatter diagram and decide whether the relationship appears linear. Calcu-
late the correlation coefficient.
b. The following three straight lines could all be fitted to the data:
i. The line joining the extreme y values, i.e. linking the points (1,7) and (6,19).
ii. The line joining the extreme x values, i.e. linking the points (1,7) and (7,18).
iii. The regression line of y on x.

Quantitative Methods Edinburgh Business School 11/25


Module 11 / Regression and Correlation

Find the equation of each line. Measure the residuals (in the y direction). Calculate the
mean absolute deviations and the variances of the residuals. Compare the MADs and
variances and suggest which of the three straight lines gives the closest fit.

Case Study 11.2: Department Store Chain


1 A chain of department stores is moving into a phase of expansion and opening several
new stores. As part of the expansion planning process a project team is carrying out an
investigation to find out how the sales levels at new stores might be predicted. One
approach has been to use regression analysis. The average level of sales per week (y)
for each existing store in the chain (14 department stores) together with a measure of
disposable income per family in each store’s catchment area are given in Table 11.1.

Table 11.1 Sales and disposable family income per store


Store Average sales per week Average disposable family
number (£000s) income (coded)
y x
1 90 301
2 87 267
3 86 297
4 84 227
5 82 273
6 80 253
7 78 203
8 75 263
9 70 190
10 68 212
11 64 157
12 61 141
13 58 119
14 52 133
Mean 74 217

The computer gives the following results after regressing sales against family income:

Coefficient
Income 0.17846
Constant 35.228
Correlation coefficient r = 0.92
Residual standard error = 4.72

11/26 Edinburgh Business School Quantitative Methods


Module 11 / Regression and Correlation

a. What is the estimated relationship between sales and income? What sales level
would be predicted for a store whose catchment area has an average disposable
income per family of £221?
b. How good is the fit of the linear equation to the data?
c. Use the residual standard error to suggest what the maximum level of accuracy
achieved by the model is likely to be.
d. What are the non-statistical reservations connected with forecasting sales in this
way?

Quantitative Methods Edinburgh Business School 11/27


Module 12

Advanced Regression Analysis


Contents
12.1 Introduction.......................................................................................... 12/1
12.2 Multiple Regression Analysis .............................................................. 12/2
12.3 Non-Linear Regression Analysis......................................................... 12/6
12.4 Statistical Basis of Regression and Correlation .............................. 12/12
12.5 Regression Analysis Summary ......................................................... 12/22
Learning Summary ....................................................................................... 12/23
Review Questions ......................................................................................... 12/26
Case Study 12.1: CD Marketing .................................................................. 12/28
Case Study 12.2: Scrap Metal Processing I ................................................ 12/29
Case Study 12.3: Scrap Metal Processing II ............................................... 12/30

Prerequisite reading: Module 10, Module 11

Learning Objectives
Regression and correlation are complicated subjects. The previous module present-
ed the basic concepts and the managerial issues involved. In this module, the basic
concepts are extended in three directions. First, multiple regression deals with
equations involving more than one x variable. Second, non-linear regression allows
relationships to be based on equations that represent curves. Third, the statistical
theory underlying regression is described. This last topic permits rigorous statistical
tests to be used in the evaluation of the results. Finally, to bring together all aspects
of regression and correlation, a step-by-step approach to carrying out a regression
analysis is given.
This module contains advanced material and may be omitted first time through
the course.

12.1 Introduction
Regression analysis is a big subject. Most universities have at least one full professor
of regression analysis (except that he or she is not usually given the title of Professor
of Regression Analysis: some camouflage is often adopted, such as ‘Professor of
Econometrics’). The previous module covered only the tip of this iceberg. It
covered simple linear regression and correlation. The restrictions ‘simple’ and
‘linear’ will be removed in this module. Multiple (more than one right-hand-side
variable) and non-linear (equations other than y = a + bx) regression will be
introduced. Moreover, in Module 11 regression and correlation were dealt with at a

Quantitative Methods Edinburgh Business School 12/1


Module 12 / Advanced Regression Analysis

non-statistical level. For example, residuals were visually inspected for randomness.
Some of the underlying theory will now be considered. The purpose is the practical
one of allowing statistical tests to be employed, alongside visual ones, in evaluating
and using regression and correlation analyses.
The module deals with these topics in the order multiple regression, non-linear
regression, statistical theory. Finally, to bring all these ideas together, regression
analysis is summarised through a step-by-step guide to handling a regression
problem.

12.2 Multiple Regression Analysis


The regression analysis discussed so far has been simple regression. In the example
concerning forecasting the sales of clothing for four-year-old children, the depend-
ent variable ‘sales’ was related to the one independent variable ‘births’. It is fairly
obvious that more factors than just births must be influencing sales. For instance,
the amount of money in the national or local economy must have an effect on the
volume of children’s clothes bought. How can this and other influences be incorpo-
rated into the forecasts of sales? The answer is that it can be done through multiple
regression analysis.
In simple linear regression the equation is of the form:
= +
In multiple regression the basic idea is extended to two or more variables on the
right-hand side of the equation. For example, there may be three ‘x’ variables:
= + + +
There are three independent (also called ‘x’ or ‘right-hand-side’) variables: x, z, t;
their coefficients are B, C and D; the constant is A. This is still a linear equation
because it includes only x, y, z and t but no squared, cubed, logarithmic, etc., terms.
If y is the sales of clothing for four-year-old children and x is the birth rate four
years earlier, then z could be an economic measure such as personal disposable
income and t could be advertising expenditure by the store. Note that the a and b in
the first equation will not be the same as A and B in the second equation (except by
a fluke). When additional variables are added, the coefficients of existing variables
will be re-estimated and there is nothing to constrain them to take their original
values.

12.2.1 Similarities and Differences between Simple and Multiple


Regression
The criterion for multiple regression (i.e. for estimating the values of A, B, C and D)
is the same as for simple regression: the sum of squared residuals is minimised.
Inevitably the formulae for calculating A, B, C and D are more complicated, but this
makes little difference if a computer is being used. The formulae will not be given
here. It will be assumed that some computing power is available for multiple
regressions. Just as in simple regression, it would be expected that, for a good
model, the residuals would be random.

12/2 Edinburgh Business School Quantitative Methods


Module 12 / Advanced Regression Analysis

Some important differences between simple and multiple regression will soon
begin to emerge, but thus far the two are almost the same:
(a) the least squares criterion is the same;
(b) the residuals should be random.
Beyond these basic principles, the differences become apparent.

Scatter Diagrams
It is not possible to draw, in two dimensions, a scatter diagram involving several
variables. Therefore, at the outset of the analysis more than one scatter diagram will
need to be drawn: there will have to be a scatter diagram for the y variable com-
bined with each of the x variables. The purpose will be the same as in simple
regression: to gain an approximate idea of whether and how the variables are
related.

Correlation Coefficient
In simple regression, the correlation coefficient squared (R2) measures the propor-
tion of variation explained by the regression. In this way it quantifies the closeness
of fit of the regression equation. In multiple regression, a more sensitive measure is
needed. The reason is as follows.
Start with a simple regression model:
Sales = + × Births
Suppose a second x variable, personal disposable income (PDI), is added:
Sales = + × Births + × PDI
It is not possible for R2 to decrease when PDI is added. Even if PDI were totally
unconnected with Sales, R2 could not fall. This is because, at the worst, the multiple
regression could choose A and B to be equal to their original values, a and b, and C
to be equal to 0. Then the multiple regression would be the same as the simple
regression with the same R2. Since least squares acts to minimise the sum of squared
residuals and thereby maximises R2, the new R2 can do no worse than equal the old
one. Consequently, even if a new x variable is unconnected to the y variable, R2 will
almost certainly rise, making it appear that the closeness of fit has improved.
A more sensitive measure of closeness of fit is the adjusted correlation coefficient
squared, (pronounced ‘R-bar-squared’). This is based on the same ratio as R2, but
the formula is adjusted to make allowance for the number of x variables included. If
a new x variable is unconnected with the y variable, then will fall. It is not
necessary to know the exact nature of the adjustment since most computer packages
print out , usually in preference to R2. R-bar-squared is used in just the same way
as R-squared, as a measure of proportion of variation explained and thereby as a
2
quantification of the closeness of fit. To summarise, is based on R2 but adjusted
to make allowance for the number of right-hand-side variables included in the
regression.

Quantitative Methods Edinburgh Business School 12/3


Module 12 / Advanced Regression Analysis

Collinearity
What would happen if the same x variable were included twice in the regression
equation? In other words, suppose that in the equation below z and t were the same:
= + + +
The answer is that the computer package would be unable to complete the calcu-
lations. However, if z and t were almost but not quite the same, then the regression
could proceed and c and d would be estimated. Clearly, these estimates would have
little value. Nor would it be easy to determine which of the variables had the greater
influence on the y variable.
This, in simple form, is the problem of collinearity (sometimes referred to as
multi-collinearity). It occurs when two (or more) of the x variables are highly
correlated. In these circumstances the two variables are contributing essentially the
same information to the regression. Their coefficient estimates are unreliable in the
sense that small changes in the observations can produce large changes in the
estimates. Regression finds it difficult to discriminate between the effects of the two
variables. While the equation overall may still be used for predictions, it cannot be
used for assessing the individual effects of the two variables.
The basic test for collinearity is to inspect the correlation coefficients of all x
variables taken in pairs. If any of the coefficients are high, the corresponding two
variables are collinear.
There are three remedies for the problem:
(a) Use only one of the variables (which to exclude is largely a subjective decision).
(b) Amalgamate the variables (say, by adding them together if the aggregated
variable has meaning).
(c) Substitute one of the variables with a new variable that has similar meaning and
that has a low correlation with the remaining one of the pair.
The test and remedies for collinearity are not precise. It is more important to be
aware of the problem and the restrictions it places on interpretation than to delve
into the technicalities lying behind it.
Example
The weekly sales of a consumer product are to be predicted from three explanatory
variables. The first x variable is the gross domestic product (GDP), reflecting the
influence of the economic environment on sales; the second is the weekly advertising
expenditure on television; the third is the weekly advertising expenditure in newspa-
pers.
The regression analysis would proceed as follows.
(a) Inspect the scatter diagrams to see whether approximate linear relationships do
exist. This time there will be three: sales against GDP, sales against television adver-
tising and sales against newspaper advertising.
(b) Carry out the regression analysis by computer. The computer will ask for the
name of the dependent variable, then those of the independent variables. The
printout will look something like:

12/4 Edinburgh Business School Quantitative Methods


Module 12 / Advanced Regression Analysis

Coefficient
GDP 2.23
TV advertising 0.35
Newspaper advertising 1.09
Constant 4.28
R-bar-squared = 0.97
Residual standard error = 3.87

The equation predicting sales is:


Sales = 4.28 + 2.23 × GDP + 0.35 × TV adv. +1.09 × News adv.
Contrast this with the regression output for simple regression involving only sales
and GDP:

Coefficient
GDP 3.14
Constant 8.65
R-bar-squared = 0.93
Standard error of residuals = 5.62

Notice that the coefficient for GDP is 3.14, whereas for multiple regression it was
2.23. The addition of an extra variable changes the previous coefficients, including
the constant.
R-bar-squared has risen to 0.97 from 0.93. The presence of the advertising expendi-
tures has therefore increased the proportion of variation explained. The increase is
not large but then R-bar-squared was already high. This is a real increase since the
adjusted correlation coefficient has been used and this makes allowance for the
presence of an extra variable.
Correspondingly, the residual standard error has decreased. In other words, the
residuals are generally smaller, as would be expected in view of the increased R-bar-
squared.
(c) Check the residuals. This process is exactly the same as for simple regression. A
scatter diagram of residuals plotted against fitted y values is inspected for random-
ness.
(d) Check for collinearity. Most computer packages will print out the correlation
matrix showing the correlations between the x variables taken in pairs:

Variable 1 2 3
1 1.0 0.1 0.3
2 1.0 0.7
3 1.0

There is high correlation (0.7) between the two advertising variables. They are collinear
to some extent. Their coefficients will not be reliable and could not be used with
confidence to compare the effectiveness of the two types of advertising. Some thought
should be given to the possibility of amalgamating the variables to form a combined

Quantitative Methods Edinburgh Business School 12/5


Module 12 / Advanced Regression Analysis

advertising expenditure variable. The other remedies for collinearity would probably not
be used. Since the correlation between them is far from perfect (and the variables do
contribute some separate information) it would be wrong to drop one of the variables
completely. It is unlikely that the third remedy for collinearity (finding a substitute) could
be applied because of the difficulty of finding another variable with similar meaning.
However, if the purpose of the regression is solely to predict sales, the equation could
still be used as it stands. Values for GDP and advertising would be inserted in the
equation to give a predicted value for sales.

12.2.2 Dummy Variables


Data that are in ‘two-category’ form (for example, male/female employees, yes/no
responses to a questionnaire) can be represented by a dummy variable. This takes
just two values, 0 and 1, and is used in multiple regression in exactly the same way as
any other variable.
For example, a study of capital expenditure by US government departments used
regression analysis to relate the expenditure to two other variables: last year’s
expenditure and the amount by which last year’s expenditure had been cut. There
were 20 observations in the regression, each referring to one year’s expenditure. The
purpose of the study was to investigate whether departments employed ‘budget
strategies’ to maximise the amount they would be allowed to spend.
It was thought that the political party in power at the time might be an important
factor. This was represented in the regression equation by a dummy variable that
was 0 for the years the Republican Party was in government and 1 for the Demo-
cratic Party.
The results gave a coefficient for the dummy variable. This was interpreted as the
amount by which capital expenditure changed when a different party was in power.
The sign of the coefficient showed whether the change was an increase or decrease.
Dummy variables can be used whenever data are in ‘two-category’ form. The
situation is more complicated when the data can take more than two categories. For
example, if three political parties were to be represented, Republican, Democratic
and Perot, it would not be possible to use 0 for the Republicans, 1 for the Demo-
crats and 2 for Perot. This would imply a ranking, for example, that the Perot Party
had twice the effect of the Democrats.

12.3 Non-linear Regression Analysis


The extension of simple regression to multiple regression was the subject of the
previous section. The next extension to consider is from linear to non-linear
regression. This means that the equations expressing the relationship between the
variables will no longer always be of a linear form such as:
= + +
In non-linear regression the right-hand-side variables may appear in squared,
cubed, logarithmic, etc. form; a scatter diagram between a y variable and an x
variable need no longer be an approximate straight line.

12/6 Edinburgh Business School Quantitative Methods


Module 12 / Advanced Regression Analysis

The major ways of carrying out a non-linear regression rely on converting or


transforming the non-linear regression so that it can be treated as if it were linear.
Two examples of the ways in which this can be done will be described in some
detail.

12.3.1 Curvilinear Regression


In curvilinear regression squared, cubed, etc. terms in an x variable are treated as
separate variables in a multiple regression. For example, the non-linear equation:
= + + ﴾12.1﴿
is a curved relationship. Figure 12.1 shows the graph (for x = 0 to 6) of the equa-
tion:
= 10 + 3 + 2
Curvilinear regression treats an equation such as Equation 12.1 (a quadratic equa-
tion) as if it were of the form:
= + +
In other words, x2 is treated as if it were an entirely separate variable, rather than the
square of x. Multiple regression is then used to estimate the coefficients a, b and c.
After estimation, x2 is restored to the equation, which can then be used for predic-
tion.

100

80

60
x y
0 10
1 15
40
2 24
3 37
20 4 54
5 75
6 100
0 x
1 2 3 4 5 6

Figure 12.1 Graph of y =10 + 3x + 2x2


Example
The data below relate to the production of electronic games by a toy manufacturer: y
refers to the productivity (in thousands of units per person per annum) and x refers to
the capital employed (in millions) on the production line. Find the curvilinear regression
equation linking the data.

Quantitative Methods Edinburgh Business School 12/7


Module 12 / Advanced Regression Analysis

y 8 9 12 16 21 39 43 53 65 79
x 2 3 3 4 5 7 7 8 9 10

100

80

60

40

20

0 x
2 4 6 8 10

Figure 12.2 Productivity and capital employed in manufacture of elec-


tronic games
Figure 12.2 shows a scatter diagram of the data. The relationship does not appear to be
linear. The shape looks similar to that in Figure 12.1. It seems sensible to try a curviline-
ar regression involving x and x2.
To carry out the regression a ‘new’ variable equal to x2 must be created. The data fed
into the computer regression package will be for three variables:

y 8 9 12 16 21 39 43 53 65 79
x 2 3 3 4 5 7 7 8 9 10
z 4 9 9 16 25 49 49 64 81 100

The output for this regression is of the form:

Variable Coefficient
x −0.50
z 0.80
Constant 5.10
R-bar-squared = 0.99
Residual standard error = 1.57

From the output, the regression equation is:


= 5.1 − 0.5 + 0.8

12/8 Edinburgh Business School Quantitative Methods


Module 12 / Advanced Regression Analysis

The correlation coefficient is high (R-bar-squared = 0.99). The residuals should also be
inspected for randomness. Restoring x2 to the equation in place of z:
= 5.1 − 0.5 + 0.8
The equation can now be used to forecast values of y given values of x.

12.3.2 Transformations
A variable is transformed when some algebraic operation is applied to it. For
example, a variable x is transformed when it is turned into its square (x2), its
reciprocal (1/x) or its logarithm (log x). The list of possible transformations is long.
The principle behind the use of transformations in regression is that a non-linear
relationship between two variables may become linear when one (or both) of the
variables is transformed. Linear regression is then used on the transformed varia-
bles. The variable is de-transformed when the equation is used to make predictions.
In other words, although the relationship between y and x is curved, it may be
possible to find a transformation of y or x or both such that the relationship
between the transformed variables is linear.
For example, a relationship of the form:
= · ﴾12.2﴿
is non-linear between y and x. This is the exponential function. Recall from Module
2 that it is characterised by the fact that, each time x increases by 1, y increases by a
constant proportion of itself. Contrast this with a linear function (Y = A + BX),
where each time X increases by 1, Y increases by a constant amount (= B). Linear
means a constant increase (or decrease); exponential means a constant percentage
increase (or decrease). There are clearly situations where an exponential function
might apply. For instance, if the sales increase of some product were thought likely
to be 10 per cent per year, the relationship between sales and time would be
exponential. Were the increase thought likely to be £1 million each year, the
relationship would be linear.
The issue in regression is this. Suppose that two variables are thought to be relat-
ed by an exponential function and that some historical data are available. How can
the equation relating the variables (i.e. the values of a and b in Equation 12.2) be
found when the regression formula applies only to linear relationships?
Figure 12.3 shows the graph of the exponential function. A transformation can
make it linear. If natural logarithms (to the base e) are taken of each side of the
equation and some algebra applied, using the rules for manipulating logarithms, the
result is:
= ·
log = log ( · )
log = log + ·

Quantitative Methods Edinburgh Business School 12/9


Module 12 / Advanced Regression Analysis

(a) (b)
y log y

y = aebx log y = log a + bx

a log a

x x

Figure 12.3 The exponential function


The new equation has the linear form:
= +
with Y = log y, A = log a, B = b and X = x. If a graph relating log y and x is drawn
(Figure 12.3(b)), it will be linear. The regression formulae can now be applied to the
transformed variables, log y and x, since they are linearly related. The coefficient
estimates obtained from this equation are equal to log a and b. If the antilog of the
former is taken, the exponential function (Equation 12.2) will be completely
determined. Given a value for x, the corresponding value for y can be calculated.
Logarithms to bases other than to the base e can be used, but the slope coefficient
in the transformed equation is no longer simply equal to b in the untransformed
equation.
The situation can be summarised as follows. Two variables, x and y, are, it is
thought, related by an exponential function. The y variable is transformed by taking
its logarithmic values. Linear regression is applied to log y and x as if they were two
new variables. The coefficient estimates obtained from this linear regression can be
translated into the coefficients of the exponential function, which can then be used
to forecast.
One point should be clarified. The underlying relationship between y and x is not
itself being altered. There is no sense in which this curved relationship is being
forced to be linear. It is the way the relationship is expressed mathematically that is
changed. The ‘trick’ of expressing the same relationship in a different form makes it
possible to use regression analysis.
Example
The sales volume of a successful piece of microcomputer software is generally expected
to grow rapidly, according to an exponential pattern, for a few years after its launch.
The Business Planning (BP) package of one software producer has shown retail sales
volumes for the first 12 months as below:

Sales volume y 12 14 17 20 24 28 34 41 48 56 66 76
Month x 1 2 3 4 5 6 7 8 9 10 11 12

12/10 Edinburgh Business School Quantitative Methods


Module 12 / Advanced Regression Analysis

Use regression analysis to find the exponential equation linking sales and time. Predict
sales volume for the next three months.
An exponential equation is of the form:
= ·
Fitting an equation of this type to the data amounts to estimating the values of a and b.
In order to use linear regression formulae, the equation must first be transformed to a
linear one. Taking logarithms of both sides, the equation becomes:
log = log + ·
The regression is performed on log y and x. To do this, the logarithms of the sales must
be found:

log y 2.48 2.64 2.83 3.00 3.18 3.33 3.53 3.71 3.87 4.03 4.19 4.33
x 1 2 3 4 5 6 7 8 9 10 11 12

Putting these two variables into a regression package, the output is:

Variable Coefficient
x 0.17
Constant 2.32
R-bar-squared = 0.99
Residual standard error = 0.02

From the output the linear equation is:


log = 2.32 + 0.17
i. e. = 0.17 and log = 2.32
Taking the antilogarithm of 2.32:
= 10.2
The exponential function relating y and x is therefore:
= 10.2 .
The usual checks on the correlation coefficient and the residuals should be carried out
at this point just as for ordinary linear regression analysis.
The next three time periods are for x = 13, 14, 15. Substituting these values in the
estimated exponential equation gives the following predictions for y:

y 93.0 110.2 130.6


x 13 14 15

The exponential function is one of the relatively few non-linear relationships that can be
made linear by applying a transformation to both sides of the equation. Mostly non-
linear relationships cannot be made linear in such a neat way. Experience allied to
mathematics is needed in order to know which relationships are amenable to this
transformation treatment.

Quantitative Methods Edinburgh Business School 12/11


Module 12 / Advanced Regression Analysis

The use of transformations so far has referred to situations in which there were sound
reasons for supposing the two variables were related by a particular non-linear relation-
ship. For example, the belief that sales of a product were increasing at a rate of 10 per
cent per year was a good reason for using an exponential function.
In other situations the type of non-linear relationship may not be known. Transfor-
mations can be helpful in a different way. The scatter diagram may just show a curve of
some sort but it may be difficult to go further and suggest possible non-linear equations
relating the variables. In such circumstances it may be necessary to try several transfor-
mations by using the squares, square roots, reciprocals, logarithms, etc. of one or both
of the variables to find the scatter diagram of transformed variables that looks most like
a straight line. Equally, regression analyses may be carried out with several types of
transformation to find the one which gives the best statistical results, highest R-squared,
random residuals, etc.
This approach, although sometimes necessary, can be dangerous. If enough
transformations are tried, it is usually possible eventually to come up with one that
appears satisfactory. However, there may be no sound reason for using the trans-
formation. The model may be purely associative. It is better to base the choice of
transformation on logic or prior knowledge rather than exhaustive trial and error.
For example, take the case of a company with several plants manufacturing the
same product. The unit cost of production (y) will vary with the capacity of the
plant (x). The relationship is unlikely to be linear. Finding a transformation that
expresses the relationship in linear form could be based on trial and error. Several
transformations of y and x could be tried until one is found that appears (from the
scatter diagram or regression results) to make the relationship linear. It would be
preferable to have a sound reason for trying particular transformations. The ‘law’ of
economies of scale suggests that unit costs might be inversely proportional to
capacity. This is a good reason for transforming capacity to 1/capacity and consider-
ing the relationship between y and 1/x. If this relationship appears to be linear then
statistics and logic are working together.
When this is the case, the statistics are in the position of confirming the logic.
Otherwise there will be no underlying basis for using the transformation and the
statistics will be in isolation from the real situation. Or attempts must be made to
rationalise the structure of the situation to fit in with the statistics. This might of
course lead to the discovery of new theories. More likely, there is a danger of using
associative rather than causal models. The trial and error approach to transfor-
mations should not be ruled out, but it should be used with caution.

12.4 Statistical Basis of Regression and Correlation


Calculating the ‘best-fit’ line or curve through a set of points and judging its strength
has been described in a non-rigorous fashion. The statistical basis of regression and
correlation gives more depth to the subject. It allows all aspects of the relationship
to be tested statistically rather than visually. For example, the randomness of
residuals, up to now, has been judged by inspection. A statistical perspective allows
their randomness to be tested more precisely, and this perspective is based on

12/12 Edinburgh Business School Quantitative Methods


Module 12 / Advanced Regression Analysis

sampling. For the moment the discussion will be restricted to simple linear regres-
sion.
If it is believed that two variables are linearly related, then the statistician hypoth-
esises that, in the whole population of observations on the two variables, a straight-
line relationship does exist. Any deviations from this (the residuals) are caused by
minor random disturbances. A sample is then taken from the population and used
to estimate the equation’s coefficients, the correlation coefficient and other statis-
tics. This sample is merely the set of points upon which calculations of a, b and r
have been based up to now.
However, the fact that these calculations are made from sample data means that
a, b and r are no more than estimates. Had a different sample been chosen, different
values would have been obtained. Were it possible to take many samples, distribu-
tions of the coefficients would be obtained. In practice only one sample is taken and
the distributions estimated from it. This is similar to the way in which the sampling
distribution of the mean (Module 8) could be estimated from just one sample. In
effect, the coefficient distributions are estimated from the variations in the residuals.
These distributions (of a, b, r and other statistics) are the basis of significance
tests of whether the hypothesis of a straight-line relationship in the population is
true. They are also the basis for determining the accuracy of regression predictions.
The statistical approach has several practical implications.

12.4.1 Measuring Closeness of Fit


Correlation is measuring the strength of a relationship. So far it has been carried out
through the correlation coefficient (r), its square (R2) or, in multiple regression, R-
bar-squared. Recall the way in which this serves as a measure of correlation. The
total variation in the y variable before regression is:
Total sum of squares = ∑( − )
After regression this can be split into two parts:
Total SS = Explained SS + Unexplained SS
where
Explained sum of square = Variation which is explicable because of regression
= ∑(Fitted − )
Unexplained sum of squares = Variation left unexplained
= Variation in residuals
= ∑(Residuals)
The correlation coefficient squared is:
=
Hence R2 measures the proportion of variation explained by regression. This can
be turned into a significance test of whether the regression explains a significant
amount of variation by applying analysis of variance, which works in a similar way.
Analysis of variance splits total variation into parts attributable to different sources:
treatments, blocks, errors, etc. In regression the total variation is split into that

Quantitative Methods Edinburgh Business School 12/13


Module 12 / Advanced Regression Analysis

attributable to regression and that forming the residuals. It can be proved, with
some ‘black box’ mathematics, that the ratio of the mean squares of the two sources
of variation is an F ratio under the hypothesis that there is no linear relationship
between the two variables. In other words, if it is hypothesised that the x and y
variables are not related, then the ratio:
( )
( )

has an F-distribution. If an F value is observed that is significantly different from


what is expected, the hypothesis is rejected and it is supposed that there is a linear
relationship between the two variables. If it is not significantly different, the
hypothesis is accepted and it is supposed that there is no linear relationship between
the variables.
Example
This is the same example that was used for calculating the regression coefficients a and
b and the correlation coefficient in Module 11:

x 1 2 4 5 8 Mean( ) = 4
y 20 19 34 30 47 Mean( ) = 30

The regression line was found to be:


= 14 + 4
The correlation coefficient was found to be:
= 0.95
The residual and fitted values are:

x y Fitted Residual
1 20 18 2
2 19 22 −3
4 34 30 4
5 30 34 −4
8 47 46 1

For example, the first fitted value is:


18 = 14 + 4 × 1
The first residual is:
Actual − Fitted = 20 − 18 = 2
The sums of squares can now be calculated:
Total SS = ∑( − )
= (20 − 30) + (19 − 30) + (34 − 30) + (30 − 30) + (47 − 30)
= 100 + 121 + 16 + 0 + 289
= 526

12/14 Edinburgh Business School Quantitative Methods


Module 12 / Advanced Regression Analysis

SS (regression) = ∑(Fitted − )
= (18 − 30) + (22 − 30) + (30 − 30) + (34 − 30) + (46 − 30)
= 144 + 64 + 0 + 16 + 256
= 480
SS (residual) = ∑(Residual)
= 2 + (−3) + 4 + (−4) + 1
= 46
As expected:
Total SS = SS(regression) + SS(residual)
526 = 480 + 46
In practice only two of the three sums of squares would have been calculated directly.
The third would be derived from the sums of squares equality. The sums of squares can
be put into an analysis of variance table (an ANOVA table), as in Table 12.1.
The degrees of freedom for SS(regression) is 1 since there is one independent variable.
The degrees of freedom from SS(error) is 3. There are five observations, but 2 degrees
of freedom are lost because the coefficients a and b have to be calculated from the data
before the residuals can be found. The degrees of freedom for the Total SS is one fewer
than the number of observations because the y mean ( ) has to be calculated first.
The mean squares are calculated by dividing the sums of squares by the corresponding
degrees of freedom. The observed F value is the ratio of the mean squares.

Table 12.1 Example of an ANOVA table for regression


Variation Degrees of Sums of Mean square F
freedom squares
Explained by regression = 1 SSR = 480 MSR = 480/1 MSR/MSE = 480/15.3
= 480 = 31.4
Error or unexplained − − 1 = 3 SSE = 46 MSE = 46/3
(residuals) = 15.3
Total −1 =4 SS = 526

Finally, the observed F value has to be compared with the critical F value found from in
Appendix 1. For (1,3) degrees of freedom, the critical F value at the 5 per cent level is
10.13. The observed value greatly exceeds this. It is therefore concluded that the
hypothesis must be rejected. It cannot be said that there is no linear relationship
between the two variables. This is no more than should be anticipated given the high
correlation coefficient.

12.4.2 Testing that the Residuals Are Random


The statistical approach to regression hypothesises a true linear relationship. Any
deviations from it are minor random disturbances. Not only is the randomness part
of the hypothesis; it is also intuitively reasonable. If the residuals were not random,
they must have a pattern in them. If they have a pattern in them, the linear model is
not adequate and should be revised or altered in some way to incorporate it. If they

Quantitative Methods Edinburgh Business School 12/15


Module 12 / Advanced Regression Analysis

are random, the linear model must be the best pattern that can be obtained from the
data.
Many statistical tests for randomness exist. This runs test is a common example.
It works in the following way. A ‘run’ is a group of consecutive residuals with the
same sign. As an example, the ten residuals in Figure 12.4 have four runs.

Residuals: +3.1 –1.6 –0.2 –1.4 +1.1 +2.9 +0.3 –1.0 –3.1 –0.1

Runs: 1 2 3 4

Figure 12.4 Runs


If the residuals have a small number of runs, then they are unlikely to be random.
This could arise from a situation like that in Figure 12.5(a). If the residuals have a
large number of runs, then, again, they are unlikely to be random. This could arise
from the situation shown in Figure 12.5(b), where the pattern is that the residuals
alternate between positive and negative.

(a) y

1st run

2nd run

x
(b) y

Each residual is a run

Figure 12.5 Runs and patterns

12/16 Edinburgh Business School Quantitative Methods


Module 12 / Advanced Regression Analysis

A runs test is based on an expected number of runs, which is the number of runs
that would be most likely to occur if the residuals were random. The expected
number of runs can be calculated using the basic ideas of probability. It is then
compared with the observed number of runs counted in the residuals. If the actual
differs from the expected by a large margin, the residuals will be assumed to be non-
random. In statistical terms, a significance test will indicate whether any difference
between the observed and expected numbers of runs is sufficient to reject the
hypothesis that the residuals are random.
Statistical tables (see Table A1.7 in Appendix 1) are available that show the critical
values for the observed number of runs. The tables are used as follows. If there are n1
positive residuals and n2 negative residuals, then Table A1.7 in Appendix 1 shows, for
one parameter, the lower critical value for the number of runs. If the number of runs
is less than this critical value, then the hypothesis of random residuals is rejected at the
5 per cent level (the tables relate only to the 5 per cent level). Table A1.7 in Appendix
1 shows the upper critical value. If the observed number of runs exceeds this critical
value (for the given numbers of positive and negative residuals) then the hypothesis is
rejected. For random residuals the number of runs should therefore be between the
upper and lower critical values.
Example
Are the residuals of Figure 12.4 random?
The runs test is a significance test. As with all significance tests, there are five stages to
follow.
(a) The hypothesis is that the residuals are random.
(b) The evidence is the sample of residuals in Figure 12.4.
(c) The significance level is the customary 5 per cent.
(d) The critical values are given by Table A1.7 in Appendix 1. There are 10 residuals,
four positive and six negative. Therefore:

n1 = 4 n2 = 6

From Table A1.7 the lower critical value is 2. From Table A1.7 the upper critical
value is 9.
(e) Since the observed number of runs is four (from Figure 12.4), the observed result
does not lie beyond the critical values. The hypothesis is accepted. The residuals
appear to be random.
If either n1 or n2 is larger than 20, tables such as those in Appendix 1 cannot be used. In
this range the number of runs behaves like a normal distribution with:
Mean = +1

( )
Standard deviation =
( ) ( )

A significance test proceeds just as the one above except that the critical values are not
taken from tables. They are two standard deviations above and below the mean (for a 5
per cent significance level).

Quantitative Methods Edinburgh Business School 12/17


Module 12 / Advanced Regression Analysis

If there is a strong pattern in a set of residuals, a visual test with a scatter diagram is
usually sufficient to detect it. However, most computer regression packages offer a
statistical test for randomness. Many such tests exist but no one package will offer them
all. The runs test is just one possibility.
When a runs test is called for, the computer usually calculates the actual number of runs
counted in the residuals. It also prints out the range (between the critical values), which
indicates random residuals. In practice, therefore, the significance test is usually carried
out automatically by the computer.
A computer printout might be as follows:

Actual number of runs 6


Expected number of runs 5
95% range for expected number of runs 3.2 − 8.8

Since the number of runs falls within the range, the conclusion is that the residuals are
random.

12.4.3 Deciding which Variables to Retain (in Multiple Regression)


Multiple regression allows more than one right-hand-side variable to be included in
the regression equation. The possibility of including two or three x variables has
been discussed. But why stop at three? The regression formulae permit the calcula-
tion of the coefficients of any number of x variables and with the aid of computing
power the complicated nature of the formulae is no hindrance. How can the
decision on the number of variables to include be made?
The criterion is that any variable included should have a significant (in the statis-
tical sense) effect on the y variable. The statistical approach to regression analysis
enables a significance test to be carried out for each variable. The hypothesis is that
the particular variable has no effect on the y variable or, in other words, that the
coefficient of the variable is really zero. It is hypothesised that the sample-based
estimate of the coefficient obtained from the regression is non-zero purely by
chance. If the significance test shows that the estimated coefficient is not signifi-
cantly different from zero, then the hypothesis is accepted and it is concluded that
this x variable has no effect on the y variable. This x variable can be discarded from
the regression equation. If the test shows that the estimated coefficient is signifi-
cantly different from zero, then the hypothesis is rejected and it is concluded that
this x variable does have an effect on the y variable. This x variable has been
included in the equation correctly (see Figure 12.6).

12/18 Edinburgh Business School Quantitative Methods


Module 12 / Advanced Regression Analysis

2.5% 2.5%

0
Hypothesis
Critical Critical
that coefficient
value value
=0
If the coefficient estimate is in this
range the hypothesis is accepted
and the variable excluded from the
equation.

If the coefficient estimate is in these ranges the hypothesis


is rejected and the variable retained in the equation.

Figure 12.6 t test for variable coefficient


The basis of the test is that the coefficient has a t-distribution if the number of
observations is less than 30, and a normal distribution if the number of observations
exceeds 30. The proof is not given but in essence it hinges on the assumption that
the residuals are normally distributed (for fewer than 30 observations) and on the
central limit theorem (for more than 30 observations). For convenience, the test is
in practice based on the t-distribution. This will not materially affect the outcome
since the two distributions almost coincide for more than 30 degrees of freedom.
The significance test for any right-hand-side variable follows the usual five stages:
(a) The hypothesis is that the true (population) coefficient for the variable is 0.
(b) The evidence is just the set of observations from which the regression coeffi-
cients have been estimated. As well as the coefficient of the variable in question,
its standard error will also have been calculated.
(c) The significance level is usually 5 per cent.
(d) The degrees of freedom for the test are:
− −1
where n = the number of observations
where k = the number of x variables in the regression.
The k + 1 degrees of freedom lost are lost because of the need to estimate the
coefficients of the k variables and the constant term. Since the variable could
have a significant effect in either a positive or a negative direction, the test is
two-tailed. The critical t value is therefore t.025 for n − k − 1 degrees of freedom
found from t-distribution tables.
The observed t value is:
. =

Quantitative Methods Edinburgh Business School 12/19


Module 12 / Advanced Regression Analysis

(e) If tObs. exceeds t.025 then the hypothesis is rejected, the variable does have a
significant effect; if tObs. is less than t.025 then the hypothesis is accepted and the
variable may be eliminated from the regression equation.
Most computer packages print out the observed t value automatically as the ratio
between the coefficient and its standard error.
Example
An earlier example on multiple regression was concerned with predicting the sales of
clothing for four-year-old children by relating it to the number of births four years
earlier and the gross domestic product (GDP). A third variable, the store’s advertising
expenditure that year, is added to the equation. The computer output for this multiple
regression is:

Variable Coefficient Standard error t value


Births 2.07 0.35 5.9
GDP 0.28 0.11 2.5
Advertising 6.12 4.80 1.3
Constant 2.86 1.13 2.5
R-bar-squared = 0.98
Residual standard error = 2.34 Degrees of freedom = 38

Are all the three variables – births, GDP and advertising – rightfully included in the
regression equation?
The computer output shown above is more extensive than that shown up to now. This
output includes the coefficients, standard errors, t values and degrees of freedom. Most
packages would show at least as much information as this.
To answer the question, a t test must be carried out on the coefficients for each of the
variables. Because the observed t values have been printed out (= coefficient/standard
error), this can be done very easily. In each case the critical t value is t.025 for 38 degrees
of freedom. For this number of degrees of freedom the t and normal distributions
almost coincide. From the normal distribution table (see Appendix 1, Table A1.2), the
critical value is 1.96.
For births and GDP the observed t values exceed the critical (both 5.9 and 2.5 are
greater than 1.96). These variables therefore both have a significant effect on the sales
of clothing and are rightfully included in the equation. The observed t value for advertis-
ing is 1.3, less than the critical value. According to the statistical test, advertising does
not have a significant effect on clothing sales. The variable can be eliminated from the
equation.

12.4.4 The Accuracy of Predictions


A common use of a regression equation is to make predictions. The forecast
corresponding to an x value is found by putting the x value into the equation and
calculating y. This y value is called a point estimate. But to know how useful the
estimate might be, some assessment of its accuracy is needed. How accurate is a
point estimate likely to be?

12/20 Edinburgh Business School Quantitative Methods


Module 12 / Advanced Regression Analysis

The residuals are the basis for measuring the accuracy of a prediction. Intuitively
this makes sense. The residuals are the historical differences between the actual
values and the regression line. It might be expected that actual values in the future
will differ from the regression line by similar amounts. The scatter of the residuals,
as measured by their standard error, is therefore an indication of forecasting
accuracy.
The residuals are what is ‘left over’ outside a regression equation. However, there
is also error within the regression model. The source of this error is inaccuracy in
the estimation of the coefficients of the variables and the constant term. This
inaccuracy is measured through the standard errors of the coefficient estimates.
The overall uncertainty in a prediction comes therefore from two areas:
(a) What the regression does not deal with – the residuals.
(b) What the regression does deal with – error in the estimation of regression model
coefficients.
Both types of error are combined in the standard error of predicted values,
often shortened to SE(Pred). The formula for this standard error is a complicated
one, integrating as it does all the standard errors mentioned above. However, most
computer packages have the capability to print it out.
Once calculated, SE(Pred) is used to calculate confidence intervals for predic-
tions. Provided there are more than 30 data points (the usual rule of thumb), the
normal distribution will apply to the point estimate. Thus, 95 per cent of future
values of the variable are likely to lie within ±2 SE(Pred) of the point estimate. If
there are fewer than 30 observations then the t-distribution applies. Instead of 2, the
t value (found from t-distribution tables) for the appropriate number of degrees of
freedom is substituted.
Such a 95 per cent confidence interval is sometimes referred to as the forecast
interval. It is used to decide whether the level of accuracy is sufficient for the
decisions being taken. For example, suppose an organisation is forecasting its profit
in the next financial year in order to take pricing decisions. The point forecast is
£50 m, the SE(Pred) is £4 m and there are 40 degrees of freedom. The accuracy of
the prediction can be calculated from these data. It is 95 per cent probable that the
profit will turn out to be in the range £42 m to £58 m (±2 SE(Pred)).
It should be noted that SE(Pred) is different for each different set of x values
used to make the prediction. This can be visualised through an example of simple
regression (only one x variable). Since the x value is multiplied by the slope in
making the prediction, any inaccuracy in estimating the slope coefficient (the x
coefficient) must have a changing effect for changing x values. Figure 12.7 shows
how the forecast interval varies for different x values. The interval is wider for x
values further from the centre of the regression.

Quantitative Methods Edinburgh Business School 12/21


Module 12 / Advanced Regression Analysis

2 SE (Pred)

2 SE (Pred)

(x, y)

Figure 12.7 Variation in the forecast interval


In practice, SE(Pred) is a vitally important measure. Whatever the statistical nice-
ties of the regression model, the real question is whether a prediction is sufficiently
accurate to be useful in the decision to which it is being applied. In the pricing
example above, the question is whether radically different decisions would be taken
if the profit were at different parts of the forecast interval. Is the forecast interval so
wide that it is unable to determine which pricing decisions should be taken? If this is
the case, then the search must go on for more accurate predictions (i.e. better
regression models); if it is not the case, then what would be the benefit of further
refinement of the model? In other words, the forecast interval is the final determi-
nant of whether the regression model is satisfactory.

12.5 Regression Analysis Summary


The necessary components for carrying out a regression analysis are now available.
The purpose of this section is to assemble them all into a coherent and structured
approach to a regression problem. The steps suggested below assume that the
problem to be tackled is one of making predictions, whether time related or cross-
sectional, rather than just establishing the existence of a relationship.
(a) Propose a tentative model. Use scatter diagrams and prior knowledge to
decide, initially, what the regression equation might be. This stage involves tenta-
tive decisions as to which x variables to include and what transformations to use
to handle any curvature. It will also consider the extent and type of data availa-
ble.
(b) Run the regression and check the closeness of fit. A computer printout
would provide R-bar-squared and, possibly, an ANOVA table. These would
show intuitively (for R-bar-squared) and statistically (the ANOVA table) whether
a sufficiently high proportion of the original variation in the y variable had been
explained.

12/22 Edinburgh Business School Quantitative Methods


Module 12 / Advanced Regression Analysis

(c) Check the residuals. The residuals should be random. A scatter diagram
between residuals and fitted values will demonstrate this visually; a runs test will
permit the check to be made statistically.
(d) Decide whether any x variables could be discarded. A t test on each x
variable coefficient will indicate if the variable has a significant effect on the y
variable. If not, the variable could be discarded, but this decision should be taken
in conjunction with prior knowledge. For example, if there were a good non-
statistical reason for including a particular x variable, then a t value of, say, 1.4
should not automatically lead to its elimination.
(e) Check for collinearity. A correlation matrix for all the x variables will show
which, if any, of them are collinear and therefore have unreliable coefficient
estimates.
(f) Decide if the regression estimates are accurate enough for the decision.
For the point estimate corresponding to any particular x value(s), SE(Pred) will
be the basis of calculating confidence intervals. These can be contrasted with the
decision at hand.
(g) If necessary, formulate a new regression model. Should any of the checks
have given unsatisfactory results, it is necessary to return to stage (a) and try
again with a new model.
The seven steps above relate solely to the statistical application of the technique
of regression. Another set of problems (covered in Module 13 on Business Fore-
casting) arise when the use of the technique has to be integrated with the decision-
making process of an organisation.

Learning Summary
This module has extended the ideas of simple linear regression by removing the
limitations of ‘simple’ and ‘linear’. First, multiple regression analysis makes the
extension beyond simple regression. It allows changes in one variable (the y
variable) to be explained by changes in several other variables (the x variables).
Multiple regression analysis is based on the same principle – the least squares
criterion – as simple regression. However, the addition of the extra x variables does
bring about added complications. Table 12.2 summarises the similarities and
differences between the two cases as far as their practical application is concerned.

Quantitative Methods Edinburgh Business School 12/23


Module 12 / Advanced Regression Analysis

Table 12.2 Comparing single and multiple regression


Similarities
(a) Substitution of x values in regression equation to make predictions.
(b) test to measure closeness of fit.
(c) Checking of residuals for randomness.
(d) Use of SE(Pred) to measure accuracy.
Differences
(a) Adjustment of correlation coefficient to allow for degrees of freedom.
(b) t test to determine variables to leave out.
(c) Check for collinearity.

The second extension beyond linear regression is to ‘curved’ relationships be-


tween variables. This is done by transforming one or more of the variables so that
the equation can be handled as if it were linear. The range of possible transfor-
mations is wide, allowing a variety of non-linear relationships to be modelled
through regression.
The possibilities of many explanatory variables and many types of equation may
seem to be advantageous, but it leads to a danger. This is that more and more
regression equations will be tried until one is found that just happens to fit the set of
observations that are available. Indeed there is a technique, called stepwise regres-
sion, which is a process for regressing all possible combinations of variables and
selecting the one that, statistically, is the best. The risk is that causality will be
forgotten. Ideally the role of regression should be to confirm some prior belief,
rather than to find a ‘belief’ from the data. This latter process can of course be
successful, but it is likely to lead to many purely associative relationships between
variables. In multiple and non-linear regression analysis it is more important than
ever to ask the question: is the regression sensible? This question should be asked
even when the statistical checks are satisfactory.
The theoretical background to regression has also been introduced. The whole
subject is large and complex. The surface has been scratched in this module but no
more. A further extension to the topic would have been to look at criteria other
than that of least squares.
Even within the least squares criterion, the statistical tests presented are just a few
of the many available. Fortunately, computer packages, so essential to all but the
smallest of problems, can carry out these tests automatically. On the other hand,
when a package carries out many tests, some of which are alternatives, the problem
of interpreting the computer’s output is an important one. A major problem faced
by new users of regression analysis is that, while they may have a good understand-
ing of the topic, their first sight of a computer’s output causes them to doubt. The
answer is not to be put off by the initial shock, but to persevere and select from the
output just those parts that are required. Computer packages are trying to satisfy a
wide range of users at all levels of sophistication. For this reason their output tends
to be confusingly large.

12/24 Edinburgh Business School Quantitative Methods


Module 12 / Advanced Regression Analysis

Perhaps the best advice in this statistical minefield is to make the correct balance
between statistical and non-statistical factors. For example, the t test for the
inclusion of variables in a multiple regression equation should be taken carefully into
account but not to the exclusion of other factors. In the earlier example on predict-
ing sales of children’s clothing, the t value for advertising was only 1.3. Statistically it
should be excluded. On the other hand, if it has been found from other sources
(such as market research interviews) that advertising does have an effect, then the
variable should be retained. The poor statistical result may have arisen because of
the limited sample chosen or because of data inaccuracy. The profusion of complex
data produced by regression analyses can promote a spurious sense of accuracy and
a spurious sense of the importance of the statistical aspects. It is not unknown for
experts in regression analysis to make mountains out of statistical molehills.

Quantitative Methods Edinburgh Business School 12/25


Module 12 / Advanced Regression Analysis

Review Questions
12.1 Another expression for a right-hand-side variable is:
A. a dependent variable.
B. an x variable.
C. an explanatory variable.
D. a residual variable.

12.2 Simple regression means that one y variable is related to one x variable; multiple
regression means that several y variables are related to several x variables. True or
false?
Questions 12.3 to 12.6 refer to the following part of the computer output
from a multiple regression analysis:

Variable Coefficient Standard error


1 5.0 1.0
2 0.3 0.2
3 22.0 4.0
Constant 5.8 1.2
Degrees of freedom = 32

12.3 The coefficients have standard errors because:


A. they are estimates calculated from a sample rather than true values.
B. there may have been errors in the data.
C. it is a multiple regression analysis.
D. it is necessary to calculate t values.

12.4 The t values for the variables 1, 2, 3 are respectively:


A. 1.0, 0.2, 4.0
B. 2.5, 0.15, 11.05
C. 5.0, 1.5, 5.5
D. 5.0, 0.3, 22.0

12.5 The regression analysis must have been based on how many observations?
A. 28
B. 35
C. 33
D. 36

12.6 Statistically, variable 2 should be eliminated from the regression equation because:
A. It has the lowest coefficient.
B. It has the lowest t value.
C. It has a t value less than 1.96.
D. Its standard error is less than 1.96.

12/26 Edinburgh Business School Quantitative Methods


Module 12 / Advanced Regression Analysis

12.7 R-bar-squared is a better measure of closeness of fit than the unadjusted R-squared
because it makes allowance for the number of x variables included in the regression
equation. True or false?
Questions 12.8 to 12.11 refer to the following data:
From a multiple regression involving three right-hand-side variables and 38 observa-
tions, this information is given:
Sum of squares (regression) = 120
Sum of squares (residual) = 170

12.8 How many degrees of freedom do the sums of squares (regression) have?
A. 3
B. 4
C. 34
D. 35

12.9 The mean square (regression) is equal to:


A. 56.7
B. 30.0
C. 40.0
D. 120.0

12.10 The mean square (residual) is equal to:


A. 4.9
B. 56.7
C. 3.5
D. 5.0

12.11 In an F test to determine whether there is a statistically significant linear relationship,


the observed F ratio is 8 (MS(reg.)/MS(res.) = 40/5). What is the critical F ratio
against which to compare it at the 5 per cent level? (See Table A1.6 in Appendix 1.)
A. 3.53
B. 14.06
C. 2.88
D. 8.61

12.12 In curvilinear regression on the equation y = a + bx + cx2 the independent variable(s)


are:
A. b and c
B. y and x
C. y and x2
D. x and x2
E. y

Quantitative Methods Edinburgh Business School 12/27


Module 12 / Advanced Regression Analysis

12.13 In using a transformation in non-linear regression, an approximation is being introduced,


since a curved relationship is being approximated by a linear one. True or false?

12.14 To carry out a regression analysis based on the equation y = aebx a linear regression is
performed on which two variables?
A. y and x
B. log y and x
C. y and log x
D. log y and log x

12.15 A regression analysis is based on the equation y = aebx by carrying out a linear
regression on the variables log y and x. The computer printout shows the coefficient for
x to be 8 and the constant to be 4. What are the values for a and b in the original
exponential function?
A. a = 8, b = 4
B. a = 4, b = 8
C. a = antilog 8, b = 4
D. a = 8, b = antilog 4
E. a = antilog 4, b = 8

Case Study 12.1: CD Marketing


1 A recording company markets CDs by mail order and through selected retail outlets,
often hypermarkets. The company deals only with popular CDs that have a large sales
potential. Usually the recordings feature well-known, middle-of-the-road singers. The
company advertises in newspapers and magazines. The expenditure on advertising varies
from week to week depending upon the CDs it is launching and the magazines published
during the period. When sales are thought to be disappointing, the company boosts its
advertising by using television commercials.
Financial data for the last eight weeks are shown in Table 12.3.

Table 12.3 CD advertising expenditure


Week Gross Newspaper Television
revenue advertising advertising
(£000) (£000) (£000)
1 180 5 1
2 165 3 2
3 150 3 3
4 150 3 3
5 185 5 2
6 170 4 1
7 190 6 1
8 200 6 0

12/28 Edinburgh Business School Quantitative Methods


Module 12 / Advanced Regression Analysis

In an attempt to measure the effect of advertising and to compare newspaper and


television advertising, regression analysis was used relating gross revenue to newspaper
advertising and television advertising. The results are as follows:

Variable Coefficient Standard error t value


News advertising 10 2.7 3.7
TV advertising −5 3.3 −1.5
Constant 138
R-bar-squared = 0.92
Sum of squares (regression) = 2191
Sum of squares (residual) = 147

a. What is the equation linking the variables?


b. Calculate the values of the residuals. Does a visual inspection suggest that they are
random?
c. Is the regression fit a good one? In particular, does an F test indicate that there is a
significant linear relationship?
d. What other information might be useful in determining the usefulness of this
regression model?
e. Could the model be used to forecast revenue?
f. How could the model be improved?

Case Study 12.2: Scrap Metal Processing I


1 A company is in the business of processing scrap metal. The reclaimed metals are sold
to a variety of manufacturing industries with which the company has long-term contracts
for the supply of these products. Because of the volatile nature of the business the long-
term contracts are felt to be an essential basis for operations. The company owns 12
processing plants, all of which were constructed in recent years. The plants are at
different geographical locations. Their capacities were decided at the time of construc-
tion with regard to likely demand in the area.
Two further plants are now planned for construction in 2018. Tentatively their capaci-
ties have been set at 270 and 350 tonnes/week. The negotiation of long-term contracts
for the sale of the output is just about to commence and an important input to the
negotiations is a projection of likely costs at the new plants. A project to predict costs
using regression analysis is under way. Current cost data for the existing plants are
shown in Table 12.4.
The project team run a regression relating average cost/tonne against capacity. The
results are shown below:

Quantitative Methods Edinburgh Business School 12/29


Module 12 / Advanced Regression Analysis

Variable Coefficient t value


Capacity −0.15 −5.6
Constant 93.0 13.3
R-bar-squared = 0.73
F value = 31.0
Residuals: 13.8, −1.6, 4.9, −8.5, −8.0, −2.8, −6.3, −1.6, −1.3, −3.3, 5.8, 8.8

Table 12.4 Scrap metal reprocessing plants


Year of Capacity Average
construction (tonnes/wk) cost/tonne (£)
2011 100 92.0
2015 150 69.1
2010 200 68.2
2016 200 54.8
2017 240 49.4
2013 250 53.1
2016 250 49.6
2015 275 50.6
2014 280 50.1
2017 300 45.1
2013 350 46.8
2015 400 42.4

a. How good a regression model is this? To answer:


i. Draw a scatter diagram relating unit costs and capacity.
ii. Carry out a runs test to reinforce any judgements made from inspection of the
scatter diagram.
iii. Use the F value given in the output to check whether there is a significant rela-
tionship.
b. What improvements could be made? In particular:
i. Is a transformation necessary? If so, what transformation?
ii. Should another variable be added to the regression equation? If so, which varia-
ble?

Case Study 12.3: Scrap Metal Processing II


1 The scrap metal processing company (Case Study 12.2) continued with efforts to
predict unit costs at the two new plants. Two more regressions were run. The first
related unit costs to the reciprocal of capacity; the second related unit costs to the
reciprocal of capacity and the age of the plant. The results were as shown below:

12/30 Edinburgh Business School Quantitative Methods


Module 12 / Advanced Regression Analysis

Variable Coefficient t value


1/Capacity 6654 12.0
Constant 25.6 9.3
R-bar-squared = 0.93
F value = 143.9
Residuals: −0.1, −0.8, 9.3, −4.1, −3.9, 0.9, −2.6, 0.8, 0.8, −2.7, 2.2, 0.2

Variable Coefficient t value


1/Capacity 5867 23.8
Age 1.65 7.3
Constant 23.1 20.1
R-bar-squared = 0.99
value = 476
Residuals: −1.3, 1.9, 2.5, −1.0, 0.2 −1.7, −0.3, 1.2, −0.6, 0.8 −1.3, −0.3
The project team went on to use these two regression models, as well as the original
model, to predict unit costs for the two new plants (of capacities 270 and 350 tonnes
per week). The accuracies of the point estimates were also measured through the
SE(Pred). The results are shown in Table 12.5.

Table 12.5 Regression results for scrap metal reprocessing plants


x variables New values Prediction SE(Pred)
Capacity 270 52.5 7.6
350 40.5 8.0
1/Capacity 0.0037 50.2 3.9
0.0029 44.6 4.0
1/Capacity and Age 0.0037, 0 44.9 1.7
0.0029, 0 39.9 1.7

a. Which predictions should be used as the basis for the contract negotiations for the
new plants? What are the reasons for preferring this model?
b. Comment on certain aspects of the following:
i. In making predictions for a plant built in 2018 the age variable will have value
zero. Therefore, age will have a zero influence in the prediction. In these circum-
stances, how can it have been useful to incorporate age as a right-hand-side
variable?
ii. Using the third model (unit costs related to 1/Capacity and Age), what are the 95
per cent forecast intervals for the predictions of unit costs at the two plants?
iii. Why, for the same regression model, is SE(Pred) sometimes different for the
predictions for the two new plants?
iv. How can relatively small percentage increases in R-bar-squared (from 0.73 to
0.93 to 0.99) be consistent with such large decreases in SE(Pred) (from 8.0 to 4.0
to 1.7)?

Quantitative Methods Edinburgh Business School 12/31


PART 5

Business Forecasting
Module 13 The Context of Forecasting
Module 14 Time Series Techniques
Module 15 Managing Forecasts

Quantitative Methods Edinburgh Business School


Module 13

The Context of Forecasting


Contents
13.1 Introduction.......................................................................................... 13/1
13.2 A Review of Forecasting Techniques ................................................. 13/2
13.3 Applications .......................................................................................... 13/3
13.4 Qualitative Forecasting Techniques .................................................. 13/5
Learning Summary ....................................................................................... 13/16
Review Questions ......................................................................................... 13/18
Case Study 13.1: Automobile Design ......................................................... 13/19

Prerequisite reading: None

Learning Objectives
The intention of this module is to provide a background to business forecasting. By
the end, the reader should know what it can be applied to and the types of tech-
niques that are used. Special attention is paid to qualitative techniques at this stage
since they are the alternative to the quantitative techniques that are usually thought
to form the nucleus of the subject.

13.1 Introduction
The business world of the 1970s and earlier was more stable than it is at present.
This view is not merely the product of nostalgic reminiscence. Inspection of
business and economic data of the period reveals relatively smooth series with
steady variations through time. As a result, business forecasting was not the major
issue it is now. In fact, many managers claim to have done their forecasting then on
the back of the proverbial envelope. The situation is different today. Uncertainty is
evident everywhere in the business world. Forecasting has become more and more
difficult. Data, whether from companies, industries or nations, seem to be increas-
ingly volatile. The rewards for good forecasting are very high; the penalties for bad
forecasting or for doing no forecasting at all are greater than ever. Even the most
non-numerate managers tend to agree that even a second envelope is insufficient.
As a consequence, interest and investment in forecasting methods have been
growing. Organisations are spending more time and money on their planning. Much
of this increased effort has gone into techniques: established techniques are being
used more widely; new techniques have been developed. The specialist forecaster’s
role has grown. Unfortunately the outcome of all this effort has not always been
successful. Indeed, some of the most costly mistakes in business have been made
because of the poor use of forecasting methods, rather than any inadequacy in the

Quantitative Methods Edinburgh Business School 13/1


Module 13 / The Context of Forecasting

techniques themselves. Analysing these mistakes reveals that, in the main, they have
come about not through technical errors but because of the way the forecasting was
organised and managed.
While attention has rightly been given to the ‘kitbag of techniques’ of the practi-
tioner (statistician, operational researcher, etc.), the roles of non-specialists involved
in the process (accountants, financial analysts, marketing experts and the managers
who are to use the forecasts to take decisions) have been neglected. These roles are
usually concerned with managing the forecasts. However, because they have less
technical expertise, the non-specialists have tended to hold back and not participate
in planning the forecasting system. Their invaluable (although non-statistical)
expertise is thereby lost to the organisation in this regard. Accordingly, the effec-
tiveness of many organisations’ forecasting work has been seriously weakened. The
role of the non-specialist is at least as important as that of the practitioner.
The purpose of this module is to provide a context for the more detailed fore-
casting topics described in the two following modules. To give this background, the
full range of types of technique used and their applications will be reviewed. Since
many organisations have traditionally adopted a simple qualitative approach to
forecasting (typically, a group of managers decides what the forecast is to be),
qualitative techniques, including the not-so-simple ones, will then be described in
some detail.

13.2 A Review of Forecasting Techniques


This review describes in outline different approaches to forecasting and provides
some general awareness of the techniques available. This review is intended to give a
point of reference for the study of business forecasting. The details of the tech-
niques will be given later.
Forecasting techniques can be divided into three categories:
(a) Qualitative
(b) Causal modelling
(c) Time series methods
Qualitative methods are based on judgement rather than records of past data.
Popular opinion might suggest that qualitative methods are the best. Stories abound
of managers with ‘instinct’ who made predictions with astounding accuracy. On the
other hand, the few surveys that have been done show that qualitative methods are,
in general, inferior to quantitative ones. The reason for this anomaly may be
psychological. In management science it seems to be the successes of people and
the failures of systems (for example, the utility that sent a gas bill for £999 999
million), rather than vice versa, that are remembered. The man who said that sliced
bread would never catch on is forgotten, as are the staggering savings made by
computerised billing systems.
Even so, some qualitative techniques have a successful record. The essence of the
best qualitative techniques is that they convert judgements into forecasts in a
thoughtful and systematic manner. They are different from the instant ‘guesses’ that
are often thought of as qualitative forecasts. More importantly, there are many

13/2 Edinburgh Business School Quantitative Methods


Module 13 / The Context of Forecasting

situations where the qualitative approach is the only one possible. For new products,
industries or technologies (developing and retailing computer software, for instance)
no past records are available to predict future business; in some countries political
uncertainties may mean that past data records have no validity. In these situations
qualitative techniques provide soundly based ways of making forecasts.
Causal modelling means that the variable to be forecast is related statistically to
one or more other variables that are thought to ‘cause’ changes in it. Nearly always
this means that regression analysis, in conjunction with past data, is used to establish
the relationship. The equation is assumed to hold in the future and is used to make
the forecasts. For example, the econometric forecasts of national economies
regularly published in the media are based on causal modelling. The models relate
economic variables one with another. Policies, such as restricting the money supply,
and economic assumptions, such as the future price of oil, are fed into the model to
give forecasts of inflation, unemployment, etc.
A further example of causal modelling is that of a company trying to predict its
turnover from advertising expenditure, product prices and economic growth. The
value of causal modelling is that it introduces, statistically, external factors into the
forecasting. This type of method is therefore usually good at discerning turning
points in a data series.
Time series methods predict future values of a variable solely from historical
values of itself. They involve determining patterns in the historical record and then
projecting the patterns into the future. While these methods are not good when the
underlying conditions of the past are no longer valid, there are many circumstances
when time methods are the best. Indeed, many surveys have shown time series
methods to be superior to the other two approaches. They are used when:
(a) conditions are stable and will continue to be so in the future;
(b) short-term forecasts are required and there is not enough time for conditions to
change more than a little;
(c) a base forecast is needed, onto which can be built changes in future conditions.
Time series methods are also, in general, the cheapest and easiest to apply and are
often automatic, needing no manual intervention once they have been set up. They
can therefore be used when there are many forecasts to be made, none of which
warrants a large expenditure. This might be the case in forecasting stock levels at a
warehouse dealing in large numbers of small-value items.

13.3 Applications
The range of applications of forecasting is wide, and to appreciate their full extent it
is helpful to categorise them. There is more than one way of doing this. The
categorisation may be according to type of technique or the functional area (market-
ing, finance, corporate planning, etc.) or the variable to be forecast (sales volume,
cash flow, manpower and so on). It is probably most useful, however, to think of
forecasting in terms of the time horizon of the forecast. Is the forecast to cover the
next two months, the next two decades or something in between? Usually the time

Quantitative Methods Edinburgh Business School 13/3


Module 13 / The Context of Forecasting

horizon is expressed as short term (up to a year), medium term (one to five years),
or long term (more than five years).

13.3.1 Short-Term Forecasting


A local education authority, responsible for all schools in its area, stocked school
equipment (from pencils and paper to desks and chairs to films and videos) in a
warehouse. Thousands of individual items were stocked. When a school needed a
particular item, it sent a requisition to the stock controller and hoped to be supplied
with the item within a matter of days. When stocks of an item ran low, the stock
controller replenished them by reordering from the manufacturer. The warehouse
had to be run on the basis of minimum cost, subject to being able to give a ‘reason-
able’ service to schools.
To balance the trade-off between cost and service, the stock controller operated a
forecasting system that allowed predictions of demand from schools and, as a result,
the efficient timing of reorders. Since replenishment time varied between one day
and three months, only short-term forecasts were needed. It was an important
requirement of the system that it should be automatic (i.e. once set up it had to
produce forecasts without a manual intervention). The great number of small-value
items of equipment that were being forecast made this an essential characteristic.
The technique used was a time series method that gave automatic forecasts of
demand for each item based on recent trends and the season of the year. These
forecasts were cheap and sufficiently accurate for taking the reordering decisions. A
higher level of accuracy could have been obtained, perhaps through a causal model
relating demand to numbers and ages of school students and financial budgets.
However, it is very unlikely that the time and money spent achieving the greater
accuracy would have been justified.

13.3.2 Medium-Term Forecasting


A car manufacturer wished to predict future human resource requirements. Instant
dismissals were not possible because of union agreements; neither was instant hiring
in large numbers possible because of the skilled nature of many of the jobs. Accord-
ingly, the forecasts were to cover a five-year period.
Because of the crucial importance of the forecasts, it was worth spending much
time and effort in developing the system. Even so, in one sense, highly accurate
forecasts were not needed. It was not essential to know, for example, that exactly
2345 fewer grade B1 workers would be required in five years’ time; but it was
essential to be as certain as possible that fewer, not more, workers would be needed
and that the number was of the order of magnitude of 2300.
The technique chosen was a causal model relating the change in the workforce at
different grades to the economic environment, plans of new models both for the
organisation and for competitors, estimated changes in productivity and advances in
production methods. Short-term effects such as seasonal variations did not affect
these forecasts.

13/4 Edinburgh Business School Quantitative Methods


Module 13 / The Context of Forecasting

13.3.3 Long-Term Forecasting


An engineering company wished to plan its export strategy for the next 20 years.
Over such a period it is almost impossible to estimate volumes of business and
profits with any precision. It is possible and important to predict the directions of
changes. The company wanted to predict which countries seemed to offer the best
prospects for growth in business for both volumes and profits; it wanted to know
which products it should be selling and where; it wanted to know what new prod-
ucts it should be developing; it wanted to know in which countries it should be
installing production capacity.
Quantitative techniques have little to offer in such circumstances. There is very
little in the way of reliable data records to work on, and the forecasts required are
not easily obtained from quantitative techniques. Instead, the company used a
qualitative technique to bring together the judgements of experts from both inside
and outside the company. Their views on all factors relevant to the plans were
solicited. These included changes in politics, economics, technology and interna-
tional law. The technique brought these views together in a systematic way that
attempted to eliminate the effects of personal bias, rank, personality, etc. on the
forecasts. The process was a lengthy one involving many iterations before an export
marketing strategy could be formulated
All the above may suggest that short-term forecasts use time series methods,
medium-term use causal modelling and long-term use qualitative techniques. While
there is certainly a tendency for this to be so, it is not always the case. Apart from
time horizons, there are other factors, such as data availability and the nature of the
forecasts required, that influence the type of technique to be used.

13.4 Qualitative Forecasting Techniques


Qualitative forecasting techniques are not based on numerical data. They are
methods of combining qualitative information such as experience, judgement and
intuition to make a forecast. Qualitative information can be slippery. Unless it is
handled carefully, the forecasts may be little more than wild guesses. Consequently,
the essence of the techniques is that they are systematic. The best techniques are
able to distill the real information in, say, a manager’s experience from the surround-
ing ‘noise’ of personality, group pressures, risk aversion, etc. and direct it to the
making of forecasts. For example, the Delphi technique is a way of obtaining a
forecast from a group of people without the process being influenced by the usual
group pressures of character strength, function in the organisation and status.
Qualitative forecasting takes a fundamentally different approach from that of
quantitative forecasting. The former is more concerned with defining the bounda-
ries or directions in which the future will lie; the latter is concerned with making
estimates of the future values of variables. As in the previous example of long-term
forecasting, qualitative techniques might predict the most profitable product areas
and countries of operation for an organisation whereas quantitative techniques
would try to forecast the actual levels of profit. This may make it seem as if the
quantitative techniques are clearly superior. An old (and possibly true) forecasting

Quantitative Methods Edinburgh Business School 13/5


Module 13 / The Context of Forecasting

story demonstrates that this is not necessarily so. In 1880 it was forecast that by
1920 London would be impassable to traffic because of a 3-foot layer of horse
manure. Qualitative techniques might have avoided this error by considering
changes in the technology of road transport corresponding to the fast-developing
railway network. Even if the prediction had been true, it would not have been
necessary to know that the layer would be exactly 3-foot thick. Two feet or 4 feet
would have served the decision makers equally well.
This last example also illustrates why qualitative forecasting is sometimes referred
to as technological forecasting. In this sense, qualitative techniques try to predict
turning points and new directions of business activity. The fact that the business
environment is rapidly changing and that there is a corresponding need to forecast
technological change is undoubtedly behind the recent increase in usage of these
techniques. And, of course, there are situations in which the lack of quantitative data
means that qualitative techniques are all that can be used.
People appear naturally to associate formal organisational forecasting with quan-
titative rather than qualitative techniques. Forecasting is seen as a numerical and
analytical process. What, then, causes organisations to act ‘unnaturally’ and use
qualitative techniques? And why is the use of these techniques increasing so rapidly?
Two motives lie behind the use of qualitative techniques.
The first motive is that the forecaster’s choice is restricted because there is a lack
of adequate data. Quantitative techniques work from a historical data record, and
preferably a long one. This lack of data may occur simply because no data exist. The
organisation may be marketing an entirely new product or exporting to a region in
which it has no experience. Alternatively, data may exist but be inadequate. This
may be because data have been recorded incompetently, but more likely it is because
the circumstances in which the data are generated are changing rapidly. For exam-
ple, the political situation in an importing country may be unstable, making past
records of business an unreliable base for future projections. Recently this problem
has been seen in situations affected by rapidly changing technology. In the electron-
ics industry events happen so quickly that historical data are soon out of date. For
example, forecasts of mobile application sales are difficult to make quantitatively
because product developments occur so frequently and the rate of growth of sales is
so steep. Quantitative techniques are generally poor in dealing with such volatility.
The second motive for using qualitative techniques is a more positive one. The
factors affecting the forecast may be better handled qualitatively. Module 15 on
Managing Forecasts shows that an important step in a systematic approach to
forecasting is a conceptual model in which the influences on the forecast are listed.
The most important influences may be ones that are not susceptible to quantifica-
tion. For example, forecasts of business activity in Scotland in 2015 would have
been influenced in a major way by the outcome of the referendum on Scottish
independence held in late 2014. The outcome and the impact of any negotiations
between Scottish and UK governments would have been difficult to deal with
quantitatively, yet would likely have a large bearing on business activity in the
interim period. As a forecaster in Scotland in mid-2014, it would probably have
been better to try to estimate the effect of this influence qualitatively. It is therefore

13/6 Edinburgh Business School Quantitative Methods


Module 13 / The Context of Forecasting

by no means always the case that qualitative techniques are a second best to
quantitative ones. In some situations they will better reflect actual circumstances.
The above are the two motives for using qualitative techniques. The other ques-
tion posed at the outset concerned the increasing frequency of their use. The answer
lies in the second of the above motives. The business environment seems recently
to have been changing more rapidly than previously, whether for technological or
political reasons. The situations in which qualitative techniques are seen to have
advantages are occurring more frequently. The microelectronic revolution is a clear
example of this, but the boundaries are wider. Since the late 1970s business data
have been more volatile than previously. Many data series show greater variability in
recent years. This has meant that the rewards for accurate forecasting have in-
creased, but it also means that previously established forecasting models have
performed less well. The need to plan successfully and the need to consider new
techniques have resulted in greater use of forecasting techniques, both in terms of
the number of organisations that have a forecasting system and of the range of
techniques employed.
Several of the most common qualitative forecasting methods hardly deserve the
title ‘technique’, although they may well have been accorded pseudo-scientific labels.
They are included here if only for the sake of completeness before we move on to
more serious contenders.
Visionary Forecasting
A purely subjective estimate (or guess, or hunch or stab in the dark) made by
one individual is a visionary forecast. Many managers believe themselves to be
good at this. Or they believe that someone else in the organisation is good at it.
Most organisations like to believe that they have a forecasting folk hero tucked
away on the third floor. Sadly, when these forecasts are properly monitored,
the records usually show visionary forecasting to be inaccurate. The reason for
this paradox seems to be that visionary forecasting is judged on its occasional
successes. Everyone remembers the person who, in 1978, predicted political
changes in China; the time the same person stated that the US dollar/pound
sterling exchange rate would never fall below 1.50 is forgotten. Nevertheless,
there are undoubtedly people who are good visionaries, but they are few in
number and their records should be carefully monitored.
Panel Consensus Forecasting
Panel consensus forecasting is probably the most common method used in
business. This refers to the meeting of a group of people who, as a result of
argument and discussion, produce a forecast. One would think that this should
provide good forecasts, bringing together as it does the expertise of several
people. Again, the records suggest otherwise. Panel consensus forecasts are
generally inaccurate. The reason is that the forecasts are dominated by group
pressures. The status, strength of characters and vested interest of the partici-
pants all influence the forecast. The full potential of the gathered experience is
not brought to bear and the forecast may turn out to be little different from
that of the strongest personality working alone. Some improvement can be

Quantitative Methods Edinburgh Business School 13/7


Module 13 / The Context of Forecasting

gained by using structured group meetings in which one person is given the
responsibility for organising the meeting and providing background information.
Brainstorming
Brainstorming is a technique perhaps better known for producing ideas than
generating forecasts. It is based on a group meeting but with the rule that every
suggestion must be heard. No proposal is to be ridiculed or put to one side
without discussion. In forecasting, brainstorming is used first to define the full
range of factors that influence the forecast variable and then to decide upon a
forecast. When properly applied it is a useful technique, but the process can
degenerate into an ill-disciplined panel consensus forecast.
Market Research
Market research also falls within the area of qualitative forecasting. It is an
accurate but expensive technique. This extensive subject involves a large
number of distinct skills such as statistical sampling, questionnaire design and
interviewing. It is more usually described within the subject of marketing and is
mentioned here only for the sake of completeness.
The techniques described in more detail below are potentially effective and
accurate, and fall firmly within the area of qualitative forecasting.

13.4.1 The Delphi Method


Named after the oracle of Ancient Greece, this technique is based upon the panel
consensus method but tries to overcome the ill-effects of group pressures. It does
this by not allowing the members of the group to communicate with one another.
The group can therefore be physically in the same place or at the end of telephones.
The group has a chairman who conducts proceedings. The process is as follows:
(a) The chairman asks each member of the group to make a forecast of the variable
in question for the relevant time period. This forecast must be written down and
passed (or phoned) to the chairman, together with what the participant believes
are the major factors affecting the variable.
(b) The chairman collects the submissions of all participants and summarises them.
A typical summary may comprise the average and range of the forecasts plus a
list of the major factors. The chairman relays the summary back to the group. At
no time are participants told anything about the individual responses of other
participants.
(c) The chairman asks the group to reconsider their forecasts taking into account
the information presented to them in the summary. Again, the forecasts are
submitted to the chairman for summary and relay.
(d) Further iterations of the process take place until such time as the group has
reached (approximately) a consensus or until the participants are no longer pre-
pared to adjust their forecasts further.
(e) The average of the final iteration is the Delphi forecast.
By keeping the participants apart, the intention is that the effects of personality,
rank, etc. are minimised. The final forecast is then a distillation of the views of the

13/8 Edinburgh Business School Quantitative Methods


Module 13 / The Context of Forecasting

entire group. Even better, each participant will have had the opportunity to readjust
his or her views in response to worthy suggestions from others. The views of
‘cranks’, even persistent cranks, will be largely filtered out by the averaging process.
The onus is on deviationists to defend their views anonymously through the
chairman.
When tested, the Delphi technique has produced good results. But it has some
disadvantages. It can be expensive, especially when the group members are assem-
bled in the same physical place. Also, it is possible to cheat by indulging in some
game-playing. One participant knowing the likely views of other participants can
submit unrealistic forecasts in order that the averaging process works out the way he
or she wants. For example, in an attempt to forecast sales, a financial executive may
substantially understate his or her view so that the optimistic view of the sales
manager is counterbalanced and, as a result, he or she achieves the aim of holding
down stock levels.
The technique can be unreliable, meaning that different groups of people might
well produce different forecasts. The results can also be sensitive to the style of the
questions posed to the group.
Nevertheless, the technique is used in a variety of situations including by Japan’s
Office of Science and Technology to predict the future technological landscape.

13.4.2 Scenario Writing


Scenario writing is not concerned with single estimates of the future. It is the
construction of several sets of circumstances that could possibly arise. Each set of
circumstances is called a scenario and stems from a series of assumptions about the
future. In other words, scenario writing is the translation of several different sets of
assumptions into scenarios. The future is then represented by several alternative
scenarios rather than one single view. The essence of scenario writing is the expres-
sion of a wide range of situations that could apply in the future and that describe the
boundaries within which contingency planning can take place.
For example, suppose an exporting company is trying to forecast sales of its
products in a South East Asian country in ten years’ time. One set of assumptions
could be that at that time there will be a pro-free-trade government and a strong
world economy. These are accompanied by specific assumptions about inflation
rates, exchange rates, technological changes, etc. These assumptions are translated
into a scenario that shows the sales, prices, costs, manpower and competition
relating to the products. A second scenario is formed from a second and different
set of assumptions. The process continues and more scenarios are formed until all
sets of assumptions that could reasonably be expected to apply have been exhaust-
ed.
Scenario writing is not a detailed technique, nor does it pretend to be accurate in
terms of the numbers it produces. Rather, it involves a new approach to forecasting.
The difficulty of making a specific forecast is recognised. Instead, the emphasis is on
covering the range of possibilities and forming flexible plans that can cope with all
of them. Its advantage is that it leads to a realistic perspective on future uncertainty.

Quantitative Methods Edinburgh Business School 13/9


Module 13 / The Context of Forecasting

Further, it can be combined with more detailed techniques for translating assump-
tions into quantified scenarios. It is particularly useful in the most difficult types of
forecasting, for example, where the time horizon is long and there are many
uncertainties.
By itself, of course, it is an approach rather than what would normally be referred
to as a technique. Its disadvantage is, as recent research has shown, that the act of
describing and publicising a scenario within an organisation increases the chance
that it will happen. Nevertheless, the basic idea is so sound that, in combination
with other techniques, there is really no reason not to use it.

13.4.3 Cross-Impact Matrices


Cross-impact matrices do not in themselves produce forecasts. They are a means to
provide estimates of the likelihood or probabilities of future events that can then be
used as part of the planning process. Special emphasis is placed on cross influences
between different events by considering how the occurrence of one event might
affect the probability of another. Thus the technique gets its name: cross-impact.
The ‘matrix’ part of the name derives from the way the probabilities are written
down, in a matrix.
The technique comprises the following steps. To illustrate, the previous example
of a company forecasting its business activity in South East Asia will be used.
(a) Make an extensive list of all the factors that might affect the plans to be made.
In the example the factors would include the political situation, the economic
climate, technological breakthroughs, product innovation, competition and so
on.
(b) Next list the developments associated with each factor. For instance, three
developments might be used in respect of the political situation: a pro-free-trade,
a Marxist or an independent government.
(c) Estimate the probabilities of these developments. They would have to be
assessed subjectively with, perhaps, the Delphi method.
(d) Form a matrix with each row representing one of the developments and each
column also representing a development. Each element of the matrix is the new
probability for the development in that column given that the development of
that row has taken place. In the example, a section of the completed matrix
might appear as in Table 13.1.
The matrix is formed by considering one factor at a time. Taking the most likely
development first, the probabilities of all the other developments in all columns
are adjusted. Then the second most likely development is taken, then the third
and so on. The process then continues with the next factor.

13/10 Edinburgh Business School Quantitative Methods


Module 13 / The Context of Forecasting

Table 13.1 Cross-impact matrix


If this development happens Then the probability of
(A) (B) (C) (D) (E) (F) (G) (H)
Political Pro-Western Govt. (A) – 0 0 50 40 10 . .
situation Independent Govt. (B) 0 – 0 30 50 20 . .
factors Marxist Govt. (C) 0 0 – 5 35 60 . .

Economic Economy grows 5% (D) 30 65 5 – 0 0 . .


climate Economy grows 1% (E) 15 60 25 0 – 0 . .
factors Economy declines (F) 10 25 65 0 0 – . .
. (G) . . . . . . – .
. (H) . . . . . . . –
. . . . . . . . . .

(e) Using the original probabilities and the cross probabilities, the overall likelihoods
of different developments can be calculated. This may involve simulation. For
example, one factor might be the launch of a new product, and the develop-
ments associated with it would be the levels of the success of the launch. Given
the probabilities of all other developments, the relative chances of the successful
launch of the new product can be calculated.
The essence of cross-impact matrices is that they are a means whereby the plan-
ner can juggle with a whole series of uncertain developments and, in particular, their
influence on the others. The cost of the technique may only be justified when the
list of developments is long. In these circumstances the whole process may be
computerised, with formulae being the basis of the adjustment of the probabilities.
How all the probabilities are used is not a part of the technique. They may be used
formally in further calculations or they may be used informally in making judge-
ments about the future. They may well be used as part of scenario writing, in
formulating the most realistic scenarios.
Although a sales example has been used to illustrate these steps, the technique is
at its best when dealing with technological uncertainties. In fact, one of the earliest
reports of its application was in the development of the US Minuteman missile
system.
The advantage of the system is that it provides potential for the difficult task of
dealing with a wide range of complex events and interactions in a relatively straight-
forward manner. Its disadvantages are its expense and the need for the forecaster to
have the capability to handle the probabilities produced.

13.4.4 Analogies
When forecasting a variable for which there is no data record, a second variable
whose history is completely known and which is supposed to be similar to the first
is used as an analogy. Because of conceptual similarities between the two, it is

Quantitative Methods Edinburgh Business School 13/11


Module 13 / The Context of Forecasting

assumed that as time goes by the data pattern of the first will follow the pattern of
the second. The forecast for the first is the already-known history of the second.
For example, the company forecasting sales of a new product in South East Asia
might choose as an analogy the sales of a previous new product with similar
characteristics marketed in that country or a similar country in the recent past. The
growth record of the previous product is the basis of the forecast for the current
one. The forecast does not have to be exactly the same as the analogy. The record
may be adjusted for level and scatter. For instance, the sales volume of the current
product may be thought to be double that of the previous one and have greater
month-by-month variations. To forecast, the growth record of the analogy would be
doubled and the scatter (or confidence limits) of the monthly forecasts increased.
The essence of the technique is not that the analogy should be exactly like the
forecast variable but that similarities in the products and the marketing environment
should be sufficient to believe that the data patterns will be comparable.
The advantage of the technique is that it provides a cheap but comprehensive
forecast in a way that makes sense to the marketing managers. The analogy is not
restricted to business. Biological growth can provide the basis for analogies in
business and the social sciences. The underlying philosophy of the technique is that
there may be social laws just as there are natural laws. Although the laws themselves
in, say, marketing may not be fully or even partially understood, data records are the
evidence of the laws and can be used in forecasting.
The main problem with the technique is that there must be at least one but not
too many analogies to choose from. If the situation is totally new to the organisa-
tion, there may not be an analogy. On the other hand, there may be several plausible
analogies and great arguments may develop in deciding the right one to use. For
example, a wine and spirits company was planning the launch of a promising new
product, but it was extremely difficult to decide which of several previous successful
products should be the analogy. All had been successful, yet their growth patterns
differed considerably. The problem was resolved by making a subjective decision in
favour of one but agreeing to monitor the forecast variable’s record closely to see if,
at some point, the marketers should revert to a new analogy.

13.4.5 Catastrophe Theory


Most forecasting techniques, whether qualitative or quantitative, are based on the
assumption that changes in conditions or in the forecast variable will be more or
less continuous. In other words, although conditions may change, it is an intrinsic
assumption of most techniques that conditions will not suddenly be completely
different. Likewise, a variable may exhibit gradual trends or steep growth or decline,
but techniques do not take into account the possibility of ‘jumps’ from one kind of
behaviour to another. Catastrophe theory deals with the possibility of such radical
changes. It does not refer to catastrophe in the sense of disaster but in the sense of a
sudden alteration in behaviour.
There are plenty of examples of this sort of behaviour in non-business fields: in
psychology, the change of mood from, say, fear to anger; in chemistry, the changes
in a substance from a solid to a liquid and from a liquid to a gas; in atomic physics,

13/12 Edinburgh Business School Quantitative Methods


Module 13 / The Context of Forecasting

the ideas of quantum theory. In business the examples may not be so clear-cut, but
there are plenty of possibilities to think about: a sudden take-off in the sales of a
product, a turnaround in a company’s profitability, an instantaneous change in the
price of a commodity.
Catastrophe theory is not a quantitative technique. It does not calculate the ex-
pected size of the jumps. Rather, it is a systematic way of determining whether a
catastrophe is likely in a given situation. The technique comprises a series of
questions to answer and characteristics to look for that will indicate the nature of
the situation being investigated.
Catastrophe theory is relatively new and there is not much in the way of a track
record to judge its success. However, it certainly fills a gap in the range of forecast-
ing techniques and is growing in popularity. The reason for its importance and the
interest it has created is this: companies can usually take emergency action to deal
with continuous changes (whether rapid or not) in a situation. Sudden jumps or
reversions in behaviour, on the other hand, often leave no time for taking evasive
action. The potential of catastrophe theory is that it may be able to predict circum-
stances with which companies have no way of dealing unless they have advance
warning.

13.4.6 Relevance Trees


The techniques described so far have all started with the present situation and then
put out ‘feelers’ to see what the future might look like. These techniques can be
described as exploratory. Relevance trees are different. They form a technique that
starts in the future with a picture of what the future should ideally look like and
works back to determine what must occur to make this future happen. Such an
approach is described as normative.
The technique starts with a broad objective, breaks this down into sub-objectives
and then further breaks down the sub-objectives through perhaps several different
levels until specific technological developments are being considered. This structure
is a relevance tree. The elements of the tree are then given ‘relevance weights’ from
which it is possible to calculate the overall relevance of the technological develop-
ments that are at the lowest level of the tree. The outcome of the technique is a list
of those developments that are most important or relevant to the achievement of
the higher-level objective and sub-objectives.
In more detail, there are seven steps in the application of relevance trees. They
will be described using as an example a much simplified case of the design of a new
passenger airliner, based on Makridakis (1983) .
(a) Draw the relevance tree. For the airliner it might look like Figure 13.1.

Quantitative Methods Edinburgh Business School 13/13


Module 13 / The Context of Forecasting

Objective:
Level Build commercially successful airliner

1 Provide Provide Low Good operating


accommodation environment costs performance

2 Passengers Baggage Pressure Catering Capital Running Range Runway All-weather

3 Seating Protection

Figure 13.1 Relevance tree


(b) Establish criteria for determining priorities. In a purely financial case there might
be only one criterion: money. In the more usual technological applications there
are several criteria, which are the dimensions along which achievement can be
measured. In the airliner example the criteria might be:

A Passenger comfort
B Safety
C Cost
D Route capability

(c) Weight the importance of each criterion relative to the others. A group of
experts would presumably have to carry out this task by answering questions
such as: ‘What is the weight of each criterion in achieving the highest level objec-
tive?’ In the airliner example the weights might be assigned:

Weight
A Passenger comfort 0.10
B Safety 0.35
C Cost 0.40
D Route capability 0.15
Total 1.00

(d) Weight the sub-objectives at each level (referred to as the elements of the tree)
according to their importance in meeting each criterion. The question posed
might be: ‘In order to meet criterion C, what is the relative importance of each
element at level 3?’ At each level a set of weights for each criterion must be as-
sessed. For the airliner example, the process might work as in Table 13.2.
The first column in the table, for example, shows the assessed relevance of the
four elements at level 1 to the criterion of comfort. Accommodation is weighted
20 per cent, environment 65 per cent and so on. Since the table gives the relative
relevance of the elements to the criteria, each column must sum to 1. The pro-
cess of assessing relevance weights is carried out for each level of the relevance
tree.

13/14 Edinburgh Business School Quantitative Methods


Module 13 / The Context of Forecasting

Table 13.2 Relevance tree: element weights


Criteria
Comfort Safety Cost Route
Criterion weight 0.10 0.35 0.40 0.15
Elements at level 1 Element weights
Accommodation 0.20 0.35 0.05 0.05
Environment 0.65 0.25 0.05 0.05
Low costs 0.05 0.05 0.75 0.25
Performance 0.10 0.35 0.15 0.65
1.00 1.00 1.00 1.00

Table 13.3 Relevance tree: partial relevance numbers


Criteria
Comfort Safety Cost Route
Criterion weight 0.10 0.35 0.40 0.15
Elements at level 1 Element weights
Accommodation 0.20 0.35 0.05 0.05
Environment 0.65 0.25 0.05 0.05
Low costs 0.05 0.05 0.75 0.25
Performance 0.10 0.35 0.15 0.65
Partial relevance numbers
Accommodation 0.0200 0.1225 0.0200 0.0075
Environment 0.0650 0.0875 0.0200 0.0075
Low costs 0.0050 0.0175 0.3000 0.0375
Performance 0.0100 0.1225 0.0600 0.0975

(e) Calculate partial relevance numbers. Each element has a partial relevance
number (PRN) for each criterion. It is calculated:
PRN = Criterion weight × Element weight
It is a measure of the relevance of that element with respect only to that criterion
(hence ‘partial’). For the airliner example the partial relevance numbers are
shown in Table 13.3. For instance, from the table the PRN for accommodation
with respect to
(f) local relevance number (LRN) for each element. The LRN for each element is
the sum of the PRNs for that element. It is a measure of the importance of that
element relative to others at the same level in achieving the highest-level objec-
tive. For the airliner example the LRNs are as shown in Table 13.4.

Quantitative Methods Edinburgh Business School 13/15


Module 13 / The Context of Forecasting

Table 13.4 Calculation of local relevance numbers


Criteria Comfort Safety Cost Route
Level 1 Partial relevance numbers LRN
Accommodation 0.0200 0.1225 0.0200 0.0075 0.17
Environment 0.0650 0.0875 0.0200 0.0075 0.18
Low costs 0.0050 0.0175 0.3000 0.0375 0.36
Performance 0.0100 0.1225 0.0600 0.0975 0.29

The LRN for accommodation is 0.17 (= 0.0200 + 0.1225 + 0.0200 + 0.0075).


There is one LRN for each element at each level.
(g) Calculate cumulative relevance numbers (CRNs). There is one for each
element. They are calculated by multiplying the LRN of an element by the LRNs
of each associated element at a higher level. This gives each element an absolute
measure of its relevance. For example, in the airliner example, at level 3 the CRN
for seating is calculated:
CRN (seating) =
LRN (seating) × LRN (passengers) × LRN (accommodation)
By this means the bottom row of elements (specific technological requirements)
will have overall measures of their relevance in achieving the objective that was
the starting point at the highest level of the tree. This should lead to decisions
about the importance, timing, resource allocation, etc. of the tasks ahead.
Recall that the technique of relevance trees is normative. Given an objective, it
indicates what must be done to achieve it. Not only that, it indicates the relative
importance or priorities of the tasks ahead. In doing so it suffers from two major
disadvantages. The first is the requirement to draw a relevance tree correctly,
comprehensively structuring the road ahead; the second is the subjective assessment
of element and criterion weights. If either of these tasks is not done well then the
result will be nonsense. It is perhaps as well to look at relevance trees as much for
the process of using the technique as for the final result. The activity of considering
the options and their relevance would probably carry substantial benefits in terms of
understanding future needs, even if the numerical relevance values were never to be
used.

Learning Summary
The obvious characteristic that distinguishes qualitative from quantitative forecast-
ing is that the underlying information on which it is based consists of judgements
rather than numbers, but the distinction goes beyond this. Qualitative forecasting is
usually concerned with determining the boundaries within which the long-term
future might lie; quantitative forecasting tends to provide specific point forecasts
and ranges for variables in the nearer future. Qualitative forecasting offers tech-
niques that are very different in type, from the straightforward, exploratory Delphi
method to the normative relevance trees. Also, qualitative forecasting is at an early
stage of development and many of its techniques are largely unproven.

13/16 Edinburgh Business School Quantitative Methods


Module 13 / The Context of Forecasting

Whatever the styles of qualitative techniques, their aims are the same: to use
judgements systematically in forecasting and planning. In using the techniques it
should be borne in mind that the skills and abilities that provide the judgements are
more important than the techniques. Just as it would be pointless to try a quantita-
tive technique with ‘made-up’ numerical data, so it would be folly to use a qualitative
technique in the absence of real knowledge of the situation in question. The
difference is that it is perhaps easier to discern the lack of accurate data than the lack
of genuine expertise.
On the other hand, where real expertise does exist, it would be equal folly not to
make use of it. For long-term forecasting, by far the greater proportion of available
information about a situation is probably in the form of judgement rather than
numerical data. To use these judgements without the help of a technique usually
results in a plan or forecast biased by personality, group effects, self-interest, etc.
Qualitative techniques offer chances to distill the real information from the sur-
rounding noise and refine it into something useful.
In spite of this enthusiasm there is a warning. In essence, most qualitative tech-
niques come down to asking questions of experts, albeit scientifically. Doubts about
the value of experts are well entrenched in management folklore. But doubts about
the questions can be much more serious, making all else pale into insignificance.
Armstrong (1985) quotes the following extract from a survey of opinion by Hauser
(1975).

Question % answering yes


1. Do you believe in the freedom of speech? 96
2. Do you believe in the freedom of speech to the extent 22
of allowing radicals to hold meetings and express their
views to the community?

The lesson must be that the sophistication of the techniques will only be worth
while if the forecaster gets the basics right first.

Quantitative Methods Edinburgh Business School 13/17


Module 13 / The Context of Forecasting

Review Questions
13.1 There are three types of forecasting technique: short-, medium- and long-term. True or
false?

13.2 Time series methods have which of the following characteristics?


A. They involve only the forecast variable and other time-related variables.
B. They are good at short-term forecasting.
C. They can predict turning points in a series.
D. They are useful when forecasting demand for many low-value items.

13.3 Causal modelling has which of the following characteristics?


A. It relates the forecast variable to other variables that are thought to cause it.
B. It works only for variables that are not time related.
C. It can predict turning points.
D. It is not good at short-term forecasting.

13.4 Causal modelling is the same as least squares regression analysis. True or false?

13.5 Qualitative forecasting techniques can use numerical data. True or false?

13.6 Qualitative techniques are used in which of the following circumstances?


A. When data are scarce.
B. In long-term forecasting when the data record may not be relevant.
C. When data are unreliable.
D. When conditions are changing rapidly.

13.7 Which attributes apply to the Delphi technique?


A. The participants are experts.
B. The participants tell their forecasts only to the chairman but give their reasons
directly to other participants.
C. The chairman relates all the individual forecasts to the group but does not say
who made each forecast.
D. Iterations of the procedure continue until a consensus is achieved.

13.8 The intention of scenario writing is to provide several alternative views of the future. It
is not to provide a specific forecast. True or false?

13.9 Which of the following statements about a cross-impact matrix are true?
A. It produces sales forecasts.
B. It gives the probabilities that sales will reach a given level.
C. It can only be applied if an extensive list of possible future developments can be
provided.
D. It reviews the probabilities of these developments under the assumptions that
other developments have taken place.

13/18 Edinburgh Business School Quantitative Methods


Module 13 / The Context of Forecasting

13.10 A good analogy variable should have had exactly the same values at each point of its
history as that expected for the forecast variable. True or false?

13.11 Catastrophe theory deals with which of the following changes in variables?
A. Discontinuities in behaviour.
B. Steep growth.
C. Falls to disastrously low levels.

13.12 In relevance trees, the partial relevance numbers (PRNs) of an element of the tree is
calculated as which of the following?
A. Sum of the criterion weights.
B. Sum of element weightings.
C. Criterion weight × Element weighting.
D. Sum of criterion weights × Element weighting.

Case Study 13.1: Automobile Design


1 Apply the technique of relevance trees to the design of a new automobile. The answer
should include:
a. The first two levels of a tree.
b. A list of criteria and criteria weights.
c. Element weights.
d. Partial relevance number (PRNs).
e. Local relevance numbers (LRNs).
f. Cumulative relevance numbers (CRNs).
There is scope for variation in approach to this case. For instance, there is no one
absolutely correct set of criteria. Answers with different criteria might be equally valid.
However, all answers should have the same general structure.
Note that this case differs substantially from the airliner example in the text, even
though both were concerned with the provision of transportation. For the airliner, the
emphasis was on the purchasers (the airlines) and their passengers; for the automobile,
the emphasis will be on the driver who is also probably the purchaser. Consequently,
the balance of relevance will change.

References
Armstrong, J. S. (1985). Long Range Forecasting. New York: John Wiley and Sons.
Hauser, P. M. (1975). Social Statistics in Use. New York: Russel Sage.
Makridakis, S. G. (1983). Forecasting: Methods and Applications (2nd ed.). New York: John Wiley
and Sons.

Quantitative Methods Edinburgh Business School 13/19


Module 14

Time Series Techniques


Contents
14.1 Introduction.......................................................................................... 14/1
14.2 Where Time Series Methods Are Successful ................................... 14/2
14.3 Stationary Series .................................................................................. 14/2
14.4 Series with a Trend.............................................................................. 14/6
14.5 Series with Trend and Seasonality..................................................... 14/8
14.6 Series with Trend, Seasonality and Cycles ....................................... 14/8
14.7 Review of Time Series Techniques .................................................. 14/15
Learning Summary ....................................................................................... 14/17
Review Questions ......................................................................................... 14/18
Case Study 14.1: Interior Furnishings ........................................................ 14/20
Case Study 14.2: Garden Machinery Manufacture.................................... 14/21
Case Study 14.3: McClune and Sons........................................................... 14/21

Prerequisite reading: None

Learning Objectives
By the end of the module the reader should know where to use time series methods.
Time series data are distinguished by being stationary or non-stationary. In the latter
case the series may contain one or more of a trend, seasonality or a cycle. The
module describes at least one technique to deal with each type of series.
Technical section: Section marked with * contains technical material and may
be omitted on a first reading of the module.

14.1 Introduction
Time series methods are forecasting techniques that predict future values of a
variable solely from its own historical record. In various ways they identify patterns
in past data and project them into the future. The methods are categorised accord-
ing to the types of series to which they can be applied. The different types of series
are:
(a) stationary (meaning, roughly, without a trend);
(b) with a trend (a consistent movement upwards or downwards);
(c) with a trend and seasonality (a regular pattern that repeats itself every year);
(d) with a trend, seasonality and cycle (a regular pattern that takes more than a year
to repeat itself).

Quantitative Methods Edinburgh Business School 14/1


Module 14 / Time Series Techniques

This module will describe techniques that deal with each of the above types of
data. First, the particular situations in which time series techniques have shown
themselves to be successful will be discussed.

14.2 Where Time Series Methods Are Successful


At first glance it may seem that such inward-looking techniques are bound to be
inferior to both qualitative techniques and causal modelling. When conditions
change, time series methods, looking only at the past record, have no way of
predicting the change or even responding quickly to it. Nevertheless, there are
distinct areas in which they are very successful. When tested, they have compared
favourably with the competition. The areas in which they have shown themselves to
be successful are:
(a) In stable conditions. If there are no changing circumstances, it can reasonably
be assumed that the factors that caused movements of the variable in the past
will continue into the future. Consequently, the time series approach is likely to
provide good forecasts.
(b) For short-term forecasts. If there is not sufficient time for any substantial
changes in conditions, then time series methods are valid. In the short term most
data series continue as in the past.
(c) As a base forecast. The base forecast shows what would be expected if the
future were similar to the past. Even when conditions are changing, time series
methods provide starting forecasts on which judgements about changing condi-
tions can be built.
(d) For screening data. Time series techniques identify patterns in the historical
record. The patterns can be used to obtain a better understanding of past
movements. For instance, it might emerge that a particularly high level of sales
one month was merely the coincidence of high points in a cycle and in seasonali-
ty.
Time series methods have a good record in all the above situations. Surveys of
forecasting performance have frequently shown them to outperform other ap-
proaches. The techniques will now be described. They are categorised according to
the series to which they can be applied.

14.3 Stationary Series


A data series is stationary if it fluctuates about some constant level, and, while the
amount of fluctuation differs from one time period to the next, there is no general
tendency for there to be more fluctuation at one part of the series than at another.
More technically, a stationary series has no trend and constant variance.
In the long run, virtually no series are stationary, but they may be in the short
run. For example, the weekly stock volumes in a warehouse over two years are a
long series (104 observations) that may well be stationary. Over five years it would
probably not be.

14/2 Edinburgh Business School Quantitative Methods


Module 14 / Time Series Techniques

14.3.1 Moving Averages


The original series is replaced by a ‘smoothed’ series, obtained by replacing each
actual observation with the average of it and other observations either side of it. If
each average is calculated from three actual observations, it is said to be a three-
point moving average; if each is from five actual observations, it is a five-point
moving average and so on. Table 14.1 gives an example of a three-point moving
average.
To help the explanation, some algebra has been introduced. Each number in a
data series is given a label, xt, where t refers to the time period. Table 14.2 shows
how the labelling works for a quarterly data series from Quarter 3 of 2016 to
Quarter 4 of 2017.

Table 14.1 Three point moving average


Time Actual Smoothed series
period (t) series (x)
1 17
2 23 17.3 = (17 + 23 + 12)/3
3 12 16.3 = (23 + 12 + 14)/3
4 14 17.3 = (12 + 14 + 26)/3
5 26 18.7 = (14 + 26 + 16)/3
6 16

Smooth value at period


=
Algebraically:
=

Table 14.2 Data labels for time series


2016 2017
Time period Q3 Q4 Q1 Q2 Q3 Q4
Data 17 23 12 14 26 16
t 1 2 3 4 5 6
Label x1 x2 x3 x4 x5 x6

The averaging process is intended to smooth away the random fluctuations in the
series. The forecast for any future time period is the most recent smoothed value. In
the above series the forecast is 18.7 for all periods in the future. A constant forecast
makes sense because the series is stationary.
Seasonal as well as random fluctuations in the data can be smoothed away by
including sufficient points in the average to cover the seasonality. Seasonal monthly
data would be smoothed using a 12-point moving average. Each month is included
once and only once in the average, and thus seasonal variation will be averaged out.

Quantitative Methods Edinburgh Business School 14/3


Module 14 / Time Series Techniques

The use of an even number of points in a moving average creates a problem. A


smoothed value no longer can refer to a particular time period. It must refer to half-
way between the middle two time periods. For example, a three-point moving
average for the months January, February and March is the smooth value for
February. A four-point moving average for January, February, March and April is
the smooth value for in between February and March. This makes no difference
when forecasting a stationary series, but it does have an effect in other uses of
moving averages. We will return to this point later.
The number of points included in the moving average is usually equal to the
seasonality of the data. In the absence of seasonality the average should include
sufficient points to be able to smooth the fluctuations, but not so many that the last
smoothed value refers to a time period remote from the time periods for which the
forecasts are being made. In practice, three- or five-point moving averages are
probably the most common.
Even when the data are non-stationary, the method of moving averages can still
be used to smooth out random fluctuations, enabling trends and other patterns to
be seen more clearly.

14.3.2 Exponential Smoothing


For a moving average, each value in the average was given an equal weight. In a
three-point moving average, each value is given a weight of 1/3. Exponential
smoothing is a way of constructing an ‘average’ that gives more weight to recent
values of the variable.
The smoothed series is given by the equation:
New smoothed value = (1 − )(Previous smoothed value)
+ (Most recent actual value)
i. e. = (1 − ) +
where α is between 0 and 1.
The value of α (alpha) is chosen by the forecaster. The larger its value, the heavier
the weighting being given to the recent values. Its value may be selected after testing
out several values and measuring which is the best. In practice, α is usually in the
range 0.1 to 0.4.
Example
The same data used in Table 14.1 are exponentially smoothed in Table 14.3. Since the
smoothing equation requires a previous smoothed value to get it started, it is usual to
make the first smoothed value equal to the actual value. This assumption will have a
negligible effect unless the series is a very short one.

14/4 Edinburgh Business School Quantitative Methods


Module 14 / Time Series Techniques

Table 14.3 Exponential smoothing example


Original series Smoothed series
(using α = 0.2)
17 17
23 18.2 (= 0.8 × 17 + 0.2 × 23)
12 16.96 (= 0.8 × 18.2 + 0.2 × 12)
14 16.37 (= 0.8 × 16.96 + 0.2 × 14)
26 18.30 (= 0.8 × 16.37 + 0.2 × 26)
16 17.84 (= 0.8 × 18.30 + 0.2 × 16)

Figure 14.1 shows how exponential smoothing works to average out random fluctua-
tions. As with moving averages, the forecast for future time periods of a stationary
series is the most recent smoothed value, in this case 17.84.

20
Data

10
Original
Smoothed
0
1 2 3 4 5 6
Time

Figure 14.1 Exponential smoothing


Technical Note*
A little algebraic manipulation is required to show why exponential smoothing
gives different weighting to different time periods. The equation for exponential
smoothing is:
= (1 − ) +
But, from the previous time period:
= (1 − ) +
Putting this St−1 in the top equation gives:
= (1 − ) + (1 − α)α +
Just as St−1 in the original equation was substituted, so St−2 can be substituted.
Continuing this process eventually gives:
= + (1 − ) + (1 − ) + (1 − ) +⋯
The weightings being given to past values are:
, α(1 − α), (1 − ) , (1 − ) …

Quantitative Methods Edinburgh Business School 14/5


Module 14 / Time Series Techniques

Since α, and thus 1 − α, lie between 0 and 1, these weightings are decreasing.
For instance, if α = 0.2, the weightings are 0.2, 0.16, 0.128, 0.1024, 0.0819 …
Recent actual values receive heavier weighting than earlier ones.
The smoothing equation derived above illustrates how the weighting works. It is
not intended to be used for calculations.

14.4 Series with a Trend


The use of moving averages or exponential smoothing may reveal the existence of a
trend, or the trend may have been immediately obvious without any smoothing. For
a non-stationary series these two techniques have to be adapted before they can be
used. There are several variants of moving averages and exponential smoothing that
can deal with a trend. The one described here is Holt’s Method, which is a form of
exponential smoothing.

14.4.1 Holt’s Method


The formula for exponential smoothing is:
= (1 − ) +
If the series has a trend, the smoothed value St (which is the forecast for future
time periods) will generally be too low since (a) it is formed in part from the
previous smoothed value St and (b) the forecast does not allow for the effect of a
trend on future values. If there is a trend, it should be seen in the smoothed values.
Therefore a first way of calculating a trend might be:
Trend = Most recent smoothed value − Previous smoothed value
i. e. Trend at time = −
Just as random fluctuations in the actual data can be smoothed, so it is with the
trend. A smoothed estimate of the trend is obtained by using a smoothing constant
(labelled γ) to combine the most recently observed trend (St−St−1) with the previous
smoothed trend. γ is between 0 and 1, is chosen by the forecaster and may or may
not be different from α.
Smoothed trend = (1 − )Previous smoothed trend
+ × Most recently observed trend
i. e. = (1 − ) + ( − )
How is this estimate of the trend used in conjunction with the exponential
smoothing formula? First, the basic formula is changed so that the previous
smoothed value, St, is increased to allow for the trend:
= (1 − ) +
becomes
= (1 − )( + )+
Secondly, future forecast values allow for the effect of the trend. A forecast for
three periods ahead is no longer St but:
Forecast 3 periods ahead = Most recent smoothed value + 3 × Trend

14/6 Edinburgh Business School Quantitative Methods


Module 14 / Time Series Techniques

More generally, the forecast for m periods ahead, Ft, is given by:
= + ·
To summarise, when a time series has a trend, forecasts with Holt’s Method are
based on three equations:
= (1 − )( + )+
= (1 − ) + ( − )
= + ·
where
= actual observation at time
= smoothed value at time
, = smoothing constants between 0 and 1
= smoothed trend at time
= forecast for periods ahead

Table 14.4 shows how Holt’s Method is applied to an annual series of sales fig-
ures. The series has been shortened in order to simplify the example. The
smoothing constants have values α = 0.2, γ = 0.3.

Table 14.4 Holt’s Method example


Year Sales α = 0.2 γ = 0.3
volume Smoothing sales Smoothed trend

2012 12 12.0 –
2013 15 15.0 3.00
2014 20 18.4 = 0.8(15 + 3) + 0.2 × 20 3.12 = 0.7(3.00) + 0.3(18.4 − 15)
2015 21 21.4 = 0.8(18.4 + 3.12) + 0.2 × 21 3.08 = 0.7(3.12) + 0.3(21.4 − 18.4)
2016 25 24.6 = 0.8(21.4 + 3.08) + 0.2 × 25 3.12 = 0.7(3.08) + 0.3(24.6 − 21.4)
2017 28 27.8 = 0.8(24.6 + 3.12) + 0.2 × 28 3.14 = 0.7(3.12) + 0.3(27.8 − 24.6)

Forecasts
2018 30.94 = 27.8 + 3.14
2019 34.08 = 27.8 + 2 × 3.14
2020 37.22 = 27.8 + 3 × 3.14

The choice of smoothing constants is based on the same principles as for ordi-
nary exponential smoothing.
The calculating process needs a starting point both for the trend and for the
smoothed values. The smoothed values for the first two time periods are taken to be
equal to the actual values. There can be no trend for the first time period. The
smoothed trend for the second time period is taken to be equal to the difference
between the first two actual values.

Quantitative Methods Edinburgh Business School 14/7


Module 14 / Time Series Techniques

14.5 Series with Trend and Seasonality


Seasonality is defined as some regular pattern of upward and downward move-
ments that repeats itself every year or less. There are several techniques that can deal
with a series that has a trend and seasonality. Just one is described here. The Holt–
Winters Method, like other techniques that handle seasonality, is technically
complex and therefore only the underlying principles, rather than the details, are
given. The Holt–Winters Method is an extension of Holt’s Method.
Holt’s Method is based on two smoothing formulae. The first relates to the series
itself, with a smoothing constant denoted by α; the second relates to the trend, with
a smoothing constant denoted by γ. The Holt–Winters Method has a third smooth-
ing equation for seasonality, with a smoothing constant denoted by β.
The seasonality in a series is seen in that some time periods (months, quarters,
etc. depending upon the nature of the data) are always above or below the
smoothed value. For example, sales of cold drinks are likely to be higher in summer
months and lower in winter months, year in, year out.
Seasonality is measured as the ratio between the actual data and the smoothed
data:
Seasonality = actual data/smoothed data
Just as the trend was smoothed for Holt’s Method with smoothing constant γ, so
seasonality is smoothed with the smoothing constant β. A forecast using Holt–
Winters then combines the smoothed value of the series with the smoothed trend
and the smoothed seasonality. The three equations for producing a Holt–Winters
forecast are complex and are not reproduced here. In practice, Holt–Winters
forecasts would always be produced via an appropriate statistical or forecasting
software package. The smoothing constant for seasonality, β, is chosen using the
same principles as for α and γ.

14.6 Series with Trend, Seasonality and Cycles


A cycle is some regular repeating pattern of upward and downward movement of
length greater than one year. Contrast this with seasonality, in which the patterns
repeat in no more than a year. One of the most common methods of dealing with
the three elements – trend, seasonality and cycle – is the charmingly titled decom-
position method.

14.6.1 Decomposition Method


The method is based on the supposition that a time series can be separated or
decomposed into four distinct elements:
(a) Trend
(b) Cycle
(c) Seasonality
(d) Random

14/8 Edinburgh Business School Quantitative Methods


Module 14 / Time Series Techniques

The first three of these elements are then reassembled to make a forecast. The
elements are isolated one by one.

Trend
The trend is isolated by regression analysis between the data and time (see Fig-
ure 14.2), i.e. the observations (xt) are regressed against time (t), where t takes on the
value 1 for the first time period, 2 for the second, 3 for the third and so on.
The regression equation will look like this:
= + +
where:
= Actual data
+ = Trend element
= Residuals comprising seasonality, cycles, random part

Cycles
The next step is to isolate any cycle in the data. By choosing a suitable moving
average (12 points for monthly data, four for quarterly, etc.) the random and
seasonal elements can be smoothed away, leaving just the trend and cycle. If St is
such a moving average, then the ratio between St and the trend (a +bt) must be the
cycle. If the ratio St/(a + bt) is approximately one for all time periods then there is
no cycle. If it differs from one with any regular pattern then the ratio should be
inspected to determine the nature of the cycle. For instance, if the ratio is graphed
against time, it might appear as in Figure 14.3. This suggests a cycle of period 12
quarters, or three years. The ratio returns to its starting point after this interval of
time.

xt

u3

u1

u2
Slope =b
a

1 2 3 t

Figure 14.2 Isolation of trend by regression analysis

Quantitative Methods Edinburgh Business School 14/9


Module 14 / Time Series Techniques

Cycle Cycle

St 1 3 9 15 21 27 33 39
a + bt
Time (quarters)

Figure 14.3 Isolation of cycle pattern


The size of the cyclical effect is measured by calculating the average of the ratio
for each point in the cycle. For example, for the fifth time period in the cycle (t = 5,
17, 29, 41, …):
Cyclical effect = Average of , , ,…
for = 5,17,29, …

Seasonality
Seasonality is isolated by an approach similar to that for cycles. The moving average,
St, comprises trend and cycles; the actual values comprise trend, cycle, seasonality
and random effect. The ratio:
or

should therefore reflect seasonality and random effect. Suppose the data are
quarterly, then the seasonality for, say, the first quarter is calculated by averaging the
ratios:
Seasonality for first quarter = Average of , , ,…

The seasonality for the other three quarters can be calculated similarly. The aver-
aging helps to eliminate the random effect that is contained in the ratios.
In making a forecast, the three isolated elements are multiplied together. Suppose
the forecast for a future quarter, t = 50, is needed. It will be calculated:
Forecast = Trend × Cycle × Seasonality
for = 50
If the data are quarterly with a cycle of length 12 quarters, t = 50 is the second
period of a cycle and the second period of the seasonality. Therefore:
Forecast = ( + 50 ) × Cyclic effect for second period of the cycle
× Seasonal effect for second quarter
Example
The data in Table 14.5 refer to the quarterly shipments of an electrical product
from a company’s warehouse. Make a forecast of elements in each quarter of
2019 using the decomposition method. (The necessary calculations and data are
set out in full in Table 14.6.)

14/10 Edinburgh Business School Quantitative Methods


Module 14 / Time Series Techniques

Table 14.5 Decomposition method example: data


Quarter Year
2009 2010 2011 2012 2013 2014 2015 2016 2017
1 4.8 6.6 7.0 10.3 15.1 13.6 19.5 23.1 23.6
2 6.9 9.5 10.8 18.4 18.8 19.4 28.4 31.0 34.6
3 5.9 7.9 9.3 15.8 15.4 17.7 26.0 26.0 31.3
4 8.1 10.6 14.7 19.9 19.7 26.0 34.4 34.4 36.7

Table 14.6 Decomposition method example: elements


(1) (2) (3) (4) (5) (6) (7)
Year/ Time Shipments Moving Trend Cycle Seasonality
Quarter period average
2009 1 1 4.8 – 3.69 – –
2 2 6.9 – 4.53 – –
3 3 5.9 6.42 5.36 1.20 0.92
4 4 8.1 6.87 6.20 1.11 1.18
2010 1 5 6.6 7.52 7.04 1.07 0.88
2 6 9.5 8.02 7.88 1.02 1.18
3 7 7.9 8.65 8.72 0.99 0.91
4 8 10.6 8.75 9.56 0.92 1.21
2011 1 9 7.0 9.07 10.39 0.87 0.77
2 10 10.8 9.42 11.23 0.84 1.15
3 11 9.3 10.45 12.07 0.87 0.89
4 12 14.7 11.27 12.91 0.87 1.30
2012 1 13 10.3 13.17 13.74 0.96 0.78
2 14 18.4 14.80 14.58 1.01 1.24
3 15 15.8 16.10 15.42 1.04 0.98
4 16 19.9 17.30 16.26 1.06 1.15
2013 1 17 15.1 17.40 17.10 1.02 0.87
2 18 18.8 17.30 17.93 0.96 1.09
3 19 15.4 17.25 18.77 0.92 0.89
4 20 19.7 16.87 19.61 0.86 1.17
2014 1 21 13.6 17.02 20.45 0.83 0.80
2 22 19.4 17.60 21.29 0.83 1.10
3 23 17.7 19.17 22.12 0.87 0.92
4 24 26.0 20.65 22.96 0.90 1.26
2015 1 25 19.5 22.90 23.80 0.96 0.85
2 26 28.4 24.97 24.64 1.01 1.14
3 27 26.0 27.07 25.48 1.06 0.96

Quantitative Methods Edinburgh Business School 14/11


Module 14 / Time Series Techniques

(1) (2) (3) (4) (5) (6) (7)


Year/ Time Shipments Moving Trend Cycle Seasonality
Quarter period average
4 28 34.4 27.97 26.31 1.06 1.23
2016 1 29 23.1 28.62 27.15 1.05 0.81
2 30 31.0 28.62 27.99 1.02 1.08
3 31 26.0 28.62 28.83 0.99 0.91
4 32 34.4 28.75 29.67 0.97 1.20
2017 1 33 23.6 29.65 30.50 0.97 0.80
2 34 34.6 30.97 31.34 0.99 1.12
3 35 31.1 31.55 32.18 0.98 0.99
4 36 36.7 – 33.02 – –

(a) Calculate the trend. A regression analysis is carried out with the ship-
ments as the y variable and time as the x variable, i.e:

y 4.8 6.9 5.9 8.1 6.6 9.5 …


x 1 2 3 4 5 6 …

The analysis gives:


Shipments = 2.851 + 0.838 × Time
(b) Calculate the cyclical effect. This effect is calculated as the ratio between
a moving average and the trend. The moving average has to be a four-point
average to include each of the quarters in every average and thus smooth
out seasonality (see column 4 in Table 14.6). The first moving average is 6.42,
calculated thus:
. . . .
6.42 =
Since it includes the first four actual observations, for time periods 1, 2, 3
and 4, it should really be centred between periods 2 and 3. Since the cyclical
effect is calculated for each time period, each moving average must refer to
one and only one time period. Arbitrarily, it is taken to refer to the later of
the two periods (i.e. time period 3). Period 2 would have been equally ap-
propriate.
The second moving average incorporates the actual data from periods 2–5
and is centred on period 4. The last moving average is for period 35 and
includes the last four items of actual data (periods 33–36).
Column 5 in Table 14.6 shows the trend, calculated from the regression
equation. For example, for 2012, quarter 4 (time period 16):
Trend = 2.851 + 0.838 × 16
= 16.26
The cyclical ratio (Moving average/Trend = Column 4/Column 5) is shown in
column 6. If this ratio exhibits any pattern, it should be revealed by drawing a
graph relating the ratio to time, as in Figure 14.4.

14/12 Edinburgh Business School Quantitative Methods


Module 14 / Time Series Techniques

Cycle = 12

Moving average
1.1

Trend
1
1 5 9 13 17 21 25 29 33

0.9

Cycle = 12

Figure 14.4 Electrical shipments: cycle pattern


There does appear to be a cycle of length 12 quarters, since troughs and
peaks both recur at this interval. For each of the 12 periods within the cycle,
this effect can be calculated by averaging over all such periods. The first pe-
riod of the cycle is taken, arbitrarily, to be the first period of the time series.
The cyclical effect for, as an example, the fifth period of the cycle is calculat-
ed (using data from column 6 of Table 14.6):
Cyclical effect for 5th period of cycle = Average periods 5,17,29
. . .
=
= 1.05
The effects for all 12 periods of a cycle are shown in Table 14.7.

Table 14.7 Decomposition method example: cyclical effect


Time period of cycle Cyclical effect
1 0.96
2 1.01
3 1.10
4 1.08
5 1.05
6 1.00
7 0.97
8 0.92
9 0.89
10 0.89
11 0.91
12 0.88

(c) Calculate the seasonal effect. The seasonality is the ratio of actual to
moving average, averaged for each quarter. Table 14.6 shows these ratios in
column 7 calculated as column 3 divided by column 4. Averaging these ratios
for the first quarter:

Quantitative Methods Edinburgh Business School 14/13


Module 14 / Time Series Techniques

. . . . . . . .
Seasonality index quarter 1 =
= 0.82

Table 14.8 shows all four seasonal indices.

Table 14.8 Decomposition method example: basic seasonal indices


Quarter Basic seasonal
index
1 0.82
2 1.14
3 0.93
4 1.21

Unfortunately, there is a difficulty with the basic seasonal indices shown in


Table 14.8. The overall effect is to change the level of data. Their average is
different from 1.0:
. . . .
Average seasonal effect =
= 1.025

The seasonal index is meant to rearrange the pattern within a year, not to
increase the trend. In the above case the trend would be increased by 2.5
per cent each year. The seasonal indices have to be adjusted so that their
average is 1.0. This is done by dividing each index in Table 14.8 by 1.025 to
give the adjusted seasonal indices of Table 14.9.

Table 14.9 Decomposition method example: adjusted seasonal indices


Quarter Adjusted seasonal
index
1 0.80
2 1.11
3 0.91
4 1.18

The average seasonal effect is now neutral since:


. . . .
= 1.00
(d) Make the forecasts. The original time series has now been decomposed
into trend, cycle and seasonality. To make forecasts for 2019, the three ele-
ments are reassembled. The forecasts are shown in Table 14.10.
i. The trend is 2.851 + 0.838 × Time. The four quarters of 2019 are the
time periods numbered 41– 44. The trend for the first quarter is there-
fore 2.851 + 0.838 × 41 = 37.21.
ii. Each cycle lasts for 12 periods. Starting at the first quarter of 2009, the
cycles are 2009–11, 2012– 14, 2015–17. Consequently, the four quarters

14/14 Edinburgh Business School Quantitative Methods


Module 14 / Time Series Techniques

of 2019 are time periods 5–8 of a cycle. The cyclical effects for these pe-
riods are taken from Table 14.7.
iii. The seasonal effect for each quarter is taken from Table 14.9. The fore-
cast is the product of the three elements. For example, for 2019, Q1:
Forecast = 37.21 × 1.05 × 0.80
= 31.26

Table 14.10 Decomposition method example: forecasts


Time Trend Cycle Seasonality Forecast
2019 Q1 37.21 1.05 0.80 31.26
Q2 38.05 1.00 1.11 42.24
Q3 38.88 0.97 0.91 34.32
Q4 39.72 0.92 1.18 43.12

14.7 Review of Time Series Techniques


Table 14.11 summarises the time series techniques, breaking them down according
to the types of series to which they are applicable. In each category very many more
techniques are available. The only technique that has not been described yet is Box–
Jenkins, a method that is, in theory, applicable to all situations. According to some
surveys, Box–Jenkins is the most accurate technique.
Table 14.11 Smoothing methods
Series type Methods
Stationary Moving averages
Exponential smoothing
Trend Holt’s
Seasonal Holt–Winters
Cyclical Decomposition
Other Box–Jenkins

14.7.1 Box–Jenkins Method


The Box–Jenkins Method allows forecasts to compensate for previous errors, as
time goes by. It does this by incorporating past residuals (or forecasting errors) into
the forecasting equation, which is then adjusted in response to previous mistakes.
Forecasting equations that combine past values of the variable with past values of
the residual are known as ARMA (autoregressive, moving average) models. The past
values of the variable are the autoregression since the variable is in effect regressed
against itself; the residuals are the moving average part.
Box–Jenkins is better described as a process rather than a technique. Figure 14.5
shows how the process works.

Quantitative Methods Edinburgh Business School 14/15


Module 14 / Time Series Techniques

(a) Pre-whiten. This is a term coined by Box and Jenkins that can be thought of as
removing the trend from the series. Later, when the forecasts are being calculat-
ed, the trend can be taken out of storage.
(b) Identify. Select those past values and residuals that seem most likely to affect
future values. This is done by looking carefully at the autocorrelation coeffi-
cients to interpret in what way the past affects the future. These coefficients are
the result of correlating the series of residuals with the same series of residuals
lagged by 1, 2, 3, … Interpreting them is a skilled operation that goes beyond the
scope of this module. For example, the result of this step might be:
Forecast for next month = Combination of values for this month and last month
together with this month’s residual
(c) Estimate. Determine each coefficient of those past values and residuals selected
at the identify step. A computer-based algorithm called the method of steepest
descent usually does this, providing a best-fit line. For example, this step might
produce:
Forecast for next month = 0.93 × This month’s value + 0.09 × Last month’s value
−0.24 × This month’s residual
(d) Diagnose. Random residuals suggest the best forecasting model has been found
as signalled by zero, or close to zero, autocorrelations. If the residuals are not
random, other past values or residuals need to be included. The estimation and
diagnosis steps are then repeated.
(e) Forecast. When the diagnostic step reveals random residuals, the equation can
be used to forecast.

Pre-whiten

Identify

Estimate

Diagnose
Not OK
OK

Forecast

Figure 14.5 The Box–Jenkins process


The Box–Jenkins is a difficult and complicated forecasting method and is expen-
sive for data collection, computer usage and human resources. It requires
considerable skill and experience; it is far from being an automatic technique. On
the other hand, surveys indicate that it is highly accurate. However, the good track
record relates largely to short-term forecasting. Like all time series methods, it
should be used to forecast only over time periods that are sufficiently short for the
conditions to remain unchanged. In practice this means that the forecast horizon
should be no more than three to six months.

14/16 Edinburgh Business School Quantitative Methods


Module 14 / Time Series Techniques

Learning Summary
In spite of the fact that surveys have demonstrated how effective time series
methods can be, they are often undervalued. The reason is that, since a variable is
predicted solely from its own historical record, the methods have no power to
respond to changes in business or company conditions. They work on the assump-
tion that circumstances will be as in the past.
Nevertheless, their track record is good, especially for short-term forecasting. In
addition, they have one big advantage over other methods. Because they work solely
from the historical record and do not necessarily require any element of judgement or
forecasts of other causal variables, they can operate automatically. For example, a large
warehouse, holding thousands of items of stock, has to predict future demands and
stock levels. The large number of items, which may be of low unit value, means that it
is neither practicable nor economic to give each variable individual attention. Time
series methods will provide good short-term forecasts by computer without needing
managerial attention. Of course, initially some research would have to be carried out,
for instance, to find the best overall values of smoothing constants. But, once this
research was done, the forecasts could be made automatically. All that would be
needed would be the updating of the historical record as new data became available.
Especially with a computerised stock system, this should cause little difficulty.
The conclusion is therefore not to underestimate time series methods. They have
advantages in cost and, in the short term, in accuracy over other methods.

Quantitative Methods Edinburgh Business School 14/17


Module 14 / Time Series Techniques

Review Questions
14.1 Which of the following statements about time series forecasting methods are true?
They:
A. relate to stationary series only.
B. are based on regression analysis.
C. make forecasts of a variable from its own historical record.
D. are able to forecast without manual intervention.

14.2 Time series methods are appropriate under what circumstances?


A. Where short-term forecasts are needed.
B. To predict turning points.
C. Where the historical data record is short.
D. Where conditions affecting the forecast variable are stable.

14.3 Which of the following defines a stationary series?


A. It has no trend.
B. It has constant variance.
C. It has no trend and is heteroscedastic.
D. It has no trend and is homoscedastic.

14.4 The difference between moving averages (MA) and exponential smoothing (ES) is that
MA computes a smoothed value by giving equal weights to past data values, whereas ES
allows the forecaster to choose the weights. True or false?
Questions 14.5 and 14.6 refer to the following data:

Period 1 2 3 4 5 6 7 8
Value 7 9 8 10 12 11 7 10

14.5 The three-point moving average forecast for time period 9 is:
A. 9.3
B. 9
C. 10
D. 8.5

14.6 The ES forecast (α = 0.4) for period 9 is:


A. 9.4
B. 10.0
C. 10.4
D. 9.0

14/18 Edinburgh Business School Quantitative Methods


Module 14 / Time Series Techniques

14.7 In Holt’s Method the two smoothing constants should be chosen:


A. to be exactly the same.
B. to be nearly the same.
C. so that the trend constant exceeds the series constant.
D. each independently of the other.

14.8 In the Holt–Winters Method, the seasonality at any point is measured as:
A. Trend/Actual
B. Trend/Smoothed
C. Actual/Trend
D. Actual/Smoothed

14.9 As part of the decomposition method, a time series xt is regressed against time (t = 0,
1, 2, 3, …). The result is:

Coefficient Standard error


Time 3.1 1.01
Constant 12.2 2.41
R2 = 0.42

What is the trend at time t = 6?


A. 3.1
B. 8.47
C. 30.8
D. 12.2
E. 76.3

14.10 In the decomposition method, the cyclical effect is measured by:


A. Trend/Actual
B. Actual/Trend
C. Smoothed/Trend
D. Trend/Smoothed

Quantitative Methods Edinburgh Business School 14/19


Module 14 / Time Series Techniques

14.11 A graph relating cyclical effect to time is as shown below. The length of the cycle is thus:

Cycle

10 20 30 40
Quarter

A. 10 years
B. 2.5 years
C. 20 years
D. 5 years

14.12 In an application of the decomposition method on quarterly data, the following have
been calculated:

Trend coefficients: a = 9, b = 1.1


Seasonality: Q1 0.92
Q2 1.06
Q3 1.12
Q4 0.90
Cycle: 1.0 at all periods

The forecast for time period 10 is:


A. 21.2
B. 22.4
C. 98.6
D. 11.9

Case Study 14.1: Interior Furnishings


1 The production manager of a firm that manufactures interior furnishings wishes to
prepare a monthly forecast of the demand for two-metre wooden curtain rails. The
following data are available:

Monthly demand
2016 2017
Oct. Nov. Dec. Jan. Feb. Mar. Apr. May June July Aug.
2000 1350 1950 1975 3100 1750 1550 1300 2200 2775 2350

a. Use a three-month moving average to compute a forecast of monthly demand for


the curtain rails.
b. Use exponential smoothing with α = 0.1 to compute forecasts.

14/20 Edinburgh Business School Quantitative Methods


Module 14 / Time Series Techniques

c. State any assumptions made.

Case Study 14.2: Garden Machinery Manufacture


1 An engineering company makes a range of brackets that are used in the manufacture of
gardening machinery. Short-term forecasts for all the brackets in the range are required.
Since the brackets are low value and there are many specifications in the range, the
forecasts must be made by ‘automatic’ techniques (i.e. ones that need no manual
intervention). The following historical quarterly demand data are available for a particu-
lar bracket, the BT34A:

Period Demand
2016 1 140
2 155
3 155
4 170
2017 1 180
2 170
3 185
4 190

There is a clear trend in the data. There may be seasonality, but the record is too short
to be sure. The best method therefore appears to be Holt’s Method. Make a forecast
for total demand in 2018 using this technique with α = 0.2 and γ = 0.3.

Case Study 14.3: McClune and Sons


1 The Scotch whisky distilling and retailing company of McClune and Sons has its
headquarters in the Speyside region of Scotland. Towards the end of 2014 McClune was
preparing its budgets for 2015. Usually the budgets were set by making simple percentage
adjustments to the previous year. This was a minor matter involving a single meeting
between the managing director and the director of finance. But this year the situation was
different. The company’s long-standing banker, Scottish Mutual, had asked for a financial
plan for 2015 based on careful forecasts of financial needs.
Because whisky takes at least three years to mature and because its demand follows a
marked seasonal pattern, with the heaviest demand around the Christmas/New Year
period, Scotch whisky companies carry high levels of stock. Most, including McClune,
rely on banks or other forms of financing to carry this heavy burden on their working
capital. The industry generally recently experienced some financial difficulties and this
was the reason Scottish Mutual requested fully justified financial statements for 2015
including cash flows, an income statement and a balance sheet. Only on this basis would
Scottish Mutual appraise McClune’s request for the continuation and extension of its
working capital loan.
The preparation of these statements had been entrusted to Keith Scott, an MBA
graduate and the deputy director of finance. After careful consideration, Keith came to
the conclusion that he needed (a) accurate monthly sales forecasts, (b) the delay in cash
receipts from sales and (c) operating costs. It would then be a matter of accounting

Quantitative Methods Edinburgh Business School 14/21


Module 14 / Time Series Techniques

calculation to combine these figures into the required statements. Keith’s knowledge of
McClune and his familiarity with the industry made him confident of his ability to predict
the company’s monthly operating expenses and the time delay in cash receipts. He could
do this by taking average values over past years while adjusting for inflation and any one-
off occurrences.
The difficult part of Keith’s task was that of forecasting whisky sales for the next 12
months. He could, of course, take McClune’s sales figures and use some forecasting
technique to predict ahead, but he was concerned that operating difficulties had made
the McClune data unreliable. On the other hand, he could obtain a monthly forecast of
sales for the whole Scotch whisky industry (from the Scotch Whisky Association in
London, to which McClune belonged) and then apply projected market share figures for
McClune to obtain the forecasts he was seeking. Although this approach might have
spin-off benefits in providing input to McClune’s market research projects, he was
concerned that the estimate of market share could involve errors of a size that would
negate the work put into the industry forecasts. He decided on the first approach of
using McClune’s own historical sales records. Keith’s first step was to obtain the sales
data. McClune’s files held sales figures going back over a hundred years, but Keith
decided that only data from 2006 onwards was relevant. During 2004 two strikes
outside the company had affected McClune’s business and the sales records for 2004
and 2005 were particularly unreliable. From 2006 onwards there were no such major
effects and Keith was able to make satisfactory adjustments for the operating difficulties
during the period. These adjusted historical sales figures (in volume terms) are present-
ed in Exhibit 1 (see Table 14.12).
Keith plotted the historical data and obtained the graph shown in Exhibit 2 (see
Figure 14.6), showing a strong seasonal pattern, consistent from year to year, as well as
a clear trend. Given this pattern he took the view that short-term forecasts could best
be obtained from a time series technique. He set out to consider which technique might
be best.

Table 14.12 Exhibit 1


Year Month Sales Time Year Month Sales Time Year Month Sales Time
2006 1 8.12 1 2009 1 7.88 37 2012 1 6.68 73
2006 2 7.76 2 2009 2 7.81 38 2012 2 7.33 74
2006 3 7.97 3 2009 3 8.40 39 2012 3 8.53 75
2006 4 7.88 4 2009 4 9.43 40 2012 4 9.46 76
2006 5 8.45 5 2009 5 9.76 41 2012 5 7.41 77
2006 6 8.68 6 2009 6 10.28 42 2012 6 10.08 78
2006 7 6.77 7 2009 7 9.27 43 2012 7 10.67 79
2006 8 6.60 8 2009 8 4.16 44 2012 8 4.40 80
2006 9 8.39 9 2009 9 10.98 45 2012 9 13.21 81
2006 10 11.88 10 2009 10 12.73 46 2012 10 16.25 82
2006 11 15.58 11 2009 11 21.03 47 2012 11 24.90 83
2006 12 19.50 12 2009 12 24.08 48 2012 12 33.08 84
2007 1 7.43 13 2010 1 9.19 49 2013 1 9.95 85
2007 2 7.26 14 2010 2 9.21 50 2013 2 8.00 86

14/22 Edinburgh Business School Quantitative Methods


Module 14 / Time Series Techniques

Year Month Sales Time Year Month Sales Time Year Month Sales Time
2007 3 8.67 15 2010 3 8.09 51 2013 3 10.84 87
2007 4 9.26 16 2010 4 9.45 52 2013 4 11.83 88
2007 5 10.55 17 2010 5 10.14 53 2013 5 12.68 89
2007 6 9.17 18 2010 6 11.17 54 2013 6 12.33 90
2007 7 8.66 19 2010 7 9.29 55 2013 7 11.72 91
2007 8 4.45 20 2010 8 4.36 56 2013 8 4.20 92
2007 9 9.10 21 2010 9 11.75 57 2013 9 15.06 93
2007 10 11.32 22 2010 10 15.31 58 2013 10 17.66 94
2007 11 15.23 23 2010 11 22.94 59 2013 11 24.92 95
2007 12 18.02 24 2010 12 28.67 60 2013 12 32.06 96
2008 1 5.87 25 2011 1 9.04 61 2014 1 11.00 97
2008 2 6.19 26 2011 2 10.01 62 2014 2 9.02 98
2008 3 8.34 27 2011 3 11.41 63 2014 3 11.58 99
2008 4 8.91 28 2011 4 10.82 64 2014 4 12.11 100
2008 5 9.05 29 2011 5 12.57 65 2014 5 11.68 101
2008 6 9.98 30 2011 6 11.83 66 2014 6 13.44 102
2008 7 6.26 31 2011 7 8.91 67 2014 7 10.87 103
2008 8 3.98 32 2011 8 4.61 68 2014 8 3.62 104
2008 9 7.24 33 2011 9 13.21 69 2014 9 14.87 105
2008 10 13.18 34 2011 10 17.39 70
2008 11 15.88 35 2011 11 27.33 71
2008 12 22.90 36 2011 12 35.21 72

Quantitative Methods Edinburgh Business School 14/23


Module 14 / Time Series Techniques

36
34
32
30
28
26
24
22
Sales

20
18
16
14
12
10
8
6
4
2
0 20 40 60 80 100
Months

Figure 14.6 Exhibit 2


a. One time series forecasting method that handles a series with a trend and seasonal
pattern, as well as being able to deal with a cycle should there be one, is classical
decomposition. Use this approach to obtain forecasts for 2015.
b. What is your overall evaluation of Keith Scott’s approach to the forecasting, and
what do you think his major difficulties will be?

14/24 Edinburgh Business School Quantitative Methods


Module 15

Managing Forecasts
Contents
15.1 Introduction.......................................................................................... 15/1
15.2 The Manager’s Role in Forecasting.................................................... 15/2
15.3 Guidelines for an Organisation’s Forecasting System ..................... 15/4
15.4 Forecasting Errors ............................................................................. 15/13
Learning Summary ....................................................................................... 15/15
Review Questions ......................................................................................... 15/17
Case Study 15.1: Interior Furnishings ........................................................ 15/19
Case Study 15.2: Theatre Company........................................................... 15/19
Case Study 15.3: Brewery ............................................................................ 15/19

Prerequisite reading: None, apart from for Case Study 15.3, which draws on material
in Module 11–Module 14

Learning Objectives
The purpose of this module is to describe what managers need to know if they are
to use forecasts in their work. It is stressed that forecasting should be viewed as a
system, not a technique. The system needs to be managed and it is here that the
manager’s role is crucial. The parts of it that fall within a manager’s sphere rather
than that of the forecasting expert are discussed in some detail. Some actual and
costly mistakes in business forecasting will demonstrate the crucial nature of the
manager’s role. By the end the readers should know how they can use the forecast-
ing techniques described in previous modules effectively in their organisations.

15.1 Introduction
The techniques of business forecasting have been the topic of previous modules.
Causal modelling was the subject of Module 11 and Module 12, qualitative tech-
niques of Module 13 and time series methods of Module 14. How can this armoury
best be used in an organisation? If it is to be used effectively, the manager’s role is a
vital one. Forecasting should be viewed as a system of which techniques are just a
part. Experts can usually be called upon to look after the techniques, but the whole
system, if it is to function properly, needs the input of management skill and
knowledge. Indeed, a review of some of the expensive losses made by organisations
because of forecasting errors reveals that the errors usually occurred because of a
lack of management skills rather than statistical skills.

Quantitative Methods Edinburgh Business School 15/1


Module 15 / Managing Forecasts

This module is about managing the experts and their techniques so that forecast-
ing makes a real contribution in business. It discusses the manager’s role, gives nine
guidelines for developing a forecasting system and describes some major forecasting
mistakes that have been made.

15.2 The Manager’s Role in Forecasting


Statistical specialists in forecasting tend to take a narrow, technique-oriented view of
their subject; non-specialists, often the prospective users of the forecasts, frequently
feel uneasy about such a technical area. This section should enable the specialists to
adopt a wider perspective and, by defining their role more clearly, the non-
specialists to gain confidence.
In small organisations forecasting may be done by one person. The individual
who needs the forecasts has to produce them. He or she has to cover all aspects of
the work alone. In larger organisations, someone needing forecasts to support
decision making can generally call for assistance. The question then arises as to
which of the departments involved should take primary responsibility for the
production of the forecasts. There are three general possibilities:
(a) user department;
(b) management services, or similar;
(c) data processing unit, management information group, or similar.
The third possibility, data processing (DP), is probably the worst candidate. This
choice often leads to the user department abrogating its responsibility to the
‘experts’, especially if there is poor communication between the two groups. As a
result, the users never become involved in the development of their own system.
While the DP unit will have lots of technical expertise, it may know little of the
wider issues and be unable to integrate the forecasting system with the decision
making it is intended to serve. The most likely outcome is an isolated and little-used
forecasting system.
The second possibility, management services (meaning a specialist department
comprising functional experts – statisticians, operational researchers, etc.), suffers
from some of the problems of the DP unit in being remote from the decision
making. Yet, when the forecasts are for strategic decisions at board level, this
solution can be successful. Management services, perhaps in the form of a corporate
planning unit, is then able to devote itself entirely to the few major decisions to be
taken. It can make the link between the technicalities of forecasting and the deci-
sions. Outside such special circumstances, the result may be the same as when the
DP unit takes the leading role: an isolated and little-used system.
The first possibility, the user department, should be the best solution for non-
board-level decisions. However, it frequently does not work. The users feel they
have insufficient technical expertise and, de facto, hand over responsibility to the
technical experts in another department. Once again the result is that they have little
involvement in the system, and this leads to underutilisation of the system. When
the forecasting non-specialists do take and maintain primary responsibility for the
process, their participation is a key factor in developing an effective system. This is

15/2 Edinburgh Business School Quantitative Methods


Module 15 / Managing Forecasts

because the non-specialists are in the best position to forge the link between
techniques and decisions. More than this, since they are going to use the forecasts,
they should have confidence that they are reliable and valuable.
Wherever responsibility rests, the development of a forecasting system is, in
larger organisations, usually a team activity. Typically, the team members will include
a forecasting practitioner, a representative of the user department and a financial
expert, although the exact composition inevitably depends upon individual circum-
stances. In the smaller organisations the forecasting may be done by one person in
whom must be combined all the team’s expertise.
In a team, the role of the practitioner or specialist is reasonably clear. The roles
of the other team members include such things as facilitating access to the user
department and providing data, but much more importantly they must include
responsibility for ‘managing’ the forecasts. This means ensuring that resources (the
experts and techniques) are properly applied to objectives (the intended uses of the
forecasts). In carrying this out, it is essential to view forecasting as a system and not
just as a technique. While the specialist is considering the statistical niceties of the
numbers being generated, the ‘manager’ should be considering the links with the
rest of the organisation: what is the decision-making system that the forecasts are to
serve? Is the accuracy sufficient for the decisions being taken? Are the forecasts
being monitored and the methods adjusted? And so on. In short, the specialist takes
a narrow view of the technique but the manager takes a broad view of the whole
forecasting system. The role of managing the system frequently and properly falls to
a general manager in the user department. It is the most vital role in the whole
forecasting process.
This recommended broad view can be broken down into three distinct areas.
They indicate the knowledge with which a manager needs to be equipped in order to
play an effective part in the system.
1. Being aware of the range of techniques available. A specialist may have a
‘pet’ technique. The manager should know the full spectrum of techniques at a
general level of detail so that he or she can make at least an initial judgement on
their applicability to the situation. Such knowledge will also increase confidence
and credibility when the manager is taking part in discussions with specialists.
2. Incorporating forecasts into management systems. This is the essence of the
manager’s role. There is a checklist (described later) of things that should be
done to integrate a forecasting process with the rest of the organisation.
3. Knowing what is likely to go wrong. Many organisations have made forecast-
ing errors in the past. Most have one thing in common: they are sufficiently
simple that, with hindsight, it seems remarkable that the mistakes could have
been made. Yet the mistakes were made and they are a source of valuable infor-
mation for the present.
Of these three areas, the first, knowing the range of techniques, has already been
covered. The next two are the subjects of the following sections.

Quantitative Methods Edinburgh Business School 15/3


Module 15 / Managing Forecasts

15.3 Guidelines for an Organisation’s Forecasting System


The key word is system. Forecasting should not be viewed as a number-generating
technique but as a system. The technique is just one part of the forecasting process,
which includes many other factors to do with the generation and use of forecasts
within an organisation. The process should specify how judgement is to be incorpo-
rated, how the effectiveness of the forecasts is to be measured, how the system
should be adjusted in response to feedback and many other aspects. In addition,
such a broad view of forecasting leads to consideration of the links between the
forecasting system and other management systems in the organisation. Lack of real
thought about the nature of these links is often the reason that forecasts are
sometimes found to be statistically accurate yet ineffective.
Gwilym Jenkins (of Box–Jenkins fame) suggests some guidelines for the devel-
opment of a forecasting system.

15.3.1 Step 1. Analyse the Decision-Making Systems to Be Served by the


Forecasts
This analysis is a time-consuming and exhaustive process. It involves listing and
describing all decisions and actions influenced by the forecasts and the people
involved, together with the links between them.
For instance, forecasts of car sales may be required by the manager of an assem-
bly line at a car plant. Primarily the forecasts will help decide the speed and mix of
the line (the total volume produced and the split between different variants of the
model). But other decisions will be influenced by the forecasts; the ordering of steel,
the production of sub-assemblies, the buying of components and the setting of
stock levels are examples. Forecasts for the assembly line should not be made
without a thorough analysis of their impact on other areas. The analysis may reveal
fundamental flaws in decision systems or organisational structure, which must be
addressed before any forecasts can hope to be effective.

15.3.2 Step 2. Define what Forecasts Are Needed


Step 2 comprises determining forecast variables, frequencies, time horizons and
accuracy levels. The forecasts needed should relate directly to the decisions analysed
at step 1.
In the car assembly example, this might imply forecasting total demand and vari-
ant mix weekly for eight weeks ahead. No more than a medium level of accuracy
would probably be required because stocks provide a balancing factor. Defining the
forecasts like this prevents the generation of needless forecasts (over-accurate, too
frequent, too great a time horizon, etc.). It can only be done after the decision
process has been analysed. Only then is it known that, for instance, the ordering of
steel for sub-assemblies may imply a greater time horizon for the forecasts than
seems necessary when the assembly line is viewed in isolation.

15/4 Edinburgh Business School Quantitative Methods


Module 15 / Managing Forecasts

15.3.3 Step 3. Develop a Conceptual Model of the Forecasting Method


The purpose of this stage is to suggest what the ideal forecasting method might be.
This means specifying all the factors that might possibly be suspected of affecting
the variable being forecast. It includes indicating the time patterns (seasonality, etc.)
that might influence the future, suggesting likely causal variables and considering
whether volatile conditions might point to the use of a judgement method.
In the car assembly example, the factors might be seasonal patterns, the econom-
ic environment, marketing activity levels, stock levels, price changes and many
others. The development of a conceptual model causes thought to be given to the
realities of a situation. It should prevent a blind rush into inappropriate statistical
techniques.

15.3.4 Step 4. Ascertain the Data Available (and Not Available)


This stage will indicate in what ways the actual forecasting method might fall short
of the ideal. The data on some variables may simply be unavailable because they
have never been collected; or data on a variable may be inaccurate; or data may not
be available in a sufficiently disaggregated form.
For example, in the car production case, data on advertising and promotional
expenditure may only be available as a combined marketing total, or data on stock
levels may be known to be inaccurate. Both these situations would limit the varia-
bles that could be included and adversely affect forecasts of car production.

15.3.5 Step 5. Develop the Method by Which Forecasts Are to Be Made


Step 5 is the ‘technique’ part of the system, involving the selection of a suitable
technique based on the earlier analysis of the forecasts needed, the accuracy required
and the data and resources available. All possible techniques should be considered
and one, or perhaps a shortlist of two or three, chosen on the basis of the match
between their attributes and the characteristics of the forecasting situation. In many
organisations this is the only part of the system given any real consideration.
The chosen method for car demand might be a causal model (relating demand to
an economic variable, perhaps personal disposable income, and marketing variables,
perhaps relative price and promotional expenditure) but with an allowance for
seasonal effects.

15.3.6 Step 6. Test the Method’s Accuracy


At this stage, several techniques might be on the shortlist. They should be compared
on the basis of past data, and the best chosen for use. Even if there is only one
candidate, something about its likely accuracy should be known. There are two types
of statistical test for doing this, the first for smoothing methods, the second for any
method, including smoothing methods.

Quantitative Methods Edinburgh Business School 15/5


Module 15 / Managing Forecasts

Testing the Accuracy of Smoothing Methods


The test involves comparing each one-period-ahead forecast with the actual
observation for that time period. A measure of scatter for the differences (some-
times called deviations or errors) is calculated. The method that has the lowest
scatter of the differences is the best, statistically speaking.
For example, Table 15.1 shows some data to which a three-point moving average
has been applied.
In the table the first smoothed figure is 17.3, calculated on the actual data for
periods 1–3, and therefore being the smoothed value for period 2, the middle of the
three periods. However, it is also the forecast for the first period in the future,
period 4, so 17.3 is referred to as the one-step-ahead forecast for period 4. The
difference between forecast and actual for period 4 is −3.3, as shown in the last
column of Table 15.1. Similar calculations are carried out for periods 5 and 6.
The scatter of the difference is usually measured by the MSE (mean square
error). This is similar to the variance as a measure of scatter except that it uses n,
not n − 1, in the denominator:
( . ) ( . ) ( . )
MSE =
= 29.2
Different smoothing methods can be compared using the MSE for each, calcu-
lated, of course, over the same time periods. The mean absolute deviation (MAD)
can just as well be used for making the comparison. Just as different methods can be
compared, so the same method but with different smoothing constants can be
compared to choose optimum values for the constants.
This method of testing accuracy cannot be used for all forecasting methods. For
example, regression-based methods do not adopt this approach since they do not
accumulate forecasts as they go along but use the whole of the available data at
once.

Table 15.1 Accuracy test for smoothing methods


Time Actual data Smoothed Forecast Difference
period
1 17
2 23 17.3
3 12 16.3
4 14 17.0 17.3 −3.3
5 25 18.3 16.3 8.7
6 16 17.0 −1.0

Testing the Accuracy of Any Method


This method works by splitting the data into two parts. The most recent part of the
historical data is put to one side (B to A in Figure 15.1). The remaining historical
data (up to B) are then used in conjunction with whichever techniques are being

15/6 Edinburgh Business School Quantitative Methods


Module 15 / Managing Forecasts

tested to make forecasts for the period B to A. For this period, forecasts and actual
can be compared. The technique giving the most accurate forecasts (as measured by
the MSE or MAD) is taken as being the best. When forecasting for the future (time
periods beyond A), all the data up to A are used.
This method is an independent way of testing accuracy. It is independent in the
sense that two separate sets of data, up to B and from B to A, were used to forecast
and to check the forecasts respectively. Contrast this with the use of the correlation
coefficient to measure accuracy for causal methods. In this case the closeness of fit
is a comparison between a set of data and a statistical model (the regression equa-
tion) constructed on exactly the same set of data.

Forecast

Actual

B A Time

Figure 15.1 A test of forecasting accuracy for all techniques

15.3.7 Step 7. Incorporate Judgements into Forecasts


Inevitably within an organisation there will be people who, for good or bad reasons,
disagree with the output of a forecasting system. If they are intimately connected
with the system as users of the forecasts, the matter is a serious one. They may be in
a position to cause the system to fail. It is essential that they be brought ‘inside’ the
system to see it as something to which they have contributed. They should see it as
their system. User involvement is important throughout the design of a forecasting
system. Stage one, the analysis of the decision-making process, should have ensured
user involvement from the outset. It is important that it be continued through all
stages of the design and when the forecasts are being used. Incorporating judge-
ments into the forecasts is an excellent way of doing this.
If people’s judgements are not permitted to influence the forecasts, they may feel
sufficiently disenchanted to hinder the system’s success, or at least not make a real
effort at making it work. All this is a rather negative reason for incorporating
judgements into forecasts. The positive reason for incorporating judgements is that
they probably contain valuable information, not obtainable elsewhere, that will
improve forecasting accuracy if it can be properly applied.
The importance of having some definite way of bringing judgements to bear
seems clear. How can it be done? There are two tasks to be accomplished. The first
is to obtain some sort of consensus from what might be a long list of differing

Quantitative Methods Edinburgh Business School 15/7


Module 15 / Managing Forecasts

views. The second is to use this consensus to make adjustments to the forecasts that
have already been derived by other means.
The first task draws on qualitative forecasting techniques. Two of them, struc-
tured groups and the Delphi method, are generally helpful in this context. Obtaining
qualitative forecasts by the Delphi method is a similar exercise to that of bringing
together judgements with a view to adjusting quantitatively derived forecasts.
However, there is a significant difference. The Delphi method requires participants
to alter their opinions in coming to a forecast. When a statistical forecast is already
on the table, participants may be reluctant to change their views. The element of
competition that is inherent in the Delphi method may be more apparent when the
aim is to adjust an existing forecast rather than to form a forecast from scratch. The
participants may see themselves as part of a bargaining process rather than a
scientific technique. Experience shows that a consensus is hard to achieve in these
circumstances. This makes the next task, adjusting the forecasts, even more difficult.
There is no way to adjust the forecasts apart from a process of discussion and,
eventually, agreement. The control on this process is that participants must be
accountable for the decisions they make. The discussion must be minuted. At a later
stage, not only the forecasts themselves but also the adjustment of forecasts should
be monitored. If the minutes reveal that any participant in the adjustment process
was insisting on a view that turned out to be incorrect, he or she will have to explain
why. This may have a deterrent effect to stop game-playing. More importantly, a
record will be built up over time so that the people whose views are consistently
correct will be evident. Moreover, people whose judgements have consistently
proved misguided, prejudiced or the product of vested interest will also be revealed.
As time goes by the adjustment process should reflect more and more the track
records of participants rather than the strength with which they hold their opinions.
Of course, the process is far from foolproof: monitoring may deter some people
from proffering their opinion; track records will mean little if there is a rapid
turnover in participants; time is needed to build up track records; special cases will
always be argued; most things can be explained away if one is clever enough.
Nevertheless, the balance must be in favour of allowing judgements to be incorpo-
rated. Participation is better than non-participation. The alternative to allowing
judgements to influence forecasts, with all the risks entailed, is to leave people who
are intimately affected by the forecasts feeling that they are outside the system. Their
positive input will not be available; a negative approach on their part may cause the
system to fail.

15.3.8 Step 8. Implement the System


Suppose that the design of a forecasting system has progressed so far that there is
available a technique capable of producing accurate forecasts that are in tune with
the decision-making process. What happens next? Too often the answer is very
little. Many organisations and individuals pay lip service to the problem of imple-
mentation while in reality giving it scant attention. The procedure presented here
suggests that the key to successful implementation is clear communication between

15/8 Edinburgh Business School Quantitative Methods


Module 15 / Managing Forecasts

those affected by the forecasts. Without clear communication it can prove very
difficult for participants to agree on the problems, never mind the solutions.
The lack of communication can be seen in the polarised attitudes of the two
major groups involved: the users and the producers of the system. The former
wonder how such theoretically minded experts can help practically minded manag-
ers; the latter wonder why so many people have yet to come out of the Stone Age
and into the age of modern management. When such conflicting views exist, the
two sides are almost impossible to reconcile. The situation can be avoided by
ensuring that users are concerned with the project from the outset. ‘Us and them’
feelings will then be minimised. The first stage of the design process, analysing the
decision-making system, provides the ideal starting point for this cooperation.
User involvement is therefore a precondition for implementation. Without it, the
steps suggested below can have little impact. Several research studies support the
need for user involvement. In particular, it has been shown that involvement in
which the user feels that he or she has some influence on the design is an important
prerequisite of success. With this proviso, the implementation stage should set out
to answer four questions.

What Are the Problems?


Forecasting experts tend either not to understand the user’s problems or, if they do
understand, to dismiss them as the product of a Neanderthal mind. When these
positions are adopted, much hard work can be wasted since the system will almost
inevitably fail. A better approach is to find out what the problems really are by
talking to the key people. Usually ‘key people’ means everyone even remotely
affected by the forecasts. Managers who appear to be peripheral with respect to a
project often turn out to be remarkably imaginative in making their negative
influence felt if their feathers have been ruffled by being excluded from the consul-
tation. It is likely that the key people will come from a wide range of departments or
functions: financial, marketing, operating management, purchasing, inventory
control and many others depending upon the organisation and the nature of the
forecasting problem.
When talking to a potential user of the system it is important to adopt a neutral
stance. It is likely to be the project leader who is conducting the interview. If, at this
stage, these persons behave as if they were salespeople for the system, the discus-
sion may well polarise. Sensitively conducted interviews of key users should reveal a
range of implementation problems. They may be rational, such as a lack of training
for relevant clerical staff or an inability to understand the computer printouts on
which the forecasts are to be issued. On the other hand, they may be irrational, such
as a fear of increasing sophistication or a personality clash with one of the project
leaders. The act of bringing these problems into the open will in itself aid their
solution.

Do All Participants Agree on the Problems?


Does it seem surprising that such agreement should be an issue? After all, a problem
is a problem. In part, this question is answered merely by communicating the full

Quantitative Methods Edinburgh Business School 15/9


Module 15 / Managing Forecasts

range of problems to all participants. But there may be disagreements. For instance,
there may be some doubt as to whether operators of the old manual system are the
right people to operate the new system. Changes in budgets are another common
source of dispute.
Continuing the consultations, perhaps supplemented with structured group meet-
ings, the project leader hopes to obtain a consensus on all these issues. It is
becoming more and more apparent that a forecasting expert needs behavioural skills
just as much as quantitative ones.

What Are the Possible Solutions to the Problems?


The range of alternative solutions to the problems should come from the users.
They are far more likely to make the system work if they know that the solutions are
their own. The pressure will be on them to make their own solutions succeed.
The first step may be to persuade them that the problem is not insuperable and
that there is benefit in trying to find an answer. The need for a new forecasting
system should have been discussed when the project started, but it may be worth-
while going over the reasons once again to convince the users of the ill-effects of
maintaining the status quo. Once an agreement has been obtained that the problems
must be surmounted, the search for solutions can commence. Many will immediate-
ly have some ideas. Brainstorming may be considered for generating some more.

Can a Consensus on an Action Plan Be Obtained?


Throughout the implementation, the project leader has been walking a tightrope. At
each stage he or she depends on personal skill and the goodwill of others in order to
make progress. This last question is probably the most difficult to answer. Vested
interests, departmental pride and status, as well as genuinely held but conflicting
views, will all rise to the surface. There are techniques (beyond the scope of this
module) to help a group leader to reach a consensus.
Even so, gaining a consensus in these conditions may prove too difficult a task
for even the most skilful and well-equipped project leader. If a consensus is impos-
sible, the next best thing is an experiment. Will the participants agree to an
experiment? Of course, the experiment would have to be for a limited period and
include a provision for feedback. But it may provide the information and goodwill
necessary for agreement at a later date. Or, possibly, the experiment itself will be
sufficient to overcome the inertia that tends to afflict most new ventures.
An example of this exhaustive problem/solution approach to implementation
comes from the banking world. A computerised forecasting system was to replace
the previous group discussion approach to planning new business activities. The
implementation stage had been reached. Interviews and discussions between all the
key people revealed a surprising range of problems, including:
(a) lack of self-confidence in using new techniques on the part of some users;
(b) concern over loss of control of forecasts;
(c) lack of belief in the producer department because of past failures;
(d) the added management burden in dealing with computer output.

15/10 Edinburgh Business School Quantitative Methods


Module 15 / Managing Forecasts

Some of these problems had already been provided for. The users were eventual-
ly convinced that meetings to incorporate judgements into forecasts allowed them
control over forecasts as before, but with the advantage of the meetings being more
soundly based on a statistical forecast. The other problems could not, however, be
discarded. To cope with the added management burden, the producers of the
forecasts had to agree to a substantial simplification of the printout and a tailoring
of the output to individual users’ needs. The users’ lack of confidence both in
themselves and in the producers proved yet more difficult to solve. Eventually a
series of experiments was agreed. The system would be introduced in one depart-
ment at a time. Meanwhile, the other departments would continue to operate the
manual system. A strict feedback procedure would allow the users to learn from the
experience of others and to make necessary alterations as the implementation
proceeded. As a result, implementation was a lengthy process (and emotionally
draining for the project leader), but it led to an end product that was worth waiting
for: a forecasting system that worked.
Of course, all situations differ and there is a need for flexibility on all sides. One
factor, however, is always a great bonus. This is the case when one of the users is
sufficiently convinced of the need for a new system or sufficiently enthusiastic about
the new technology and techniques that he or she will take on a leadership role. If
the users are being motivated by one of their own side rather than a seemingly
distanced expert, many of the difficulties simply do not arise. This factor is now
seen as being increasingly important in the implementation of new technology,
whether related to forecasting or not. Indeed, all the ideas suggested in this section
go far beyond the implementation of forecasting. They apply to the implementation
of any new methods, techniques or practices.

15.3.9 Step 9. Monitor Performance


Surveys show that few organisations monitor the performance of their forecasting
systems. This may seem surprising, but it is perfectly consistent with an approach
that fails to see forecasting as a system. A non-systematic view sees forecasting as a
technique that provides forecasts; a systematic view sees forecasting as a process
that takes in all the nine stages of the guidelines and that is continually being
adjusted. The ninth and final stage is monitoring: the regular evaluation of the health
of the system.
The best monitoring goes further than simple assessment and the allocation of
praise or blame. It provides the information on which improvements in the system
can be based. The evidence can show not just how accurate the forecasts have been
but when and why the forecasts have been good or bad. At what time periods were
the forecasting errors unacceptably high? Does the system fail to capture properly
the seasonal effect or business cycles or changes in the economic environment?
Have the forecasts been made acceptable only by virtue of the incorporation of
judgements? Or have they been made unacceptable only because of the incorpora-
tion of judgements? Were the statistics and the judgements equally disastrous?
Whose judgements have proved to be sound and whose unsound? Questions such
as these should be asked – and answered – regularly. The emphasis is on regularity.

Quantitative Methods Edinburgh Business School 15/11


Module 15 / Managing Forecasts

Monitoring should not be the occasional lightning audit to catch people out but a
continual flow of information.
Monitoring will, therefore, stem from regular reports comparing the actual data,
as they become known, with the forecast. The reports will include purely statistical
measures of accuracy, similar to those used in assessing the accuracy of techniques
ex post. Thus, the mean absolute deviation (MAD) and mean square error (MSE) will
be used to indicate the average level of accuracy over the time period in question.
Beyond this, general summary information is required possibly on a management-
by-exception basis. Times of exceptional accuracy or inaccuracy should be reported
with, where possible, reasons for the deviation. For instance, the report might say
that the third-quarter forecasts were inaccurate, just as they were in the last two
years, or that the third-quarter forecasts were inaccurate because of the special
circumstance of, say, a strike.
If the system allows for forecasts to be altered to reflect judgements, then, in
addition to these frequent monitoring reports, there must also be less regular reports
assessing the performance of the ‘judges’. Track records must be compiled showing
which judges have been accurate and which inaccurate. Even if a particular judge’s
views have not been included so far, they may be sometime in the future and it will
be useful to have his or her track record to hand.
Among all this statistical data it is worth remembering that it is not always the
forecasts that are closest to the actual data that are the best. This paradox comes
about as follows. In one important sense the best forecasts are those that have
credibility with the users. If they have credibility, then notice will be taken of them
and management action will follow. This action may cause the originally recorded
forecasts to appear inaccurate. Their true accuracy can, of course, never be known
because conditions have changed (by the taking of management action). According-
ly, a comprehensive monitoring system should go beyond numerical data and
consider perceptions of value and success. In other words, the users will be ques-
tioned from time to time and their opinions of the strengths and weaknesses of the
system solicited. A moderately accurate but functioning system is preferable to a
highly accurate but never used system.
For example, the senior management of a manufacturing company could not
understand why production planning decisions were so poor. A check on the
demand forecasting system that supported the decisions revealed that the forecasts
were highly accurate. A check on the way decisions were made uncovered the
surprising information that the forecasts were never used by the production
planners. They used their own ‘seat of the pants’ judgements as forecasts. The
reason was apparently that the system had not been properly implemented. Liaison
between producers and users had been very poor and the computer-based system
had never been explained to those receiving its output. As a result the planners felt
threatened and isolated by the system and ignored it. If less effort had been chan-
nelled into obtaining high levels of accuracy and more into making sure users knew
what to do with the forecasts, the overall benefits to the organisation would have
been considerable.

15/12 Edinburgh Business School Quantitative Methods


Module 15 / Managing Forecasts

In this example the failure of the forecasting system naturally gives cause for
concern. Just as disturbing must be the fact that no one was aware of its failure until
the evidence of the bad decisions started to roll in. Such poor communication is a
surprisingly frequent occurrence. In more organisations than would wish to
acknowledge it, producers and users of forecasting systems (or other management
aids) hold diametrically opposed views about the success or failure of the project.
The producers think it a success; the users think it a failure. The producers do not
go to talk to the users because they think all is fine; the users think the non-
appearance of the producers is because they are too ashamed. If the two sides do
meet, the users are too polite to say what they say between themselves; the produc-
ers think the users’ faint praise is because they begrudge them their success. Such
situations reinforce the view that survey checks conducted by an independent body
a short while after a system has become operational are a necessity, not a luxury.
In summary, there is nothing very sophisticated about the monitoring of fore-
casting performance. Its essentials are the recording of comprehensive data, both
quantitative and qualitative, together with a willingness to face facts and act upon
them. Perhaps this is why so few organisations monitor their forecasts. The excite-
ment of forecasting, such as it is, lies in the techniques. There is no excitement in
tedious data evaluations, and therefore, some might think, monitoring must be an
unimportant part of the process.

15.4 Forecasting Errors


The history of business forecasting is crowded with expensive mistakes. Some cases
are presented here, with the positive purpose of learning from others. The mistakes
are a guide to the surprisingly simple and usually non-technical things that can and
do go wrong. These examples happened some years ago but the lessons are still
relevant.

15.4.1 Chartering Oil Tankers Example


Jenkins (1979) cites the case of an oil company that lost enormous sums of money
by taking too superficial an approach to forecasting. A time series approach was
adopted, which was unsuited to the circumstances.
Figure 15.2 shows the spot prices for chartering oil tankers for the years 1968–
71. Analysing the series and detecting an upwards trend in 1969 and early 1970, it
was assumed that the trend would continue at least in the short term. The company
could therefore save money by current rather than future chartering. Accordingly,
charter contracts were taken out. The spot price continued to rise; more contracts
were arranged. No doubt other oil companies, noticing what was happening,
became involved. The spot price rose to great heights. When the chartering activity
came to a halt in early 1971, the spot price fell to its pre-1970 level. Contracts taken
out at this time would have been at about one-third of the price of just a few
months earlier. The cost to the company of the over-priced contracts has been
estimated at $250 million.

Quantitative Methods Edinburgh Business School 15/13


Module 15 / Managing Forecasts

250
200
150
100
50
1968 1969 1970 1971

Figure 15.2 Spot price for chartering oil tankers


Two mistakes had been made. First, the company’s intervention in the spot mar-
ket affected the market mechanism and thus the price. This occurred because the
company was very large and because the supply of oil tankers is, in the short term,
fixed. This is a non-technical error that, with the benefit of hindsight, can be seen
with some clarity. The second mistake was more technical. A deeper analysis of the
series would have revealed that it was a ‘random walk’. The step from one spot price
to the next was a random one. If random, then by definition there was no pattern in
the movement of the spot price. This means that time series analysis is inapplicable.
The basis of time series analyses is that they determine patterns in historical data and
project them into the future. If there are no patterns, then time series analysis will
fail. Worse still, any patterns determined in subsets of the data will be spurious and
may lead to false conclusions.
In the absence of patterns in the series, a different type of forecasting method
should have been employed. In this case a causal approach would have been better.
An investigation of factors likely to cause the spot price to vary, such as supply of
tankers, demand for oil, economic variables, etc., would have had a better chance of
bearing fruit.

15.4.2 Airline Passenger Miles Example


One Sunday the planning director of an airline noticed a graph of the Index of UK
Manufacturing Production in the business supplement of his Sunday newspaper.
The thought struck him that the shape of the graph was very much the same as that
of ‘Passenger Miles Flown’ with his airline. On Monday he set his team to work, and
they developed a causal model linking the two variables. Analysis showed that,
statistically, the model was a good one, having a high correlation coefficient. It was
subsequently used to predict future business. But soon unsatisfactory forecasts
started to be produced and it had to be abandoned.
There were two mistakes in this piece of forecasting. First, the strong statistical
evidence had demonstrated only that the two variables were associated. It had not
shown that the link was causal. Over the period of analysis both variables had risen
steadily as the UK economy slowly grew. When the economic situation changed,
manufacturing production dropped. At the same time the $/£ exchange rate
increased and large numbers of tourists flew off to the USA for holidays. Conse-

15/14 Edinburgh Business School Quantitative Methods


Module 15 / Managing Forecasts

quently, passenger miles flown increased at the time when the model was predicting
a decrease. There was no causal link between the variables, so, when circumstances
altered, the model no longer held good.
The second mistake was that in order to forecast passenger miles flown, the
airline had first to forecast the Index of Manufacturing Production. This in itself
was no trivial matter. A direct attack on passenger miles flown would have carried
an equal chance of success while saving time and effort. Where forecasts of eco-
nomic variables are needed, ones that are readily available should be chosen. Good
forecasts of some economic variables, such as gross domestic product, personal
disposable income and others, are published regularly by a variety of econometric
forecasting institutes.
These two examples show how easily major mistakes can be made. More espe-
cially, they show that the role of the forecasting non-specialist in supervising a
forecasting effort or as part of a forecasting team is of vital importance. Mistakes are
usually non-technical in nature. There is no guaranteed means of avoiding them, but
it is clearly the responsibility of the ‘managers’ in the team to guard against them.
The lessons other organisations have learned the hard way can help them in their
task.

Learning Summary
Managers have a clear role in ‘managing’ forecasts. But they also have a role as
practitioners of forecasting. The low cost of a very powerful computer means that it
is not a major acquisition; software and instruction manuals are readily available.
With a small investment in time and money, managers, frustrated by delays and
apparent barriers around specialist departments, take the initiative and are soon
generating forecasts themselves. They can use their own data to make forecasts for
their own decisions without having to work through management services or data
processing units.
This development has several benefits. The link between technique and decision
is made more easily; one person has overall understanding and control; time is
saved; re-forecasts are quickly obtained. But, of course, there are pitfalls. There may
be no common database, no common set of assumptions within an organisation.
For instance, an apparent difference between two capital expenditure proposals may
have more to do with data/assumption differences than with differences between
the profitabilities of the projects. Another pitfall is in the use of statistical techniques
that may not be as straightforward as the software manual suggests. The use of
techniques by someone with no knowledge of when they can or cannot be applied is
dangerous. A time series method applied to a random data series is an example. The
computer will always (nearly always) give an answer. Whether it is legitimate to base
a business decision on it is another matter.
However, it is with management aspects of forecasting that this module has
primarily been concerned. It has been suggested that this is an area of expertise too
often neglected and that it should be given more prominence. Statistical theory and
techniques are of course important as well, but the disproportionate amounts of

Quantitative Methods Edinburgh Business School 15/15


Module 15 / Managing Forecasts

time spent studying and discussing them give a wrong impression of their im-
portance relative to management issues.
In particular, the topics covered as steps 7–9 in the guidelines – the incorporation
of judgements, implementation and monitoring – are given scandalously little
attention within the context of forecasting. This is generally true, whether books,
courses, research or the activities of organisations are being referred to. A moment’s
thought demonstrates that this is an error. If a forecasting technique is wrongly
applied, good monitoring will permit it to be adjusted speedily: the situation can be
retrieved. If judgements, implementation or monitoring are badly done or ignored,
communication between producers and users will probably disappear and the
situation will be virtually impossible to retrieve.
Why should these issues be held in such low regard? Perhaps the answer lies in
the widespread attitude that says that a manager needs to be taught statistical
methods but that the handling of judgements, implementation and monitoring are
matters of instinct that all good managers have. They are undoubtedly management
skills, but whether they are instinctive is another matter. Whatever the reason, the
effect of this inattention is almost certainly a stream of failed forecasting systems.
How can the situation be righted? A different attitude on the part of all con-
cerned would certainly help, but attitudes are notoriously hard to change. A long-
term yet realistic approach calls for more information. Comparatively little is known
about these management aspects. If published reports and research on the manage-
ment of forecasting were as plentiful as they are on technical aspects, a great
improvement could be anticipated.
Even so, the best advice of all is probably to avoid forecasting. Sensible people
should only use forecasts, not make them. The general public and the world of
management judge forecasts very harshly. Unless they are exactly right, they are
failures. And they are never exactly right. This rigid and unrealistic test of forecast-
ing is unfortunate. The real test is whether the forecasting is, on average, better than
the alternative, which is often a guess, frequently not even an educated one.
A more positive view is that the present time is a particularly rewarding one to
invest in forecasting. The volatility in data series seen since the mid-1970s puts a
premium on good forecasting. At the same time, facilities for making good forecasts
are now readily available in the form of a vast range of techniques and wide choice
of relatively cheap computers. With the latter, even sophisticated forecasting
methods can be applied to large data sets. It can all be done on a manager’s desktop
without the need to engage in lengthy discussions with experts in other departments
of the organisation.
Whether the manager is doing the forecasting in isolation or is part of a team, he
or she can make a substantial contribution to forward planning. To do so, a system-
atic approach to forecasting using the nine guidelines and an awareness of the
hidden traps will serve that manager well.

15/16 Edinburgh Business School Quantitative Methods


Module 15 / Managing Forecasts

Review Questions
15.1 A manager using forecasts should view forecasting as a system, whereas a forecasting
expert should view forecasting as a series of techniques. True or false?

15.2 Which department should, preferably, take overall responsibility for the development of
a production planning forecasting system?
A. Production.
B. Data processing.
C. Financial planning.
D. Management services.

15.3 Why is it necessary to analyse the decision-making system prior to designing a


forecasting system?
A. To create user involvement.
B. To establish exactly what the purpose of the forecasts is.
C. To amend the remit for the project.
D. To correct anomalies in the decision-making system.

15.4 The purpose of developing a conceptual model of the forecasts is to consider what
variables might be incorporated in a causal model. True or false?
Questions 15.5 to 15.8 refer to the data in Table 15.2 showing a three-point
moving average forecast.

Table 15.2 Three-point moving average forecast


Time Actual data Smoothed Forecast Forecast
period error
1 17
2 23 17.3
3 12 16.3
4 14 16.0 17.3 −3.3
5 22 17.7 16.3 5.7
6 17

15.5 What is the forecast for time period 6?


A. 17.0
B. 17.7
C. 16.3
D. 16.0

Quantitative Methods Edinburgh Business School 15/17


Module 15 / Managing Forecasts

15.6 What is the one-step-ahead forecast error for period 6?


A. 1.0
B. −1.0
C. 1.7
D. −0.3
E. 0.3

15.7 What is the MAD (mean absolute deviation) of the forecast errors for periods 4–6?
A. 3.7
B. 1.0
C. 3.3
D. 14.8
E. 1.2

15.8 What is the MSE (mean square error) of the forecast errors for periods 4–6?
A. 22.2
B. 14.8
C. 6.7
D. 3.9

15.9 In comparing two forecasting methods, the MAD and the MSE are calculated for both
methods. In what circumstances could the following situation arise?
The MSE for method 1 is lower than the MSE for method 2; the MAD for method 1 is
higher than the MAD for method 2.
A. The situation could never arise.
B. Method 1 is superior to method 2.
C. Method 2 is superior to method 1.
D. On average method 1 is superior, but the method leaves some very large
errors.
E. On average method 2 is superior, but the method leaves some very large
errors.

15.10 Why was the method of testing forecasting accuracy that was based on keeping some of
the historical data to one side described as an independent test?
A. It can be used for any forecasting method.
B. It considers forecast errors of more than one step ahead.
C. The coefficients in the forecasting model are estimated with data separate from
that on which accuracy is measured.

15.11 Judgements should be incorporated into a forecasting system because:


A. they are a valuable source of information.
B. it is important that the user feels able to influence the system.
C. doing so helps to prevent negative reactions to the system.

15.12 Although most participants in a forecasting system can agree on the likely problems in
implementing it, it is very difficult to obtain a consensus on the solutions that should be
adopted. True or false?

15/18 Edinburgh Business School Quantitative Methods


Module 15 / Managing Forecasts

Case Study 15.1: Interior Furnishings


1 The production manager of a company making interior furnishings has to prepare a
forecast of the demand for sets of 2-metre curtain rails next month (September 2017).
Data for recent months are shown in Table 15.3 (initially used in Case Study 14.1 in
Module 14).

Table 15.3 Demand for curtain rails


Time Demand
2016 Oct. 2000
Nov. 1350
Dec. 1950
2017 Jan. 1975
Feb. 3100
Mar. 1750
Apr. 1550
May 1300
June 2200
July 2775
Aug. 2350

a. Make a forecast using a three-point moving average (MA).


b. Make a forecast using exponential smoothing with α = 0.1.
c. Compare the accuracy of the two methods using the MSE.
d. Which method seems the better, and what is the forecast for September 2017?
e. What reservations are there about this forecast?

Case Study 15.2: Theatre Company


1 A theatre organisation stages a wide range of productions from Shakespeare to
contemporary and experimental. Some productions are booked up long before the first
night; others are played before a half-full auditorium. In planning future activities, the
organisation has asked a firm of consultants to advise how forecasts of the demand for
tickets for each production might be prepared. Describe briefly how the guidelines for
developing a forecasting system might be applied to this situation.

Case Study 15.3: Brewery


1
Note: This case study draws upon material from several modules. It also requires the use of a
computer that has software capable of producing forecasts based on (a) regression analysis
and (b) the Holt–Winters Method. However, the majority of the case deals with qualitative

Quantitative Methods Edinburgh Business School 15/19


Module 15 / Managing Forecasts

issues, and, therefore, even if such software is not available, it is still possible to tackle many
of the questions posed.
A large company whose main business is the production of alcoholic beverages wants to
forecast the sales of a particular brand of beer. The organisation is diversified and has
many other products. The terms of reference are to forecast the UK sales volumes for
the brand. The forecasts are to be used as the basis for short- to medium-term
production and marketing decisions.
Beer, of course, takes a relatively short time (a matter of weeks) to produce and
distribute. Stock levels are kept low partly because the product can deteriorate, albeit
slowly, and partly because the costs of holding stocks are high compared to production
costs. The need for low stock levels, together with fierce competition from other
producers, has meant that it is imperative to respond quickly, in terms of both produc-
tion and marketing, to changes in demand. This is the primary reason for this forecasting
project.
Substantial amounts of data are available. The most important data are the brand’s sales
volumes (quarterly) going back several years. Furthermore, the data are believed to be
accurate and reliable. Data for the most recent 15 years are shown in Table 15.4. They
are in unspecified units and have been rounded for ease of manipulation.
Wider data concerning the product as well as other general data about the industry and
the economic climate are also available. Table 15.5 shows some of these data.

Table 15.4 Sales volumes (unspecified units)


Quarter
Year 1 2 3 4 Total
2003 9 10 12 11 42
2004 12 15 16 13 56
2005 16 17 20 16 69
2006 18 21 22 17 78
2007 21 23 27 22 93
2008 25 27 32 26 110
2009 26 29 35 27 117
2010 28 35 40 29 132
2011 33 41 45 33 152
2012 32 42 53 38 165
2013 33 42 54 40 169
2014 32 45 58 41 176
2015 34 46 57 44 181
2016 36 48 61 45 190
2017 37 49 63 48 197

15/20 Edinburgh Business School Quantitative Methods


Module 15 / Managing Forecasts

Table 15.5 GDP (in index form) and advertising data 2009–17
GDP Advertising expenditure
Quarter Quarter
Year 1 2 3 4 1 2 3 4
2009 93.5 93.4 94.3 96.2 1.4 1.8 1.9 1.4
2010 95.7 96.0 96.6 97.6 1.3 1.9 2.0 1.7
2011 98.3 100.5 100.6 100.7 1.5 1.8 2.3 1.5
2012 100.2 104.2 102.7 103.1 1.6 2.3 2.4 1.7
2013 102.4 100.5 98.7 98.5 1.7 2.0 2.8 1.9
2014 98.3 97.4 98.5 99.6 1.7 2.2 3.0 1.9
2015 100.2 100.4 100.5 101.7 1.8 2.4 3.0 2.2
2016 103.3 103.1 103.7 103.3 1.6 2.5 3.1 2.2
2017 103.9 103.7 104.2 104.0 1.8 2.7 3.0 2.3

Approach this forecasting problem by using the nine-point checklist for developing a
forecasting system. Specifically:
a. The main decisions that the forecasts are to serve are:
i. production levels (monthly);
ii. distribution quantities (quarterly);
iii. marketing action (quarterly).
All have a time horizon of no more than one year. Suggest what other related deci-
sions are likely to be affected by the forecasts.
b. What forecasts are required, in terms of timing and accuracy?
c. Suggest a conceptual model for making forecasts.
d. What restrictions on data availability might there be?
e. Which techniques are appropriate in this situation?
f. Test the accuracy of:
i. a causal model relating sales volume to GDP and/or advertising expenditure;
ii. a Holt–Winters time series model.
g. How could judgement be incorporated into the forecasts?
h. What are the likely implementation problems and how might they be resolved?
i. How should the forecasts be monitored?

References
Jenkins, G. (1979). Practical Experiences with Modelling and Forecasting Time Series. Gwilym
Jenkins and Partners (Overseas) Ltd.

Quantitative Methods Edinburgh Business School 15/21


Appendix 1

Statistical Tables
Table A1.1 Binomial Distribution Tables
p
n r 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50
1 0 0.9500 0.9000 0.8500 0.8000 0.7500 0.7000 0.6500 0.6000 0.5500 0.5000
1 0.0500 0.1000 0.1500 0.2000 0.2500 0.3000 0.3500 0.4000 0.4500 0.5000

2 0 0.9025 0.8100 0.7225 0.6400 0.5625 0.4900 0.4225 0.3600 0.3025 0.2500
1 0.0950 0.1800 0.2550 0.3200 0.3750 0.4200 0.4550 0.4800 0.4950 0.5000
2 0.0025 0.0100 0.0225 0.0400 0.0625 0.0900 0.1225 0.1600 0.2025 0.2500

3 0 0.8574 0.7290 0.6141 0.5120 0.4219 0.3430 0.2746 0.2160 0.1664 0.1250
1 0.1354 0.2430 0.3251 0.3840 0.4219 0.4410 0.4436 0.4320 0.4084 0.3750
2 0.0071 0.0270 0.0574 0.0960 0.1406 0.1890 0.2389 0.2880 0.3341 0.3750
3 0.0001 0.0010 0.0034 0.0080 0.0156 0.0270 0.0429 0.0640 0.0911 0.1250

4 0 0.8145 0.6561 0.5220 0.4096 0.3164 0.2401 0.1785 0.1296 0.0915 0.0625
1 0.1715 0.2916 0.3685 0.4096 0.4219 0.4116 0.3845 0.3456 0.2995 0.2500
2 0.0135 0.0486 0.0975 0.1536 0.2109 0.2646 0.3105 0.3456 0.3675 0.3750
3 0.0005 0.0036 0.0115 0.0256 0.0469 0.0756 0.1115 0.1536 0.2005 0.2500
4 0.0000 0.0001 0.0005 0.0016 0.0039 0.0081 0.0150 0.0256 0.0410 0.0625

5 0 0.7738 0.5905 0.4437 0.3277 0.2373 0.1681 0.1160 0.0778 0.0503 0.0312
1 0.2036 0.3280 0.3915 0.4096 0.3955 0.3602 0.3124 0.2592 0.2059 0.1562
2 0.0214 0.0729 0.1382 0.2048 0.2637 0.3087 0.3364 0.3456 0.3369 0.3125
3 0.0011 0.0081 0.0244 0.0512 0.0879 0.1323 0.1811 0.2304 0.2757 0.3125
4 0.0000 0.0004 0.0022 0.0064 0.0146 0.0284 0.0488 0.0768 0.1128 0.1562
5 0.0000 0.0000 0.0001 0.0003 0.0010 0.0024 0.0053 0.0102 0.0185 0.0312

Quantitative Methods Edinburgh Business School A1/1


Appendix 1 / Statistical Tables
p
n r 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50
6 0 0.7351 0.5314 0.3771 0.2621 0.1780 0.1176 0.0754 0.0467 0.0277 0.0156
1 0.2321 0.3543 0.3993 0.3932 0.3560 0.3025 0.2437 0.1866 0.1359 0.0938
2 0.0305 0.0984 0.1762 0.2458 0.2966 0.3241 0.3280 0.3110 0.2780 0.2344
3 0.0021 0.0146 0.0415 0.0819 0.1318 0.1852 0.2355 0.2765 0.3032 0.3125
4 0.0001 0.0012 0.0055 0.0154 0.0330 0.0595 0.0951 0.1382 0.1861 0.2344
5 0.0000 0.0001 0.0004 0.0015 0.0044 0.0102 0.0205 0.0369 0.0609 0.0938
6 0.0000 0.0000 0.0000 0.0001 0.0002 0.0007 0.0018 0.0041 0.0083 0.0516

7 0 0.6983 0.4783 0.3206 0.2097 0.1335 0.0824 0.0490 0.0280 0.0152 0.0078
1 0.2573 0.3720 0.3960 0.3670 0.3115 0.2471 0.1848 0.1306 0.0872 0.0547
2 0.0406 0.1240 0.2097 0.2753 0.3115 0.3177 0.2985 0.2613 0.2140 0.1641
3 0.0036 0.0230 0.0617 0.1147 0.1730 0.2269 0.2679 0.2903 0.2918 0.2734
4 0.0002 0.0026 0.0109 0.0287 0.0577 0.0972 0.1442 0.1935 0.2388 0.2734
5 0.0000 0.0002 0.0012 0.0043 0.0115 0.0250 0.0466 0.0774 0.1172 0.1641
6 0.0000 0.0000 0.0001 0.0004 0.0013 0.0036 0.0084 0.0172 0.0320 0.0547
7 0.0000 0.0000 0.0000 0.0000 0.0001 0.0002 0.0006 0.0016 0.0037 0.0078

8 0 0.6634 0.4305 0.2725 0.1678 0.1001 0.0576 0.0319 0.0168 0.0084 0.0039
1 0.2793 0.3826 0.3847 0.3355 0.2670 0.1977 0.1373 0.0896 0.0548 0.0312
2 0.0515 0.1488 0.2376 0.2936 0.3115 0.2965 0.2587 0.2090 0.1569 0.1094
3 0.0054 0.0331 0.0839 0.1468 0.2076 0.2541 0.2786 0.2787 0.2568 0.2188
4 0.0004 0.0046 0.0185 0.0459 0.0865 0.1361 0.1875 0.2322 0.2627 0.2734
5 0.0000 0.0004 0.0026 0.0092 0.0231 0.0467 0.0808 0.1239 0.1719 0.2188
6 0.0000 0.0000 0.0002 0.0011 0.0038 0.0100 0.0217 0.0413 0.0703 0.1094
7 0.0000 0.0000 0.0000 0.0001 0.0004 0.0012 0.0033 0.0079 0.0164 0.0313
8 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0002 0.0007 0.0017 0.0039

20 0 0.3585 0.1216 0.0388 0.0115 0.0032 0.0008 0.0002 0.0000 0.0000 0.0000
1 0.3774 0.2702 0.1368 0.0576 0.0211 0.0068 0.0020 0.0005 0.0001 0.0000

A1/2 Edinburgh Business School Quantitative Methods


Appendix 1 / Statistical Tables
p
n r 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50
2 0.1887 0.2852 0.2293 0.1369 0.0669 0.0278 0.0100 0.0031 0.0008 0.0002
3 0.0596 0.1901 0.2428 0.2054 0.1339 0.0716 0.0323 0.0123 0.0040 0.0011
4 0.0133 0.0898 0.1821 0.2182 0.1897 0.1304 0.0738 0.0350 0.0139 0.0046

5 0.0022 0.0319 0.1028 0.1746 0.2023 0.1789 0.1272 0.0746 0.0365 0.0148
6 0.0003 0.0089 0.0454 0.1091 0.1686 0.1916 0.1712 0.1244 0.0746 0.0370
7 0.0000 0.0020 0.0160 0.0545 0.1124 0.1643 0.1844 0.1659 0.1221 0.0739
8 0.0000 0.0004 0.0046 0.0222 0.0609 0.1144 0.1614 0.1797 0.1623 0.1201
9 0.0000 0.0001 0.0011 0.0074 0.0271 0.0654 0.1159 0.1597 0.1771 0.1602

10 0.0000 0.0000 0.0002 0.0020 0.0099 0.0308 0.0686 0.1171 0.1593 0.1762
11 0.0000 0.0000 0.0000 0.0005 0.0030 0.0120 0.0336 0.0710 0.1185 0.1602
12 0.0000 0.0000 0.0000 0.0001 0.0008 0.0039 0.0136 0.0355 0.0727 0.1201
13 0.0000 0.0000 0.0000 0.0000 0.0002 0.0010 0.0045 0.0146 0.0366 0.0739
14 0.0000 0.0000 0.0000 0.0000 0.0000 0.0002 0.0012 0.0049 0.0150 0.0370

15 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0003 0.0013 0.0049 0.0148
16 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0003 0.0013 0.0046
17 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0002 0.0011
18 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0002
19 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
20 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

Quantitative Methods Edinburgh Business School A1/3


Appendix 1 / Statistical Tables

Area tabulated

Table A1.2 Normal distribution tables


z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879
0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224

0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621

1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441

1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
A1/4 Edinburgh Business School Quantitative Methods
Appendix 1 / Statistical Tables
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817

2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952

2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990

Quantitative Methods Edinburgh Business School A1/5


Appendix 1 / Statistical Tables

Table A1.3 Poisson Distribution Tables


λ
r 0.005 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0 0.9950 0.9900 0.9802 0.9704 0.9608 0.9512 0.9418 0.9324 0.9231 0.9139
1 0.0050 0.0099 0.0196 0.0291 0.0384 0.0476 0.0565 0.0653 0.0738 0.0823
2 0.0000 0.0000 0.0002 0.0004 0.0008 0.0012 0.0017 0.0023 0.0030 0.0037
3 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0001 0.0001

λ
r 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0 0.9048 0.8187 0.7408 0.6703 0.6065 0.5488 0.4966 0.4493 0.4066 0.3679
1 0.0905 0.1637 0.2222 0.2681 0.3033 0.3293 0.3476 0.3595 0.3659 0.3679
2 0.0045 0.0164 0.0333 0.0536 0.0758 0.0988 0.1217 0.1438 0.1647 0.1839
3 0.0002 0.0011 0.0033 0.0072 0.0126 0.0198 0.0284 0.0383 0.0494 0.0613
4 0.0000 0.0001 0.0003 0.0007 0.0016 0.0030 0.0050 0.0077 0.0111 0.0153
5 0.0000 0.0000 0.0000 0.0001 0.0002 0.0004 0.0007 0.0012 0.0020 0.0031
6 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0002 0.0003 0.0005
7 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001

λ
r 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0
0 0.3329 0.3012 0.2725 0.2466 0.2231 0.2019 0.1827 0.1653 0.1496 0.1353
1 0.3662 0.3614 0.3543 0.3452 0.3347 0.3230 0.3106 0.2975 0.2842 0.2707
2 0.2014 0.2169 0.2303 0.2417 0.2510 0.2584 0.2640 0.2678 0.2700 0.2707
3 0.0738 0.0867 0.0998 0.1128 0.1255 0.1378 0.1496 0.1607 0.1710 0.1804
4 0.0203 0.0260 0.0324 0.0395 0.0471 0.0551 0.0636 0.0723 0.0812 0.0902
5 0.0045 0.0062 0.0084 0.0111 0.0141 0.0176 0.0216 0.0260 0.0309 0.0361
6 0.0008 0.0012 0.0018 0.0026 0.0035 0.0047 0.0061 0.0078 0.0098 0.0120
7 0.0001 0.0002 0.0003 0.0005 0.0008 0.0011 0.0015 0.0020 0.0027 0.0034
8 0.0000 0.0000 0.0001 0.0001 0.0001 0.0002 0.0003 0.0005 0.0006 0.0009
9 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0001 0.0001 0.0002

A1/6 Edinburgh Business School Quantitative Methods


Appendix 1 / Statistical Tables

λ
r 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.0
0 0.1225 0.1108 0.1003 0.0907 0.0821 0.0743 0.0672 0.0608 0.0550 0.0498
1 0.2572 0.2438 0.2306 0.2177 0.2052 0.1931 0.1815 0.1703 0.1596 0.1494
2 0.2700 0.2681 0.2652 0.2613 0.2565 0.2510 0.2450 0.2384 0.2314 0.2240
3 0.1890 0.1966 0.2033 0.2090 0.2138 0.2176 0.2205 0.2225 0.2237 0.2240
4 0.0992 0.1082 0.1169 0.1254 0.1336 0.1414 0.1488 0.1557 0.1622 0.1680
5 0.0417 0.0476 0.0538 0.0602 0.0668 0.0735 0.0804 0.0872 0.0940 0.1008
6 0.0146 0.0174 0.0206 0.0241 0.0278 0.0319 0.0362 0.0407 0.0455 0.0504
7 0.0044 0.0055 0.0068 0.0083 0.0099 0.0118 0.0139 0.0163 0.0188 0.0216
8 0.0011 0.0015 0.0019 0.0025 0.0031 0.0038 0.0047 0.0057 0.0068 0.0081
9 0.0003 0.0004 0.0005 0.0007 0.0009 0.0011 0.0014 0.0018 0.0022 0.0027
10 0.0001 0.0001 0.0001 0.0002 0.0002 0.0003 0.0004 0.0005 0.0006 0.0008
11 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0001 0.0001 0.0002 0.0002
12 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001

Quantitative Methods Edinburgh Business School A1/7


Appendix 1 / Statistical Tables

For example, in a sample with


11 degrees of freedom, the
2.5% value of t that gives a probability
of 2.5% in the upper-tail area is
t = 2.201
t0.025

Table A1.4 t-distribution tables


Degrees of Upper-tail area α
freedom 0.4 0.25 0.1 0.05 0.025 0.01 0.005 0.0025 0.001 0.0005
1 0.325 1.000 3.078 6.314 12.706 31.821 63.657 127.32 318.31 636.62
2 0.289 0.816 1.886 2.920 4.303 6.965 9.925 14.089 22.327 31.598
3 0.277 0.765 1.638 2.353 3.182 4.541 5.841 7.453 10.214 12.924
4 0.271 0.741 1.533 2.132 2.776 3.747 4.604 5.598 7.173 8.610
5 0.267 0.727 1.476 2.015 2.571 3.365 4.032 4.773 5.893 6.869
6 0.265 0.718 1.440 1.943 2.447 3.143 3.707 4.317 5.208 5.959
7 0.263 0.711 1.415 1.895 2.365 2.998 3.499 4.029 4.785 5.408
8 0.262 0.706 1.397 1.860 2.306 2.896 3.355 3.833 4.501 5.041
9 0.261 0.703 1.383 1.833 2.262 2.821 3.250 3.690 4.297 4.781
10 0.260 0.700 1.372 1.812 2.228 2.764 3.169 3.581 4.144 4.587
11 0.260 0.697 1.363 1.796 2.201 2.718 3.106 3.497 4.025 4.437
12 0.259 0.695 1.356 1.782 2.179 2.681 3.055 3.428 3.930 4.318
13 0.259 0.694 1.350 1.771 2.160 2.650 3.012 3.372 3.852 4.221
14 0.258 0.692 1.345 1.761 2.145 2.624 2.977 3.326 3.787 4.140
15 0.258 0.691 1.341 1.753 2.131 2.602 2.947 3.286 3.733 4.073
16 0.258 0.690 1.337 1.746 2.120 2.583 2.921 3.252 3.686 4.015
17 0.257 0.689 1.333 1.740 2.110 2.567 2.898 3.222 3.646 3.965
18 0.257 0.688 1.330 1.734 2.101 2.552 2.878 3.197 3.610 3.922
19 0.257 0.688 1.328 1.729 2.093 2.539 2.861 3.174 3.579 3.883
20 0.257 0.687 1.325 1.725 2.086 2.528 2.845 3.153 3.552 3.850

A1/8 Edinburgh Business School Quantitative Methods


Appendix 1 / Statistical Tables
Degrees of Upper-tail area α
freedom 0.4 0.25 0.1 0.05 0.025 0.01 0.005 0.0025 0.001 0.0005
21 0.257 0.686 1.323 1.721 2.080 2.518 2.831 3.135 3.527 3.819
22 0.256 0.686 1.321 1.717 2.074 2.508 2.819 3.119 3.505 3.792
23 0.256 0.685 1.319 1.714 2.069 2.500 2.807 3.104 3.485 3.767
24 0.256 0.685 1.318 1.711 2.064 2.492 2.797 3.091 3.467 3.745
25 0.256 0.684 1.316 1.708 2.060 2.485 2.787 3.078 3.450 3.725
26 0.256 0.684 1.315 1.706 2.056 2.479 2.779 3.067 3.435 3.707
27 0.256 0.684 1.314 1.703 2.052 2.473 2.771 3.057 3.421 3.690
28 0.256 0.683 1.313 1.701 2.048 2.467 2.763 3.047 3.408 3.674
29 0.256 0.683 1.311 1.699 2.045 2.462 2.756 3.038 3.396 3.659
30 0.256 0.683 1.310 1.697 2.042 2.457 2.750 3.030 3.385 3.646
40 0.255 0.681 1.303 1.684 2.021 2.423 2.704 2.971 3.307 3.551
60 0.254 0.679 1.296 1.671 2.000 2.390 2.660 2.915 3.232 3.460
120 0.254 0.677 1.289 1.658 1.980 2.358 2.617 2.860 3.160 3.373
∞ 0.253 0.674 1.282 1.645 1.960 2.326 2.576 2.807 3.090 3.291

Quantitative Methods Edinburgh Business School A1/9


Appendix 1 / Statistical Tables

Area of probability

2
0 ca

Table A1.5 Chi-squared distribution tables


Degrees of Area in upper tail
freedom 0.995 0.99 0.975 0.95 0.90
1 0.000039 0.000157 0.000962 0.0039321 0.0157908
2 0.0100251 0.0201007 0.0506356 0.102587 0.210720
3 0.0717212 0.114832 0.215795 0.351846 0.584375
4 0.206990 0.297110 0.484419 0.710721 1.063623
5 0.411740 0.554300 0.831211 1.145476 1.61031
6 0.675727 0.872085 1.237347 1.63539 2.20413
7 0.989265 1.239043 1.68987 2.16735 2.83311
8 1.344419 1.646482 2.17973 2.73264 3.48954
9 1.734926 2.087912 2.70039 3.32511 4.16816
10 2.15585 2.55821 3.24697 3.94030 4.86518
11 2.60321 3.05347 3.81575 4.57481 5.57779
12 3.07382 3.57056 4.40379 5.22603 6.30380
13 3.56503 4.10691 5.00874 5.89186 7.04150
14 4.07468 4.66043 5.62872 6.57063 7.78953
15 4.60094 5.22935 6.26214 7.26094 8.54675
16 5.14224 5.81221 6.90766 7.96164 9.31223
17 5.69724 6.40776 7.56418 8.67176 10.0852
18 6.26481 7.01491 8.23075 9.39046 10.8649
19 6.84398 7.63273 8.90655 10.1170 11.6509
20 7.43386 8.26040 9.59083 10.8508 12.4426

A1/10 Edinburgh Business School Quantitative Methods


Appendix 1 / Statistical Tables
Degrees of Area in upper tail
freedom 0.995 0.99 0.975 0.95 0.90
21 8.03366 8.89720 10.28293 11.5913 13.2396
22 8.64272 9.54249 10.9823 12.3380 14.0415
23 9.26042 10.19567 11.6885 13.0905 14.8479
24 9.88623 10.8564 12.4011 13.8484 15.6587
25 10.5197 11.5240 13.1197 14.6114 16.4734
26 11.1603 12.1981 13.8439 15.3791 17.2919
27 11.8076 12.8786 14.5733 16.1513 18.1138
28 12.4613 13.5648 15.3079 16.9279 18.9392
29 13.1211 14.2565 16.0471 17.7083 19.7677
30 13.7867 14.9535 16.7908 18.4926 20.5992
40 20.7065 22.1643 24.4331 26.5093 29.0505
50 27.9907 29.7067 32.3574 34.7642 37.6886
60 35.5346 37.4848 40.4817 43.1879 46.4589
70 43.2752 45.4418 48.7576 51.7393 55.3290
80 51.1720 53.5400 57.1532 60.3915 64.2778
90 59.1963 61.7541 65.6466 69.1260 73.2912
100 67.3276 70.0648 74.2219 77.9295 82.3581

Entries in the table give 2 values, in which α is the area or probability in the upper tail of the chi-squared distribution. For example,
with 10 degrees of freedom and an area of 0.01 in the upper tail, 2.01 = 23.2093.

Degrees of Area in upper tail


freedom 0.10 0.05 0.025 0.01 0.005
1 2.70554 3.84146 5.02389 6.63490 7.87944
2 4.60517 5.99147 7.37776 9.21034 10.5966
3 6.25139 7.81473 9.34840 11.3449 12.8381
4 7.77944 9.48773 11.1433 13.2767 14.8602
5 9.23635 11.0705 12.8325 15.0863 16.7496

Quantitative Methods Edinburgh Business School A1/11


Appendix 1 / Statistical Tables
Degrees of Area in upper tail
freedom 0.10 0.05 0.025 0.01 0.005
6 10.6446 12.5916 14.4494 16.8119 18.5476
7 12.0170 14.0671 16.0128 18.4753 20.2777
8 13.3616 15.5073 17.5346 20.0902 21.9550
9 14.6837 16.9190 19.0228 21.6660 23.5893
10 15.9871 18.3070 20.4831 23.2093 25.1882
11 17.2750 19.6751 21.9200 24.7250 26.7569
12 18.5494 21.0261 23.3367 26.2170 28.2995
13 19.8119 22.3621 24.7356 27.6883 29.8194
14 21.0642 23.6848 26.1190 29.1413 31.3193
15 22.3072 24.9958 27.4884 30.5779 32.8013
16 23.5418 26.2962 28.8454 31.9999 34.2672
17 24.7690 27.5871 30.1910 33.4087 35.7185
18 25.9894 28.8693 31.5264 34.8053 37.1564
19 27.2036 30.1435 32.8523 36.1908 38.5822
20 28.4120 31.4104 34.1696 37.5662 39.9968
21 29.6151 32.6705 35.4789 38.9321 41.4010
22 30.8133 33.9244 36.7807 40.2894 42.7958
23 32.0069 35.1725 38.0757 41.6384 44.1813
24 33.1963 36.4151 39.3641 42.9798 45.5585
25 34.3816 37.6525 40.6465 44.3141 46.9278
26 35.5631 38.8852 41.9232 45.6417 48.2899
27 36.7412 40.1133 43.1944 46.9630 49.6449
28 37.9159 41.3372 44.4607 48.2782 50.9933
29 39.0875 42.5569 45.7222 49.5879 52.3356
30 40.2560 43.7729 46.9792 50.8922 53.6720
40 51.8050 55.7585 59.3417 63.6907 66.7659
50 63.1671 67.5048 71.4202 76.1539 79.4900

A1/12 Edinburgh Business School Quantitative Methods


Appendix 1 / Statistical Tables
Degrees of Area in upper tail
freedom 0.10 0.05 0.025 0.01 0.005
60 74.3970 79.0819 83.2976 88.3794 91.9517
70 85.5271 90.5312 95.0231 100.425 104.215
80 96.5782 101.879 106.629 112.329 116.321
90 107.565 113.145 118.136 124.116 128.299
100 118.498 124.342 129.561 135.807 140.169

Quantitative Methods Edinburgh Business School A1/13


Appendix 1 / Statistical Tables

5% of area Example
For F1 = 10, F2= 9 degrees of freedom:
1% of area P(F>3.13) = 0.05
P(F>5.26) = 0.01

0 1 2 3 4 5 F
3.13 5.26

Table A1.6 F-distribution tables Φ1 degrees of freedom (for greater mean square)
Φ2 1 2 3 4 5 6 7 8 9 10 11 12 14 16 20 24 30 40 50 75 100 200 500 ∞ Φ2

1 161 200 216 225 230 234 237 239 241 242 243 244 245 246 248 249 250 251 252 253 253 254 254 254 1
4052 4999 5403 5625 5764 5859 5928 5981 6022 6056 6082 6106 6142 6169 6208 6234 6258 6286 6302 6323 6334 6352 6361 6366

2 18.51 19.00 19.16 19.25 19.30 19.33 19.36 19.37 19.38 19.39 19.40 19.41 19.42 19.43 19.44 19.45 19.46 19.47 19.47 19.48 19.49 19.49 19.50 19.50 2
98.49 99.00 99.17 99.25 99.30 99.33 99.34 99.36 99.38 99.40 99.41 99.42 99.43 99.44 99.45 99.46 99.47 99.48 99.48 99.49 99.49 99.49 99.50 99.50

3 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 8.81 8.78 8.76 8.74 8.71 8.69 8.66 8.64 8.62 8.60 8.58 8.57 8.56 8.54 8.54 8.53 3
34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 27.34 27.23 27.13 27.05 26.92 26.83 26.69 26.60 26.50 26.41 26.35 26.27 26.23 26.18 26.14 26.12

4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96 5.93 5.91 5.87 5.84 5.80 5.77 5.74 5.71 5.70 5.68 5.66 5.65 5.64 5.63 4
21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 14.66 14.54 14.45 14.37 14.24 14.15 14.02 13.93 13.83 13.74 13.69 13.61 13.57 13.52 13.48 13.46

5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.78 4.74 4.70 4.68 4.64 4.60 4.56 4.53 4.50 4.46 4.44 4.42 4.40 4.38 4.37 4.36 5
16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.27 10.15 10.05 9.96 9.89 9.77 9.68 9.55 9.47 9.38 9.29 9.24 9.17 9.13 9.07 9.04 9.02

6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06 4.03 4.00 3.96 3.92 3.87 3.84 3.81 3.77 3.75 3.72 3.71 3.69 3.68 3.67 6
13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7.98 7.87 7.79 7.72 7.60 7.52 7.39 7.31 7.23 7.14 7.09 7.02 6.99 6.94 6.90 6.88

7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.63 3.60 3.57 3.52 3.49 3.44 3.41 3.38 3.34 3.32 3.29 3.28 3.25 3.24 3.23 7
12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84 6.71 6.62 6.54 6.47 6.35 6.27 6.15 6.07 5.98 5.90 5.85 5.78 5.75 5.70 5.67 5.65

8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.34 3.31 3.28 3.23 3.20 3.15 3.12 3.08 3.05 3.03 3.00 2.98 2.96 2.94 2.93 8
11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.03 5.91 5.82 5.74 5.67 5.56 5.48 5.36 5.28 5.20 5.11 5.06 5.00 4.96 4.91 4.88 4.86

9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.13 3.10 3.07 3.02 2.98 2.93 2.90 2.86 2.82 2.80 2.77 2.76 2.73 2.72 2.71 9
10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47 5.35 5.26 5.18 5.11 5.00 4.92 4.80 4.73 4.64 4.56 4.51 4.45 4.41 4.36 4.33 4.31

10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.97 2.94 2.91 2.86 2.82 2.77 2.74 2.70 2.67 2.64 2.61 2.59 2.56 2.55 2.54 10
10.04 7.56 6.55 5.99 5.64 5.39 5.21 5.06 4.95 4.85 4.78 4.71 4.60 4.52 4.41 4.33 4.25 4.17 4.12 4.05 4.01 3.96 3.93 3.91

A1/14 Edinburgh Business School Quantitative Methods


Appendix 1 / Statistical Tables
Φ2 1 2 3 4 5 6 7 8 9 10 11 12 14 16 20 24 30 40 50 75 100 200 500 ∞ Φ2
11 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.86 2.82 2.79 2.74 2.70 2.65 2.61 2.57 2.53 2.50 2.47 2.45 2.42 2.41 2.40 11
9.65 7.20 6.22 5.67 5.32 5.07 4.88 4.74 4.63 4.54 4.46 4.40 4.29 4.21 4.10 4.02 3.94 3.86 3.80 3.74 3.70 3.66 3.62 3.60

12 4.75 3.88 3.49 3.26 3.11 3.00 2.92 2.85 2.80 2.76 2.72 2.69 2.64 2.60 2.54 2.50 2.46 2.42 2.40 2.36 2.35 2.32 2.31 2.30 12
9.33 6.93 5.95 5.41 5.06 4.82 4.65 4.50 4.39 4.30 4.22 4.16 4.05 3.98 3.86 3.78 3.70 3.61 3.56 3.49 3.46 3.41 3.38 3.36

13 4.67 3.80 3.41 3.18 3.02 2.92 2.84 2.77 2.72 2.67 2.63 2.60 2.55 2.51 2.46 2.42 2.38 2.34 2.32 2.28 2.26 2.24 2.22 2.21 13
9.07 6.70 5.74 5.20 4.86 4.62 4.44 4.30 4.19 4.10 4.02 3.96 3.85 3.78 3.67 3.59 3.51 3.42 3.37 3.30 3.27 3.21 3.18 3.16

14 4.60 3.74 3.34 3.11 2.96 2.85 2.77 2.70 2.65 2.60 2.56 2.53 2.48 2.44 2.39 2.35 2.31 2.27 2.24 2.21 2.19 2.16 2.14 2.13 14
8.86 6.51 5.56 5.03 4.69 4.46 4.28 4.14 4.03 3.94 3.86 3.80 3.70 3.62 3.51 3.43 3.34 3.26 3.21 3.14 3.11 3.06 3.02 3.00

15 4.54 3.68 3.29 3.06 2.90 2.79 2.70 2.64 2.59 2.55 2.51 2.48 2.43 2.39 2.33 2.29 2.25 2.21 2.18 2.15 2.12 2.10 2.08 2.07 15
8.68 6.36 5.42 4.89 4.56 4.32 4.14 4.00 3.89 3.80 3.73 3.67 3.56 3.48 3.36 3.29 3.20 3.12 3.07 3.00 2.97 2.92 2.89 2.87

16 4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 2.49 2.45 2.42 2.37 2.33 2.28 2.24 2.20 2.16 2.13 2.09 2.07 2.04 2.02 2.01 16
8.53 6.23 5.29 4.77 4.44 4.20 4.03 3.89 3.78 3.69 3.61 3.55 3.45 3.37 3.25 3.18 3.10 3.01 2.96 2.89 2.86 2.80 2.77 2.75

17 4.45 3.59 3.20 2.96 2.81 2.70 2.62 2.55 2.50 2.45 2.41 2.38 2.33 2.29 2.23 2.19 2.15 2.11 2.08 2.04 2.02 1.99 1.97 1.96 17
8.40 6.11 5.18 4.67 4.34 4.10 3.93 3.79 3.68 3.59 3.52 3.45 3.35 3.27 3.16 3.08 3.00 2.92 2.86 2.79 2.76 2.70 2.67 2.65

18 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 2.41 2.37 2.34 2.29 2.25 2.19 2.15 2.11 2.07 2.04 2.00 1.98 1.95 1.93 1.92 18
8.28 6.01 5.09 4.58 4.25 4.01 3.85 3.71 3.60 3.51 3.44 3.37 3.27 3.19 3.07 3.00 2.91 2.83 2.78 2.71 2.68 2.62 2.59 2.57

19 4.38 3.52 3.13 2.90 2.74 2.63 2.55 2.48 2.43 2.38 2.34 2.31 2.26 2.21 2.15 2.11 2.07 2.02 2.00 1.96 1.94 1.91 1.90 1.88 19
8.18 5.93 5.01 4.50 4.17 3.94 3.77 3.63 3.52 3.43 3.36 3.30 3.19 3.12 3.00 2.92 2.84 2.76 2.70 2.63 2.60 2.54 2.51 2.49

20 4.35 3.49 3.10 2.87 2.71 2.60 2.52 2.45 2.40 2.35 2.31 2.28 2.23 2.18 2.12 2.08 2.04 1.99 1.96 1.92 1.90 1.87 1.85 1.84 20
8.10 5.85 4.94 4.43 4.10 3.87 3.71 3.56 3.45 3.37 3.30 3.23 3.13 3.05 2.94 2.86 2.77 2.69 2.63 2.56 2.53 2.47 2.44 2.42

21 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37 2.32 2.28 2.25 2.20 2.15 2.09 2.05 2.00 1.96 1.93 1.89 1.87 1.84 1.82 1.81 21
8.02 5.78 4.87 4.37 4.04 3.81 3.65 3.51 3.40 3.31 3.24 3.17 3.07 2.99 2.88 2.80 2.72 2.63 2.58 2.51 2.47 2.42 2.38 2.36

22 4.30 3.44 3.05 2.82 2.66 2.55 2.47 2.40 2.35 2.30 2.26 2.23 2.18 2.13 2.07 2.03 1.98 1.93 1.91 1.87 1.84 1.81 1.80 1.78 22
7.94 5.72 4.82 4.31 3.99 3.76 3.59 3.45 3.35 3.26 3.18 3.12 3.02 2.94 2.83 2.75 2.67 2.58 2.53 2.46 2.42 2.37 2.33 2.31

23 4.28 3.42 3.03 2.80 2.64 2.53 2.45 2.38 2.32 2.28 2.24 2.20 2.14 2.10 2.04 2.00 1.96 1.91 1.88 1.84 1.82 1.79 1.77 1.76 23
7.88 5.66 4.76 4.26 3.94 3.71 3.54 3.41 3.30 3.21 3.14 3.07 2.97 2.89 2.78 2.70 2.62 2.53 2.48 2.41 2.37 2.32 2.28 2.26

24 4.26 3.40 3.01 2.78 2.62 2.51 2.43 2.36 2.30 2.26 2.22 2.18 2.13 2.09 2.02 1.98 1.94 1.89 1.86 1.82 1.80 1.76 1.74 1.73 24
7.82 5.61 4.72 4.22 3.90 3.67 3.50 3.36 3.25 3.17 3.09 3.03 2.93 2.85 2.74 2.66 2.58 2.49 2.44 2.36 2.33 2.27 2.23 2.21

25 4.24 3.38 2.99 2.76 2.60 2.49 2.41 2.34 2.28 2.24 2.20 2.16 2.11 2.06 2.00 1.96 1.92 1.87 1.84 1.80 1.77 1.74 1.72 1.71 25
7.77 5.57 4.68 4.18 3.86 3.63 3.46 3.32 3.21 3.13 3.05 2.99 2.89 2.81 2.70 2.62 2.54 2.45 2.40 2.32 2.29 2.23 2.19 2.17

Quantitative Methods Edinburgh Business School A1/15


Appendix 1 / Statistical Tables
Φ2 1 2 3 4 5 6 7 8 9 10 11 12 14 16 20 24 30 40 50 75 100 200 500 ∞ Φ2
26 4.22 3.37 2.98 2.74 2.59 2.47 2.39 2.32 2.27 2.22 2.18 2.15 2.10 2.05 1.99 1.95 1.90 1.85 1.82 1.78 1.76 1.72 1.70 1.69 26
7.72 5.53 4.64 4.14 3.82 3.59 3.42 3.29 3.17 3.09 3.02 2.96 2.86 2.77 2.66 2.58 2.50 2.41 2.36 2.28 2.25 2.19 2.15 2.13

27 4.21 3.35 2.96 2.73 2.57 2.46 2.37 2.30 2.25 2.20 2.16 2.13 2.08 2.03 1.97 1.93 1.88 1.84 1.80 1.76 1.74 1.71 1.68 1.67 27
7.68 5.49 4.60 4.11 3.79 3.56 3.39 3.26 3.14 3.06 2.98 2.93 2.83 2.74 2.63 2.55 2.47 2.38 2.33 2.25 2.21 2.16 2.12 2.10

28 4.20 3.34 2.95 2.71 2.56 2.44 2.36 2.29 2.24 2.19 2.15 2.12 2.06 2.02 1.96 1.91 1.87 1.81 1.78 1.75 1.72 1.69 1.67 1.65 28
7.64 5.45 4.57 4.07 3.76 3.53 3.36 3.23 3.11 3.03 2.95 2.90 2.80 2.71 2.60 2.52 2.44 2.35 2.30 2.22 2.18 2.13 2.09 2.06

29 4.18 3.33 2.93 2.70 2.54 2.43 2.35 2.28 2.22 2.18 2.14 2.10 2.05 2.00 1.94 1.90 1.85 1.80 1.77 1.73 1.71 1.68 1.65 1.64 29
7.60 5.42 4.54 4.04 3.73 3.50 3.33 3.20 3.08 3.00 2.92 2.87 2.77 2.68 2.57 2.49 2.41 2.32 2.27 2.19 2.15 2.10 2.06 2.03

30 4.17 3.32 2.92 2.69 2.53 2.42 2.34 2.27 2.21 2.16 2.12 2.09 2.04 1.99 1.93 1.89 1.84 1.79 1.76 1.72 1.69 1.66 1.64 1.62 30
7.56 5.39 4.51 4.02 3.70 3.47 3.30 3.17 3.06 2.98 2.90 2.84 2.74 2.66 2.55 2.47 2.38 2.29 2.24 2.16 2.13 2.07 2.03 2.01

32 4.15 3.30 2.90 2.67 2.51 2.40 2.32 2.25 2.19 2.14 2.10 2.07 2.02 1.97 1.91 1.86 1.82 1.76 1.74 1.69 1.67 1.64 1.61 1.59 32
7.50 5.34 4.46 3.97 3.66 3.42 3.25 3.12 3.01 2.94 2.86 2.80 2.70 2.62 2.51 2.42 2.34 2.25 2.20 2.12 2.08 2.02 1.98 1.96

34 4.13 3.28 2.88 2.65 2.49 2.38 2.30 2.32 2.17 2.12 2.08 2.05 2.00 1.95 1.89 1.84 1.80 1.74 1.71 1.67 1.64 1.61 1.59 1.57 34
7.44 5.29 4.42 3.93 3.61 3.38 3.21 3.08 2.97 2.89 2.82 2.76 2.66 2.58 2.47 2.38 2.30 2.21 2.15 2.08 2.04 1.98 1.94 1.91

36 4.11 3.26 2.86 2.63 2.48 2.36 2.28 2.21 2.15 2.10 2.06 2.03 1.98 1.93 1.87 1.82 1.78 1.72 1.69 1.65 1.62 1.59 1.56 1.55 36
7.39 5.25 4.38 3.89 3.58 3.35 3.18 3.04 2.94 2.86 2.78 2.72 2.62 2.54 2.43 2.35 2.26 2.17 2.12 2.04 2.00 1.94 1.90 1.87

38 4.10 3.25 2.85 2.62 2.46 2.35 2.26 2.19 2.14 2.09 2.05 2.02 1.96 1.92 1.85 1.80 1.76 1.71 1.67 1.63 1.60 1.57 1.54 1.53 38
7.35 5.21 4.34 3.86 3.54 3.32 3.15 3.02 2.91 2.82 2.75 2.69 2.59 2.51 2.40 2.32 2.22 2.14 2.08 2.00 1.97 1.90 1.86 1.84

40 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 2.07 2.04 2.00 1.95 1.90 1.84 1.79 1.74 1.69 1.66 1.61 1.59 1.55 1.53 1.51 40
7.31 5.18 4.31 3.83 3.51 3.29 3.12 2.99 2.88 2.80 2.73 2.66 2.56 2.49 2.37 2.29 2.20 2.11 2.05 1.97 1.94 1.88 1.84 1.81

42 4.07 3.22 2.83 2.59 2.44 2.32 2.24 2.17 2.11 2.06 2.02 1.99 1.94 1.89 1.82 1.78 1.73 1.68 1.64 1.60 1.57 1.54 1.51 1.49 42
7.27 5.15 4.29 3.80 3.49 3.26 3.10 2.96 2.86 2.77 2.70 2.64 2.54 2.46 2.35 2.26 2.17 2.08 2.02 1.94 1.91 1.85 1.80 1.78

44 4.06 3.21 2.82 2.58 2.43 2.31 2.23 2.16 2.10 2.05 2.01 1.98 1.92 1.88 1.81 1.76 1.72 1.66 1.63 1.58 1.56 1.52 1.50 1.48 44
7.24 5.12 4.26 3.78 3.46 3.24 3.07 2.94 2.84 2.75 2.68 2.62 2.52 2.44 2.32 2.24 2.15 2.06 2.00 1.92 1.88 1.82 1.78 1.75

46 4.05 3.20 2.81 2.57 2.42 2.30 2.22 2.14 2.09 2.04 2.00 1.97 1.91 1.87 1.80 1.75 1.71 1.65 1.62 1.57 1.54 1.51 1.48 1.46 46
7.21 5.10 4.24 3.76 3.44 3.22 3.05 2.92 2.82 2.73 2.66 2.60 2.50 2.42 2.30 2.22 2.13 2.04 1.98 1.90 1.86 1.80 1.76 1.72

48 4.04 3.19 2.80 2.56 2.41 2.30 2.21 2.14 2.08 2.03 1.99 1.96 1.90 1.86 1.79 1.74 1.70 1.64 1.61 1.56 1.53 1.50 1.47 1.45 48
7.19 5.08 4.22 3.74 3.42 3.20 3.04 2.90 2.80 2.71 2.64 2.58 2.48 2.40 2.28 2.20 2.11 2.02 1.96 1.88 1.84 1.78 1.73 1.70

50 4.03 3.18 2.79 2.56 2.40 2.29 2.20 2.13 2.07 2.02 1.98 1.95 1.90 1.85 1.78 1.74 1.69 1.63 1.60 1.55 1.52 1.48 1.46 1.44 50
7.17 5.06 4.20 3.72 3.41 3.18 3.02 2.88 2.78 2.70 2.62 2.56 2.46 2.39 2.26 2.18 2.10 2.00 1.94 1.86 1.82 1.76 1.71 1.68

A1/16 Edinburgh Business School Quantitative Methods


Appendix 1 / Statistical Tables
Φ2 1 2 3 4 5 6 7 8 9 10 11 12 14 16 20 24 30 40 50 75 100 200 500 ∞ Φ2
55 4.02 3.17 2.78 2.54 2.38 2.27 2.18 2.11 2.05 2.00 1.97 1.93 1.88 1.83 1.76 1.72 1.67 1.61 1.58 1.52 1.50 1.46 1.43 1.41 55
7.12 5.01 4.16 3.68 3.37 3.15 2.98 2.85 2.75 2.66 2.59 2.53 2.43 2.35 2.23 2.15 2.06 1.96 1.90 1.82 1.78 1.71 1.66 1.64

60 4.00 3.15 2.76 2.52 2.37 2.25 2.17 2.10 2.04 1.99 1.95 1.92 1.86 1.81 1.75 1.70 1.65 1.59 1.56 1.50 1.48 1.44 1.41 1.39 60
7.08 4.98 4.13 3.65 3.34 3.12 2.95 2.82 2.72 2.63 2.56 2.50 2.40 2.32 2.20 2.12 2.03 1.93 1.87 1.79 1.74 1.68 1.63 1.60

65 3.99 3.14 2.75 2.51 2.36 2.24 2.15 2.08 2.02 1.98 1.94 1.90 1.85 1.80 1.73 1.68 1.63 1.57 1.54 1.49 1.46 1.42 1.39 1.37 65
7.04 4.95 4.10 3.62 3.31 3.09 2.93 2.79 2.70 2.61 2.54 2.47 2.37 2.30 2.18 2.09 2.00 1.90 1.84 1.76 1.71 1.64 1.60 1.56

70 3.98 3.13 2.74 2.50 2.35 2.23 2.14 2.07 2.01 1.97 1.93 1.89 1.84 1.79 1.72 1.67 1.62 1.56 1.53 1.47 1.45 1.40 1.37 1.35 70
7.01 4.92 4.08 3.60 3.29 3.07 2.91 2.77 2.67 2.59 2.51 2.45 2.35 2.28 2.15 2.07 1.98 1.88 1.82 1.74 1.69 1.62 1.56 1.53

80 3.96 3.11 2.72 2.48 2.33 2.21 2.12 2.05 1.99 1.95 1.91 1.88 1.82 1.77 1.70 1.65 1.60 1.54 1.51 1.45 1.42 1.38 1.35 1.32 80
6.96 4.88 4.04 3.56 3.25 3.04 2.87 2.74 2.64 2.55 2.48 2.41 2.32 2.24 2.11 2.03 1.94 1.84 1.78 1.70 1.65 1.57 1.52 1.49

100 3.94 3.09 2.70 2.46 2.30 2.19 2.10 2.03 1.97 1.92 1.88 1.85 1.79 1.75 1.68 1.63 1.57 1.51 1.48 1.42 1.39 1.34 1.30 1.28 100
6.90 4.82 3.98 3.51 3.20 2.99 2.82 2.69 2.59 2.51 2.43 2.36 2.26 2.19 2.06 1.98 1.89 1.79 1.73 1.64 1.59 1.51 1.46 1.43

125 3.92 3.07 2.68 2.44 2.29 2.17 2.08 2.01 1.95 1.90 1.86 1.83 1.77 1.72 1.65 1.60 1.55 1.49 1.45 1.39 1.36 1.31 1.27 1.25 125
6.84 4.78 3.94 3.47 3.17 2.95 2.79 2.65 2.56 2.47 2.40 2.33 2.23 2.15 2.03 1.94 1.85 1.75 1.68 1.59 1.54 1.46 1.40 1.37

150 3.91 3.06 2.67 2.43 2.27 2.16 2.07 2.00 1.94 1.89 1.85 1.82 1.76 1.71 1.64 1.59 1.54 1.47 1.44 1.37 1.34 1.29 1.25 1.22 150
6.81 4.75 3.91 3.44 3.14 2.92 2.76 2.62 2.53 2.44 2.37 2.30 2.20 2.12 2.00 1.91 1.83 1.72 1.66 1.56 1.51 1.43 1.37 1.33

200 3.89 3.04 2.65 2.41 2.26 2.14 2.05 1.98 1.92 1.87 1.83 1.80 1.74 1.69 1.62 1.57 1.52 1.45 1.42 1.35 1.32 1.26 1.22 1.19 200
6.76 4.71 3.88 3.41 3.11 2.90 2.73 2.60 2.50 2.41 2.34 2.28 2.17 2.09 1.97 1.88 1.79 1.69 1.62 1.53 1.48 1.39 1.33 1.28

400 3.86 3.02 2.62 2.39 2.23 2.12 2.03 1.96 1.90 1.85 1.81 1.78 1.72 1.67 1.60 1.54 1.49 1.42 1.38 1.32 1.28 1.22 1.16 1.13 400
6.70 4.66 3.83 3.36 3.06 2.85 2.69 2.55 2.46 2.37 2.29 2.23 2.12 2.04 1.92 1.84 1.74 1.64 1.57 1.47 1.42 1.32 1.24 1.19

1000 3.85 3.00 2.61 2.38 2.22 2.10 2.02 1.95 1.89 1.84 1.80 1.76 1.70 1.65 1.58 1.53 1.47 1.41 1.36 1.30 1.26 1.19 1.13 1.08 1000
6.66 4.62 3.80 3.34 3.04 2.82 2.06 2.53 2.43 2.34 2.26 2.20 2.09 2.01 1.89 1.81 1.71 1.61 1.54 1.44 1.38 1.28 1.19 1.11

∞ 3.84 2.99 2.60 2.37 2.21 2.09 2.01 1.94 1.88 1.83 1.79 1.75 1.69 1.64 1.57 1.52 1.46 1.40 1.35 1.28 1.24 1.17 1.11 1.00 ∞
6.64 4.60 3.78 3.32 3.02 2.80 2.64 2.51 2.41 2.32 2.24 2.18 2.07 1.99 1.87 1.79 1.69 1.59 1.52 1.41 1.36 1.25 1.15 1.00

Quantitative Methods Edinburgh Business School A1/17


Appendix 1 / Statistical Tables

Table A1.7 Runs test critical value tables


(a) Lower critical values
n1 n2 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

2 2 2 2 2 2 2 2 2 2
3 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3
4 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 4
5 2 2 3 3 3 3 3 4 4 4 4 4 4 4 5 5 5

6 2 2 3 3 3 3 4 4 4 4 5 5 5 5 5 5 6 6
7 2 2 3 3 3 4 4 5 5 5 5 5 6 6 6 6 6 6
8 2 3 3 3 4 4 5 5 5 6 6 6 6 6 7 7 7 7
9 2 3 3 4 4 5 5 5 6 6 6 7 7 7 7 8 8 8

10 2 3 3 4 5 5 5 6 6 7 7 7 7 8 8 8 8 9
11 2 3 4 4 5 5 6 6 7 7 7 8 8 8 9 9 9 9
12 2 2 3 4 4 5 6 6 7 7 7 8 8 8 9 9 9 10 10
13 2 2 3 4 5 5 6 6 7 7 8 8 9 9 9 10 10 10 10
14 2 2 3 4 5 5 6 7 7 8 8 9 9 9 10 10 10 11 11

15 2 3 3 4 5 6 6 7 7 8 8 9 9 10 10 11 11 11 12
16 2 3 4 4 5 6 6 7 8 8 9 9 10 10 11 11 11 12 12
17 2 3 4 4 5 6 7 7 8 9 9 10 10 11 11 11 12 12 13
18 2 3 4 5 5 6 7 8 8 9 9 10 10 11 11 12 12 13 13
19 2 3 4 5 6 6 7 8 8 9 10 10 11 11 12 12 13 13 13
20 2 3 4 5 6 6 7 8 9 9 10 10 11 12 12 13 13 13 14

(b) Upper critical values


n1 n2 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

2
3
4 9 9
5 9 10 10 11 11

6 9 10 11 12 12 13 13 13 13
7 11 12 13 13 14 14 14 14 15 15 15
8 11 12 13 14 14 15 15 16 16 16 16 17 17 17 17 17
9 13 14 14 15 16 16 16 17 17 18 18 18 18 18 18

10 13 14 15 16 16 17 17 18 18 18 19 19 19 20 20
11 13 14 15 16 17 17 18 19 19 19 20 20 20 21 21

A1/18 Edinburgh Business School Quantitative Methods


Appendix 1 / Statistical Tables

12 13 14 16 16 17 18 19 19 20 20 21 21 21 22 22
13 15 16 17 18 19 20 20 21 21 22 22 23 23
14 15 16 17 18 19 20 20 21 22 22 23 23 23 24

15 15 16 18 18 19 20 21 22 22 23 23 24 24 25
16 17 18 19 20 21 21 22 23 23 24 25 25 25
17 17 18 19 20 21 22 23 23 24 25 25 26 26
18 17 18 19 20 21 22 23 24 25 25 26 26 27
19 17 18 20 21 22 23 23 24 25 26 26 27 27
20 17 18 20 21 22 23 24 25 25 26 27 27 28

Quantitative Methods Edinburgh Business School A1/19


Appendix 2

Examination Formulae Sheet



Arithmetic mean = =
∑| |
Mean absolute deviation =
∑( )
Variance =

Short-Cut Formula
Variance = ∑( )− · /( − 1)
∑( )
Standard deviation =

Coefficient of variation =
!
Combinations =
!( )!

Binomial Distribution
P( successes) = (1 − )
where p = probability of a success, n = sample size
Mean =
Standard deviation = (1 − )
For a proportion
Mean =
( )
Standard deviation =

Estimation
95 per cent confidence limits for population mean:
±

Pooled standard deviation for two samples:
( ) ( )
=

where n1 and n2 are the sample sizes and

Quantitative Methods Edinburgh Business School A2/1


Appendix 2 / Examination Formulae Sheet

s1 and s2 are the sample standard deviations.

Poisson Distribution
P(r events) = ·
!

where λ is the average number of events per sample.

Normal Distribution
Testing a sample mean
z=
/√

Testing a sample proportion


z=
( )/

Testing the difference between two sample means


( )
z=

t-Distribution
t=
/√

95% confidence limits for population mean:


± . ·

Chi-Squared Distribution
= ( − 1) ×

95% confidence limits for population variance:


( ) ( )
> Population variance >
. .

where s2 is sample variance and n is sample size.


Testing differences in proportions
( )
=∑ with (no. of rows − 1)(no. of columns − 1) degrees of freedom

F-Distribution
=

A2/2 Edinburgh Business School Quantitative Methods


Appendix 2 / Examination Formulae Sheet

One-Way Analysis of Variance


Total SS = ∑( − ̿ ) with ( − 1) degrees of freedom
Sum of squares between treatments
= No. of observations in treatment × ∑( − ̿)

where is the mean of the jth treatment, with (c − 1) degrees of freedom.

Two-Way Analysis of Variance


Total SS = ∑( − ̿ ) with ( − 1) degrees of freedom
Sum of squares between treatments
= No. observations in treatment × ∑( − ̿)

where is the mean of the jth treatment, with (c − 1) degrees of freedom.


Sums of squares between blocks
= No. observations in block × ∑( − ̿)
where is the mean of the ith block, with (r − 1) degrees of freedom.

Regression
Least squares regression line of y on x is
= +
where
∑( )( )
= and
∑( )

= −

Correlation Coefficient
∑( )( )
=
∑( ) ·∑( )

Runs Test
If n1 or n2 > 20, number of runs is normally distributed with
Mean = +1

( )
Standard deviation =
( ) ( )

Exponential Smoothing
= (1 − ) +

Quantitative Methods Edinburgh Business School A2/3


Appendix 2 / Examination Formulae Sheet

Holt’s Method
= (1 − α)( + )+
= (1 − ) + ( − )
= + ·
where
= actual observation at time
= smoothed value at time
, = smoothing constants between 0 and 1
= smoothed trend at time
= forecast for periods ahead

Mean Square Error


∑( )
MSE =
.

A2/4 Edinburgh Business School Quantitative Methods


Appendix 3

Practice Final Examinations


The rationale for providing two examinations is that students who have worked
through the course, have taken the first practice examination and, on the basis of
their performance in that examination, are not satisfied that they have attained
mastery of the material will be able to study the course again and have a second
opportunity to test themselves. Where the first examination is satisfactory, the
second may be used for additional practice.

Quantitative Methods Edinburgh Business School A3/1


Appendix 3 / Practice Final Examinations

Practice Final Examination 1


Time allowed: 3 hours
Section A: Multiple choice.
Answer all 25 questions. Each correct answer receives 2 marks.
An incorrect answer or unanswered question receives 0 marks.
Total marks for Section A = 50 = 33 per cent of total for paper.
Section B: Case studies.
Answer all 4 questions. Each question will be marked out of 25.
Marks for individual parts of a question are shown in the question.
Total marks for Section B = 100 = 67 per cent of total for paper.

Section A: Multiple Choice Questions

1 A toy manufacturer is putting together a financial plan for his business, but to do so he
needs to estimate the Christmas demand for his newly developed children’s board
game. He knows demand will vary according to the price charged, but he does not want
to set the price until nearer the Christmas season. However, he does have information
on the two similar products he marketed last year. One was priced at £5 and sold six
(thousands); the other was priced at £11 and sold three (thousands). Over this range he
thinks the relationship between price and demand is linear. What is the straight-line
equation linking price and demand?
A. Demand = 3.5 + 0.5 × Price
B. Demand = 6 − 2 × Price
C. Demand = 8.5 − 0.5 × Price
D. Demand = 3.5 − 0.5 × Price

2 A furniture manufacturer is a supplier for a large retail chain. Amongst her products are
a round coffee table and a square coffee table, both of which can be made on either of
two machines. Knowledge of the production times per table and the time per week that
each machine is available results in the following two equations showing possible
production mixes for each machine.
Machine A: 2R + 3S = 27
Machine B: 5R + 4S = 50
where R = number of round tables made per week
where S = number of square tables made per week.
The solution to these simultaneous equations is the numbers of each product that must
be made if all available machine time is used up. Which of the following production
mixes is the solution to the equations?
A. = 5, = 6
B. = 3, = 7
C. = 10, = 0
D. = 6, = 5

A3/2 Edinburgh Business School Quantitative Methods


Appendix 3 / Practice Final Examinations

3 Which of the following is correct? When analysing any management data, the pattern
should be found before considering exceptions because:
A. exceptions can only be defined in the context of a pattern.
B. in management only the pattern needs to be considered.
C. finding the pattern may involve statistical analysis.
D. patterns are easier to find than exceptions.
Questions 4 and 5 are based on the following data, which refer to the
number of customer complaints about the quality of service on one of an
airline’s routes in each of ten weeks:

11 20 12 11 16 13 15 21 18 13

4 Which of the following is correct? The range is:


A. 6
B. 10
C. 11
D. 21

5 Which of the following is correct? The mean absolute deviation is:


A. 0
B. 3
C. 3.3
D. 15

6 Which of the following is correct? When randomness is incorporated into a sampling


method:
A. the sample will be representative.
B. the sample will be as accurate as possible.
C. the accuracy of the sample can be calculated.
D. the cost of sampling is reduced.

7 An integrated factory production line fills, labels and seals 1-litre bottles of mineral
water. Production is monitored to check how much liquid the bottles actually contain
and that legal requirements on minimum contents are being met. A distribution is
formed by taking a random sample of 1000 bottles and measuring the amount of liquid
each holds. The measurements are grouped in intervals of 1 millilitre and a histogram
formed. To calculate how much of the production falls within different volume ranges, it
is necessary to know what type the resulting distribution is. Which of the following is
correct? The distribution is:
A. normal
B. binomial
C. Poisson
D. observed

Quantitative Methods Edinburgh Business School A3/3


Appendix 3 / Practice Final Examinations

8 In practice the binomial distribution is usually approximated by the normal if the rule of
thumb, np > 5, holds. Which of the following is correct? The reason for following the
procedure is that:
A. it is easier to calculate probabilities.
B. suitable binomial tables are not available.
C. basic theory shows that the distribution is in reality normal.
D. the Central Limit Theorem applies.

9 A medical consultant sees 100 patients per month in his office. In an attempt to optimise
his appointments system, he decides to analyse the time he spends with each patient.
What type of standard distribution should he expect for the average time he spends
with each patient over the course of a month?
A. normal
B.
C. binomial
D. Poisson
Questions 10 to 12 refer to the following circumstances.
A supermarket is checking for the presence of bacterial infection in precooked TV
dinners. To do this a sample of 16 is selected at random and given a laboratory test that
gives each meal a score between 0 (completely free of infection) and 100 (highly
infected). Tests on the sample of 16 result in an average score of 10. It is known from
investigations of other products that the infection score is likely to be normally
distributed with a standard deviation of two.

10 What is the point estimate of the population mean (the average score for all dinners)?
A. 2
B. 0.5
C. 8 to 12
D. 10

11 What are the 95 per cent confidence limits for the population mean?
A. 6 to 14
B. 8 to 12
C. 9 to 11
D. 9.5 to 10.5

12 Which one of the following concepts did you have to use in answering Question 11?
A. Central limit theorem
B. Variance sum theorem
C. Sampling distribution of the mean
D. Analysis of variance

A3/4 Edinburgh Business School Quantitative Methods


Appendix 3 / Practice Final Examinations

13 A consultant has made a presentation concerning the likely impact of a new quality
control system on rejection rates. The production director asks why a significance level
of 5 per cent has been used. Which of the following is valid? It is:
A. the definition of significance.
B. convention.
C. the boundary between an acceptable risk and an unacceptable risk.
D. part of normal distribution theory.

14 A restaurant chain asks its customers to rate the service on a scale from 0 to 10 before
they leave the restaurant. It is a corporate objective that an overall average of 8 should
be achieved. To check how a particular restaurant compares with the corporate
average, a random sample of the ratings of 36 customers is selected. The sample has a
mean of 6.8 and standard deviation of 3. A two-tailed significance test is conducted with
the hypothesis that the overall mean rating is 8. Which of the following is correct?
A. The test is inconclusive at the 5 per cent level.
B. The test suggests the hypothesis should be accepted at the 5 per cent level.
C. The test suggests the hypothesis should be rejected at the 5 per cent level.
D. The test suggests the hypothesis should be rejected at the 1 per cent level.

15 A chi-squared test is used in testing customer attitudes to different brands of


toothpaste. The value of chi-squared measured in the test is 23. What is the critical
value in the upper 10 per cent tail of a chi-squared distribution with 18 degrees of
freedom with which to compare the observed value?
A. 16.085
B. 24.769
C. 25.989
D. 28.869

16 Which of the following is correct? Analysis of variance is a technique that:


A. compares the variances of two samples.
B. compares the variances of several samples.
C. tests the hypothesis that there is no difference between the means of several
samples.
D. calculates the combined variance of several samples.

17 Which of the following is correct?


I. Correlation measures the strength of the linear relationship between two variables.
II. If high values of one variable are associated with low values of a second and vice
versa, then the two variables are negatively correlated.
III. Regression measures whether changes in one variable cause changes in another.
A. I only
B. I and II only
C. II and III only
D. I, II and III

Quantitative Methods Edinburgh Business School A3/5


Appendix 3 / Practice Final Examinations

Questions 18 to 20 refer to the following computer output from a regression


analysis relating a company’s annual revenue to three independent variables.
The results are to be used to predict future revenue as part of the corporate
plans.

Variable Coefficient Standard error


Gross domestic product 6.4 0.8
Previous year’s profit 0.6 0.1
Marketing expenditure 15.6 12.0
Constant 8.4 1.2
R-bar-squared = 0.94
Residual standard error = 1.2

18 Which of the following is correct? The value of R-bar-squared means that:


A. predictions of future annual revenue will be 94 per cent accurate.
B. residual standard error should be reduced by 94 per cent to allow for degrees
of freedom.
C. the regression explains 94 per cent of the variation in the past record of
revenue.
D. 94 per cent of the residuals are within 2 residual standard errors of the
regression line.

19 Which of the following is correct? The t values for the three variables are respectively:
A. 8.0, 6.0, 1.3
B. 0.8, 0.1, 12.0
C. 6.4, 0.6, 15.6
D. 5.4, 0.5, 13.0

20 Which of the following is correct? Statistically, the variable ‘marketing expenditure’


should be eliminated from the regression equation because:
A. it has the highest standard error.
B. it has the lowest t value.
C. its standard error is less than 2.
D. its t value is less than 2.

21 A company is trying to forecast its next two years’ revenue for a new venture in the Far
East. Because of the lack of suitable data, it has been decided to use the Delphi tech-
nique. Six executives have been selected to take part. Which of the following is correct?
The executives should:
A. all be at the same level of seniority.
B. not talk to one another about their forecasts.
C. adjust their forecast towards the mean at each iteration.
D. continue making forecasts until a consensus is reached.

A3/6 Edinburgh Business School Quantitative Methods


Appendix 3 / Practice Final Examinations

Questions 22 to 23 refer to the following circumstances.


A mail order company has just installed a new computerised inventory control sys-
tem. The stock levels for the first seven weeks under the new system were:

Week 1 2 3 4 5 6 7
Stock (£ million) 13 15 14 10 8 11 12

22 For budgeting purposes a three-point moving average forecast for period 8 is calculated.
Which of the following is correct? The forecast is:
A. 9.7
B. 10.3
C. 10.7
D. 12

23 In carrying out sensitivity analysis, an exponential smoothing (α = 0.2) forecast for


period 8 is prepared. Which of the following is correct? The forecast is:
A. 10.3
B. 11.7
C. 12
D. 12.7

24 Which technique should a confectionery company use to forecast quarterly ice cream
sales for the year ahead based on a ten-year data record?
A. Delphi
B. Exponential smoothing
C. Holt’s Method
D. Holt–Winters Method

25 A manufacturer of white goods has used two forecasting methods, three-point moving
averages and exponential smoothing, to make short-term forecasts of the weekly stock
levels. The company now wants to choose the more accurate of the two and use it
alone to forecast. To do this two measures of accuracy, the MAD (mean absolute
deviation) and MSE (mean square error), have been calculated for each method over the
last ten weeks. What conclusion should be drawn if moving averages has the lower MSE
while exponential smoothing has the lower MAD?
A. Moving averages is superior.
B. Exponential smoothing is superior.
C. A calculating error must have been made because such a situation could never
arise.
D. Exponential smoothing is better on average but probably has more large
individual errors.

Quantitative Methods Edinburgh Business School A3/7


Appendix 3 / Practice Final Examinations

Section B: Case Studies

Case Study 1

1 The UK publisher of a book on employment opportunities for business graduates


measured the exposure the book was receiving as part of an investigation into market-
ing effectiveness. A survey of bookshops was carried out to determine, on average, how
many copies of the book are displayed on shelves. To do this the publisher selected at
random a sample of ten universities or colleges from all universities and colleges offering
business degrees. Every bookshop near one of the selected campuses (within a two-mile
radius of the campus centre) was visited and the number of books on display counted.
a. Which sampling method has been used? (5 marks)
b. Why is this method preferable to simple random sampling? (10 marks)
c. How might the sample be biased? (5 marks)
d. How might the sampling method be improved? (5 marks)

Case Study 2

1 Since the introduction of a new line in scented envelopes, production schedules have
been continually changed because of substantial variations in demand from quarter to
quarter. The production scheduling department is trying to improve the situation by
adopting a scientific approach to forecasting demand. They have applied the decomposi-
tion method to quarterly data covering 2012–17. The separation of the elements of this
time series revealed the following:
i. The trend was: Demand = 2600 + 40t
where t =1, 2, 3, etc. and t = 1 represents Q1, 2012
ii. There was no cycle.
iii. The seasonal factors (actual/moving average) were:

Q1 Q2 Q3 Q4
2012 – – 76 100
2013 113 109 73 104
2014 109 111 76 106
2015 110 110 72 104
2016 114 107 77 102
2017 114 108 76 108

Now, in the first quarter of 2018, the production scheduling department needs a
forecast for the third quarter of 2018.
a. What should the forecast be? (15 marks)
b. Should the schedule be delayed until the results for the current quarter are available
(i.e. is the actual demand for the first quarter likely to affect the forecast)? (5 marks)
c. If the method proves unsatisfactory in practice, what other forecasting methods
could be used? (5 marks)

A3/8 Edinburgh Business School Quantitative Methods


Appendix 3 / Practice Final Examinations

Case Study 3

1 Seven hundred motorists, chosen at random, took part in a market research survey to
find out about consumers’ attitudes to brands of petrol. The ultimate objective was to
determine the impact of advertising campaigns.
The motorists were presented with a list of petrol brands:
 BP
 Esso
 Fina
 Jet
 National
 Shell
They were also presented with a list of attributes:
 Good mileage
 Long engine life
 For with-it people
 Value for money
 Large company
The motorists were then asked to tick the petrol brands to which they thought the
attributes applied. A motorist was free to tick all or none of the brands for each
attribute. Table A3.1 contains the results, showing the percentage of the sample that
ticked each brand for each attribute.

Table A3.1 Consumers’ attitudes to brands of petrol


Brand
Attributes BP Esso Fina Jet National Shell
Good mileage 37.1 61.5 7.6 11.9 22.6 85.3
Long engine life 12.2 19.2 3.8 5.0 8.5 19.0
For with-it people 6.9 10.0 1.2 3.1 12.4 10.9
Value for money 24.0 37.8 5.1 39.7 15.3 40.3
Large company 30.4 67.9 5.4 2.3 4.8 59.8

a. Analyse this table. What is the main pattern in it? (10 marks)
b. What are the exceptions? (5 marks)
c. Can you explain the exceptions? (5 marks)
d. What further information would be useful to understand the table better? (5 marks)
NB: to analyse the table you will find it useful to re-present it first by rounding, reorder-
ing, etc.

Quantitative Methods Edinburgh Business School A3/9


Appendix 3 / Practice Final Examinations

Case Study 4

1 A national government analyst is investigating the fuel consumption of a popular make of


car when it runs on a new ‘environment friendly’ unleaded petrol. The preliminary data
shown in Table A3.2 were collected from 12 trips of similar length and in the same car.
The evidence collected from each trip was the fuel consumption in miles per gallon, the
average speed in miles per hour, the average altitude in hundreds of metres and the
total weight of passengers and luggage in the car in hundreds of kilos.

Table A3.2 Preliminary data


Trip Fuel con- Average Altitude Passenger
sumption speed (mph) (’00 m) weight
(mpg) (’00 kg)
1 33 50 0 2
2 36 40 1 3
3 31 45 3 2
4 24 55 4 4
5 34 35 1 3
6 27 45 5 4
7 20 55 3 5
8 35 35 0 2
9 32 40 1 3
10 27 50 2 3
11 23 55 4 4
12 28 45 2 2

In an attempt to measure the effect on fuel consumption of speed, altitude and load,
multiple regression analysis was used, relating fuel consumption to speed, altitude and
load. The results were:

Variable Coefficient Standard error


mph −0.4 0.12
Altitude −0.8 0.58
Weight −1.5 0.92
Constant 53.9
R-bar-squared = 0.82
Standard error of residuals = 2.2

a. What is the regression equation linking fuel consumption to the other variables?
What consumption would you predict for a trip at an average speed of 45 mph, at an
altitude of 200 m and with total passenger weight of 300 kg? (5 marks)
b. Calculate the values of the residuals. Is there a pattern in the residuals?
(5 marks)

A3/10 Edinburgh Business School Quantitative Methods


Appendix 3 / Practice Final Examinations

c. Should any of the variables be excluded from the regression equation?


(5 marks)
d. Approximately, what is the accuracy of the prediction made in part (a)? What is the
difference between the standard error of residuals and SE(Predicted) as a measure
of forecasting accuracy? (5 marks)
e. How could the analysis be improved? (5 marks)

Quantitative Methods Edinburgh Business School A3/11


Appendix 3 / Practice Final Examinations

Practice Final Examination 2


Time allowed: 3 hours
Section A: Multiple choice.
Answer all 25 questions. Each correct answer receives 2 marks.
An incorrect answer or unanswered question receives 0 marks.
Total marks for Section A = 50 = 33 per cent of total for paper.
Section B: Case studies.
Answer all 4 questions. Each question will be marked out of 25.
Marks for individual parts of a question are shown in the question.
Total marks for Section B = 100 = 67 per cent of total for paper.

Section A: Multiple Choice Questions

1 A box contains 12 wine glasses, three of which have flaws. A glass is selected at random
and is found to be flawed; it is not returned to the box.
What is the probability that the next glass selected will also be faulty?
A. 0.17
B. 0.18
C. 0.25
D. 0.27
Questions 2 and 3 are based on the following information:
A supermarket’s daily sales figures (in £000) over the last quarter (13 weeks = 78
days) have been summarised:

Daily sales (£000) Number of days


Less than 40 10
40–49 24
50–59 22
60–69 19
70 or more 3

2 What is the probability P(40 ≤ x ≤ 59)?


A. 0.13
B. 0.28
C. 0.31
D. 0.59

3 What is the sales level that was exceeded on 56 per cent of all days?
A. £43 680
B. £50 000
C. £56 320
D. £60 000

A3/12 Edinburgh Business School Quantitative Methods


Appendix 3 / Practice Final Examinations

4 A baker is producing standard loaves. A sample of 50 loaves was taken and weighed. It
was found that the average loaf weighed 502 g with a variance of 30 g.
In what range would the weight of 95 per cent of the sample loaves be?
A. 491.0 to 513.0 g
B. 496.5 to 507.5 g
C. 500.4 to 503.5 g
D. 501.8 to 502.2 g

5 35 employees attended a training course in November and a further 45 attended a


similar course in February. Of those on the November course, 54 per cent rated it as
satisfactory, while only 40 per cent of those on the February course gave it this rating.
What percentage of all those receiving this training gave it a rating of satisfactory?
A. 45.0 per cent
B. 46.2 per cent
C. 47.0 per cent
D. 51.4 per cent

6 What is the solution of (2y + 5)/(3 − 4y) = 7?


A. y = 0.53
B. y = 0.83
C. y = 0.87
D. y = 2.67

7 What is the solution of the following simultaneous equations?


2 + 5 = 18
3 −4 =4
A. x = 0.28; y = 3.71
B. x = 7.43; y = 3.14
C. x = 4.00; y = 2.00
D. x = −7.42; y = 6.57

8 A questionnaire on computer usage was sent to a sample of small businesses. Of the


250 replies received, 150 indicated that their business currently used a computer. Fifty
of the total replies received were completed by clerical staff, of whom 30 also indicated
that the firm used a computer.
What is the probability that any particular questionnaire returned was from a computer-
using business and was filled in by non-clerical staff?
A. 0.12
B. 0.40
C. 0.48
D. 0.60

Quantitative Methods Edinburgh Business School A3/13


Appendix 3 / Practice Final Examinations

Questions 9 and 10 are based on the following information


A sample of eight part-time workers were asked about their hourly rate of pay. The
rates were (in £ per hour):

3.10 3.45 2.95 3.17 3.40 3.90 3.50 3.80

9 What is the median?


A. 3.17
B. 3.28
C. 3.40
D. 3.425

10 What is the variance?


A. 0.11
B. 0.33
C. 11.73
D. 13.39

11 Data were collected on the number of episodes of a 10-part serial that were seen by a
sample of 14 viewers:

0 0 0 1 1 2 2 8 9 9 10 10 10 10

Which is the most appropriate measure of location?


A. Mean = 5.14
B. Median = 5.00
C. Mode = 10.00
D. None of these.

12 Which of the following statements about variable sampling fractions is true?


A. Measurements on one part of the sample are taken extra carefully.
B. A section of the population is deliberately over-represented in the sample.
C. The size of the sample is varied according to the type of items so far selected.
D. The results from the sample are weighted so that the sample is more repre-
sentative of the population.

13 A random sample of 36 employees is taken to estimate the average hourly rate of pay
with an accuracy of £0.25 per hour (with 95 per cent probability of being correct).
What would the accuracy have been if a sample size of 400 had been used?
A. £0.004 per hour
B. £0.045 per hour
C. £0.075 per hour
D. £0.150 per hour

A3/14 Edinburgh Business School Quantitative Methods


Appendix 3 / Practice Final Examinations

14 A machine produces chair legs. The lengths of the legs are normally distributed with a
mean of 25 cm and a standard deviation of 0.6 cm. Legs that are longer than 25.8 cm or
shorter than 24.4 cm have to be discarded. The machine produces 450 legs per shift.
How many legs per shift have to be discarded?
A. 0
B. 83
C. 113
D. 337

15 A management team of four is to be chosen to oversee the specification and


implementation of a new management information system. There are nine suitable
candidates. How many ways are there of choosing the team?
A. 24
B. 120
C. 126
D. 362 880

16 A company is launching a new magazine. The initial advertising over the first three
months achieves its target that 45 per cent of the population will have heard of it. What
is the probability that in a group of six people only two people have heard of the new
magazine?
A. 0.018
B. 0.186
C. 0.278
D. 0.333

17 What is a type 2 error in a significance test?


A. Probability of correctly accepting the null hypothesis.
B. Probability of incorrectly accepting the null hypothesis.
C. Probability of correctly accepting the alternative hypothesis.
D. Probability of incorrectly accepting the alternative hypothesis.

18 A retailer is weighing strawberries to sell as 250 g punnets. A customer has complained


that strawberries he had bought previously weighed under 250 g. The retailer decides to
check the weight of 36 punnets. He finds that the average weight is 248.5 g with
standard deviation 4.8 g.
In using a significance test to judge whether he is selling under-weight punnets, which of
the following conclusions is correct?
A. At the 5 per cent level he is selling under weight.
B. At the 5 per cent level he is not selling under weight.
C. At the 5 per cent level the test is inconclusive.
D. A significance test is inappropriate in this case.

Quantitative Methods Edinburgh Business School A3/15


Appendix 3 / Practice Final Examinations

19 A hospital casualty department in California received 24 patients with broken hips


during the last year. What is the probability that there will be more than four of these
cases in the next month?
A. 0.053
B. 0.090
C. 0.143
D. 0.188

20 A random sample of 12 is taken from a normal distribution. The variance of the sample
is 7. What are the 95 per cent confidence limits for the population variance?
A. 3.30 to 17.48
B. 3.51 to 20.18
C. 3.83 to 22.01
D. 3.91 to 16.83
Questions 21 and 22 refer to the following data.
A two-way analysis of variance has been carried out for four blocks and six treat-
ments. Calculations have shown that:
Total sum of squares = 444
Treatment sum of squares = 160
Block sum of squares = 120

21 What is the observed F value for the treatments?


A. 0.8
B. 1.1
C. 2.3
D. 2.9

22 What conclusion can you reach on the differences between blocks?


A. They are significantly different at the 5 per cent level of significance.
B. They are significantly different at the 1 per cent level of significance.
C. There is no significant difference.
D. Inconclusive, further information required.

23 The following data have been used in a regression analysis:

Month 1 2 3 4 5 6
Sales 12 15 16 18 20 23

What is the value of the intercept on the Sales axis?


A. −4.8
B. 8.3
C. 10.1
D. 12.7

A3/16 Edinburgh Business School Quantitative Methods


Appendix 3 / Practice Final Examinations

24 The following ten residuals have been obtained from a regression analysis:

+1.2 −1.4 −1.6 −0.3 +1.1 +1.7 −0.8 +0.3 −0.9 −1.2

A runs test is carried out. What is the upper critical value for this test?
A. 2
B. 9
C. 10
D. 11

25 What is the value of 9−3/2


A. 0.037
B. 7.500
C. 27.000
D. −27.000

Section B: Case Studies

Case Study 1

1 A large hotel in a capital city offers a wide range of accommodation ranging from single
rooms to four-room suites. Facilities are also available for functions such as conferences,
receptions, dinners and dances. At some times of the year all accommodation and
facilities are fully booked, while at others the hotel may be less than half full. A firm of
consultants has been asked to advise how reliable forecasts of usage might be prepared.
Describe the factors that would need to be taken into account by the consultants and
the steps they would need to take.
(25 marks)

Case Study 2

1 A company is investigating the amount spent on maintenance of its PC. It believes that
the amount spent is likely to be dependent on the usage of the machines. The following
information has been collected:

Weekly usage Annual maintenance


(hours) cost (£00s)
13 1.25
10 1.46
20 2.63
28 2.81
37 3.69
32 3.54

a. Draw a scatter diagram. (4 marks)

Quantitative Methods Edinburgh Business School A3/17


Appendix 3 / Practice Final Examinations

b. Find the equation for the linear regression of the annual maintenance cost on weekly
usage. (8 marks)
c. Calculate the correlation coefficient. (3 marks)
d. Write a short report on your findings. Indicate what further analysis would be
needed to test the validity of the regression model. (7 marks)
e. The company is investigating the cost of a maintenance contract to cover the six
PCs. What is the maximum amount that you would recommend the firm to pay for
such a contract?
What assumptions have you made in reaching this decision? (3 marks)

Case Study 3

1 A firm has three production lines for making crystal wine glasses. Daily records are kept
of the number of faulty glasses produced by each line, as shown in the table below.

Day Production line


A B C
1 5 6 3
2 6 7 3
3 4 8 4
4 4 6 5
5 5 7 3
6 4 6 4

a. Carry out a one-way analysis of variance to indicate whether there are differences in
the number of faulty glasses produced by the different assembly lines. (18 marks)
b. Write a short report on your findings, and indicate any further details of the
production process you would wish to investigate. (7 marks)

Case Study 4

1 A supermarket chain wants to increase its turnover by attracting new customers. It is


proposing to do this by a voucher scheme. As a pilot study, 10 supermarkets are
selected for the introduction of the vouchers. Their turnover is measured in the week
before the promotion and again in the week after the promotion. The results are
summarised in the table below. NB: the data for Store 5 was incorrect and has been
omitted.

Store Turnover (£000)


Week 1 Week 2
1 340 412
2 367 375
3 423 435
4 417 426

A3/18 Edinburgh Business School Quantitative Methods


Appendix 3 / Practice Final Examinations

Store Turnover (£000)


Week 1 Week 2
6 395 428
7 324 341
8 310 360
9 305 347
10 426 429

a. What precautions should the supermarket chain take to ensure the validity of the
pilot study? (4 marks)
b. Does the promotion make a significant difference to the weekly turnover of the
stores in the pilot study? (7 marks)
c. Does the promotion significantly increase the weekly turnover of the stores in the
study? (7 marks)
d. Write a short report on your findings. Comment on the implications for the other
supermarkets in the group. (7 marks)

Quantitative Methods Edinburgh Business School A3/19


Appendix 3 / Practice Final Examinations

Examination Answers

Practice Final Examination 1

Section A: Multiple Choice Questions


1 The correct answer is C. The question is equivalent to finding the straight line
linking the points (5,6) and (11,3) (i.e. finding m and c in the equation):
Demand = + × Price
The slope of the line is:
= −0.5 =
The intercept is given by:
Demand = − 0.5 × Price
Putting demand = 6 and price = 5, c = 8.5.
2 The correct answer is D.
Machine A: 2 + 3 = 27
Machine B: 5 + 4 = 50
Eliminate R from the equations by multiplying that for A by 5 and then subtracting
2 times that for B.
Machine A: 10 + 15 = 135
Machine B: 10 + 8 = 100
7 = 35
= 5
Substituting in the original equation for A gives:
2 + 15 = 27
2 = 12
= 6
3 The correct answer is A. B, C and D may be true statements in some situations, but
A is always true.
4 The correct answer is B. Range = Highest value − Lowest value
= 21 − 11
=10
5 The correct answer is B.

11 20 12 11 16 13 15 21 18 13 mean = 15
Deviations from −4 5 −3 −4 1 −2 0 6 3 −2
Absolute deviations 4 5 3 4 1 2 0 6 3 2 mean = 3

6 The correct answer is C. C is correct because randomness means that statistical


theory can be used to measure accuracy through measures such as the standard
deviation. A is not correct because, while randomness helps to make a sample
representative, it does not guarantee it. B is not correct since, for example, methods

A3/20 Edinburgh Business School Quantitative Methods


Appendix 3 / Practice Final Examinations

such as stratified sampling increase accuracy by restricting randomness. As in


opinion polls, randomness by itself may increase sampling costs, so D is not correct.
7 The correct answer is D. The distribution is observed because the histogram is
formed directly by measuring volumes rather than indirectly by making assumptions
about the process which produces the data.
8 The correct answer is A. The approximation is done simply so that calculating is
faster. None of B, C or D is correct.
9 The correct answer is A. If the appointments are set at 30-minute intervals, the
norm is that each appointment will take 30 minutes. However, there will be varia-
tions around this from patient to patient for many reasons, such as the nature of the
problem, whether the patient is well known to the consultant, outside interruptions,
etc. This is exactly the type of situation to which the normal distribution applies.
10 The correct answer is D. The point estimate of a population mean is the mean of
the sample.
11 The correct answer is C. The population standard deviation is 2.
The standard deviation of the mean score of samples of 16:
= 2/√16 = 0.5
The 95 per cent confidence limits are therefore:
= 10 ± 2 × 0.5
= 9 to 11
12 The correct answer is C. The concept of the sampling distribution of the mean is
used in calculating the standard deviation of the mean score of samples of 16. The
variance sum theorem and analysis of variance have no direct relevance to the
calculations. The calculations do depend on the sampling distribution of the mean
being normally distributed, but this follows from the fact that the underlying
population is normal and the central limit theorem need not be called on.
13 The correct answer is B. The decision maker decides the significance level but 5 per
cent is the usual level. Significance levels do define significance, and they are the
boundary between acceptable and unacceptable risks, but they do not have to be 5
per cent. That is why A and C are not valid. Significance tests can be based on the
normal distribution, but they do not have to be, so D is not valid.
14 The correct answer is C. The sampling distribution of the mean has a mean of 8
(hypothesis), a standard deviation of 0.5 (3/√36) and is normally distributed (central
limit theorem). For a two-tailed test at the 5 per cent level the critical values are 8 ±
2 × 0.5 = 7 and 9. The sample result was 6.8; therefore, the hypothesis is rejected.
For a test at the 1 per cent level the critical values are 8 ± 3 × 0.5 = 6.5 and 9.5.
Therefore, the hypothesis is accepted at the 1 per cent level.
15 The correct answer is C. The value is found directly from the chi-squared table.
16 The correct answer is C. Analysis of variance works by measuring the ratios between
variances but is a significance test on sample means.

Quantitative Methods Edinburgh Business School A3/21


Appendix 3 / Practice Final Examinations

17 The correct answer is B. I describes what correlation does; II describes negative


correlation. III is untrue since regression measures association and not necessarily
causation.
18 The correct answer is C. R-bar-squared is the ratio between explained and total
variance with allowance for degrees of freedom. Therefore, C is correct.
19 The correct answer is A. The t values are the ratios between coefficients and
standard errors.
20 The correct answer is D. The decision to reject a variable from a regression equation
is based, statistically, on a significance test that uses t values. The critical value, for t,
at the 5 per cent level is approximately 2. Therefore, a variable with a t value less
than 2 should, statistically, be omitted.
21 The correct answer is B. The Delphi process requires that the participants
communicate only through the chairman to avoid factors such as rank and personal-
ity playing a part. Therefore, B is correct. A is incorrect because one of the purposes
of the Delphi method is that people at different levels of seniority can take part.
Participants are under no requirement to adjust their predictions at each iteration in
any direction or to adjust them at all, so C is incorrect. This being the case, a
consensus may not be reached. Therefore, D is incorrect.
22 The correct answer is B.

Week 1 2 3 4 5 6 7 8
Stock 13 15 14 10 8 11 12
Mov. av. – 14 13 10.7 9.7 10.3
Forecast 14 13 10.7 9.7 10.3

23 The correct answer is B. α = 0.2

Week 1 2 3 4 5 6 7 8
Stock 13 15 14 10 8 11 12
ES 13 13.4 13.5 12.8 11.9 11.7 11.7
Forecast – 13.0 13.4 13.5 12.8 11.9 11.7 11.7

24 The correct answer is D. Because ice cream sales are to be forecast, a technique that
can handle seasonality is needed. This suggests D rather than B or C. A quantitative
rather than qualitative technique is suggested by the lengthy data record, ruling out
A.
25 The correct answer is D. The MAD calculates the average of the absolute values of
the forecasting errors (i.e. the average error size). Therefore, exponential smoothing,
having the lower MAD, is better on average. The MSE calculates the average of the
squared forecasting errors. Squaring the errors has the effect of giving a heavy
weighting to any large errors. Consequently, if exponential smoothing has a higher
MSE but a lower MAD, it is likely to have at least one more really large error.

A3/22 Edinburgh Business School Quantitative Methods


Appendix 3 / Practice Final Examinations

Section B: Case Studies

Case Study 1
1
(a) Two-stage cluster sampling. It is two-stage sampling because there are two levels
in the sampling: universities/colleges and bookshops. It is cluster sampling be-
cause at the second stage all bookshops near the universities/colleges are
included in the sample.
(5 marks)
(b) This method is preferable because it will be:
(i) cheaper. Only a limited number of locations need be visited since the sample
is restricted to ten universities/colleges. Simple random sampling could
mean visiting just one bookshop at many more locations.
(ii) easier. A list of universities/colleges from which to select is more likely to be
available than a list of bookshops near campuses.
(iii) more accurate. The area near a university or college is likely to contain all
types of bookshop (in terms of size, ownership, site, proximity to students,
etc.). This diversity will be reflected in the sample, whereas it might not
have been with simple random sampling.
(10 marks)
(c) Potential sources of bias:
(i) Not all bookshops used by students will be less than two miles from a cam-
pus and they will be excluded.
(ii) The book will be of interest to groups other than students: graduates, po-
tential employers, etc. The bookshops used by these groups will mainly be
excluded from consideration.
(iii) Two miles is an arbitrary radius. Large universities will be more spread out
and specialist student bookshops could be at greater distances from the
centre of a campus.
(5 marks)
(d) Possible improvements:
(i) There are known differences among universities, which might affect the
number of copies on display and which are ignored. Stratified sampling
would make the sample more accurate by recognising the difference be-
tween undergraduate and graduate schools, large and small schools.
(ii) More of the potential market would be included by having other strata, for
example, city centre bookshops.
(iii) Instead of a two-mile radius, the catchment areas of the universities should
be defined.
(5 marks)

Quantitative Methods Edinburgh Business School A3/23


Appendix 3 / Practice Final Examinations

Case Study 2
1
(a) First calculate the average of each quarter’s seasonal factors to see if their level
needs to be adjusted. Quarterly seasonal averages:

Q1 Q2 Q3 Q4
2012 – – 76 100
2013 113 109 73 104
2014 109 111 76 106
2015 110 110 72 104
2016 114 107 77 102
2017 114 108 76 108
Average 112 109 75 104

The average of these averages is exactly 100, so no adjustment is necessary.


Classical decomposition splits a time series into three elements:
Stage 1. Trend. For Q3, 2018, t = 27, and the trend factor is 2600 + 40 × 27
= 3680
Stage 2. Cycle. No cycle is found.
Stage 3. Seasonality. For the third quarter the seasonal factor is 75. The fore-
cast for Q3, 2018 is 3680 × 0.75 = 2760.
(15 marks)
(b) No, because data from the first quarter of 2018 will affect the forecasts very
little. First, the trend will be adjusted, but since the record covers 24 quarters the
effect will not be great. Second, the seasonal element will be unchanged since
first-quarter data cannot affect the third-quarter seasonal factor.
(5 marks)
(c) Other possible forecasting methods: the series is a seasonal one, therefore, a
method to handle seasonality is needed. Holt–Winters is a suitable time series
method. If the main factors affecting demand can be found (perhaps a variable
measuring economic activity, one measuring marketing expenditure, etc.) and
measured, regression analysis can be used.
(5 marks)

Case Study 3
1
(a) The procedure is to follow the five stages for the non-technical analysis of data.
Stage 1. Eliminate irrelevant data: in this case there are no superfluous data.
Stage 2. Consider the seven guidelines for communicating data and re-present the
table.
Three changes can be made. They are to:
(i) round the percentages to eliminate the decimal points.
(ii) put rows and columns in size order. This can be done by calculating row
and column averages and using these figures as the basis for ordering.

A3/24 Edinburgh Business School Quantitative Methods


Appendix 3 / Practice Final Examinations

(iii) interchange rows and columns.


Table A3.3 is the product of these changes.

Table A3.3 Revised table


Attributes
Brand Good Large Value for Engine With-it Average
mileage company money life
Shell 85 60 40 19 11 43
Esso 61 68 38 19 10 39
BP 37 30 24 12 7 22
National 23 5 15 8 12 13
Jet 12 2 40 5 3 12
Fina 8 5 5 4 1 5
Average 38 28 27 11 7

Stage 3. Describe the overall pattern. The companies have approximately the
same ranking whatever the attribute: Shell scores highly for all attributes, Fina
lowly.
(10 marks)
(b) Stage 4. Investigate the exceptions to the pattern. There are three main excep-
tions. They are that:
(i) Esso scores more highly than expected for ‘large company’.
(ii) Jet scores more highly than expected for ‘value for money’.
(iii) National scores more highly than expected for ‘with-it people’.
(5 marks)
(c) All exceptions can be explained in terms of the brands’ advertising styles. Esso is
known as a large company internationally, Jet is known as a retailer of cheaply
acquired petrol and National has advertised to younger age groups. The conclu-
sion is that the pattern is a good fit.
(5 marks)
(d) Stage 5. Finally, additional information should be introduced. Is the response to
the questions proportional to market share or to the size of the company? Does
the pattern hold for other samples in other regions and at different times? It may
even hold for other products besides petrol.
The information can be used to monitor the effectiveness of advertising cam-
paigns. The impact of advertising does not depend on the level of response but
on the extent to which the response changes from the norm.
(5 marks)

Case Study 4
1
(a) The regression equation is:
mpg = 53.9 − 0.4 × speed − 0.8 × altitude − 1.5 × weight
When speed = 45, altitude = 200 and weight = 300,

Quantitative Methods Edinburgh Business School A3/25


Appendix 3 / Practice Final Examinations

mpg = 53.9 − 0.4 × 45 − 0.8 × 2 − 1.5 × 3


= 29.8
(5 marks)

Trip Residual
1 2.1 = 33 − (53.9 − 0.4 × 50 − 0.8 × 0 − 1.5 × 2)
2 3.4 = 36 − (53.9 − 0.4 × 40 − 0.8 × 1 − 1.5 × 3)
3 0.5 = 31 − (53.9 − 0.4 × 45 − 0.8 × 3 − 1.5 × 2)
4 1.3 = 24 − (53.9 − 0.4 × 55 − 0.8 × 4 − 1.5 × 4)
5 −0.6 = 34 − (53.9 − 0.4 × 35 − 0.8 × 1 − 1.5 × 3)
6 1.1 = 27 − (53.9 − 0.4 × 45 − 0.8 × 5 − 1.5 × 4)
7 −2.0 = 20 − (53.9 − 0.4 × 55 − 0.8 × 3 − 1.5 × 5)
8 −1.9 = 35 − (53.9 − 0.4 × 35 − 0.8 × 0 − 1.5 × 2)
9 −0.6 = 32 − (53.9 − 0.4 × 40 − 0.8 × 1 − 1.5 × 3)
10 −0.8 = 27 − (53.9 − 0.4 × 50 − 0.8 × 2 − 1.5 × 3)
11 0.3 = 23 − (53.9 − 0.4 × 55 − 0.8 × 4 − 1.5 × 4)
12 −3.3 = 28 − (53.9 − 0.4 × 45 − 0.8 × 2 − 1.5 × 2)

(b) There is a pattern in the residuals in that the negative residuals are in the latter
trips. However, this is not linked to particular values of the independent varia-
bles. If the trips are numbered in time order (i.e. trip 1 was the first and so on),
the pattern could mean that fuel consumption worsened as the car became older.
(5 marks)
(c) To see if variables should be excluded, the t values have to be calculated:

Variable Coefficient Std error t value


mph −0.4 0.12 −3.3
altitude −0.8 0.58 −1.3
weight −1.5 0.92 −1.6

Speed has a t value that is less than −2 and is rightfully included. There is a
doubt about altitude and weight since their t values are closer to zero. However,
other evidence (on the working of internal combustion engines) suggests that
they should be retained.
(5 marks)
(d) Approximately, the accuracy of the forecast (95 per cent confidence limits) in
part (a) is given by 2 × residual standard error (i.e. ±4.4).
The standard error of residuals reflects error arising from the residuals only.
SE(Predicted) reflects error from the estimate of the regression line as well as the
residuals.
(5 marks)
(e) Improvements to the regression:
(i) Include more data: 12 observations are too few when there are three inde-
pendent variables.

A3/26 Edinburgh Business School Quantitative Methods


Appendix 3 / Practice Final Examinations

(ii) Ensure that the style of driving (or the driver) does not change from trip to
trip.
(iii) Ensure the terrain is approximately the same for each trip (i.e. the same
mixture of urban and rural).
(iv) Include a variable for wet or dry conditions.
(5 marks)

Practice Final Examination 2

Section A: Multiple Choice Questions


1 The correct answer is B. When one flawed glass is selected, there remain 11, of
which two are flawed. The probability that the next glass selected will be flawed is
therefore 2/11 (i.e. 0.18).
2 The correct answer is D.
No. days 40 ≤ ≤ 59 = 46
[40 ≤ ≤ 59] = 46/78 = 0.59
3 The correct answer is B. See the table below.
Sales No. days Cum. freq.
< 40 10 10
40 ≤49 24 34
50 ≤ 59 22 56
60 ≤ 69 19 75
70 + 3 78
Total 78

56% of all days = 0.56 × 78 = 43.7 days


Sales are less than £50 000 on 34 days; sales of £50 000 are exceeded on 78 − 34 =
44 days.
4 The correct answer is A. The mean is 502 g and the standard deviation is 5.477
(√30), so 95 per cent of loaves will fall within the range 502 ± (2 × 5.477) (i.e. 491
to 513).
5 The correct answer is B.

November course Attending 35 Satisfactory 54% = 18.9 (i.e. 19)


February course Attending 45 Satisfactory 40% = 18.0 (i.e. 18)

Over both courses, 37 out of 80 (46.2 per cent) found the course satisfactory.

Quantitative Methods Edinburgh Business School A3/27


Appendix 3 / Practice Final Examinations

6 The correct answer is A.


= 7
2 + 5 = 7(3 − 4 )
2 + 5 = 21 − 28
30 = 16
= 0.53
7 The correct answer is C.

18 (1)
2x + 5y = 4 (2)
(1) × 3
6x + 15y = 54 (3)
(2) × 2
6x – 8y = 8 (4)
(3) − (4)
23y = 46
y= 2

In (1)
2 + 10 = 18
2 = 8
= 4
In (2)
LHS: 3 − 4 = 12 − 8 = 4
RHS: 4
Solution: x = 4; y = 2
8 The correct answer is C. 200 replies were received from non-clerical staff. Of these,
120 (150 − 30) indicated that their firm used a computer.
Therefore, 120/250 = 48 per cent of the replies were filled in by non-clerical staff
from computer-using firms.
9 The correct answer is D.
. .
Median =
.
=
= 3.425
10 The correct answer is A.

x ( − ) ( − )
2.95 −0.459 0.211
3.10 −0.309 0.095
3.17 −0.239 0.057

A3/28 Edinburgh Business School Quantitative Methods


Appendix 3 / Practice Final Examinations

x ( − ) ( − )
3.40 −0.009 0.000
3.45 0.041 0.002
3.50 0.091 0.008
3.80 0.391 0.153
3.90 0.491 0.241

∑ = 27.27
∑( − ) = 0.767
= 3.409
( ) .
Variance = ∑ = = 0.11
( )

11 The correct answer is C. It is a U-shaped distribution.


12 The correct answer is B. A section of the population is deliberately over-
represented.
13 The correct answer is C. Accuracy is proportional to the square root of the sample
size. The ratio of the accuracy of the two samples is therefore in the ratio 6(√36) to
20(√400) (i.e. 0.3).
Therefore, the accuracy of the second sample is £0.25 × 0.3 = £0.075.
14 The correct answer is C. See Figure A3.1.
The proportion of legs discarded is Area A + Area B

Area A Area B

24.4 Mean = 25 25.8

z2 z1

Figure A3.1 Production of chair legs


. .
= = = = 1.33 and Area A = 0.5 − 0.4082 = 0.0918
. .
. .
= = = = 1.00 and Area B = 0.5 − 0.3413 = 0.1587
. .
Total proportion discarded = 0.2505
Total number discarded = 0.2505 × 450 = 113
15 The correct answer is C. There are nine candidates and a team of four is to be
selected. The answer to the question is 9 4
! × × × × !
= =
! ! × × × × !
= 126

Quantitative Methods Edinburgh Business School A3/29


Appendix 3 / Practice Final Examinations

16 The correct answer is C. The binomial distribution is used with n = 6, r = 2 and p =


0.45.
From binomial table, P[only 2] = 0.278
17 The correct answer is B. A type 2 error in a significance test is the probability of
incorrectly accepting the null hypothesis.
18 The correct answer is A. The hypothesis is that the average weight is 250 g. The
significance test is one-tailed since we are only interested in the true average being
lower.
= 36; = 4.8; = 248.5
.
= =
/√ . /
.
= ×6
.
= −1.875
The critical value at the 5 per cent level is z = 1.645
The hypothesis is rejected at the 5 per cent level. It seems that the retailer is selling
under-weight punnets.
19 The correct answer is A. Use the Poisson distribution.
n = 24/year
λ = No. cases/month = 24/12 = 2
From tables:

P(0 cases) = 0.1353


P(1 cases) = 0.2707
P(2 cases) = 0.2707 = 0.9473
P(3 cases) = 0.1804
P(4 cases) = 0.0902

P (> 4 cases/month) = 0.053


20 The correct answer is B. Use the chi-squared distribution with n = 12 and s 2 = 7
( ) ( )
> Pop. variance >
. .
× ×
> >
. .
20.18 > > 3.51
21 The correct answer is D.

Source d.f. SS MS F
Treatments 5 160 32 2.94
Blocks 3 120 40 3.67
Error 15 164 10.9
Total 23 444

A3/30 Edinburgh Business School Quantitative Methods


Appendix 3 / Practice Final Examinations

Treatment F = 2.94
22 The correct answer is A. Critical values:
treatments = 2.90 at 5% with (5,15) d. f.
= 4.56 at 1%
blocks = 3.29 at 5% with (3,15) d. f.
= 5.42 at 1%
They are significantly different at the 5 per cent level of significance.
23 The correct answer is C.

Month Sales
(x) (y) xy x2
1 12 12 1
2 15 30 4
3 16 48 9
4 18 72 16
5 20 100 25
6 23 138 36
∑x = 21 ∑y = 104 ∑xy = 400 ∑x2= 91
= 3.5 = 17.33


=

× . × . .
= =
× . × . .
.
=
.
= 2.061
= − = 17.33 − 2.061 × 3.5
= 17.33 − 7.2135
= 10.12
Equation = 10.12 + 2.06
= 10.1
24 The correct answer is B.
n1 = number of positive residuals = 4
n2 = number of negative residuals = 6
Lower critical value = 2
Upper critical value = 9
25 The correct answer is A.

Quantitative Methods Edinburgh Business School A3/31


Appendix 3 / Practice Final Examinations

/
9 = /

=
( )

=
( )

=
= 0.037

Section B: Case Studies

Case Study 1
1 Use the nine guidelines from the text, integrating them with the practical details of
the case.
Step 1. The case does not make clear what the purpose of the forecasts is. It could
be to plan human resource requirements or to set a seasonal pricing schedule or to
plan special offers. There are several other possibilities. This must be clarified at the
outset. In this context the current decision-making process – timing, decision
makers, systems – should be analysed. It may be necessary to revise the decision-
making process. It would of course be important to know the hotel’s business
strategy.
Step 2. In the light of Step 1, a detailed list of the forecasts required should be made
– the variables (room occupancy, restaurant usage, conference usage, etc.), their
levels of accuracy, their time horizon (just the season ahead or more), their timing
(how long before the season they are needed), to whom they will be distributed.
Step 3. A conceptual model would probably relate hotel usage to factors such as
pricing, promotional spend, local rival capacity, marketing consortia, economic
factors (exchange rates, economic activity, etc.), seasonal factors and citations in
guidebooks. The model would segment the market.
Step 4. A wealth of historical data is probably available: occupancy rates for
different facilities at different levels of detail, promotional spend, prices, economic
data including regional activity. Data on demand, as opposed to occupancy, may not
be available – the hotel is unlikely to keep records of customers turned away
because it is full. Estimates may have to be made.
Step 5. More than one technique should be used, choosing from judgement
(Delphi, scenarios) and causal modelling.
Step 6. The independent test of accuracy would be used to show how well the
techniques would have worked if used last season. Measures of accuracy would be
the MSE and MAD.
Step 7. Judgement would come into play when unusual or unforeseeable events
occurred, for example, exceptional weather or especially good or bad reviews.
Judgement has to be used in such circumstances, and at the last minute. It would be
important to record any changes made.
Step 8. All decision makers and stakeholders would have to be involved in any new
decision-making system or any new provision of forecast information. A system

A3/32 Edinburgh Business School Quantitative Methods


Appendix 3 / Practice Final Examinations

(problems/solutions/actions) for handling contingencies would need to be agreed in


advance.
Step 9. In this situation, where a) the use of a forecasting system is likely to be new
and b) judgement plays a significant role, forecasts will need to be tracked, success
and failure identified and lessons learned for the next and future seasons.
In this question, for which answers are subjective and for which there is no one
right answer, a strict marking scheme is not possible. However, high marks would
be awarded for using all guidelines and illustrating them with details and insights
relating to the case.

Case Study 2
1
(a) Scatter diagram
Annual maintenance (£00)

4.0
3.5
3.0
2.5
2.0
1.5
1.0
0.5

0 5 10 15 20 25 30 35 40
Weekly usage (hours)

(b) Using the formula/data table in the text, the coefficients of the linear regression
can be calculated:
= 0.092, = 0.41
And so the linear regression equation is:
= 0.41 + 0.092
(c) Using the formula/data table in the text, the correlation coefficient of the
regression can be calculated:
= 0.96
(d) The regression has a close fit, as suggested by the high correlation coefficient,
although this should be checked for significance. Accuracy is likely to be good
but there are a number of reservations:
 the findings are valid only within the range of the data;
 a large sample is needed.
As a further test, the residuals should be checked for randomness, either by plot-
ting residuals against fitted values or by conducting a runs test.
(e) An assumption has to be made on future usage. Assuming it is slightly higher at,
say, 25 hours then from the regression equation:
Cost = 0.41 + (0.092 × 25) = 2.71
Annual cost for 6 PCs = 2.71 × 6 = 16.26 (i. e. £1626)

Quantitative Methods Edinburgh Business School A3/33


Appendix 3 / Practice Final Examinations

This is the projected internal cost of maintenance and therefore the highest that
should be paid for an outside contract. This assumes that there is no change in
cost because of changes in the average ages of the PCs – through the passage of
time or through purchasing new PCs.

Case Study 3
1

A ( − ) B ( − ) C ( − )
5 0 6 1 3 −2
6 1 7 2 3 −2
4 −1 8 3 4 −1
4 −1 6 1 5 0
5 0 7 2 3 −2
4 −1 6 1 4 −1
Total 28 40 22
6 6 6
4.67 6.67 3.67

Overall mean = ̿ = 90/18 = 5.0


Total sum of squares = TSS
=0+1+1+1+0+1+1+4+9+1+4+1+4+4+1+0+4+1
= 38
Sum of squares between treatments = SST
= 6 × [4.67 − 5.0] + 6 × [6.67 − 5.0] + 6 × [3.67 − 5.0]
= 28
Error sum of squares = TSS − SST
= 38 − 28
= 10
Alternatively,
Error sum of squares = 0.33 + 1.33 + 0.67 + 0.67 + 0.33 + 0.67
+0.67 + 0.33 + 1.33 + 0.67 + 0.33 + 0.67
+0.67 + 0.67 + 0.33 + 1.33 + 0.67 + 0.33
= 10.0

ANOVA table
Source DF SS MSE F ratio
Treatments 2 28.00 14.00 20.89
Error 15 10.00 0.67
Total 17 38.00

A3/34 Edinburgh Business School Quantitative Methods


Appendix 3 / Practice Final Examinations

(a) From tables with 2,15 degrees of freedom:


( > 3.68) = 0.05
( > 6.36) = 0.01
Hypothesis: There is no difference between average number of faults for the
assembly lines.
There is a significant difference, at the 1 per cent level of significance, between
the number of faulty glasses produced by the different assembly lines.
(b) The findings show a significant difference, at the 1 per cent level of significance,
between the number of faulty glasses produced by assembly lines A, B and C.
Should investigate:
 Effect of day (use two-way ANOVA).
 Staffing and experience.
 Foreman.
 Breakdown of machinery, etc.
 Use of new trainees.
 Use of a longer time period.
 Criteria used to identify faulty glasses.

Case Study 4
1

Store Turnover (£000) (d)


Week 1 Week 2 Differences
1 340 412 72
2 367 375 8
3 423 435 12
4 417 426 9
6 395 428 33
7 324 341 17
8 310 360 50
9 305 347 42
10 426 429 3
Sum d = 246

Data for Store 5 were missing. Test based therefore on total of nine stores.
(a) Validity of pilot study:
 Random selection of sample.
 Choice of weeks − avoid festivals, etc.
 Size of stores selected.
 Geographical locations.
 Effect of local promotions.
 Number of stores chosen.

Quantitative Methods Edinburgh Business School A3/35


Appendix 3 / Practice Final Examinations

(b) Effect of promotion on data:


(i) The hypothesis is that the promotion has not changed turnover.
(ii) The data used are the differences for each store (i.e. paired data). The hy-
pothesis then becomes = 0.
(iii) The significance level is taken to be 1 per cent (it could have been 5 per
cent).
(iv) The test is a two-tailed paired t test (two-tailed because we are testing for a
change in turnover, not an increase; and a t test because the sample size is
less than 30).
=9
= = 27.33
∑( )
S = Std. dev of = = 23.46
( )

Observed =
/√
.
=
. /√
= 3.49 with ( − 1 = 8) degrees of freedom
From t tables, critical value at the 1 per cent level = 3.355. This is the value that
leaves 0.005 in each tail.
As our calculated value of t is greater than 3.355, we reject the null hypothesis at
the 1 per cent level of significance.
There is a significant difference in the turnover of Week 2 as compared to Week 1, at
the 1 per cent level of significance.
(c) To test for a significant increase in turnover, use a one-tailed test with the
hypothesis that there is no increase.
Calculation of the observed t is as before (i.e. t = 3.49) with 8 degrees of free-
dom.
From tables (with 8 d.f.), the critical value of t for a one-tailed test is 2.896. This
is the value that leaves 0.01 in one tail.
As our calculated value of t is greater than 2.896, we reject the null hypothesis at
the 1 per cent level of significance.
There is a significant increase in the turnover of Week 2 as compared to Week 1, at
the 1 per cent level of significance.
(d) The findings are, as in (a) and (b), a significant difference and a significant
increase in Week 2 as compared to Week 1, both at the 1 per cent level of signif-
icance.
It seems likely that the introduction of the voucher scheme may produce a simi-
lar pattern for the turnover in other stores in the group. This will of course
depend on how representative the stores used in the pilot study were. Also, a
trial period of one week is not very long; the turnover should be monitored for a
longer period. Moreover, the cost of the promotion and the effect on turnover
should be evaluated.

A3/36 Edinburgh Business School Quantitative Methods


Appendix 4

Answers to Review Questions


Contents
Module 1 .............................................................................................................4/1
Module 2 .............................................................................................................4/6
Module 3 .......................................................................................................... 4/12
Module 4 .......................................................................................................... 4/18
Module 5 .......................................................................................................... 4/23
Module 6 .......................................................................................................... 4/31
Module 7 .......................................................................................................... 4/37
Module 8 .......................................................................................................... 4/45
Module 9 .......................................................................................................... 4/53
Module 10 ........................................................................................................ 4/59
Module 11 ........................................................................................................ 4/66
Module 12 ........................................................................................................ 4/72
Module 13 ........................................................................................................ 4/79
Module 14 ........................................................................................................ 4/85
Module 15 ........................................................................................................ 4/96

Module 1

Review Questions
1.1 The correct answer is True. A large part of statistics is concerned with analysing
sample information and then generalising the results to the population. However
well chosen, a sample may be non-representative, in which case any conclusions
may be incorrect. Statistical theory enables one to calculate a degree of belief (i.e. a
probability of being right) for the conclusion.
1.2 The correct answer is D. After the first card has been drawn and not replaced, there
are 51 cards left in the pack, of which three are aces.
(second ace) = =
1.3 The correct answer is A, C.
1.4 The correct answer is B. If a coin is unbiased:
(tails) =

Quantitative Methods Edinburgh Business School A4/1


Appendix 4 / Answers to Review Questions

Earlier tosses of the coin are unconnected with the ninth trial and therefore cannot
affect the probability. However, after eight consecutive ‘heads’ one might begin to
doubt that the coin was unbiased.
1.5 The correct answer is C. Not less than £50 000 means £50 000 or more (i.e. the two
highest categories). Total frequency for the two categories is 17 + 6 = 23.
1.6 The correct answer is A. Frequencies can be turned into probabilities. For sales of
£60 000 or more, the frequency is 6, which is equivalent to a probability of:
6 1
=
78 13
1.7 The correct answer is B. Ninety per cent of days is 9/10 × 78 = 70 days,
approximately, which is covered by exactly the top four categories (categories 2–5).
The level exceeded on 90 per cent of days must thus be the boundary between the
first and second categories (i.e. £30 000).
1.8 The correct answer is B. There are many distributions of which the normal is one
that is much used. It has a smooth shape, is continuous and is always symmetrical.
Its parameters do affect its shape in some ways, but not its symmetry.
1.9 The correct answer is D. Sixty-eight per cent of the readings are within ± 1 standard
deviation of the mean (i.e. between 50 and 70). Since the distribution is symmetrical
about the mean, half of this percentage must be between 60 and 70.
1.10 The correct answer is B. Ninety-five per cent of motorists’ speeds are in the range
mean ± 2 standard deviations, 82 ± 22. Outside this range are 5 per cent of motor-
ists. Because of the symmetry of the normal distribution, 2.5 per cent must be less
than the range, 2.5 per cent more than the range (see Figure A4.1). Therefore, 2.5 per
cent of motorists have a speed less than 60 km/h. Consequently, 97.5 per cent must
have a speed greater than 60 km/h.

2.5% 2.5%
95%

60 82 104

Figure A4.1 Motorists’ speeds

Case Study 1.1: Airline Ticketing


1

Class (minutes) Frequency


0 but less than 1 14
1 but less than 2 24
2 but less than 3 20
3 but less than 4 15

A4/2 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Class (minutes) Frequency


4 but less than 5 9
5 but less than 6 6
6 but less than 7 5
7 but less than 8 3
8 but less than 9 2
9 or more 2
Total 100

Note that the classes are mutually exclusive. Each reading goes in one class, and
only one class. If the class intervals were 0–1, 1–2, etc., there would be confusion
when it came to classifying 1.0. Note also that the limits of each class are whole
numbers of minutes, which are easier to work with than classes of, say, 1.5–3.5, etc.
The frequency histogram derived from this data is given in Figure A4.2.

25

20

15

10

1 2 3 4 5 6 7 8 9 10

Figure A4.2 Frequency histogram


The 10 per cent of customers with the highest service times come from the follow-
ing classes:

9 or more 2%
8–9 2%
7–8 3%
6–7 3%
Total 10%

The total frequency associated with the 6–7 class is 5. Approximately, it can be
estimated that 3 are in the range 6.4–7.0, and 2 in the range 6.0–6.4 (dividing the
range in the ratio 3:2). Therefore, the service time exceeded by only 10 per cent of
the customers is 6.4 minutes.
This last part of the question could have been answered more exactly by going back
to the original data. The five readings in this class are 6.1, 6.3, 6.5, 6.6, 6.9. This
would lead to the conclusion that the time exceeded by only 10 per cent of custom-
ers is 6.3 minutes.

Quantitative Methods Edinburgh Business School A4/3


Appendix 4 / Answers to Review Questions

Marking Scheme (out of 10) Marks


Classification of data 4
– Reduce by 1 mark if classes are not mutually exclusive
– Reduce by 1 mark if class limits are not whole numbers
Histogram 4
Time exceeded by 10 per cent of customers (by either method of 2
calculation)
Total 10

Case Study 1.2: JP Carruthers Co.


1 The true offer is higher than 9.8 per cent. The list of questions in Section 1.7.4
points to the mistake that has been made. The mistake is a technical one (Section
1.7.4(f)), but of a very simple nature. The calculation of average bonus units is
incorrect. JPC does not recognise that workers producing fewer than 61 units do
not have to pay back a bonus to the company.
For example, suppose there were just three workers who produced the following
quantities per day:
91, 71 and 51: Average = 71
If the level for bonuses is 61, then, by JPC’s calculations, they will have to pay an
average of 10 bonus units per person = 30 in total. In fact they would have to pay
for bonus units as follows:
The worker producing 91 is paid for 91 − 61 = 30 units.
The worker producing 71 is paid for 71 − 61 = 10 units.
The worker producing 51 is paid for 0 units (not −10).
JPC pays a total of 40 bonus units, not 30.

Employee Average daily output Bonus


no. units
11 98 37
13 89 28
17 76 15
23 43 0
24 50 0
26 78 17
30 79 18
32 52 0
34 96 35
35 87 26
40 69 8

A4/4 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Employee Average daily output Bonus


no. units
42 66 5
43 98 37
45 88 27
47 53 0
48 41 0
52 44 0
54 46 0
55 97 36
59 70 9
Total bonus units = 298

JPC is likely to pay for 298 bonus units/day, not 200 (= 20 × 10) as it believes.
Under JPC’s calculations the employees receiving 0 bonus units would actually have
to pay negative bonuses to the company.
Current cost per day = 71 × 0.72 × 20 = £1022.40
New cost per day = 71 × 0.72 × 20 + 298 × 0.50 = £1171.40
. .
True offer =
.
= 14.6%

Marking Scheme (out of 10) Marks


Understanding JPC’s error 5
Calculating true offer 5
Total 10

Case Study 1.3: Newspaper Letters


1 Both writers make mistakes. Following the list of error-spotting questions, it can be
seen that:
(a) The posts of both writers might tend to bias their views (Section 1.7.1).
(b) There is a sample bias (Section 1.7.4(c)).
(c) There are omissions (Section 1.7.4(d)).
(d) There are logical errors (Section 1.7.4(e)).
Dr X draws his conclusion:
(a) from a sample of one, and
(b) without producing any comparative data.
To make a case he would need to estimate the frequency of deaths under anaesthet-
ics in dental surgeries (from a larger sample) and show that this frequency was
higher than the equivalent frequency of deaths under anaesthetics in other locations.

Quantitative Methods Edinburgh Business School A4/5


Appendix 4 / Answers to Review Questions

Mr Y’s figures are open to doubt for several reasons:


(a) He does not compare like with like. He compares two groups. The first was
given an anaesthetic by a medically qualified anaesthetist. This group is 36 per
cent of the population of people given anaesthetics and had a disproportionate
share of deaths, 45 per cent. The second group was given an anaesthetic by a
dentist. Presumably it constituted 64 per cent of the population but had only 55
per cent of the deaths. It appears that dentists are safer than anaesthetists. How-
ever, the two groups are not alike. It is probable that, at least in some cases, the
qualified anaesthetist would be present either because the operation was a diffi-
cult one or because the general health of the patient put him or her at risk. If the
first group is inherently more at risk than the second, it is wrong to attribute the
difference in statistics to the qualification or otherwise of the anaesthetist.
(b) At best, the mention of the hospital service is a red herring. One cannot judge
the fact that nearly 50 per cent of the deaths occur in hospitals without knowing
more about the number of anaesthetics given in hospital.
(c) The data appear to deal with the number of deaths without taking into account
the cause of death. Especially in the hospital service, many of the deaths are
likely to have been caused by factors totally unconnected with the anaesthetic.

Marking Scheme (out of 10) Marks


For commenting
– that Dr X uses a sample of one and has no comparative data 2
– that Mr Y does not compare like with like 4
– that the hospital service data are meaningless 2
– that the data do not take into account the cause of death 2
Total 10

Module 2

Review Questions
2.1 The correct answer is A.
2.2 The correct answer is B. The line has intercept = 1 (when x = 0, y = 1), therefore C
and D must be wrong. The slope of the line is negative, therefore A must be wrong.
Since the line passes through (0,1) and (1,0), the slope is −1 (i.e. m = −1). The line
is y = 1 − x.
2.3 The correct answer is A. B can be rejected immediately since it is a straight line and
the equation is not linear (it has an x2 in it).
Plot a selection of points:
x = −1, y = 11
x = 0, y = 4
x = 1, y = −1

A4/6 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

x = 2, y = −4
x = 3, y = −5
x = 4, y = −4
x = 5, y = −1
The curve starts with a high value of y at x = −1. As x increases y, decreases and is a
minimum when x = 3. After this point, y increases again. It is A rather than C that
has this shape.
2.4 The correct answer is C.
6x + 4 = 2y − 4
Add 4 to both sides: 6x + 8 = 2y
Divide both sides by 2: 3x + 4 = y
2.5 The correct answer is D.
=

Multiply both sides by 2: 2 +3 =

Multiply both sides by y: 2 +3 = 2 −2 +


10
Subtract 2y2 from both sides: 3 = −2 + 10
Add 2y to both sides: 5 = 10
= 2

2.6 The correct answer is D. Any line has the equation y = mx + c


Since the intercept is 3, c = 3
Since the line goes through (3,9):
9 = 3m + 3
m=2
and the line is y = 2x + 3.
2.7 The correct answer is B. The line goes through the points (−1,6) and (3,−2).
The slope is:
( )
= = −2
The line must have the form y = −2x + c
Since it goes through (−1, 6):
6=2+c
c=4
The line must be y = 4 − 2x.

Quantitative Methods Edinburgh Business School A4/7


Appendix 4 / Answers to Review Questions

2.8 The correct answer is D.


4 + =5
2 − =7
Add the equations:
6 = 12
=2
Substitute in the first equation (or the second):
8+ =5
= −3
2.9 The correct answer is B.
2 +7 =3
3 − 2 = 17
Multiply the first equation by 3, the second by 2 and subtract:
6 + 21 = 9
6 − 4 = 34
25 = −25
= −1
Substitute x = −1 in 2y + 7x = 3:
2 −7=3
2 = 10
=5
2.10 The correct answer is B.
(16)−3/2
= /

=
(√ )

2.11 The correct answer is C. The definition of logarithms means that a number is being
sought such that 2? = 8. The number is 3.
2.12 The correct answer is B. A is the equation of a straight line and cannot be correct.
The intercept of the curve shown is 10, thus D cannot be correct. The curve shown
is one of growth. The equation C is one of decay (the exponent is negative) and thus
cannot be correct. At first sight B satisfies all the requirements. A check shows that:
= 1, = 10 · 10 . = 10 · √10 = just over 30
This is in accordance with the curve in the graph and B is correct.

A4/8 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Case Study 2.1: Algebraic Formulation


1
(a)
(i) E = (b − e)x
(ii) b, e
(b) T = 2 + x, where T = time taken, x = number of orders.
(c) s = 400x + 200y + 10 000, where s = total salary, x = number of new cars sold
and y = number of second-hand cars sold.
(d) For hires of no more than one week in length, the algebraic and graphical
solutions are as follows:
= 50 + 0.09 for ≤ 1000 (≤ means less or equal to)
= 140 for > 1000 (> means greater than)

140
C = Total cost of hire (£)

x = Miles travelled
50

1000 x

Figure A4.3 Cost of car hire

Marking Scheme (out of 25) Marks


(a) (i) Equation 3
(ii) Constants 2
(b) Variables 3
Equation 2
(c) Variables 3
Equation 2
(d) Variables 2
Equations 2
Split into x ≤ 1000 and x > 1000 1
Graph 5
Total 25

Quantitative Methods Edinburgh Business School A4/9


Appendix 4 / Answers to Review Questions

Case Study 2.2: CNX Armaments Co.


1 Let x = the quantity of a system sold. A value of x is to be found so that Costs =
Revenue.
For System 1
Costs = 100( ) +4 ( )

Revenue = 5
Thus 5 = 100 + 4
= 100
The breakeven point for System 1 is 100.
For System 2
Costs = 1200 + 4
Revenue = 8
Thus 8 = 1200 + 4
4 = 1200
= 300
The breakeven point for System 2 is 300.

Marking Scheme (out of 10) Marks


System 1 specification of revenue 1
General specification of costs 3
Costs = Revenue equation 3
Solution 1
Repeat for System 2 2
Total 10

Case Study 2.3: Bonzo Corporation


1 The quantities to be found are the amounts of Bonzo and Woof that each dog must
eat. Therefore, let:
= oz of Bonzo that should be eaten
= oz of Woof that should be eaten
Each ounce of Bonzo contains 0.3 oz meat and 0.7 oz wheatmeal. If a dog eats B oz
of Bonzo, he eats 0.3B oz meat and 0.7B oz wheatmeal. If a dog eats W oz of Woof,
he eats 0.4W oz meat and 0.6W oz wheatmeal. If a dog eats B oz of Bonzo and W oz
of Woof, his total intake of meat and wheatmeal is:
0.3 + 0.4 of meat
0.7 + 0.6 of wheatmeal
These quantities have to be 6 oz and 10 oz respectively, so two equations can be
formed:
0.3 + 0.4 = 6 ﴾A4.1﴿

A4/10 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

0.7 + 0.6 = 10 ﴾A4.2﴿


To find the quantities a dog should eat entails solving the two simultaneous equa-
tions to put actual numbers to B and W. Multiply Equation A4.1 by 15 and
Equation A4.2 by 10, then subtract:
4.5 + 6 = 90
7 +6 = 100
−2.5 = −10
= 4
Substitute B = 4 into Equation A4.1:
1.2 + 0.4 = 6
0.4 = 4.8
= 12
Each dog should eat:
4 oz of Bonzo
12 oz of Woof

Marking Scheme (out of 20) Marks


Variables 5
Equations 5
General method of solution 5
Answer 5
Total 20

Case Study 2.4: Woof Dog Food


1 Let y = sales in cases.
Let x = time (i.e. x = 0, 1, 2, … for each year).
General form of relationship: y = kecx:
When x = 0 (six years ago):
= 10 000 = =
= 10 000
When x = 5 (last year):
= 10 000 = 40 000
=4
Take logarithms of both sides:
5 = log 4
= 1.386
= 0.277
.
= 10 000
This year, x = 6:

Quantitative Methods Edinburgh Business School A4/11


Appendix 4 / Answers to Review Questions

. × .
= 10 000 = 10 000
From tables, e1.662 = 5.27
Sales = 52 700

Marking Scheme (out of 20) Marks


Variables 4
Calculating k 4
Calculating c 5
General form 2
Estimate for this year 5
Total 20

Module 3

Review Questions
3.1 The correct answer is A, B. These are the two major guidelines in communicating
data. There is no requirement for data to be specified to two decimal places (alt-
hough one of the ‘rules’ of data presentation is that they should be rounded to two
effective figures). The data do not have to be analysed before presentation (although
presentation should be in a form amenable to analysis). It is often the case that only
the receiver is in a position to analyse them.
3.2 The correct answer is False. Although the assumption is often made, there is
frequently no link between the specification of data and its accuracy.
3.3 The correct answer is True. Data should be sufficiently accurate to avoid wrong
decisions being made. If the decisions do not require a high level of accuracy, then it
is pointless to supply such accuracy.
3.4 The correct answer is B. The two effective figures are the 3 and the 7. The rounded
number is thus 3700.
3.5 The correct answer is C. All numbers are rounded to their first two figures: 1700 for
1732, 38 for 38.1, etc.
3.6 The correct answer is A, B, C. All are correct. Speedy comparisons, ease of
subtraction and closeness are all factors that aid the comparison of numbers.
3.7 The correct answer is A, B, C, D. All are correct. Any of the four bases could be
used. Circumstances and/or taste would dictate which.
3.8 The correct answer is B, C. The rows are apparently in alphabetical order, so B is
valid (although it is conceivable that they have been ordered by, say, capital em-
ployed). A gap has been introduced in the numbers purely to accommodate the
labels, thus C is valid. A is not a valid criticism since the numbers are rounded to
two effective figures. The first figure is ‘1’ for all numbers, therefore the two
effective figures are the second and third. D is not valid, since a vertical gridline
would not help the communication of the data.

A4/12 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

3.9 The correct answer is C. Income statements cannot be ordered by size because they
have a logical build-up. Nor would it help with communication to use summary
measures or interchange rows and columns. Other rules can be applied. Rounding
can be done and is not illegal. A and B are not correct. Auditing can only be
checked in published accounts in the most superficial and trivial way. Most pub-
lished accounts are in fact rounded to some degree already. D is not correct since
published accounts are for shareholders.
3.10 The correct answer is B, D. A is not correct since the involvement of time has no
bearing on the choice between tables and graphs. C is not correct since graphs are
not usually good at distinguishing fine differences.

Case Study 3.1: Local Government Performance Measures


1 The presentation of the data is not specially bad. However, it still takes some time to
appreciate the salient features and some small changes can make a lot of difference.
Using the rules of data presentation, Table A4.1 shows the amended version.

Table A4.1 Local government performance measures (amended)


2015 2016
District Dec. Feb. Apr. May June
Plumby 56 67 75 55 71
Westerley 44 86 89 72 87
Tyneham 40 100 84 100 100
Southam 32 49 65 66 70
Eastham 25 78 75 79 94
Northing- 15 67 86 88 88
ton

Average 35 74 79 77 85

Rule 1. Rounding to two effective figures means losing the decimal point. It
would be perfectly acceptable to do so given the purpose of the table (monitor-
ing performance). In any case, it is unlikely that data relating to achievement of
objectives are accurate to one decimal place. However, if the data were for dif-
ferent purposes – for example, paying bonuses for an outsourced service – the
loss of the decimal place might not be acceptable.
Rule 2. The data are in alphabetical order. This would be fine if the table was a
long reference table where readers were only interested in the performances of
particular authorities. To judge the relative performance of the six authorities,
size order is better. The order in the first month has been chosen so that relative
improvement can easily be seen.
Rule 3. Rows and columns have not been interchanged. Comparisons between
authorities are at least as important as comparisons over time.
Rule 4. A summary measure would be useful since it would allow comparisons
with ‘average performance’. The ‘average’ used here is the ‘total’ – the achieve-

Quantitative Methods Edinburgh Business School A4/13


Appendix 4 / Answers to Review Questions

ment of all authorities taken as a combined group. If this is not available then the
arithmetic mean of the six authorities would serve nearly as well.
Rule 5. Labelling is generally clear but some minor changes have been made,
mainly to improve the appearance of the table.
Rule 6. The excessive use of gridlines is obtrusive and obscures comparisons.
This is one of the worst features of the original table. The gridlines, except for
the horizontal one separating the headings from the data, have been removed.
The columns have been moved closer together for easier comparison.
Rule 7. A short verbal summary might be: Authorities have made significant
improvements, moving from an average achievement of 34 per cent to 85 per
cent. All authorities have improved to some extent but some have improved
more markedly than others.

Marking Scheme (out of 20) Marks


Major changes
– Rounding – losing decimal place 3
– Size order – rather than alphabetical 4
– Summary measure 4
– Removal of gridlines 4
– Verbal summary 3
Minor changes
– Improved labelling 1
– Moving data closer together 1
Total 20

Case Study 3.2: Multinational Company’s Income Statement


1 Several changes can be made and are shown in the amended table, Table A4.2.

Table A4.2 Summary of combined figures: results for year ending 31


December (£m)
2015 2016
Sales to third parties 9150 9840
OPERATING PROFIT 541 601
Shares ass. companies’ profit 59 64
Non-recurring items −50 −56
PROFIT BEFORE TAXATION 550 609
Taxation −272 −315
PROFIT AFTER TAX 278 294
Outside interests/pref. divs −20 −21
Ordinary dividends −95 −106
RETAINED PROFIT 163 167

A4/14 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

The changes are:


(a) A small amount of further rounding has been carried out. Since these are
published accounts, the two effective figures rule has not been adhered to in
every case for reasons stated in the text.
(b) Labels have been shortened and brought nearer the numbers.
(c) Important subtotals, such as ‘Profit before tax’ and ‘Profit after tax’ have been
separated and highlighted.
(d) The italics, used to denote negatives, have been substituted with minus signs.
Tests on the readability of these accounts showed that the italics caused confu-
sion. The accountant’s brackets for negatives would probably have been
preferable to the italics.

Marking Scheme (out of 10) Marks


Rounding 3
Labelling succinct and closer to data 2
Subtotals highlighted 3
Italics eliminated and minus signs used 2
Total 10

Case Study 3.3: Country GDPs


1 In considering which is the better method of communication, the salient points are
as follows:
(a) The sizes of the countries are so different that it is difficult to incorporate them
all on the same graph. Figure A4.4 shows the data for all countries except the
USA, Japan and some smaller countries. Figure A4.5 includes the USA and Japan
as well. It is always a problem to deal with situations like this without having to
resort to artificial devices such as a logarithmic scale or a break in the vertical
axis. Both devices can be confusing. Figure A4.5 does not work.
(b) Just as there is a difficulty with the largest countries because of the scale, so there
are problems with several smaller countries. It is impossible to make any real
comparisons between the five smallest countries, or even include all of them on
the graph.
(c) Differences can be seen from graphs but not the magnitudes of the differences.
For example, it can be seen that Germany has grown more than Belgium, but it
is not possible to see that the growth factor for Germany was about 17 and for
Belgium 13.
(d) The lines rarely cross over and a common problem with graphs is not present
here. The major pattern of growth is distinctly visible.
The decision between Table 3.11 and Figure A4.4 rests on personal choice and
circumstances. Table 3.11 is more precise, but Figure A4.4 would seem (to some)
less boring than a table of numbers.

Quantitative Methods Edinburgh Business School A4/15


Appendix 4 / Answers to Review Questions

2000

Germany

GDP (Eur thousand million)


1500
France
UK

1000 Italy

500

Belgium
Denmark
0
1965 1970 1975 1980 1985 1990 1995

Figure A4.4 Graph of Table 3.11 (omitting USA and Japan)

8000

7000
USA

6000
GDP (Eur thousand million)

5000

4000 Japan

3000

2000 Germany

France
1000

Belgium
0
1965 1970 1975 1980 1985 1990 1995

Figure A4.5 Graph of Table 3.11 (including USA and Japan)

Marking Scheme (out of 10) Marks


Recognising the salient points:
Dealing with different sizes of data on one graph 2
Distinguishing between small countries 2
Magnitudes quantifiable with difficulty on graph 2
Greater attractiveness of graph 2
Few lines crossing over on this particular graph 2
Total 10

A4/16 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Case Study 3.4: Energy Efficiency


1 This is an example of data in the form of a table taken from a management report.
The rules of data presentation are applied as follows:
Rule 1. Round the numbers to two effective figures. Note that for IRR percent-
ages, this implies that the first three figures are retained. The ‘2’ is common to all
IRRs.
Rule 2. Ordering by size is not appropriate. It is better to keep the groups, each
referring to the changing of one parameter.
Rule 3. The important comparison is between the IRRs. Therefore rows and
columns should not be interchanged.
Rule 4. The ‘summary’ is the base IRR, which is put at the foot of the column
containing the other IRRs.
Rule 5. There are already no gridlines. Nor is there excessive white space.
Rule 6. Boxes around the labels is a device commonly used, but it detracts from
the numbers. The units of the numbers (%, m, etc.) are part of the labelling.
They can be moved out of the body of the table.
Rule 7. A verbal summary might be: ‘None of the changes in the parameters
reduced the IRR to below 20 per cent or increased it above 30 per cent.’
The result of these changes is the amended table, Table A4.3. Note that, with a
better presentation, it is no longer necessary to show the changes in parameters and
their new values. Only the new value is shown.

Table A4.3 Sensitivity analysis for oil conservation (amended)


Parameter Base case New New
parame- IRR%
ter values
Capital cost (£m) 3.8 4.3 22.7
3.3 30.5
Life (years) 13 12 23.1
11 22.6
10 22.1
9 21.3
Fuel differential (t/day) 43 48 27.7
38 22.0
Fuel price inflation 12 13 26.4
(%/year)
11 23.5
Base case 24.9

Quantitative Methods Edinburgh Business School A4/17


Appendix 4 / Answers to Review Questions

Marking Scheme (out of 10) Marks


Rounding correctly 2
Improved labelling 2
Moving units (e.g. ‘years’) away from the numbers 2
Verbal summary 2
Elimination of some data 2
Total 10

Module 4

Review Questions
4.1 The correct answer is False. Traditional statistical methods can be helpful, but
because they were not designed for management situations there is a gap in a
manager’s needs they do not fill.
4.2 The correct answer is False. The need for such skills is not new, but it is now
especially important because more data are now available than previously. In any
case, computers do not necessarily have to present data in a complicated style,
although they often do so.
4.3 The correct answer is C. Most data sets contain items which are unnecessary or
irrelevant to the analysis being undertaken. Inaccuracies should lead to corrections
being made rather than omissions, so A is not a correct reason. Large amounts of
data may well be analysed at one time, so B is not a correct reason.
4.4 The correct answer is A, B. Data to eight decimal places may or may not be
required, depending on the situation. The key point is that the number of decimal
places does not necessarily denote accuracy.
4.5 The correct answer is A, B. C is incorrect since the model will be to some extent an
approximation and thus more inaccurate than the original data.
4.6 The correct answer is B. Sales increase by almost 25 per cent each year. A and D are
also models, but not such good ones. C refers only to one year.
4.7 The correct answer is B. Division 2 is the exception, since its profit/capital
employed is under 4 per cent. The others all earn a return of six per cent (this is the
model for the data). It may be tempting to think of Division 4 as the exception since
it is the largest.
4.8 The correct answer is False. You would not be right. Although 11 exceptions out of
36 is a high ratio, all can be explained in terms of strikes or holidays. The model is
probably a good one with the condition that it does not apply when there are strikes
or holidays.
4.9 The correct answer is C, D. C and D would help to show whether the pattern found
for distilled spirits was repeated for other alcoholic beverage sectors of the market.
Since the essence of the analysis is to look at variations between states of the USA,
it is difficult to see that there would be a basis for comparison with the provinces of

A4/18 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

a different country. The factors that make one province different from another are
not the same as make one state different from the next.
4.10 The correct answer is False. The reason for choosing a simple model first is that the
objective of understanding the data can be achieved more quickly and easily. Any
model that is a good one, whether simple or sophisticated, will ‘model’ the pattern,
not obscure it.

Case Study 4.1: Motoring Correspondent


1
(a) Analyse the data according to the guidelines. Reducing the data by omitting the
words and re-presenting what is left, gives:
2004: one death every 10 million miles
2009: one death every 12 million miles
2014: 6400 deaths in 92 000 million miles
i.e. one death in every 14.4 million miles
It can now be seen that on average the ratio of deaths to miles driven has im-
proved. The model of the data is that the average number of miles for each
death increases by 20 per cent every five years. There are no exceptions to this
model since the data are too few.
There is no other information with which to make comparisons, but the follow-
ing road death data would be useful, if obtainable:
(i) prior to 2004;
(ii) casualties (as opposed to deaths);
(iii) other countries’ road accidents.
(b) It can now be concluded that there was a reduction in the death rate on the
roads. Whether driving was safer is a slightly different matter. The deaths include
all road deaths. An improvement in the overall figure could mask a worsening
for pedestrians. In any case, safety is a personal thing, depending upon an indi-
vidual’s circumstances. The overall rate for drivers may have improved because
of improvements for long-distance drivers as a result of motorway building. This
would not make things safer for a predominantly town driver.
(c) If the model is correct, the number of miles for every death will increase by 20
per cent each year. In 2018 there will be one death every 17.3 million miles.
(d) The reservations are that the model is based on:
(i) too few data;
(ii) aggregate data which may obscure important trends in subsections of road
users;
(iii) purely time trends. It does not attempt to explain what underlying causes
initiate the trends. Over five years these causes may change and disrupt the
time trend.

Quantitative Methods Edinburgh Business School A4/19


Appendix 4 / Answers to Review Questions

Marking Scheme (out of 20) Marks


(a) Reduction and re-presentation 4
Correct model 4
Suggestion of comparative data 2
(b) Qualified conclusion that driving is safer 4
(c) Prediction 2
(d) Reservations 4
Total 20

Case Study 4.2: Geographical Accounts


1 Follow the five stages of the guidelines:
(a) Reduce the data by omitting the expenditures, but retaining the percentages.
Conversely, the percentages could be omitted and the expenditures retained. The
percentages are preferred in view of the final objective of finding the unusual
expenditure levels. The presence of both types of data in the same table merely
confuses.
(b) Re-present the data by rounding the numbers to two effective digits. Reorder the
regions by their total expenditures; reorder the expenditure categories by size.
The result of (a) and (b) is Table A4.4.

Table A4.4 Expenditure by geographical division (%)


Total Raw materials Transport Manpower Finance Fuel
England and 100 40 29 23 3.1 0.8
Wales
Divisions
Thames 100 38 28 26 3.5 0.7
Severn-Trent 100 37 33 21 4.5 0.9
North West 100 50 26 19 0.7 0.7
Anglia 100 35 30 24 3.0 0.6
Yorkshire 100 47 31 17 1.2 0.9
Wales 100 47 21 18 9.1 0.9
South 100 30 34 27 1.6 0.9
Wessex 100 27 32 34 1.2 1.0
Northumbria 100 47 25 25 0.3 0.9
South West 100 46 24 21 2.4 0.8

(c) The model for the data is that, under each category of expenditure, the percent-
age for each region is roughly the same as the percentage for the total (the whole
of England and Wales). The model is, to some extent, dictated by the question,
since we are looking for out-of-the-ordinary expenditures.

A4/20 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

(d) The major exceptions can be spotted by looking down each column and noting
the regions whose percentage is very different from the percentage at the head of
the column.
(e) The only really valid comparison would have to be with previous years’ expendi-
tures.
Suppose a major exception is defined, arbitrarily, as being more than one-fifth away
from the England and Wales figure. For example, for raw materials an exception is
any region whose percentage for this category is between 40 − 8 and 40 + 8 (8
being one-fifth of 40) (i.e. between 32 per cent and 48 per cent). Choosing a fraction
other than one-fifth would result in a tightening or a loosening of the definition of
‘major exception’.
The answer to the question posed is that the unusual exceptions are:
(a) Raw materials: North West, South, Wessex.
(b) Transport: Wales.
(c) Manpower: Wessex, Yorkshire, Wales.
(d) Finance: There is a large variation in this category. Only Thames and Anglia are
close to the average figure.
(e) Fuel: Anglia, Wessex.
Note that in this instance ordering by size has had little effect on the analysis. This
does not diminish its general usefulness. Once the exceptions are known, the
analysis has done its job. It is for management to explain the exceptions and, if
necessary, take action.

Marking Scheme (out of 20) Marks


Reduction: omission of either expenditures or percentages 3
Re-presentation: rounding 3
Finding a model for the data 4
Exceptions
– method of defining exceptions 4
– list of exceptions 4
Comparative data: previous years 2
Total 20

Case Study 4.3: Wages Project


1 No data are obviously irrelevant and so no reduction is undertaken. However,
considerable re-presentation is necessary. Following the seven rules for data
presentation:
(a) Round the numbers to two effective figures.
(b) Put the rows in size order. The original ordering method is not clear. The trades
are ordered neither by size nor alphabetically.
(c) The important comparisons are between trades. Therefore there is no reason to
interchange rows and columns.

Quantitative Methods Edinburgh Business School A4/21


Appendix 4 / Answers to Review Questions

(d) Summary measures are already provided. The total row presumably refers to the
overall data for all trades.
(e) There is no excessive use of gridlines or white space in the original table.
(f) The labelling is poor. The labels appear to be too long. Although it is sometimes
necessary to preserve legal or precise definitions, this is unlikely to be the case
here. The labels also interfere with the presentation of the numbers. Their length
results in there being gaps in the columns of numbers. The labelling should be
abbreviated.
(g) No verbal summary can be made at this stage.
The result of this re-presentation is Table A4.5.

Table A4.5 Wages inspectorate’s blitz campaign (Table 4.7 amended)


Trades % employers % employees Amount under-
underpaying entitled to paid (£)
arrears
Bookselling 50 20 1 200
Unlicensed restau- 48 33 5 300
rants
Retail bread 33 23 2 500
Newsagency 33 22 9 600
Drapery 32 17 12 000
Licensed restaurants 31 9 3 900
Hairdressing 26 11 4 400
Hotels 22 10 6 200
Retail food 22 12 17 000
Furnishing 20 8 14 000
Other trades 18 8 200
Total 27 14 76 000
Trades are in descending order according to percentage of employers underpaying
(column 1).

No major pattern emerges, but there is some indication that trades for which a large
percentage of employers are underpaying also have a large percentage of employees
being underpaid. This could be the model of the data. However, the exceptions to
this model are so many that there must be doubts as to its validity.
When searching for another model involving the third column, it becomes apparent
that this column, referring to the amount underpaid, is meaningless unless the
sample size is known, i.e. the number of employers and employees concerned in the
amounts underpaid. This fact suggests that the next step in the analysis might be to
find information on the sample sizes and thereby calculate the average amounts
underpaid per employee in each trade. This is a case where, even though no neat
analysis emerges, it is clear what to do next.

A4/22 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Marking Scheme (out of 20) Marks


Re-presentation
– rounding 3
– ordering 3
– labelling 3
Attempting to find a model and to validate it by looking at 4
exceptions
Recognising that no adequate model can be found 2
Suggesting the next steps in data collection 5
Total 20

Module 5

Review Questions
5.1 The correct answer is B. Summary measures are used because a few summary
figures are easier to handle (i.e. to remember, to compare with other numbers, etc.)
than a set of data comprising perhaps several hundred numbers. If the summary
measures are well chosen, little accuracy will be lost for practical purposes, but
inevitably some information will be; if the measures are ill-chosen they could
misrepresent the data and be misleading. Measures of location and scatter frequently
capture the main features of a set of data, but not always. Just how a set of data can
be summarised depends upon the set itself.
5.2 The correct answer is A.
Arithmetic mean =
= 4
5.3 The correct answer is D. In ascending order:
0, 1, 1, 2, 2, 3, 4, 4, 5, 5, 5, 5, 6, 6, 7, 8
Median =
= 4.5
5.4 The correct answer is E. Mode = 5
5.5 The correct answer is E. None of the statements A–D applies.
The arithmetic mean is the pre-eminent measure of location because it is well
known, easy to use and easy to calculate. But in a few circumstances (the presence
of extreme readings, distinct clusters of readings and when taking the average of
averages) the arithmetic mean can be misleading. Except for these circumstances,
the arithmetic mean is preferable in the case of a symmetrical distribution, when all
three give approximately the same result. For a U-shaped distribution the arithmetic
mean is not helpful for descriptive purposes (there are two clusters of numbers).

Quantitative Methods Edinburgh Business School A4/23


Appendix 4 / Answers to Review Questions

5.6 The correct answer is B. It is tempting to calculate:


Average speed =
=
= 375 km/h
But, since a different amount of time is spent at each speed, we are in effect
averaging averages (cf. percentage marks for two school streams). Going back to the
basic definition of average speed:
Average speed =
Total distance = 200 + 200 + 200 + 200 = 800 km.
Total time = Time for AB + Time for BC + Time for CD + Time for DA.
The distance AB is 200 km travelled at 200 km/h.
The time is therefore one hour.
The distance BC is 200 km travelled at 300 km/h.
The time is therefore 200/300 = 40 min.
Similarly, time for CD is 30 min. and time for DA is 20 min.
Total time = 2.5 hours.
Average speed = 800/2.5 = 320 km/h.
5.7 The correct answer is C. Scatter (synonymous with dispersion) is one attribute of a
set of numbers, independent and distinct from other attributes. A measure of scatter
is complementary to other summary measures, not an alternative. The use of a
measure of scatter depends upon the nature of the data and the purpose to which it
is being put. It does not have to be used in every situation. A measure of scatter has
a low value when the readings are close, high when they are further apart.
5.8 The correct answer is B. Range = 29 − 21 = 8
5.9 The correct answer is A, B. Either A or B could be correct.
In ascending order:
21, 22, 23, 24, 25, 26, 26, 27, 27, 29
There are 10 readings. Eliminating the top and bottom 25 per cent means losing the
top two and bottom two, or the top three and bottom three. In the first case,
Interquartile range = 27 − 23 = 4
In the second case:
Interquartile range = 26 − 24 = 2
5.10 The correct answer is B. The arithmetic mean is
= 25

A4/24 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

23 27 21 25 26 22 29 24 27 26
− −2 2 −4 0 1 −3 4 −1 2 1
| − | 2 2 4 0 1 3 4 1 2 1 ∑| − | = 20
( − ) 4 4 16 0 1 9 16 1 4 1 ∑( − ) = 56

∑| |
Mean absolute deviation =
= =2
5.11 The correct answer is B.
∑( )
Variance =
= = 6.2
5.12 The correct answer is D.
Standard deviation = √Variance
= √6.2 = 2.5
5.13 The correct answer is B. A, C and D are untrue. While no one measure of scatter is
pre-eminent in general, a particular measure is likely to be preferable for each
particular set of data. The range is a descriptive measure that depends on the two
extreme readings. The interquartile range overcomes this problem. Variance suffers
from the disadvantage that it measures the scatter in units that are the square of the
units of the original readings. Standard deviation, being the square root of variance,
measures scatter in original units. Variance and standard deviation are easy to handle
mathematically but are not intuitively easy to understand. The opposite is true of
interquartile range and mean absolute deviation. They are ‘sensible’ and simple
measures of scatter, but are not easy to handle mathematically.
5.14 The correct answer is B, C. An outlier should be corrected, retained or excluded. It
is corrected of course when an error in typing, collection, etc., has been made; it is
always retained when it is an integral part of the pattern; it is usually – but not
always – excluded when it is not a regular part of the pattern. The ambiguity in the
latter case arises because of the difficulty in deciding whether an event is truly
isolated (e.g. is a strike an isolated event?). Answer A is false because an outlier is
sometimes corrected.
5.15 The correct answer is B. Between 2013 and 2014, cost of living grew by:
. .
= 8.7%
.
Wages, salaries grew by:
. .
= 8.4%
.
The growth in the cost of living is therefore slightly greater.

Quantitative Methods Edinburgh Business School A4/25


Appendix 4 / Answers to Review Questions

Case Study 5.1: Light Bulb Testing


1
(a) Mean life for Supplier A’s bulbs:
( × ) ( × ) ( × ) ( × )
=

=
= 1107 hours
Note that all bulbs whose length of life was, say, 700–899 hours are approximat-
ed as having the mean length of life for that class (i.e. 800 hours).
Mean life for Supplier B’s bulbs:
( × ) ( × ) ( × ) ( × )
=

=
= 1070 hours
Supplier A’s bulbs last on average 37 hours longer.
(b) To measure uniformity of quality means measuring the scatter of length of life.
Range and interquartile range do not depend on all 60 observations. Since two
suppliers are being compared and the differences may be small, a more precise
measurement is required. Standard deviation (or variance) could be used, but
mean absolute deviation is easier to calculate.
For Supplier A:
MAD = (12 × |800 − 1107| + 14 × |1000 − 1107| + 24
× |1200 − 1107| + 10 × |1400 − 1107|)/60
=
=
= 172 hours
For Supplier B:
MAD = (4 × |800 − 1070| + 34 × |1000 − 1070| + 19
× |1200 − 1070| + 3 × |1400 − 1070|)/60
=
=
= 115 hours
Supplier B’s bulbs are more uniform in quality.
(c) Having a longer average life is obviously a desirable characteristic. However,
uniform quality is also desirable. As an example, consider a planned maintenance
scheme in which bulbs (in factories or offices) are all replaced at regular intervals
whether they have failed or not. This can be cheaper since labour costs may be
higher than bulb costs and replacing all at once takes less labour time than indi-
vidual replacement. In such a scheme the interval between replacement is usually

A4/26 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

set such that, at most, perhaps 10 per cent of bulbs are likely to have failed. The
interval between replacement can be larger for bulbs of more uniform quality
since they all tend to fail at about the same time. Consequently, the scatter of
length of life is more important than average length of life in these circumstanc-
es.
The choice between Supplier A and Supplier B depends upon the way in which
the bulbs are replaced and is not necessarily based solely on average length of
life.

Marking Scheme (out of 20) Marks


Calculating average life for Supplier A bulbs 2
Calculating average life for Supplier B bulbs 2
Preferring MAD to other measures in calculating scatter
(2 marks if standard deviation or variance) 4
Correct calculation of MAD/standard deviation for Supplier A 4
Correct calculation of MAD/standard deviation for Supplier B 4
Explanation of why average life and uniformity of quality are both 4
important attributes
Total 20

Case Study 5.2: Smith’s Expense Account


1 Average daily expense per trip = £210/6 = £35. But the daily expense (last column)
varies with the length of the trip. Short trips are more expensive per day than long
ones. Smith has more short trips than the other salespeople (who average £20 per
day on weekly trips). Since all trips have an equal weighting in the averaging process,
Smith comes off unfavourably. His boss is averaging averages.
It would be more reasonable to calculate the average expense per day (as opposed to
the average daily expense per trip):
Average expense per day =
£
=
= £18.5
An even better measure would be one that related expense to both trips and days.
Another technique, introduced later, can do this.

Marking Scheme (out of 10) Marks


Explaining that a per-trip average is not fair to Mr Smith because 5
trips are of unequal length
Preparing and calculating a new average based on days 5
Total 10

Quantitative Methods Edinburgh Business School A4/27


Appendix 4 / Answers to Review Questions

Case Study 5.3: Monthly Employment Statistics


1 The standard deviation (or any other measure of scatter) by itself does not, in this
case, reflect the stability of employment level.
It is necessary to relate the standard deviation to the mean, since a standard devia-
tion of 600 against a mean of 10 000 shows more stability than a smaller standard
deviation of 300 against a mean of 4000. The ratio of standard deviation/mean is
the coefficient of variation and it is useful in comparing scatter in data with different
means.

A B C D
Coefficient of variation 0.06 0.06 0.08 0.07

There is, therefore, little difference between departments with respect to stability.

Marking Scheme (out of 10) Marks


Understanding that standard deviation by itself does not reflect 5
stability
Relating the standard deviation to the mean by use of the 5
coefficient of variation
Total 10

Case Study 5.4: Commuting Distances


1
 Mode = 3 miles
 Median = 6.5 miles
 Mean = 10.5 miles
The arithmetic mean is given in the question. The 50 per cent and 51 per cent marks
are both in the 6–7 miles category and the median is taken as the mid-point, 6.5
miles.
It is tempting to suppose that the mode is the 4–5 miles class, since this has the
highest percentage. However, some classes cover just one mile (0, 1, 2, 3 classes).
Others cover two miles (4–5, 6–7, 8–9, 10–11), others cover three miles and so on.
Assumptions must be made. Assuming the 4–5 mile class is split evenly with the 4-
mile class having 8 per cent and the 5-mile class having 8 per cent (and making
similar assumptions for other classes), the mode is 3 miles.
The data are far from symmetrical. Because a few workers travel long distances, the
mean is distorted (although the mean is 10.5 miles, approximately three-quarters of
the workers travelled fewer than 10.5 miles). The mode could be misleading on
account of the approximations made in its calculation.
The median is probably the best measure of location. Two measures of scatter are
asked for. The interquartile range is calculated by removing the bottom 25 per cent

A4/28 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

of readings (the classes for 0, 1, 2 miles and part of the 3-mile class) and the top 25
per cent (classes for 30+, 20–29, 15–19, 12–14 miles and part of the 10–11 miles
class).
Interquartile range = 11 − 3 = 8 miles
A second measure of scatter is more difficult. The range cannot be calculated
because the largest distance is not known. The MAD, variance and standard
deviation cannot be calculated because the highest class is imprecise. Whereas, for
instance, the 15–19 miles class can be substituted by its mid-point in making
calculations, this cannot be done for a class defined as ‘30 and over’ miles. However,
since it is known that the arithmetic mean distance is 10.5 miles, the mid-point of
the 30+ miles class can be approximated.
If x is the average distance for the 30+ miles class, then:
Arithmetic mean = [(3 × 0) + (7 × 1) + (9 × 2) + (10 × 3) + (16 × 4.5)
+(12 × 6.5) + (11 × 8.5) + (8 × 10.5) + (8 × 13) + (4 × 17)
+(6 × 24.5) + (6 × )]/100
.
=
.
=
= 10.5 miles
701.5 + 6 = 1050
6 = 348.5
= 58
The MAD can be calculated using 58 miles as the mid-point of the highest class:
MAD = [3 × |0 − 10.5| + 7 × |1 − 10.5| + 9 × |2 − 10.5| + 10 × |3 − 10.5|
+16 × |4.5 − 10.5| + 12 × |6.5 − 10.5| + 11 × |8.5 − 10.5| + 8 × |10.5 − 10.5|
+8 × |10.5 − 10.5| + 8 × |13 − 10.5| + 4 × |17 − 10.5| + 6 × |24.5 − 10.5|
+6 × |58 − 10.5|]/100
= 830.5/100
= 8.3 miles
The data can be summarised (the median is 6.5 miles with a mean absolute deviation
of 8.3 miles). The interquartile range could be substituted for the MAD.
Since the interesting feature is the extremely long distance travelled by a few, an
alternative summary could be more ad hoc (the median is 6.5 miles, but a quarter of
the workers travelled more than 12 miles; i.e. a verbal summary is being used to
describe the skewed nature of the data).

Quantitative Methods Edinburgh Business School A4/29


Appendix 4 / Answers to Review Questions

Marking Scheme (out of 20) Marks


Correct calculation of the mode 2
Correct calculation of the median 2
Understanding the defects of mode and mean and choosing the 4
median to summarise location
Using the mean to calculate the average distance applicable to the 2
30+ class
Correct calculation of MAD (or correct calculation of variance or 2
standard deviation if chosen)
Correct calculation of interquartile range 2
A final summary using (a) median, (b) interquartile range or MAD, or 6
an ad hoc statement reflecting the long ‘tail’ of the data
Total 20

Case Study 5.5: Petroleum Products


1 Index for 2012 = 100 for both types of index.
(a) Simple aggregate index:
. . .
For 2013: Index = × 100
. . .
.
= × 100
.
= 108.2
. . .
For 2014: Index = × 100
. . .
.
= × 100
= 106.4
(b) Weighted aggregate index:
( . × ) ( . × ) ( . × )
For 2013: Index = × 100
( . × ) ( . × ) ( . × )
.
= × 100
.
= 104.8
( . × ) ( . × ) ( . × )
For 2014: Index = × 100
( . × ) ( . × ) ( . × )
.
= × 100
.
= 105.6

Summary 2012 2013 2014


Simple index 100 108.2 106.4
Weighted index 100 104.8 105.6

(c) It does matter which index is used since between 2013 and 2014 the indices
move in different directions. The weighted index must be the better one since it
allows for the fact that very different quantities of each product are purchased.

A4/30 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

The importance of the weighting is twofold. First, petroleum products are all
made from crude oil and are to some extent substitutable as far as the producer
is concerned. An index should reflect the average price of every gallon of petro-
leum product purchased. Only the weighted index does this. Secondly, products
such as car petrol, of which large quantities are purchased, will have a bigger
effect on the general public than products, such as kerosene, for which only
small amounts are purchased. Again, the index should reflect this.
(d) A differently constructed index would use different weightings. The price part of
the calculation could be changed by using price relatives but this would have
little effect since the prices are close together.
The different weightings that could be used are:
(i) Most recent year quantity weighting. But this would imply a change in his-
torical index values every year.
(ii) Average quantity for all years’ weighting. This would not necessarily mean
historical changes every year. The average quantities for 2012–14 could be
used in the future. This has the advantage that it guards against the chosen
base year being untypical in any way.

Marking Scheme (out of 20) Marks


(a) Simple index (1 mark off for calculating error) 5
(b) Weighted index (1 mark off for calculating error) 5
(c) Which to use 5
(d) Other indices
– Not changing the price element 1
– Sensible variations in the weighting method 4
Total 20

Module 6

Review Questions
6.1 The correct answer is C. Sampling is necessary because it is quicker and easier than
measuring the whole population while little accuracy is lost. Statement A is not true
because it is not always impossible to take population measurements, although it is
usually difficult. Statement B is untrue because sampling is always less accurate since
fewer observations/measurements are made.
6.2 The correct answer is A. Many sampling methods are based on random selection for
two reasons. First, it helps to make the sample more representative (although it is
unlikely to make it totally representative). Second, it enables the use of statistical
procedures to calculate the range of accuracy of any estimates made from the
sample. A is therefore a correct reason, while B, C and D are incorrect.
6.3 The correct answer is B. Starting top left, the first number is 5; therefore, the first
region chosen is SE England. Moving across the row, the second number is 8 and

Quantitative Methods Edinburgh Business School A4/31


Appendix 4 / Answers to Review Questions

the corresponding region is Scotland. The third number is 5, which is ignored, since
SE England is already included. In this situation, having the same region repeated in
the sample would not make sense. Consequently, we sample without replacement.
The fourth number is 0 and is also ignored since it does not correspond to any
region. The fifth number is 4 and so London completes the sample.
6.4 The correct answer is B, C. Multi-stage sampling has two advantages over simple
random sampling. The population is divided into groups, then each group into
subgroups, then each subgroup into subsubgroups, etc. A random sample of groups
is taken, then for each group selected a random sample of its subgroups is selected
and so on. Therefore, it is not necessary to list the whole population and advantage
B is valid. Since the observations/measurements/interviews of sample elements are
restricted to a few sectors (often geographical) of the population, time and effort
can be saved, as, for example, in opinion polls. Advantage C is therefore also valid.
Multi-stage sampling is solely a way of collecting the sample. Once collected, the
sample is treated as if it were a simple random sample. Its accuracy and the observa-
tions required are therefore just the same as for simple random sampling. Reasons A
and D are false.
6.5 The correct answer is A, C. In stratified sampling the population is split into
sections on the basis of some characteristic (e.g. management status in the absentee-
ism survey). The sample has to have the same sections in the same proportions as
the population (e.g. if there are 23 per cent skilled workers in the population, the
sample has to have 23 per cent skilled workers). In respect of management status,
therefore, the sample is as representative as it can be. In respect of other characteris-
tics (e.g. length of service with the company) the stratified sample is in the same
position as the simple random sample (i.e. its representativeness is left to chance).
Thus a stratified sample is usually more representative than a simple random one
but not necessarily so. Statement A is true.
A cluster sample can also be stratified by making each cluster have the same
proportions of the stratifying characteristics as the population. Statement B is
untrue.
If a total sample of 100 is required, stratification will probably mean that more than
100 elements must be selected. Suppose 23 per cent skilled workers are required and
that, by the time the sample has grown to 70, 23 skilled workers have already been
selected. Any further skilled workers chosen cannot be used. To get a sample of
100, therefore, more than 100 elements will have had to be selected. Only if the cost
of selection is negligible will a stratified sample be as cheap as a simple random
sample. Statement C is true.
6.6 The correct answer is B. Use of a variable sampling fraction means that one section
of the population is deliberately over-represented in the sample. This is done when
the section in question is of great importance. It is over-sampled to minimise the
likelihood of there being error because only a few items from this section have been
measured. Incidentally, D describes weighted sampling.
6.7 The correct answer is A, B, C. Non-random sampling is used in all three situations.

A4/32 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Convenience sampling may be used because random sampling is impossible (e.g.


with a blood sample). Systematic sampling and convenience sampling both some-
times give a sample that is as good as random (e.g. every tenth name from an
alphabetical list). Quota sampling is used to overcome interviewer bias.
6.8 The correct answer is C. The essential difference between stratification and quota
sampling is that the strata in the sample are intended to mimic the population. The
quota sizes can be set according to any criteria deemed suitable.
While quota sampling is usually seen in connection with interviewing, this is not an
essential characteristic. Reason A, therefore, does not always apply. Similarly, B does
not name an essential difference. For example, an interviewer could have used a
random method to make selections within the quota.
6.9 The correct answer is A, C, F. The forest was split into geographical areas because
of the difficulty of listing each tree individually. A random sample of areas was
taken: area sampling was therefore used. Since the areas were classified as sloping or
level and since the final sample had 20 per cent sloping areas, 80 per cent level, just
like the population, stratification was therefore also used. Since sampling was at two
levels, areas and trees, multi-stage sampling was used.
6.10 The correct answer is A. Accuracy is proportional to the square root of the sample
size. In the first case, the sample size is 25 (√25 = 5); in the second case, the sample
size is 400 (√400 = 20). In the second case the accuracy is 20/5 = 4 times greater.
The average height is thus measured to ±12/4 = ±3 cm.

Case Study 6.1: Business School Alumni


1 A question about the randomness of a sample can only have meaning after the
population it is supposed to represent has been clearly defined. In this case, the
population is the set of all graduates of the school. Do the 1200 usable replies
constitute a random sample of this population?
The sample is not random. Two alphabetical lists are sampled systematically. Every
twentieth name is taken from each after a random start. Although this selection is
random, non-randomness comes in the following ways:
(a) The two lists do not constitute the entire population. There are other graduates
whose addresses are not known. The reasons for their addresses being unknown
are many. Perhaps they did not like the school and did not want to maintain
contact, or they have moved around a lot, or they feel their careers have not
been as successful as those of their peers. The omission of these will bias the
sample.
(b) Not everyone who was sent a questionnaire replied. Only 1200 replies were
received. (We do not know how many were sent but questionnaires often have
response rates as low as 10 or 20 per cent.) What are the reasons for not reply-
ing? Again, they could be many, including a reluctance to report failure, or
relative failure. The omission of these non-replying graduates from the sample is
likely to result in bias.

Quantitative Methods Edinburgh Business School A4/33


Appendix 4 / Answers to Review Questions

Marking Scheme (out of 10) Marks


Marks awarded for answering
– that the selection procedure is random 2
– that the whole population has not been sampled 2
– that the omission of graduates whose addresses are not known 2
could lead to bias
– that non-response results in non-randomness 2
– that the reasons for non-response mean the sample is likely to be 2
biased
Total 10

Case Study 6.2: Clearing Bank


1
(a) Even when all the records are computerised, it is still time-consuming and
expensive to analyse the accounts of several million customers. In addition, with
14 regions each with its own computer, a substantial amount of work is needed
to bring the results of the regions together in compatible form. Sampling cuts
down the accounts to be analysed and the regions to be dealt with.
(b) Two factors affect sample size: the accuracy required and the cost of collecting
the sample. One needs to look at the purposes to which the information is to be
put and try to establish minimum accuracy levels, e.g. ±£75 for average account
profitability, ±£40 transactions for average transaction volume, etc. Each accu-
racy implies a sample size. The largest of these would be the minimum sample
size. The cost of collecting this sample is then estimated and tested against the
budget. By iterative procedures, one hopes to arrive at a sample size through a
consensus of those involved that balances accuracy and expense.
(c) Multi-stage sampling is recommended. Regions are sampled, then branches, then
customers. The regions are chosen at random. At the stage of choosing branch-
es, stratified sampling is used so that the sample contains the same percentage of
each branch size as the population. Systematic sampling can be used for the
account holders. Since they are listed in chronological order, a systematic sample
will in effect be random. The sampling scheme would be as follows:
(i) By a simple random method select three of the 14 regions. The choice of
three is arbitrary to some extent. However, one or two would probably
give an insufficient geographical coverage, whereas more than three
means that one is perhaps dealing with more regions than necessary.
(ii) For each of the three regions, list its branches in three groups according
to size. Select by a simple random or systematic method, two large
branches, two medium and two small. The sample now has similar pro-
portions to the population as shown below.

A4/34 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Size
Small Medium Large
Sample % 33 33 33
Population % 34 34 32

In total we have a sample of 18 branches. As with choosing three regions,


the six branches per region are chosen to give representation without be-
ing unwieldy.
(iii) For each of the 18 branches, obtain a computer listing of chequebook ac-
counts. If a sample of accounts was taken such that an equal number,
2000/18 = 111, was taken from each of the 18 lists, then large branch
customers would be under-represented (since there are more customers at
large branches than small). To overcome this problem, the sample taken
from each size of branch must be in proportion to the number of ac-
counts at that size of branch. Since large branches average 5000 accounts,
medium 2500 and small 1000, accounts from large branches must be
5000/(5000 + 2500 + 1000) of the total (i.e. 10/17). Similarly, medium
branches should be 5/17, and small 2/17. Since 2000/3 = 667 accounts
come from each region:
10/17 × 667 = 392 should be from large branches (i.e. 196 per branch);
5/17 × 667 = 196 should be from medium branches (i.e. 98 per branch);
2/17 × 667 = 78 should be from small branches (i.e. 39 per branch).
Because of rounding, the total sample size is now 1998. The choice of
2000 for the total sample size was not sufficiently precise for this discrep-
ancy to matter.
From the computer list for each branch, select by systematic sampling a
sample of the appropriate size. For example, if a large branch had 6000
accounts in all, then, after a random start, every thirtieth name would be
taken to give a sample of the right size.
(d) To compare the profitability of the accounts of customers from different
socioeconomic groups requires the stratification of the customers at stage (iii)
(above) of the procedure. Suppose 10 per cent of the bank’s chequebook cus-
tomers are known to be of the AB socioeconomic group: 10 per cent of each
sample at branch level must be from this group (i.e. 20 of the large bank sample,
ten of the medium and four of the small). The sampling would continue until
each socioeconomic category had the correct number of accounts.
In order to do stratified sampling of this type, two sets of information are re-
quired:
(i) The percentages of customers in each socioeconomic group (e.g. ten
per cent of all the bank’s customers are from the AB group, etc.). A
further sampling exercise might provide this information.
(ii) Knowledge of the occupation of each customer. Socioeconomic classi-
fication can be done from other information, but in any case the ability
to carry out this last part of the exercise does depend upon the availa-
bility of some personal details for each customer.
(e) The major practical problems are likely to be:

Quantitative Methods Edinburgh Business School A4/35


Appendix 4 / Answers to Review Questions

(i) Some small percentage of accounts will have been closed between the
time of compilation of the computer listings and use of the infor-
mation, thus reducing the sample size and the accuracy. The original
sample may have to be increased to allow for this.
(ii) Some accounts will be dormant (i.e. open but rarely, or never, used). A
decision to include or exclude these accounts must be made. It is usual-
ly made after consideration of the purposes to which the information is
being put.
(iii) To know the occupation of customers requires visiting branches, since
this information may not be computerised. This is time-consuming and
requires the establishment of a working relationship with the bank
manager. It also requires permission to breach the confidentiality be-
tween customers and manager.
(iv) The personal details may well be out of date, since such information is
only occasionally updated.
(v) It may not be possible to classify some account holders into a socioec-
onomic group. For example, the customer may have been classified as a
schoolchild seven years ago. This is a problem of non-response. Omit-
ting such customers from the sample will lead to bias. Extra work must
be done to find the necessary details.

Marking Scheme (out of 20) Marks


(a) Reasons for taking sample
– Reducing effort/computer time 1
– Reducing complication of dealing with all regions 1
(b) Factors influencing sample size
– Accuracy required by user 1
– Cost of sampling 1
(c) For suggesting
– Multi-stage sampling (first regions, then branches, then customers) 2
– Stratification of branches 2
– Systematic sampling of accounts (1 mark if simple random sug- 2
gested)
For getting the numbers right
– 2/4 regions, 10/30 branches 1
– Different sample size from different size of branch 1
(d) For suggesting stratification by socioeconomic group 2
For realising extra information is required
– % customers in each group 1
– Occupation of account holder 1
(e) Practical difficulties
– Need to allow for closed or dormant accounts 1

A4/36 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

– Additional order of difficulty in establishing socioeconomic 1


groups (i.e. branches have to be visited)
– Out-of-date personal details 1
– Non-response problem where account holders are classified into 1
socioeconomic groups
Total 20

Module 7

Review Questions
7.1 The correct answer is B. The probabilities are calculated from the formula:
.
(data class X) =
This is the relative frequency method.
7.2 The correct answer is A. Even though the situation that generates the data appears
to be normal, the distribution is still observed because the probabilities were
calculated from frequencies. Had the data been used to estimate parameters from
which probabilities were to be found in conjunction with statistical tables, the
resulting distribution would have been normal.
7.3 The correct answer is False. The amounts owed are not strictly a continuous
variable since they are measured in currency units. They are not measured, for
example, in tiny fractions of pennies.
7.4 The correct answer is A. The addition rule for mutually exclusive events gives:
(no more than 1 cancellation) = (0 or 1 cancellation)
= (0) + (1)
= 32% + 29%
= 61%
7.5 The correct answer is C. The multiplication rule gives:
(0 cancellations on day 1 and 0 on day 2) = 0.32 × 0.32
= 0.102 i. e. 10.2%
7.6 The correct answer is C. The basic multiplication rule as given relates to
independent events. They may not be independent. For example, spells of bad
weather are likely to prevent patients attending. The fact that there were no cancella-
tions on day 1 might indicate a spell of good weather and a higher probability of no
cancellations on the following day.
7.7 The correct answer is D. The number of ways of choosing three objects from eight
is 8 3 .
! × ×
= = = 8 × 7 = 56
!× ! !
7.8 The correct answer is B. Knowledge about standard distributions (e.g. for the
normal ±2 standard deviations covers 95 per cent of the distribution) is available
rather than it having to be calculated direct from data.

Quantitative Methods Edinburgh Business School A4/37


Appendix 4 / Answers to Review Questions

A is not correct since some small amount of data has to be collected to check that
the standard distribution is applicable and to calculate parameters.
C is not necessarily correct. Standard distributions are approximations to actual
situations and may not lead to greater accuracy.
7.9 The correct answer is B. The population is split into two types: watchers and non-
watchers. A random sample of 100 is taken from this population. The number of
watchers per sample therefore has a binomial distribution.
7.10 The correct answer is True. Since the programme is described as being popular, the
proportion of people viewing (p) is likely to be sufficiently high (perhaps about 0.3)
so that np and np(1 − p) are both greater than 5. The normal approximation to the
binomial can therefore be applied.
7.11 The correct answer is D. The population can be split into two types: those that have
heard and those that have not. A random sample of five is taken from this popula-
tion. The underlying distribution is therefore binomial with p = 0.4 and n = 5. The
binomial formula is:
( of type 1) = × × (1 − )
Thus:
(1 person has heard of the chocolate bar) = × 0.4 × 0.6
= 5 × 0.4 × 0.6
= 2 × 0.1296
= 0.26 (approx. )
Or, Table A1.1 could have been used.
7.12 The correct answer is A. The average per clerk per day is 190. There are 12 clerks.
The total per day is therefore 190 × 12 = 2280.
7.13 The correct answer is C. Since the distribution is normal, 68 per cent of clerks will
clear a number of dockets in the range:
Mean ±1 standard deviation
= 190 ±25
= 165 to 215
32 per cent of clerks will clear a number of dockets outside this range. Since a
normal distribution is symmetrical, half of these (16 per cent) will clear fewer than
165. Likewise, 16 per cent will clear more than 215.
16% of 12 = 1.92
Approximately two clerks will clear more than 215 dockets per day.
7.14 The correct answer is A. There is a 95 per cent probability that the number of
dockets cleared by any clerk on any day will lie within the range covered by 95 per
cent of the distribution. This range is, for a normal distribution, the mean ±2
standard deviations.
The range is 190 ± 2 × 25 = 140 to 240.

A4/38 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Case Study 7.1: Examination Grades


1 Since the normal distribution is continuous, a grade of 85 is represented by the
range 84.5–85.5. We are therefore looking for the probability of a mark exceeding
85.5.
(a) This is 15.7 away from the mean (85.5–69.8). In terms of standard deviations,
this is:
z = 15.7/11.6
= 1.35
From Table A1.2 and Figure A4.6, for z = 1.35 the shaded area A is 0.4115.
Area B = 0.5 − 0.4115
= 0.0885
Therefore 8.85 per cent of students should exceed a mark of 85 per cent.

Area B
Area A

Figure A4.6 Examination grades


(b) 40 (represented by 39.5–40.5) is 30.3 (69.8 − 39.5) away from the mean, so:
z = −30.3/11.6
= −2.61
Referring to the normal curve Table A1.2 in Appendix 1, the area between the
mean and z = −2.61 is 0.4955. Therefore 0.5 − 0.4955 = 0.0045, or 0.45 per cent
of students should get less than 40.
(c) 50 (represented by 49.5–50.5) is 20.3 (69.8 − 49.5) away from the mean, so:
z = −20.3/11.6
= −1.75
Referring to the normal curve Table A1.2 in Appendix 1, the area between the
mean and z = −1.75 is 0.4599. The proportion of students failing is 0.5 − 0.4599
= 0.0401. In a class of 180, this means 0.0401 × 180 = 7 students.
(d) If 8 per cent of students are awarded distinctions, then the lowest distinction
mark corresponds to the z value associated with the shaded area in Figure A4.7
being equal to 0.42. From Table A1.2 in Appendix 1, z = 1.40 corresponds to an
area = 0.4192 and z = 1.41 corresponds to an area = 0.4207. The z value being
sought can therefore be estimated at 1.405.
The lowest distinction mark is 69.8 + 1.405 × 11.6 = 86.1. Since a mark of 86
per cent is in the range 85.5–86.5, if no more than 8 per cent of the class are to
be awarded distinctions, the lowest distinction mark is 87 per cent.

Quantitative Methods Edinburgh Business School A4/39


Appendix 4 / Answers to Review Questions

8%

z = 1.405

Figure A4.7 Examination grades z value for distinction students

Marking Scheme (out of 20) Marks


(a) Method 4
Correct calculation 1
(b) Method 4
Correct calculation 1
(c) Method 4
Correct calculation 1
(d) Method 4
Correct calculation 1
Total 20

Case Study 7.2: Car Components


1
(a) The population from which samples of size 6 are being taken is the entire
production of these components. The population is split into two types, defec-
tive and non-defective. The binomial distribution is likely to apply to this
situation. The procedure is to assume that the process is operating with 10 per
cent defectives and to ascertain whether the evidence is consistent with this as-
sumption. Thus:
Binomial distribution applies with:
p = 0.10
n=6
From Table A1.1:
P(0 defective) = 0.5314
P(1 defective) = 0.3543
P(2 defective) = 0.0984
P(3 defective) = 0.0146
P(4 defective) = 0.0012

A4/40 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

P(5 defective) = 0.0001


P(6 defective) = 0
To see whether the results are consistent with an overall 10 per cent defective
rate, the observed results from the 100 samples are compared with the theoreti-
cally expected results calculated above.

Number of defectives 0 1 2 3 4 5 6
Observed no. of samples 52 34 10 4 0 0 0
Theoretical no. of samples 53 35 10 1 0 0 0

There is a close correspondence. The results are consistent with a process defec-
tive rate of 10 per cent. Note that because of rounding the theoretical numbers
of samples do not add up to 100.
(b) A first reservation concerning this conclusion is whether the samples were taken
at random. If the samples were only taken at particular times, say at the start of a
shift, it might be that starting-up problems meant that the defective rate at this
time was high. The results would then suggest the overall rate was higher than it
actually is.
Second, if the samples that contain defectives were mostly towards the end of
the time period during which the samples were collected, this might indicate that
the process used to have a defective rate less than 10 per cent but had deteriorat-
ed.
Third, the fact that there are more samples with three defectives than expected,
and fewer with zero and one defective, suggests greater variability in the process
than expected. This might occur because p is not constant at 10 per cent but
varies throughout the shift. The antidote to this problem is either to split the
shift into distinct time periods and take samples from each or to use a more
sophisticated distribution called beta-binomial, which allows for variability in p
and which will be described in a later module.

Marking Scheme (out of 20) Marks


(a) Use of correct parameters 3
Correct use of table 4
Comparison of observed/theoretical 4
(b) Random samples 3
Time differences 3
Variability in p 3
Total 20

Quantitative Methods Edinburgh Business School A4/41


Appendix 4 / Answers to Review Questions

Case Study 7.3: Credit Card Accounts


1 Expenditure is normally distributed with mean £280 and standard deviation £90.
(a) For £200, z = (200 − 280)/90 = −0.89.
From Table A1.2 in Appendix 1, the area corresponding to z = 0.89 is 0.3133.
Therefore, P(expenditure ≤ £200)
= 0.5 − 0.3133
= 0.1867
Consequently, approximately 19 per cent of clients are likely to spend less than
£200 per month.
(b) For £300, z = (300 − 280)/90 = 0.22.
From Table A1.2 in Appendix 1, the area corresponding to z = 0.22 is 0.0871.
The range of expenditure £200 to £300 bestrides the mean with z = −0.89 on
one side and z = 0.22 on the other. Thus, there is an area on either side of the
mean and they must be added together.
Therefore, P(expenditure £200–£300)
= 0.3133 + 0.0871
= 0.4004
Approximately 40 per cent of clients are likely to spend between £200 and £300
per month.
(c) For £400, z = (400 − 280)/90 = 1.33.
From Table A1.2, Appendix 1, the area corresponding to z = 1.33 is 0.4082.
This area includes part of the previous expenditure class, that between £280 and
£300.
Therefore, P(expenditure £300–£400)
= 0.4082 − 0.0871
= 0.3211
Approximately 32 per cent of clients are likely to spend between £300 and £400
per month.
(d) The area from the mean up to £400 has already been found to be 0.4082.
Therefore, P(expenditure exceeds £400)
= 0.5 − 0.4082
= 0.0918
Approximately 9 per cent of clients are likely to spend more than £400 per
month.
As a check, the percentages associated with the four expenditure classes should sum
to 100.

A4/42 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Marking Scheme (out of 20) Marks


(a) Method 3
Calculation 2
(b) Method 3
Calculation 2
(c) Method 3
Calculation 2
(d) Method 3
Calculation 2
Total 20

Case Study 7.4: Breakfast Cereals


1 The binomial distribution applies since random samples are being taken from a
population split into two types. The parameters are n = 20 and p = 0.45. From the
binomial probability Table A1.1 in Appendix 1, the probability of there being any
given number of regular users in a sample can be found.

P(0 users) = 0
P(1 user) = 0.0001
P(2 users) = 0.0008
P(3 users) = 0.0040
P(4 users) = 0.0139 ← Total so far 0.0188
P(5 users) = 0.0365
P(6 users) = 0.0746
P(7 users) = 0.1221
P(8 users) = 0.1623
P(9 users) = 0.1771
P(10 users) = 0.1593
P(11 users) = 0.1185
P(12 users) = 0.0727
P(13 users) = 0.0366
P(14 users) = 0.0150 ← Total from here to end 0.0214
P(15 users) = 0.0049
P(16 users) = 0.0013
P(17 users) = 0.0002
P(18+ users) = 0

From this table and using the subtotals:


P(5 to 13 users inclusive) = 1.0 − 0.0188 − 0.0214
= 0.96 approx.

Quantitative Methods Edinburgh Business School A4/43


Appendix 4 / Answers to Review Questions

Therefore, there is a 96 per cent probability that a sample will contain from five to
13 regular users of the breakfast cereal. If many samples are taken, it is likely that 96
per cent of them will contain five to 13 users. Because consumers are counted in
whole numbers, there is no range of users equivalent to the 95 per cent requested in
the question.
The question has been answered, but it has been a lengthy process. Since np (= 9)
and n(1 − p) (= 11) are both greater than five, the binomial can be approximated by
the normal. The parameters are:
Arithmetic mean = =9
Standard deviation = (1 − )
= √4.95
= 2.22
For a normal distribution, 95 per cent of it lies between ±2 standard deviations.
Therefore, 95 per cent of samples are likely to be between:
9 − (2 × 2.22) and 9 + (2 × 2.22)
4.56 and 13.44
Recall that, because a discrete distribution is being approximated by a continuous
one, a whole number with the binomial is equivalent to a range with the normal. For
example, five users with the binomial corresponds to the range 4.5 to 5.5 with the
normal. In the above calculations, the range 4.56 to 13.44 covers almost (but not
quite) the range five to 13 users.
Approximately, therefore, 95 per cent of the samples are likely to have between five
and 13 users inclusive. This is the same result as obtained by the lengthier binomial
procedure.

Marking Scheme (out of 20)


If binomial used Marks
– Correct parameters 4
– Use of table 6
– 95% range 6
– Know that normal is an alternative 4
Total 20

If normal approximation used Marks


– Know basic situation is binomial 4
– Applying np > 5 rule 3
– Calculation of parameters 6
– Continuity correction 3
– 95% range 4
Total 20

A4/44 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Module 8

Review Questions
8.1 The correct answer is C. Statistical inference uses sample information to make
statements about populations. The statements are in the form of estimates or
hypotheses.
8.2 The correct answer is C. Inference is based on sample information. Even though a
sample is random, it may not be representative, and therefore there is some chance
that the inference may be incorrect. The other statements are true but they are not
the reasons for using confidence levels.
8.3 The correct answer is A. The mean of the sample = (7 + 4 + 9 + 2 + 8 + 6 + 8 + 1
+ 9)/9 = 6
The variance = [(7 − 6)2 + (4 − 6)2 + (9 − 6)2 + (2 − 6)2 + (8 − 6)2 + (6 − 6)2 + (8
− 6)2 + (1 − 6)2 + (9 − 6)2]/(9 − 1) = 9
The standard deviation is:
√9 = 3
The standard deviation of the distribution of means of sample size 9 is:
3/√9 = 1
8.4 The correct answer is B. The point estimate of the population mean is simply the
sample mean.
8.5 The correct answer is B. The point estimate of the mean is 6. The 90 per cent
confidence limits are (from normal curve tables) 1.645 standard errors either side of
the point estimate. The limits are 6 ± 1.645 × 1 = 4.4 to 7.6 (approximately).
8.6 The correct answer is A. The 95 per cent confidence limits cover a range of 2
standard errors on either side of the mean. A standard error is 150/√n where n is
the sample size. Thus
20 = 2 × 150/√
1 = 15/√
√ = 15
= 225
8.7 The correct answer is False. Sample evidence does not prove a hypothesis. Because
it is from a sample, it merely shows whether the evidence is statistically significant or
not.
8.8 The correct answer is A. The tester decides on the significance level. He or she may
choose whatever value is thought suitable but 5 per cent has come to be accepted as
the convention. The other statements are true but only after 5 per cent has been
chosen as the significance level.
8.9 The correct answer is True. Critical values are an alternative approach to
significance tests and can be used in both one- and two-tailed tests.
8.10 The correct answer is A. The standard error of the sampling distribution is 6 (=
48/√64). There is no suggestion that any deviation from the hypothesised mean of 0

Quantitative Methods Edinburgh Business School A4/45


Appendix 4 / Answers to Review Questions

could be in one direction only. Therefore the test is two-tailed. At the 5 per cent
level the critical values are 2 standard errors either side of the mean (i.e. at −12 and
12). Since the observed sample mean is 9.87, at the 5 per cent level the hypothesis is
accepted. At the 10 per cent level the critical values are 1.645 standard errors from
the mean (i.e. at −9.87 and 9.87). At the 10 per cent level the test is inconclusive.
8.11 The correct answer is C. Since the test is one-tailed at the 5 per cent level, the
critical value is 1.645 standard errors away from the null hypothesis mean. The
critical value is therefore 9.87. For the alternative hypothesis the z value of 9.87 is
−1.69 (= (9.87−20)/6). The corresponding area in normal curve tables is 0.4545.
Since the null hypothesis will be accepted (and the alternative rejected even if true)
when the observed sample mean is less than 9.87, the probability of a type 2 error is
0.0455 (= 0.5 − 0.4545; i.e. 4.55 per cent).
8.12 The correct answer is D. The power of the test is the probability of accepting the
alternative hypothesis when it is true. The power is therefore:
1.0 − P(type 2 error) = 95.45 per cent.
8.13 The correct answer is C. The test is to determine whether the plea has met its
objective by bringing about an increase of £2500 per month (i.e. whether the
average increase in turnover per branch is £2500 per month). This is equal to £7500
over a three-month period.
8.14 The correct answer is B. The samples being compared are the turnovers before and
after the plea. They are paired in that the same 100 branches are involved. Each
turnover in the first period is paired with the turnover of the same branch in the
second period.
The test is one-tailed. In the circumstances that the plea was well exceeded the
hypothesis would not be rejected. One would only say that the plea had not
succeeded if the observed increase was significantly less than £7500, but not if
significantly more. Therefore only one tail should be considered. The test should
thus be one-tailed, based on paired samples.
8.15 The correct answer is False. The procedure described relates to an unpaired sample
test, not a paired test. A paired test requires a new sample formed from the differ-
ences in each pair of observations.

Case Study 8.1: Food Store


1 The distribution of the means of samples of size 250 is a normal distribution. In
spite of the fact that the distribution of individual amounts due appears (from the
range) to be skewed, the central limit theorem states that the sampling distribution
will be normal. Therefore, at the 95 per cent confidence level, the mean of the one
sample collected must lie within two standard errors of the true population mean
(the average amount outstanding for all overdue accounts). The standard error of
this sampling distribution is equal to:
= 95/√250
= 95/15.8
= 6 (approximately)

A4/46 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

The mean of the one sample (£186) must lie within 12 (= 2 × 6) of the true
population mean at the 95 per cent confidence level. Consequently, the true
population mean must be in the range 186 ± 12 = £174 to £198.

Marking Scheme (out of 10) Marks


Central limit theorem 2
Standard error = /√ 2
Calculation of standard error 2
96 per cent confidence limits = ±2 standard errors 2
Calculation of final result 2
Total 10

Case Study 8.2: Management Association


1 This is a test of the hypothesis that the new course is no different from the old and
will produce an overall average test score of 242, just as before.
The sample size is 16, but the usually required 30 is not necessary in this case.
First, since the individual distribution of scores was (and is assumed still to be)
normal, the sampling distribution of the mean will be normal whatever the sample
size. Also, since the standard deviation is assumed still to be 52, unchanged from
before, it is not being estimated from the sample. Therefore, the second reason for
needing a sample size greater than 30 does not apply.
The test is two-tailed because the new test could differ from the old by being either
higher or lower.
The significance test follows the usual steps:
(a) The hypothesis is that the new course will still give an overall average test score
of 242.
(b) The sample evidence is the 16 people who have undergone the computer-based
course and achieved an average score of 261.
(c) Choose the conventional significance level of 5 per cent.
(d) The sampling distribution of the mean is normal, with the mean assumed to be
242 (the hypothesis) and the standard deviation equal to
52/√16 = 13
The z value for the sample result of an average score of 261 is thus:
= (261 − 242)/13
= 19/13
= 1.46
From the normal curve table given in Appendix 1 (Table A1.2), the associated
area under the curve is 0.4279. The hypothesis is that the new course would not
change the test score. The possibility that the new course could have led to an
improvement or a deterioration was recognised. The probability of the sample

Quantitative Methods Edinburgh Business School A4/47


Appendix 4 / Answers to Review Questions

result must therefore be seen as the probability of a result as far from the mean
as z = 1.46 in either direction (a two-tailed test).
Probability of sample result = 2 × 7.21% (0.0721 = 0.5 − 0.4279)
= 14.42
(e) This result is larger than the significance level of 5 per cent and the hypothesis
must be accepted. There is insufficient evidence to suggest that the new course
makes a significant difference to the test scores at the 5 per cent level.

Marking Scheme (out of 10) Marks


Correct type of test, correct hypothesis 1
Two-tailed test 2
Knowing that a sample of 16 is adequate and the reasons 3
Standard error formula and calculation 1
Probability of sample evidence 2
Correct conclusion 1
Total 10

Case Study 8.3: Textile Company


1 The hypothesis is that the supplier is sending yarns of an acceptable mean tensile
strength and therefore that the sample of 50 has come from a population of mean
12 kg.
The test is one-tailed since only the possibility of under-strength yarns is of concern
to the management and needs to be considered.
The distribution of strengths is not known, but since the sample size exceeds 30 the
sampling distribution of the mean is normal, by the central limit theorem.
The significance test follows the usual steps:
(a) The hypothesis is that the sample comes from a population of mean 12 kg.
(b) The evidence is the sample of 50 yarns with mean 11.61 kg and standard
deviation 1.48 kg.
(c) Choose the conventional 5 per cent significance level.
(d) The standard error of the sampling distribution of the mean is:
1.48/√50 = 1.48/7.07 = 0.21
The z value of the observed sample mean is (11.61 − 12)/0.21 = −1.86.
From the normal curve table in Appendix 1 (Table A1.2), the corresponding area
is 0.4686. The probability of the sample evidence is therefore 0.0314 (= 0.5 −
0.4686) (i.e. 3.14 per cent).
(e) The probability of the sample evidence is less than the 5 per cent significance
level and the hypothesis must be rejected. It does appear that the supplier is
sending yarn of a significantly lower tensile strength.
The text suggested six possible reservations about significance tests. Only some of
them apply to this case. First, the test result is close to the accept/reject border and
is not fully convincing. It should serve as a warning to check the situation further

A4/48 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

rather than to be the basis for contract action. Second, the assumptions underlying
the test must be met. In this case this means that the sample should have been
selected in a truly random fashion. If not, the whole basis of the test is undermined.
Third, a tensile strength of slightly less than 12 kg might be adequate for the cloth
concerned, and it might not be economic to go to the expense of ensuring the
contract is kept to the letter. On the other hand, a tensile strength of, say, 11 kg or
less might have been more serious because the quality of the cloth was noticeably
reduced. This might suggest the adoption of a test that had 11 kg as the alternative
hypothesis.

Marking Scheme (out of 15) Marks


Correct type of test and hypothesis 1
One-tailed test 2
Use of central limit theorem 2
Calculation of standard error 2
Probability of sample evidence 2
Correct conclusion 1
Reservations 5
Total 15

Case Study 8.4: Titan Insurance Company


1
(a) The data are in the form of two paired samples (each output is paired with an
output for that same salesperson in the other sample). The test should be a
paired significance test, based on a new sample formed from the difference in
output for each salesperson.
The hypothesis is that the new scheme has not increased output (i.e. the new
sample has come from a distribution of mean 0).
Because the sample is 30, the central limit theorem suggests that the sampling
distribution of the mean will be normal.
It is assumed that, since the new scheme increases incentives, only an increase in
output is possible. The test is thus one-tailed.
(b) Form the new sample by subtracting the old output from the new as in Ta-
ble A4.6.

Table A4.6 Titan Insurance: paired sample test


Salesperson Difference ( − ) ( − )
1 5 1 1
2 19 15 225
3 −5 −9 81
4 7 3 9
5 0 −4 16

Quantitative Methods Edinburgh Business School A4/49


Appendix 4 / Answers to Review Questions

Salesperson Difference ( − ) ( − )
6 13 9 81
7 −3 −7 49
8 −6 −10 100
9 −6 −10 100
10 25 21 441
11 17 13 169
12 21 17 289
13 21 17 289
14 −14 −18 324
15 −7 −11 121
16 19 15 225
17 −7 −11 121
18 −34 −38 1444
19 −7 −11 121
20 13 9 81
21 13 9 81
22 9 5 25
23 −11 −15 225
24 11 7 49
25 18 14 196
26 −19 −23 529
27 8 4 16
28 −7 −11 121
29 9 5 25
30 18 14 196
120 0 5750

Mean = 120/30 = 4
Standard deviation = 5750/29 = 14.08
Conduct the significance test in five stages:
(i) The hypothesis is that the new sample comes from a population of mean 0.
(ii) The evidence is the new sample of mean 4 and standard deviation 14.08.
(iii) The significance level is 5 per cent.
(iv) The standard error of the sampling distribution of the mean is:
14.08/√30 = 2.57
The z value of the observed sample mean:
= (4 − 0)/2.57 = 1.56
(v) From the normal curve table in Appendix 1 (Table A1.2), the correspond-
ing area is 0.4406. The probability of such a z value is therefore 0.0594 or
5.94 per cent.

A4/50 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

This percentage is slightly higher than the significance level. The hypothesis is
accepted (but only just). The new scheme does not give rise to a significant
increase in output.
(c) The six possible reservations listed in the text suggest:
(i) The sample should be collected at random. This means a true random-
based sampling method should have been used to select the salespeople
covering all grades, areas of the country, etc. It also means that the months
used should not be likely to show variations other than those stemming
from the new scheme. For instance, allowance should be made for seasonal
variations in sales.
(ii) Checks should be made that the structure of the test is right. For instance,
does a simple measure like the total sum assured reflect the profitability of
the company? Profitability may have more to do with the mix of types of
policy than total sum assured.
(iii) The potential cost/profit to the company of taking the right decision in
regard to the incentive scheme suggests that more effort could be put into
the significance test. In particular, a larger sample could be taken.
(iv) The balance between the two types of error should be right. It is more im-
portant to know whether the scheme is profitable than to know whether it
gives a significant increase. The test should have sufficient power to distin-
guish between null and alternative hypotheses. This is discussed below.
(d) If the alternative hypothesis is a mean increase of £5000:
(i) P(type 1 error) = significance level = 5%.
(ii) The critical value of the one-tailed test is 1.645 standard errors from the
mean. The critical value is 4.23 (= 1.645 × 2.57). For the alternative hy-
pothesis, the z value of 4.23 is 0.30 (= (4.23 − 5)/2.57) (see Figure A4.8).
From the normal curve table in Appendix 1 (Table A1.2), the correspond-
ing area is 0.1179. The null hypothesis is accepted (and the alternative
hypothesis is rejected) if the observed sample mean is less than the critical
value, 4.23. A type 2 error is the acceptance of the null hypothesis when the
alternative hypothesis truly applies. Therefore:
P(type 2 error)=38.21%

Quantitative Methods Edinburgh Business School A4/51


Appendix 4 / Answers to Review Questions

Null Shaded Alternative


= p(type 2 error)
= 0.3821

5%

0 4230 5000
Critical
value

Figure A4.8 Average output increase: type 2 error


(iii) The power of the test is the probability of accepting the alternative hypoth-
esis when it truly does apply. This is the complement of P(type 2 error).
Power = 100% − 38.21% = 61.79%
(e) Note that, although the null hypothesis was accepted, the alternative was more
likely to apply. Under the null hypothesis, P(sample evidence) = 5.94 per cent;
under the alternative hypothesis the z-value of the observed sample mean is 0.39
((4 − 5)/2.57) and P(sample evidence) = 34.83 per cent. The problem is that the
power of the test is low. There is a much higher probability of a type 2 error
than a type 1 error.
To balance the situation, if P(type 2 error) = P(type 1 error) = 5 %, then the
critical value must be halfway between the means of the null hypothesis distribu-
tion and the alternative hypothesis distribution and also 1.645 standard errors
from both means. The critical value must therefore be at £2500 and (working in
thousands):
2.5 = 1.645 × standard error
= 1.645 × 14.08/√ (where is the sample size)
(Note that the original estimate of the standard deviation, 14.08, is still used.)
√ = 1.645 × 14.08/2.5
= 86(approx. )
A sample size of 86 would be able to discriminate between hypotheses in a more
balanced way.

Marking Scheme (out of 25) Marks


For each of parts (a) to (e) 5
Total 25

A4/52 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Module 9

Review Questions
9.1 The correct answer is B. A ‘natural’ measurement such as height is a typical example
of a normal distribution. Many small genetic factors presumably cause the varia-
tions. This is highly typical of the sort of situation on which the normal is defined.
9.2 The correct answer is B, C. The binomial formula, with its factorials and powers, is
more difficult to use than the Poisson. Binomial tables extend to many more pages
than the Poisson because the former has virtually one table for each sample size.
A is not a correct reason. If the situation is truly binomial but the Poisson is used,
some accuracy will be lost but the loss will be small if the rule of thumb applies.
9.3 The correct answer is B. The situation looks to be Poisson. Assume this to be the
case. The parameter is equal to the average number of accidents per month: 36/12
= 3. From the Poisson probability table (see Appendix 1, Table A1.3):
(0 accidents) = 0.0498
(1 accident) = 0.1494
(2 accidents) = 0.2240
(3 accidents) = 0.2240
(4 accidents) = 0.1680
(5 accidents) = 0.1008
Therefore:
(up to and including 5 accidents)
= 0.0498 + 0.1494 + 0.2240 + 0.2240 + 0.1680 + 0.1008
= 0.9160
Therefore:
(more than 5 accidents) = 1 − 0.916
= 0.084
= 8% (approximately)
9.4 The correct answer is B. The variance is estimated in essentially the same way as the
standard deviation, which is merely the square root of the variance. The same
reasoning that leads to the standard deviation having n − 1 degrees of freedom leads
to the variance having n − 1 = 24 degrees of freedom.
9.5 The correct answer is False. In addition to the conditions made in the statement, the
distribution from which the sample is taken must also be normal if the sampling
distribution of the mean is to be a t-distribution.
9.6 The correct answer is D.
Standard error = Individual standard deviation/ Sample size
= 95/√18
= 22.4

Quantitative Methods Edinburgh Business School A4/53


Appendix 4 / Answers to Review Questions

9.7 The correct answer is D.


Observed = (60 − 0)/22.4 = 2.68
9.8 The correct answer is B. For this test there are 17 degrees of freedom. The test is
one-tailed since it is supposed that the fitness of the executives can only have
improved after the course. The table t value is therefore in the row for 17 degrees of
freedom and the column headed 0.05.
9.9 The correct answer is B. The observed t value is greater than the 5 per cent
significance t value. Hence it falls within the 5 per cent tail and the probability of the
sample evidence is less than 5 per cent. The fitness of the executives does show a
significant improvement.
9.10 The correct answer is True. Chi-squared is essentially the ratio between a sample
variance and the variance of the population from which it was drawn. It can
therefore be used to test hypotheses such as that described in the question, provided
the population is normal and the sample is selected at random.
9.11 The correct answer is E. The correct value is taken from the row referring to 18
degrees of freedom, and the column referring to an upper tail area 0.10. The critical
chi-squared value is 25.989.
9.12 The correct answer is False. The chi-squared, not F-distribution, is applicable in
such circumstances.
9.13 The correct answer is D. Find the entry for row 11 and column 8 in the F table. The
upper entry refers to the 5 per cent tail and the lower to the 1 per cent tail. The
correct answer is 2.95.
9.14 The correct answer is True. The observed F ratio is 96/24 = 4.0. The 1 per cent
critical F ratio is 3.80, taken from the row corresponding to 14 degrees of freedom
in the denominator and the column corresponding to 12 degrees of freedom in the
numerator. The observed F therefore does exceed the 1 per cent critical value.
9.15 The correct answer is C. Essentially the distribution is binomial with p = 0.0035 and
n = 100. For such a low value of p and large value of n, the distribution can and
would be approximated by the Poisson, which is easier to use in practice than the
binomial.

Case Study 9.1: Aircraft Accidents


1
(a) The situation looks to be Poisson. The sample is the 400-day period. The events
are the incidents. It is possible to count how many incidents have taken place,
but not how many incidents might have taken place but did not. There is thus a
good a priori case for supposing that the Poisson will apply.
To test this, the observed results from 100 aircraft must be compared with what
would be expected theoretically under the assumption of a Poisson distribution.
In order to use the Poisson Table (see Appendix 1, Table A1.3) the parameter of
the distribution (the average number of incidents per 400 days) has to be calcu-
lated.

A4/54 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

( × ) ( × ) ( × ) ( × ) ( × ) ( × ) ( × ) ( × )
=
= 160/100
= 1.6
On average, therefore, an aircraft was involved in 1.6 incidents over the 400
days. From the column headed 1.6 in the table, the theoretical frequencies of
incidents can be found. For example, the probability of 0 incidents is 0.2019.
Out of 100 aircraft, one would thus expect 20 to be involved in 0 incidents. The
comparison between theoretical and observed is:

No. incidents (r) 0 1 2 3 4 5 6 7


Observed 23 33 23 11 5 3 1 1
Theoretical Poisson 20 32 26 14 6 2 0 0

There is a good correspondence between theoretical and observed, the discrep-


ancies being small. The two criteria for judging whether a Poisson distribution
fits the data are both satisfied. First, there is a good a priori case for supposing
the situation is Poisson. Second, the observed data are very close to what is an-
ticipated theoretically.
(b) Assuming the distribution of incidents is Poisson, the probabilities of five or
more incidents are, from the table:

P(5 incidents) = 0.0176


P(6 incidents) = 0.0047
P(7 incidents) = 0.0011
P(8 incidents) = 0.0002
P(9 incidents or more) = 0

giving:

P(5 incidents or more) = 0.0236

A proportion of 0.0236 (or 2.36 per cent) of the 800 would therefore be ex-
pected to be involved in five or more incidents over a 400-day period (i.e. 19
aircraft).
(c) Reservations about the conclusion are principally to do with whether the
incidents are random. It may be that for this aircraft certain routes/flights/ pi-
lots/times of the year are more prone to accident. If so, the average incident rate
differs from one part of the population to another and a uniform value (1.6)
covering the whole population should not be used. In this case it may be neces-
sary to treat each section of the population differently or to move to a more
sophisticated distribution, possibly the negative binomial.
Another problem may be that the sample is not representative of the population.
Not all routes may be included; not all airlines may be included; there may be a
learning effect, with perhaps fewer errors later in the 400-day period than earlier;
or perhaps the pilots are doubly careful when they first fly a new aircraft. The

Quantitative Methods Edinburgh Business School A4/55


Appendix 4 / Answers to Review Questions

data should be investigated to search for gaps and biases such as these. If the
insurance is to cover all aircraft/routes/airlines then the sample data should be
representative of this population.
Lastly, the data are all about incidents; the insurance companies become involved
financially only when an accident takes place. The former may not be a good
surrogate for the latter. If possible, past records should be used to establish a
relationship between the two and to test just how good a basis the analysis of
incidents is for deciding on accident insurance.
(d) If a check of the data reveals missing routes or airlines then the gaps should be
filled if possible. The data should be split into subsections and the analysis re-
peated to find if there has been a learning effect or if there are different patterns
in different parts of the data. There could be differences on account of seasonali-
ty, routes, airlines, type of flight (charter or scheduled). If differences are
observed then the insurance premium would be weighted accordingly.
Data from the introduction of other makes of aircraft could serve to indicate
learning effects and also the future pattern of incidents.
The large amounts of money at stake in a situation like this would make the extra
statistical work suggested here worthwhile.

Marking Scheme (out of 20) Marks


(a) General method: comparing observed with theoretical 2
A priori justification of Poisson 2
Calculation of parameter 2
Correct use of table 2
(b) Calculation 2
(c) Noticing larger tail than anticipated 2
Suggesting reasons for possible non-randomness 2
Difference between incidents and accidents 2
(d) Possible new data 2
Further analyses 2
Total 20

Case Study 9.2: Police Vehicles


1
(a) Before conducting any significance test, the mean and standard deviation of the
sample have to be calculated.

Car 1 2 3 4 5 6 7 8 9 10 11 12 13 14
(mpg) 21 24 22 24 29 18 21 26 25 19 22 20 28 23 = 23
− −2 1 −1 1 6 −5 −2 3 2 −4 −1 −3 5 0
2 4 1 1 1 36 25 4 9 4 16 1 9 25 0
( − )

A4/56 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

= ∑( − ) /(14 − 1)
= 136/13
= 10.46
The five stages of a significance test can now be followed:
(i) The hypothesis is that the tyres have made no difference and that the petrol
consumptions come from a population with a mean of 22.4.
(ii) The evidence is the sample of 14 cars’ petrol consumptions with a mean of
23 and a standard deviation of:
√10.46 = 3.23
(iii) The significance level is 5 per cent.
(iv) The sample size is less than 30, the standard deviation has been estimated
from the sample and the underlying individual distribution is normal. All
the conditions attached to the t-distribution are present. The observed t
value is:
= (23 − 22.4)/(3.23/√14)
= 0.60/0.86
= 0.70
(v) The test is one-tailed, assuming the tyres could bring about only an im-
provement, not a deterioration, in petrol consumption; the degrees of
freedom are 13. The t value corresponding to the 5 per cent level is thus
taken from the row for 13 degrees of freedom and the column for t0.05. The
value is 1.771. The observed t value is less than this and therefore the hy-
pothesis is accepted. The tyres do not make a significant difference.
(b) On the other hand, the alternative hypothesis is that the tyres result in an
improvement in petrol consumption of 1.5 mpg. Under this hypothesis the sam-
ple would have come from a distribution of mean 23.9 (= 22.4 + 1.5). The
observed t value is:
= (23 − 23.9)/(3.23/√14)
= −0.9/0.86
= −1.05
Ignoring the negative sign, this observed t value (under the alternative hypothe-
sis) is lower than the critical t value of 1.771, just as was the previous observed t
value (under the null hypothesis). The sample evidence would therefore be insuf-
ficient to reject either the null or the alternative hypothesis. Clearly the sample
size is too small to discriminate properly between the two hypotheses.
If the probabilities of type 1 and type 2 errors are to be equal then the critical t
value, t0.05, should be equidistant from both hypotheses (i.e. halfway between
them at 23.15). A sample size larger than 14 is evidently required to do this. As-
suming that the sample size needed is greater than 30 (and therefore the t-
distribution can be approximated to the normal):
. = (23.15 − 22.4)(3.23/√ )
1.645 = 0.75/(3.23/√ )
√ = (1.645 × 3.23)/0.75
= 50
A sample size of 50 is needed if the test is to discriminate equally between the no
change hypothesis and the hypothesis based on the manufacturer’s claim. To

Quantitative Methods Edinburgh Business School A4/57


Appendix 4 / Answers to Review Questions

achieve such a sample size would presumably involve using cars of a wider age
span than six to nine months.
(c) Many factors affect a car’s petrol consumption. A well-designed significance test
should exclude or minimise the effect of all factors except the one of interest, in
this case, the tyres. Major influences on consumption are the type of car and its
age. By comparing like with like in respect of these factors, their effect is elimi-
nated. Other factors cannot be controlled in this way. Very little can be done
about the type of usage, total mileage, the style of the drivers and the quality of
maintenance. It is hoped that these factors will balance out over the sample of
cars and the time period.
(d) The principal argument in favour of the officer’s suggestion is that it may
eliminate the effect of different maintenance methods from the test so that ob-
served differences are accounted for by the tyres, not the maintenance methods.
The arguments against his proposal are stronger. First, the maintenance methods
are unlikely to be identical in the two forces. The procedures laid down may be
the same but the interpretation of them by different sets of mechanics of differ-
ent levels of skill will almost certainly mean that there are still differences.
Second, his proposals create some new difficulties not present in the original
significance test. Some factors affecting petrol consumption that were eliminated
by the first test are now reintroduced. The geography of the territories served by
the forces will differ; the drivers of the cars will be different; the roles of the cars
may be different. All these factors will cause different fuel consumptions in the
two samples of cars, which may well disguise or overwhelm the influence of the
tyres.
While the officer’s test could certainly be carried out, the new variables his test
introduces would put a question mark over any conclusions drawn. On the
whole, the officer’s suggestion should be rejected but without blunting his en-
thusiasm for using analytical methods to help in decision taking.

Marking Scheme (out of 20) Marks


(a) Calculation of mean and standard deviation 2
Standard error 1
Observed t value 2
Table t value 2
Drawing the right conclusion 1
(b) Method of finding sample size 3
Correct calculations 3
(c) Reasons for eliminating other influences 3
(d) For and against officer’s proposal 3
Total 20

A4/58 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Module 10

Review Questions
10.1 The correct answer is A, C. Analysis of variance tests the hypothesis that the
samples come from populations with equal means or that they come from a
common population. In the former case, B is an assumption. In both cases, D is an
assumption. B and D are therefore not hypotheses but assumptions underlying the
testing of the hypotheses by analysis of variance.
10.2 The correct answer is A.
SSE = Total SS − SST = 316 − 96 = 220
10.3 The correct answer is D.
MST = SST/Degrees of freedom = 96/(5 − 1) = 24
10.4 The correct answer is D.
MSE = SSE/Degrees of freedom
= 220/[(12 − 1) × 5]
= 4
Observed F = MST/MSE = 24/4 = 6
10.5 The correct answer is True. The hypothesis tested by the analysis of variance is that
the treatments come from populations with equal means (i.e. the treatments have no
effect). But since the observed F value exceeds the critical value, the hypothesis
must be rejected. The treatments do have a significant effect.
10.6 The correct answer is A, D. It is hypothesised that B is an attribute of the
populations from which the samples are taken; C is an assumed attribute of the
populations. Since the samples are selected at random, it would be virtually impossi-
ble for the samples to have these attributes.
10.7 The correct answer is B. The grand mean is 4. Total SS is calculated by finding the
deviation of each observation from the grand mean, then squaring and summing.
Taking each row in turn:
Total SS = (3 − 4)2 + (7 − 4)2 + (4 − 4)2 + (2 − 4)2 +…+ (3 − 4)2 + (1 − 4)2 +
(4 − 4)2
= 1 + 9 + 0 + 4 + 4 + 4 + 4 + 4 + 16 + 9 + 1 + 4 + 0 + 4 + 1 + 9 + 16
+1+9+0
= 100
10.8 The correct answer is A.
SST = r × [(5 − 4)2 + (5 − 4)2 + (3 − 4)2 + (3 − 4)2]
where r = number of rows (observations or blocks).
SST = 5 × (1 + 1 + 1 + 1)
SST = 20

Quantitative Methods Edinburgh Business School A4/59


Appendix 4 / Answers to Review Questions

10.9 The correct answer is D.


SSB = c × [(4 − 4)2 + (3 − 4)2 + (6 − 4)2 + (3 − 4)2 + (4 − 4)2]
where c = number of columns (treatments).
SSB = 4 × (0 + 1 + 4 + 1 + 0)
SSB = 24
10.10 The correct answer is False. A balanced design is one in which all the treatment
groups are the same size (i.e. they all have an equal number of observations).

Case Study 10.1: Washing Powder


1 A one-way analysis of variance will test the hypothesis that the means of the
irritation index for each brand come from populations with the same mean. If the
hypothesis is accepted, the conclusion will be that the powders do not cause
different levels of skin irritation; if rejected, the conclusion will be that the powders
do cause different levels of irritation.
The systematic way of carrying out the test is to base it on an ANOVA table, as
shown in Table A4.7. The details of the calculations are shown below.

Table A4.7 ANOVA for washing powders


Variation Degrees Sums of Mean F
of free- squares square
dom
Explained by c−1=5 SST = MST = 490/5 MST/MSE =
treatments 490 98/10.9
(between columns) = 98 = 8.99
Error or unex- (r − 1)c = SSE = MSE =
plained 54 588.2 588.4/54
(within columns) = 10.90
Total rc − 1 = 59 SS =
1078.4

The first column of Table A4.7 describes the sources of error. The second relates to
degrees of freedom, always given by c − 1 and (r − 1)c for a one-way analysis of
variance.
The third column requires the calculation of the sums of squares. SST deals with the
‘between’ sums of squares and is concerned with the group means and their
deviations from the grand mean.
SST 10 × [(98.3 − 101.8)2 + (102.9 − 101.8)2 + (103.6 − 101.8)2 +
= (98.5 − 101.8)2 + (101.3 − 101.8)2 + (106.4 − 101.8)2]
= 10 × (12.25 + 1.21 + 3.24 + 10.89 + 0.25 + 21.16)
= 490.0
SSE deals with (within) sums of squares and is concerned with the individual
observations and the deviations between them and their group means.

A4/60 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

SSE = (97 − 98.3)2 + (102 − 98.3)2 + (95 − 98.3)2 + (99 − 98.3)2 + Brand 1
(98 − 98.3)2 + (100 − 98.3)2 + (95 − 98.3)2 + (96 − 98.3)2 +
(103 − 98.3)2 + (98 − 98.3)2
+ (106 − 102.9)2 + (109 − 102.9)2 + (100 − 102.9)2 + (104 Brand 2
− 102.9)2 + (103 − 102.9)2 +(101 − 102.9)2 + (99 − 102.9)2
+ (103 − 102.9)2 + (105 − 102.9)2 + (99 − 102.9)2
+ (112 − 103.6)2 + (98 − 103.6)2 + (101 − 103.6)2 + (101 − Brand 3
103.6)2 + (105 − 103.6)2 + (103 − 103.6)2 + (102 − 103.6)2
+ (105 − 103.6)2 + (108 − 103.6)2 + (101 − 103.6)2
+ (105 − 98.5)2 + (93 − 98.5)2 + (99 − 98.5)2 + (96 − 98.5)2 Brand 4
+ (99 − 98.5)2 + (101 − 98.5)2 + (97 − 98.5)2 + (99 − 98.5)2
+ (101 − 98.5)2 + (95 − 98.5)2
+ (101 − 101.3)2 + (102 − 101.3)2 + (97 − 101.3)2 + (100 − Brand 5
101.3)2 + (105 − 101.3)2 + (99 − 101.3)2 + (101 − 101.3)2 +
(103 − 101.3)2 + (105 − 101.3)2 + (100 − 101.3)2
+ (108 − 106.4)2 + (113 − 106.4)2 + (104 − 106.4)2 + (108 Brand 6
− 106.4)2 + (103 − 106.4)2 + (104 − 106.4)2 + (108 −
106.4)2 + (104 − 106.4)2 + (110 − 106.4)2 + (102 − 106.4)2
= 68.1 + 94.9 + 148.4 + 106.5 + 58.1+ 112.4
= 588.4

Total SS is of course the total sums of squares. It is concerned with all observations
and their deviations from the grand mean. Going through the observations, each
row in turn:
Total SS = (97 − 101.8)2 + (106 − 101.8)2 +…+ (100 − 101.8)2 + (102 − 101.8)2
= 1078.4
It was not strictly necessary to calculate all three sums of squares since:
Total SS = SST + SSE
Calculating all three provided a check. Table A4.7 shows that the equality is satis-
fied: 1078.4 = 490.0 + 588.4.
Next, the mean squares are calculated by dividing the sums of squares by the
associated degrees of freedom (column 4 in Table A4.7). The ratio of the mean
squares is the observed value of the F variable and is calculated in the final column.
To finish the test, the critical F value for (5, 54) degrees of freedom at the 5 per cent
level is found from the table of the F-distribution. In this case, the value is 2.38. The
observed F value, 8.99, greatly exceeds 2.38. The hypothesis is rejected at the 5 per
cent significance level. There is a significant difference in the levels of irritability
caused.
At the 1 per cent level the hypothesis is also rejected. The critical F value for (5, 54)
degrees of freedom is 3.37 at the 1 per cent level. The observed F value exceeds this
also. The evidence that the powders do not cause the same level of skin irritation is
strong.

Quantitative Methods Edinburgh Business School A4/61


Appendix 4 / Answers to Review Questions

Qualifications
The reservations that should attach to the results of the test are to do with both
statistics and common sense.
(a) An F test assumes that the populations from which the samples are drawn are
normally distributed. In this case, it must be assumed that the distribution of
observations for each brand is normal. This may not be true, especially since the
sample size (10) is too small for the central limit theorem to have any real effect.
(b) An F test also assumes that the populations from which the samples are drawn
have equal variances. Again, this may not be true although statistical research has
indicated that variances would have to be very different before the results of the
test were distorted.
(c) Since skin irritation is very much a subjective problem and one that is hard to
quantify, there must also be doubts about the validity of the data (i.e. does the
index measure accurately what it is supposed to measure?). The tester should
look carefully at the ways in which the index has been validated by the research-
ers.
(d) The data must also come into question for more fundamental reasons. The
design of the experiment gives rise to the following doubts:
(i) How were the households chosen? Are they a representative group?
(ii) Do the households do their washing any differently because they are being
monitored by the tester?
(iii) How representative are the batches of washing?
(iv) How ‘standard’ are the batches of washing?
(v) Are there factors that make some people more prone to skin irritation and
that should therefore be built into the test?
(vi) Are the data independent (e.g. is there any cumulative effect in the testing)?
Does any brand suffer a higher index because of the effect of brands tested
earlier?

Marking Scheme (out of 20) Marks


Calculating degrees of freedom 2
Calculating total SS, SST, SSE (2 marks each) 6
Calculating mean squares 1
Calculating observed F 1
Finding critical F 2
Use of ANOVA table 2
Qualifications
– Statistical assumptions about normality and variance 2
– Quality of data 4
Total 20

A4/62 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Case Study 10.2: Hypermarkets


1 To test whether there is any difference between responses at different locations, a
two-way analysis of variance is needed. The treatments are the store locations and
the blocks are the days of the week.
Table A4.8 is Table 10.14 with row and column means and the grand mean added.

Table A4.8 Positive responses to ‘courteous service’ attribute


Store
B C D G L M N S Average
Monday 71 73 66 69 58 60 70 61 66
Tuesday 71 78 81 89 78 85 90 84 82
Wednesday 73 78 76 86 74 80 81 76 78
Thursday 73 75 73 80 75 71 73 72 74
Friday 62 66 69 81 60 64 61 57 65

Average 70 74 73 81 69 72 75 70 Grand mean = 73

After calculating the means, the next step is to construct a two-way analysis of
variance (ANOVA) table as shown in Table A4.9.

Table A4.9 Two-way ANOVA table for hypermarket survey


Variation Degrees of Sums of Mean square F
freedom squares
Explained by c−1=7 SST = 520 MST = 520/7 MST/MSE = 74.3/19.7
treatments
(between = 74.3 = 3.77
columns)
Explained by (r − 1) = 4 SSB = 1760 MSB = 1760/4 MSB/MSE = 440/19.7
blocks
(between = 440 = 22.3
rows)
Error or (r − 1)(c − 1) = SSE = 552 MSE = 552/28
unexplained 28
(within = 19.7
columns)
Total rc − 1 = 39 SS = 2832
SST = 5 × [(70 − 73)2 + (74 − 73)2 + (73 − 73)2 + (81 − 73)2 + (69 − 73)2 +
(72 − 73)2 + (75 − 73)2 + (70 − 73)2]
= 5 × (9 + 1 + 0 + 64 + 16 + 1+ 4 + 9)
= 520
The block sum of squares is calculated from the block (row) means:

Quantitative Methods Edinburgh Business School A4/63


Appendix 4 / Answers to Review Questions

SSB = 8 × [(66 − 73)2 + (82 − 73)2 + (78 − 73)2 + (74 − 73)2 + (65 − 73)2]
= 8 × (49 + 81 + 25 + 1 + 64)
= 1760
The error sum of squares (SSE) is calculated by first determining the total sum of
squares (Total SS).

Total SS = (71 − 73)2 + (73 − 73)2 + (66 − 73)2 + (69 − 73)2 Monday 616
+ (58 − 73)2 + (60 − 73)2 + (70 − 73)2 + (61 −
73)2
+ (71 − 73)2 + (78 − 73)2 + … Tuesday 928
+ (73 − 73)2 + (78 − 73)2 + … Wednesday 326
+ (73 − 73)2 + (75 − 73)2 + … Thursday 62
+ (62 − 73)2 + (66 − 73)2 + … Friday 900
= 2832

Since Total SS = SST + SSB +SSE


SSE = Total SS − SST − SSB
= 2832 − 1760 − 520
= 552
(a) Do the locations of the store have different effects on the responses? To test the
hypothesis that the location (column) means come from the same population,
the observed F value relating to treatments must be compared with a critical F
value. If the significance level is chosen to be 5 per cent then the appropriate
critical F value, relating to (7, 28) degrees of freedom, is found from the F table
to be 2.36. The observed F value is 3.77; therefore the hypothesis should be
rejected. At the 5 per cent significance level, location does appear to affect re-
sponses.
(b) It is important to neutralise the effect of the days of the week in a test such as
this. Intuitively there is a likelihood that people’s attitudes will vary between the
beginning of the week and the end, when the weekend approaches. This factor
may affect customers and staff alike.
(c) If the effect of days of the week had not been neutralised, the appropriate test
would have been a one-way analysis of variance as shown in Table A4.10. Total
SS and SST are calculated just as in the two-way case, but SSE is obtained from
the relationship:
Total SS = SST + SSE
The critical F value at the 5 per cent level and for (7, 32) degrees of freedom is
2.32. The observed F value, 1.03, is less than the critical. The hypothesis should
be accepted. Location does not appear to affect responses. When the effects of
days of the week are not allowed for, the result of the test is the opposite of
when they are allowed for.

A4/64 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Table A4.10 One-way ANOVA table for hypermarket survey


Variation Degrees of Sums of Mean square F
freedom squares
Explained by treatments c−1=7 SST = 520 MST = 520/7 MST/MSE =
74.3/72.2
(between columns) = 74.3 = 1.03
Error or unexplained (r − 1)c = 32 SSE = 2312 MSE = 2312/32
(within columns) = 72.2
Total rc − 1 = 39 SS = 2832

(d) Referring back to Table A4.9, the effect of days of the week on responses can
also be tested. This time the observed F value is the ratio MSB/MSE, equal to
22.3. The critical F value at the 5 per cent level for (4, 28) degrees of freedom is
2.71. The observed F far exceeds this amount. Days of the week have a highly
significant effect on the responses.
(e) Should the analysis of variance be taken further by looking into the possibility of
an interaction effect? The usefulness of such an extension to the study depends
on how much it is thought days of the week and locations have independent
effects on responses. If it were thought that the ‘Monday’ and ‘Friday’ effects
were more marked in some parts of the country than others then an interaction
variable would permit the inclusion of this influence in the analysis of variance.
Intuitively it does not seem likely that people feel particularly worse about Mon-
days (better about Fridays) in some cities than in others. In any case, since the
effect of location has already been demonstrated to have a significant bearing on
responses, the inclusion of a significant interaction term could only make the
effect more marked (by decreasing the SSE while SST remains the same). Over-
all it does not seem worthwhile to extend the analysis to include an interaction
term.

Marking Scheme (out of 30) Marks


Two-way analysis of variance
– Calculating degrees of freedom 2
– Calculating Total SS, SST, SSB, SSE (2 marks each) 8
– Calculating mean squares 2
– Calculating observed F 1
– Finding critical F 1
– Use of ANOVA table 4
– Reasons for including blocks 2
One-way analysis of variance
– Calculating sums of squares 2
– ANOVA table 4
Discussion of interaction
– Likelihood of need for interaction variable 2

Quantitative Methods Edinburgh Business School A4/65


Appendix 4 / Answers to Review Questions

– Predicted effect on outcome of test 2


Total 30

Module 11

Review Questions
11.1 The correct answer is C, D. A is untrue because regression is specifying the
relationship between variables; correlation is measuring the strength of the relation-
ship. B is untrue because regression and correlation cannot be applied to unpaired
sets of data. C is true, by definition, and D is true, because if the data were plotted in
a scatter diagram, they would lie approximately along a straight line with a negative
slope.
11.2 The correct answer is B. A is untrue because residuals are measured vertically, not at
right angles to the line. B is true, by definition. C is untrue because actual points
below the line have negative residuals, and D is untrue because residuals are all zero
only when the points all lie exactly on the line (i.e. when there is perfect correlation).
11.3 The correct answer is B.

− ( − ) − ( − ) ( − )( − )
4 2 −4 16 −3 9 12
6 4 −2 4 −1 1 2
9 4 1 1 −1 1 −1
10 7 2 4 2 4 4
11 8 3 9 3 9 9
∑40 25 34 24 26
=8 =5

∑( − )( − ) = 26
∑( − ) = 34
∑( − ) = 24
Slope coefficient = = ∑( − )( − )/∑( − )
= 26/34
= 0.765
11.4 The correct answer is C.
Intercept = − Slope × = 5 − 0.765 × 8
= −1.12

A4/66 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

11.5 The correct answer is A.


∑( )( )
Correlation coefficient = =
∑( ) ∑( )

= 26/√34.24
= 26/28.6
= 0.91
11.6 The correct answer is A.
Residual = Actual − Fitted
= 2 − (4 × 0.765 − 1.12)
= 0.06
11.7 The correct answer is A. The evidence of everyday life is that husbands and wives
tend to be of about the same age, with only a few exceptions. One would therefore
expect strong positive correlation between the variables.
11.8 The correct answer is A. If data are truly represented by a straight line, the residuals
should exhibit no pattern. They should be random. Randomness implies that each
residual should not be linked with the previous (i.e. there should be no serial
correlation). Randomness also implies that the residuals should have constant
variance across the range of x values (i.e. heteroscedasticity should not be present).
11.9 The correct answer is False. The strong correlation indicates association, not
causality. In any case, it is more likely that if causal effects are present, they work in
the opposite direction (i.e. a longer life means a patient has more time in which to
visit his doctor).
11.10 The correct answer is C. The prediction of sales volume for advertising expenditure
of 5 is:
Sales = 14.7 + 6.3 × Advertising expenditure
= 14.7 + 31.5
= 46.2
11.11 The correct answer is B. Unexplained variation = Sum of squared residuals = 900
˗squared = Explained variation/Total variation
= (Total − Unexplained)/Total
0.70 = (Total − 900)/Total
0.7 × Total = Total − 900
0.3 × Total = 900
Total = 3000
11.12 The correct answer is A. The difference between a regression of y on x and one of x
on y is that y and x are interchanged in the regression and correlation formulae.
Since the correlation coefficient formula is unchanged if x and y are swapped round,
the correlation coefficients are the same in both cases. Since the slope and intercept
formulae are changed if x and y are swapped round, then these two quantities are
different in the two cases (unless by a fluke).

Quantitative Methods Edinburgh Business School A4/67


Appendix 4 / Answers to Review Questions

Case Study 11.1: Railway Booking Offices


1

20

15

10

5
x
0 1 2 3 4 5 6 7 8

Figure A4.9 Railway booking transactions


(a) The scatter diagram is given in Figure A4.9. High y values tend to correspond to
high x values. The underlying relationship could be linear but there is a lot of
scatter.

− 2 − 2 ( − )( − )
( − ) ( − )
3 11 −1 1 −3 9 3
1 7 −3 9 −7 49 21
3 12 −1 1 −2 4 2
4 17 0 0 3 9 0
6 19 2 4 5 25 10
7 18 3 9 4 16 12
∑ 24 84 24 112 48
=4 = 14

∑( )( )
=
∑( ) ·∑( )

= 48/√24 × 112
= 48/51.8
= 0.93
The correlation coefficient is high, confirming the visual evidence of the scatter
diagram that the relationship is linear.
(b) Line (i)
The line goes through the points (1,7) and (6,19). Therefore, the line has slope =
(19 − 7)/(6 − 1) = 12/5, (i.e. the line is y = a + 2.4x). Since the line goes
through the point (1,7):
7 = a + 2.4

A4/68 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

a = 4.6
The line is y = 4.6 + 2.4x
Line (ii)
The line goes through the points (1,7) and (7,18). Therefore, the line has slope =
(18 − 7)/(7 − 1) = 11/6 = 1.8 (i.e. the line is y = a + 1.8x). Since the line goes
through (1,7):
7 = a + 1.8
A = 5.2
The line is y = 5.2 + 1.8x
Line (iii)
The regression line is found from the regression formulae:
∑( )( )
Slope =
∑( )
= 48/24
= 2
Intercept = − Slope ×
= 14 − 2 × 4
= 6
The line is y = 6 + 2x

Points Line (i) Line (ii) Line (iii)


y = 4.6 + 2.4x y = 5.2 + 1.8x y = 6 + 2x
x y Fitted y Residual Fitted y Residual Fitted y Residual
3 11 11.8 −0.8 10.6 0.4 12.0 −1.0
1 7 7.0 0 7 0 8.0 −1.0
3 12 11.8 0.2 10.6 1.4 12.0 0
4 17 14.2 2.8 12.4 4.6 14.0 3.0
6 19 19.0 0 16.0 3.0 18.0 1.0
7 18 21.4 −3.4 17.8 0.2 20.0 −2.0
MAD = 1.2 MAD = 1.6 MAD = 1.3
Variance = 4.0 Variance = 6.5 Variance = 3.2

The residuals are calculated as actual minus fitted y values. For example, for line
(i) and the point (3,11), the residual is:
11 − (4.6 + 2.4 × 3) = −0.8
The MADs are calculated as the average of the absolute values of the residuals.
For example, for line (i):
MAD = (0.8 + 0 + 0.2 + 2.8 + 0 + 3.4)/6 = 1.2
The variances are calculated as the average of the squared residuals (but with a
divisor of 5, not 6, as in the formula for the variance). For example, for line (i):
Variance = (0.64 + 0 + 0.04 + 7.84 + 0 + 11.56)/5 = 20.08/5
= 4.016
The mean absolute deviation shows that line (i), connecting the extreme y values,
has the smallest residual scatter. On the MAD criterion, line (i) is the best.

Quantitative Methods Edinburgh Business School A4/69


Appendix 4 / Answers to Review Questions

The variance shows that line (iii), the regression line, has the smallest residual
scatter. On the variance criterion (equivalent to least squares), line (iii) is the best.
This has to be the case since the regression line is the line that minimises the
sum of squared residuals.
Clearly different, but equally plausible criteria (minimising the MAD and mini-
mising the variance of the residuals) give different ‘best fit’ lines. Even when one
keeps to one criterion the margin between the ‘best’ line and the others is small
(in terms of the criterion). Yet the three lines (i), (ii) and (iii) differ markedly
from one another and would give distinctly different results if used to forecast.
The conclusion is that, while regression analysis is a very useful concept, it
should be used with caution. A regression line is best only in a particular way
and, even then, only by a small margin.

Marking Scheme (out of 20) Marks


(a) Scatter diagram 2
Correlation coefficient 2
(b) Equations of lines (i) and (ii) 3
Equation of line (iii) 3
(c) Residuals of lines 2
MADs of lines 2
Variances of lines 2
(d) Conclusions as to which lines are best 2
Conclusion on viewing regression with caution 2
Total 20

Case Study 11.2: Department Store Chain


1
(a) The regression equation is:
Sales = 35.228 + 0.17846 × Income
For average disposable family income = £221:
Sales = 35.228 + 0.17846 × 221
= 74.668
(b) The goodness of fit can be checked by considering the correlation coefficient
and the residuals. The correlation coefficient r is 0.92. This is high, suggesting a
good fit.
The next step is to check the residuals for randomness. They must first be calcu-
lated using the regression equation (see Table A4.11).

A4/70 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Table A4.11 Department stores: regression results


Store Average sales per Average disposable Fitted y Residual
number week (£000s) family income
(coded)
y x
1 90 301 88.9 1.1
2 87 267 82.9 4.1
3 86 297 88.2 −2.2
4 84 227 75.7 8.3
5 82 273 83.9 −1.9
6 80 253 80.4 −0.4
7 78 203 71.5 6.5
8 75 263 82.2 −7.2
9 70 190 69.1 0.9
10 68 212 73.1 −5.1
11 64 157 63.2 0.8
12 61 141 60.4 0.6
13 58 119 56.5 1.5
14 52 133 59.0 −7.0
Mean 74 217

A visual inspection of the residuals does not suggest any particular pattern. First,
there is no tendency for the positives and negatives to be grouped together (e.g.
for the positive residuals to refer to the smaller stores and the negatives to the
larger, or vice versa). In other words, there is no obvious evidence of serial cor-
relation. Second, there is no tendency for the residuals to be of different sizes at
different parts of the range (e.g. for the residuals to be, in general, larger for
larger stores and smaller for smaller stores). In short, there is no evidence of
heteroscedasticity.
Visually, the residuals appear to be random. Taken with the high correlation
coefficient, this indicates that there is a linear relationship between sales and
family disposable income.
(c) The scatter of the residuals about the regression line is measured through the
residual standard error. If the residuals are normally distributed, 95 per cent of
them will lie within 2 standard errors of the line. For a point forecast (given by
the line) it may be anticipated, if the future is like the past, that the actual value
will also lie within 2 standard errors of the point forecast.
If residual error were the only source of error, 95 per cent confidence limits for
the forecast could be defined as, in the example given above:
£74 668 ± 2 × 4720
i.e. £65 228 to £84 108

Quantitative Methods Edinburgh Business School A4/71


Appendix 4 / Answers to Review Questions

However, there are other sources of error (see Module 12) and therefore the
above confidence interval must be regarded as the best accuracy that could be
achieved.
(d) The linear relationship between sales and family disposable income appears to
pass the statistical tests. Further, since it must be a reasonable supposition that
sales are affected to some degree by the economic wealth of the catchment area,
the model has common sense on its side.
On the other hand, there are many influences on a store’s sales besides family
income. These are not included in the forecasting method. Ideally, a method that
can include other variables would be preferable.
A second reservation is concerned with the quality of the data. While store sales
are probably fairly easy to measure and readily available, this is unlikely to be the
case with the disposable family income. If these data are not available, an expen-
sive survey would be required to make estimates. Even then, the data are not
likely to carry a high degree of accuracy.
Last, the catchment area will be difficult to define in many if not all cases, adding
further to the inaccuracy of the data.

Marking Scheme (out of 20) Marks


(a) Specifying equation 2
Making forecast 2
(b) Commenting on correlation coefficient 2
Calculating residuals 3
Discussing randomness 1
including
– serial correlation 1
– heteroscedasticity 1
(c) Knowing meaning of residual standard error 2
and its relationship to forecasting accuracy 2
(d) Non-statistical reservations 4
Total 20

Module 12

Review Questions
12.1 The correct answer is B, C. B and C give synonyms for a right-hand-side variable.
Another synonym is an independent variable, the opposite of A. D is incorrect,
there being no such thing as a residual variable.
12.2 The correct answer is False. The statement on simple regression is correct, but the
statement on multiple regression should be altered to ‘one y variable is related to
several x variables’.

A4/72 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

12.3 The correct answer is A. The coefficients have standard errors because they are
calculated from a set of observations that is deemed to be a sample and therefore
the coefficients are estimates. Possible variations in the coefficients are calculated
via their standard errors, which are in turn estimated from variation in the residuals.
B is incorrect since, although there may be data errors, this is not what the standard
errors measure. The standard errors are used to calculate t values that are used in
multiple regression, but this is not why they arise. Therefore, C and D are incorrect.
12.4 The correct answer is C. The t values are found by dividing the coefficient estimate
by the standard error. Thus:
Variable 1: 5.0/1.0 = 5.0
Variable 2: 0.3/0.2 = 1.5
Variable 3: 22/4 = 5.5
12.5 The correct answer is D. The degrees of freedom = No. observations − No. x
variables − 1. Thus, number of observations = 32 + 3 + 1 = 36.
12.6 The correct answer is C. The elimination of variables is based on a significance test
for each variable. The t value for each variable is compared with the critical t value
for the relevant degrees of freedom. In this case, the number of observations
exceeds 30; therefore, the normal distribution applies and the critical t value is 1.96.
12.7 The correct answer is True. The formula for R-bar-squared has been adjusted to
take degrees of freedom into account. Since each x variable reduces the degrees of
freedom by 1, the number of x variables included is allowed for.
12.8 The correct answer is A. Sums of squares (regression) have as many degrees of
freedom as there are right-hand-side variables (i.e. 3).
12.9 The correct answer is C.
Mean square (regression) = Sums of squares (regression)/Degrees of freedom
= 120/3
= 40
12.10 The correct answer is D. The degrees of freedom for sums of squares (residuals) =
n − k − 1 = 34
Mean square (residual) = Sums of squares (residual)/Degrees of freedom
= 170/34
= 5.0
12.11 The correct answer is C. The critical F ratio is for (3,34) degrees of freedom and for
the 5 per cent level. From F tables this is found to be 2.88. Since observed exceeds
critical, there is a significant linear relationship.
12.12 The correct answer is D. The independent variables are the right-hand-side
variables. Only x (and x2) appear on the right-hand side, but in curvilinear regression
squared terms are treated as additional variables. Therefore, the independent
variables are x and x2.
12.13 The correct answer is False. A transformation is used not to approximate a curved
relationship to a linear one but to put the relationship in a different form so that the
technique of linear regression can be applied to it.

Quantitative Methods Edinburgh Business School A4/73


Appendix 4 / Answers to Review Questions

12.14 The correct answer is B. To carry out a regression analysis on the exponential
function y = aebx the equation is first transformed by taking logarithms (to the base
e) of either side to obtain eventually:
log = log +
This is a linear equation between log y and x. Hence, a linear regression can be
carried out on the variables log y and x.
12.15 The correct answer is E. In its linear form the equation is:
log = log +
The coefficient of x is thus b and the constant is log a. Therefore:
= 8 and = antilog 4

Case Study 12.1: CD Marketing


1
(a) The equation is:
Gross revenue = 138 + 10 × News adv. −5 × TV adv.
(b) The values of the residuals are found by first calculating the fitted y values from
the regression equation. The fitted values are then subtracted from the actual.
(See Table A4.12.) The residuals are not especially unusual. A visual inspection
suggests that they are random, although it is of course difficult to detect patterns
from so few observations.

Table A4.12 CD advertising: regression results


Week Gross News TV Fitted Residuals
reve- adver- adver-
nue tising tising
(£000) (£000) (£000)
1 180 5 1 183 (= 138 + (10 × 5) − (5 × 1)) −3 (180 − 183)
2 165 3 2 158 (= 138 + (10 × 3) − (5 × 2)) 7 (165 − 158)
3 150 3 3 153 (= 138 + (10 × 3) − (5 × 3)) −3 (150 − 153)
4 150 3 3 153 (= 138 + (10 × 3) − (5 × 3)) −3 (150 − 153)
5 185 5 2 178 (= 138 + (10 × 5) − (5 × 2)) 7 (185 − 178)
6 170 4 1 173 (= 138 + (10 × 4) − (5 × 1)) −3 (170 − 173)
7 190 6 1 193 (= 138 + (10 × 6) − (5 ×1)) −3 (190 − 193)
8 200 6 0 198 (= 138 + (10 × 6) − (5 × 0)) 2 (200 − 198)

(c) The correlation coefficient is high at 0.92. But it would be expected to be high
where there are so few observations and two right-hand-side variables. The F test
would be a more precise check. An ANOVA table should be drawn up as in
Table A4.13.
The observed F value is 37.3. The critical F value for (2,5) degrees of freedom at
the 5 per cent significance level is, from the F table, 5.79. Since observed F ex-

A4/74 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

ceeds the critical F, it can be concluded that there is a significant linear relation-
ship.

Table A4.13 CD advertising: ANOVA


Variation Degrees of Sums of Mean square F
freedom squares
Explained k =2 SSR = MSR = MST/MSE =
by regres- 2191 2191/2 1095/29.4
sion
= 1095 = 37.3
Error or n−k−1 =5 SSE = 147 MSE = 147/5
unexplained
(residuals) = 29.4
Total n−1 =7 SS =
2338

(d) Two additional pieces of information would be useful. The correlation matrix
would help to check on the possible collinearity of the two x variables. Calcula-
tions of SE(Pred) would help to determine whether the predictions produced by
the model were of sufficient accuracy to use in decision making.
(e) The model could be used to forecast revenue, provided that conditions do not
change. In particular, this means that the ways in which decisions are taken re-
main the same.
A seeming paradox of the model is the negative coefficient for television adver-
tising expenditure. Does this mean that television advertising causes sales to be
reduced? The answer is almost certainly no. The reason is this: television adver-
tising is only used when sales are disappointing. Consequently, high television
advertising expenditure is always associated with low revenue (but not as low as
it might have been). The causality works in an unexpected way: from sales to
advertising and not the other way around.
Provided decisions about when to use the two types of advertising conform to
the past, the model could be used for predictions. If, however, it was decided to
experiment in advertising policy and make expenditures in different circum-
stances to those that have applied in the past, the model could not be expected
to predict gross revenue.
(f) The prime improvement would be extra data. Eight observations is an unsatis-
factory base for a regression analysis. This is not a statistical point. It is simply
that common sense suggests that too little information will be contained in those
eight observations for the decision at hand, whatever it is.

Marking Scheme (out of 20) Marks


(a) Specifying regression equation 2
(b) Calculating residuals 3
Commenting on their randomness 1
(c) F test with ANOVA table 6

Quantitative Methods Edinburgh Business School A4/75


Appendix 4 / Answers to Review Questions

(d) Additional information


– correlation matrix 2
– SE(Pred) 2
(e) Use for prediction, noticing negative coefficient 2
(f) Improvements to model 2
Total 20

Case Study 12.2: Scrap Metal Processing I


1
(a)
(i) The scatter diagram is shown in Figure A4.10. The relationship between
unit costs and capacity is curved. A linear regression analysis will not be the
best model for these data. However, the R-bar-squared is high at 0.73 and,
furthermore, the F test below will show a significant relationship. This illus-
trates that a regression equation, although significant, may not necessarily
be adequate.

90
Unit costs (£/tonne)

80

70

60

50

40
100 200 300 400
Capacity (tonnes/week)

Figure A4.10 Scrap metal plants: scatter diagram


(ii) The runs test is as follows. The 12 residuals comprise four positive and
eight negative. There are five runs. In Appendix 1, Table A1.7 shows that
the lower critical value in these circumstances is 3. The runs test therefore
contradicts (but only just) the clear visual evidence of the scatter diagram
that the residuals are not random.
(iii) The computer printout has shown the observed F value to be 31.0. The
critical F value has (1,10) degrees of freedom, there being one right-hand-
side variable and 12 observations. From Table A1.6 in Appendix 1, the crit-
ical value at the 5 per cent significance level is 4.96. Observed exceeds
critical; thus the relationship is significant.
(b)
(i) The scatter diagram shows the clear curve in the data, which should be in-
corporated into the regression model. There are several possible ways of

A4/76 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

doing this, including the taking of logarithms of one or both variables.


However, economic theory suggests a better alternative. The law of econo-
mies of scale would imply that unit costs are related to the reciprocal of
capacity. Since this transformation has sound reasoning behind it, this
should be tried first. Only if it fails will there be the need to resort to trial
and error searches for transformations.
(ii) It could well be the case that there has been a learning curve effect with
respect to the plants. It is plausible to suppose that the company has been
able to improve both design and operations at the later plants in the light of
experiences at the earlier ones. If this is the case, it can be tested by intro-
ducing a second right-hand-side variable: the age of the plant. This would
be done by using information about the dates of construction of the plants.
In 2018 a plant built in 2010 has age 8, one built in 2014 has age 4, and so
on.

Marking Scheme (out of 20) Marks


(a) (i) Scatter diagram – seeing curvature in data 2
(ii) Runs test 4
(iii) F test 4
(b) (i) Handling curvature through reciprocal – quoting 5
reasons for using this transformation (3 marks if
logarithms or trial and error suggested)
(ii) Suggesting addition of age variable and giving reasons 5
Total 20

Case Study 12.3: Scrap Metal Processing II


1
(a) All the diagnostic checks for a good regression model indicate that the third
model (relating unit costs with 1/Capacity and Age) is the best. The reasons can
be considered within the structure of the regression analysis steps given at the
end of the module. The steps concerned are as follows:
(i) The model satisfies the common-sense test. There are sound reasons of
prior knowledge for the inclusion of both variables and for the transfor-
mation used.
(ii) The closeness of fit is the best. It has the highest R-bar-squared and the
highest F value. Note, however, that the F values are significant in all three
cases.
(iii) The residuals appear to be random. By inspection, the residuals are:
−++−+−−+−+−−
There seems to be no pattern in them. In this case, the runs test confirms the
visual evidence. The series above has nine runs. For five positives and seven
negatives, the tables in Appendix 1 show that the lower and upper critical
values are 3 and 11. The observed number of runs is within the ‘random’
range. The residuals also appear to be random for the second model (relating

Quantitative Methods Edinburgh Business School A4/77


Appendix 4 / Answers to Review Questions

unit costs to 1/Capacity alone). They do not appear to be random in the first
model as described in Case Study 12.2.
(iv) The third model is the only one of the three that is a multiple regression
model. Both of its right-hand-side variables have been included correctly.
Their t values are 23.8 (for 1/Capacity) and 7.3 (for Age), well in excess of
the critical t value at the 5 per cent level.
(v) Collinearity is largely absent from the third model. Using the formula given
in Module 11, the correlation coefficient between 1/Capacity and Age is
0.44. Squaring this to obtain R-squared, the answer is 0.19. This is a low
value (and an F test shows the relationship is not significant).
(vi) The third model has the lowest SE(Pred), at 1.7, compared with 4.0 for the
second model. It is more than twice as accurate as the next best model.
(b)
(i) Although the value of age in making a prediction for a 2018 plant is zero,
age nevertheless has had an effect on the prediction. Age was allowed for in
constructing the regression model. All coefficients were affected by the
presence of age in the regression. One could say that the regression has
separated the effect of capacity from that of age. The (pure) effect of capac-
ity can now be used in predicting for a modern plant.
(ii) The 95 per cent forecast intervals must be based on the t-distribution since
the number of observations is less than 30. For 9 degrees of freedom
(12−2−1) t0.025 is 2.26. The intervals are:
270 capacity plant: 44.9 ± 2.26 × 1.7 = 44.9 ± 3.8
350 capacity plant: 39.9 ± 2.26 × 1.7 = 39.9 ± 3.8
(iii) SE(Pred) takes into account a number of sources of error. One of these is
in the measurement of the variable coefficients. Any prediction involves
multiplying these coefficients by the values of the right-hand-side variables
on which the predictions are based. Therefore, the amount of the error will
vary as these prediction values vary. SE(Pred) will thus be different for dif-
ferent predictions.
(iv) R-squared measures variation explained; SE(Pred) deals with unexplained
variation plus other errors. Although, therefore, the two are linked, the rela-
tionship is not a simple or exact one. An increase in R-squared from 0.93 to
0.99 in one way appears a small increase. From another point of view it re-
flects a great reduction in the unexplained variation, which is reflected in
the substantial improvement in prediction accuracy.

Marking Scheme (out of 20) Marks


(a) Dimensions along which to judge the best model
(i) Sensibleness 2
(ii) Closeness of fit 2
(iii) Random residuals 2
(iv) t-test on variable coefficients 2
(v) Collinearity 2

A4/78 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Marking Scheme (out of 20) Marks


(vi) Accuracy of prediction 2
(b) (i) Influence of age 2
(ii) Forecast intervals 2
(iii) Reason for variation in SE(Pred) 2
(iv) Relationship between R2 and SE(Pred) 2
Total 20

Module 13

Review Questions
13.1 The correct answer is False. The techniques are classified into qualitative, causal
modelling and time series; the applications are classified into short-, medium- and
long-term.
13.2 The correct answer is B, D. A is false because time series methods make predictions
from the historical record of the forecast variable only, and do not involve other
variables. B is true because a short time horizon does not give time for conditions to
change and disrupt the structure of the model. C is false since time series methods
work by projecting past patterns into the future and therefore are usually unable to
predict turning points. D is true because some time series methods are able to
provide cheap, automatic forecasts.
13.3 The correct answer is A, C. A is the definition of causal modelling. B is false since
there is no reason why causal modelling cannot be applied to time series as well as
cross-sectional data. C is true because causal modelling tries to identify all the
underlying causes of a variable’s movements and can therefore potentially predict
turning points. D is false since causal modelling can be used for short-term fore-
casts, but its expense often rules it out.
13.4 The correct answer is False. Causal modelling is the approach of relating one
variable to others; least squares regression analysis is a technique for defining the
relationship. There are other ways of establishing the relationship besides least
squares regression analysis.
13.5 The correct answer is True. Qualitative forecasting does not work statistically from a
long data series, as the quantitative techniques tend to. However, in forming and
collecting judgements, numerical data may be used. For example, a judgement may
be expressed in the form of, say, a future exchange rate between the US dollar and
the euro.
13.6 The correct answer is A, B, C, D. The situations A to D are the usual occasions
when qualitative forecasting is used.
13.7 The correct answer is A. A is correct, although ‘expert’ needs careful definition. It
would be better to say that the participants were people with some involvement in
or connection with the forecast. B is not true since the participants are not allowed

Quantitative Methods Edinburgh Business School A4/79


Appendix 4 / Answers to Review Questions

to communicate with one another at all. C is not true because the chairman passes
on a summary of the forecasts, not the individual forecasts. D is not true. The
chairman should bring the process to a stop as soon as there is no further move-
ment in the forecasts, even though a consensus has not been reached.
13.8 The correct answer is True. This is a definition of scenario writing. Each view of the
future is a scenario.
13.9 The correct answer is C, D. A and B are not true since the technique of the cross-
impact matrix does not apply to forecasts or probabilities of particular variables,
whether sales or not. C is true since the technique is based on the full range of
future events or developments and they therefore need to be fully listed. D is true,
being a description of what the technique does.
13.10 The correct answer is False. The essence of an analogy variable is that it should
represent the broad pattern of development expected for the forecast variable. It
does not have to be exactly the same at each point. They could, for example, differ
by a multiplicative factor of ten.
13.11 The correct answer is A. Catastrophe theory applies to ‘jumps’ in the behaviour of a
variable rather than smooth changes, however steep or unfortunate.
13.12 The correct answer is C. C gives the formula for a partial relevance number.

Case Study 13.1: Automobile Design


1 Recall the way in which relevance trees work. The technique starts with a broad
objective, breaks this down into sub-objectives and then further breaks down the
sub-objectives through perhaps several different levels finally to come to specific
technological developments. The elements of the increase are then given ‘relevance
weights’, from which it is possible to calculate the overall relevance of the techno-
logical developments that are at the lowest level of the tree. The outcome of the
technique is a list of those developments that are most important or relevant to the
achievement of the higher-level objective and sub-objectives.
Seven steps in the application of relevance trees were described in the text. For the
design of an automobile, the steps might be as shown below. Bear in mind that here
it is the structure of the answer that matters, not the detail of the answers. In a
practical application of the technique, the details would be important but substantial
assistance from several key people in the automobile industry would be needed to
get them right. Even then there would be considerable disagreement. In other
words, the technique is one for which there is no single right answer.
(a) The relevance tree is given in Figure A4.11.

A4/80 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Objective:
Level Design successful automobile

1 Accommodation Control Information


(on performance)

2 Passengers Baggage Direction Speed Communication Visibility

3 Comfort Protection Instruments


4
5

Figure A4.11 Relevance tree for automobile design


(b) In this case the criteria might be:

A Performance
B Passenger comfort
C Safety
D Running costs
E Capital costs
Weight the importance of each criterion relative to the others. This is done by
asking which criteria are most relevant to the basic objective of designing a suc-
cessful automobile. The weights might be assigned as follows:
Weight
A Performance 0.30
B Passenger comfort 0.20
C Safety 0.10
D Running costs 0.15
E Capital costs 0.25
Total 1.00

(c) Weight the sub-objectives at each level (the elements of the tree) according to
their importance in meeting each criterion. In this case, the result might be as in
Table A4.14.

Table A4.14 Element weights


Criteria Perfor- Com- Safety Run- Capital
mance fort ning costs
costs
Criterion weight 0.30 0.20 0.10 0.15 0.25
Elements at level 1 Element weights
Accommodation 0.10 0.75 0.60 0.05 0.15
Control 0.65 0.15 0.30 0.80 0.75
Information 0.25 0.10 0.10 0.15 0.10

Quantitative Methods Edinburgh Business School A4/81


Appendix 4 / Answers to Review Questions

Criteria Perfor- Com- Safety Run- Capital


mance fort ning costs
costs
1.00 1.00 1.00 1.00 1.00
Elements at level 2 Element weights
Passengers 0.05 0.70 0.55 0.05 0.10
Baggage 0.05 0.05 0.05 0.00 0.05
Direction 0.05 0.00 0.10 0.00 0.05
Speed 0.45 0.10 0.10 0.75 0.60
Communication 0.10 0.00 0.05 0.00 0.05
Instruments 0.15 0.05 0.05 0.15 0.10
Visibility 0.15 0.10 0.10 0.05 0.05
1.00 1.00 1.00 1.00 1.00

The first column shows the assessed relevance of the three elements at level 1 to
the criterion of performance. Accommodation is weighted 10 per cent, control
65 per cent and information 25 per cent. Since the table gives the relative rele-
vance of the elements at each level to the criteria, this part of each column must
sum to 1. The process of assessing relevance weights is carried out in a similar
way for the second level of the tree.
(d) Each element has a partial relevance number (PRN) for each criterion. It is
calculated:
PRN = Criterion weight × Element weight
It is a measure of the relevance of that element with respect only to that criteri-
on. For this case the partial relevance numbers are shown in Table A4.15.
For instance, at level 2 the PRN for direction with respect to capital costs is 0.05
× 0.25 = 0.0125.
PRNs are calculated for each element at each level for each criterion.
(e) The LRN for each element is the sum of the PRNs for that element (see Ta-
ble A4.16). It is a measure of the importance of that element relative to others at
the same level in achieving the highest-level objective. For example, at level 2 the
LRN for direction is 0.0375 (= 0.0150 + 0 + 0.0100 + 0 + 0.0125). There is one
LRN for each element at each level.
(f) There is one CRN for each element. They are calculated by multiplying the LRN
of an element by the LRNs of each associated element at a higher level (see Ta-
ble A4.17). This gives each element an absolute measure of its relevance.
For example:
CRN (direction) = LRN (direction) × LRN (control)
= 0.0375 × 0.5625
= 0.021
The CRNs at the second level show the comparative importance with respect to
the overall objective of the elements at that level. Thus, speed is the most im-
portant (0.240) and baggage the least important (0.012).

A4/82 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Recall that by this process the bottom row of elements (specific technological
requirements) will have overall measures of their relevance in achieving the ob-
jective at the highest level of the tree. This should lead to decisions about the
importance, timing, resource allocation, etc. of the tasks ahead.

Table A4.15 Partial relevance numbers


Criteria Perfor- Com- Safety Running Capital
mance fort costs costs
Criterion weight 0.30 0.20 0.10 0.15 0.25
Elements at level 1 Element weights
Accommodation 0.10 0.75 0.60 0.05 0.15
Control 0.65 0.15 0.30 0.80 0.75
Information 0.25 0.10 0.10 0.15 0.10
1.00 1.00 1.00 1.00 1.00
Partial relevance numbers
Accommodation 0.0300 0.1500 0.0600 0.0075 0.0375
Control 0.1950 0.0300 0.0300 0.1200 0.1875
Information 0.0750 0.0200 0.0100 0.0225 0.0250
Elements at level 2 Element weights
Passengers 0.05 0.70 0.55 0.05 0.10
Baggage 0.05 0.05 0.05 0.00 0.05
Direction 0.05 0.00 0.10 0.00 0.05
Speed 0.45 0.10 0.10 0.75 0.60
Communication 0.10 0.00 0.05 0.00 0.05
Instruments 0.15 0.05 0.05 0.15 0.10
Visibility 0.15 0.10 0.10 0.05 0.05
1.00 1.00 1.00 1.00 1.00
Partial relevance numbers
Passengers 0.0150 0.1400 0.0550 0.0075 0.0250
Baggage 0.0150 0.0100 0.0050 0.0000 0.0125
Direction 0.0150 0.0000 0.0100 0.0000 0.0125
Speed 0.1350 0.0200 0.0100 0.1125 0.1500
Communication 0.0300 0.0000 0.0500 0.0000 0.0125
Instruments 0.0450 0.0100 0.0050 0.0225 0.0250
Visibility 0.0450 0.0200 0.0100 0.0075 0.0125

Quantitative Methods Edinburgh Business School A4/83


Appendix 4 / Answers to Review Questions

Table A4.16 Local relevance numbers


Criteria Performance Com- Safety Running Capital LRN
fort costs costs
Level 1 Partial relevance numbers
Accommodation 0.0300 0.1500 0.0600 0.0075 0.0375 0.2850
Control 0.1950 0.0300 0.0300 0.1200 0.1875 0.5625
Information 0.0750 0.0200 0.0100 0.0225 0.0250 0.525
1.000
Level 2 Partial relevance numbers
Passengers 0.0150 0.1400 0.0550 0.0075 0.0250 0.2425
Baggage 0.0150 0.0100 0.0050 0.0000 0.0125 0.0425
Direction 0.0150 0.0000 0.0100 0.0000 0.0125 0.0375
Speed 0.1350 0.0200 0.0100 0.1125 0.1500 0.4275
Communication 0.0300 0.0000 0.0500 0.0000 0.0125 0.0475
Instruments 0.0450 0.0100 0.0050 0.0225 0.0250 0.1075
Visibility 0.0450 0.0200 0.0100 0.0075 0.0125 0.0950
1.0000

Table A4.17 Cumulative relevance numbers


Element LRN LRN of higher CRN
element
Passengers 0.2425 0.2850 0.069
Baggage 0.0425 0.2850 0.012

Direction 0.0375 0.5625 0.021


Speed 0.4275 0.5625 0.240
Communication 0.0475 0.5625 0.027
Instruments 0.1075 0.5625
} 0.077
Instruments 0.1075 0.1525
Visibility 0.0975 0.1525 0.015

A4/84 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Marking Scheme (out of 20) Marks


Based on two levels of elements
(a) Relevance tree 4
(b) Criteria with weights 2
(c) Element weights 2
(d) Partial relevance numbers 4
(e) Local relevance numbers 4
(f) Cumulative relevance numbers 4
Total 20

Module 14

Review Questions
14.1 The correct answer is C. C is the definition of time series forecasting. A is untrue
because TS methods work for stationary and non-stationary series. Decomposition
is the only method that uses regression even to a small extent. Therefore, B is
untrue. D is partly true. Some, but not all, TS methods are automatic and need no
intervention once set up.
14.2 The correct answer is A, D. TS methods analyse the patterns of the past and project
them into the future. Where conditions are not changing, the historical record is a
reliable guide to the future. TS methods are therefore good in the short term when
conditions have insufficient time to change (situation A) and in stable situations (D).
For the same reason, they are not good at predicting turning points (situation B). In
order to analyse the past accurately, a long data series is needed, thus situation C is
unlikely to be one in which TS methods are used.
14.3 The correct answer is D. A stationary series has no trend and constant variance.
Homoscedastic means ‘with constant variance’. Thus, it is only D that fully defines a
stationary series.
14.4 The correct answer is False. The part of the statement referring to MA is true; the
part referring to ES is false. ES gives unequal weights to past values, but they are
not completely determined by the forecaster. They are partly chosen by the fore-
caster in that a smoothing constant, α, is selected.
14.5 The correct answer is A. The three-point moving average for 9 is the average of the
values for periods 6, 7 and 8:
Forecast for period 9 = = 9.3
14.6 The correct answer is A.

Quantitative Methods Edinburgh Business School A4/85


Appendix 4 / Answers to Review Questions

Value Old smoothed New smoothed


7 – 7.0
9 7.0 7.8 (= 0.6 × 7 + 0.4 × 9)
8 7.8 7.9 (= 0.6 × 7.8 + 0.4 × 8)
10 7.9 8.7 (= 0.6 × 7.9 + 0.4 × 10)
12 8.7 10.0 (= 0.6 × 8.7 + 0.4 × 12)
11 10.0 10.4 (= 0.6 × 10 + 0.4 × 11)
7 10.4 9.0 (= 0.6 × 10.4 + 0.4 × 7)
10 9.0 9.4 (= 0.6 × 9 + 0.4 × 10)

The forecast for period 9 is 9.4.


14.7 The correct answer is D. The two smoothing constants are chosen independently
with a view only to ensuring the best possible accuracy.
14.8 The correct answer is D. The ratio between an actual and a smoothed value is a
measure of seasonality.
14.9 The correct answer is C. The trend at any point is a + bt, where a and b are the
regression coefficients. Therefore, trend at t = 6 is 12.2 + (6 × 3.1) = 30.8.
14.10 The correct answer is C. Smoothing should eliminate random and seasonal
fluctuations. Thus, the smoothed value should contain the cycle and the trend.
Dividing by the trend, the cycle remains:
Cycle =
14.11 The correct answer is D. The length of a cycle is the time it takes to repeat itself. In
this case the time is 20 quarters (i.e. five years).
14.12 The correct answer is A. A forecast at t = 10 is given by:
Trend at = 10 × Cycle × Seasonality for Q2
= [9+(1.1×10)] × 1.0 × 1.06
= 21.2

Case Study 14.1: Interior Furnishings


1
(a) The forecast for future months is 2442 – see Table A4.18.

Table A4.18 Moving average forecast


Demand Three-month moving average
2016 Oct. 2000
Nov. 1350 1767 = (2000 + 1350 + 1950)/3 (rounded)
Dec. 1950 1758 = (1350 + 1950 + 1975)/3 (rounded)
2017 Jan. 1975 2342 etc.
Feb. 3100 2275
Mar. 1750 2133

A4/86 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Demand Three-month moving average


Apr. 1550 1533
May 1300 1683
June 2200 2092
July 2775 2442
Aug. 2350 2442

(b) The forecast for future months is 2056 – see Table A4.19.

Table A4.19 Exponential smoothing forecast


Demand Exponential Old Calculations for the
smoothing smoothed smoothed values
(α = 0.1)
2016 Oct. 2000 2000 –
Nov. 1350 1935 2000 1935 = 0.9 × 2000 + 0.1 ×
1350
Dec. 1950 1936 1935 1936 = 0.9 × 1935 + 0.1 ×
1950
2017 Jan. 1975 1940 1936 1940 = 0.9 × 1936 + 0.1 ×
1975
Feb. 3100 2056 1940 2056 = 0.9 × 1940 + 0.1 ×
3100
Mar. 1750 2025 2056 2025 = 0.9 × 2056 + 0.1 ×
1750
Apr. 1550 1978 2025 1978 = 0.9 × 2025 + 0.1 ×
1550
May 1300 1910 1978 1910 = 0.9 × 1978 + 0.1 ×
1300
June 2200 1939 1910 1939 = 0.9 × 1910 + 0.1 ×
2200
July 2775 2023 1939 2023 = 0.9 × 1939 + 0.1 ×
2775
Aug. 2350 2056 2023 2056 = 0.9 × 2023 + 0.1 ×
2350

(c) In both cases it is assumed that the series are stationary. In other words, there is
no trend in the data and they have constant variance through time.

Marking Scheme (out of 10) Marks


(a) Moving average
– Basic calculation 1
– Continuing calculation through series 1
– Forecast 1

Quantitative Methods Edinburgh Business School A4/87


Appendix 4 / Answers to Review Questions

Marking Scheme (out of 10) Marks


(b) Exponential smoothing
– Starting off with actual value 1
– Basic calculation 2
– Continuing calculation through series 1
– Forecast 1
(c) Assumptions
– No trend 1
– Constant variance 1
Total 10

Case Study 14.2: Garden Machinery Manufacture


1 Holt’s Method is based on the following formulae:
To smooth the series: = (1 − α)( + )+ ·
To smooth the trend: = (1 − ) + ( − )
To forecast: = + ·
where = smoothed value of series at
= smoothed value of trend at
= actual value of series at
, = smoothing constants
= forecast on periods ahead
The forecasts for quarterly demand in 2018 are calculated in Table A4.20. The total
forecast is therefore 215.7 + 223.9 + 232.1 + 240.3 = 912.0.

Table A4.20 Holt’s Method forecast


α = 0.2 γ = 0.3
Period Actual Smoothed series (St) Smoothed trend (bt)
2016 1 140 140
2 155 155 15.0
3 155 167 = 0.8(155 + 15) + 0.2(155) 14.1 = 0.7(15) + 0.3(167 − 155)
4 170 178.9 = 0.8(167 + 14.1) + 0.2(170) 13.4 = 0.7(14.1) + 0.3(178.9 − 167)
2017 1 180 189.8 = 0.8(178.9 + 13.4) + 0.2(180) 12.7 = 0.7(13.4) + 0.3(189.8 − 178.9)
2 170 196.0 = 0.8(189.8 + 12.7) + 0.2(170) 10.8 = 0.7(12.7) + 0.3(196 − 189.8)
3 185 202.4 = 0.8(196 + 10.8) + 0.2(185) 9.5 = 0.7(10.8) + 0.3(202.4 − 196)
4 190 207.5 = 0.8(202.4 + 9.5) + 0.2(190) 8.2 = 0.7(9.5) + 0.3(207.5 − 202.4)
Forecasts
2018 1 207.5 + 8.2 = 215.7
2 207.5 + (2 × 8.2) = 223.9
3 207.5 + (3 × 8.2) = 232.1

A4/88 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

α = 0.2 γ = 0.3
Period Actual Smoothed series (St) Smoothed trend (bt)
4 207.5 + (4 × 8.2) = 240.3

Marking Scheme (out of 10) Marks


Series:
– Starting off 1
– Basic calculation 2
– Continuing calculation 1
Trend:
– Starting off 1
– Basic calculation 2
– Continuing calculations 1
Forecast:
– Individual quarters 1
– Total 1
Total 10

Case Study 14.3: McClune and Sons


1
(a) Keith Scott, assistant to the director of finance at McClune and Sons, is to
develop a forecast of Scotch whisky industry sales for each of the 12 months of
2015 as an input to some financial statements. The historical data available (Ex-
hibits 1 and 2) show an obvious seasonal pattern as well as a trend. This suggests
that forecasts could be made using the technique called classical decomposition.
The basis of this method is the assumption that the series has four component
parts: trend, cycle, seasonality and randomness. The technique involves isolating
the components, which can then be reassembled to make forecasts for future
time periods.
Forecast for time period = Trend for × Cycle for × Seasonality for
Since the random component is by definition unpredictable and has an average
of 0, it must be omitted from the forecast.
(i) Isolating the trend. If the trend is linear (a straight line), linear regression
can be used to identify it. This process involves regressing the historical
sales data against the time index (a variable taking the values 1, 2, 3, 4, 5, …
for the time periods). The fitted values of this regression constitute the
trend element of the decomposition analysis. Results are:
= 8.83 + 0.055
where T = trend or fitted value for time period t, and t = 1, 2, 3, … for suc-
ceeding time periods.

Quantitative Methods Edinburgh Business School A4/89


Appendix 4 / Answers to Review Questions

Exhibit 3 shows the historical data in column 1, the time index in column 2
and the trend in column 3.

Exhibit 3
(1) (2) (3) (4) (5) (6)
Year Month Sales Time Trend Mov. Cycle Season
av.
2006 1 8.12 1 8.89
2006 2 7.76 2 8.94
2006 3 7.97 3 9.00
2006 4 7.88 4 9.05
2006 5 8.45 5 9.11
2006 6 8.68 6 9.16 9.80 1.07 0.89
2006 7 6.77 7 9.22 9.74 1.06 0.70
2006 8 6.60 8 9.27 9.70 1.05 0.68
2006 9 8.39 9 9.33 9.76 1.05 0.86
2006 10 11.88 10 9.38 9.87 1.05 1.20
2006 11 15.58 11 9.44 10.05 1.06 1.55
2006 12 19.50 12 9.49 10.09 1.06 1.93
2007 1 7.43 13 9.55 10.25 1.07 0.72
2007 2 7.26 14 9.60 10.07 1.05 0.72
2007 3 8.67 15 9.66 10.13 1.05 0.86
2007 4 9.26 16 9.71 10.08 1.04 0.92
2007 5 10.55 17 9.77 10.05 1.03 1.05
2007 6 9.17 18 9.82 9.93 1.01 0.92
2007 7 8.66 19 9.88 9.80 0.99 0.88
2007 8 4.45 20 9.93 9.71 0.98 0.46
2007 9 9.10 21 9.99 9.68 0.97 0.94
2007 10 11.32 22 10.04 9.65 0.96 1.17
2007 11 15.23 23 10.10 9.53 0.94 1.60
2007 12 18.02 24 10.15 9.59 0.94 1.88
2008 1 5.87 25 10.21 9.39 0.92 0.62
2008 2 6.19 26 10.26 9.35 0.91 0.66
2008 3 8.34 27 10.32 9.20 0.89 0.91
2008 4 8.91 28 10.37 9.35 0.90 0.95
2008 5 9.05 29 10.43 9.41 0.90 0.96
2008 6 9.98 30 10.48 9.82 0.94 1.02
2008 7 6.26 31 10.54 9.98 0.95 0.63
2008 8 3.98 32 10.59 10.12 0.96 0.39
2008 9 7.24 33 10.65 10.12 0.95 0.72

A4/90 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

(1) (2) (3) (4) (5) (6)


Year Month Sales Time Trend Mov. Cycle Season
av.
2008 10 13.18 34 10.70 10.17 0.95 1.30
2008 11 15.88 35 10.76 10.23 0.95 1.55
2008 12 22.90 36 10.81 10.25 0.95 2.23
2009 1 7.88 37 10.87 10.50 0.97 0.75
2009 2 7.81 38 10.92 10.52 0.96 0.74
2009 3 8.40 39 10.98 10.83 0.99 0.78
2009 4 9.43 40 11.03 10.79 0.98 0.87
2009 5 9.76 41 11.09 11.22 1.01 0.87
2009 6 10.28 42 11.14 11.32 1.02 0.91
2009 7 9.27 43 11.20 11.43 1.02 0.81
2009 8 4.16 44 11.25 11.54 1.03 0.36
2009 9 10.98 45 11.31 11.52 1.02 0.95
2009 10 12.73 46 11.36 11.52 1.01 1.11
2009 11 21.03 47 11.42 11.55 1.01 1.82
2009 12 24.08 48 11.47 11.63 1.01 2.07
2010 1 9.19 49 11.53 11.63 1.01 0.79
2010 2 9.21 50 11.58 11.64 1.01 0.79
2010 3 8.09 51 11.64 11.71 1.01 0.69
2010 4 9.45 52 11.69 11.92 1.02 0.79
2010 5 10.14 53 11.75 12.08 1.03 0.84
2010 6 11.17 54 11.80 12.46 1.06 0.90
2010 7 9.29 55 11.86 12.45 1.05 0.75
2010 8 4.36 56 11.91 12.52 1.05 0.34
2010 9 11.75 57 11.97 12.79 1.07 0.92
2010 10 15.31 58 12.02 12.91 1.07 1.19
2010 11 22.94 59 12.08 13.11 1.09 1.75
2010 12 28.67 60 12.13 13.17 1.09 2.18
2011 1 9.04 61 12.19 13.14 1.08 0.69
2011 2 10.01 62 12.24 13.16 1.07 0.76
2011 3 11.41 63 12.30 13.28 1.08 0.86
2011 4 10.82 64 12.35 13.45 1.09 0.80
2011 5 12.57 65 12.41 13.82 1.11 0.91
2011 6 11.83 66 12.46 14.36 1.15 0.82
2011 7 8.91 67 12.52 14.16 1.13 0.63
2011 8 4.61 68 12.57 13.94 1.11 0.33
2011 9 13.21 69 12.63 13.70 1.08 0.96

Quantitative Methods Edinburgh Business School A4/91


Appendix 4 / Answers to Review Questions

(1) (2) (3) (4) (5) (6)


Year Month Sales Time Trend Mov. Cycle Season
av.
2011 10 17.39 70 12.68 13.59 1.07 1.28
2011 11 27.33 71 12.74 13.16 1.03 2.08
2011 12 35.21 72 12.79 13.01 1.02 2.71
2012 1 6.68 73 12.85 13.16 1.02 0.51
2012 2 7.33 74 12.90 13.14 1.02 0.56
2012 3 8.53 75 12.96 13.14 1.01 0.65
2012 4 9.46 76 13.01 13.05 1.00 0.73
2012 5 7.41 77 13.07 12.84 0.98 0.58
2012 6 10.08 78 13.12 12.67 0.97 0.80
2012 7 10.67 79 13.18 12.94 0.98 0.82
2012 8 4.40 80 13.23 13.00 0.98 0.34
2012 9 13.21 81 13.29 13.19 0.99 1.00
2012 10 16.25 82 13.34 13.39 1.00 1.21
2012 11 24.90 83 13.40 13.82 1.03 1.80
2012 12 33.08 84 13.46 14.01 1.04 2.36
2013 1 9.95 85 13.51 14.10 1.04 0.71
2013 2 8.00 86 13.57 14.08 1.04 0.57
2013 3 10.84 87 13.62 14.24 1.05 0.76
2013 4 11.83 88 13.68 14.35 1.05 0.82
2013 5 12.68 89 13.73 14.36 1.05 0.88
2013 6 12.33 90 13.79 14.27 1.04 0.86
2013 7 11.72 91 13.84 14.36 1.04 0.82
2013 8 4.20 92 13.90 14.44 1.04 0.29
2013 9 15.06 93 13.95 14.50 1.04 1.04
2013 10 17.66 94 14.01 14.53 1.04 1.22
2013 11 24.92 95 14.06 14.45 1.03 1.73
2013 12 32.06 96 14.12 14.54 1.03 2.21
2014 1 11.00 97 14.17 14.47 1.02 0.76
2014 2 9.02 98 14.23 14.42 1.01 0.63
2014 3 11.58 99 14.28 14.40 1.01 0.80
2014 4 12.11 100 14.34
2014 5 11.68 101 14.39
2014 6 13.44 102 14.45
2014 7 10.87 103 14.50
2014 8 3.62 104 14.56

A4/92 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

(1) (2) (3) (4) (5) (6)


Year Month Sales Time Trend Mov. Cycle Season
av.
2014 9 14.87 105 14.61
(ii) Isolating the cycle. For monthly data, a 12-point moving average can be
expected to smooth away both random fluctuations and seasonal variations.
The moving average thus contains the remaining two elements, the trend
and the cycle:
Moving average = Trend × Cycle
Consequently, the ratio Moving average/Trend is the cycle.
The relevant calculations are also in Exhibit 3. Column 4 shows the 12-point
moving average; column 3 shows the trend; column 5 shows the cycle, calcu-
lated as column 4 divided by column 3.

1.16
1.14
1.12
1.1
1.08
1.06
1.04
Cycle

1.02
1
0.98
0.96
0.94
0.92
0.9
0.88
0 20 40 60 80 100 120
Time

Exhibit 4: Cyclical effect


Exhibit 4 shows the cycle as a graph plotted against time. In the early part of
the series, up to period 70, there is a clear cyclical pattern. After this the pat-
tern is not so clear and seems to be disappearing.
(iii) Isolating the seasonality. The seasonality is calculated from the moving
average and the actual values:
Actual values = Trend × Cycle × Seasonality × Randomness
Moving average = Trend × Cycle
Consequently, the ratio:
Actual value/Moving average = Seasonality × Randomness
Since the seasonality is calculated several times for each month (e.g. for Janu-
ary, an estimate of seasonality is made for 2006, 2007, 2008, …), the

Quantitative Methods Edinburgh Business School A4/93


Appendix 4 / Answers to Review Questions

randomness can, it is hoped, be eliminated by averaging the estimates for a


particular month.
In Exhibit 3, the individual (unaveraged) seasonal factors are shown in col-
umn 6 (calculated as column 1 divided by column 4). In Exhibit 5 these
individual seasonal factors are averaged to give the true seasonality.

Exhibit 5
2006 2007 2008 2009 2010 2011 2012 2013 2014 Aver-
age
Jan. 0.72 0.62 0.75 0.79 0.69 0.51 0.71 0.76 0.69
Feb. 0.72 0.66 0.74 0.79 0.76 0.56 0.57 0.63 0.68
Mar. 0.86 0.91 0.78 0.69 0.86 0.65 0.76 0.80 0.79
Apr. 0.92 0.95 0.87 0.79 0.80 0.73 0.82 0.84
May 1.05 0.96 0.87 0.84 0.91 0.58 0.88 0.87
June 0.89 0.92 1.02 0.91 0.90 0.82 0.80 0.86 0.89
July 0.70 0.88 0.63 0.81 0.75 0.63 0.82 0.82 0.75
Aug. 0.68 0.46 0.39 0.36 0.34 0.33 0.34 0.29 0.40
Sep. 0.86 0.94 0.72 0.95 0.92 0.96 1.00 1.04 0.92
Oct. 1.20 1.17 1.30 1.11 1.19 1.28 1.21 1.22 1.21
Nov 1.55 1.60 1.55 1.82 1.75 2.08 1.80 1.73 1.73
.
Dec. 1.93 1.88 2.23 2.07 2.18 2.71 2.36 2.21 2.20
Total 11.98
Average 1.00

A further adjustment to seasonality has sometimes to be made. If the overall


effect of the calculated seasonality is to increase or decrease the series, as well
as to rearrange the distribution within the year, then this former effect must
be removed. The indication that this effect is present is that the seasonal fac-
tors for each month have an average value different from 1.0. In Exhibit 5 it
can be seen that the seasonal factors do have an average of 1.0 and therefore
the adjustment does not have to be made in this case.
(iv) To forecast the 2015 Scotch whisky sales, the components have to be esti-
mated for each month of the year and these estimates multiplied together.
The forecasts are shown in Exhibit 6. Note that totals may not work out ex-
actly because of rounding. The build-up of the forecast for January 2015 is
explained here as an example.
January 2015 is the 109th time period (t = 109). Thus, the trend for January
2015 is:
= 8.83 + 0.055 × 109
= 14.83
The cycle for January 2015 has to be estimated from column 5 of Exhibit 3.
Since the pattern is not a convincing one and it does seem to peter out, it is

A4/94 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

assumed that there is no definite cycle. The cyclical factor for January 2015
and the other months is taken to be 1.0.
Seasonality for January 2015, taken from Exhibit 5, is 0.69.
The forecast for January 2015 is calculated by multiplying the three compo-
nents together:
Forecast = 14.83 × 1.0 × 0.694
= 10.29
The forecasts for the whole of 2015 are shown in Exhibit 6.

Exhibit 6: Forecasts for 2015


Year Month Time Trend Cycle Season Forecast
2015 1 109 14.83 1 0.69 10.29
2015 2 110 14.89 1 0.68 10.10
2015 3 111 14.94 1 0.79 11.77
2015 4 112 15.00 1 0.84 12.62
2015 5 113 15.05 1 0.87 13.09
2015 6 114 15.11 1 0.89 13.44
2015 7 115 15.16 1 0.75 11.43
2015 8 116 15.22 1 0.40 6.09
2015 9 117 15.27 1 0.92 14.11
2015 10 118 15.33 1 1.21 18.53
2015 11 119 15.38 1 1.73 26.68
2015 12 120 15.44 1 2.20 33.89

(b) There are a number of assumptions and limitations in the use of these forecasts.
These reservations do not mean of course that the forecasts cannot be used, but
they do mean that they should only be used in full awareness of the problems
involved. The reservations are:
(i) The decomposition method does not allow the accuracy of the forecasts to
be measured.
(ii) Other forecasting methods could be used in such a situation, for example, the
Holt–Winters Method. Keith Scott should try other methods and compare
their accuracy.
(iii) Keith Scott should ensure he discusses the forecasts thoroughly with man-
agement before formalising the method by incorporating the forecasts in pro
formas and the like.

Marking Scheme (out of 30) Marks


(a) The forecasts
(i) Measuring the trend
– method 3
– calculations 2
(ii) Measuring the cycle
– method 3

Quantitative Methods Edinburgh Business School A4/95


Appendix 4 / Answers to Review Questions

Marking Scheme (out of 30) Marks


– calculations 2
– conclusion 2
(iii) Measuring seasonality
– method for basic seasonality 3
– calculations 2
– adjustment 2
(iv) Combining into a forecast
– method 3
– calculations 2
(b) Comments on the approach
(i) Lack of a measure of the forecasting accuracy 2
(ii) Possibility of using other techniques 2
(iii) Need to communicate with management colleagues 2
Total 30

Module 15

Review Questions
15.1 The correct answer is False. Both manager and expert should view forecasting as a
system. They differ in the focus of their skills. The expert knows more about
techniques; the manager knows more about the wider issues.
15.2 The correct answer is A. The people who are to use the system should have primary
responsibility for its development. They will then have confidence in it and see it as
‘their’ system.
15.3 The correct answer is D. Analysing the decision-making process may reveal
fundamental flaws in the system or organisational structure which must be ad-
dressed before any forecasts can hope to be effective.
15.4 The correct answer is False. The conceptual modelling stage includes consideration
of possible causal variables but has wider objectives. The stage should be concerned
with all influences on the forecast variable. Time patterns and qualitative variables
also come into the reckoning.
15.5 The correct answer is D. The smoothed value calculated in period 5 from the actual
values for periods 3–5 is 16.0. This is the forecast for the next period ahead, period
6.
15.6 The correct answer is A. The one-step-ahead forecast error for period 6 is the
difference between the actual value and the one-step-ahead forecast for that period:
Error = Actual − Forecast
= 17.0 − 16.0
= 1.0

A4/96 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

15.7 The correct answer is C. The MAD is the average of the errors, ignoring the sign:
MAD = (3.3 + 5.7 + 1.0)/3
= 3.3
15.8 The correct answer is B. The MSE is the average of the squared errors:
MSE = (3.3 + 5.7 + 1.0 )/3
= 44.4/3
= 14.8
15.9 The correct answer is E. MAD and MSE are different measures of scatter. In
comparing forecasting methods they may occasionally give different answers,
suggesting different methods as being superior.
MAD measures the average error. The method for which it is lower is therefore
more accurate on average. The MSE squares the errors. This can penalise heavily a
method that leaves large residuals. Forecasting method 2 may be better on average
than method 1 (i.e. have a lower MAD), but contained within the MAD for method
2 there may be some large errors that cause the MSE of method 2 to be higher than
that of method 1.
15.10 The correct answer is C. Other measures of closeness of fit (e.g. the correlation
coefficient) are based on the same data as were used to calculate the model parame-
ters. This method keeps the two sets of data apart. A and B are true but not the
reasons why the test is described as an independent one.
15.11 The correct answer is A, B, C. A, B and C summarise the reasons why step 7,
incorporating judgements, is an important part of the system.
15.12 The correct answer is False. A consensus on what the problems are can be just as
difficult to obtain as a consensus on the solutions.

Case Study 15.1: Interior Furnishings


1
(a) Using a three-point moving average (MA), the forecast is as in Table A4.21.

Table A4.21 Moving average forecast error


Time Demand Three- Forecast Error
point MA
2016 Oct. 2000
Nov. 1350 1767
Dec. 1950 1758
2017 Jan. 1975 2342 1767 208
Feb. 3100 2275 1758 1342
Mar. 1750 2133 2342 −592
Apr. 1550 1533 2275 −725
May 1300 1683 2133 −833
June 2200 2092 1533 667
July 2775 2442 1683 1092

Quantitative Methods Edinburgh Business School A4/97


Appendix 4 / Answers to Review Questions

Time Demand Three- Forecast Error


point MA
Aug. 2350 2442 2092 258

(b) Using exponential smoothing (ES) with α = 0.1, the forecast is as in Table A4.22.

Table A4.22 Exponential smoothing forecast error


Time Demand ES Forecast Error
2016 Oct. 2000 2000
Nov. 1350 1935
Dec. 1950 1936
2017 Jan. 1975 1940 1936 39
Feb. 3100 2056 1940 1160
Mar. 1750 2025 2056 −306
Apr. 1550 1978 2025 −475
May 1300 1910 1978 −678
June 2200 1939 1910 290
July 2775 2023 1939 836
Aug. 2350 2056 2023 327

(c) Calculating the MSE for the exponential smoothing:


MSE =
. ˗
( ) ( ) ( )
=
= 337 000
Calculating the MSE for the moving average:
( ) ( ) ( )
MSE =
= 639 800
(d) Exponential smoothing has the lower MSE and therefore performs better over
the historical time series. The forecast for September 2017 is the exponential
smoothing forecast. The most recent smoothed value is the forecast for the next
period ahead. Thus, the forecast for September 2017 = 2056.
(e) The reservations about the forecast are:
(i) Exponential smoothing assumes the series is stationary. This has not been
checked (there are insufficient data to do so).
(ii) The possibility of a seasonal effect has been ignored (it would be impossible
to measure seasonality from less than one year’s data).
(iii) There is much volatility in the series, as seen by considering the data and
the MSE. The exponential smoothing forecast, although better than the
moving average, is not particularly good. It may be that different types of
forecasting methods should be used. Perhaps a causal model should be
tried.

A4/98 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Marking Scheme (out of 20) Marks


(a) Calculating the MA smoothed values 2
Translating smoothed values into forecasts 2
(b) Calculating the ES smoothed values 2
Translating smoothed values into forecasts 2
(c) Correct method for calculating
– one-step-ahead errors 1
– MSE 2
Calculating the MSE for the MA forecasts 2
Calculating the MSE for the ES forecasts 2
(d) Correct choice of forecast for September 2017 2
(e) Reservations
– Stationarity 1
– Seasonality 1
– Other methods 1
Total 20

Case Study 15.2: Theatre Company


1 There are nine steps in the guidelines.

Step 1: Analyse the Decision-Making System


In planning future productions, many interlinked decisions must be taken. All
depend in some way on the forecasts:
(a) length of run
(b) seat prices
(c) discounts
(d) promotions
(e) advertising
(f) costs, etc.
The timing of these decisions, the people who take them and how decisions are
altered or reviewed should be mapped out. In the light of this information it may be
necessary to revise the decision-making process.

Step 2: Define Forecasts Needed


Having completed step 1, it should be possible to make a detailed list of the full
range of forecasts required, the levels of accuracy, their timings, their time horizons
and by whom they will be used.

Step 3: Prepare a Conceptual Model


The factors affecting the demand for tickets should be considered and listed. The
main factors are:

Quantitative Methods Edinburgh Business School A4/99


Appendix 4 / Answers to Review Questions

(a) Internal
(i) reputation of play
(ii) reputation of actors
(iii) presence of star names
(iv) ticket prices
(v) advertising/promotional expenditures
(vi) demand for previous, similar productions
(b) External
(i) personal disposable income
(ii) weather
(iii) time of year
(iv) day of week
(v) competition
These are the factors that it is hoped a forecasting model could take into account.

Step 4: Data Available


It is likely that data on all the factors listed in step 3 above will not be available. In
particular, while attendance figures will be available, demand data will not. This is
especially important because of ‘house full’ productions, where demand may well
have been three or four times the attendance. Since it is demand that is to be
forecast, some subjective assessments may have to be made or some surveys may
have to be carried out. Similarly, advertising expenditure may not be available for
individual productions. Expenditures may have to be broken down subjectively.

Step 5: Decide on the Technique


In this case, the technique would probably be a causal model relating demand to the
factors listed under step 3.

Step 6: Test the Accuracy


The accuracy would be tested through the independent test, the second of the two
presented in this module. Data from perhaps two productions would be put to one
side. The coefficients of the forecasting model would be estimated from data on
other productions and this model used to forecast demand for tickets at the two
productions. The performance of the model would be quantified by a measure of
scatter (MSE and MAD). This would be used to choose between different models
and to decide whether the accuracy was sufficient for the decisions to be taken.

Step 7: Decide How to Incorporate Judgement


Situations will arise where special circumstances warrant the making of adjustments
to a statistical forecast. Unusual publicity prior to the opening of a play or the
withdrawal of a famous actor would be examples of such circumstances in this case.
Judgement would be needed to modify the forecasts. A system, perhaps regular
monthly meetings, should be instituted to discuss and make alterations. It would be
essential that changes be carefully considered by all involved rather than hasty or

A4/100 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

unilateral decisions be taken. Those attending the meetings would be accountable


for any alterations made.

Step 8: Implement
The manager of the forecasting system should establish and gain agreement on what
the implementation problems are and how they can be solved. The problems would
depend on the individuals involved, but it is likely that they would centre on the
replacement of purely judgement-based methods by a more scientific one. The
manager should also follow the first set of forecasts through the decision process.

Step 9: Monitor
Implementation refers to just before and after the first use of the forecasts. In
monitoring, the performance of the system is watched, but not in such detail, every
time it is used. The accuracy of the forecasted demand for each production should
be recorded and reasons for deviations explored. In the light of this information, the
usefulness of the system can be assessed and changes to it recommended. The track
records of those making judgemental adjustments to the forecasts should also be
monitored. In this situation, where judgement must play an important part, this
aspect of monitoring will take on particular significance.

Marking Scheme (out of 25) Marks


There are nine steps but some are more difficult than others:
Step 1 4
Step 2 2
Step 3 5
Step 4 3
Step 5 3
Step 6 2
Step 7 2
Step 8 2
Step 9 2
Total 25

Case Study 15.3: Brewery


1 The questions posed at the end of the case correspond to the nine stages of the
guidelines for developing a forecasting system.

Step 1: Analyse the Decision-Making System


The purpose of this analysis is to ensure that the forecasts really do serve the
decision process as intended rather than being unconnected and peripheral. Im-
portant side benefits often accrue from such analyses. Inconsistencies in the way
decisions are taken and in the organisational structure may come to light. A whole

Quantitative Methods Edinburgh Business School A4/101


Appendix 4 / Answers to Review Questions

forecasting project may have to be postponed pending the resolution of these wider
issues.
A thorough analysis of a decision-making system involves systems theory. A lower-
level approach of listing the decisions directly and indirectly affected by the fore-
casts is described here. The list would be determined from an exhaustive study of
the organisational structure and the job descriptions associated with relevant parts
of it. Here is a brief description of the main decisions.

Main decisions Related decisions


Production levels for one year ahead Materials requirements
(monthly) Manpower requirements
Pre-production operating activities
Machinery maintenance
Warehousing requirements
Short-term financial planning
Distribution quantities (quarterly) Warehousing requirements
Transport needs
Short-term financial planning
Marketing action (quarterly) Advertising
Promotions

The list forms the input information for step 2, determining the forecasts required.

Step 2: Determine the Forecasts Required


The decisions all have a time horizon of, at most, one year. The forecasts have
therefore to be for one year ahead.
The production decisions require monthly forecasts, the distribution and marketing
decisions quarterly forecasts. This suggests that the system should produce monthly
forecasts, aggregating in the case of the latter. The timing of the decisions within the
month or quarter dictates the time at which the forecasts should reach the decision
makers.

Step 3: Conceptualise
At this step consideration is given to the factors that influence the forecast variable.
No thought is given to data availability at this step. An ideal situation is envisaged.
An alcoholic beverage is not strictly essential to the maintenance of life. It is a
luxury product. Therefore, its consumption will be affected by economic circum-
stances. It would be strange if advertising and promotions did not result in changes
in demand. In addition, the variability of the production, advertising and promo-
tions of the different competitors must have an effect. In particular, the launch of a
new product changes the state of the market. It is not just competing beer products
that are important. Competition in the form of other products, such as lager and
possibly wines and spirits, must also have an influence.

A4/102 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

The data record in Table 15.4 makes it clear that there is a seasonal effect. In other
words, the time of year and, perhaps, the weather are relevant factors. The occur-
rence of special events may have had a temporary influence. A change in excise duty
as a result of a national budget is an obvious example. More tentatively, national
success in sporting tournaments is rumoured to have an effect on the consumption
of alcoholic beverages.
There are also reasons for taking a purely time series approach to the forecast. First,
the seasonal effect will be handled easily by such methods. Second, the time horizon
for the forecasts is short: less than one year. Within such a period there is little time
for other variables to bring their influence to bear. To some extent, therefore, sales
volume could have a momentum of its own. Third, a time series approach will give a
forecast on the basis ‘if the future is like the past’. Such a forecast would be the
starting point for judging the effect of changing circumstances.

Step 4: Data Availability


The ideals which were the subject of step 3 will be restricted by the availability (or
lack of availability) of data. The search for data records relating to the factors of step
3 will reveal the restrictions. They might be:
(a) The organisation’s data on advertising and promotions might be aggregated and
difficult to separate.
(b) Competitors’ data are likely to be available for some factors (production levels
and new products, for example) but unavailable for others (advertising, promo-
tions). Where competitive data are available, they may not be available
sufficiently early to be used. They may not become available within the one-year
time horizon of the forecasts.
Data that are both available and usable are the organisation’s aggregated data on
advertising and promotions and nationally available data on the economic climate.
In the case of the former, only quarterly data are obtainable. Unfortunately, monthly
forecasts are wanted. Regretfully, it must be decided to produce only quarterly
forecasts.
Consideration of Table 15.4 suggests that data earlier than 2009 should not be
included. The growth in sales volume in the early 2000s is distinctly higher than
subsequently. Whatever the reasons for this, there is no point in requiring the
forecasting techniques to account for it, especially since the post-2008 data record is
sufficiently long for most purposes.

Step 5: Decide Which Techniques to Use


There are good reasons for employing both a time series and a causal modelling
approach to the problem. Both should be used and then judgement made between
them on the basis of their apparent accuracies.
A time series technique will have to have a number of characteristics. It will have to
handle the trend that is evident from Table 15.4. It will also have to deal with the
seasonal effect. There may or may not be a cycle in the data. However, the decision
to restrict the data record to the period 2009–17 means that it will be difficult to

Quantitative Methods Edinburgh Business School A4/103


Appendix 4 / Answers to Review Questions

determine any cyclical effect (these effects are often five or seven years in length).
To summarise, what is required is a technique that can handle trend and seasonality
but not cycles. The obvious choice is the Holt–Winters technique.
The causal modelling technique should be multiple regression analysis with two
independent variables. The first independent variable is the gross domestic product
(GDP) of the UK, as a measure of the economic climate. Other economic variables
can also be used in this role, but GDP is the most usual. The second independent
variable is the sum of advertising and promotional expenditures (ADV/PRO) of the
organisation. Scatter diagrams relating the dependent variable with each independ-
ent variable in turn can verify that it is reasonable to consider GDP and ADV/PRO
as independent variables.
Other potential independent variables will have to be ignored for reasons of the
non-availability of data.

Step 6: Testing the Accuracy


Because two approaches (time series and causal modelling) are being employed,
there are two stages in testing the accuracy of the forecasts. First, it has to be
determined which Holt–Winters model (i.e. what values for the smoothing con-
stants) is the best and whether one or both of the independent variables should be
included in the regression model. Second, the best Holt–Winters model has to be
compared with the best regression model.

Which Holt–Winters Model?


The Holt–Winters model works with three smoothing constants, one each for the
main series, the trend and the seasonality. To decide on the values for the constants,
some experimentation is needed. Several different sets of values should be tried and
the forecasting accuracy of each compared. The accuracy is measured via the mean
square or the mean deviation of the one-step-ahead forecast errors.
A brief reminder of this method is given. For any set of parameters, the data record
should be followed time period by time period, and at each a forecast for one period
ahead calculated. The forecast error is found by subtracting the forecast from the
actual. The scatter of these errors for the whole series is measured by calculating the
mean square error (the average squared error) or the mean absolute deviation (the
average of the absolute values of the errors). This is repeated with other parameter
sets. The set for which one or both of these statistics is the lowest is chosen.
It is best to try a wide range of parameter sets and ‘home in’ on the one that seems
the best. Table A4.23 shows the results of this process for the data of Table 15.4.

Table A4.23 Which Holt–Winters forecast?


Smoothing constants MAD
Series Trend Season
0.5 0.2 0.2 5 33
0.4 0.2 0.2 4 25
0.3 0.2 0.2 4 21

A4/104 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Smoothing constants MAD


Series Trend Season
0.2 0.2 0.2 4 20
0.1 0.2 0.2 4 30
0.2 0.4 0.2 3 17
0.2 0.3 0.2 3 18
0.2 0.1 0.2 4 26
0.2 0.5 0.2 3 16
0.2 0.6 0.2 3 16
0.2 0.4 0.5 2 12*
0.2 0.4 0.4 3 13
0.2 0.4 0.3 3 15
0.2 0.4 0.1 4 21
* Best parameter set.

The table shows the parameter sets in three groups. For the first group the smooth-
ing constant for the main series has been varied; for the second, that for the trend
has been varied by keeping the series constant at its ‘best’ level; finally, the constant
for seasonality has been varied by keeping the other two at their ‘best’ level. The
parameter set with the lowest MAD and MSE is (0.2, 0.4, 0.5). The Holt–Winters
model with these parameters would appear to be the best. Note that the procedure
for finding these parameter values is an approximate one. There is no guarantee that
the truly optimum set has been found. To ensure that this had been done would
have required an exhaustive comparison of all possible parameter sets.

Which Regression Model?


There are three possible regression models:
(a) GDP as independent variable.
(b) ADV/PRO as independent variable.
(c) GDP and ADV/PRO as independent variables.
They should be compared using the following basic criteria:
(a) Sensibleness of models.
(b) Closeness of fit – using R-bar-squared.
(c) Randomness of residuals – statistically or by inspection.
Table A4.24 shows how the three models compare according to these criteria.
The first model, using GDP as the only independent variable, is inadequate. The fit
is not a close one (R-bar-squared = 0.27). Nor are the residuals random. They
exhibit a seasonal pattern in that the residuals for quarter 1 are all negative, for
quarter 3 all positive.
The second model, using ADV/PRO as the independent variable, is good. The fit is
a close one and the residuals are random.

Quantitative Methods Edinburgh Business School A4/105


Appendix 4 / Answers to Review Questions

The third model, with GDP and ADV/PRO as independent variables, is slightly
better than the second, having a marginally higher R-bar-squared.

Table A4.24 Output for the regression models


Criterion Model (independent variable(s))
GDP ADV/PRO GDP, ADV/PRO
Sensibleness Yes Yes Yes
-bar-squared 0.27 0.91 0.93
Random residuals No Yes Yes
(Seasonality)

Finally, since these are regression models, they should be checked for the existence
of any of the usual reservations: lack of causality, spuriousness, etc. There may
indeed be a problem with causality. The second and third models are superior
because the ADV/PRO variable captures the seasonality, which was a problem in
the first. It is not clear whether it is the seasonality or the expenditure on advertising
and promotion that explains the changes in sales volumes. There will be no difficul-
ty if advertising and promotion expenditures continue to be determined with
seasonal variations as in the past, but if the method of allocation changes then both
models will be inadequate. A new model, dealing with advertising/promotion and
seasonality separately, will have to be tested.
Meanwhile, the model with two independent variables seems to be the best. The
results of this regression analysis are shown in more detail in Table A4.25.

Table A4.25 Output of the regression model linking sales to GDP and
ADV/PRO
Sales volume = –44.3 + 0.49 × GDP + 17.4 × ADV/PRO
R-bar-squared = 0.93
Quarter
Residuals Year 1 2 3 4
2009 −0.3 −4.2 −0.4 −0.6
2010 2.4 −1.3 1.7 −4.6
2011 2.6 4.3 −0.5 1.4
2012 −1.1 −5.3 4.7 1.7
2013 −2.9 1.8 0.7 2.5
2014 −1.9 2.8 1.3 3.0
2015 −2.6 −1.2 −0.7 −0.3
2016 1.4 −2.2 0 −0.1
2017 −1.4 −5.0 3.5 0.8

Best Overall? Time Series or Regression?

A4/106 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

The best time series model is the Holt–Winters with smoothing constants 0.2 (for
the series), 0.4 (for the trend) and 0.5 (for the seasonality); the best regression model
relates sales volume to GDP and total expenditure on advertising and promotion.
To choose between these two, an independent test of accuracy should be used. This
means that the latter part of the data (2017) is kept apart and the data up to then
(2009–16) used as the basis for forecasting 2017. The better model is the one that
provides forecasts for 2017 that are closer to the actual sales volumes. There are two
reasons for comparing the models in this way.
First, the test is independent in the sense that the data being forecast (2017) are not
used in establishing the forecasting model. Contrast this with the use of R-bar-
squared. All of the data, 2009–17, are used to calculate the coefficients of the model;
the residuals are then calculated and R-bar-squared measures how close this model is
to the same 2009–17 data. This is not an independent measure of accuracy.
Second, the accuracy of smoothing techniques is usually measured through the
mean square error or mean absolute deviation; the accuracy of regression models is
measured by R-bar-squared. These two types of measures are not directly compara-
ble. On the other hand, the independent test of accuracy does provide a directly
comparable measure: closeness to the 2017 data.
The details of the test are as follows. The 2009–16 data are used for each of the two
models as the basis of a forecast for each of the four quarters of 2017. The close-
ness of the two sets of forecasts to the actual 2017 data is measured using the mean
square error and the mean absolute deviation. The model for which both these
measures are smaller is chosen as the better to use in practice. Should the two
measures have contradicted one another, then this would have meant that the model
with the lower MSE tended to be closer to extreme values whereas the model with
lower MAD tended to be closer on average to all values.
Table A4.26 shows the results of this test. The Holt–Winters time series is clearly
superior to the regression model. Both measures, MAD and MSE, demonstrate that
it gives the better forecasts for the already known 2017 data. The Holt–Winters
technique, with smoothing constants 0.2, 0.4 and 0.5, should be chosen to make
forecasts. The whole of the data series, including of course 2017, should be used in
doing this. Forecasts for 2018 are shown in Table A4.27.

Table A4.26 Comparing time series and regression models


Quar- Actual Time series Regression
ter
(of 2017) Forecast Error Error2 Forecast Error Error2
1 37 38 −1 1 39 −2 4
2 49 50 −1 1 54 −5 25
3 63 63 0 0 60 3 9
4 48 47 1 1 47 1 1
MAD = 3/4 MSE = 3/4 MAD = 11/4 MSE = 39/4
= 0.75 = 0.75 = 2.75 = 9.75

Quantitative Methods Edinburgh Business School A4/107


Appendix 4 / Answers to Review Questions

Table A4.27 2018 forecasts


Quarter Forecast
(of 2018)
1 39
2 52
3 66
4 49

Step 7: Incorporating Judgement into the Forecast


To incorporate judgement into a forecast, there are two basic tasks. The first is to
draw all the judgements together and try to form a consensus. This can be done
through one of the qualitative forecasting techniques. The second task is to make
the adjustment to the forecast, if one is to be made. This can be accomplished by
getting those people affected by the forecast to agree to the change and then, most
important of all, to make them accountable for it. It is vital that the accuracy of the
alterations be monitored.
It should be noted that the body of ‘experts’ whose views should be considered
could go outside the organisation and include generally recognised industry experts.
Assembling the judgements of these experts, obtaining a consensus and making
them accountable for their views is a difficult, if not impossible, task. Nevertheless,
the approach should be as outlined above. First, the judgements are brought
together; second, decisions to adjust the statistically derived forecasts are made.
Table A4.28 summarises some of the judgements that might be made about the
future of the product. Sometimes the judgements might be expressed in words
rather than in convenient percentage growth terms. When this happens, a quantita-
tive forecast has to be derived from the words. Table A4.28 shows the source of
each judgement, the verbal forecast where one exists and a forecast in terms of
percentage growth, whether actual or derived.

Table A4.28 Experts’ opinions of brand prospects


Source Verbal forecasts % growth p.a.
Business pages of 1. Slight decline in volume next year −1 to −2
newspapers
2. Stagnation for two to three years 0
3. Continuing as in recent years 4
4. No significant changes in the 4
situation
Business journals 1. No cause for optimism 0 (?)
2. No improvement in prospects 3
Industry experts 1. No significant change 4
2. Better than previous years 5
3. Better than previous years 5

A4/108 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Source Verbal forecasts % growth p.a.


Stockbrokers 1. Unlikely to sustain even recent 1
modest growth
2. Unchanged situation 4

To obtain a consensus from these data, a modified version of the Delphi method
might be used. All the experts represented in the table should be approached,
presented with the views of the others and asked if they wish to adjust their opin-
ions. As a result, some of the more extreme views might be altered.
The second stage, the adjustment of the statistical forecasts, is made by people
within the organisation. They should be accountable for any changes made; it is not
of course possible to make the external experts accountable for their views within
the context of an organisation’s forecasts.

Step 8: Implement the Forecasts


There are four questions to be asked and answered in implementing a forecast:
(a) What are the problems?
(b) Do all the participants agree on the problems?
(c) What are the possible solutions to the problems?
(d) Can a consensus on an action plan be obtained?
Implementation in this case commences with the drawing up of a list of everyone
affected by the forecasts. Nearly all the main functional departments – finance,
marketing, production, corporate planning – should be represented on the list.
Those on the list are then interviewed. The purpose is solely to find out what
difficulties they think might block the successful working of the system. The style of
interviewing should be neutral. No attempt should be made to lead the interviewees,
nor to sell the forecasting system to them, nor to encourage them to say nice things
about it. Such approaches might result in the erection of defensive barriers and
prevent the eliciting of real views.
There are likely to be many problems. Only two examples will be described. First,
there might be disagreement about the effect of the launches of other new brands if
any are about to be introduced either by the organisation itself or by competitors.
Second, there may be internal political problems, especially if there have been recent
changes in the organisational structure or in personnel.
The most usual problem is a lack of belief in new techniques and technology on the
part of the users. When present, this is usually because of previous failures in similar
projects.
How might such problems be solved? The problem of the new brands might be
solved by creating two sets of forecasts, each relating to a different set of assump-
tions about the effect of the new brands. In effect, this solution is a form of the
technique of scenario writing. Contingency plans should be prepared so that the
organisation can respond to either scenario.
The problem of political difficulties, its source and its effect, will often go far
beyond the forecasting project. However, the early employment of the user in-

Quantitative Methods Edinburgh Business School A4/109


Appendix 4 / Answers to Review Questions

volvement approach is likely to minimise the effect, since the participants will feel
more like a team than they would otherwise.
Likewise, a lack of belief in a forecasting system will probably have effects that go
far beyond any one particular project, but early user involvement will mitigate the
effects.

Step 9: Monitor Performance


Monitoring is probably the most tedious stage of the nine, being based on soliciting
and recording a considerable amount of information. Thoroughness and persistence
are the virtues required. The performance of this forecasting system should be
monitored through three separate reports.
(a) An annual report should show the accuracy of the forecasts as measured by the
mean absolute deviation (MAD) or the MSE. The MAD achieved should be
compared with the MAD that is expected given the ex post measurement of accu-
racy described in step 6. As well as this indication of the average accuracy,
exceptional inaccuracies should be reported, with suggested reasons for their
occurrence. If some particular reason has appeared consistently, the forecasting
system should be adjusted accordingly. For example, had there been changes in
the trend to which the system had responded too slowly, then the smoothing
constant for the trend equation should be increased. The point is that, although
the smoothing constant as set may have been fine for the historical data, it might
no longer be right if the trend starts to behave in a slightly different way. An
adjustment to it would then have to be considered. If the trend starts to behave
in a radically different way, the whole basis of the system (the use of Holt–
Winters) will have to be re-evaluated.
(b) An annual report should show the performance of the ‘judges’, those who have
proposed qualitative adjustments to the forecasts. This means that their judge-
mental forecasts should be recorded from minutes of meetings and compared
with what has happened. Over time the identity of the good judgemental fore-
casters will begin to emerge. Their views should then carry greater weight in later
forecasts. Many of them may outperform the statistical forecasts. However, be-
fore scrapping the statistical system it will have to be borne in mind that these
judgements are being made as an amendment to the Holt–Winters output. It is
one thing to be able to make marginal improvements to an existing forecast; it is
quite another thing to beat the existing forecast when working in isolation from
it.
(c) After the system has been in operation for a year, a survey of users should be
carried out by an independent party. The users should be asked for their percep-
tions of the system. Exactly how do they use the forecasts? What are the
strengths and weaknesses of the system in their eyes? How can it be improved?
Do they think it has been successful in achieving its objectives? And so on. This
will demonstrate whether the system has credibility among the users and does
support their decisions. Such a survey should be repeated in the future when it is
felt that the whole system is due for review and reappraisal.

A4/110 Edinburgh Business School Quantitative Methods


Appendix 4 / Answers to Review Questions

Conclusions
It should be emphasised that it is probably only for short-term forecasts that a time
series method will seem to be the best. For medium-term forecasts beyond a year
ahead, a causal model is likely to be better. Even for a short-term forecast, however,
uncertainty and volatility in the UK economic environment will eventually cause
problems, and adjustments will have to be made to the Holt–Winters model. For
important medium-term forecasts on which the expenditure of a lot of money is
justified, it may be worthwhile to use all three approaches to forecasting: causal,
time series and qualitative. If all give similar output, there is mutual confirmation of
the correct forecast; if they give different output, then the process of reconciling
them will be a valuable way for the managers involved to gain a better understand-
ing of the future.
This case solution has covered the important aspects of the case, but not all the
aspects. Among the omissions, for example, are statistical tests of randomness.
Furthermore, techniques other than Holt–Winters and some limited causal model-
ling have not been described, but they should have been considered. The emphasis
has, however, been on the role of a manager, not a statistician. The items included
are, in general, the things a manager would need to be aware of in order to be able
to engage in sensible discussions with forecasting experts.

Marking Scheme (out of 30) Marks


Steps
1. Analyse decision 2
2. Forecasts required 2
3. Conceptualise 4
4. Data availability 2
5. Techniques 2
6. Accuracy
– Best Holt–Winters 4
– Best regression 4
– Holt–Winters vs. regression 4
7. Judgements 2
8. Implementation 2
9. Monitoring 2
Total 30

Quantitative Methods Edinburgh Business School A4/111


Index
absolute value 5/15, 11/8 beta-binomial distribution 9/25
accounting data 3/2 binomial distribution 7/15–7/20, 7/22,
special case of 3/13–3/17 7/26–7/29, 9/2–9/4, 9/25
accounting ratios 3/6 characteristics 7/15
accounts 3/13–3/17 deriving 7/16–7/18
accuracy 1/22, 3/4, 3/5, 3/11, 5/2 occurrences 7/15–7/16
in sampling 6/12–6/18, 8/8 parameters 7/19
of business forecasting 12/22, 15/4– tables 7/18
15/8, 15/12–15/13 blocks 10/4, 10/11
addition 2/8, 2/21, 7/9, 7/13 Box–Jenkins Method 14/16–14/17, 15/4
advertising 1/18, 11/2, 12/4–12/7, brackets in algebraic expressions 2/10
12/21 brainstorming 13/8, 15/11
Advertising Standards Authority 1/18 breakeven point 2/33
aggregate index 5/26–5/31 business forecasting 13/1–13/24
algebra 2/5–2/25 accuracy 15/4–15/8, 15/12–15/13
alphabetical order 3/6, 3/12, 4/10 applications 13/4–13/5
alternative hypothesis 8/15–8/19, 8/26 case studies 13/24, 15/24–15/26
ambiguity 1/23 conceptual model 15/5
analogies 13/12 data available 15/5
analysis of variance 10/1–10/22 errors 15/2, 15/4, 15/14–15/16
ANOVA tables 10/8–10/10, 10/13– exploratory approach 13/14
10/15, 12/24 forecasts needed 15/5
applications 10/2–10/5 guidelines for organisation forecasting
case studies 10/21–10/22 system 15/2, 15/4–15/14
extensions 10/14–10/15 implementing the system 15/9–15/12
one-way 10/2–10/10, 10/14 incorporating judgements 15/8–15/9,
two-way 10/2–10/5, 10/10–10/14 15/12
ANOVA tables 10/8–10/10, 10/13– long-term forecasting 13/5
10/15, 12/24 management aspects 15/1–15/26
area sampling 6/10 manager’s role 15/2–15/4
area under the curve 7/20, 7/23 medium-term forecasting 13/4–13/5
arithmetic mean 5/3–5/15, 7/25, 7/26, monitoring performance 15/12–
8/3, 9/8 15/14
distortion by clusters and outliers normative approach 13/14
5/13 problems 15/10–15/11
error by averaging averages 5/14 qualitative techniques 13/2–13/17,
notation 8/7 15/8
pre-eminence 5/13–5/14, 5/19 quantitative techniques 13/6–13/7,
ARMA (autoregressive, moving average) 13/17–13/18
14/16 selection of technique 15/6
Armstrong, J. S. 13/18 short-term forecasting 13/4, 13/5,
autocorrelation coefficients 14/16 14/2, 14/17
averaging of averages 5/14 techniques 13/2–13/3, 15/1, 15/3
base forecast 14/2 time horizons 13/4, 13/5, 15/5
base year 5/24–5/25 time series methods 13/3
bases 2/19–2/24, 5/24–5/26 user involvement 15/2, 15/3, 15/8–
Bayesians 1/4 15/10, 15/14

Quantitative Methods Edinburgh Business School I/1


Index

business reports 3/2 measuring closeness of fit 12/14–


case studies 12/16
analysis of variance 10/21–10/22 statistical basis 12/13–12/23
business forecasting 13/24, 15/24– correlation coefficient 11/10–11/14,
15/26 11/19, 11/24, 15/7
correlation 11/31–11/32 multiple regression analysis 12/3–
data analysis 4/22–4/24 12/4
data communications 3/28 residuals 11/15–11/17
distributions 7/37, 9/34–9/35 cost of living index 5/24, 5/27, 5/28
equations 2/32–2/34 critical values 8/12–8/19, 8/21, 9/23–
formulations 2/32 9/24, 12/16
insurance premium 9/34 crop yield 10/3–10/5, 10/11
oil conservation 3/31 cross-impact matrices 13/11–13/12
regression 11/32, 12/33–12/36 cumulative relevance number (CRN)
relevance trees 13/24 13/17
sampling methods 6/24 curvilinear regression 12/7–12/9
statistical inference 8/38–8/40 cycles 14/9–14/16
summary measures 5/39–5/41 data analysis 4/1–4/24
time series forecasting 14/25–14/26 accounting data 4/3
uses and misuses of statistics 1/30– case studies 4/22–4/24
1/35 comparisons 4/9, 4/12, 4/14
wage negotiations 1/31 definition 4/1–4/2
catastrophe theory 13/13–13/14 exceptions 4/8, 4/12
causal modelling 13/3, 15/6 guidelines 4/6–4/12
central limit theorem 8/5, 8/7, 8/8, implications for the producers of
8/11, 9/10, 9/12, 9/15 statistics 4/13–4/17
certain events 1/3, 8/2–8/3 lack of confidence in 4/3
chi-squared distribution 9/15–9/21, 9/26 management information system
characteristics of 9/16 output 4/5, 4/9, 4/17
occurrences of 9/16 market research 4/5
tables 9/17 model building 4/7, 4/12, 4/13
class sizes 1/11 over-complication by experts 4/3, 4/5
cluster sampling 6/8, 6/11 reduction of data 3/5, 4/5, 4/6
clusters 5/13 re-presentation of data 4/6–4/7, 4/9,
coefficient of variation 5/21–5/22 4/13
collinearity 12/4–12/6, 12/24 data communication 3/1–3/31
combinations 7/10–7/14 accounting data 3/13–3/17
common-sense test 1/23, 5/2 case studies 3/31
comparisons 4/9, 4/12, 4/14, 5/12 presentation rules 3/3–3/13
computers 3/18, 3/23, 4/17, 11/8, 12/5, use of graphs 3/17–3/23
12/25 data points 1/6
confidence limits 8/3, 8/6–8/8, 8/10, data presentation rules 3/3–3/17, 3/20–
8/27, 9/10, 10/1–10/2 3/23, 4/3, 4/6–4/7
constants 2/5, 2/8, 2/13, 2/18 clarity of labelling 3/9–3/10, 3/12,
control group 8/20 3/22, 4/7
convenience sampling 6/12 interchange of rows and columns
coordinates 2/2–2/8, 2/12, 2/13 3/6–3/7, 3/12, 3/22, 4/7
correlation 11/1–11/3 minimal use of space and lines 3/8,
case studies 11/31–11/32 3/12, 3/22, 4/7
limitations 11/21–11/24 reordering of numbers 3/5–3/6, 3/12,
3/22

I/2 Edinburgh Business School Quantitative Methods


Index

rounding to two effective figures 3/3– case study 2/32–2/34


3/5, 3/11, 3/22, 4/7 dependent 2/16
use of summary measures 3/12, 3/22, inconsistent 2/15, 2/16
4/7 linear 2/11–2/18, 11/15, 11/22
use of verbal summary 3/10–3/12, manipulation 2/8–2/11
3/22, 4/7 simultaneous 2/14–2/18
data reduction 3/5, 4/5, 4/6 error sum of squares (SSE) 10/6, 10/9,
decay functions 2/21 10/11, 10/12
decision making 4/13, 4/17, 8/26, 8/27, errors
15/2, 15/4, 15/9 in business forecasting 15/2, 15/4,
decision support systems 1/2 15/13–15/16
decomposition method 14/9 in sampling 6/13, 6/18
degrees of freedom 9/8–9/9, 9/12, in significance tests 8/15–8/19
9/14–9/22, 10/7, 12/15 logical 1/21
Delphi forecasting 13/6, 13/9–13/10, standard 8/7, 12/5
13/18, 15/8 statistical 1/2, 1/22–1/25, 5/23
dice 7/12–7/14, 7/16–7/18 estimation 8/2, 8/6–8/9, 8/22, 8/27,
distributions, statistical 1/5–1/17, 7/2– 9/1, 9/17
7/38 evidence 1/22
average 1/14 experimental design 10/15
binomial 7/15–7/20, 7/22, 7/26– exponential functions 2/18–2/25, 12/10
7/29, 9/2–9/5 and linear functions 2/24
case studies 7/37 exponential smoothing 14/4–14/6
chi-squared 9/15–9/21, 9/26 exponents 2/19–2/20
continuous 1/9–1/12 extrapolation 11/23
discrete 1/5–1/12, 7/20–7/22 factorial notation 7/11, 7/12
F-distribution 9/21–9/25, 10/7, factors 10/15
12/14 F-distribution 9/21–9/25, 10/7, 12/14
non-symmetrical 9/17 characteristics 9/21
normal 1/12–1/16, 7/14–7/15, 7/20– occurrences 9/22–9/23
7/28 tables 9/23–9/24, 10/13
observed 1/12–1/14, 1/24, 7/2–7/8, feedback 4/17, 15/11
7/14, 7/28–7/29, 9/2 financial analysts 1/21, 3/2
Poisson 9/2–9/8, 9/26 financial budgets 3/4
probability concepts 7/8–7/14 financial data 3/13–3/17
reverse J-shaped 5/11–5/12, 5/32, fitted values 11/6, 11/14–11/17, 12/5
7/2 fixed weight index 5/28
standard 1/12–1/18, 7/2, 7/8, 7/14– forecast interval 12/22, 12/23
7/15, 7/26, 9/2, 9/15 forecasting
standard deviation 1/14, 1/17, 5/18– methods and applications 13/14
5/19, 5/23, 7/21–7/28, 8/21, formulations, case study 2/32
8/22 frequency histogram 1/7–1/9, 1/31, 7/4
symmetrical 5/9, 5/32, 7/19 frequency table 1/6, 7/2–7/8
U-shaped 5/10, 5/32, 7/2 functions 2/5–2/8
division 2/9, 2/20 Gossett, W. S. 9/10, 9/25
dummy variables 12/6 gradient 2/11
econometric forecasts 13/3, 15/16 graphics 1/19–1/20
effective figures 3/3–3/5 graphs 1/19–1/22, 1/23–1/25, 2/2–2/8,
Ehrenberg, A. S. C. 3/5, 4/5 2/13, 2/23
elasticity 2/8 helpful uses 3/17–3/20
equations 2/5, 2/8–2/25

Quantitative Methods Edinburgh Business School I/3


Index

in data communication 3/2, 3/17– logarithms 2/21, 2/24, 12/9–12/13


3/23 logical errors 1/21
unhelpful uses 3/17–3/20 long-range forecasting 13/18
gridlines 3/8, 4/7 long-term forecasting 13/5
gross domestic product (GDP) 11/3– management data 1/18, 3/2, 3/10, 4/2
11/4, 12/4–12/6, 12/21, 15/16 management information systems 1/2,
growth function 2/18, 2/21 3/2, 3/6, 3/18, 3/23
heteroscedasticity 11/20 output 4/5, 4/9, 4/17
histograms 5/8–5/11, 5/32, 7/16, 7/19 market research 1/2, 4/5, 6/2, 6/12,
Holt’s Method 14/6–14/8 6/18, 7/3, 8/2, 13/9
Holt–Winters Method 14/8 mathematics 2/1–2/25, 11/5–11/7
hypotheses 4/2, 8/2, 8/9–8/21, 8/24, equations 2/8–2/12, 11/5–11/6
9/13 exponential functions 2/18–2/25
testing 8/10 graphical representation 2/2–2/8
impossible events 1/3 linear functions 2/11–2/14
income statement 3/29 simultaneous equations 2/14–2/18
independent events 7/10, 7/12 mean absolute deviation (MAD) 5/15–
indices 5/24–5/31 5/21, 15/6, 15/13
fixed weight index 5/28 mean square error (MSE) 15/6, 15/13
Laspeyres Index 5/28 mean squares 10/7, 10/10
Paasche Index 5/28 measures of central tendency 5/5
price relative index 5/27 measures of dispersion 5/15
simple aggregate index 5/26–5/27 measures of location 5/3–5/14, 5/22,
simple index 5/24–5/26 5/31
weighted aggregate index 5/27–5/31 arithmetic mean 5/5–5/15
inference, statistical 6/12, 6/16, 8/1– choice of measure 5/6–5/8
8/40, 9/1 comparisons 5/12
applications 8/2 median 5/5–5/6
case studies 8/38–8/40 mode 5/6, 5/11–5/14
confidence levels 8/2–8/3 visual approach 5/12
estimation 8/6–8/9 measures of scatter 5/14–5/22, 5/31
sampling distribution of the mean coefficient of variation 5/21–5/22
8/3–8/6 comparison of measures 5/19–5/21
significance tests 8/9–8/29 interquartile range 5/15, 5/21
information collection 6/17 mean absolute deviation (MAD)
interaction variable 10/14 5/15–5/21, 15/6, 15/13
intercept 2/11, 2/21, 11/5, 11/24 range 5/15
internal rate of return (IRR) 3/31 standard deviation 5/18
interquartile range 5/15, 5/21, 6/12 variance 5/16–5/21
interviews 1/20 median 5/5, 5/10, 5/12
invoice checking 6/3, 6/17 medium-term forecasting 13/4
Jenkins, G. 15/4, 15/14 method of steepest descent 14/17
judgement sampling 6/4, 6/11–6/12 microcomputers 11/17–11/21
labelling in tables 3/9, 3/12, 3/22, 4/7 microeconomics 2/14
Laspeyres Index 5/28 mode 5/6, 5/11–5/14
least-squares method 11/7–11/9, 11/23, model building 4/7, 4/12, 4/13, 5/31
12/3, 12/24 modelling, practical experiences with
linear functions 2/6, 2/11–2/18, 2/22, forecasting time series and 15/14
2/24 moving averages 14/3–14/6, 14/13
and exponential functions 2/24 multi-collinearity 12/4
local relevance number (LRN) 13/17 multi-factor situations 10/15

I/4 Edinburgh Business School Quantitative Methods


Index

multiple regression analysis 12/2–12/7, percentages 1/21


12/24–12/26 pictorial representations 1/23
and simple regression 12/2–12/3 point estimate 8/7, 8/8, 8/27, 9/13,
collinearity 12/4–12/6 12/22
correlation coefficient 12/3–12/4 point representation 2/4
dummy variables 12/6 Poisson distribution 9/2–9/8
retention of variables 12/19–12/21 approximation to binomial 9/6–9/7
scatter diagrams 12/3 characteristics 9/2, 9/25
multiplication law 2/5, 2/10–2/11, 2/19, degrees of freedom 9/8–9/9
2/21 derivation 9/4
multiplication law of probability for occurrences 9/3
independent events 7/10, 7/12 parameters 9/6
multi-stage sampling 6/4–6/8, 6/11 tables 9/3–9/6
mutually exclusive events 7/9 t-distribution 9/9–9/15
negative binomial distribution 9/24, 9/25 pooled estimate of the standard error
non-linear regression analysis 12/2, 8/22
12/7–12/13 population mean 8/6–8/8, 9/21
curvilinear 12/7–12/9 population measures 8/7
transformations 12/9–12/13 population split 6/6–6/8
normal curve tables 1/13, 7/21, 7/23– populations 1/2, 1/17, 6/1–6/18, 8/1
7/25, 8/11, 8/17, 8/21, 9/6, 9/10 standard deviation of 8/7
normal distribution 7/20–7/28, 8/6, price relative index 5/27
9/25 probability 1/3–1/5, 7/4–7/14
approximating binomial with normal a priori assessment 1/4, 1/5, 7/8, 9/2
7/27–7/28 binomial 7/16, 7/17, 7/19
characteristics 7/20–7/21 concepts 7/8–7/14
derivation 7/22–7/23 conditional 7/9
normal curve tables 1/13, 7/21, errors 1/24
7/23–7/25, 8/11, 8/17, 8/21, 9/6, measurement 1/3–1/4, 1/10, 7/5
9/10 objective assessment 1/4
occurrences 7/21–7/22 probability distributions 7/2, 9/2
parameters 7/25–7/27 relative frequency assessment 1/4,
null hypothesis 8/9, 8/15–8/17 7/8, 9/2
number picking 4/8 subjective assessment 1/4, 1/5, 7/8
observations 1/6, 7/4, 8/23, 9/8, 10/8, probability histograms 1/8, 7/5, 7/6
11/2, 11/4 probability sampling 6/10
observed distributions 1/12–1/14, 1/24, profit figures 1/21
7/2–7/8, 7/14, 7/28–7/29, 9/2 profit margin 3/6
omissions 1/20 profitability 1/21
opinion polls 1/2, 6/3–6/6, 6/18, 7/16 promotion schemes 8/21–8/26
ordered arrays 1/6, 7/3 purchasing behaviour 5/29
orderings 7/10–7/12 quadrants 2/3
outliers 5/11, 5/13, 5/23–5/24 qualitative forecasting techniques 13/2–
Paasche Index 5/28 13/17, 15/8
panel consensus forecasting 13/8 analogies 13/12
parameters 1/14–1/18 brainstorming 13/8
binomial distribution 7/19 catastrophe theory 13/13–13/14
normal distribution 7/25–7/27 cross-impact matrices 13/11–13/12
Poisson distribution 9/6 Delphi forecasting 13/6, 13/9–13/10,
t-distribution 9/15 13/18, 15/8
partial relevance numbers (PRN) 13/16 market research 13/9

Quantitative Methods Edinburgh Business School I/5


Index

panel consensus forecasting 13/8 rounding 3/6, 3/16–3/17, 3/22–3/23,


relevance trees 13/14–13/17 4/13
scenario writing 13/10–13/11 fixed 3/4
visionary forecasting 13/8 to two effective figures 3/3–3/5,
qualitative methods of business 3/11, 3/16, 3/22, 4/7, 4/9
forecasting 13/2–13/3, 13/6–13/8, variable 3/3
13/17 runs test 11/21, 12/16–12/19
quality control 6/3, 9/24 sample bias 1/20, 1/23, 6/12, 6/15
quota sampling 6/12 samples 1/2, 1/20, 6/1–6/3
raising to a power 2/20 paired 8/23–8/24
random numbers 4/8 sampling distribution of the mean 8/3–
tables 6/5–6/6 8/6, 8/11, 8/20
random sampling 6/5–6/11, 6/13, 6/16, sampling frame 6/14
8/26 sampling methods 6/1–6/24
randomness 11/16, 11/21 accuracy of samples 6/12–6/14
statistical tests 12/13, 12/16–12/19 applications of sampling 6/3–6/4
range 5/15 area sampling 6/10
readings 7/4 bias 6/12, 6/15
regression 2/24, 12/36 case studies 6/24
accuracy of predictions 12/22–12/23 cluster sampling 6/8, 6/11
applications 11/3–11/5 convenience sampling 6/12
calculations on microcomputer difficulties in sampling 6/14–6/16
11/17–11/21 judgement sampling 6/4, 6/11–6/12
case studies 11/31–11/32, 12/33– multi-stage sampling 6/4–6/8, 6/11
12/36 non-response 6/14–6/15, 6/18
equation of straight line 11/5–11/6 opinion polls 6/3–6/6, 6/18, 7/16
forecasting 11/3 probability sampling 6/10, 8/10
limitations 11/21–11/24 quota sampling 6/12
mathematics 11/5–11/7 random sampling methods 6/5–6/11,
measuring closeness of fit 12/14– 6/13, 6/16
12/16, 12/24 sample size 6/16–6/17, 9/4, 9/8
multiple regression analysis 12/2– sampling frame 6/14
12/7, 12/24–12/26 simple random sampling 6/4–6/11,
non-linear regression analysis 12/2, 6/13
12/7–12/13 stratified sampling 6/4, 6/8–6/9,
regression coefficients 11/19, 12/14 6/13, 6/14
residuals 11/6, 12/24 systematic sampling 6/11
scatter diagrams 11/1–11/10, 11/17– variable sampling 6/10
11/21, 12/3–12/5 weighting 6/9
simple linear 11/7–11/9, 12/13– scatter 5/4, 5/14–5/22, 9/16, 9/18, 9/21
12/23 scatter diagrams 11/1–11/9, 11/17–
spurious 11/22 11/21, 12/3–12/4
regression coefficients 11/19, 12/14 scenario writing 13/10–13/11
relevance trees 13/14–13/17 scepticism 1/24
case study 13/24 seasonality 14/8–14/16
representativeness 6/4, 6/8, 6/10, 6/12 sensitivity analysis 3/31
research and development 4/3 sequences 7/10
residuals 11/6, 11/7, 11/15–11/17, serial correlation 11/20
11/19–11/21 short-term forecasting 13/4, 13/5, 14/2,
tests for randomness 12/5, 12/9, 14/17
12/16–12/19

I/6 Edinburgh Business School Quantitative Methods


Index

significance level 8/9–8/13, 8/9–8/22, logical errors 1/21


8/27 misuse 1/2–1/3, 1/18–1/22, 1/23–
significance tests 8/2, 8/9–8/29 1/25
assumptions 9/1 omissions 1/20
basic 8/15–8/19 technical errors 1/21
critical values 8/12–8/19 stepwise regression 12/25
difference between paired samples stratified sampling 6/4, 6/8–6/9, 6/12,
8/23–8/24 6/14
difference in means of two samples Student's distribution 9/11
8/19–8/23 subpopulations 6/10
errors 8/15–8/19 subscripts 2/12
limitations 8/25–8/27 subtraction 2/9, 2/21
one-tailed 8/13, 8/14, 9/14 sum of squares between treatments (SST)
power 8/16–8/17 10/6, 10/9, 10/11, 10/12
stages 8/20, 8/21, 12/18, 12/20 summary measures 5/1–5/41
summary 9/26 case studies 5/39–5/41
two-tailed 8/13, 8/14, 9/14, 9/18 in data presentation 3/7, 3/12, 3/22,
type 1 errors 8/15, 8/16, 8/17, 8/26 4/7, 4/13
type 2 errors 8/15, 8/16, 8/17, 8/26 indices 5/24–5/31
simple index 5/24–5/26 kurtosis 5/22
simple random sampling 6/4–6/11, measures of location 5/5–5/14
6/14, 6/17 measures of scatter 5/14–5/22
simultaneous equations 2/14–2/18 outliers 5/23–5/24
algebraic solution 2/16–2/18 skew 5/22
skew 5/22, 7/19, 9/22 usefulness 5/3–5/4, 5/12
slope coefficient 12/11, 12/23 sums of squares 12/14–12/16
slope of the line 2/11, 2/13, 11/5–11/7, symbols 2/5
11/24, 4/69 systematic sampling 6/11
smoothing constants 14/6–14/8, 14/18, tables 3/5–3/13, 3/21–3/22, 4/13, 5/4,
15/6 5/12, 5/32
squaring 5/16 t-distribution 9/9–9/15, 9/25, 12/20,
standard deviation 1/14, 1/17, 5/18– 12/26
5/19, 5/23, 7/21–7/28, 8/7–8/9, characteristics 9/9
8/21, 8/22 derivation 9/10
estimate 9/8–9/13 normality 9/15, 9/17
standard distributions 1/12–1/18, 7/2, occurrences 9/10
7/8, 7/14–7/15, 7/26, 9/2, 9/15 parameters 9/15
summary 9/26 tables 9/11–9/15
standard error of predicted values (SE technical errors 1/21, 1/23
(Pred)) 12/22–12/23 technological forecasting 13/6
stationary series 14/2–14/6 theoretical distributions 7/2, 9/2
exponential smoothing 14/4–14/6 time series forecasting 14/1–14/26
moving averages 14/3–14/4 advantages 14/2, 14/18, 15/4
statistical gap 4/2 Box–Jenkins method 14/16–14/18
statistical tests 4/2, 12/13, 12/16–12/19 case studies 14/25–14/26
statistical theory 6/16, 8/7, 8/27, 10/8, cycles 14/9–14/16
15/17 decomposition method 14/9
statistics Holt’s Method 14/6–14/9
definition 1/1, 1/18 Holt–Winters Method 14/8–14/9
descriptive 1/2 practical experiences with modelling
inferential 1/2 and 15/14

Quantitative Methods Edinburgh Business School I/7


Index

review of techniques 14/16–14/18 discrete 1/9


seasonality 14/9, 14/10–14/15 related 1/21, 2/2, 2/14, 11/1, 11/4,
series with a trend 14/6–14/9 11/21–11/22
series with trend and seasonality variance 4/14, 5/16–5/21, 5/23, 8/20
14/8–14/9 variance sum theorem 8/20
series with trend, seasonality and variations 7/22
cycles 14/9–14/16 explained 11/12, 11/13
stationary series 14/2–14/6 R-squared 11/12
total sum of squares (Total SS) 10/5 unexplained 11/12, 11/13
transformation 2/24, 12/7, 12/9–12/13, verbal information 5/2, 5/32
12/25 verbal summary 3/10–3/13, 3/22, 4/7,
treatments 10/4–10/10 5/32
trends 14/6–14/13 visionary forecasting 13/8
Twyman’s Law 5/23 wage negotiations, case study 1/31–1/34
unique solution 2/15, 2/16 wage, average weekly 5/2
variables 1/6, 1/12, 1/14, 2/5, 2/18, weighted aggregate index 5/27–5/31
7/2, 7/14–7/15, 7/19 weighting 5/27–5/31, 6/9
associated 11/21–11/22, 15/16 x-axis 2/2–2/13, 7/20
continuous 1/9 y-axis 2/2–2/8, 2/13

I/8 Edinburgh Business School Quantitative Methods

You might also like