Quantitative - Methods Course Text
Quantitative - Methods Course Text
Methods
Professor David Targett
The courses are updated on a regular basis to take account of errors, omissions and recent
developments. If you'd like to suggest a change to this course, please contact
us: [email protected].
Quantitative Methods
The Quantitative Methods programme is written by David Targett, Professor of Information Systems at
the School of Management, University of Bath and formerly Senior Lecturer in Decision Sciences at the
London Business School. Professor Targett has many years’ experience teaching executives to add
numeracy to their list of management skills and become balanced decision makers. His style is based on
demystifying complex techniques and demonstrating clearly their practical relevance as well as their
shortcomings. His books, including Coping with Numbers and The Economist Pocket Guide to Business
Numeracy, have stressed communication rather than technical rigour and have sold throughout the world.
He has written over fifty case studies which confirm the increasing integration of Quantitative Methods
with other management topics. The cases cover a variety of industries, illustrating the changing nature of
Quantitative Methods and the growing impact it is having on decision makers in the Information Technol-
ogy age. They also demonstrate Professor Targett’s wide practical experience in international
organisations in both public and private sectors.
One of his many articles, a study on the provision of management information, won the Pergamon Prize
in 1986.
He was part of the team that designed London Business School’s highly successful part-time MBA
Programme of which he was the Director from 1985 to 1988. During this time he extended the interna-
tional focus of the teaching by leading pioneering study groups to Hong Kong, Singapore and the United
States of America. He has taught on all major programmes at the London Business School and has
developed and run management education courses involving scores of major companies including:
British Rail
Citicorp
Marks and Spencer
Shell
First Published in Great Britain in 1990.
© David Targett 1990, 2000, 2001
The rights of Professor David Targett to be identified as Author of this Work have been asserted in
accordance with the Copyright, Designs and Patents Act 1988.
All rights reserved; students may print and download these materials for their own private study only and
not for commercial use. Except for this permitted use, no materials may be used, copied, shared, lent,
hired or resold in any way, without the prior consent of Edinburgh Business School.
Contents
Module 6 4/31
Module 7 4/37
Module 8 4/45
Module 9 4/53
Module 10 4/59
Module 11 4/66
Module 12 4/72
Module 13 4/79
Module 14 4/85
Module 15 4/96
Index I/1
Learning Objectives
This module gives an overview of statistics, introducing basic ideas and concepts at
a general level, before dealing with them in greater detail in later modules. The
purpose is to provide a gentle way into the subject for those without a statistical
background, in response to the cynical view that it is not possible for anyone to read
a statistical text unless they have read it before. For those with a statistical back-
ground, the module will provide a broad framework for studying the subject.
1.1 Introduction
The word statistics can refer to a collection of numbers or it can refer to the
science of studying collections of numbers. Under either definition the subject has
received far more than its share of abuse (‘lies, damned lies…’). A large part of the
reason for this may well be the failure of people to understand that statistics is like a
language. Just as verbal languages can be misused (for example, by politicians and
journalists?) so the numerical language of statistics can be misused (by politicians
and journalists?). To blame statistics for this is as sensible as blaming the English
language when election promises are not kept.
One does not have to be skilled in statistics to misuse them deliberately (‘figures
can lie and liars can figure’), but misuses often remain undetected because fewer
people seem to have the knowledge and confidence to handle numbers than have
similar abilities with words. Fewer people are numerate than are literate. What is
needed to see through the misuse of statistics, however, is common sense with the
addition of only a small amount of technical knowledge.
The difficulties are compounded by the unrealistic attitudes of those who do
have statistical knowledge. For instance, when a company’s annual accounts report
that the physical stock level is £34 236 417 (or even £34 236 000), it conveys an aura
of truth because the figure is so precise. Accompanying the accountants who
estimated the figure, one may have thought that the method by which the data were
collected did not warrant such precision. For market research to say that 9 out of 10
dogs prefer Bonzo dog food is also misleading, but in a far more overt fashion. The
statement is utterly meaningless, as is seen by asking the questions: ‘Prefer it to
what?’, ‘Prefer it under what circumstances?’, ‘9 out of which 10 dogs?’
Such examples and many, many others of greater or lesser subtlety have generat-
ed a poor reputation for statistics which is frequently used as an excuse for
remaining in ignorance of it. Unfortunately, it is impossible to avoid statistics in
business. Decisions are based on information; information is often in numerical
form. To make good decisions it is necessary to organise and understand numbers.
This is what statistics is about and this is why it is important to have some
knowledge of the subject.
Statistics can be split into two parts. The first part can be called descriptive
statistics. Broadly, this element handles the problem of sorting a large amount of
collected data in ways which enable its main features to be seen immediately. It is
concerned with turning numbers into real and useful information. Included here are
simple ideas such as organising and arranging data so that their patterns can be seen,
summarising data so that they can be handled more easily and communicating data
to others. Also included is the now very important area of handling computerised
business statistics as provided by management information systems and decision
support systems.
The second part can be referred to broadly as inferential statistics. This element
tackles the problem of how the small amount of data that has been collected (called
the sample) may be analysed to infer general conclusions about the total amount of
similar data that exist uncollected in the world (called the population). For instance,
opinion polls use inferential statistics to make statements about the opinions of the
whole electorate of a country, given the results of perhaps just a few hundred
interviews.
Both types of statistics are open to misuse. However, with a little knowledge and
a great deal of common sense, the errors can be spotted and the correct procedures
seen. In this module the basic concepts of statistics will be introduced. Later, some
abuses of statistics and how to counter them will be discussed.
The first basic concept to look at is that of probability, which is fundamental to
statistical work. Statistics deals with approximations and ‘best guesses’ because of
the inaccuracy and incompleteness of most of the data used. It is rare to make
1.2 Probability
All future events are uncertain to some degree. That the present government will
still be in power in the UK in a year’s time (given that it is not an election year) is
likely, but far from certain; that a communist government will be in power in a
year’s time is highly unlikely, but not impossible. Probability theory enables the
difference in the uncertainty of events to be made more precise by measuring their
likelihood on a scale.
(Heads) = 0.5
(Tails) = 0.5
(b) ‘Relative frequency’ approach. When the event has been or can be repeated a
large number of times, its probability can be measured from the formula:
.
(Event) =
.
For example, to estimate the probability of rain on a given day in September in
London, look at the last 10 years’ records to find that it rained on 57 days. Then:
.
(Rain) =
. ( × )
=
= 0.19
(c) Subjective approach. A certain group of statisticians (Bayesians) would argue
that the degree of belief that an individual has about a particular event may be
expressed as a probability. Bayesian statisticians argue that in certain circum-
stances a person’s subjective assessment of a probability can and should be used.
The traditional view, held by classical statisticians, is that only objective probabil-
ity assessments are permissible. Specific areas and techniques that use subjective
probabilities will be described later. At this stage it is important to know that
probabilities can be assessed subjectively but that there is discussion amongst
statisticians as to the validity of doing so. As an example of the subjective ap-
proach, let the event be the achievement of political unity in Europe by the year
2020 AD. There is no way that either of the first two approaches could be em-
ployed to calculate this probability. However, an individual can express his own
feelings on the likelihood of this event by comparing it with an event of known
probability: for example, is it more or less likely than obtaining a head on the
spin of a coin? After a long process of comparison and checking, the result
might be:
(Political unity in Europe by 2020 AD) = 0.10
The process of accurately assessing a subjective probability is a field of study in
its own right and should not be regarded as pure guesswork.
The three methods of determining probabilities have been presented here as an
introduction and the approach has not been rigorous. Once probabilities have been
calculated by whatever method, they are treated in exactly the same way.
Examples
1. What is the probability of throwing a six with one throw of a die?
With the a priori approach there are six possible outcomes: 1, 2, 3, 4, 5 or 6 show-
ing. All outcomes are equally likely. Therefore:
(throwing a 6) =
2. What is the probability of a second English Channel tunnel for road vehicles being
completed by 2025 AD?
The subjective approach is the only one possible, since logical thought alone cannot
lead to an answer and there are no past observations. My assessment is a small one,
around 0.02.
3. How would you calculate the probability of obtaining a head on one spin of a biased
coin?
The a priori approach may be possible if one had information on the aerodynamical
behaviour of the coin. A more realistic method would be to conduct several trial
spins and count the number of times a head appeared:
.
(obtaining a head) =
.
4. What is the probability of drawing an ace in one cut of a pack of playing cards?
Use the a priori method. There are 52 possible outcomes (one for each card in the
deck) and the probability of picking any one card, say the ace of diamonds, must
therefore be 1/52. There are four aces in the deck, hence:
(drawing an ace) = =
53
66
41 71 40
110
83 106
72
20
99 92
75
43 56 60 .
45 57 61 .
46 57 62 .
48 58 62
49 58 63
49 58 65
50 59 65
Table 1.1 is an ordered array. The numbers look neater now but it is still not
possible to get a feel for the data (the average, for example) as they stand. The next
step is to classify the data and then arrange the classes in order. Classifying means
grouping the numbers in bands (e.g. 50–54) to make them easier to handle. Each
class has a frequency, which is the number of data points that fall within that class.
This is called a frequency table and is shown in Table 1.2. This shows that seven
data points were greater than or equal to 40 but less than 50, 12 were greater than or
equal to 50 but less than 60 and so on. There were 100 data points in all.
It is now much easier to get an overall conception of what the data mean. For
example, most of the numbers are between 60 and 90 with extremes of 40 and 110.
Of course, it is likely that at some time there may be a need to perform detailed
calculations with the numbers to provide specific information, but at present the
objective is merely to get a feel for the data in the shortest possible time. Another
arrangement with greater visual impact, called a frequency histogram, will help
meet this objective.
30
27
22
Frequency
20 19
12
10
10 7
3
e.g.
(40 ≤ < 50) = = 0.07
ues, the distribution becomes smoother, until, ultimately, the continuous distribu-
tion (d) will be achieved.
50 60
Continuous variable (d) Variable classes (c)
50 < x < 55
50 < x < 60
55 < x < 60
Area
0.12 Area Area
0.05 0.07
50 60 50 55 60
Example
Figure 1.6 The area under each part of the curve is shown. The total
area is equal to 1.0
Using the continuous distribution in Figure 1.6, what are the probabilities that a
particular value of the variable falls within the following ranges?
1. ≤ 60
2. ≤ 100
3. 60 ≤ ≤ 110
4. ≥ 135
5. ≥ 110
Answers
1. ( ≤ 60) = 0.01
2. ( ≤ 100) = 0.01 + 0.49 = 0.5
3. (60 ≤ ≤ 110) = 0.49 + 0.27 = 0.76
4. ( ≥ 135) = 0.02
5. ( ≥ 110) = 0.21 + 0.02 = 0.23
In practice, the problems with the use of continuous distributions are, first, that
one can never collect sufficient data, sufficiently accurately measured, to
establish a continuous distribution. Second, were this possible, the accurate
measurement of areas under the curve would be difficult. Their greatest
practical use is where continuous distributions appear as standard distributions, a
topic discussed in the next section.
For example, one standard distribution, the normal, is derived from the follow-
ing theoretical situation. A variable is generated by a process which should give the
variable a constant value, but does not do so because it is subject to many small
disturbances. As a result, the variable is distributed around the central value (see
Figure 1.7). This situation (central value, many small disturbances) can be expressed
mathematically and the resulting distribution can be anticipated mathematically (i.e.
a formula describing the shape of the distribution can be found).
(a)
(b)
Figure 1.8 Salaries: (a) hospital – high standard deviation; (b) school –
low standard deviation
68%
1s 1s
95%
2s 2s
99%
3s 3s
Example
A machine is set to produce steel components of a given length. A sample of 1000
components is taken and their lengths measured. From the measurements the average
and standard deviation of all components produced are estimated to be 2.96 cm and
0.025 cm respectively. Within what limits would 95 per cent of all components pro-
duced by the machine be expected to lie?
Take the following steps:
1. Assume that the lengths of all components produced follow a normal distribution.
This is reasonable since this situation is typical of the circumstances in which normal
distributions arise.
2. The parameters of the distribution are the average mean = 2.96 cm and the standard
deviation = 0.025 cm. The distribution of the lengths of the components will there-
fore be as in Figure 1.10.
95%
1.6.1 Definitions
Statistical expressions and the variables themselves may not have precise definitions.
The user may assume the producer of the data is working with a different definition
than is the case. By assuming a wrong definition, the user will draw a wrong
conclusion. The statistical expression ‘average’ is capable of many interpretations. A
firm of accountants advertises in its recruiting brochure that the average salary of
qualified accountants in the firm is £44 200. A prospective employee may conclude
that financially the firm is attractive to work for. A closer look shows that the
accountants in the firm and their salaries are as follows:
All the figures could legitimately be said to be the average salary. The firm has
doubtless chosen the one that best suited its purposes. Even if it were certain that
the correct statistical definition was being used, it would still be necessary to ask just
how the variable (salary) is defined. Is share of profits included in the partners’
salaries? Are bonuses included in the accountants’ salaries? Are allowances (a car,
for example) included in the accountants’ salaries? If these items are removed, the
situation might be:
The mean salary is now £36 880. Remuneration at this firm is suddenly not quite
so attractive.
1.6.2 Graphics
Statistical pictures are intended to communicate data very rapidly. This speed means
that first impressions are important. If the first impression is wrong then it is
unlikely to be corrected.
There are many ways of representing data pictorially, but the most frequently
used is probably the graph. If the scale of a graph is concealed or not shown at all,
the wrong conclusion can be drawn. Figure 1.11 shows the sales figures for a
company over the last three years. The company would appear to have been
successful.
Sales
A more informative graph showing the scale is given in Figure 1.12. Sales have
hardly increased at all. Allowing for inflation, they have probably decreased in real
terms.
12
10
Sales (£ million)
First, it arises in the collection of the data. The left-wing politician who states that
80 per cent of the letters he receives are against a policy of the right-wing govern-
ment and concludes that a majority of all the electorate oppose the government on
this issue is drawing a conclusion from a biased sample.
Second, sample bias arises through the questions that elicit the data. Questions
such as ‘Do you go to church regularly?’ will provide unreliable information. There
may be a tendency for people to exaggerate their attendance since, generally, it is
regarded as a worthy thing to do. The word ‘regularly’ also causes problems. Twice a
year, at Christmas and Easter, is regular. So is twice every Sunday. It would be
difficult to draw any meaningful conclusions from the question as posed. The
question should be more explicit in defining regularity.
Third, the sample information may be biased by the interviewer. For example,
supermarket interviews about buying habits may be conducted by a young male
interviewer who questions 50 shoppers. It would not be surprising if the resultant
sample comprised a large proportion of young attractive females.
The techniques of sampling which can overcome most of these problems will be
described later in the course.
1.6.4 Omissions
The statistics that are not given can be just as important as those that are. A
television advertiser boasts that nine out of ten dogs prefer Bonzo dog food. The
viewer may conclude that 90 per cent of all dogs prefer Bonzo to any other dog
food. The conclusion might be different if it were known that:
(a) The sample size was exactly ten.
(b) The dogs had a choice of Bonzo or the cheapest dog food on the market.
(c) The sample quoted was the twelfth sample used and the first in which as many
as nine dogs preferred Bonzo.
third factor, inflation. The variables have increased together as the cost of living has
increased, but they are unlikely to be causally related. This consideration is im-
portant when decisions are based on statistical association. To take the example
further, holding down clergymen’s salaries in order to hold down the price of rum
would work if the relationship were causal, but not if it were mere association.
admission, answers may well be biased. The figure of 2.38 is likely to be higher than
the true figure. Even so, a comparison with 20 years ago can still be made, but only
provided the bias is the same now as then. It may not be. Where did the 20-year-old
data come from? Most likely from a differently structured survey of different sample
size, with different questions and in a different social environment. The comparison
with 20 years ago, therefore, is also open to suspicion.
One is also misled in this case by the accuracy of the data. The figure of 2.38
suggests a high level of accuracy, completely unwarranted by the method of data
collection. When numbers are presented to many decimal places, one should
question the relevance of the claimed degree of accuracy.
Learning Summary
The purpose of this introduction has been twofold. The first aim has been to
present some statistical concepts as a basis for more detailed study of the subject.
All the concepts will be further explored. The second aim has been to encourage a
healthy scepticism and atmosphere of constructive criticism, which are necessary
when weighing statistical evidence.
The healthy scepticism can be brought to bear on applications of the concepts
introduced so far as much as elsewhere in statistics. Probability and distributions can
both be subject to misuse.
Logical errors are often made with probability. For example, suppose a ques-
tionnaire about marketing methods is sent to a selection of companies. From the
200 replies, it emerges that 48 of the respondents are not in the area of marketing. It
also emerges that 30 are at junior levels within their companies. What is the proba-
bility that any particular questionnaire was filled in by someone neither in marketing
nor at a senior level? It is tempting to suppose that:
Probability = = 39%
This is almost certainly wrong because of double counting. Some of the 48 non-
marketers are also likely to be at a junior level. If 10 respondents were non-
marketers and at a junior level, then:
Probability = = 34%
Only in the rare case where none of those at a junior level were outside the mar-
keting area would the first calculation have been correct.
2000
No. of civil servants
1500
1000
500
Graphical errors can frequently be seen with distributions. Figure 1.13 shows an
observed distribution relating to the salaries of civil servants in a government
department. The figures give a wrong impression of the spread of salaries because
the class intervals are not all equal. One could be led to suppose that salaries are
higher than they are. The lower bands are of width £8000 (0–8, 8–16, 16–24). The
higher ones are of a much larger size. The distribution should be drawn with all the
intervals of equal size, as in Figure 1.14.
Statistical concepts are open to misuse and wrong interpretation just as verbal
reports are. The same vigilance should be exercised in the former as in the latter.
2000
No. of civil servants
1500
1000
500
Salary (£000s)
Review Questions
1.1 One of the reasons probability is important in statistics is that, if data being dealt with
are in the form of a sample, any conclusions drawn cannot be 100 per cent certain. True
or false?
1.2 A randomly selected card drawn from a pack of cards was an ace. It was not returned
to the pack. What is the probability that a second card drawn will also be an ace?
A. 1/4
B. 1/13
C. 3/52
D. 1/17
E. 1/3
1.4 A coin is known to be unbiased (i.e. it is just as likely to come down ‘heads’ as ‘tails’). It
has just been tossed eight times and each time the result has been ‘heads’. On the ninth
throw, what is the probability that the result will be ‘tails’?
A. Less than 1/2
B. 1/2
C. More than 1/2
D. 1
25
22
17
8
6
1.5 On how many days were sales not less than £50 000?
A. 17
B. 55
C. 23
D. 48
1.6 What is the probability that on any day sales are £60 000 or more?
A. 1/13
B. 23/78
C. 72/78
D. 0
1.7 What is the sales level that was exceeded on 90 per cent of all days?
A. £20 000
B. £30 000
C. £40 000
D. £50 000
E. £60 000
1.9 A normal distribution has mean 60 and standard deviation 10. What percentage of
readings will be in the range 60–70?
A. 68%
B. 50%
C. 95%
D. 34%
E. 84%
1.10 A police checkpoint recorded the speeds of motorists over a one-week period. The
speeds had a normal distribution with a mean 82 km/h and standard deviation 11 km/h.
What speed was exceeded by 97.5 per cent of motorists?
A. 49
B. 60
C. 71
D. 104
0.9 3.5 0.8 1.0 1.3 2.3 1.0 2.4 0.7 1.0
2.3 0.2 1.6 1.7 5.2 1.1 3.9 5.4 8.2 1.5
1.1 2.8 1.6 3.9 3.8 6.1 0.3 1.1 2.4 2.6
4.0 4.3 2.7 0.2 0.3 3.1 2.7 4.1 1.4 1.1
3.4 0.9 2.2 4.2 21.7 3.1 1.0 3.3 3.3 5.5
0.9 4.5 3.5 1.2 0.7 4.6 4.8 2.6 0.5 3.6
6.3 1.6 5.0 2.1 5.8 7.4 1.7 3.8 4.1 6.9
3.5 2.1 0.8 7.8 1.9 3.2 1.3 1.4 3.7 0.6
1.0 7.5 1.2 2.0 2.0 11.0 2.9 6.5 2.0 8.6
1.5 1.2 2.9 2.9 2.0 4.6 6.6 0.7 5.8 2.0
1 Classify the data in intervals one minute wide. Form a frequency histogram. What
service time is likely to be exceeded by only 10 per cent of customers?
negotiate employee benefits on a company-wide basis, but to negotiate wages for each
class of work in a plant separately. For years, however, this antiquated practice has been
little more than a ritual. Supposedly, the system gives workers the opportunity to
express their views, but the fact is that the wages settlement in the first group invariably
sets the pattern for all other groups within a particular company. The Door Trim Line
at JPC was the key group in last year’s negotiations. Being first in line, the settlement in
Door Trim would set the pattern for JPC that year.
Annie Smith is forewoman for the Door Trim Line. There are many variations of
door trim and Annie’s biggest job is to see that they get produced in the right mix. The
work involved in making the trim is about the same regardless of the particular variety.
That is to say, it is a straight piecework operation and the standard price is 72p per unit
regardless of variety. The work itself, while mainly of an assembly nature, is quite
intricate and requires a degree of skill.
Last year’s negotiations started with the usual complaint from the union about piece
prices in general. There was then, however, an unexpected move. Here is the union’s
demand for the Door Trim Line according to the minutes of the meeting:
We’ll come straight to the point. A price of 72p a unit is diabolical… A fair
price is 80p.
The women average about 71 units/day. Therefore, the 8p more that we want
amounts to an average of £5.68 more per woman per day…
This is the smallest increase we’ve demanded recently and we will not accept
less than 80p.
(It was the long-standing practice in the plant to calculate output on an average daily
basis. Although each person’s output is in fact tallied daily, the bonus is paid on daily
output averaged over the week. The idea is that this gives a person a better chance to
recoup if she happens to have one or two bad days.)
The union’s strategy in this meeting was a surprise. In the past the first demand was
purposely out of line and neither side took it too seriously. This time their demand was
in the same area as the kind of offer that JPC’s management was contemplating.
At their first meeting following the session with the union, JPC’s management heard
the following points made by the accountant:
a. The union’s figure of 71 units per day per person is correct. I checked it against the
latest Production Report. It works out like this:
Average weekly output for the year to date is 7100 units; thus, average daily output
is 7100/5 =1420 units/day.
The number of women directly employed on the line is 20, so that average daily
output is 1420/20 = 71 units/day/woman.
b. The union’s request amounts to an 11.1 per cent increase: (80 − 72)/72 × 100 =
11.1.
c. Direct labour at current rates is estimated at £26 million. Assuming an 11.1 per cent
increase across the board, which, of course, is what we have to anticipate, total
annual direct labour would increase by about £2.9 million: £26 000 000 × 11.1% =
£2 886 000.
Prior to the negotiations management had thought that 7 per cent would be a rea-
sonable offer, being approximately the rate at which productivity and inflation had been
increasing in recent years. Privately they had set 10 per cent as the upper limit to their
final offer. At this level they felt some scheme should be introduced as an incentive to
better productivity, although they had not thought through the details of any such
scheme.
As a result of the union’s strategy, however, JPC’s negotiating team decided not to
hesitate any longer. Working late, they put together their ‘best’ package using the 10
per cent criterion. The main points of the plan were as follows:
a. Maintain the 72p per unit standard price but provide a bonus of 50p for each unit
above a daily average of 61units/person.
b. Since the average output per day per person is 71, this implies that on average 10
bonus units per person per day would be paid.
c. The projected weekly cost then is £5612:
(71 × 0.72) + (10 × 0.50) = 56.12
56.12 × 5 × 20 = £5612
d. The current weekly cost then is £5112:
71 × 0.72 × 5 × 20 = 5112
e. This amounts to an average increase of £500 per week, slightly under the 10 per
cent upper limit:
500/5112 × 100 = 9.78%
f. The plan offers the additional advantage that the average worker gets 10 bonus units
immediately, making the plan seem attractive.
g. Since the output does not vary much from week to week, and since the greatest
improvement should come from those who are currently below average, the largest
portion of any increase should come from units at the lower cost of 72p each. Those
currently above average probably cannot improve very much. To the extent that this
occurs, of course, there is a tendency to reduce the average cost below the 79p per
unit that would result if no change at all occurs:
5612/(71 × 5 × 20) = 79.0p
At this point management had to decide whether they should play all their cards at
once or whether they should stick to the original plan of a 7 per cent offer. Two further
issues had to be considered:
a. How good were the rates?
b. Could a productivity increase as suggested by the 9.8 per cent offer plan really be
anticipated?
Annie Smith, the forewoman, was called into the meeting, and she gave the following
information:
a. A few workers could improve their own average a little, but the rates were too tight
for any significant movement in the daily outputs.
b. This didn’t mean that everyone worked at the same level, but that individually they
were all close to their own maximum capabilities.
c. A number did average fewer than 61 units per day. Of the few who could show a
sustained improvement, most would be in this fewer-than-61 category.
This settled it. JPC decided to go into the meeting with their ‘best’ offer of 9.8 per
cent. Next day the offer was made. The union asked for time to consider it and the next
meeting was set for the following afternoon.
In the morning of the following day Annie Smith reported that her Production Per-
formance Report (see Table 1.4) was missing. She did not know who had taken it but
was pretty sure it was the union steward.
The next meeting with the union lasted only a few minutes. A union official stated his
understanding of the offer and, after being assured that he had stated the details
correctly, he announced that the union approved the plan and intended to recommend
its acceptance to its membership. He also added that he expected this to serve as the
basis for settlement in the other units as usual and that the whole wage negotiations
could probably be completed in record time.
And that was that. Or was it? Some doubts remained in the minds of JPC’s negotiat-
ing team. Why had the union been so quick to agree? Why had the Production
Performance Report been stolen? While they were still puzzling over these questions,
Annie Smith phoned to say that the Production Performance Report had been returned.
1 In the hope of satisfying their curiosity, the negotiating team asked Annie to bring the
Report down to the office. Had any mistakes been made?
Was JPC’s offer really 9.8 per cent? If not, what was the true offer?
year. It is a safety record which cannot be matched by any other form of general
anaesthesia.
Yours faithfully,
Mr Y, (President-Elect) Society for the Advancement of Anaesthesia in Dentis-
try.
1 Comment upon the evidence and reasoning (as given in the letters) that lead to these
two conclusions.
Learning Objectives
This module describes some basic mathematics and associated notation. Some
management applications are described but the main purpose of the module is to lay
the mathematical foundations for later modules. It will be preferable to encounter
the shock of the mathematics at this stage rather than later when it might detract
from the management concepts under consideration. For the mathematically literate
the module will serve as a review; for those in a rush it could be omitted altogether.
2.1 Introduction
The quantitative courses people take at school, although usually entitled ‘mathemat-
ics’, probably cover several quantitative subjects, including algebra, geometry and
trigonometry. Some of these areas are useful as a background to numerical methods
in management. These include graphs, functions, simultaneous equations and
exponents. Usually, the mathematics is a precise means of expressing concepts and
techniques. A technique may not be complex, but the presence of mathematics,
especially notation, can cause difficulties and arouse fears.
Most of the mathematics met in a management course will be reviewed here.
Although their usefulness in management will be indicated, this is not the prime
purpose. Relevance is not an issue at this stage; the main objective is to deal with
basic mathematical ideas now so that they do not interfere with comprehension of
more directly applicable techniques at a later stage.
1 2 3
F2
3 units
2 (3,2)
y-axis
1 2 units
1 2 3 4
x-axis
2
x-values negative x-values positive
y-values positive y-values positive
1
–3 –2 –1 0 1 2 3
–1
x-values negative x-values positive
y-values negative y-values negative
–2
y
5
(–7,4)
4
3
(3,2)
2
(612 ,1)
1
(–4,0) (4,0)
–8 –7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7 8 x
–1
–2
–3 (0,–3)
(–4,–3)
–4
–5
This can be written more concisely with letters in place of the words. Suppose x
is the variable number of products sold, y is the variable direct profit, p the set price
and q the set cost per unit. Then:
=( − ) ﴾2.1﴿
The equation is given a label (Equation 2.1) so that it can be referred to later.
Note that multiplication can be shown in several different ways. For example,
Price (p) times Volume (x) can be written as:
·
( )( )
The multiplication sign (×) used in arithmetic tends not to be used with letters
because of possible confusion with the use of x as a variable.
The use of symbols to represent numbers is the dictionary definition of algebra.
It is intended not to confuse but to simplify. The symbols (as opposed to verbal
labels, e.g. ‘y’ instead of ‘Direct profit’) shorten the description of a complex
relationship; the symbols (as opposed to numbers, e.g. y instead of 2.1, 3.7, etc.)
allow the general properties of the variables to be investigated instead of particular
properties when the symbols take particular numerical values.
The relationship (Equation 2.1) above is an equation. Since price and cost are
fixed in this example, p and q are constants. Depending upon the quantities sold, x
and y may take on any of a range of values; therefore, they are variables. Once x is
known, y is automatically determined, so y is a function of x. Whenever the value
of a variable can be calculated given the values of other variables, it is said to be a
function of the other variables.
If the values of the constants are known, say p = 5, q = 3, then the equation
becomes:
=2 ﴾2.2﴿
A graph can now be made of this function. The graph is the set of all points
satisfying Equation 2.2, i.e. all the points for which Equation 2.2 is true. By looking
at some of the points, the shape of the graph can be seen:
when x = 0, y = 2 times 0 = 0
when x = 1, y = 2 times 1 = 2
when x = 2, y = 2 times 2 = 4, etc.
Therefore, points (0,0), (1,2), (2,4), etc. all lie on this function. Joining together a
sample of such points shows the shape of this graph. This has been done in
Figure 2.5, which shows the graph of the function y = 2x.
y = 2x
2
–2 –1 1 2 3 x
–1
(–3,7) 7
6
5
4
3
(–2,2) 2
1
–3 –2 –1 1 2 3 x
–1
2
y= x –2 –2
x = –3 –2 –1 0 1 2 3
y = 7 2 –1–2–1 2 7
y
3 2
y = x + 3x – 2
x = –3 –2 –1 0 1 2 3 4
y = –2 2 0 –2 2 18 52
x
–3 –2 –1 1 2 3
–1
–2
x = −3 −2 −1 0 1 2 3
y= 7 2 −1 −2 −1 2 7
In Figure 2.5 only two points need be plotted since a straight line is defined
completely by any two points lying on it. The number of points which require to be
plotted varies with the complexity of the function.
When we are working with functions, they are usually restricted to their algebraic
form. It is neater and more economical to use them in this form. They are generally
put in graphical form only for illustrative purposes. The behaviour of complex
equations is often difficult to imagine from the mathematical form itself.
where
Q1 is the quantity sold at price P1
Q2 is the quantity sold at price P2
Suppose the product currently sells at the price P1 and the quantity sold per
month is Q1. A new price is mooted. What is likely to be the quantity sold at this
price? If the elasticity is known (or can be estimated), then the equation can be
rearranged and solved for Q2, i.e. put in the form:
Q2 = function of E,Q1,P1,P2
The likely quantity sold (Q2) at the new price (P2) can then be calculated. But first
the equation would have to be rearranged.
The four rules by which equations can be rearranged are:
(a) Addition. If the same quantity is added to both sides of an equation, the
resulting equation is equivalent to the original equation.
Examples
1 Solve x − 1 = 2 x−1 = 2
Add 1 to both sides of the equation x−1+1 = 2+1
x = 3
(b) Subtraction. If the same quantity is subtracted from both sides of an equation,
the resulting equation is equivalent to the original.
Examples
1 Solve x + 4 = 14 x + 4 = 14
Subtract 4 from both sides of the equation x + 4 − 4 = 14 − 4
x = 10
1 Solve 8x = 72 8x = 72
Divide both sides by 8 =
x = 9
2 Solve for x: 2y − 4x + 5 = 6x − 3y −
5
Add 5 and 3y to both sides 5y − 4x + 10 = 6x
Add 4x to both sides 5y + 10 = 10x
Divide by 10 +1 = x
This illustrates that the solved variable can appear on either side of the equation.
(d) Multiplication. If both sides of an equation are multiplied by the same number,
except zero, the resulting equation is equivalent to the original.
Examples
1 Solve = 6 = 6
Multiply both sides by 3: x = 18
2 Solve =1 = 1
Multiply both sides by (4 − y) 2y + 3 = 4 − y
Add y to both sides 3y + 3 = 4
Subtract 3 from both sides 3y = 1
Divide both sides by 3 y =
Example
Simplify = =
7 y = 2x + 1
6
A
5 (x1,y1)
4
y distance = y1 – y2
3 y1 – y2
B Slope =
(x2,y2) x1 – x2
2 x distance = x1 – x2
1
( 12,0) (0,1)
–1 0 1 2 3 4 x
–1
–2
The same reasoning applies to any two points along the line, confirming the
obvious fact that the slope (and therefore m) is constant along a straight line.
If A and B are two particular points (2,5) and (1,3) (i.e. x1 = 2, y1 = 5, x2 = 1, y2 =
3) then:
Slope AB = =2
For example, if the sales volume of a company were expressed as a linear func-
tion of time, y would be sales and x would be time (x = 1 for the first time period,
x = 2 for the second time period and so on). Then m would be the constant change
in sales volume from one time period to the next. If m = 3, then sales volume would
be increasing by 3 each time period.
A few additional facts are worthy of note.
(a) It is possible for m to be negative. If this is the case, the line leans in a backward
direction, since as x increases, y decreases. This is illustrated in Figure 2.9.
(b) It is possible for m to take the value 0. The equation of the line is then
y = constant, and the line is parallel to the x axis.
(c) Similarly, the line x = constant is parallel to the y axis and the slope can be
regarded as being infinite. The last two lines are examples of constant functions
and are also shown in Figure 2.9.
3
y = –x + 3
x = –2
2
y=1
1
–3 –2 –1 0 1 2 3 4 x
2. What is the equation of the line with intercept −3 and which passes through the
point (1,1)?
Since the intercept is −3, the line is:
y = mx − 3
Since it passes through (1,1):
1=m–3
m=4
The line is y = 4x – 3.
3. What is the equation of the line passing through the points (3,1) and (1,5)?
The slope between these two points is:
= = = −2
Therefore, the line must be y = −2x + c.
Since it passes through (3,1) (NB (1,5) could just as easily be used):
1 = −6 + c
c=7
The line is y = −2x + 7.
The values of x and y that satisfy both equations are found from the point of
intersection of the lines (see Figure 2.10). Since this point is on both lines, the x and
y values here must satisfy both equations. From the graph these values can be read:
x = 4, y = 3. That these values do fit both equations can be checked by substituting
x = 4, y = 3 into the equations of the lines.
Line (2.4)
3 Line (2.5)
x
4 6 16
8
2x + 3y = 24
4
2x + 3y = 12
x
6 12
Example
Solve the two simultaneous equations:
5 + 2 = 17 ﴾2.6﴿
2 −3 =3 ﴾2.7﴿
Multiply Equation 2.6 by 3 and Equation 2.7 by 2 so that the coefficients of y are the
same in both equations. (Equation 2.6 could just as well have been multiplied by 2 and
Equation 2.7 by 5 and then x eliminated.)
15x + 6y = 51
4x − 6y = 6
Add the two equations to eliminate y:
19x = 57
x=3
Substitute x = 3 in Equation 2.7 to find the y value:
6 − 3y = 3
3y = 3
y=1
The solution is x = 3, y = 1.
2000
Balance (£)
1500
1331
1331
1210
1100
1000
0 1 2 3 4 Time
(years)
2.6.1 Exponents
Consider an expression of the form ax. The base is a and the exponent is x. If x is a
whole number, then the expression has an obvious meaning (e.g. a2 = a × a,
a3 = a × a × a, 34 = 81, etc.) It also has meaning for values of x that are not whole
numbers. To see what this meaning is, it is necessary to look at the rules for working
with exponents.
(a) Multiplication. The rule is:
× =
For example:
× =
× =
It can be seen that this makes good sense if one substitutes whole numbers for a,
x and y. For instance:
2 ×2 = 4×8
= 32
= 2
Note that the exponents can only be added if the bases are the same. For exam-
ple, a3 × b2 cannot be simplified.
(b) Division. The rule is similar to that for multiplication:
/ =
For example:
/ = = =
Again, the reasonableness of the rule is confirmed by resorting to a specific nu-
merical example.
(c) Raising to a power. The rule is:
( ) =
For example:
( ) =
Several points of detail follow from the rules:
(a) = 1 since 1 = / =
(b) = 1/ since 1/ = / = =
1 1 1 1 1 1
3 +
(c) = √ ;
2 = √ since × =
3 2 2 2 2 = 1=
This last point demonstrates that fractional or decimal exponents do have mean-
ing.
Examples
1. ( × )/
=( )/
= /
=
=
2. Evaluate: 274/3
= (27 )
= ( √27)
=3
= 81
3. Evaluate: 4−3/2
1
= /
4
1
=
(√4)
1
=
2
1
=
8
4. Evaluate: (22)3
=2
= 64
2.6.2 Logarithms
In pursuing the objective of understanding exponential functions, it is also helpful
to look at logarithms. At school, logarithms are used for multiplying and dividing
large numbers, but this is not the purpose here. A logarithm is simply an exponent.
For example, if y = ax then x is said to be the logarithm of y to the base a. This is
written as logay = x.
Examples
1. 1000 = 103 and therefore the logarithm of 1000 to the base 10 is 3 (i.e.
3 = log101000). Logarithms to the base 10 are known as common logarithms.
2. 8 = 23 and therefore the logarithm of 8 to the base 2 is 3 (i.e. 3 = log28). Logarithms
to the base 2 are binary logarithms.
3. e is a constant frequently found in mathematics (just as π is). e has the value 2.718
approximately. Logarithms to the base e are called natural logarithms and are writ-
ten ln (i.e. x = lny means x = logey).
e has other properties which make it of interest in mathematics.
The rules for manipulation of logarithms follow from the rules for exponents:
(a) Addition: log + log = log
(b) Subtraction: log − log = log | / |
(c) Multiplication by a constant: log = log
(a) y y=2×2
x
15
10
k=2
0 1 2 3 x
(b) y
–x
y=5×2
15
10
–1 0 1 2 x
New cases
Initial level
0 1 2 3 4 Time
statistical technique, called regression analysis, is the usual way of dealing with this
second problem. (Regression analysis is a topic covered in Module 11 and Module 12.)
Review Questions
2.1 Which point on the graph shown below is (−1,2)?
A
2
D
1
–2 –1 0 1 2 3 4 x
C B
A. Point A
B. Point B
C. Point C
D. Point D
2.2 What is the equation of the line shown on the graph below?
0 1 2 3 x
A. = +1
B. =1−
C. =− −1
D. = −1
2.3 Which of the ‘curves’ shown below is most likely to have the equation y = x2 − 6x + 4?
6
C
A
4
B
2
0 2 4 6 x
A. A
B. B
C. C
2.6 What is the equation of the line with intercept 3 and that goes through point (3,9)?
A. =3 +2
B. =6 +3
C. =4 +3
D. =2 +3
2.7 What is the equation of the line that goes through the points (−1,6) and (3,−2)?
A. =2 +8
B. =4−2
C. =− +5
D. =2 −8
2.12 What is the equation of the curve shown in the graph below?
30
20
10
1 2 x
A. = + 10
B. = 10 · 10 .
C. = 10 · 10 .
D. = 100 · 10 .
1 What is the breakeven point for each system (i.e. how many of each system need to be
sold so that revenue equals cost)?
Handling Numbers
Module 3 Data Communication
Module 4 Data Analysis
Module 5 Summary Measures
Module 6 Sampling Methods
Data Communication
Contents
3.1 Introduction.............................................................................................3/1
3.2 Rules for Data Presentation ..................................................................3/3
3.3 The Special Case of Accounting Data ............................................... 3/12
3.4 Communicating Data through Graphs.............................................. 3/16
Learning Summary ......................................................................................... 3/21
Review Questions ........................................................................................... 3/22
Case Study 3.1: Local Government Performance Measures ..................... 3/24
Case Study 3.2: Multinational Company’s Income Statement.................. 3/25
Case Study 3.3: Country GDPs ..................................................................... 3/25
Case Study 3.4: Energy Efficiency ................................................................. 3/26
Learning Objectives
By the end of the module the reader should know how to improve data presenta-
tion. This is important both in communicating data to others and in analysing data.
The emphasis is on the visual aspects of data presentation. Special reference is made
to accounting data and graphs.
3.1 Introduction
Data communication means the transmission of information through the medium
of numbers. Its reputation is mixed. Sometimes it is thought to be done dishonestly
(‘There are lies, damned lies and statistics’); at other times it is thought to be done
confusingly so that the numbers appear incomprehensible and any real information
is obscured. Thus far, numbers and words are similar. Words can also mislead and
confuse. The difference seems to be that numbers are treated with less tolerance and
are quickly abandoned as a lost cause. More effort is made with words. One hears,
for instance, of campaigns for the plain and efficient use of words by bureaucrats,
lawmakers, etc., but not for the plain and efficient use of numbers by statisticians,
computer scientists and accountants. Furthermore, while experts spend much time
devising advanced numerical techniques, little effort is put into methods for better
data communication.
This module attempts to redress the balance by looking closely at the question of
data communication. In management, numbers are usually produced in the form of
tables or graphs. The role of both these modes of presentation will be discussed.
The case of accounting data will be given separate treatment.
It may seem facile to say that data should be presented in a form that is suitable
for the receiver rather than convenient for the producer. Yet computerised man-
agement data often relate more to the capabilities of the computer than to the needs
of the managers – many times accounting information seems to presuppose that all
the receivers are accountants; statistics frequently can only be understood by the
highly numerate. A producer of data should have the users at the forefront of his or
her mind, and should also not assume that the receiver has a similar technical
background to him- or herself.
In the context that the requirements of the users of data are paramount, the aim
now is to show how data might be presented better. How is ‘better’ to be defined?
A manager meets data in just a few general situations:
(a) Business reports. The data are usually the supporting evidence for conclusions
or suggestions made verbally in the text.
(b) Management information systems. Large amounts of data are available on
screen or delivered to the manager at regular intervals, usually in the form of
computer printouts.
(c) Accounting data. Primarily, for a manager, these will indicate the major
financial features of an organisation. The financial analyst will have more detailed
requirements.
(d) Self-generated data. The manager may wish to analyse his own data: sales
figures, delivery performance, invoice payments, etc.
In all these situations speed is essential. A manager is unlikely to have the time to
carry out a detailed analysis of every set of data that crosses his or her desk. The
data should be communicated in such a way that its features are immediately
obvious. Moreover, the features should be the main ones rather than the points of
detail. These requirements suggest that the criterion that distinguishes well-
presented data should be: ‘The main patterns and exceptions in the data should be
immediately evident.’
The achievement of this objective is made easier since, in all the situations above,
the manager will normally be able to anticipate the patterns in the data. In the first
case above, the pattern will have been described in the text; in the other three, it is
unlikely that the manager will be dealing with raw data in a totally new set of
circumstances and he or she will therefore have some idea of what to expect. The
methods of improving data presentation are put forward with this criterion in mind.
In looking at this subject it must be stressed that it is not just communicating
data to others that is important. Communicating data to oneself is a step in coming
to understand them. The role of data communication in analysis is perhaps the most
valuable function of the ideas proposed here. Important results in the sciences,
medicine and economics are not usually discovered through the application of a
sophisticated computerised technique. They are more likely to be discovered
because someone has noticed an interesting regularity or irregularity in a small
amount of data. Such features are more likely to be noticed when the data are well
presented. Sophistication may be introduced later when one is trying to verify the
result rigorously, but this should not be confused with the original analysis. In short,
simple, often visual, methods of understanding numbers are highly important. The
role of data communication as part of the analysis of data will be explored in the
next module.
Original Rounded
(2 effective figures)
1382 1400
721 720
79.311 79
17.1 17
4.2 4.2
2.32 2.3
These numbers have been rounded to the first two figures. Contrast this to fixed
rounding, such as rounding always to the first decimal place. For example, if the
above numbers were rounded to the first decimal place, the result would be:
Original Rounded
(1st decimal place)
1382 1382.0
721 721.0
79.311 79.3
17.1 17.1
4.2 4.2
2.32 2.3
Rounding to the same number of decimal places may appear to be more con-
sistent, but it does not make the numbers easier to manipulate and communicate.
Rounding to two effective figures puts numbers in the form in which mental
arithmetic is naturally done. The numbers are therefore assimilated more quickly.
The situation is slightly different when a series of similar numbers, all of which
have, say, the first two figures in common, are being compared. The rounding
would then be to the first two figures that are effective in making the comparison
(i.e. the first two that differ from number to number). This is the meaning of
‘effective’ in ‘two effective figures’. For example, the column of numbers below
would be rounded as shown:
Original Rounded
1142 1140
1327 1330
1489 1490
1231 1230
1588 1590
The numbers have been rounded to the second and third figures because all the
numbers have the first figure, 1, in common. Rounding to the first two figures
would be over-rounding, making comparisons too approximate.
Many managers may be concerned that rounding leads to inaccuracy. It is true, of
course, that rounding does lose accuracy. The important questions are: ‘Would the
presence of extra digits affect the decision being taken?’ and ‘Just how accurate are
the data anyway – is the accuracy being lost spurious accuracy?’
Often, one finds that eight-figure data are being insisted upon in a situation
where the decision being taken rests only on the first two figures, and where the
method of data collection was such that only the first two figures can be relied upon
as being accurate. Monitoring financial budgets is a case in point. During a financial
year actual costs are continuously compared with planned. There is room for
approximation in the comparison. If the budget were £11 500, it is enough to know
that the actual costs are £10 700 (rounded to two effective figures). No different
decision would be taken had the actual costs been specified as £10 715. The
conclusion would be the same: actual costs are about 7 per cent below budget. Even
if greater accuracy were required it may not be possible to give it. At such an early
stage actual costs will almost certainly rest in some part on estimates and be subject
to error. To quote actual costs to the nearest £1 is misleading, suggesting a level of
accuracy that has not been achieved. In this situation high levels of accuracy are
therefore neither necessary nor obtainable, yet the people involved may insist on
issuing actual cost data to the nearest £1. Where there is argument about the level of
accuracy required, A. S. C. Ehrenberg (1975) suggests in his book that the numbers
should be rounded but a note at the bottom of the table should be provided to
indicate a source from which data specified to a greater precision can be obtained.
Then wait for the rush.
It is much easier to see the main features of the data from Table 3.2. East now
stands out as a clear exception, its profit being out of line with the other divisions.
The rounding also facilitates the calculation of accounting ratios. The profit margin
is about 14 per cent for all divisions except East, where it is over 20 per cent. While
it is possible to see these features in Table 3.1, they are not so immediately apparent
as in the amended table. When managers have many such tables crossing their
desks, it is essential that attention should be attracted quickly to important infor-
mation that may require action.
Frequently, tables are ordered alphabetically. This is helpful in a long reference
table that is unfamiliar to the user, but not so helpful when management infor-
mation is involved. Indeed, it may be a hindrance. In management information it is
the overall pattern, not the individual entries, that is of interest. Alphabetical order is
more likely to obscure than highlight the pattern. In addition, managers are usually
not totally unfamiliar with the data they receive. For instance, anyone looking at
some product’s sales figures by state in the USA would probably be aware that, for a
table in population order, California would be close to the top and Alaska close to
the bottom. In other words, the loss from not using alphabetical order is small
whereas the gain in data communication is large.
57
−23
34
In a table, then, the more important comparison should be presented down col-
umns, not along rows. Taking the ‘Capital employed’ data from Table 3.2, it could
be presented horizontally, as in Table 3.3, or vertically, as in Table 3.4. In Table 3.3
the differences between adjacent rows are 390, 140, 190 respectively. When the data
are in a column, such calculations are made much more quickly.
In many tables, however, comparisons across rows and down columns are equal-
ly important and no interchange is possible.
The summary measure is usually an average since it is important that the sum-
mary should be of the same size order as the rest of the numbers. A column total,
for example, is not a good summary, being of a different order of magnitude from
the rest of the column and therefore not a suitable basis for comparison. The
summary measure can also be the basis for ordering the rows and columns (see
Section 3.2.2).
type of number from another (e.g. to separate a summary row from the rest of the
numbers). Table 3.6 shows the data of Table 3.2 but with white space and gridlines
introduced. Table 3.7 is a repeat of Table 3.6 but with an acceptable use of space
and gridlines.
Table 3.6 Data of Table 3.2 with white space and gridlines
Division Capital Turnover Profit
employed
South 1870 730 96
West 1480 560 82
North 1340 530 78
East 1150 430 89
The patterns and exceptions in these data are much more clearly evident once the
white space and gridlines have been removed. The purpose of many tables is to
compare numbers. White space and gridlines have the opposite effect. They
separate numbers and make the comparison more difficult.
3.2.6 Rule 6: Labelling Should Be Clear but Unobtrusive
Care should be taken when labelling data; otherwise the labels may confuse and
detract from the numbers. This seems an obvious point, yet in practice two labelling
faults are regularly seen in tables. First, the constructor of a table may use abbreviat-
ed or obscure labels, having been working on the project for some time and falsely
assuming that the reader has the same familiarity with the numbers and their
definitions. Second, gaps may be introduced in a column of numbers merely to
accommodate extra-long labels. Labels should be clear and not interfere with the
understanding of the numbers.
Table 3.8 is an extract from a table of historical data relating to United Kingdom
utilities prior to their privatisation in the 1980s and 1990s. The extract shows ‘Gross
income as % of net assets’ for a selection of these organisations. First, the length of
the label relating to Electricity results in gaps in the column of numbers; second,
this same label includes abbreviations, the meaning of which may not be apparent to
the uninitiated. In Table 3.9 the labels have been shortened and unclear terms
eliminated. If necessary, a footnote or appendix could provide an exact definition of
the organisations concerned.
In Table 3.9 the numbers and labels are clearer. Watertight and lengthy defini-
tions of the organisations do not belong within the table. The purpose of this rule is
to assert the primary importance of tables in communicating numbers. As far as
possible, the labels should give unambiguous definitions of the numbers but should
not obscure the information contained in the numbers.
Table 3.10 GDP of nine EC countries plus Japan and the USA (€ thou-
sand million)
1965 1975 1987 1990 1995 1997
United 99.3 172.5 598.8 766.4 846.3 1133.3
Kingdom
Belgium 16.6 46.2 123.5 154.4 209.0 213.7
Denmark 10.1 26.9 88.8 99.6 129.4 140.2
France 96.8 253.3 770.2 940.8 1169.1 1224.4
Germany* 114.3 319.9 960.9 1182.2 1846.4 1853.9
Ireland 2.7 5.9 27.2 35.9 49.4 65.1
Italy 53.4 130.2 657.4 861.2 832.0 1011.1
Luxem- 0.7 1.7 6.0 8.1 13.2 13.9
bourg
Nether- 18.7 61.2 188.9 222.3 301.9 315.6
lands
Japan 83.0 372.8 2099.4 2341.5 3917.9 3712.1
USA 690.0 1149.7 3922.3 4361.5 5374.3 6848.2
* Germany includes the former GDR in 1995 and 1997.
(f) Rule 6. The labelling is already clear. No changes have been made.
(g) Rule 7. It would be very difficult to make a simple verbal summary of these
data. Moreover, in the context, the publishers would probably not wish to be
appearing to lead the reader’s thinking by suggesting what the patterns were.
The typical questions that might be asked of these data can now be applied to
Table 3.11. It is possible to see quickly that Germany’s GDP increased by 1900/110
= just over 17 times; Italy’s by 1000/53 = just over 19 times; Japan’s by just under
45 times; the UK’s by 11; Japan has overtaken Germany, France and the UK;
Ireland is over four times the size of Luxembourg economically. The information is
more readily apparent from Table 3.11 than from Table 3.10.
Operating Expenses:
Crew Wages and Social Security 7 685 965 (9 010)
Other Crew Expenses 541 014 (633)
Insurance Premiums 1 161 943 (1 367)
Provisions and Stores 1 693 916 (2 268)
Repairs and Maintenance 1 685 711 (3 297)
Other Operating Expenses 60 835 (27)
(12 829 384)
(8 131 706)
2015 2016
£ £000
Net Profit/(Loss) Currency Exch. (190 836) (680)
Dividends 35 732 47
(8 286 810)
6 967 179
These features of the company’s finances are remarkable. Even when one knows
what they are, it is very difficult to see them in the original table (Table 3.12). Yet it
is this volatility that is of major interest to shareholders and managers.
The question of rounding creates special difficulties with accounting data. The
reason is that rounding and exact adding up are not always consistent. It has to be
decided which is the more important – the better communication of the data or the
need to allow readers to check the arithmetic. The balance of argument must weigh
in favour of the rounding. Checking totals is a trivial matter in published accounts
(although not, of course, in the process of auditing). If a mistake were found in the
published accounts of such a large company, the fault would almost certainly lie
with a printer’s error. But ‘total checking’ is an obsessive pastime and few compa-
nies would risk the barrage of correspondence that would undoubtedly ensue even
though a note to the accounts explained that rounding was the cause. Because of
this factor the two effective figures rule may have to be broken so that adding and
subtracting are exact. This has been done in Table 3.13. The only remaining
question is to wonder why totals are exactly right in company accounts which have
in any case usually been rounded to some extent (the nearest £ million for many
companies). The answer is that figures have been ‘fudged’ to make it so. The same
considerations apply, but to a lesser degree, with internal company financial infor-
mation.
Communicating financial data is an especially challenging area. The guiding prin-
ciple is that the main features should be evident to the users of the data. It should
not be necessary to be an expert in the field nor to have to carry out a complex
analysis in order to appreciate the prime events in a company’s financial year. Some
organisations are recognising these problems by publishing two sets of (entirely
consistent) final accounts. One is source material, covering legal requirements and
suitable for financial experts; the other is a communicating document, fulfilling the
purpose of accounts (i.e. providing essential financial information). Other organisa-
tions may, of course, have reasons for wanting to obscure the main features of their
financial year.
12
10
Rate %
0
86 87 88 89 90 91 92 93 94 95 96
Year
12
10
Italy
6
USA
4 Canada
France
Netherlands
Denmark
2 Belgium
Germany
Japan
0
86 87 88 89 90 91 92 93 94 95 96
Year
Key: Denmark
Spain
Italy
Netherlands
Belgium
1500.0
1239.0
978.0
717.0
456.0
195.0
JAN JAN JAN JAN JAN
2011 2012 2013 2014 2015
2012 Jan. 570 450 230 420 380 2014 Jan. 930 470 470 300 360
Feb. 800 550 300 280 310 Feb. 780 510 530 390 420
Mar. 1100 770 330 400 430 Mar. 900 590 440 260 440
Apr. 910 690 540 270 380 Apr. 780 530 440 490 440
May 970 690 290 390 420 May 1100 510 420 400 440
June 910 660 350 520 240 June 1000 650 440 510 350
July 900 690 370 430 300 July 1100 550 390 350 400
Aug. 900 580 330 240 300 Aug. 870 580 570 350 360
Sept. 750 520 480 430 340 Sept. 1100 610 460 380 360
Oct. 1400 640 410 380 430 Oct. 1100 750 730 360 530
Nov. 1100 590 360 560 430 Nov. 950 660 750 530 410
Dec. 1000 450 340 570 430 Dec. 1200 600 650 500 400
If the general pattern over the years or a comparison between countries is re-
quired, Table 3.15 is suitable. This shows the average monthly imports of coal for
each year. It can now be seen that four of the countries have increased their imports
by a factor of 30–45 per cent. Italy is the exception, having decreased coal imports
by 20 per cent. The level of imports in the countries can be compared.
In these terms, Italy is the largest, followed by Belgium, followed by the other
three at approximately the same level.
Table 3.15 can be transferred to a graph, as shown in Figure 3.4. General patterns
are evident. Italy has decreased its imports, the others have increased theirs; the level
of imports is in the order Italy, Belgium … The difference between the table and the
graph becomes clear when magnitudes have to be estimated. The percentage change
(−20 per cent for Italy, etc.) is readily calculated from the table, but not from the
graph. In general, graphs show the sign of changes but a table is needed to make an
estimate of the size of the changes. The purpose of the data and personal prefer-
ence would dictate which of the two were used.
1200
1000
Italy
800 Belgium
600 Netherlands
Denmark
400 Spain
200
Pictures are useful for attracting attention and for showing very general patterns.
They are not useful for showing complex patterns or for extracting actual numbers.
Learning Summary
The communication of data is an area that has been neglected, presumably because
it is technically simple and there is a tendency in quantitative areas (and perhaps
elsewhere) to believe that only the complex can be useful. Yet in modern organisa-
tions there can be few things more in need of improvement than data
communication.
Although the area is technically simple, it does involve immense difficulties. What
exactly is the readership for a set of data? What is the purpose of the data? How can
the common insistence on data specified to a level of accuracy that is not needed by
the decision maker and is not merited by the collection methods be overcome? How
much accounting convention should be retained in communicating financial
information to the layperson? What should be done about the aspects of data
presentation that are a matter of taste? The guiding principle among the problems is
that the data should be communicated according to the needs of the receiver rather
than the producer. Furthermore, they should be communicated so that the main
features can be seen quickly. The seven rules of data presentation described in this
module seek to accomplish this.
Rule 1: round to two effective digits.
Rule 2: reorder the numbers.
Rule 3: interchange rows and columns.
Rule 4: use summary measures.
Rule 5: minimise use of space and lines.
Rule 6: clarify labelling.
Rule 7: use a verbal summary.
Producers of data are accustomed to presenting them in their own style. As al-
ways there will be resistance to changing an attitude and presenting data in a
different way. The idea of rounding especially is usually not accepted instantly.
Surprisingly, however, while objections are raised against rounding, graphs tend to
be universally acclaimed, even when not appropriate. Yet the graphing of data is the
grossest form of rounding. There is evidently a need for clear and consistent
thinking in regard to data communication.
This issue has been of increasing importance because of the growth in usage of
all types and sizes of computers and the development of large-scale management
information systems. The benefits of this technological revolution should be
enormous but the potential has yet to be realised. The quantities of data that
circulate in many organisations are vast. It is supposed that the data provide
information which in turn leads to better decision making. Sadly, this is frequently
not the case. The data circulate, not providing enlightenment, but causing at best
indifference and at worst tidal waves of confusion. Poor data communication is a
prime cause of this. It could be improved. Otherwise, one must question the
wisdom of the large expenditures many organisations make in providing untouched
and bewildering management data. One thing is clear: if information can be assimi-
lated quickly, it will be used; if not, it will be ignored.
Review Questions
3.1 In communicating management data, which of the following principles should be adhered
to?
A. The requirements of the user of the data are paramount.
B. Patterns in the data should be immediately evident.
C. The data should be specified to two decimal places.
D. The data should be analysed before being presented.
3.2 The specification of the data (the number of decimal places) indicates the accuracy. True
or false?
3.3 The accuracy required of data should be judged in the context of the decisions that are
to be based upon the data. True or false?
1732
1256.3
988.42
38.1
B.
1730
1260
990
38
C.
1700
1300
990
38
3.6 Which are correct reasons? It is easier to compare numbers in a column than in a row
because:
A. The difference between two- and three-figure numbers is quickly seen.
B. Subtractions of one number from another are made more quickly.
C. The numbers are likely to be closer together and thus easier to analyse quickly.
3.7 When the rows (each referring to the division of a large company) of a table of numbers
are ordered by size, the basis for the ordering should be:
A. The numbers in the left-hand column.
B. The averages of the rows.
C. The capital employed in the division.
D. The level of manpower employed in the division.
3.9 Only some of the presentation rules can be applied to financial accounts. This is
because:
A. Rounding cannot be done because the reader may want to check that the
auditing has been correct.
B. Rounding cannot be done because it is illegal.
C. An income statement cannot be ordered by size since it has to build up to a
final profit.
D. Published annual accounts are for accountants; therefore their presentation is
dictated by accounting convention.
1 This table is one of many that elected representatives have to consider at their monthly
meetings. The representatives need, therefore, to be able to appreciate and understand
the main features very quickly. In these circumstances, how could the data be presented
better? Redraft the table to illustrate the changes.
1 Compared to many accounting statements Table 3.17 is already well presented, but
what further improvements might be made?
1 Table 3.18 gives the results of this sensitivity analysis. It shows the extent to which the
assumptions have been varied and the new IRR for each variation. The ‘base rate’ is the
IRR for the original calculation. How could it be better presented? (Note that it is not
necessary to understand the situation fully in order to propose improvements to the
data communication.)
References
Ehrenberg, A. S. C. (1975). Data Reduction. New York: John Wiley and Sons.
Data Analysis
Contents
4.1 Introduction.............................................................................................4/1
4.2 Management Problems in Data Analysis .............................................4/2
4.3 Guidelines for Data Analysis ..................................................................4/6
Learning Summary ......................................................................................... 4/15
Review Questions ........................................................................................... 4/16
Case Study 4.1: Motoring Correspondent ................................................... 4/17
Case Study 4.2: Geographical Accounts ...................................................... 4/18
Case Study 4.3: Wages Project ..................................................................... 4/19
Learning Objectives
By the end of this module the reader should know how to analyse data systematical-
ly. The methodology suggested is simple, relying very much on visual interpretation,
but it is suitable for most data analysis problems in management. It carries implica-
tions for the ways information is produced and used.
4.1 Introduction
What constitutes successful data analysis? There is apparently some uncertainty on
this point. If a group of managers are given a table of numbers and asked to analyse
it, most probably they will ‘number pick’. Individual numbers from somewhere in
the middle of the table which look interesting or which support a long-held view
will be selected for discussion. If the data are profit figures, remarks will be made
such as: ‘I see Western region made £220 000 last year. I always said that the new
cost control system would work.’ A quotation from Andrew Lang, a Scottish poet,
could be applied to quite a few managers: ‘He uses statistics as a drunken man uses
lamp posts – for support rather than illumination.’
Real data analysis is concerned with seeking illumination, not support, from a set
of numbers. Analysis is defined as ‘finding the essence’. A successful data analysis
must therefore involve deriving the fundamental patterns and eliciting the real
information contained in the entire table. This must happen before sensible remarks
can be made about individual numbers. To know whether the cost system in the
above example really did work requires the £220 000 to be put in the context of
profit and cost patterns in all regions.
The purpose of this module is to give some guidelines showing how illumination
might be derived from numbers. The guidelines give five steps to follow in order to
find what real information, if any, a set of numbers contains. They are intended to
provide a framework to help a manager understand the numbers he or she encoun-
ters.
One might have thought that understanding numbers is what the whole subject
of statistics is about, and so it is. But statistics was not developed for use in man-
agement. It was developed in other fields such as the natural sciences. When it is
transferred to management, there is a gap between what is needed and what
statistics can offer. Certainly, many managers, having attended courses or read
books on statistics, feel that something is missing and that the root of their problem
has not been tackled. This and other difficulties involved in the analysis of manage-
ment data will be pursued in the following section, before some examples of the
types of data managers face are examined. Next, the guidelines, which are intended
to help fill the statistics gap, will be described and illustrated. Finally, the implica-
tions of this gap for the producers of statistics will be discussed.
rather like reading. When looking at a business report, a manager will usually
read it carefully, work out exactly what the author is trying to say and then decide
whether it is correct. The process is similar with a table of numbers. The data
have to be sifted, thought about and weighed. To do this, good presentation (as
stressed in Module 3 in the rules for data presentation) may be more important
than sophisticated techniques. Most managers could do excellent data analyses
provided they had the confidence to treat numbers more like words. It is only
because most people are less familiar with numbers than words that the analysis
process needs to be made more explicit (via guidelines such as those in Section
4.3 below) in the case of numbers.
(c) Over-complication by the experts. The attitude of numbers experts (and other
sorts of experts as well) can confuse managers. The experts use jargon, which is
fine when talking to their peers but not when talking to a layperson; they try
sophisticated methods of analysis before simple ones; they communicate results
in a complicated form, paying little regard to the users of the data. For example,
vast and indigestible tables of numbers, all to five decimal places, are often the
output of a management information system. The result can be that the experts
distance themselves from management problems. In some companies specialist
numbers departments have adopted something akin to a research and develop-
ment role, undertaking solely long-term projects. Managers come to believe that
they have not the skills to help themselves while at the same time believing that
no realistic help is available from experts.
Accounting Data
In Module 3 Table 3.12 showed the income statement of a multinational shipping
company. It is difficult to analyse (i.e. it is difficult to say what the significant
features of the company’s business were). Some important happenings are obscured
in Table 3.12, but they were revealed when the table was re-presented in Table 3.13.
MONTH CUMULATIVE
TERMINAL COSTS
ESTIMATE STANDARD VARIANCE VAR % ESTIMATE STANDARD VARIANCE VAR % BUDGET
LO-LO
STEVEDORING
STRAIGHT TIME - FULL 131 223 143 611 1 288 8.6 1 237 132 1 361 266 124 134 9.1 1 564 896
STRAIGHT TIME - M.T. 13 387 14 651 1 264 8.6 256 991 281 399 24 408 8.7
(UN)LASHING 78 (78) 78 (78)
SHIFTING 801 (801) 11 594 (11 594)
OVERTIME, SHIFT TIME OF
WAITING & DEAD TIME 7 102 (7 102) 190 620 (190 620)
RO-RO
STEVEDORING
TRAILERS
STRAIGHT FULL 20 354 26 136 5 782 22.1 167 159 215 161 48 002 22.3 330 074
STRAIGHT M.T. 178 228 50 21.9 14 846 18 993 4 147 21.8
RO-RO COST PLUS
VOLVO CARGO
ROLLING VEHICLES 14 326 19 515 5 189 26.6 98 210 157 163 58 951 37.5
BLOCKSTONED 29 27 (2) (7.4) 613 674 61 9.1
(UN) LASHING RO-RO 355 (355) 355 (355)
SHIFTING 977 (977) 3 790 (3 790)
OVERTIME, SHIFT TIME OF
WAITING & DEAD TIME 1 417 (1 417) (28 713) (28 713)
HEAVY LIFTS (OFF STANDARD) 2 009 (2 009)
CARS
STEVEDORING
STRAIGHT TIME 6 127 6 403 276 4.3 38 530 35 328 (3 202) (9.1) 168 000
(UN) LASHING 2 (2)
SHIFTING 795 (795) 1 288 (1 288)
OVERTIME, SHIFT TIME OF
WAITING & DEAD TIME 7 573 (7 573)
OTHER SHIPSIDE OF COSTS 3 422 (3 422) 24 473 (24 473)
TOTAL TERMINAL COSTS 200 571 210 571 16 000 4.5 2 083 976 2 069 984 (13 992) (.7) 2 062 970
Market Research
Table 4.2 indicates what can happen when experts over-complicate an analysis. The
original data came from interviews of 700 television viewers who were asked which
British television programmes they really like to watch. The table is the result of the
analysis of this relatively straightforward data. It is impossible to see what the real
information is, even if one knows what correlation means. However, a later and
simpler analysis of the original data revealed a result of wide-ranging importance in
the field of television research. (See Ehrenberg, 1975, for further comment on this
example.)
In all three examples any message in the data is obscured. They were produced by
accountants, computer scientists and statisticians respectively. What managers
would have the confidence to fly in the face of experts and produce their own
analysis? Even if they had the confidence, how could they attempt an analysis? The
guidelines described below indicate, at a general level, how data might be analysed.
They provide a starting point for data analysis.
The re-presentation being recommended does not refer just to data that are liter-
ally a random jumble. On the contrary, the assumption is that the data have already
been assembled in a neat table. Neatness is preferable to messiness but the patterns
may still be obscured. When confronted by apparent orderliness one should take
steps to re-present the table in a fashion which makes it easy to see any patterns
contained in it. The ways in which data can be rearranged were explored in detail in
the previous module. Recall that the seven steps were:
(a) Round the numbers to two effective figures.
(b) Put rows and columns in size order.
(c) Interchange rows and columns where necessary.
(d) Use summary measures.
(e) Minimise use of gridlines and white space.
(f) Make the labelling clear and do not allow it to hinder comprehension of the
numbers.
(g) Use a verbal summary.
encapsulate every nuance of reality, but for some time they have been able to
summarise and predict reality to a generally acceptable level of approximation. In
management the objective is usually no more than this.
Only if the simple approach fails are complex methods necessary, and then ex-
pert knowledge may be required. As a last resort, even if the numbers are random
(random means they have no particular pattern or order), this is a model of a sort
and can be useful. For example, the fact that the day-to-day movements in the
returns from quoted shares are random is an important part of modern financial
theory.
comparison. The other results may be from another year, from another company,
from another country or from another analyst. In other words, reference can usually
be made to a wider set of information. In consequence, questions may be prompted:
Why is the sales mix different this year from the previous five? Why do other
companies have less brand switching for their products? Why is productivity higher
in the west of Germany? Making comparisons such as these provides a context in
which to evaluate results and also suggests the consistencies or anomalies which
may in turn lead to appropriate management action.
If the results coincide with others, then this further establishes the model and
may mean that in future fewer data may need to be collected – only enough to see
whether the already established model still holds. This is especially true of manage-
ment information systems where managers receive regular printouts of sets of
numbers and they are looking for changes from what has gone before. It is more
efficient for a manager to carry an established model from one time period to the
next rather than the raw data.
Example: Consumption of Distilled Spirits in the USA
As an example of an analysis of numbers that a manager might have to carry out,
consider Table 4.3 showing the consumption of distilled spirits in different states of the
USA. The objective of the analysis would be to measure the variation in consumption
across the states and to detect any areas where there were distinct differences. How
can the table be analysed and what information can be gleaned from it? The five stages
of the guidelines are followed.
Stage 1: Reduce the data. Many of the data are redundant. Are percentage fig-
ures really necessary when per capita figures are given? It is certainly possible, with
some imaginative effort, to conceive of uses of percentage data, but they are not
central to the purposes of the table. It can be reduced to a fraction of its original
size without any loss of real information.
Stage 2: Re-present. To understand the table more quickly, the numbers can be
rounded to two effective figures. The original table has numbers, in places, to eight
figures. No analyst could possibly make use of this level of specification. What con-
clusion would be affected if an eighth figure were, say, a 7 instead of a 4? In any
event, the data are not accurate to eight figures. If the table were a record docu-
ment (which it is not) then more than two figures may be required, but not eight.
Putting the states in order of decreasing population is more helpful than alphabetical
order. Alphabetical order is useful for finding names in a long list, but it adds nothing
to the analysis process. The new order means that states are just as easy to find.
Most people will know that California has a large population and Alaska a small one,
especially since no one using the table will be totally ignorant of the demographic
attributes of the USA. At the same time, the new order makes it easy to spot states
whose consumption is out of line with their population.
The end result of these changes, together with some of a more cosmetic nature, is
Table 4.4. Contrast this table with the original, Table 4.3.
Alaska 46 47 1 391 172 1 359 422 2.3 0.33 0.32 3.64 3.86
Arizona 29 30 4 401 883 4 144 521 6.2 1.03 0.98 1.94 1.86
Arkansas 38 38 2 534 826 2 366 429 7.1 0.60 0.56 1.20 1.12
California 1 1 52 529 142 52 054 429 0.9 12.33 12.32 2.44 2.46
Colorado 22 22 6 380 783 6 310 566 1.1 1.50 1.49 2.47 2.49
Connecticut 18 18 7 194 684 7 271 320 (−1.1) 1.69 1.72 2.31 2.35
Delaware 45 43 1 491 652 1 531 688 (−2.6) 0.35 0.36 2.56 2.65
Dist. of 27 27 4 591 448 4 828 422 (−4.9) 1.08 1.14 6.54 6.74
Columbia
Florida 4 4 22 709 209 22 239 555 1.7 5.33 5.28 2.70 2.67
Georgia 13 13 10 717 681 9 944 846 7.8 2.52 2.35 2.16 2.02
Hawaii 41 40 2 023 730 1 970 089 2.7 0.48 0.47 2.28 2.28
Illinois 3 3 26 111 587 26 825 876 (−2.7) 6.13 6.35 2.33 2.41
Indiana 19 20 7 110 382 7 005 511 1.5 1.67 1.66 1.34 1.32
Kansas 35 35 2 913 422 2 935 121 (−0.7) 0.68 0.70 1.26 1.29
Kentucky 26 26 4 857 094 5 006 481 (−3.0) 1.14 1.19 1.42 1.47
Louisiana 21 21 7 073 283 6 699 853 5.6 1.66 1.59 1.84 1.77
Maryland 12 12 10 833 966 10 738 731 0.9 2.54 2.54 2.61 2.62
Massachusetts 10 10 13 950 268 14 272 695 (−2.3) 3.28 3.38 2.40 2.45
Minnesota 15 15 8 528 284 8 425 567 1.2 2.00 1.99 2.15 2.15
Missouri 20 17 7 074 614 7 697 871 (−7.9) 1.66 1.82 1.48 1.61
Nebraska 36 36 2 733 497 2 717 859 0.6 0.64 0.64 1.76 1.76
Nevada 30 31 4 360 172 4 095 910 6.5 1.02 0.97 7.15 6.92
New Jersey 8 8 15 901 587 16 154 975 (−1.6) 3.73 3.82 2.17 2.21
New Mexico 42 41 1 980 372 1 954 139 1.3 0.47 0.46 1.70 1.70
New York 2 2 41 070 005 41 740 341 (−1.6) 9.64 9.88 2.27 2.30
North Dakota 47 46 1 388 475 1 384 311 0.3 0.33 0.33 2.16 2.16
Oklahoma 33 29 3 904 574 4 187 527 (−6.8) 0.92 0.99 1.41 1.54
Rhode Island 39 39 2 073 075 2 131 329 (−2.7) 0.49 .50 2.24 2.30
South 23 25 5 934 427 5 301 054 11.9 1.39 1.26 2.08 1.88
Carolina
South Dakota 48 48 1 312 160 1 242 021 5.6 0.31 0.29 1.91 1.82
Tennessee 24 24 5 618 774 5 357 160 4.9 1.32 1.27 1.33 1.28
Texas 5 6 17 990 532 17 167 560 4.8 4.22 4.06 1.44 1.40
Wisconsin 11 11 10 896 455 10 739 261 1.5 2.56 2.54 2.36 2.33
Total licence 319 583 215 317 874 435 0.5 75.04 75.22 2.13 2.13
Stage 3: Build a model. The pattern is evident from the transformed table. Con-
sumption varies with the population of the state. Per capita consumption in each
state is about equal to the figure for all licence states with some variation (±30 per
cent) about this level. The pattern a year earlier was the same except that overall
consumption increased slightly (1 per cent) between the two years. Refer back to
Table 4.3 and see if this pattern is evident even when it is known to be there. There
may of course be other patterns but this one is central to the objectives of the anal-
ysis.
Stage 4: Exceptions. The overall pattern of approximately equal per capita con-
sumption in each state allows the exceptions to be seen. From Table 4.4, three
states stand out as having a large deviation from the pattern. The states are District
of Columbia, Nevada and Alaska. These states were exceptions to the pattern in the
earlier year as well. Explanations in the cases of District of Columbia and Nevada are
readily found, probably being to do with the large non-resident populations. People
live, and drink, in these states who are not included in the population figures (diplo-
mats in DC, tourists in Nevada). An explanation for Alaska may be to do with the
lack of leisure opportunities. Whatever the explanations, the analytical method has
done its job. The patterns and exceptions in the data have been found. Explanations
are the responsibility of experts in the marketing of distilled spirits in the USA.
Stage 5: Comparison. A comparison between the two years is provided by the
table. Other comparisons will be relevant to the task of gaining an understanding of
the USA spirits market. The following data would be useful:
(i) earlier years, say, five and ten years before;
(ii) a breakdown of aggregate data into whisky, gin, vodka, etc.;
(iii) other alcoholic beverages: wine, beer, etc.
Once data from these other sources have been collected they would be analysed in
the manner described, but of course the process would be shorter because the
pattern can be anticipated. Care would need to be taken that like was being com-
pared with like. For example, it would have to be checked that an equivalent
definition of consumption was in force ten years earlier.
The second implication is more direct. Data should be presented in forms which
enable them to be analysed speedily and accurately. Much of the reduction and re-
presentation stages of the guidelines could, in most instances, be carried out just as
well by the producer of the data as by the user. It would then need to be done only
once rather than many times by the many users of the data. Unfortunately, when
time is spent thinking about the presentation of statistics, it is usually spent in
making the tables look neat or attractive rather than making them amenable to
analysis.
There is much that the producers of data can do by themselves. For example,
refer back to the extract from a management information system shown in Ta-
ble 4.1: if thought were given to the analysis of the data through the application of
the guidelines, a different presentation would result (see Table 4.5).
(a) Some data might be eliminated. For instance, full details on minor categories of
expenditure may not be necessary. This step has not been taken in Table 4.5
since full consultation with the receivers would be necessary.
(b) The table should be re-presented using the rules of data presentation. In
particular, some rounding is helpful. This is an information document, not an
auditing one, and thus rounding is appropriate. In any case, no different conclu-
sions would be drawn if any of the expenditures were changed by one unit. In
addition, improvement is brought about by use of summary measures and a
clearer distinction between such measures and the detail of the table.
(c) A model derived from previous time periods would indicate when changes were
taking place. There is a good case for including a model or summary of previous
time periods with all MIS data. This has not been done for Table 4.5 since previ-
ous data were not available.
(d) Exceptions can be clearly marked. It is, after all, a prime purpose of budget data
to indicate where there have been deviations from plan. This can be an automat-
ic process. For example, all variances greater than 10 per cent could be marked.
This might even obviate the need for variance figures to be shown.
(e) The making of comparisons is probably not the role of the data producer in this
example, involving as it does the judgement of the receivers in knowing what the
relevant comparisons are. The task of the producer has been to facilitate these
comparisons.
Making the suggested changes does of course have a cost attached in terms of
management time. However, the cost is a small fraction of the cost of setting up and
operating the information system. The changes can transform the system and make
it fully operational. If an existing system is being largely ignored by managers, there
may be no alternative.
Table 4.5 Budgeting data from an MIS (amended from Table 4.1)
Port: Liverpool OCEAN PORT TERMINAL COSTS – SHIPSIDE OPERATIONS
Period: December (U.S Dollars: Conversion rate 1.60)
MONTH CUMULATIVE
ESTIMATE STANDARD VARIANCE VAR % ESTIMATE STANDARD VARIANCE VAR % BUDGET
LO-LO: Stevedoring (STR-FULL) 131 000 144 000 12 000 9 1 240 000 1 360 000 120 000 9 1 560 000
Stevedoring (STR-MT) 13 400 14 700 1 300 9 257 000 281 000 24 000 9
Unlashing 78 0 −78 * 78 0 −78 *
Shifting 200 0 −200 * 12 000 0 −12 000 *
Overtime, etc. 7 100 0 −7 100 * 191 000 0 −191 000 *
RO-RO: Stevedoring TR (STR-FULL) 20 400 26 100 5 800 22 167 000 215 000 48 000 22 330 000
Stevedoring TR (STR-MT) 180 230 50 22 15 000 19 000 4 100 22
Stevedoring cost plus 0 0 0 0 0 0 0 0
Stevedoring Volvo 0 0 0 0 0 0 0
Stevedoring rolling 14 300 19 500 5 200 27 98 000 157 000 59 000 37
Stevedoring blockstow 29 27 −2 −7 610 670 60 9
Unlashing 350 0 −350 * 350 0 −350 *
Shifting 980 0 −980 * 3 800 0 −3 800 *
Overtime, etc. 1 400 0 −1 400 * 29 000 0 −29 000 *
Heavy lifts 0 0 0 0 2 000 0 −2 000 *
CARS: Stevedoring (STR) 6 100 6 400 280 4 38 000 35 000 −3 000 −9 170 000
Unlashing 0 0 0 0 2 0 −2 *
Shifting 800 0 −800 * 1 300 0 −1 300 *
Overtime, etc. 0 0 0 0 7 600 0 −7 600 *
TOTALS: LO-LO 152 000 158 000 5 900 4 1 700 000 1 640 000 −59 000 4
RO-RO 38 000 46 000 8 300 18 315 000 392 000 77 000 20
CARS 6 900 6 4001 −520 −8 47 000 35 000 −12 000 −34
OTHER 3 400 0 −3 400 * 24 000 0 −24 000
GRAND 201 000 211 000 10 000 5 2 080 000 2 070 000 −14 000 −1 2 060 000
TOTAL
Totals may not agree because of rounding.
*Zero standard cost, therefore variance not calculable.
Few managers will not admit that there is currently a problem with the provision
and analysis of data, but they rarely tell this to their IT systems manager or whoever
sends them data. Without feedback, inexpensive yet effective changes are never
made. It must be the responsibility of users to criticise constructively the form and
content of the data they receive. The idea that computer scientists/statisticians
always know best, and if they bother to provide data then they must be useful, is
false. The users must make clear their requirements, and even resort to a little
persistence if alterations are not forthcoming.
Learning Summary
Every manager sees the problem of handling numbers differently because each sees
it mainly in the (probably) narrow context with which he or she is familiar in his or
her own work. One manager sees numbers only in the financial area; another sees
them only in production management. The guidelines suggested here are intended
to be generally applicable to the analysis of business data in many different situa-
tions and with a range of different requirements. The key points are:
(a) In most situations managers without statistical backgrounds can carry out
satisfactory analyses themselves.
(b) Simple methods are preferable to complex ones.
(c) Visual inspection of well-arranged data can play a role in coming to understand
them.
(d) Data analysis is like verbal analysis.
(e) The guidelines merely make explicit what comes naturally when dealing with
words.
The need for better skills to turn data into real information in managerial situa-
tions is not new. What has made the need so urgent in recent times is the
exceedingly rapid development of computers and associated management infor-
mation systems. The ability to provide vast amounts of data very quickly has grown
enormously. It has far outstripped the ability of management to make use of the
data. The result has been that in many organisations managers have been swamped
with so-called information which in fact is no more than mere numbers. The
problem of general data analysis is no longer a small one that can be ignored. When
companies are spending large amounts of money on data provision, the question of
how to turn the data into information and use them in decision making is one that
has to be faced.
Review Questions
4.1 Traditional statistical techniques do not help managers in analysing data. True or false?
4.2 The need for new management skills in data analysis arises because so many data come
from computers, which means that they have to be presented in a more complicated
style. True or false?
4.3 Which of the following reasons is correct? The first step in data analysis is to reduce the
data. It is done because:
A. Most data sets contain some inaccuracies.
B. One can only analyse a small amount of data at a time.
C. Most data sets contain some items which are irrelevant.
4.4 Data recorded to eight decimal places can be rounded down since such a degree of
accuracy will not affect the decision being taken. True or false?
A. True
B. False
4.5 Which of the following reasons are correct? A model or pattern is used to summarise a
table because:
A. Exceptions can be seen more easily and accurately.
B. It is easier to make comparisons with other sets of data.
C. The model will be more accurate than the original data.
4.7 A company has four divisions. The profit and capital employed by each are given in the
table below. Which division is the exception?
A. Division 1
B. Division 2
C. Division 3
D. Division 4
4.8 A confectionery manufacturer’s production level for a new chocolate bar is believed to
have increased by 5 per cent per month over the last 36 months. However, for 11 of
these months this model does not fit. The exceptions were as follows: for five months
strikes considerably reduced production; the three Decembers had lower figures, as did
the three Augusts, when the production plant is closed for two weeks. You would be
right in concluding that the 5 per cent model is not a good one because 11 exceptions
out of 36 is too many. True or false?
4.9 Towards the completion of an analysis of the consumption of distilled spirits across the
different states of the USA in a particular year, the results are compared with those of
similar studies. Which of the following other analyses would be useful?
A. Consumption of distilled spirits across the départements of France.
B. Consumption of wine across the départements of France.
C. Consumption of wine across the states of the USA.
D. Consumption of whisky across the states of the USA.
4.10 A simple model is used in preference to a sophisticated one in the analysis of data
because sophisticated models obscure patterns. True or false?
Ten years ago, in 2004, the death rate on the roads of this country was running
at 0.1 death for every 1 million miles driven. By 2009 a death occurred every
12 million miles. Last year, according to figures just released, there were 6400
deaths, whilst a total of 92 000 million miles were driven.
References
Ehrenberg, A. S. C. (1975). Data Reduction. New York: John Wiley and Sons.
Summary Measures
Contents
5.1 Introduction.............................................................................................5/1
5.2 Usefulness of the Measures ....................................................................5/2
5.3 Measures of Location..............................................................................5/5
5.4 Measures of Scatter ............................................................................. 5/14
5.5 Other Summary Measures ................................................................. 5/20
5.6 Dealing with Outliers .......................................................................... 5/21
5.7 Indices ................................................................................................... 5/22
Learning Summary ......................................................................................... 5/29
Review Questions ........................................................................................... 5/30
Case Study 5.1: Light Bulb Testing............................................................... 5/33
Case Study 5.2: Smith’s Expense Account .................................................. 5/34
Case Study 5.3: Monthly Employment Statistics ........................................ 5/34
Case Study 5.4: Commuting Distances ........................................................ 5/34
Case Study 5.5: Petroleum Products ........................................................... 5/35
Learning Objectives
By the end of the module, the reader should know how large quantities of numbers
can be reduced to a few simple summary measures that are much easier to handle
than the raw data. The most common measures are those of location and scatter.
The special case of summarising time series data with indices is also described.
5.1 Introduction
When trying to understand and remember the important parts of a lengthy verbal
report, it is usual to summarise. This may be done by expressing the essence of the
report in perhaps a few sentences, by underlining key phrases or by listing the main
subsection headings. Each individual has his own method, which may be physical (a
written précis) or mental (some way of registering the main facts in the mind).
Whatever the method, the point is that it is easier to handle information in this way,
by summarising and storing these brief summaries in one’s memory. On the few
occasions that details are required, it is necessary to turn to the report itself.
The situation is no different when it is numerical rather than verbal information
that is being handled. It is still better to form a summary to capture the salient
characteristics. The summary may be a pattern, simple or complex, revealed when
analysing the data, or it may be based on one or more of the standard summary
measures described in this module.
Inevitably some accuracy is lost. In the extreme, if the summarising is badly done,
it can be wholly misleading. (How often do report writers claim to have been totally
misunderstood after hearing someone else’s summary of their work?) Take the case
of a recent labour strike in the UK about wages payment. In reporting the current
levels of payment, newspapers/union leaders/employers could not, of course, state
the payments for all 173 000 employees in the industry. They had to summarise.
Five statements as to the ‘average’ weekly wage were made:
The average weekly wage is …
All these quoted wages were said to be the same thing: the average weekly wage.
Are the employees in the industry grossly underpaid or overpaid? It is not difficult
to choose an amount that reinforces one’s prejudices. The discrepancies are not
because of miscalculations but because of definitions. Quote 1 is the basic weekly
wage without overtime, shift allowances and unsocial hours allowance, and it has
been reduced for tax and other deductions. Since the industry is one that requires
substantial night-time working for all employees, no one actually takes home the
amount quoted. Quote 2 is the same as the first but without the tax deduction.
Quote 3 is the average take-home pay of a sample of 30 employees in a particular
area. Quote 4 is basic pay plus unsocial hours allowance but without any overtime
or tax deductions. Quote 5 is basic pay plus allowances plus maximum overtime
pay, without tax and other deductions.
It is important when using summary measures (and in all of statistics) to apply
common sense and not be intimidated by complex calculations. Just because
something that sounds statistical is quoted (‘the average is £41.83’) does not mean
that its accuracy and validity should be accepted without question. When summary
measures fail, it is usually not because of poor arithmetic or poor statistical
knowledge but because common sense has been lacking.
In this context, the remainder of this module goes on to describe ways of sum-
marising numbers and to discuss their effectiveness and their limitations.
monthly a computer report of the two previous months’ production. The report for
June (22 working days) and May (19 working days) is given in Table 5.1.
The data as shown are useful for reference purposes (e.g. what was the produc-
tion on 15 May?) or for the background to a detailed analysis (e.g. is production
always lower on a Friday and, if so, by how much?). Both these types of use revolve
around the need for detail. For more general information purposes (e.g. was May a
good month for production? What is the trend of production this year?) the amount
of data contained in the table is too large and unwieldy for the manager to be able to
make the necessary comparisons. It would be rather difficult to gauge the trend of
production levels so far this year, from six reports, one for each month, such as that
in Table 5.1. If summary measures were provided, then most questions, apart from
the ones that require detail, could be answered readily. A summary of Table 5.1
might be as shown in Table 5.2.
The summary information provided in Table 5.2 enables a wide variety of man-
agement questions to be answered and, more importantly, answered quickly.
Comparisons are made much more easily if summary data for several months, or
years, are available on one report.
Three types of summary measure are used in Table 5.2. The first, average pro-
duction, measures the location of the numbers and tells at what general level the
data are. The second, the range of production, measures scatter and indicates how
widely spread the data are. The third indicates the shape of the data. In this case,
the answer ‘symmetrical’ says that the data fall equally on either side of the average.
The three measures reflect the important attributes of the data. No important
general features of the data are omitted. If, on the other hand, the measure of scatter
had been omitted, the two months could have appeared similar. In actual fact, their
very different ranges provide an important piece of information that reflects
production planning problems.
For each type of measure (location, scatter, shape) there is a choice of measure to
use (for location, the choice is between arithmetic mean and other measures). The
different types of measures are described below. The measures have many uses
other than as summaries and these will be indicated. They will also be found in
other subject areas. For example, the variance, a measure of scatter, plays a central
role in modern financial theory.
where:
refers to the data in the set
is standard notation for the arithmetic mean
∑ is the Greek capital sigma and, mathematically, means ‘sum of’
is standard notation for the numbers of readings in the set.
3, 3, 4, 5, 5, 6, 6, 6, 7
↑
Middle number
Median = 5
If there is an even number of readings, then there can be no one middle number.
In this case, it is usual to take the arithmetic mean of the middle two numbers as the
median.
For example, if the set of nine numbers above was increased to ten by the pres-
ence of ‘8’, the set would become:
3, 3, 4, 5, 5, 6, 6, 6, 7, 8
︸
Middle two numbers
Median = (5+6)/2
Median = 5.5
5.3.3 Mode
The third measure of location is the mode. This is the most frequently occurring
value. Again, there is no mathematical formula for the mode. The frequency with
which each value occurs is noted and the value with the highest frequency is the
mode.
Again, using the same nine numbers as an example: 3, 3, 4, 5, 5, 6, 6, 6, 7
Number Frequency
3 2
4 1
5 2
6 3
7 1
Mode = 6
Treating all data in a class as if each observation were equal to the mid-point is of
course an approximation, but it is done to simplify the calculations. However, on
some occasions the data may only be available in this form anyway. For example, in
measuring the lengths of machined car components as part of a quality check, the
observations would probably be recorded in groups such as ‘100.5 to 101.0’ rather
than as individual measurements such as ‘100.634’. The most serious approximation
in Table 5.4 is in taking the mid-point of the 90+ class as 94.5, since this class could
include days when complaints had been much higher, say 150, because of some
special circumstances such as severe disruption on account of a derailment. For
open-ended groups such as this it may be necessary to examine the outliers to test
the validity of the mid-point approximation.
Calculating the mode and median from a frequency table is more straightforward.
The median class is the one in which the middle observation lies. In this case the
175th and 176th observations lie in the 30–39 class (i.e. the median is 34.5). The
mode is the mid-point of the class with the highest frequency. In this case the class
is 20–29 and the mode is therefore 24.5.
4
No. of readings
1 2 3 4 5 6 7 8 9 10 11 12
Marks
For the symmetrical distribution (Figure 5.1) all three measures are equal. This is
always approximately the case for symmetrical data. Whichever measure is chosen, a
similar answer results. Consequently, it is best to use the most well-known measure
(i.e. the arithmetic mean) to summarise location for this set of data.
Calculations for Figure 5.1:
Mean =
=
= 8
4
No. of readings
0 1 2 3 4 5 17 18 19
Episodes seen
to quote more than one mode, each mode corresponding to one peak. For example,
in Figure 5.2, had the frequencies for 0 and 19 episodes been 5 and 4 respectively,
technically there would have been one mode at 0, but because the histogram still
would have two peaks, the data should be reported as having two modes.
Calculations for Figure 5.2:
Mean =
=
= 8
Median = middle value of set
= average of 10th and 11th values
= 0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 4, 17, 18, 18, 19, 19, 19, 19, 19
=2
Mode = most frequently occurring values
= 0 and 19
Context: Weeks away from work through sickness in a one-year period for a sample of 20 employees in
a particular company.
Readings: 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 3, 5, 18, 28, 44, 52
Shape:
7
5
No. of readings
0
0 1 2 3 5 18 28 44 52
Weeks of sickness
value would be obtained if the outlier of 52 weeks had not been present. Then the
mean would have been reduced from 8.0 to 5.7.
In all situations, the arithmetic mean can be misleading if there are just one or
two extremes in the data. The mode is not misleading, just unhelpful. Most sickness
records have a mode of 0, therefore to quote ‘mode 0’ is not providing any more
information than merely saying that the data concern sickness records.
Calculations for Figure 5.3:
Mean =
=
= 8
Median = middle value of set
= average of 10th and 11th values
= 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 3, 5, 18, 28, 44, 52
=1
Mode = most frequently occurring value
=0
Mean, medium and mode are the major measures of location and are obviously
useful as summaries of data. Equally obviously, they do not capture all aspects of a
set of numbers. Other types of summary measure are necessary. However, before
we leave measures of location, their uses, other than as summarisers, will be
described.
Adding the mean of each set of data, as in the next two sets, allows the shape of
the distribution to become apparent more quickly to the eye. As you can see along
the rows:
In the first case, the focus enables one to see that the numbers are scattered
closely and about equally either side of the mean; in the second case, one sees that
most numbers are below the mean with just a few considerably above.
In fact, the two sets are the symmetrical data and the reverse J-shaped data intro-
duced earlier in Figure 5.1 and Figure 5.3. In the latter case the arithmetic mean was
judged not to be the most useful measure to act as a summary; nevertheless it has a
value when used as a focus for the eye. One meets this usage with row and column
averages in tables of numbers.
For Comparisons
Measures of location can be used to compare two (or more) sets of data regardless
of whether the measure is the best summary measure for that set.
Set 1: 5, 7, 8, 9, 9, 10 Mean = 8
Set 2: 5, 5, 5, 6, 6, 7, 8, 10 Mean = 6.5
The two sets of data above contain a different number of readings. The arithme-
tic mean may or may not be the correct summary measure for either set.
Nevertheless, a useful comparison between them can be effected through the mean.
Similarly, the sickness records of a group of people (reverse J shape) over several
years can be compared using the arithmetic mean, even though one would not use
this measure purely to summarise the data.
Salary
(inc. bonuses)
1 Founder/managing director £60 000
4 Skilled workers £14 000
5 Unskilled workers £12 000
Arithmetic mean salary = £17 600
The lesson is: when averaging averages where groups of different size are in-
volved, go back to the basic definition of the average.
Except where one of the difficulties described above applies, the arithmetic mean
is the first choice of measure of location.
5.4.1 Range
The best-known and certainly the simplest measure of scatter is the range, which is
the total interval covered by the numbers.
Range = Largest reading − Smallest reading
For example, for the nine numbers 3, 3, 4, 5, 5, 6, 6, 6, 7:
Range = 7 − 3
=4
3, 3, 4, 5, 5, 6, 6, 6, 7
︸ ︸
remove remove
Interquartile range = 6 − 4
=2
where:
is the arithmetic mean
is the number of readings in the set
(The notation | | (pronounced ‘the absolute value of’) means the size of the
number disregarding its sign).
For example, calculate the MAD of: 3, 3, 4, 5, 5, 6, 6, 6, 7.
From the previous work: = 5
3 3 4 5 5 6 6 6 7
− −2 −2 −1 0 0 1 1 1 2
| − | 2 2 1 0 0 1 1 1 2
∑| − | = 2 +2 +1 +0 +0 +1 +1 +1 +2
= 10
MAD =
= 1.1
The concept of absolute value used in the MAD is to overcome the fact that
( − ) is sometimes positive, sometimes negative and sometimes zero. The
absolute value gets rid of the sign. Why is this necessary? Try the calculation without
taking absolute values and see what happens.
5.4.4 Variance
An alternative way of eliminating the sign of deviations from the mean is to square
them, since the square of any number is never negative. The variance is the average
squared distance of readings from the arithmetic mean:
Variance =
population. We can see this intuitively because the one-in-a-million extreme outlier
that is present in the population will not usually be present in the sample. Extreme
outliers have a large impact on the variance since it is based on squared deviations.
Consequently, when the variance is calculated from a sample, it tends to underesti-
mate the true population variance.
Dividing by n − 1 instead of n increases the size of the calculated figure, and this
increase offsets the underestimate by just the right amount. ‘Just the right amount’
has the following meaning. Calculating the variance with n − 1 as the denominator
will give, on average, the best estimate of the population variance. That is, if we
were to repeat the calculation for many, many samples (in fact, an infinite number
of samples) and take the average, the result would be equal to the true population
variance. If we used n as the denominator this would not be the case. This can be
verified mathematically but goes beyond what a manager needs to know – consult a
specialist statistical text if you are interested.
Section 9.3 on ‘Degrees of Freedom’ in Module 9 gives an alternative and more
technical explanation.
Unless you are sure you are in the rare situation of dealing with the whole popu-
lation of a variable, you should use the n − 1 version of the formula. Calculators and
popular spreadsheet packages that have a function for calculating the variance
automatically nearly always use n − 1, although there may be exceptions.
Taking the usual example set of numbers, calculate the variance of 3, 3, 4, 5, 5, 6,
6, 6, 7 (Mean = 5).
3 3 4 5 5 6 6 6 7
− −2 −2 −1 0 0 1 1 1 2
( − ) 4 4 1 0 0 1 1 1 4
∑( − ) = 4 +4 +1 +0 +0 +1 +1 +1 +4
= 16
∑( )
Variance =
=
= 2
The variance has many applications, particularly in financial theory. However, as
a pure description of scatter, it suffers from the disadvantage that it involves
squaring. The variance of the number of weeks of sickness of 20 employees is
measured in square weeks. However, it is customary to quote the variance in
ordinary units (e.g. in the above example the variance is said to be two weeks).
3 9
3 9
4 16
5 25
5 25
6 36
6 36
6 36
7 49
Total 45 241
Mean = =5
Variance = ∑( )− · / − 1
= [241−9×25]/8
= 2
Standard deviation Easy to handle mathematically Too involved for descriptive purposes
Used in other theories
All the measures have their particular uses. No single one stands out. When a
measure of scatter is required purely for descriptive purposes, the best measure is
probably the mean absolute deviation, although it is not as well known as it deserves
to be. When a measure of scatter is needed as part of some wider statistical or
mathematical theory, then the variance and standard deviation are frequently
encountered.
Further Example
A company’s 12 salespeople in a particular region last month drove the following
number of kilometres:
Salesperson Kilometres
(hundreds)
1 34
2 47
3 30
4 32
5 38
6 39
7 36
8 43
Salesperson Kilometres
(hundreds)
9 31
10 40
11 42
12 32
Calculate:
(a) range
(b) interquartile range
(c) MAD
(d) variance
(e) standard deviation.
Which measure is the most representative of the scatter in these data?
(a) Range = Highest – Lowest = 47 – 30 = 17
(b) Putting the numbers in ascending order:
30, 31, 32, 32, 34, 36, 38, 39, 40, 42, 43, 47
Lowest quartile (25%) Highest quartile (25%)
Interquartile range = 40 – 32 = 8
(c) To calculate MAD, it is first necessary to find the arithmetic mean:
Mean =
=
= 37
Next calculate the deviations:
34 47 30 32 38 39 36 43 31 40 42 32
− −3 10 −7 −5 1 2 −1 6 −6 3 5 −5
| − | 3 10 7 5 1 2 1 6 6 3 5 5
∑| − | = 54
∑| |
MAD =
= 4.5
(d) The variance
34 47 30 32 38 39 36 43 31 40 42 32
− −3 10 −7 −5 1 2 −1 6 −6 3 5 −5
( − ) 9 100 49 25 1 4 1 36 36 9 25 25
∑( − ) = 320
∑( )
Variance =
=
= 29.1
(e) Standard deviation
= √Variance
= √29.1
= 5.4
The best descriptive measure of scatter in this situation is the mean absolute
deviation. The average difference between a salesperson’s travel and the average travel
is 4.5 (450 kilometres). This is a sensible measure that involves all data points. The range
is of great interest, but not as a measure of scatter. Its interest lies in indicating the
discrepancy between the most and least travelled. It says nothing about the ten in-
between salespeople. The interquartile range is probably the second choice. The
variance and standard deviation are probably too complex conceptually to be descrip-
tive measures in this situation, where further statistical analysis is not likely.
This is useful when sets of data with very different characteristics are being com-
pared. For example, suppose one is looking at the number of passengers per day
passing through two airports. Over a one-year period the average number of
passengers per day, the standard deviations and the coefficients of variation are
calculated.
Consideration of the standard deviations alone would suggest that there was more
scatter at Airport 2. In relation to the number of passengers using the two airports, the
scatter is smaller at Airport 2 as revealed by the coefficient of variation being 0.14 as
against 0.25 at Airport 1.
5.5.1 Skew
Skew measures the extent to which a distribution is non-symmetrical. Figure 5.4(a)
is a distribution that is left-skewed; Figure 5.4(b) is a symmetrical distribution with
zero skew; Figure 5.4(c) is a distribution that is right-skewed.
Figure 5.4 Skew: (a) left skew; (b) zero skew; (c) right skew
The concept of skew is normally used purely descriptively and is assessed visually
(i.e. one looks at the distribution and assesses whether it is symmetrical or right- or
left-skewed). Skew can be measured quantitatively but the formula is complex and
the accuracy it gives (over and above a verbal description) is rarely necessary in
practice. However, the measurement of skew gives rise to the alternative labels
positively skewed (right-skewed) and negatively skewed (left-skewed).
5.5.2 Kurtosis
Kurtosis measures the extent to which a distribution is ‘pinched in’ or ‘filled out’.
Figure 5.5 shows three distributions displaying increasing levels of kurtosis. As with
skew, a qualitative approach is sufficient for most purposes (i.e. when one looks at
the distribution, one can describe it as having a low, medium or high level of
kurtosis). Kurtosis can also be measured quantitatively, but, again, the formula is
complex.
particularly true of the variance and standard deviation, which use squared values.
How does one deal with the outliers? Are they to be included or excluded? Three
basic situations arise.
(a) Twyman’s Law. This only half-serious law states that any piece of data that
looks interesting or unusual is wrong. The first consideration when confronted
by an outlier is whether the number is incorrect, perhaps because of an error in
collection or a typing mistake. There is an outlier in these data, which are the
week’s overtime payments to a group of seven workers (in £s):
13.36, 17.20, 16.78, 15.98, 1432, 19.12, 15.37
Twyman’s Law suggests that the outlier showing a payment of 1432 occurs be-
cause of a dropped decimal point rather than a fraudulent claim. The error
should be corrected and the number retained in the set.
(b) Part of the pattern. An outlier may be a definite and regular part of the pattern
and should be neither changed nor excluded. Such was the case with the sickness
record data of Figure 5.3. The outliers were part of the pattern and similar ef-
fects were likely to be seen in other time periods and with other groups of
employees.
(c) Isolated events. Outliers occur that are not errors but that are unlikely to be
repeated (i.e. they are not part of the pattern). Usually they are excluded from
calculations of summary measures, but their exclusion is noted. For example, the
following data, recorded by trip wire, show the number of vehicles travelling
down a major London road during a ten-day period:
5271, 5960, 6322, 6011, 7132, 5907, 6829, 741, 7098, 6733
The outlier is the 741. Further checking shows that this day was the occasion of
a major royal event and that the road in question was closed to all except state
coaches, police vehicles, etc. This is an isolated event, perhaps not to be repeated
for several years. For traffic control purposes, the number should be excluded
from calculations since it is misleading, but a note should be made of the exclu-
sion. Hence, one would report:
Mean vehicles/day =
=
= 6363 (excluding day of Royal Event)
The procedure for outliers is first to look for mistakes and correct them; and
second, to decide whether the outlier is part of the pattern and should be includ-
ed in calculations or an isolated event that should be excluded.
5.7 Indices
An index is a particular type of measure used for summarising the movement of a
variable over time. When a series of numbers is converted into indices, it makes the
numbers easier to understand and to compare with other series.
The best-known type of index is probably the cost of living index. The cost of
living comprises the cost of many different goods: foods, fuel, transport, etc.
Instead of using the miscellaneous and confusing prices of all these purchases, we
use an index number, which summarises them for us. If the index in 2014 is 182.1
compared with 165.3 in 2013, we can calculate that the cost of living has risen by:
. .
× 100 = 10.2%
.
This is rather easier than having to cope with the range of individual price rises
involved.
Every index has a base year when the index was 100 (i.e. the starting point for
the series). If the base year for the cost of living index was 2008, then the cost of
living has risen 82.1 per cent between 2008 and 2014. This could be said in a
different way: the 2014 cost of living is 182.1 per cent of its 2008 value.
The index very quickly gives a feeling for what has happened to the cost of living.
Comparisons are also easier. For example, if the Wages and Salaries Index has 2008
as its base year and stood at 193.4 in 2014, then, over the six years, wages out-
stripped the cost of living: 93.4 per cent as against 82.1 per cent.
The cost of living index is based on a complicated calculation. However, there
are some more basic indices.
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014
Price (£000s) 6.1 8.2 8.6 10.1 11.8 12.4 16.9 19.0 19.7 19.4
Index 49 66 69 81 95 100 136 153 159 156
The index for 2010, being the base year, is 100. The other data in the series are
scaled up accordingly. For instance, the index for 2007 is:
8.6 × = 69
.
where 8.6 is the original datum and 12.4 is the original datum for the base year. And
for 2013:
19.7 × = 159
.
The choice of the base year is important. It should be such that individual index
numbers during the time span being studied are never too far away from 100. As a
rule of thumb, the index numbers are not usually allowed to differ from 100 by
more than the factor of 3 (i.e. the numbers are in the range 30 to 300). If the base
year for the numbers in the series above had been chosen as 2005, then the index
series would have been from 100 to 318.
In long series, there might be more than one base year. For example, a series
covering more than 30 years from 1982 to 2014 might have 1982 as a base year with
the series then rising to 291 in 2001, which could then be taken as a second base
year:
Care obviously needs to be taken in interpreting this series. The increase from
1982 to 2014 is not 113 per cent. If the original series had been allowed to continue,
the 2014 index would have been 213 × 2.91 = 620. The increase is thus 520 per
cent.
If January is taken as the base, then the indices for the months are:
One possible disadvantage of this index is that livestock with a low price will
have much less influence than livestock with a high price. For instance, in February
a 20 per cent change in the price of cattle would change the index much more than
a 20 per cent change in the pig price:
However, this may be a desirable feature. If the price level of each factor in the
index reflects its importance, then the higher-priced elements should have more
effect on the index. On the other hand, this feature may not be desirable. One way
to counter this is to construct a price relative index. This means that the individual
prices are first converted into an index and then these individual indices are aver-
aged to give the overall index.
Instead of adding them up, we first weight the prices by the quantities, and the
final index is formed from the resulting monthly total. The quantities used for the
weighting should be the same for each month, since this is a price index. Otherwise
the index would measure price and volume changes. If the quantities used for
weighting are the base month (January) quantities, then the index is known as a
Laspeyres Index and is calculated as follows:
A Laspeyres Index, like other indices, can be used for quantities as well as prices.
For a quantity index the role of price and quantity in the above example (of a price
index) would be reversed, with prices providing the weightings to measure changes
in quantities. The weights (prices) remain at the constant level of some base period
while the quantities change from time period to time period. For example, the UK
Index of Manufacturing Production shows how the level of production in the
country is changing as time goes by. The quantities refer to different types of
product – consumer goods, industrial equipment, etc. – and the prices are those of
the products in a base year. The use of prices as weights for quantities gives the
most expensive products a heavier weighting.
A major criticism of the Laspeyres Index is that the weights in the base year may
soon become out of date and no longer representative. An alternative is the
Paasche Index, which takes the weights from the most recent time period – the
weightings therefore change from each time period to the next. In the livestock
example a Paasche Index would weight the prices in each month with the quantities
relating to December, the most recent month. A Paasche Index always uses the
most up-to-date weightings, but it has the serious practical disadvantage that, if it is
to be purely a price index, every time new data arrive (and the weightings change)
the entire past series must also be revised.
A fixed weight index may also be used. Its weights are from neither the base
period nor the most recent period. They are from some intermediate period or from
the average of several periods. It is a matter of judgement to decide which weights
to use.
The cost of living index has already been introduced. It indicates how the cost of
a typical consumer’s lifestyle changes as time goes by, and it has many practical uses.
For example, it is usually the starting point in wage negotiations, since it shows how
big a wage increase is needed if an employee’s current standard of living is to be
maintained. A wage increase lower than the cost of living index would imply a
decrease in real wages.
What type of index should be used for the cost of living index? Table 5.8 and
Table 5.9 show the simplified example from Economics Module 12 (Tables 12.1 and
12.2).
The weights are multiplied by the prices to give the index, as in the livestock
example. The weights are changed regularly as a result of government surveys of
expenditure patterns. It is important to note that, for the cost of living index,
previous values of the index are not changed as the weights change (i.e. the index
remains as it was when first calculated). Consequently, in comparing the cost of
living now with that of 20 years ago, a cost of living index reflects changes in
purchasing behaviour as well as the inevitably increasing prices.
A price index such as the cost of living index can be used to deflate economic
data. For example, the measure of a country’s economic activity is its GNP (gross
national product), the total value of the goods and services an economy produces in
a year. The GNP is measured from statistical returns made to the government and is
calculated in current prices (i.e. it is measured in terms of the prices for the goods
and services that apply in that year). Consequently, the GNP can rise from one year
to another because prices have risen through inflation, even though actual economic
activity has decreased. It would be helpful to neutralise the effect of prices in order
to have a more realistic measure of economic activity. This is done by deflating the
series so that the GNP is in real, not current, terms. The deflation is carried out by
using a price index. Table 5.11 shows the GNP of a fictional country for 2008–14,
together with a price index for those years. Current GNP is converted to real GNP
by dividing by the price index.
The changes in GNP(real) over the years 2008–14 show that economic activity
did increase in each of those years but not by as much as GNP(current) suggested.
It is important that an appropriate price index is used. It would not have been
appropriate to use the cost of living index for GNP since that index deals only with
consumer expenditure. A price index that incorporates the prices used in the GNP –
that is, investment and government goods as well as consumer goods – should be
used.
Learning Summary
In the process of analysing data, at some stage the analyst tries to form a model of
the data, as suggested previously. ‘Pattern’ or ‘summary’ are close synonyms for
‘model’. The model may be simple (all rows are approximately equal) or complex
(the data are related via a multiple regression model). Often specifying the model
requires intuition and imagination. At the very least, summary measures can provide
a model based on specifying for the data set:
(a) number of readings;
(b) a measure of location;
(c) a measure of scatter;
(d) the shape of the distribution.
In the absence of other inspiration, these four attributes provide a useful model
of a set of numbers. If the data consist of two or more distinct sets (as, for example,
a table), then this basic model can be applied to each. This will give a means of
comparison between the rows or columns of the table or between one time period
and another.
The first attribute (number of readings) is easily supplied. Measures of location
and scatter have already been discussed. The shape of the distribution can be found
by drawing a histogram and literally describing its shape (as with the symmetrical, U
and reverse-J distributions seen earlier). A short verbal statement about the shape is
often an important factor in summarising or forming a model of a set of data.
Verbal statements have a more general role in summarising data. They should be
short, no more than one sentence, and used only when they can add to the sum-
mary. They are used in two ways: first, they are used when the quantitative measures
are inadequate; second, they are used to point out important features in the data.
For example, a table of a company’s profits over several years might indicate that
profits had doubled. Or a table of the last two months’ car production figures might
have a note stating that 1500 cars were lost because of a strike.
It is important, in using verbal summaries, to distinguish between helpful state-
ments pointing out major features and unhelpful statements dealing with trivial
exceptions and details. A verbal summary should always contribute to the objective
of adding to the ease and speed with which the data can be handled.
Review Questions
5.1 Which of the following statements about summary measures are true?
A. They give greater accuracy than the original data.
B. It is easier to handle information in summary form.
C. They are never misleading.
D. Measures of location and scatter together capture all the main features of data.
Questions 5.2 to 5.4 refer to the following data:
1, 5, 4, 2, 7, 1, 0, 8, 6, 6, 5, 2, 4, 5, 3, 5
5.5 Which of the following applies? As a measure of location, the arithmetic mean:
A. is always better than the median and mode.
B. is usually a misleading measure.
C. is preferable to mode and median except when all three are approximately
equal.
D. should be used when the data distribution is U shaped.
E. None of the statements applies.
5.6 An aircraft’s route requires it to fly along the sides of a 200 km square (see figure
below). Because of prevailing conditions, the aircraft flies from A to B at 200 km/h, from
B to C at 300 km/h, from C to D at 400 km/h and from D to A at 600 km/h. What is
the average speed for the entire journey from A to A?
A B
200 km
D C
A. 325 km/h
B. 320 km/h
C. 375 km/h
D. 350 km/h
5.7 Which of the following statements about measures of scatter are true?
A. Measures of scatter must always be used when measures of location are used.
B. A measure of scatter is an alternative to a measure of location as a summary.
C. One would expect a measure of scatter to be low when readings are close
together, and high when they are further apart.
D. A measure of scatter should be used in conjunction with a measure of disper-
sion.
5.14 Which of the following statements are true regarding the presence of an outlier in a
data set of which summary measures are to be calculated?
A. An outlier should either be retained in or excluded from the set.
B. An outlier that is part of the pattern of the data should always be used in
calculating the arithmetic mean.
C. An outlier that is not part of the pattern should usually be excluded from any
calculations.
Which of the following is correct? Between 2013 and 2014, the growth of the cost of
living compared to wages and salaries was:
A. much greater.
B. slightly greater.
C. equal.
D. slightly less.
E. much less.
Mr Smith’s boss felt that these expenses were excessive because, he said, the average
expense per day was £35. Other salespeople away on weekly trips submitted expenses
that averaged at around £20 a day. Where did the £35 come from? How can Smith
argue his case?
Department
A B C D
Mean monthly employment level 10 560 4891 220 428
Standard deviation 606 302 18 32
Is the monthly employment level more stable in some departments than in others?
The mean distance from work was found to be 10.5 miles. Calculate the mode, the
median and two measures of scatter. How would you summarise these data succinctly?
Prices Quantities
2012 2013 2014 2012 2013 2014
Car petrol 26.2 27.1 27.4 746 768 811
Kerosene 24.8 28.9 26.5 92 90 101
Paraffin 23.0 24.1 24.8 314 325 348
a. Use 2012 = 100 to construct a simple aggregate index for the years 2012 to 2014
for the prices of the three petroleum products.
b. Use 2012 quantities as weights and 2012 = 100 to construct a weighted aggregate
index for the years 2012 to 2014 for the prices of the three petroleum products.
c. Does it matter which index is used? If so, which one should be used?
d. How else could our index be constructed?
Sampling Methods
Contents
6.1 Introduction.............................................................................................6/1
6.2 Applications of Sampling .......................................................................6/3
6.3 The Ideas behind Sampling ....................................................................6/3
6.4 Random Sampling Methods ...................................................................6/4
6.5 Judgement Sampling ........................................................................... 6/10
6.6 The Accuracy of Samples ................................................................... 6/12
6.7 Typical Difficulties in Sampling .......................................................... 6/13
6.8 What Sample Size? .............................................................................. 6/15
Learning Summary ......................................................................................... 6/16
Review Questions ........................................................................................... 6/18
Case Study 6.1: Business School Alumni ..................................................... 6/20
Case Study 6.2: Clearing Bank ...................................................................... 6/20
Learning Objectives
By the end of this module the reader should know the main principles underlying
sampling methods. Most managers have to deal with sampling in some way. It may
be directly in commissioning a sampling survey, or it may be indirectly in making
use of information based on sampling. For both purposes it is necessary to know
something of the techniques and, more importantly, the factors critical to their
success.
6.1 Introduction
Statistical information in management is usually obtained from samples. The
complete set of all conceivable observations of a variable is a population; a subset
of a population is a sample. It is rarely possible to study a population. Nor is it
desirable, since sample information is much less expensive yet proves sufficient to
take decisions, solve problems and answer questions in most situations. For
example, to know what users of soap powder in the UK think about one brand of
soap powder, it is hardly possible to ask each one of the 15 million of them his or
her opinion, nor would the expense of such an exercise be warranted. A sample
would be taken. A few hundred would be interviewed and from their answers an
estimate of what the full 15 million are thinking would be made. The 15 million are
the population, the few hundred are the sample. A population does not have to refer
to people. In sampling agricultural crops for disease, the population might be 50 000
hectares of wheat.
In practice, information is nearly always collected from samples as opposed to
populations, for a wide variety of reasons.
(a) Economic advantages. Collecting information is expensive. Preparing ques-
tionnaires, paying postage, travelling to interviews and analysing the data are just
a few examples of the costs. Taking a sample is cheaper than observing the
whole population.
(b) Timeliness. Collecting information from a whole population can be slow,
especially waiting for the last few questionnaires to be filled in or for appoint-
ments with the last few interviewees to be granted. Sample information can be
obtained more quickly, and sometimes this is vital, for example, in electoral opin-
ion polls when voters’ intentions may swing significantly as voting day
approaches.
(c) Size and accessibility. Some populations are so large that information could
not be collected from the whole population. For example, a marketing study
might be directed at all teenagers in a country and it would be impossible to
approach them all. Even in smaller populations there may be parts that cannot
be reached. For example, surveys of small businesses are complicated by there
being no up-to-date lists because small businesses are coming into and going out
of existence all the time.
(d) Observation and destruction. Recording data can destroy the item being
observed. For example, in a quality test on electrical fuses, the test ruins the fuse.
It would not make much sense to destroy all fuses immediately after production
just to see whether they worked. Sampling is the only possible approach to some
situations.
There are two distinct areas of theory associated with sampling, illustrated in
Figure 6.1. In the example of market research into soap powder above, the first area
of theory would show how to choose the few hundred interviewees; the second
would show how to use the sample information to draw conclusions about the
population.
Theory
Theory How to make
How to choose inferences about
a sample. the population
from the sample.
Only the first area, methods of choosing samples, is the topic here. The methods
will be described, some applications will be illustrated and technical aspects will be
introduced. The second area, making inferences, is the subject of Module 8.
random sampling is equivalent to this procedure. The selection of the sample is left
to chance and one would suppose that, on average (but not always), the sample
would be reasonably representative.
The major disadvantage of simple random sampling is that it can be expensive. If
the question to be answered is the likely result of a UK general election, then the
sampler must find some means of listing the whole electorate, choose a sample at
random and then visit the voters chosen (perhaps one in Inverness, one in Pen-
zance, two in Norwich, one in Blackpool, etc.). Both selection and questioning
would be costly.
Variations on simple random sampling can be used to overcome this problem.
For example, multi-stage sampling in the opinion poll example above would
permit the sample to be collected in just a few areas of the country, cutting down on
the travelling and interviewing expenses.
The variations on simple random sampling also make it possible to use other
information to make the sample more representative. In the soap powder example a
stratified sample would let the sampler make use of the fact that known percent-
ages of households use automatic machines, semi-automatic machines, manual
methods and launderettes.
Some situations do not allow or do not need any randomisation in sampling.
(How would a hospital patient feel about a ‘random’ sample of blood being taken
from his body?) Judgement sampling refers to all methods that are not essentially
random in character and in which personal judgement plays a large role. They can
be representative in many circumstances.
Figure 6.2 shows diagrammatically the main sampling methods and how they are
linked. In the next sections these methods will be described in more detail.
To choose the sample, the random numbers are taken one at a time from Ta-
ble 6.1 and each associated with the corresponding number and name from
Table 6.2. The first number is 31, so the corresponding employee is Lester, E. The
second number is 03, so the employee is Binks, J. This is continued until the
necessary sample is collected. The third number is 62, and thus the name is Sutcliffe,
H. The fourth number is 98, which does not correspond to any name and is
ignored. Table 6.3 shows the sample of five.
The drawbacks of simple random sampling are that the listing of the population
can prove very expensive, or even impossible, and that the collection of data from
the sample can also be expensive, as with opinion polls. Variations on simple
random sampling have been developed to try to overcome these problems. These
methods still have a sizeable element of random selection in them and the random
part uses a procedure such as that described above.
Multi-Stage
In multi-stage sampling, the population is split into groups and each group is split
into subgroups, each subgroup into subsubgroups, etc. A random sample is taken at
each stage of the breakdown. First a simple random sample of the groups is taken;
of the groups chosen a simple random sample of their subgroups is taken and so on.
For example, suppose the sample required is of 2000 members of the workforce of
a large company, which has 250 000 employees situated in offices and factories over
a large geographical area. The objective is to investigate absenteeism. The company
has 15 regional divisions; each division has an average of ten locations at which an
office or factory is situated. Each location has its own computerised payroll. Since
the company operates on a decentralised basis, no company-wide list of employees
exists. Figure 6.3 illustrates how the population is split.
Company
15 geographical divisions
Cluster Sampling
Cluster sampling is closely linked with multi-stage sampling in that the population
is divided into groups, the groups into subgroups and so on. The groups are
sampled, the subgroups sampled, etc. The difference is that, at the final stage, each
individual of the chosen groups is included in the final sample. Pictorially, the final
sample would look like a series of clusters drawn from the population.
For example, suppose the company described in the previous section on multi-
stage sampling is slightly different. The locations are not offices and factories but
small retail establishments at which an average of six staff are employed. The four
divisions and two locations per division could be sampled as before, but, because
there are only six staff, further samples at each location would not be taken. All six
would be included in the sample, carrying the further advantage of ensuring that all
grades of employee are in the sample. In this case the total sample size is 4 × 2 × 6
= 48. This is two-stage cluster sampling, because two stages of sampling, of
divisions and locations, are involved. If all locations, ignoring divisions, had been
listed and sampled, the process would involve only one sampling stage and would be
called ordinary cluster sampling. If a total sample larger than 48 were required
then more divisions or more than two locations per division would have to be
selected.
Stratified Sampling
In stratified sampling prior knowledge of a population is used to make the sample
more representative. If the population can be divided into subpopulations (or strata)
of known size and distinct characteristics then a simple random sample is taken of
each subpopulation such that there is the same proportion of subpopulation
members in the sample as in the whole population. If, for example, 20 per cent of a
population forms a particular stratum, then 20 per cent of the sample will be of that
stratum. The sample is therefore constrained to be representative at least as regards
the occurrence of the different strata. Note the different role played by the strata
compared to the role of the groups in multi-stage and cluster sampling. In the
former case, all strata are represented in the final sample; in the latter, only a few of
the groups are represented in the final sample.
For an example of stratified sampling, let’s return to the previous situation of
taking a sample of 2000 from the workforce of a large company. Suppose the
workforce comprises 9 per cent management staff, 34 per cent clerical, 21 per cent
skilled manual and 36 per cent unskilled manual. It is desirable that these should all
be represented in the final sample and in these proportions. Stratified sampling
involves first taking a random sample of 180 management staff (9 per cent of 2000),
then 680 clerical, then 420 skilled and finally 720 unskilled. This does not preclude
the use of multi-stage or cluster sampling. In multi-stage sampling the divisions and
locations would be selected first, then samples taken from the subpopulations at
each location (i.e. take 22 management staff – 9 per cent of 250 – at location one, 22
at location two and so on for the eight locations).
The final sample is then structured in the same way as the population as far as
staff grades are concerned. Note that stratification is only worth doing if the strata
are likely to differ in regard to the measurements being made in the sample. Other-
wise the sample is not being made more representative. For instance, absenteeism
results are likely to be different among managers, clerks, skilled workers and
unskilled workers. If the population had been stratified according to, say, colour of
eyes, it is unlikely that absenteeism would differ from stratum to stratum, and
stratified sampling would be inapplicable.
Weighting
Weighting is a method of recognising the existence of strata in the population after
the sampling has been carried out instead of before, as with stratification. Weighting
means that the measurements made on individual elements of the sample are
weighted so that the net effect is as if the proportions of each stratum in the sample
had been the same as those in the population. In the absenteeism investigation,
suppose that at each location the computerised payroll records did not indicate to
which staff grade those selected for the sample belonged. Only when the personnel
records were examined could this be known. Stratification before sampling is
therefore impossible. Or, at least, it would be extremely expensive to amend the
computer records. The sample of 2000 must be collected first. Suppose the strata
proportions are as in Table 6.4.
The table shows the weighting that should be given to the elements of the sam-
ple. The weighting allocated to each stratum means that the influence each stratum
has on the results is the same as its proportion in the population. If the measure-
ments being made are of days absent for each member of the sample, then these
measurements are multiplied by the appropriate weighting before calculating average
days absent for the whole sample.
Probability Sampling
In simple random sampling each element of the population has an equal chance of
being selected for the sample. There are circumstances when it is desirable for
elements to have differing chances of being selected. Such sampling is called
probability sampling.
For example, when a survey is carried out into the quality and nutritional value of
school meals, a random sample of schools will have to be taken and their menus
inspected. If every school has an equal chance of being chosen, children at large
schools will be under-represented in the sample. The probability of choosing menus
from small schools is greater when a sample of schools is taken than when a sample
of schoolchildren is taken. This may be important if there is a variation in meals
between large and small establishments.
The issue hinges on what is being sampled. If it is menus, then school size is
irrelevant; if it is children subjected to different types of menu, then school size is
relevant. Probability sampling would give schools different chances of being chosen,
proportional to the size of the school (measured in terms of the number of children
attending). In the final sample, children from every size of school would have an
equal chance that their menu was selected.
Variable Sampling
In variable sampling some special subpopulation is over-sampled (i.e. deliberately
over-represented). This is done when the subpopulation is of great importance and
a normal representation in the sample would be too small for accurate information
to be gleaned.
Suppose the project is to investigate the general health levels of children who
have suffered from measles. The population is all children, of specified ages, who
have had measles. A small subpopulation (perhaps 1 per cent) is formed of children
who have suffered permanent brain damage as part of their illness. Even a large
sample of 500 would contain only five such children. Just five children would be
insufficient to assess the very serious and greatly variable effects of brain damage.
Yet this is a most important part of the survey. More of such children would
purposely be included in the sample than was warranted by the subpopulation size.
If calculations of, for instance, IQ measurements were to be made from the sample,
weighting would be used to restore the sample to representativeness.
Area Sampling
Sometimes very little may be known about the population. Simple random sampling
may be difficult and expensive, but, at the same time, lack of knowledge may
prevent a variation on simple random sampling being employed. Area sampling is
an artificial breaking down of the population to make sampling easier. The popula-
tion is split into geographical areas. A sample of the areas is taken and then further
sampling is done in the few areas selected.
The survey of users’ attitudes to a particular soap powder is such a case. A listing
of the whole population of users is impossible, yet not enough is known to be able
to use multi-stage or cluster sampling. The country/district where the sampling is to
be conducted is split into geographical areas. A sample of the areas is selected at
random and then a further sample is taken in the areas chosen, perhaps by listing
households or using the electoral roll. The difficult operation of listing the popula-
tion is reduced to just a few areas.
first name on the list, the starting point would be selected at random from the
first 50.
To return to the workforce absenteeism survey, the computerised payroll could
be sampled systematically. If the payroll had 1750 names and a sample of 250
was required, then every seventh name could be taken. Provided that the list is in
random order, the resulting sample is, in effect, random, but all the trouble of
numbering the list and using random numbers has been avoided. Systematic
sampling would usually provide a random sample if the payroll were in alphabet-
ical order.
There are, however, dangers in systematic sampling. If the payroll were listed in
workgroups, each group having six workers and one foreman, then taking one
name in seven might result in a sample consisting largely of foremen or consist-
ing of no foremen at all.
A systematic sample can therefore result in a sample that is, in effect, a random
sample, but time and effort has been saved. At the other extreme, if care is not
taken, the sample can be hopelessly biased.
(b) Convenience sampling. This means that the sample is selected in the easiest
way possible. This might be because a representative sample will result or it
might be that any other form of sampling is impossible.
In medicine, a sample of blood is taken from the arm. This is convenience sam-
pling. Because of knowledge of blood circulation, it is known that the sample is
as good as random.
As a further example, consider the case of a researcher into the psychological
effects on a family when a member of the family suffers from a major illness.
The researcher will probably conduct the research only on those families that are
in such a position at the time of the research and who are willing to participate.
The sample is obviously not random, but the researcher has been forced into
convenience sampling because there is no other choice. He or she must analyse
and interpret the findings in the knowledge that the sample is likely to be biased.
Mistaken conclusions are often drawn from such research by assuming the sam-
ple is random. For instance, it is likely that families agreeing to be interviewed
have a different attitude to the problem of major illness from families who are
unwilling.
(c) Quota sampling. Quota sampling is used to overcome interviewer bias. It is
frequently used in market research street interviews. If the interviewer is asked to
select people at random, it is difficult not to show bias. A young male interviewer
may, for instance, select a disproportionate number of attractive young females
for questioning.
Quota sampling gives the interviewer a list of types of people to interview. The
list may be of the form:
10 males age 35–50
10 females age 35–50
10 males age 50+
10 females age 50+
Total sample = 40
Discretion is left to the interviewer to choose the people in each category.
Note that quota sampling differs from stratified sampling. A stratum forms a
known proportion of the population; quota proportions are not known. Any
conclusions must be interpreted in the context of an artificially structured, non-
random sample.
If samples of size 2 are taken and the average salary of each sample is computed,
then using simple random sampling there will be six possible samples:
Under stratified sampling, where each sample of two comprises one manager
from each stratum, there are four possible samples:
If a sample of size 2 were being used to estimate the population average salary
(23), simple random sampling could give an answer in the range 18 → 28, but
stratified sampling is more accurate and could only give a possible answer in the
range 20 → 26. On a large scale (in terms of population and sample size), this is the
way in which stratified sampling improves the accuracy of estimates.
In summary, simple random sampling allows (in a way as yet unspecified) the
accuracy of a sample estimate to be calculated. Stratified sampling, besides its other
advantages, improves the accuracy of sample estimates.
6.7.2 Non-response
Envisage a sample of households with which an interviewer has to make contact to
ask questions concerning the consumption of breakfast cereals. If there is no one at
home when a particular house is visited, then there is non-response. If the house is
revisited (perhaps several times are necessary) then the sampling becomes very
expensive; if the house is not revisited then it will be omitted from the sample and
the sample is likely to be biased. The bias occurs because the households where no
one is in may not be typical of the population. For instance, if the interviewer made
his calls during the daytime then households where both marriage partners are at
work during the day would tend to be omitted, possibly creating bias in the results.
The information from the sample could not properly be applied to the whole
population but only parts of it.
Non-response does not occur solely when house visits are required, but whenev-
er the required measurements cannot be made for some elements of the sample. In
the absenteeism example, missing personnel records for some of the staff selected
for the sample would constitute a non-response problem. If efforts are made to find
or reconstruct the missing records it will be expensive; if they are left out of the
sample it will be biased since, for instance, the missing records may belong mainly to
new employees.
6.7.3 Bias
Bias is a systematic (i.e. consistent) tendency for a sampling method to produce
samples that over- or under-represent elements of the population. It can come from
many sources, and sampling frame error and non-response, as described above, are
two of them. Other sources include:
(a) Inaccurate measurement. This might be physical inaccuracy, such as a
thermometer that is calibrated 1 degree too high; it might be conceptual, such as
a salary survey that excludes perks and commission bonuses.
(b) Interviewer bias. This is when the interviewer induces biased answers. For
example, an interviewer may pose a question aggressively: ‘You don’t believe the
new job conditions are advantageous, do you?’, signalling that a particular answer
is required.
(c) Interviewee bias. Here the interviewee injects the bias. The interviewee may be
trying to impress with the extent of his knowledge and falsely claim to know
more than he does. Famously, a survey of young children’s video viewing habits
seemed to indicate that most were familiar with ‘adult’ videos until it was realised
that the children interviewed were anxious to appear more grown-up than they
were and made exaggerated claims. People of all ages tend to give biased answers
in areas relating to age, salary and sexual practice.
(d) Instrument bias. The instrument refers to the means of collecting data, such as
a questionnaire. Poorly constructed instruments can lead to biased results. For
example, questions in the questionnaire may be badly worded. ‘Why is Brand X
superior?’ raises the question of whether Brand X actually is thought to be supe-
rior. Bad questions may not just be a question of competence: they may indicate
a desire to ‘rig’ the questionnaire to provide the answers that are wanted. It is
always advisable to ask about the source of funding for any ‘independent’ survey.
Bias is most dangerous when it exists but is not recognised. When this is the case
the results may be interpreted as if the sample were truly representative of the
population when it is not, possibly leading to false conclusions.
When bias is recognised, it does not mean that the sample is useless, merely that
care should be taken when making inferences about the population. In the example
concerned with family reactions to major illness, although the sample was likely to
be biased, the results could be quoted with the qualification that a further 20 per
cent of the population was unwilling to be interviewed and the results for them
might be quite different.
In other circumstances a biased sample is useful as a pilot for a more representa-
tive sampling study.
The second answer to the sample size question is not theoretical but is more
frequently met in practice. It is to collect the largest sample that the available budget
allows. Then the results can be interpreted in the light of the accuracy given by the
sample size.
It may seem surprising that the population size has no bearing on the sample
size. However, the population size has an effect in a different way. The definition of
simple random sampling (which is also involved in other types of sampling) is that
each member of the population should have an equal chance of being chosen.
Suppose the population is small, with only 50 members. The probability of the first
member being chosen is 1/50, the second is 1/49, the third is 1/48 and so on (i.e.
the probabilities are not equal). For a large population the difference in probabilities
is negligible and the problem is ignored. For a small population there are two
options. First, sampling could be done with replacement. This means that when an
element of the sample has been chosen, it is then returned to the population. The
probabilities of selection will then all be equal. This option means that an element
may be included in the sample twice. The second option is to use a different theory
for calculating accuracy and sample size. This second option is based on sampling
without replacement and is more complicated. A small population, therefore,
while not affecting sample size, does make a difference to the nature of sampling.
Learning Summary
It is most surprising that information collection should be so often done in apparent
ignorance of the concept of sampling. Needing information about invoices, one
large company investigated every single invoice issued and received over a three-
month period: a monumental task. A simple sampling exercise would have reduced
the cost to around 1 per cent of the actual cost with little or no loss of accuracy.
Even after it is decided to use sampling, there is still, obviously, a need for careful
planning. This should include a precise timetable of what and how things are to be
done. The crucial questions are: ‘What are the exact objectives of the study?’ and
‘Can the information be provided from any other source?’ Without this careful
planning it is possible to collect a sample and then find the required measurements
cannot be made. For example, having obtained a sample of 2000 of the workforce,
it may be found that absence records do not exist, or it may be found that another
group in the company carried out a similar survey 18 months before and their
information merely needs updating.
The range of uses of sampling is extremely wide. Whenever information has to
be collected, sampling can prove valuable. The following list gives a guide to the
applications that are frequently encountered:
(a) opinion polls of political and organisational issues;
(b) market research of consumer attitudes and preferences;
(c) medical investigations;
(d) agriculture (crop studies);
(e) accounting;
(f) quality control (inspection of manufactured output);
Review Questions
6.1 Which reason is correct?
The need to take samples arises because:
A. it is impossible to take measurements of whole populations.
B. sampling gives more accurate results.
C. sampling requires less time and money than measuring the whole population.
6.3 A British company divides the country into nine sales regions. Three of them are to be
selected at random and included in an information exercise. Using the table of random
numbers, what are the regions chosen? (Take the numbers starting at the top left, a row
at a time.)
Random Numbers
5 8 5 0 4
7 2 6 9 3
6 1 4 7 8
Sales regions
1. North West England
2. Eastern England
3. Midlands
4. London
5. South East England
6. South West England
7. Wales
8. Scotland
9. Northern Ireland
The sample will comprise:
A. SE England, Scotland, SE England.
B. SE England, Scotland, London.
C. Scotland, SE England, NW England.
D. SE England, Wales, SW England.
E. NW England, N Ireland, NW England.
6.4 The advantages of multi-stage sampling over simple random sampling are:
A. fewer observations need to be made.
B. the entire population does not have to be listed.
C. it can save time and effort by restricting the observations to a few areas of the
population only.
D. the accuracy of the results can be calculated more easily.
6.5 Which of the following statements about stratified sampling are true?
A. It is usually more representative than a simple random sample.
B. It cannot also be a cluster sample.
C. It may be more expensive than a simple random sample.
6.6 Which of the following statements about variable sampling fractions is true?
A. Measurements on one part of the sample are taken extra carefully.
B. A section of the population is deliberately over-represented in the sample.
C. The size of the sample is varied according to the type of items so far selected.
D. The results from the sample are weighted so that the sample is more repre-
sentative of the population.
6.8 What is the essential difference between stratified and quota sampling?
A. Quota sampling refers only to interviewing, whereas stratification can apply to
any sampling situation.
B. Stratification is associated with random selection within strata, whereas quota
sampling is unconnected with random methods.
C. The strata sizes in the sample correspond to the population strata sizes,
whereas the quota sizes are fixed without necessarily considering the popula-
tion.
6.9 A sample of trees is to be taken (not literally) from a forest and their growth monitored
over several years. In the forest 20 per cent of the trees are on sloping ground and 80
per cent are on level ground. A sample is taken by first dividing the forest into geo-
graphical areas of approximately equal size. There are 180 such areas (36 sloping, 144
level). Three sloping areas and 12 level ones are selected at random. In each of these 15
areas, 20 trees are selected at random for inclusion in the sample.
The resulting sample of 300 (20 × 15) trees can be said to have been selected using
which of the following sample methods?
A. Multi-stage.
B. Cluster.
C. Stratification.
D. Weighting.
E. Probability.
F. Area.
G. Systematisation.
H. Convenience.
6.10 A random sample of 25 children aged 10 years is taken and used to measure the average
height of 10-year-old children with an accuracy of ±12 cm (with 95 per cent probability
of being correct). What would have been the accuracy had the sample size been 400?
A. ±3 cm
B. ±0.75 cm
C. ±48 cm
D. ±4 cm
It is thought that branch size affects the profitability of customer accounts, because of
differing staff ratios and differing ranges of services being available at branches of
different size. Each branch has between 100 and 15 000 chequebook accounts. All
accounts are computerised regionally. Each region has its own computer, which can
provide a chronological (date of opening account) list of chequebook accounts and from
which all the necessary information on average balances and so on can be retrieved.
a. Why should the bank adopt a sampling approach rather than taking the information
from all account holders?
b. In general terms, what factors would influence the choice of sample size?
c. If a total sample of 2000 is to be collected, what sampling method would you
recommend? Why?
d. In addition, the bank wants to use the information from the sample to compare the
profitability of chequebook accounts for customers of different socioeconomic
groups. How would you do this? What extra information would you require? NB:
An account holder’s socioeconomic group can be specified by reference to his/her
occupation. The bank will classify into five socioeconomic groups.
e. What practical difficulties do you foresee?
Statistical Methods
Module 7 Distributions
Module 8 Statistical Inference
Module 9 More Distributions
Module 10 Analysis of Variance
Distributions
Contents
7.1 Introduction.............................................................................................7/1
7.2 Observed Distributions ..........................................................................7/2
7.3 Probability Concepts ..............................................................................7/8
7.4 Standard Distributions ........................................................................ 7/14
7.5 Binomial Distribution .......................................................................... 7/15
7.6 The Normal Distribution .................................................................... 7/19
Learning Summary ......................................................................................... 7/27
Review Questions ........................................................................................... 7/29
Case Study 7.1: Examination Grades ........................................................... 7/31
Case Study 7.2: Car Components................................................................. 7/31
Case Study 7.3: Credit Card Accounts ........................................................ 7/32
Case Study 7.4: Breakfast Cereals ................................................................ 7/32
7.1 Introduction
Several examples of distributions have been encountered already. In Module 5, on
summary measures, driving competition scores formed a symmetrical distribution;
viewing of television serial episodes formed a U-shaped distribution; sickness
6
5
9
11
8
11 12
12 11 10
18 16 14
16 17 15
17 15
16
19
17 13 23 15
18 19
20 21 16 19 17
20
19 18 18
19 20 18
20 18
21 19
13 23 23
22 24 19
13 22 20
23 23
24 20
25 24 26 25
22 24 24
23 25 24 25 20 25
27 24
18 25
30 16 28 25
25 29 25 25
26
16 29 28 26 27
29 27 27
26 27 28 28 31
4
34 32
8
28
36
Figure 7.1 Market share (%) of soft drink product throughout Europe
Figure 7.1 is a mess. At first, most data are. They may have been taken from large
market research reports, dog-eared production dockets or mildewed sales invoices;
they may be the output of a computer system where no effort has been made at data
communication. Some sorting out must be done. A first attempt might be to arrange
the numbers in an ordered array, as in Table 7.1.
The numbers look neater now, but it is still difficult to get a feel for the data –
the average, the variability, etc. – as they stand. The next step is to classify the data.
Classifying means grouping the numbers in bands, such as 20–25, to make them
easier to appreciate. Each class has a frequency, which is the number of data points
(market shares) that fall within that class. A frequency table is shown in Table 7.2.
30
25
Frequency
20
15
10
0 5 10 15 20 25 30 35 40
Market shares (%)
20–29 75 25 0.25
30–39 42 14 0.14
40–49 29 10 0.10
50+ 19 6 0.06
Total 300 100 1.00
Probability 0.25
0.20
0.15
0.10
0.05
Consequently:
P(deliveries exceed 45) = P(deliveries 46–49) + P(deliveries 50 or more)
= 0.04 + 0.06
= 0.10
Therefore, X = 45.
The capacity of 45 will result in overtime being needed on 10 per cent of days.
An alternative representation of a frequency distribution is as a cumulative frequency
distribution. Instead of showing the frequency for each class, a cumulative frequency
distribution shows the frequency for that class and all smaller classes. For example,
instead of recording the number of days on which the deliveries of food to a hypermar-
ket were in the ranges 0–9, 10–19, 20–29, etc., a cumulative distribution records the
number of days when deliveries were less than 9, less than 19, less than 29, etc.
Table 7.4 turns Table 7.3 into cumulative form.
Table 7.4 shows cumulative frequencies in ‘less than or equal to’ format. They could just
as easily be in ‘more than or equal to’ format, as in Table 7.5.
These cumulative frequency tables can be put into the form of graphs. They are then
known as ogives. Figure 7.4 shows Table 7.4 and Table 7.5 as ogives.
less than
more than
300
250
200
Frequency
150
100
50
0
0 9 19 29 39 49
Deliveries
possible for the company to make a loss (event A) and be made bankrupt (event
AA).
The addition law for mutually exclusive events is:
(A or B or C or … ) = (A) + (B) + (C) + ⋯
This law has already been used implicitly in the example on hypermarket deliver-
ies. The probabilities of different classes were added together to give the probability
of an amalgamated class. The law was justified by relating the probabilities to the
frequencies from which they had been derived.
Example
Referring to the example on hypermarket deliveries (Table 7.3), what is the probability
of fewer than 40 deliveries on any day?
The events (the classes) in Table 7.3 are mutually exclusive. For example, given that the
number of deliveries on a day was in the range 10–19, it is not possible for that same
number of deliveries to belong to any other class. The addition law, therefore, can be
applied:
P(fewer than 40 deliveries) = P(0–9 deliveries or 10–19 or 20–29 or 30–39
= P(0–9) + P(10–19) + P(20–29) + P(30–39)
= 0.18 + 0.27 + 0.25 + 0.14
= 0.84
Equally, the result could have been obtained by working in the frequencies from which
the probabilities were obtained.
Note that if events are not mutually exclusive, this form of the addition law cannot be
used.
A conditional probability is the probability of an event under the condition that
another event has occurred or will occur. It is written in the form P(A/B), meaning the
probability of A given the occurrence of B. For example, if P(rain) is the probability of
rain later today, P(rain/dark clouds now) is the conditional probability of rain later
given that the sky is cloudy now.
The probabilities of independent events are unaffected by the occurrence or non-
occurrence of the other events. For example, the probability of rain later today is not
affected by dark clouds a month ago and so the events are independent. On the other
hand, the probability of rain later today is affected by the presence of dark clouds now.
These events are not independent. The definition of independence is based on condi-
tional probability. Event A is independent of B if:
(A) = (A/B)
This equation is merely the mathematical way of saying that the probability of A is not
affected by the occurrence of B. The idea of independence leads to the multiplication
law of probability for independent events:
(A and B and C and … ) = (A) × (B) × (C) × …
The derivation of this law will be explained in the following example.
Example
Twenty per cent of microchips produced by a certain process are defective. What is the
probability of picking at random three chips that are defective, defective, OK, in that
order?
The three events are independent since the chips were chosen at random. The multipli-
cation law can therefore be applied.
(1st chip defective and 2nd chip defective and 3rd chip OK)
= (defective) × (defective) × (OK)
= 0.2 × 0.2 × 0.8
= 0.032
This result can be verified intuitively by thinking of choosing three chips 1000 times.
According to the probabilities, the 1000 selections are likely to break down as in
Figure 7.5. Note that in a practical experiment the probability of, for instance, 20 per
cent would not guarantee that exactly 200 of the first chips would be defective and 800
OK, but it is the most likely outcome. Figure 7.5 shows 32 occasions when two
defectives are followed by an OK; 32 in 1000 is a proportion of 0.032 as given by the
multiplication law.
A B C D
B C D A C D A B D A B C
2 possibilities for 3rd place
C D B D B C C D A D A C B D A D A B B C A C A B
1 possibility for 4th place
D C D B C B D C D A C A D B D A B A C B C A B A
Second, repetition occurs because the order of the two selected for the subcom-
mittee does not affect its constitution. It is the same subcommittee of A and B
whether they were selected first A then B or vice versa. In Table 7.6 each pair of
orderings 1,7; 2,8; 3,14 etc. provides just one subcommittee. Consequently, the
number of subcommittees can be further halved from 12 to 6. Since the process
started with all possible sequences of the four candidates, and since allowance has
now been made for all repetitions, this is the final number of different subcommit-
tees. They are:
AB, AC, AD, BC, BD, CD
Put more concisely, the number of ways of choosing these six was calculated
from:
More generally, the number of ways of selecting r objects from a larger group of
n objects is:
!
=
( !×( )!)
These calculations are not of immediate practical value, but they are the basis for
the derivation of the binomial distribution later in the module.
7.5.1 Characteristics
The binomial distribution is discrete (the values taken by the variable are distinct),
giving rise to stepped shapes. Figure 7.7 illustrates this and also that the shape can
vary from right-skewed through symmetrical to left-skewed depending upon the
situation in which the data are collected.
The elements of a statistical population are of two types. Each element must be
of one but only one type. The proportion, p, of the population that is of the
first type is known (and the proportion of the second type is therefore 1 − p).
A random sample of size n is taken. Because the sample is random, the number
of elements of the first type it contains is not certain (it could be 0, 1, 2, 3, …
or n) depending upon chance.
From this theoretical situation the probabilities that the sample contains given
numbers (from 0 up to n) or elements of the first type can be calculated. If a great
many samples were actually collected, a histogram would be gradually built up, the
variable being the number of elements of the first type in the sample. Probabilities
measured from the histogram should match those theoretically calculated (approxi-
mately but not necessarily exactly, as with the coin tossing example). The
probabilities calculated theoretically are binomial probabilities; the distribution
formed from these probabilities is a binomial distribution.
For example, a machine produces microchip circuits for use in children’s toys.
The circuits can be tested and found to be defective or OK. The machine has been
designed to produce no more than 20 per cent defective chips. A sample of 30 chips
is collected. Assuming that overall exactly 20 per cent of chips are defective, the
The following example shows how this formula can be used. If 40 per cent of the
USA electorate are Republican voters, what is the probability that a randomly
assembled group of three will contain two Republicans?
Using the binomial formula:
(2 Republicans in group of 3)
= ( = 2)
= × × (1 − )
= × × (1 − ) since sample size, , equals 3
= × 0.4 × 0.6 since the proportion of type 1 (Republicans) is 0.4
!
= × 0.16 × 0.6
( !× !)
= 3 × 0.16 × 0.6
= 0.288
There is a 28.8 per cent chance that a group of three will contain two Republican
voters.
Making similar calculations for r = 0, 1 and 3, we obtain the binomial distribution
of Figure 7.8.
43.2%
28.8%
21.6%
6.4%
0 1 2 3
Example
A manufacturer has a contract with a supplier that no more than 5 per cent of the
supply of a particular component will be defective. The component is delivered in lorry
loads of several hundred. From each delivery a random sample of 20 is taken and
inspected. If three or more of the 20 are defective, the load is rejected. What is the
probability of a load being rejected even though the supplier is sending an acceptable
proportion of 5 per cent defective?
The binomial distribution is applicable, since the population is split two ways and the
samples are random. Table A1.1 in Appendix 1 is a binomial table for samples of size 20.
As with Table A1.1, the rows refer to the number of defective components in the
sample; the columns refer to the proportion of defectives in the population of all
components supplied (0.05 here). The following is based on the column headed 0.05 in
Table A1.1 in Appendix 1:
The manufacturer rejects loads when there are three or more defectives in the sample.
From the above, the probability of a load being rejected even though the supplier is
sending an acceptable proportion of defectives is (6.0% + 1.6% =) 7.6 per cent.
7.5.5 Parameters
The distributions in Figure 7.7 differ because the parameters differ. Parameters fix
the context within which a variable varies. The binomial distribution has two
parameters: the sample size, n, and the population proportion of elements of the
first type, p. Two binomial distributions with different parameters, while still having
the same broad characteristics shown in Figure 7.7, will be different; two binomial
distributions with the same sample size and the same proportion, p, will be identical.
The variable is still free to take different values but the parameters place restrictions
on the extent to which different values can be taken.
The binomial probability formula in Section 7.5.3 demonstrates why this is so.
Once n and p are fixed, the binomial probability is fixed for each r value; if n and p
were changed to different values, the situation would still be binomial (the same
general formula is being used), but the probabilities and the histogram would differ.
Right-skewed shapes occur when p is small (close to 0), left-skewed shapes when p
is large (close to 1), and symmetrical shapes when p is near 0.5 or n is large (how
large depends upon the p value). This can be checked by drawing histograms for
different parameter values.
7.6.1 Characteristics
The normal distribution is bell shaped and symmetrical, as in Figure 7.9. It is also
continuous. The distributions met up to now have been discrete. A distribution is
discrete if the variable takes distinct values such as 1, 2, 3, … but not those in
between. The delivery distribution (Table 7.3) is discrete since the data are in
groups; the binomial distribution is discrete since the variable (r) can take only
whole number values. Continuous means that all values are possible for the variable.
It could take the values 41.576, 41.577, 41.578, … rather than all these values being
grouped together in the class 40–49 or the variable being limited to whole numbers
only.
68%
1s 1s
2s 2s
3s 3s
of disturbance. Each source gives rise to a small variation in the value of the
quantity. The variations are equally likely to be positive or negative at random;
they are independent of one another (i.e. no variation is influenced by any oth-
er); they can be added together.
Because of positive and negative variations cancelling each other out, the tenden-
cy would be for most values to be close to the central value, but with a few at
greater distances. Intuitively, the resulting symmetrical bell shape can be visualised.
A great many practical situations approximate very well to the theoretical. It is
not necessary to know what the sources of variation are, merely that they are likely
to exist. Actual situations in which the normal distribution has been found to apply
include:
(a) IQs of children;
(b) heights of people of the same sex;
(c) dimensions of mechanically produced components;
(d) weights of machine-produced items;
(e) arithmetic means of large samples.
In the case of human beings, the variations giving rise to normal distributions of
IQs, heights and other variables are presumably the many genetic and environmen-
tal effects that make people different. For mechanically produced items, the sources
must include vibration (from a number of factors), tiny differences in machine
settings, different operators, etc.
The use of the distribution in sampling is one of the most important. The varia-
tions that make one sample different from another arise from the random choices as
to the elements included in the sample. This is an important part of statistical
inference, the subject of the next module.
The probability of, say, 57 positive variations (and therefore 43 negative) is given
by the binomial probability formula with p = 0.5 and n = 100:
( = 57) = × 0.5 × 0.5
Since r = 57 gives rise to an x value of L + 14u:
( = + 14 ) = × 0.5 × 0.5
The probabilities of other x values could be calculated similarly. The distribution
of x will be symmetrical if displayed in the form of a histogram since a binomial
distribution with p = 0.5 is symmetrical. This is close to what is being sought except
that a normal distribution is based on many small variations, rather than 100
variations of size u. To make further progress some advanced mathematics is
needed. The number of variations is increased to be infinitely large while the size of
the variations is reduced to be infinitely small. The resulting formula for the normal
distribution is:
( ) /
( ≤ )=∫ ×
√
The formula may not appear attractive or even comprehensible. Fortunately it is
not necessary to be able to use it. As with the binomial, there are normal curve
tables (see Table A1.2 in Appendix 1) from which probabilities can be read.
Example
A factory produces a wide variety of tins for use with food products. A machine
produces the lids for tins of instant coffee. The diameters of the lids are normally
distributed with a mean of 10 cm and a standard deviation of 0.03 cm.
(a) What percentage of the lids have diameters in the range 9.97 cm to 10.03 cm?
Information about areas under normal curves refers to values of the variable being
measured in terms of ‘standard deviations away from the mean’. The range 9.97 to
10.03 when in this form is 10 – 0.03 to 10 + 0.03 (i.e. Mean – 1s to Mean + 1s).
From the basic properties of a normal curve (Figure 7.9), this range includes 68 per
cent of the area under the curve or, in other words, 68 per cent of the observations.
So, 68 per cent of the lids have diameters in the range 9.97 to 10.03 cm. (Check
with Table A1.2 in Appendix 1 that this is approximately correct by looking up z =
1.0.)
(b) Lids of diameter greater than 10.05 cm are too large to form an airtight seal and
must be discarded. If the machine produces 8000 lids per shift, how many are wast-
ed?
Lids wasted have diameters greater than 10.05, or greater than mean + 1.67 stand-
ard deviations (since (10.05 − 10.00)/0.03 = 1.67). The range is therefore no longer
expressed as the mean plus a whole number of standard deviations. For ranges such
as these the normal curve table has to be used. The table is used as follows: each
number in the body of the table is the area under the curve from the mean to a
point a given number (z) of standard deviations to the right of the mean (the shaded
area in Figure 7.10). In this example, the area under the curve from the mean to the
point 1.67 standard deviations to the right is wanted. In Table A1.2, Appendix 1,
look down the left-hand column to find 1.6, then across the top row to find 0.07.
The figure at the intersection is 0.4525, corresponding to 1.67 standard deviations
from the mean.
The question asks for the probability of a diameter greater than 10.05 cm (i.e. for
the area beyond 1.67 standard deviations). The area to the right of the mean as far
as 1.67 standard deviations has just been found to be 0.4525; the entire area to the
right of the mean is 0.5, since the distribution is symmetrical. Therefore (see Fig-
ure 7.11):
(diameter greater than 10.05) = 0.5 − 0.4525
= 0.0475
So, 4.75 per cent of lids will have a diameter greater than 10.05 cm. Therefore, in a
shift, 4.75 per cent of 8000 = 380 lids will be unusable.
0.0475
0.4525
10.0 10.05
Diameter (cm)
90%
z z
deviations. So 90 per cent of the production lies within ±1.645s of the mean, or
within ±1.645 × 0.03 cm of the mean = ±0.04935 cm. The range of diameters for 90
per cent of the lids is, therefore, 9.951 to 10.049 cm (approx.).
Note that the basic properties given in Figure 7.9 are approximations only. By look-
ing up z = 1.00, it can be seen that 68.26 per cent of the area lies within ±1 standard
deviation of the mean. Similarly, 95.44 per cent lies within ±2 standard deviations
and 99.74 per cent between ±3 standard deviations.
7.6.5 Parameters
The normal distribution applies to both IQs and weights of bread loaves, yet the
shapes of the two distributions are different (see Figure 7.13). The distributions
differ because the parameters differ. The normal distribution has two parameters,
the arithmetic mean and the standard deviation. Contrast this with the binomial
distribution, whose two parameters are the sample size and the population propor-
tion of type 1. Two normal distributions with the same mean and the same standard
deviation will be identical; two normal distributions with different means and
standard deviations, while still having the characteristics shown in Figure 7.9, will be
centred differently and be of different widths.
(a) (b)
5, 8, 2, 4, 4, 6, 5, 3, 7, 6, 9, 7, 4, 2, 6, 5, 3, 5, 5, 4
The data will be easier to handle if they are grouped together:
Value 0 1 2 3 4 5 6 7 8 9
Frequency 2 2 4 5 3 2 1 1
First, calculate the parameters: the arithmetic mean and standard deviation.
Mean = 5.0
Standard deviation = 1.9
What is theoretically expected from a normal distribution with these parameters
is that 68 per cent of the observations will be between 5 ±1.9, 95 per cent between 5
±3.8, etc. as follows:
Observed % Theoretical %
Mean ±1s : 3.1 to 6.9 60 68.0
Mean ±2s : 1.2 to 8.8 95 95.0
Mean ±3s : −0.7 to 10.7 100 99.7
Although it is not perfect, the match between observed and theoretical for just 20
observations is reasonably good, suggesting an approximate normal distribution.
Judgement alone has been used to say that the match is acceptable. Statistical
approaches to this question are available but are beyond the scope of this module.
The methods are, however, based on the same principle. The observed data are
compared with what is theoretically expected under an assumption of normality.
Example
If, on average, 20 per cent of firms respond to industrial questionnaires, what is the
probability of fewer than 50 responses when 300 questionnaires are sent out?
The situation is one to which the binomial applies. Firms either respond or do not
respond, giving the two-way split of the population. A sample of 300 is being taken from
this population. To answer the question using the binomial formula, the following would
have to be calculated or obtained from tables:
(0 responses) = × 0.2 × 0.8
(1 response) = × 0.2 × 0.8
⋅
⋅
⋅
(49 responses) = × 0.2 × 0.8
These 50 probabilities would then have to be added. This is a daunting task, even for
mathematical masochists. However, since np = 60 (= 300 × 0.2) and n(1 − p) = 240 are
both greater than 5, the normal approximation can be used, with parameters:
Arithmetic mean = = 60
Standard deviation = (1 − )
= √300 × 0.2 × 0.8
= 7 (approx. )
A slight difficulty arises in that a discrete distribution is being approximated by a
continuous one. To make the approximation it is necessary to pretend that the variable
(the number of responses) is continuous (i.e. 60 responses represent the interval
between 59.5 and 60.5, 61 the interval between 60.5 and 61.5 and so on). Consequently,
to answer the question posed, P(r ≤ 49.5) must be found:
49.5 is 10.5 from the mean, or
= 10.5/7
= 1.5 standard deviations from the mean
Using Table A1.2 in Appendix 1:
( ≤ 1.5) = 0.5 − 0.4332
= 0.0668
(fewer than 50 replies) = 6.68%
Learning Summary
The analysis of management problems often involves probabilities. For example,
postal services define their quality of service as the probability that a letter will reach
its destination the next day; electricity utilities set their capacity at a level such that
there is no more than some small probability that it will be exceeded and power cuts
necessitated; marketing managers in contracting companies may try to predict future
business by attaching probabilities to new contracts being sought. In such situations
and many others, including those introduced earlier, the analysis is frequently based
on the use of observed or standard distributions.
An observed distribution usually entails the collection of large amounts of data
from which to form histograms and estimate probabilities.
A standard distribution is mathematically derived from a theoretical situation. If
an actual situation matches (to a reasonable approximation) the theoretical, then the
standard distribution can be used both to describe and to analyse the situation. As a
result, fewer data need to be collected.
This module has been concerned with two standard distributions: the binomial
and the normal. For each, the following have been described:
(a) its characteristics;
(b) the situations in which it can be used;
(c) its derivation;
(d) the use of probability tables;
(e) its parameters;
(f) how to decide whether an actual situation matches the theoretical situation on
which the distribution is based.
The mathematics of the distributions have been indicated but not pursued rigor-
ously. The underlying formulae, particularly the normal probability formula, require
a relatively high level of mathematical and statistical knowledge. Fortunately such
detail is not necessary for the effective use of the distributions because tables are
available. Furthermore, the role of the manager will rarely be that of a practitioner
of statistics; rather, he or she will have to supervise the use of statistical methods in
an organisation. It is therefore the central concepts of the distributions, not the
mathematical detail, that are of concern. To look at them more deeply goes beyond
what a manager will find helpful and enters the domain of the statistical practitioner.
The distributions that have been the subject of this module are just two of the
many that are available. However, they are two of the most important and useful.
The principles behind the use of any standard distribution are the same, but each is
associated with a different situation. A later module will look at other standard
distributions and their applications.
Review Questions
7.1 What method of probability measurement is used to estimate probabilities in observed
distributions?
A. A priori
B. Relative frequency
C. Subjective
7.2 A machine produces metal rods of nominal length 100 cm for use in locomotives. A
sample of 1000 rods is selected, the rods measured, their lengths classified in groups of
size 0.1 cm, and a frequency histogram drawn. From the histogram, probabilities are
calculated. What type of distribution has been formed?
A. Observed
B. Binomial
C. Normal
7.3 The amounts owed by the customers of a mail-order company at the end of its financial
year would form a continuous distribution. True or False?
Questions 7.4 to 7.6 refer to the following situation:
A medical consultant books appointments to see 20 patients in the course of a morn-
ing. Some of them cancel their appointments at little or no notice. From past records
the following probabilities have been calculated.
7.4 What is the probability that on a particular morning there will be no more than one
cancellation?
A. 61%
B. 29%
C. 39%
D. 9%
7.5 What is the probability that on two successive mornings there will be no cancellations?
A. 64%
B. 16%
C. 10.2%
D. 9.1%
7.6 What assumption must be made to justify the use of the multiplication rule to answer
Question 7.5?
The events are:
A. mutually exclusive.
B. conditional.
C. independent.
7.8 Which of the following are advantages in using standard distributions compared to
observed distributions?
A. No data need be collected.
B. Knowledge of the distribution is available in probability tables.
C. Greater accuracy.
7.9 What standard distribution would you expect to describe the variation in the number of
persons, in randomly selected samples of 100 residents of London, who are watching a
popular television programme at a given time?
A. Normal
B. Binomial
7.10 In Question 7.9, the normal distribution could be used as an approximation to the
binomial. True or False?
7.11 A company manufacturing chocolate bars planned that, three months after launching a
new product, 40 per cent of the nation should have heard of it. If this has been achieved,
what is the probability that a random sample of five people will contain only one who
has heard of it?
A. 0.08
B. 0.05
C. 0.13
D. 0.26
7.12 Estimate the total number of dockets cleared by all clerks each day.
A. 2280
B. 300
C. 190
D. 2880
7.13 Estimate the approximate number of clerks who clear more than 215 per day.
A. 1
B. 1 clerk every other day
C. 2
D. 3
7.14 Within what range would the number of dockets cleared by any clerk on any day be
expected to lie, with 95 per cent probability?
A. 140 to 240
B. 165 to 215
C. 115 to 265
D. 140 to 215
a. Are these results consistent with the view that the process is operating with an
average 10 per cent of defectives?
b. What reservations have you about your conclusion?
Statistical Inference
Contents
8.1 Introduction.............................................................................................8/1
8.2 Applications of Statistical Inference .....................................................8/2
8.3 Confidence Levels ...................................................................................8/2
8.4 Sampling Distribution of the Mean .......................................................8/3
8.5 Estimation................................................................................................8/6
8.6 Basic Significance Tests..........................................................................8/9
8.7 More Significance Tests ...................................................................... 8/18
8.8 Reservations about the Use of Significance Tests............................ 8/24
Learning Summary ......................................................................................... 8/26
Review Questions ........................................................................................... 8/28
Case Study 8.1: Food Store ........................................................................... 8/30
Case Study 8.2: Management Association ................................................... 8/31
Case Study 8.3: Textile Company ................................................................ 8/31
Case Study 8.4: Titan Insurance Company.................................................. 8/31
Learning Objectives
Statistical inference is the set of methods by which data from samples can be turned
into more general information about populations. By the end of the module, the
reader should understand the basic underlying concepts. Statistical inference has two
main parts. Estimation is concerned with making predictions and specifying their
accuracy; significance testing is concerned with distinguishing between a result
arising by chance and one arising from other factors. The module describes some of
the many different types of significance test. As in the last module, some of the
mathematics will have to be left in a ‘black box’.
8.1 Introduction
Statistical inference is the use of sample data to predict, or infer, further pieces of
information about the population from which the sample or samples came. It is a
collection of methods by which knowledge of samples can be turned into
knowledge about the populations. Recall that statistically a population is the set of
all possible values of a variable. One form of inference is estimation, where a
sample is the basis for predicting the values of population parameters; a second is
that the population mean is definitely in the range 58–64 per cent. The sample of
1200 on which the prediction was based may have been unrepresentative. It was not
known whether this was the case, but, from the variability within the sample, it was
possible to say that it would be expected that 19 out of 20 such examples would
have a mean in the range given. This is the meaning of ‘95 per cent confidence’. A
confidence level attached to a statement is in effect the probability that the state-
ment is true.
All inferences are made at some level of confidence. In the medical example, the
conclusion that the new treatment was no better was stated ‘with 95 per cent
confidence’, meaning that, if the same sample results arose on 100 occasions, it
would be expected that the conclusion ‘no better’ would be correct on 95 of those
occasions. By convention, 95 per cent is the usual confidence level. If the confi-
dence level were set too high, it would be too tough a barrier and very little could
pass it; if set too low it would have little discriminatory ability. The 95 per cent level,
or one in 20, is generally regarded as an acceptable level of risk. But there is no
reason why other levels of confidence should not be used. The method of deriving
confidence levels from sample variability uses the next piece of theory.
(a) (b)
100 cm 100 cm
To prove these properties are true requires some ‘black box’ mathematics.
If the individual distribution is not normal, the outcome of taking samples is
more surprising. Figure 8.2(a) shows a non-normal distribution. It could be any non-
normal distribution, for example, the number of copies of a local weekly newspaper
read in a year by the inhabitants of the small town that it serves. A random sample
of at least 30 inhabitants is taken and the arithmetic mean of the number of copies
read by the sample is calculated. More samples of the same size are taken, and
eventually the distribution will be as shown in Figure 8.2(b).
(a) (b)
Number of inhabitants
012 8
Copies read Average copies read
(average = 8) per person for samples
of more than 30 people
Provided the sample size is greater than 30, the sampling distribution of the mean
will be approximately normal whatever the shape of the distribution from which the
samples were taken. The ‘30’ is a rule of thumb. If the individual distribution is at all
similar to a normal, then a sample size of fewer than 30 will be enough to make the
sampling distribution of the mean normal. For a distribution only slightly skewed, a
sample size of just four or five may well be sufficient.
The normalisation property is a consequence of the central limit theorem. This
states that as the sample size is increased, the sampling distribution of the mean
becomes progressively more normal. In view of the mathematics involved, only the
results have been presented. The great benefit of this theorem is that, even though
the distribution of a variable is unknown, the taking of samples enables the distribu-
tion of the sample mean to be known. Analysis can then proceed using techniques
and tables associated with the normal distribution.
Example
A manufacturing organisation has a workforce of several thousand. Sickness records
show that the average number of days each employee is off sick is 14 with a standard
deviation of 6. If random samples of 100 employees were taken and the average number
of days of sickness per employee calculated for each sample, what would be the
distribution of these sample averages? What would be its parameters?
The starting distribution is the number of days of sickness for each employee. The shape
of this distribution is not known, but it is likely to be a reverse J (see Figure 8.3(a)). Its
mean is 14 and its standard deviation 6. Moving to the distribution of the average
number of days of sickness per employee of samples of 100 employees (see Fig-
ure 8.3(b)), the new distribution will be normal or nearly normal because the sample
size exceeds 30 (central limit theorem). The parameters will be:
Mean = 14 days
Standard deviation = 6/√100
= 0.6 days
(a) (b)
Number of employees
Average = 14 Average = 14
Standard deviation = 6 Standard deviation = 0.6
0 5 10 15 20 25 30 35 40 45 50 55 60 65 14
Days' sickness Average days' sickness per employee
for sample of size 100
8.5 Estimation
Estimation is the prediction of the values of population parameters given knowledge
of a sample. The sickness record example of Figure 8.3 can demonstrate the use of
estimation. Since the sampling distribution for samples of size 100 is normal with
mean 14 and standard deviation 0.6, 95 per cent of all such samples will have their
mean in the range 12.8–15.2 days. This follows from the property of normal
distributions that 95 per cent of values lie within ±2 standard deviations of the
mean. In other words, on 95 per cent of occasions (or, at the 95 per cent confidence
level) the sample mean will be within 1.2 days of the population mean.
This shows how a range for sample means can be estimated from the population
mean. However, this is estimation in the wrong direction – from the population to
the sample instead of from the sample to the population. The process can be turned
round to estimate the population mean from a single sample mean. Suppose, as is
usually the case, the population mean is not known but that a sample of 100
employees’ sickness records has an average number of days of sickness per employ-
ee of 11.5. This must be less than 1.2 days away from the unknown population
mean (at 95 per cent confidence level), from the above calculations. This unknown
population mean must then be within 1.2 days of 11.5 (see Figure 8.4).
Sample
mean
two conditions do not hold, a sample smaller than 30 can sometimes still be
used, but some more advanced theory is needed.
(b) Calculate the sample mean ( ) and the sample standard deviation (s).
(c) The standard deviation of the sampling distribution of the mean (the standard
error) is calculated as /√ .
(d) The point estimate of the population is .
(e) The 95 per cent confidence limits for the population mean are ± 2 /√ .
Example
A sample of 49 of a brand of light bulb lasted on average 1100 hours before failing. The
standard deviation was 70 hours. Estimate the overall average life length for this brand
of bulb.
Follow the general procedure as set out above:
(a) The sample, which is assumed to be random, has been taken. The size is sufficient
for the central limit theorem to apply and the standard deviation approximation to
be valid.
(b) The mean is 1100 hours, the standard deviation 70 hours.
(c) The standard deviation of the sampling distribution is 70/√49 = 10.
(d) The point estimate of life length is 1100 hours.
(e) The 95 per cent confidence limits are 1080–1120 hours. From the normal distribu-
tion Table A1.2 in Appendix 1, the limits for other confidence levels can be found.
For example, since 90 per cent of a normal distribution lies within ±1.645 standard
deviations of the mean, the 90 per cent confidence limits in this case are:
1100 ± 1.645 × 10
1083.55 1116.45
Because the central limit theorem applied, these confidence limits could be found
without the need to know anything about the shape of the original distribution. The only
calculations were to find the mean and standard deviation from the sample.
The ideas of estimation can help in deciding upon sample sizes. In making an estimation,
a particular level of accuracy may be required. The sample size can be chosen in order
to provide that level of accuracy.
Example
In the case of the light bulbs above, what sample size is needed to estimate the average
length of life to within ±5 hours (with 95 per cent confidence)?
Instead of the sample size being known and the confidence limits unknown, the situation
is reversed. At the 95 per cent level:
Confidence limits required = ± 2 /√
= 1100 ± 5 hours
1100 ± 5 = 1100 ± 140/√
i. e. 5 = 140/√
√ = 28
= 784
The formula used in this calculation related the accuracy of the estimate to the square
root of the sample size. This is why accuracy becomes progressively more expensive.
For example, a doubling in accuracy requires a quadrupling of sample size (and, presum-
ably, a great increase in expense). The trade-off between accuracy and sample size is
therefore an important one.
other than pure chance must be at work. The significance level is chosen by the
person conducting the test to reflect his or her view of the credible and the in-
credible, but usually and conventionally it is 5 per cent.
(d) Calculate the probability of the sample evidence occurring, under the
assumption that the hypothesis is true. In the example, the probability of the
sample average recovery rate being 10.5 days was calculated, assuming that it
came from a population with a mean recovery rate of 12 days (the average for
the old treatment).
(e) Compare the probability with the significance level. If it is higher, it is
judged consistent with the hypothesis (the sample result is thought to have hap-
pened purely by chance) and the hypothesis is accepted; if it is lower, it is judged
inconsistent with the hypothesis (the sample result is thought too unusual to
have happened purely by chance) and the hypothesis is rejected. When the hy-
pothesis is rejected, the result is said to be significant at the 5 per cent level.
It is rare for evidence to prove conclusively the truth of a proposition. The evi-
dence merely alters the balance of probabilities. Significance tests give a critical but
arbitrary point that divides evidence supporting the proposition from that which
does not. The significance level is this critical point. Its use is an abrupt, black-and-
white method of separation, but it does provide a convention and a framework for
weighing evidence. When a result is reported as being statistically significant at the 5
per cent level, the conclusion has a consistent meaning in whatever situation the test
took place. Occasionally a conclusion may be reported ‘at the 95 per cent confi-
dence level’ instead of ‘at the 5 per cent significance level’. The meaning is the same.
Example
One of the products of a dairy company is 500 g packs of butter. There is some concern
that the production machine may be manufacturing slightly overweight packs. A random
sample of 100 packs is weighed. The average weight is 500.4 g and the standard
deviation is 1.5 g. Is this consistent with the machine being correctly set and producing
packs with an overall average weight of 500 g?
Follow the steps of hypothesis testing.
(a) The hypothesis is that the true population average weight produced by the machine
is 500 g.
(b) The evidence is the sample of 100 packs with average weight 500.4 g and standard
deviation 1.5 g.
(c) Let the significance level be the conventional 5 per cent.
(d) Assuming the hypothesis is true, the sample mean has come from a sampling distribu-
tion of the mean as shown in Figure 8.5. The sampling distribution is normal. Even if
the distribution of the weights of individual packs is not normal, the central limit theo-
rem makes the sampling distribution normal. The mean is 500 (the hypothesis). The
standard deviation is equal to the standard deviation of the individual distribution di-
vided by the square root of the sample size. It is valid to calculate the standard
deviation from the sample because the sample size exceeds 30.
The sample taken had a mean of 500.4. Recall the way in which z values are calculat-
ed for normal distributions. In this situation:
= (500.4 − 500)/0.15
= 2.67
From the normal curve table given in Appendix 1 (Table A1.2) the associated area
under the normal curve is 0.4962 (see Figure 8.6). The probability of obtaining a
sample result as high as or higher than 500.4 is:
= 0.5 − 0.4962
= 0.0038
= 0.38%
Standard deviation
= 0.15
500 g
0.0038
0.4962
z = 2.67
500 500.4
45%
5%
z = 1.645
500.247
Critical value
the mean as 0.4 g in either direction, not just 500.4 g or higher. This new probability
is double the previous one, since the area in the tail (0.0038) occurs on both sides of
the mean. It is equal to 0.76 per cent. Figure 8.8 summarises the difference between
one- and two-tailed tests. In the two-tailed test this new probability, 0.76 per cent,
would be compared with the 5 per cent significance level.
(a) (b)
P(sample result)
= 0.38% + 0.38%
= 0.76%
P(sample result)
= 0.38% 0.38% 0.38%
(a) (b)
2.5% 2.5%
5%
Figure 8.9 Critical values for (a) one-tailed test; (b) two-tailed test
Using the critical value method of viewing significance tests, we see that there are
two critical values in the two-tailed test. Since the possibility of an extreme result on
either side of the mean is to be taken into account, there must be a critical value on
either side of the mean. However, as the significance level is still 5 per cent, each
critical value must leave half of 5 per cent (= 2.5 per cent) in its tail of the distribu-
tion. This is illustrated in Figure 8.9.
When the area in a tail is 2.5 per cent, the corresponding z value is approximately
2 (95 per cent of a distribution is within ±2 standard deviations of the mean). In the
butter pack example the critical values are therefore at:
500 ± 2 × 0.15
i.e. 499.7 and 500.3
The sample result is compared with the critical values. A result between the val-
ues leads to acceptance of the hypothesis; a result outside the critical values leads to
rejection of the hypothesis.
The decision between one- and two-tailed tests depends upon the hypothesis (i.e.
what you are setting out to test). In the butter example, the production manager of
the organisation would be concerned both if the packs were too heavy (he would be
giving butter away) and if they were too light (he would be infringing trading
standards laws). So he would use a two-tailed test. The null hypothesis would be
that the population mean was 500 g and the alternative hypothesis would be that the
population mean was not 500 g. On the other hand, an official from the trading
standards department of the municipal authority would only be interested if the
packs were too light and customers were being cheated. It would not be his concern
if the organisation was giving customers more than it needed to. So he would use a
one-tailed test. The null hypothesis would be that the population mean is 500 g (or it
could be is 500 g or greater) and the alternative hypothesis would be that the popula-
tion mean is less than 500 g.
To summarise:
(a) If the null hypothesis is true (assuming significance level is 5 per cent):
P(correctly accepting null hypothesis) = 95%
P(erroneously rejecting null hypothesis) = 5% (type 1 error)
(b) If the alternative hypothesis is true:
P(correctly accepting alternative hypothesis) is the power of the test
P(erroneously rejecting alternative hypothesis) = 100% − power (type
2 error)
Example
The manufacturers of a new influenza drug for children claim it will reduce the child’s
temperature by 1 °C within 12 hours. The drug is tried on a random sample of 36
children. The average temperature reduction is 0.8 °C with a standard deviation of
1.4 °C. At the 5 per cent significance level, has the drug performed according to
specification? What are the probabilities of type 1 and type 2 errors? What is the power
of the test? It can be assumed that the distribution of individual temperatures is normal.
Follow the five stages of a significance test:
(a) The null hypothesis is that the drug makes no difference to the temperature; the
alternative hypothesis is that the temperature is reduced by 1 °C.
(b) The sample evidence is the reduction in temperature for 36 children, mean 0.8,
standard deviation 1.4.
(c) The significance level is the conventional 5 per cent.
(d) If we assume the truth of the hypothesis, the sample mean comes from a distribution
as shown in Figure 8.10, with mean 0 and standard deviation 0.23 (1.4/√36). The test
is one-tailed since only a reduction in temperature is considered. The critical value
for a 5 per cent test is, from previous work, at the point z = 1.645 from the mean.
The critical value is therefore at 0.38 (= 1.645 × 0.23).
(e) The sample result, mean = 0.8, is well outside the critical value. The null hypothesis
is rejected and the alternative accepted. The new drug does make a significant differ-
ence to temperature.
The probability of a type 1 error – the probability of wrongly rejecting the null hypothe-
sis – is 5 per cent, the significance level.
The probability of a type 2 error – the probability of wrongly rejecting the alternative
hypothesis (i.e. accepting the null hypothesis when it is false) – needs some calculation
(Figure 8.11). The critical value marks the accept/reject boundary for the null hypothe-
sis. It therefore marks the reject/accept boundary for the alternative. The probability of
wrongly rejecting the alternative hypothesis must then be the area in the tail of the
distribution based on the alternative, as marked by the critical value. This is the
highlighted area in Figure 8.11.
Standard error
= 1.4
36
= 0.23
Null Alternative
The shaded area
is the probability
of a type 2 error.
0 0.38 1
Critical value
probability of a type 2 error. In the above example, if the sample size were 25 instead of
36:
Standard error of sample means = 1.4/√25 = 0.28
Critical value = 0 + 1.645 × 0.28 = 0.46
z value for alternative distribution = (0.46 − 1.0)/0.28 = 1.93
P(type 2 error) = 2.68%
The probabilities of error are now closer to a balance with a smaller, and presumably
cheaper, sample size. Likewise, the sample size could have been maintained at 36 and
the significance level reduced from 5 per cent to obtain a balance.
Note that, in many cases, an alternative hypothesis is not known and it is not possible to
measure the probability of a type 2 error and the power of the test.
Significance test
z = 1.645 5%
Confidence
interval
5% z = 1.645 z = 1.645 5%
new treatment. The sample that is not given the new treatment is known as the
control group.
Previous tests have been based on the distribution of sample means. The signifi-
cance test for two samples is based on a new distribution: that of the difference in
sample means. Suppose two samples are taken from a population, their means and
the differences in means recorded. Gradually a distribution is built up. This is the
distribution of the difference in sample means.
The mean of this distribution is 0 since, overall, the difference in means of sam-
ples taken from the same population must average to 0. To calculate the standard
deviation requires the use of the variance sum theorem. It states that, if x and y
are two variables, then the variance of their sum or difference is:
Variance (x + y) = Variance (x) + Variance (y)
Variance (x −y) = Variance (x) + Variance (y)
The proof of the theorem is mathematical. In this case the two variables (x and y
in the formula) are the means of the two samples. Each sample mean has a variance
given by:
Variance ( ) = Individual variance/Sample size
= /
where V = variance of the population from which the samples were drawn.
This expression has previously been met in square root (standard deviation) form
in the context of the sampling distribution of the mean. From the variance sum
theorem:
Variance ( − ) = / + / ﴾8.1﴿
Usually the sample sizes are chosen to be equal and therefore:
Variance ( − ) = 2 / ﴾8.2﴿
If s is the individual standard deviation ( = √ ) then:
Standard deviation ( − ) = √2 · /√
This distribution of the difference in sample means, which has mean 0 and stand-
ard deviation √2 · /√ , is the basis of the test of significance between two samples.
The same five stages of the basic significance test are applicable:
(a) Set a hypothesis. The hypothesis is that the observed difference in sample means
comes from a distribution with mean 0. If it is supposed that there is no differ-
ence between the two samples, the observed difference in the sample means will
have arisen purely by chance.
(b) Collect sample evidence. This will consist of the two samples and their means.
(c) Set the significance level. The conventional 5 per cent is usually chosen. It is
interpreted as before. An observed difference in sample means is supposed to
have arisen purely by chance if its probability exceeds the significance level. The
hypothesis will then be accepted. The difference is supposed to be too unlikely
to have arisen purely by chance if its probability is less than the significance level.
It will then be concluded that the hypothesis is wrong.
(d) Calculate the probability of the sample evidence assuming the truth of the
hypothesis. Normal curve theory is used, for instance, to calculate the probability
that the observed difference in sample means has come from a distribution with
mean 0 and standard deviation √2 · /√ (as calculated above).
(e) The probability found in (d) is compared with the significance level. If it is
higher, the hypothesis is accepted; if lower, the hypothesis is rejected. The critical
value approach to significance tests could equally well be used.
Example
A supermarket chain wants to find which of two promotions is the more effective.
Promotion A is tried out at a sample of 36 stores over a one-week period; promotion B
is tried out at a sample of 36 similarly sized stores over the same period. For both
samples the increase in sales turnover over the previous week is measured. The average
increase for sample A is £12 000 and for sample B it is £53 000. The standard deviation
of weekly sales turnover at all the stores belonging to the chain is £120 000. Is there a
significant difference between the effectiveness of the two promotions?
Follow the five stages of the two sample significance tests:
(a) The hypothesis is that there is no difference between the two promotions.
(b) The evidence is sample A with mean £12 000 and sample B with mean £53 000.
(c) The significance level is selected to be the usual 5 per cent.
(d) If the hypothesis is true and, given the standard deviation of all weekly sales turno-
vers to be £120 000, the difference in the sample means comes from a distribution of
mean 0 and standard deviation £28 284 (= √2 · 120 000/√36), as shown in Fig-
ure 8.13. The distribution is normal because of the central limit theorem.
The observed difference in sample means is £41 000 (= 53 000 − 12 000). The value
z is (41 000 − 0)/28 284 = 1.45.
From the normal curve table in Appendix 1 (Table A1.2) the area from the mean to
this point is 0.4265. The area in the tail is therefore 0.0735. Since it was possible for
either promotion to prove better than the other, the test is a two-tailed test. The
probability of the sample evidence is thus 2 × 0.0735 = 0.1470 (i.e. 14.7 per cent).
(e) The probability of the sample evidence, 14.7 per cent, exceeds the significance level.
The hypothesis is accepted. There is no evidence that one promotion is significantly
better than the other.
Standard error
= £28 284
z = 1.45
0 41 000
= (( − 1) +( − 1) )/( + − 2) ﴾8.3﴿
The reason for the ‘2’ in the denominator is technical and will be explained later when
the concept of degrees of freedom is introduced. This estimate is then used in formulae
Equation 8.1 and Equation 8.2 as if it were the real standard deviation.
Example
In the supermarket promotion example above, suppose the standard deviation of the
weekly turnovers at all stores is not known. From sample A, s is calculated to be
£95 200 and from sample B, s is calculated to be £140 500. What is the pooled estimate
of the standard deviation?
Using the formula in Equation 8.3 and working in thousands:
= ((36 − 1) × 95.2 + (36 − 1) × 140.5 )/(36 + 36 − 2)
= (35 × 9063 + 35 × 19 740)/70
= 120
The estimated standard deviation is thus £120 000 (approximately). It is used in the
significance test just as in the above example, when £120 000 was assumed to be the
true population standard deviation.
30) with mean 0 and standard deviation 9.6 (= 68/√50). The z value of the ob-
served sample mean is 2.71 (= 26/9.6). From the normal curve table in Appendix 1
(Table A1.2), the associated area is 0.4966. The test is one-tailed since, presumably,
only the possibility of the promotion improving sales is considered. The probability
of the sample evidence is therefore 0.34 per cent.
(e) A probability of 0.34 per cent is well below the significance level. The hypothesis is
rejected. The promotion makes a significant difference to sales turnover.
Example
A political party claims that it has the support of the majority of the electorate. An
opinion poll that questioned 1600 potential voters showed that 48 per cent of them
supported the party. Does this evidence refute the claim?
A significance test will answer the question. The usual five stages are followed.
(a) The null hypothesis is that 50 per cent of the electorate support the party (i.e. the
proportion is 0.5). A one-tailed test is required since, in respect of the task of trying
to refute the claim, we are only interested in whether support is significantly lower
than the 50 per cent mark, not significantly higher. The alternative hypothesis is that
support is less than 50 per cent.
(b) The sample data are the opinions of the 1600 people interviewed.
(c) The conventional 5 per cent significance level is chosen.
(d) Assuming the truth of the hypothesis, the sample mean (the 48 per cent, or a
proportion of 0.48, from the opinion poll) comes from a normal distribution (the
normal can be used since the sample size is large), as shown in Figure 8.14.
s = 0.0125
z = 1.645
5%
Critical 0.5
value
= 0.4795
tors operating to affect sales turnover besides the promotions. The weeks chosen
(close to Christmas or holidays), advertising campaigns and weather would all
influence sales turnover. In addition, the size of stores would govern the magni-
tude of the increase that could be expected. The effect of all these factors must
be minimised when it is the effect of the promotion alone that is to be investi-
gated. Both tests of promotional effectiveness used (paired and unpaired
samples) have their own advantages in this respect. The unpaired sample is based
on just one week and thus eliminates the effect of data being collected in differ-
ent weeks; the paired sample is based on one set of stores and thus eliminates
some of the possibly systematic effects of data being collected at different stores
in different geographical areas.
(b) For many decisions it is simply not possible to collect sample evidence of the
sort that can be used in a significance test. For instance, it is difficult to carry out
trials with new medical drugs on humans. In other cases, it is not possible to
select random samples of sufficient size to satisfy assumptions and apply a test.
(c) Significance tests make no attempt to balance costs. The cost of sampling
compared with the value of the evidence provided has to be handled outside the
statistical method. It may be that the cost of sampling is greater than the benefit
accruing from this information.
(d) A significance test is a black-and-white, all-or-nothing process. If it is followed
rigidly, a probability of sample evidence of 4.9 per cent leads to the rejection of
the hypothesis; a probability of 5.1 per cent leads to acceptance. Clearly the real
difference between the two cases is marginal, yet in terms of the conclusion the
difference is total. When the probability is close to the accept/reject border, it is
preferable to quote the conclusion as ‘The sample evidence leads to the ac-
ceptance of the hypothesis at the 5 per cent level but the probability, at 7.3 per
cent, was close to the significance level.’
(e) The significance tests are based on assumptions, such as a normal distribution, a
sample size exceeding 30, a known population standard deviation and so on. The
assumptions may not be met in practice and the test may therefore be inapplica-
ble. However, some significance tests can still be used even when the
assumptions do not hold exactly. They are said to be robust.
(f) Generally, a good significance test balances type 1 and type 2 errors. The
incorrect rejection and the incorrect acceptance of the hypothesis should have
small and about equal probabilities. Often there is no alternative hypothesis and
the probability of a type 2 error cannot be calculated. The test may therefore be
unbalanced in the sense of imposing accept/reject standards that are too harsh
or too loose for the circumstances.
Situations do arise, however, when the equality of the probabilities of type 1 and
type 2 errors is not desirable. When the cost of a type 1 error is very different
from that of a type 2 error, it may be better to weight the probabilities according
to the costs.
Learning Summary
Statistical inference belongs to the realms of traditional statistical theory. Its
relevance lies in its applicability to specialised management tasks, such as quality
control and market research. Most managers would find that it can only occasionally
be applied directly to general management problems. Its major value is that it
encompasses ideas and concepts that enable problems to be viewed in broader and
more structured ways.
Two areas have been discussed, estimation and significance testing. New theory –
confidence levels, the sampling distribution of the mean, the central limit theorem
and the variance sum theorem – has been introduced.
The conceptual contribution that estimation makes is to concentrate attention on
the range of a business forecast rather than merely the point estimate. To take a
previous market research example, the estimate that 61 per cent of male toiletries
are purchased by females sounds fine. But what is the accuracy of the estimate? The
61 per cent is no more than the most likely value. How different could the true
value be from 61 per cent? If it can be said with near certainty (95 per cent confi-
dence) that the percentage is between 58 per cent and 64 per cent, then the estimate
is a good one on which decisions may be reliably based. If the range is 8 per cent to
88 per cent, then there must be doubts about its usefulness for decision making.
Surprisingly, the confidence limits of business forecasts are often reported with little
emphasis, or not reported at all.
The second area considered was significance testing. It is concerned with distin-
guishing real from apparent differences. The discrepancy between a sample mean
and what is thought to be the mean of the whole population is judged in the context
of inherent variation. An apparent difference is one that is likely to have arisen
purely by chance because of the inherent variation; a real difference is one that is
unlikely to have arisen purely by chance and some other explanation (i.e. that the
hypothesis is untrue) is supposed. A significance level draws a dividing line between
the two. The dividing line marks an abrupt border. In practice, extra care is exer-
cised over samples falling in the grey areas immediately on either side of the border.
A number of significance tests have been introduced and it can be difficult to
know which one to use. To illustrate the different circumstances in which each is
appropriate, a medical example will be used in which a new treatment for reducing
cholesterol levels is being tried out. Country-wide records are available showing that
the existing treatment on average reduces cholesterol levels by five units.
The three types of test described in the module are:
(a) Single sample. This is the basic significance test described in Section 8.6.
Evidence from one sample is used to test a hypothesis relating to the population
from which it has come. For example, to show that the new cholesterol treat-
ment was more effective than the existing treatment, the hypothesis would be
that the new treatment was no more effective than the old (i.e. it reduced choles-
terol levels by five units on average). A representative sample of patients would
be given the new treatment and the average reduction in cholesterol measured.
This would be compared with the hypothesised population figure of five units.
(b) Two independent samples (Section 8.7.1). Two independently drawn samples
are compared, usually with the hypothesis that there is no difference between
them. For example, in trying out the new cholesterol treatment there might be
some doubt about the accuracy of the country-wide data on which the hypothe-
sis was based. One way to get round the problem would be to use two samples.
The first would be a sample of patients to whom the new treatment had been
given and the second a ‘control’ sample of patients to whom the old treatment
was given. As before, the hypothesis would be that the new treatment was no
better than the old. The average reduction measured for the first sample would
be compared to that for the second to test whether the evidence supported this.
(c) Paired samples (Section 8.7.2). Two samples are compared but they are not
drawn independently. Each observation in one sample has a ‘partner’ in the oth-
er. For example, instead of testing the ultimate effect of the new treatment in
reducing cholesterol levels, it might be helpful to know whether it worked quick-
ly, taking effect within, say, three days. To do this the new treatment would be
given to a single sample of patients. Their cholesterol levels would be measured
at the outset and again three days later. There would then be two samples, the
first of cholesterol levels at the outset and the second of levels three days later.
However, each observation in one sample would be paired with one in the other
– paired because the two observations would relate to the same patient. The
hypothesis would be that the treatment had made no difference to cholesterol
levels after three days. As described in Section 8.7.2, the significance test would
be carried out by forming a new sample from the difference in cholesterol levels
for each patient and testing whether the average for the new sample could have
come from a population of mean zero.
If two independent samples had been used (i.e. the two samples contained differ-
ent patients, as for the independent samples above), and the cholesterol levels had
been measured for one sample at the outset and for the second three days later, any
difference in cholesterol levels might be accounted for by the characteristics of the
patients rather than the treatment.
In deciding how to conduct a significance test, there are three other factors to
consider. First, the test can be conducted with probabilities or critical values. This is
purely a matter of preference for the tester – both would produce the same result
(see Section 8.6.1). Second, the test can be one-tailed or two-tailed. This decision is
not a matter of preference; it depends upon the purpose of the test and what
outcome is wanted (Section 8.6.2). Third, the test could use data in the form of
proportions. This depends on the nature of the data, whether proportional or not
(Section 8.6.3).
Both estimation and significance testing can improve the way a manager thinks
about particular types of numerical problems. Moreover, they help to show the
manager what to look for in a management report: does an estimate or forecast also
include a measure of accuracy? In making comparisons, are the differences real or
apparent? From the point of view of day-to-day management, this is where their
importance lies.
Review Questions
8.1 Which is correct? Statistical inference is a set of methods for:
A. collecting samples.
B. estimating sample characteristics from parameters.
C. using sample information to make statements about populations.
D. designing statistical experiments.
8.2 Which is correct? Statistical inferences have confidence levels associated with them
because:
A. statistical data cannot be measured accurately.
B. alternative hypotheses may not be known.
C. they are made from samples that may be unrepresentative.
D. circumstances that apply when the sample was selected may change.
Questions 8.3 to 8.5 refer to the following random sample of nine observa-
tions collected from a normally distributed population:
7, 4, 9, 2, 8, 6, 8, 1, 9
8.3 What is the standard deviation of the sampling distribution of the mean of samples of
size 9?
A. 1
B. 3
C. 0.94
D. 1/3
8.5 What are the 90 per cent confidence limits for the population mean (approximately)?
A. 4 to 8
B. 4.4 to 7.6
C. 1.1 to 10.9
D. 0 to 12
8.6 The mean of a normally distributed population is to be estimated to within ±20 at the
95 per cent confidence level. The standard deviation is 150. What should the sample
size be?
A. 225
B. 250
C. 15
D. 56
8.7 A significance test is a method for deciding whether sample evidence is sufficiently
strong to prove the truth of a hypothesis. True or false?
8.8 Which is correct? A significance level is usually set at 5 per cent because:
A. through time 5 per cent has come to be accepted as the norm.
B. 5 per cent is the definition of significance.
C. 5 per cent marks the boundary between an acceptable risk and an unacceptable
risk.
D. 5 per cent significance implies 95 per cent confidence.
8.9 Critical values can be used in both two-tailed significance tests and one-tailed
significance tests. True or false?
Questions 8.10 to 8.12 refer to the following situation:
In a significance test a random sample of size 64 with mean 9.87 and standard devia-
tion 48 is taken from a normal population. The hypothesis is that the population mean is
0.
8.11 If the alternative hypothesis is that the population mean is 20, what is the probability of
a type 2 error under a significance test at the 5 per cent level? (Note that with this
alternative hypothesis the test must be one-tailed.)
A. 5%
B. 18.36%
C. 4.55%
D. 9.18%
8.12 If the alternative hypothesis is that the population mean is 20, what is the power of the
5 per cent significance test?
A. 95%
B. 95.55%
C. 90.82%
D. 95.45%
8.13 What should be the null hypothesis in a significance test to determine whether the plea
met its objective? The mean increase between the two three-month periods is:
A. 0
B. £2500
C. £7500
D. £250 000
8.15 A significance test at the 5 per cent level would proceed by calculating a pooled
estimate of the standard error, determining a critical value at 1.645 standard errors
below £7500 and comparing the observed difference in means (£5500 = £175 000 −
£169 500) with the critical value. True or false?
a. Describe the 5 per cent significance test you would apply to these data to determine
whether the new scheme has significantly raised outputs.
b. What conclusion does the test lead to?
c. What reservations have you about this result?
d. Suppose it has been calculated that, in order for Titan to break even, the average
output must increase by £5000. If this figure is the alternative hypothesis, what is:
i. the probability of a type 1 error?
More Distributions
Contents
9.1 Introduction.............................................................................................9/1
9.2 The Poisson Distribution .......................................................................9/2
9.3 Degrees of Freedom ...............................................................................9/7
9.4 t-Distribution ...........................................................................................9/8
9.5 Chi-squared Distribution .................................................................... 9/14
9.6 F-Distribution ....................................................................................... 9/19
9.7 Other Distributions ............................................................................. 9/22
Learning Summary ......................................................................................... 9/23
Review Questions ........................................................................................... 9/25
Case Study 9.1: Aircraft Accidents ............................................................... 9/28
Case Study 9.2: Police Vehicles..................................................................... 9/28
Learning Objectives
By the end of this module the reader should be more aware of the very wide range
of standard distributions that are available as well as their applications in statistical
inference. Two standard distributions, the binomial and normal, and statistical
inference were the subjects of the previous two modules. Those fundamental
concepts are amplified and extended in this module. More standard distributions
relating to a variety of theoretical situations and their use in estimation and signifi-
cance tests are described.
The module covers some advanced material and may be omitted the first time
through the course.
9.1 Introduction
Statistical inference is the collection of methods by which sample data can be turned
into more general information about a population. There are two main types of
inference, estimation and significance testing. The former is concerned with
predicting confidence intervals for parameters, the latter with judging whether
sample evidence is consistent with a hypothesis.
The ideas of statistical inference are particularly useful when used in conjunction
with standard distributions (sometimes called theoretical or probability distribu-
tions). Recall how a standard distribution arises. Each standard distribution has been
constructed theoretically from a situation in which data are generated. The situation
has characteristics that can be expressed mathematically. This enables the probabili-
ties of the variable taking different values to be calculated by the a priori method of
measurement. These probabilities form the standard distribution (cf. observed
distributions, which are based on data collection and for which the probabilities are
calculated by the frequency method of probability measurement).
These concepts have been discussed previously, in the context of the binomial
and normal distributions. The purpose now is to extend the discussion to some
more standard distributions. Their application to statistical inference in some
practical situations will be described. The first of these standard distributions is the
Poisson.
9.2.1 Characteristics
The Poisson is a discrete distribution. Its shape varies from right-skewed to almost
symmetrical as in Figure 9.1.
(e.g. road accidents that happened) and the ‘non-occurrence of events’ (e.g. road
accidents that could have happened but did not). The difference is that for the
Poisson the total number of elements in the sample is not known. Whereas the
occurrence of events in the sample can be counted, the non-occurrence of events
cannot. Because the number of events that could have occurred but did not is
infinite, the sample size is in effect infinite. The Poisson distribution provides the
probabilities that given numbers of events occur within the sample (usually but not
always a period of time). Compare this with the binomial distribution, whereby the
probabilities that the sample contains given numbers of elements of one type are
calculated. The mathematics of the Poisson are based on those of the binomial but
changed to allow for an infinite sample size.
A typical application is the arrival of telephone calls at a switchboard. The sample
is a period of time; the elements of the sample are the arrival or non-arrival of calls.
Since, potentially, an infinite number of calls could arrive during the time period, the
sample size is, in effect, infinite. The number of calls that do arrive can be counted,
but there remains an uncountable number of calls that did not arrive. Given an
average arrival rate of calls (the average number of calls per time period), the
probability that any number of calls will arrive can be calculated. For instance,
knowing that 10 calls per minute arrive on average, the Poisson gives the probabili-
ties that in any minute 0, 1, 2, 3, 4, … calls will arrive, leading to distribution shapes
like those in Figure 9.1. The probabilities can be found in Poisson distribution tables
such as those in Appendix 1, Table A1.3, the use of which will be explained later.
The purpose of such an analysis might be to determine the most efficient capacity
of the switchboard.
Other applications of the Poisson are:
(a) Flaws in telegraph cable (there are only a finite number of flaws in a given length
of cable, but an infinite number could potentially have occurred though did not)
– here the continuum is not time but cable.
(b) Mechanical breakdowns of machinery, cars, etc.
(c) Clerical errors.
9.2.3 Deriving the Poisson
The Poisson distribution is derived from the binomial. The starting point is the
binomial formula for probabilities:
( of type 1) = · · (1 − )
If the sample size, n, increases indefinitely while np (the average number of the
first type per sample) remains at some constant level, λ (the Greek letter lambda),
then the formula becomes (after some not inconsiderable ‘black box’ mathematics):
( events) = · / !
The parameter of the distribution is λ, the average number of events per sample;
e is a constant, equal to 2.718…, which just happens to occur in certain mathemati-
cal situations (cf. the way π, equal to 3.141… , just happens to occur as the ratio
between the circumference and diameter of a circle).
This formula is the method by which the probabilities in Table A1.3, Appendix 1,
are calculated. It can be applied to the switchboard example above to calculate the
probabilities of given numbers of telephone calls per time period. Suppose the
average number of calls per minute is two (i.e. λ = 2). From tables or a calculator, e−2
= 0.135. Thus:
(0 calls) = 0.135 × 1/1 = 0.135
(1 call) = 0.135 × 2/1 = 0.27
and so on.
As with the binomial, it is sometimes easier, with the aid of a calculator, to use
the formula than the lengthy tables, especially when it is noted from the Poisson
formula that:
( + 1) = ( ) × /( + 1)
For example:
( = 1) = ( = 0) ×
( = 2) = ( = 1) × /2
( = 3) = ( = 2) × /3 etc.
The important assumption is that the sample is taken at random, just as for the
binomial. In the switchboard example, if the average arrival rate of calls refers to the
whole day, then a sample time period taken at a particularly busy time of day would
violate the assumption.
The required probabilities can be read off Table A1.3 in Appendix 1. For this example
the average number of calls per minute is two. The column associated with this value
gives the probabilities shown in Table 9.1. The table shows that there will be four or
fewer calls 94.7 per cent of the time. A switchboard of capacity four will therefore be
able to handle incoming calls (approximately) 95 per cent of the time.
9.2.5 Parameters
The Poisson has just one parameter, the average occurrence of events (the average
number of calls per minute in the above example). Once this parameter, λ, is known,
the shape of the distribution is fixed exactly. This can be verified from the Poisson
formula. Once λ is fixed, the probabilities are fixed and the distribution shape will
be fixed. If λ takes a different value, the probabilities will be different, as will the
shape, although it will nevertheless have a general profile in accordance with
Figure 9.1.
for convenience. The binomial tables and probability formula are lengthy and
tedious to use. The normal curve table is much easier to use.
In other circumstances the binomial can be approximated by the Poisson. Be-
cause of the way the derivations of the distributions are linked, the binomial can be
approximated by the Poisson when the sample is large and the proportion, p, is
small. As a rule of thumb, the approximation will give good results if n > 20 and p <
0.05. The parameter of the approximating Poisson is easily found since λ is defined
as being equal to the mean of the binomial, np.
Example
At peak hours, 15 000 cars per hour use the traffic tunnel at a major international
airport. From historical records it has been calculated that the probability of any one car
breaking down in the tunnel is 0.00003. Should three cars break down, it is unlikely that
emergency services will be able to deal with the situation; traffic will come to a stand-
still, flights will be missed and ulcers will be activated. What is the probability that
emergency services will be able to cope during a peak hour (i.e. that there will be no
more than two car breakdowns in the tunnel)?
The population is split into car journeys involving a breakdown in the tunnel and car
journeys not resulting in a breakdown in the tunnel. A sample (15 000 cars in one hour)
is taken from the population. The binomial applies to situations like this. The parameters
are n =15 000 and p = 0.00003. Using the binomial probability formula:
(0 breakdowns) = ( = 0)
= · (0.00003) · (0.99997)
(1 breakdown) = ( = 1)
= · (0.00003) · (0.99997)
(2 breakdowns) = ( = 2)
= · (0.00003) · (0.99997)
The probability of there being no more than two accidents has to be found.
(0, 1 or 2 accidents) = ( = 0) + ( = 1) + ( = 2)
The calculations are possible but, to most people, daunting. Since n > 20 and p < 0.05,
the Poisson approximation can be used.
= = 15 000 × 0.00003
= 0.45
The Poisson formula is:
( )= · / !
and
( + 1) = ( ) × λ/( + 1)
Therefore:
.
( = 0) = = 0.64
( = 1) = 0.64 × 0.45 = 0.29
( = 2) = 0.29 × 0.45/2 = 0.06
(at most 2 breakdowns) = 0.64 + 0.29 + 0.06
= 0.99
Therefore, the breakdown services are likely to find themselves unable to cope one
hour in a hundred.
values dependent on their random selection, but the last cannot. The last observa-
tion must take a particular value if the mean is to equal the value used in calculating
the previous n − 1 deviations. Only n − 1 deviations are therefore free to vary in the
estimate of the standard deviation, and it has n − 1 degrees of freedom.
The fact that the standard deviation estimate has n − 1 degrees of freedom lies
behind the use of n − 1 as the denominator in the formula. The averaging can be
thought of as being based on the degrees of freedom, not the sample size.
To summarise, the degrees of freedom associated with the estimate of a parame-
ter is the sample size minus the number of observations ‘used up’ because of the
need to measure other statistics (such as the arithmetic mean) before the estimate
can be made. In some cases (e.g. arithmetic mean), the degrees of freedom are equal
to the sample size; in other cases (e.g. the standard deviation), the degrees of
freedom are equal to the sample size minus one; in yet other cases to be met later,
the adjustment to the sample size is more than one.
9.4 t-Distribution
When uses of the sampling distribution of the mean were discussed in the previous
module, there was a restriction on the sample size in certain circumstances. If the
underlying individual distribution was non-normal, the sampling distribution was
normal only when the sample size exceeded 30; the standard deviation could be
estimated from the sample only when the sample size again exceeded 30. The t-
distribution overcomes the second of these restrictions and allows small sample
work to be done.
9.4.1 Characteristics
The t-distribution is similar to a normal distribution except that it tends to have
longer tails. It is continuous. The shape is symmetrical and varies, as in Figure 9.2.
For small sample sizes the tails are considerably longer than those of the normal; for
sample sizes of 30 or more, to a very good approximation, the t-distribution and the
normal coincide.
Normal
t
2.5%
t0.025
The 2 in this expression is just the z value that leaves 2.5 per cent in one tail of a
normal distribution, because 95 per cent of normal distribution lies between ±2
standard deviations of the mean. When the distribution is the t, then, instead of z =
2, the formula uses t0.025. The 95 per cent confidence limits for a population mean
when the t-distribution is used are:
± . · /√
This t value will vary according to the sample size, unlike the z value, which does
not change with the sample size. Different confidence levels can be substituted. The
90 per cent confidence limits would have t0.05 in place of t0.025.
Example
A sample of 20 brand X light bulbs is tested to destruction. Their average life length is
1080 hours and the standard deviation of the sample is 60 hours. What are the 95 per
cent confidence limits of the average life length of all brand X bulbs?
Follow the five stages of the estimation procedure introduced in the last module.
(a) A random sample of size 20 has been taken. The sample size is not large enough for
the central limit theorem to be used. The t-distribution will apply, but only if it is
known or can be assumed that the individual distribution of bulb life lengths is nor-
mal.
(b) The mean of the sample has been calculated to be 1080 hours, the standard
deviation to be 60 hours.
(c) The standard error of the sampling distribution is:
Individual standard deviation/ Sample size = 60/√20
= 13.42
(d) The point estimate of the population mean (all brand X bulbs) is 1080 hours.
(e) 95 per cent confidence limits are given by:
± . · /√
For 19 (= 20 − 1) degrees of freedom, t0.025 is 2.093 (from Table A1.4 in Appendix
1). Therefore the limits are:
= 1080 ± 2.093 × 60/√20
= 1080 ± 125.58/4.47
= 1080 ± 28.09
= 1052 to 1108
The 90 per cent limits use 1.729 (see Table A1.4 in Appendix 1) instead of 2.093 and
so are:
= 1080 ± 1.729 × 60/√20
= 1080 ± 23.2
= 1057 to 1103
In significance testing as well as estimation, the layout of t tables requires t values to be
used in a slightly different way to z. Instead of probabilities or critical values being
compared, an observed t value is compared to a critical t value. The approach is very
similar to the critical value approach to significance tests. The five stages of significance
test become:
(a) Specify the hypothesis.
=
/√
= −15/8.85
= −1.69
(e) The observed t value is −1.69. Note that, since the t-distribution is symmetrical, a
negative t value causes no difficulties. The theoretical t value obtained from Ta-
ble A1.4 in Appendix 1 and appropriate to the significance level depends on whether
the test is one- or two-tailed. Since the machines are for hardening metal, only undu-
ly low temperatures are of concern to the engineering company. Assuming this is the
case, the test is one-tailed. From the table, the t value is in the row corresponding to
9 degrees of freedom and the column corresponding to 5 per cent in the tail (i.e. the
column headed 0.05). This value is 1.833. The observed value is less than 1.833 and
thus the sample evidence has a probability greater than 5 per cent (see Figure 9.4).
The hypothesis is accepted. The machines are performing according to specification.
5%
t0.05 = 1.833
Sample t =1.69
9.4.5 Parameters
When the sample size exceeds 30, the sampling distribution of the mean is almost
normal and so has two parameters, the arithmetic mean and the standard deviation.
The t-distribution stems from this but has one extra parameter, the degrees of
freedom. The t-distribution, therefore, has three parameters in total. This can be
verified by noting that once mean, standard deviation and degrees of freedom are
specified, the distribution probabilities are fixed.
which the t-distribution is used are always less than 30, the central limit theorem can
never be invoked and therefore the normality assumption is important. This is
especially true, of course, when the individual distribution is far from normal in
shape.
The check that should be made is that the individual distribution is indeed nor-
mal. This may be done by following the usual procedure of taking a small sample
and comparing observed with expected frequencies.
Four standard distributions have been encountered so far: binomial, normal,
Poisson and t. They are probably the most widely used and for that reason have
been covered in some detail. Many more standard distributions are available but
their range of application tends to be limited and specialised. They can also be
mathematically complex. For these reasons the distributions that follow are de-
scribed in less detail, and we concentrate on how and where they can be applied.
9.5.1 Characteristics
Chi-squared, the variable of the distribution, is defined as:
= ( − 1) × Observed sample variance/Population variance
The (n − 1) is the degrees of freedom of the chi-squared distribution. The sam-
ple size is n, and one degree of freedom is lost because an estimate of the mean is
required for the calculation of the sample variance, just as with the t-distribution.
The distribution can take on a variety of shapes, as shown in Figure 9.5. As the
degrees of freedom approach 30, the shape becomes more and more like that of the
normal distribution.
k=2
k = Degrees of freedom
k=5
k = 20
As usual with sampling distributions, in practice only one sample is gathered and
the theoretically derived distribution used to calculate probabilities as part of
estimation procedures or significance tests.
The assumption, common to all sampling distributions, that the sample has been
taken at random also applies to the chi-squared distribution. A second important
assumption ceases to be important when, as a rule of thumb, the sample size
exceeds 30. Just as with the t-distribution, the chi-squared becomes approximately
normal as the sample size approaches 30. The same restrictions on normality and
sample size apply therefore to the chi-squared as to the t-distribution. To summa-
rise, the assumptions are either that the sample has been taken at random from a
normal population or that, if the population is non-normal, the sample size is 30 or
more.
Estimation is based on finding confidence limits for chi-squared and then trans-
forming these into variances. For example, at the 95 per cent level, there is 95 per
cent probability that a chi-squared observed from a sample will fall between:
. and .
Thus:
. < ( − 1) × Sample variance/Population variance < .
Inverting this expression (and therefore also changing < into >), with s2 as usual
referring to the sample variance, gives:
1/ . > Population variance/( − 1) > 1/ .
( )
> Population variance >
( ) ﴾9.1﴿
. .
This expression gives the 95 per cent confidence limits for a population variance.
Example
A random sample of 10 is taken from a normal distribution. The variance of the sample
is six. What are the 90 per cent confidence limits for the population variance?
From Equation 9.1:
( ) ( )
> Population variance >
. .
2.5% 2.5%
In any significance test of this question the null hypothesis would be that seniori-
ty is independent of gender (i.e. there are as many men and women at each
management level as would be expected given the numbers of men and women
employed by the agency). So, as a first step, it would be useful to know what should
be expected.
Given that 28 out of 50 employees are men, it would be expected (no gender
differences, remember!) that 28/50 directors are men. Likewise 22/50 directors
should be women. But there are 10 directors and therefore 28/50 of 10 = 5.6 would
be men, setting aside concerns about what 0.6 of a director means. This would mean
4.4 women directors. Table 9.3 has been developed from Table 9.2 to include these
expected numbers.
It rather looks from Table 9.3 as if women are under-represented at the higher
levels and over-represented at the lower, but how can this impression be tested
statistically? The key is that the statistic:
( )
∑
over all cells of the table, follows, approximately, a chi-squared distribution. Some
statistical theory, which we will leave in the ‘black box’, is needed to demonstrate
this. The degrees of freedom are:
(Number of rows − 1) × (Number of columns − 1)
Example
A significance test for the advertising agency follows the usual five stages.
(a) The null hypothesis is that of independence: the proportions of male and female
managers at each management level are in proportion to the numbers of men and
women employed by the agency.
(b) The data are the evidence collected from personnel records and shown in Table 9.2.
(c) The conventional 5 per cent significance level is used.
(d) The observed chi-squared value is calculated:
= (7 − 5.6) /5.6 + (12 − 11.2) /11.2 + (9 − 11.2) /11.2 + (3 − 4.4) /4.4
+(8 − 8.8) /8.8 + (11 − 8.8) /8.8
= 1.96/5.6 + 0.64/11.2 + 4.84/11.2 + 1.96/4.4 + 0.64/8.8 + 4.84/8.8
= 1.91
(e) The test is one-tailed because the purpose of the test is to determine whether
observed and expected are significantly different, i.e to determine whether chi-
squared is unusually high; it would be of no interest in addressing the hypothesis if
chi-squared were unusually low. The conventional five per cent significance level
means, for a one-tailed test, that the value of chi-squared that leaves five per cent in
the tail of the distribution must be found. The degrees of freedom are:
(Number of rows − 1) × (Number of columns − 1) = 1 × 2 = 2
From the chi-squared table (Table A1.5 in the Appendix 1), the value of chi-squared
leaving 5 per cent in the tail and relating to two degrees of freedom is 5.99.
The observed chi-squared is much smaller than this and therefore the hypothesis must
be accepted: level of seniority and gender are independent. There are no significant
gender differences in the levels of seniority. The point is that, in this case, although there
are differences and women do appear to be under-represented, these differences are
not statistically significant.
There is a potential difficulty with this test. The above formula for chi-squared is an
approximation to a true chi-squared. It is akin to approximating the binomial distribu-
tion with the normal. In this case the approximation to chi-squared is only valid if the
sample size is large. As a rule of thumb, the expected frequency in each cell of the table
should be around five or more. If it is not, a correction should be made to the formula.
In the above example none of the cells had expected frequencies much below five and
so the correction was not made. The nature of the correction goes beyond the scope of
this module and is not described.
This approach, of comparing expected and observed frequencies to test for pro-
portions in a table, is also the basis for testing the goodness of fit of distributions to
observed data. For example, it is possible to test whether a sample of data follows a
standard distribution such as the normal or binomial distribution. The expected
frequencies are what would be expected theoretically from statistical tables of the
distribution. For instance, 34 per cent of the distribution lies between the mean and
+1 standard deviation. The observed frequencies are what is observed in the sample
to lie, for instance, between the mean and +1 standard deviation. The above
approximation is used to calculate an observed chi-squared and this is compared to
the critical value for the specified significance level found in chi-squared tables. This
application of chi-squared is important to statisticians but less so to managers and
for this reason is not covered in detail.
9.6 F-Distribution
When location (usually measured by the arithmetic mean) is the subject of analysis,
the normal and t-distributions can be used both to compare a sample mean with a
population mean and to compare the mean of one sample with that of another.
When scatter is the subject of analysis, the chi-squared distribution can be used to
compare a sample variance with a population variance. The F-distribution com-
pletes the picture as regards scatter. It is used to compare the variance of one
sample with that of a second. The variable of an F-distribution is the ratio between
two variance estimates. Just as the location of two samples could be compared
through the difference in their means (by applying a normal or t test), so the scatter
of two samples can be compared through the ratio of their variances (by applying an
F test).
9.6.1 Characteristics
The variable of the F-distribution is defined as:
= Variance of sample 1/Variance of sample 2
There is a pair of degrees of freedom associated with the F-distribution, one for
the sample variance in the numerator (the upper part of the ratio) and one for the
sample variance in the denominator (the lower part of the ratio). Each of these
degrees of freedom is the sample size minus one, the one being lost for the usual
reason that in calculating a variance the arithmetic mean must also be calculated
from the sample. For example, if the two samples had 17 and 13 observations
respectively then the F-distribution would be said to have (16,12) degrees of
freedom. The distribution can take on a variety of shapes, as shown in Figure 9.7.
The shapes are right-skewed. The lower the degrees of freedom, the more skewed
the distribution is likely to be, since estimates of variances will be more uncertain.
df(20,8) df(30,30)
df(8,8)
0 1 2 3 4
Two samples are taken at random from a normally distributed population. The
variances of the two samples, and , are calculated. The ratio between the
variances, = / , is also calculated. More pairs of samples are taken and F
calculated for each pair. The distribution of these ratios is the F-distribution.
5%
1%
F0.05 F0.01
ohms, but inevitably not all are of exactly 3 ohms. From past experience it is thought
that each supplier’s resistors are normally distributed about a mean of 3 ohms. A quality
control scheme based on sampling and significance testing continually checks that the
average resistance is 3 ohms. The evidence is that the average is 3 ohms for both
suppliers. However, recent customer complaints have begun to suggest that there may
be a difference in quality between the two suppliers’ products in that the variation in
resistance is greater for Supplier A than Supplier B. A sample of 12 resistors is taken at
random from A’s supply and the variance calculated to be 0.042; a sample of 18 is taken
from B’s supply and the variance calculated to be 0.017. Is there a significant difference
in quality?
Follow the five stages of a significance test:
(a) The hypothesis is that the two samples come from populations with the same
variance; in other words, there is no difference in quality.
(b) The evidence is the two samples and their variances, from which an F ratio can be
calculated. Since the distribution of resistances of each supplier’s resistors is thought
to be normal, a significance test based on the F-distribution will be valid.
(c) Set the significance level at the conventional 5 per cent.
(d) The critical F value for a 5 per cent significance level and (11,17) degrees of freedom
is 2.41. The observed F ratio is 0.042/0.017 = 2.47. The observed F ratio is slightly
larger than the critical value.
(e) Since the observed F exceeds the critical value, the sample evidence has a probability
of less than 5 per cent. The hypothesis is rejected. There is evidence that one supplier
(Supplier B) is supplying a product of better quality, as defined by variation in re-
sistance. The result is so close to the accept/reject borderline that this fact should be
mentioned in quoting the result. If the significance level had been 1 per cent then the
hypothesis of no difference in quality would be accepted, since the 1 per cent critical
value is 3.52 while the observed F ratio remains 2.47.
Two major assumptions were involved in this significance test. They are the same as for
the chi-squared distribution. First, the populations from which the samples were taken
were normal. Second, the samples were selected at random from the populations.
Learning Summary
In respect of their use and the rationale for their application, the standard distribu-
tions introduced in this module (Poisson, t, chi-squared, F, negative binomial and
beta-binomial) are in principle the same as the earlier ones (normal and binomial).
Their areas of application are to problems of inference, specifically estimation and
significance testing. The advantages their use brings are twofold. First, they reduce
the need for data collection compared with the alternative of collecting one-off
distributions for each and every problem. Second, each standard distribution brings
with it a body of established knowledge that can widen and speed the analysis.
The eight standard distributions encountered so far are just a few, but probably the
most important few, of the very many that are available. Each has been developed to
cope with a particular type of situation. Details of each distribution have then been
recorded and made generally available. When a new distribution has been developed
and added to the list, it has usually been because it is applicable to some particular
problem that can be generalised. For instance, W.S. Gosset developed the t-
distribution because of its value when applied to a sampling problem in the brewing
company for which he worked. Because this problem was a special case of a general
type of problem, the t-distribution has gained wide acceptance.
To summarise, when one is faced with a statistical problem involving the need to
look at a distribution, there is often a better alternative than having to collect large
amounts of data. A wide range of standard distributions are available and may be of
help. Table 9.4 summarises the standard distributions described so far and the
theoretical situations from which they have been derived.
One of the principal uses of standard distributions is in significance testing. Ta-
ble 9.5 lists four types of significance test and shows the standard distribution that is
the basis of each.
Review Questions
9.1 What standard distribution would you expect to apply to the heights of the male
employees of an organisation?
A. Binomial
B. Normal
C. Poisson
D. t
E. F
9.2 If n > 20 and p < 0.05, the binomial distribution is usually approximated by the Poisson
when analysing data. The reasons for this are:
A. the Poisson is more accurate since it is based on the binomial but with p small
and n large.
B. it is easier to calculate with the Poisson formula than with the binomial.
C. Poisson tables are easier to use than binomial tables.
9.3 The stretch of motorway between two major cities has had 36 major accidents in the
last year. What is the probability that there will be more than five major accidents next
month?
A. 18%
B. 8%
C. 10%
D. 5%
9.4 When the sample size is 25, the number of degrees of freedom associated with
estimating the variance is:
A. 25
B. 24
C. 23
D. 22
9.5 The sampling distribution of the mean is a t-distribution, not a normal distribution, when
random samples of size 20 are taken from a population of unknown standard deviation
and unknown type. True or false?
9.6 What is the standard error of the sampling distribution of mean differences?
A. 9.7
B. 23.0
C. 5.3
D. 22.4
9.8 What is the t value corresponding to the 5 per cent significance level against which to
compare the observed t value?
A. 1.73
B. 1.74
C. 2.10
D. 2.11
E. 1.645
9.10 Provided conditions of randomness and normality are met, the chi-squared distribution
can be the basis of testing whether a sample variance is significantly different from a
hypothesised population variance. True or false?
9.11 Referring to the figure below, what is the 10 per cent critical chi-squared value for the
upper tail of a distribution with 18 degrees of freedom?
10%
c2 = ?
A. 10.865
B. 10.085
C. 24.769
D. 28.869
E. 25.989
9.12 The F-distribution could be used to determine confidence limits in the estimation of
population variance from a sample variance provided the sample has been selected at
random and the distribution is normal. True or false?
9.13 The 5 per cent critical value for the F-distribution when the numerator has 8 degrees of
freedom and the denominator 11 is:
A. 3.20
B. 5.74
C. 4.74
D. 2.95
E. 3.31
9.14 An F ratio is calculated from the variances 24 and 96 of two samples of sizes 15 and 13
respectively. This ratio exceeds the 1 per cent critical F ratio. True or false?
9.15 One of CBA plc’s products is a range of bathroom tiles. The production process has an
average 0.35 per cent tiles defective. The tiles are sold in packages containing 100 tiles.
What standard distribution would be used in practice to monitor the number of
defective tiles per package?
A. Normal
B. Binomial
C. Poisson
D. t
E. Chi-squared
F. F
end of this time the miles-per-gallon (mpg) figure was calculated for each of the 14 cars.
The result was as follows:
Car 1 2 3 4 5 6 7 8 9 10 11 12 13 14
mpg 21 24 22 24 29 18 21 26 25 19 22 20 28 23
The police computer holds not only criminal records but also records of the history of
all police vehicles. This includes summaries of petrol consumption per quarter. From
these data it was found that, in the past, patrol cars, when between six and nine months
old, had on average consumed petrol at the rate of 22.4 miles per gallon over quarterly
periods. It is not possible to calculate the standard deviation of this distribution because
the data are held in summary form only and there is insufficient detail for the standard
deviation to be determined. Checks with the records of other police vehicles indicate
that the distribution can be assumed to be normal.
a. Conduct a significance test to determine whether the new tyres have made a
significant difference to petrol consumption, the null hypothesis being that the tyres
make no difference to petrol consumption.
b. What should the sample size ideally have been if the test were to discriminate
equally between the null hypothesis and the alternative hypothesis (the manufactur-
er’s claim)?
c. Why is it necessary for the cars included in the trial to be of the same make and
mark and of roughly similar ages?
d. A senior officer of the force who took a statistics option at the police training
college suggests that the test is invalid because the past records refer to the three
previous years and, he says, maintenance methods must have improved over this
time. His proposal is to compare the consumption of a sample of cars using the new
tyres with another sample of cars of the same make and ages at a particular neigh-
bouring force whose maintenance routines just happen to be identical. What are the
arguments for and against his suggestion?
Analysis of Variance
Contents
10.1 Introduction.......................................................................................... 10/1
10.2 Applications .......................................................................................... 10/2
10.3 One-Way Analysis of Variance ........................................................... 10/5
10.4 Two-Way Analysis of Variance ........................................................ 10/10
10.5 Extensions of Analysis of Variance................................................... 10/13
Learning Summary ....................................................................................... 10/14
Review Questions ......................................................................................... 10/15
Case Study 10.1: Washing Powder ............................................................. 10/16
Case Study 10.2: Hypermarkets ................................................................. 10/17
10.1 Introduction
The significance tests encountered up to now have involved just one or two
samples. The mean or variance of a sample has been compared with that of a
population; the mean or variance of one sample has been compared with that of a
second sample. The purpose has been to test the effect of new conditions on some
management situation. The tests show whether or not the new conditions have led
to a significant change in the data at a given confidence level. For example, the
influence of using a different brand of tyres on police cars was investigated by
carrying out a significance test to compare miles-per-gallon data of a small sample of
police cars using the new brand of tyres with the historical record of miles per
gallon for all cars with the current brand of tyres. Although the miles per gallon
achieved by the sample was greater, it was not sufficiently so to suggest that the new
tyres had made a significant difference at the 95 per cent confidence level.
In other situations the need sometimes arises to make comparisons with several
samples rather than just one or two. For example, several different fertilisers may be
used to improve crop yield. The question that has to be answered is whether the
fertilisers all have approximately the same effect or whether there are significant
differences in their effects. In this case several samples need to be compared, each
referring to crop yield data for one of the fertilisers.
It would be possible to tackle problems such as this by taking the samples in
pairs and using ordinary two-sample significance tests. Suppose there were six
fertilisers. Comparing them all in pairs would need a large number of significance
tests, which would be very time-consuming. Worse, at the conventional 5 per cent
confidence level there is a 5 per cent chance of wrongly rejecting the hypothesis (of
no difference between the two samples). Consequently, even if in reality there is no
difference in the two fertilisers, it could be expected that 5 per cent of the tests
would show, purely by chance, that there is. Paired significance tests are therefore
not a statistically valid option.
Analysis of variance is a type of significance test that allows, in a single test, sev-
eral samples to be compared under the hypothesis that they have all come from the
same population (or that the populations from which they come all have the same
mean). First, some applications will be described. Second, the underlying theory will
be explained before we move on to consider the basic technique, called one-way
analysis of variance. Third, some extensions, particularly two-way analysis of
variance, will be discussed.
10.2 Applications
10.2.1 Television Assembly Lines
A plant in Asia manufactures TVs. There are eight parallel assembly lines within the
plant. Production rates vary from day to day and from line to line because of a range
of factors, including breakdowns, lack of supplies and the skill levels of the workers.
Management wants to know whether the observed production rates differ signifi-
cantly from line to line so that it can decide whether attention should be given to all
lines or restricted to particular lines. Table 10.1 shows production data collected
from the lines over a 200-day period.
Analysis of variance permits one single significance test to indicate whether the
differences in average production rates, shown in the bottom row of Table 10.1, are
significant or not. It tests the hypothesis that all samples (lines) come from the same
population (or from populations with the same mean). In other words, analysis of
variance will show whether the production rate for each assembly line can be
thought of as being essentially the same.
Plot 3
Plot 2
Plot 1
Plot 4 Plot 6
Plot 5
However, there are many reasons that could account for a difference in crop
yields between the plots apart from the effect of the fertiliser. The slope, drainage
and acidity of the soils are some examples. If the effects of these factors could be
neutralised, a clearer idea of whether differences between fertilisers exist could be
obtained. Some of the factors can be neutralised by arranging the experiment so that
the location of the plot has no effect on the outcome. This can be done by dividing
each plot into six sections and allocating (at random) one fertiliser to each section
(see Figure 10.2).
One measurement of crop yield is made in each section within each plot. Now 36
measurements are made. They can be grouped in two different ways. They can be
thought of as six groups (one for each fertiliser), as previously, or six groups (one
for each plot). These two ways of grouping are the basis of the isolation of the
effect of plot location. Two-way analysis of variance is used to analyse the observa-
tions. Whatever the application, by convention, the first grouping (here the
fertilisers) is referred to as the treatments and the second grouping (here the plots)
is referred to as the blocks.
B Plot 1 Plot 3
F Plot 2 F E A DC B
A F
C A D E C F B
B
D Plot 4 E
D A F C B E Plot 6
E A
Plot 5 C
C E B F D A D
Total SS = (4 − 5) + (7 − 5) + (5 − 5) + (6 − 5) + (3 − 5) + (3 − 5) + ⋯
= 52
Next turn to the variation between the treatments. This variation, the sum of
squares between treatments (referred to as ‘SST’), is calculated:
SST = No. observations × ∑( − ̿)
where is the mean of the jth treatment.
SST measures the variation between group means via the deviations between
group means and the grand mean. In the example, SST is:
MST = SST/( − 1)
MSE = SSE/( − 1)
In the example:
MST = 10/2 = 5
MSE = 42/(4 × 3) = 3.5
Recall that the underlying idea of analysis of variance is to compare the variation
between treatments with the variation within treatments. If the former is significant-
ly greater than the latter (at some level of significance) then the differences in
treatment means must be larger than those expected purely by chance. If chance is
not causing the differences, something else must be. Therefore, it is supposed, the
treatments have a significant effect in determining the differences observed.
The relevance of the mean squares to this process is that they are the basis of a
significance test to determine whether explained variation (between treatments) is
significantly different from unexplained variation (within treatments). This is
because the ratio between MST and MSE follows an F-distribution. Proof of this
fact involves some ‘black box’ mathematics.
The significance test proceeds as follows. The hypothesis is that all treatments
come from populations with the same mean, or from a common population. If the
observed F value (MST/MSE) lies beyond the critical value, at a given level of
significance, then the hypothesis is rejected: the treatments do have a significant
effect. If the observed F value is less than the critical value then the hypothesis is
accepted: there is not a significant difference between the treatment means.
In the example:
Observed = MST/MSE = 5/3.5 = 1.43
The critical F value for (2,12) degrees of freedom at the 5 per cent significance
level from the F-distribution table (see Appendix 1, Table A1.6) is 3.88. Observed F
is less than the critical value and therefore the hypothesis is accepted. The treat-
ments can be thought of as having no significant effect on the variable.
Intuitively, the significance test can be thought to operate in the following way.
MST and MSE are the bases of two alternative ways of estimating the variance (σ2)
of the common population from which the observations are supposed to have been
taken.
̿
MST = No. obs ×
= No. obs × Variance of sample means
is an estimate of:
No. obs × =
.
( )
MSE = , all ,
( )
is an estimate of σ2
But the test for differences in estimates of variances is the F test. The ratio
MST/MSE is the ratio between two estimates of a variance, and thus the F test can
be used to determine whether they are significantly different.
Suppose there were a significant difference between the two estimates of σ2, and
that the one based on MST were the larger. This would be an indication that much
more (indeed a significant proportion) of the variation between the observations is
accounted for by the effect of the different treatments rather than being a result of
random variation between observations. Treatments, then, appear to have a
significant effect on the means of observations relating to them. In other words,
there must be a difference between the effects of the treatments.
This test carries three underlying assumptions. The test is basically an F test and
two of the assumptions are exactly those described in Module 9 in relation to the F-
distribution. First, the observations are supposed to have come from a normal
distribution. Second, the observations should have been taken at random. The third
assumption is equally fundamental. The test is based on the treatment groups having
come from a common population or from populations with equal means (in
practice it does not matter which assumption is made). In the latter case, the
populations must also have equal variances. This stems from the statistical theory of
the test but, intuitively, it can be envisaged that a difference in variances would
distort the sums of the squares within groups. In either case there should not be a
significant difference in the variances of the treatment groups. This is the third
assumption, that the treatment groups have the same underlying variance.
The systematic way of carrying out the test is to base it on an ANOVA table. The first
step is to calculate SSE, the within-group sum of squares. This is shown in Table 10.6.
The next step is the calculation of SST, the between-groups sum of squares:
SST = 6 × [(65 − 63) +(63 − 63) +(61 − 63) ]
= 6×8
= 48
Total SS can now be calculated as 344 (SST + SSE). Alternatively, it can be calculated
independently as a check on SST and SSE. The sums of squares can now be inserted in
an ANOVA table (Table 10.7).
The first column describes the sources of error. The second relates to degrees of
freedom, always given by c − 1 and (r − 1)c. The sums of squares go in the third
column. The mean squares are the sums of squares divided by the degrees of freedom
and are shown in the fourth column. Finally, the observed F value is calculated to be
1.22, the ratio between the mean squares.
To finish the test, the critical F value at the 5 per cent level is found from the table of
the F-distribution. In this case the value is 3.68. The observed F value, 1.22, is much
smaller. The hypothesis is accepted. There is no significant difference in the examination
results. The packages do not appear to make a difference to examination results.
For two-way analysis of variance an extra sum of squares, for the blocks, must be
calculated. This is labelled SSB and is calculated in the same way as SST except that
it is rows, not columns, to which the process is applied. SSB is calculated from the
deviations of the row (plot) means from the grand mean. In general:
SSB = No. of observations in block × ∑( − )
In the example:
SSB = 3 × [(4.7 − 5) +(4.7 − 5) +(5 − 5) +(6 − 5) +(4.7 − 5) ]
= 4.0
Statistically, it can be demonstrated that:
Total SS = SST + SSB + SSE
This is similar to the one-way analysis of variance. Here, the equation is of real
practical use. It is used to calculate SSE. In other words, the procedure is to
calculate Total SS, SST and SSB directly from the data, but then calculate SSE from
the summation equation.
In the example:
Total SS = 52, as before
SST = 10, as before
SSB = 4 by the above calculation
SSE = 52 − 10 − 4 = 38 by the summation equation
The mean squares are now calculated. The degrees of freedom for MSE are no
longer c(r − 1). The block (row) means have been used in the calculations and the
associated number of degrees of freedom is lost. The degrees of freedom for MSE
are now (c − 1)(r − 1), which, in the example, is 8. Therefore:
MST = 10/2 = 5 as before
MSE = 38/8 = 4.75
The F value is still MST/MSE: 5/4.75 = 1.05.
The loss of degrees of freedom because of the use of block means also affects
the F value. The critical F value against which to compare the observed value now
relates to (2,8) degrees of freedom, not (2,12) as previously. At the 5 per cent level
the critical F value for (2,8) degrees of freedom is 4.46, much greater than the
observed value. The hypothesis is still accepted. The treatments do not appear to
have a significant effect.
These workings may be seen more clearly from a two-way analysis of variance
table (Table 10.9). Column 1 describes the sources of variation. Column 2 refers to
the degrees of freedom. The degrees of freedom for the blocks is r − 1, analogously
to c − 1 being the degrees of freedom for the treatments. Degrees of freedom for
the error are (c − 1)(r − 1), as described above.
The third column relates to the sums of squares. Recall that only Total SS, SST
and SSB are calculated directly. SSE is calculated from the equation:
Total SS = SST + SSB + SSE
The fourth column refers to the mean squares, calculated by dividing the sums of
squares by their degrees of freedom. The F ratio in the fifth column is formed from
MST/MSE: in this example, 1.05.
The critical F value, taken from tables, is compared to this observed F ratio. At
the 5 per cent level for the appropriate degrees of freedom (2,8), the critical F value
is 4.46. As indicated above, the observed value is less than the critical value and the
hypothesis must still be accepted. After allowing for the effect of different lots, the
fertilisers still do not appear to cause significantly different crop yields.
Example: Teaching Material for Economics (continued)
The conclusion drawn from the earlier example concerning the provision of teaching
material for economics courses was that the material had no significant effect on the
results. The hypothesis that the treatment (type of material) groups came from
populations with the same mean was accepted. However, the experiment was carried
out in several schools. It could well be the case that a large part of the variation
between the groups was accounted for by differences between the schools. If this
variation were neutralised by quantifying it and removing it from the calculations, would
the conclusion be the same? The possibility can be pursued by considering each school
as a ‘block’ and carrying out a two-way analysis of variance. The data including block
means are given in Table 10.10.
The sum of squares of the blocks (SSB) can now be calculated from the average
examination result of each school:
SSB = 3 × [(60 − 63) +(65 − 63) +(70 − 63) +(61 − 63) +(61 − 63) +(61 − 63) ]
= 3 × (9 + 4 + 49 + 4 + 4 + 4)
= 222
A two-way ANOVA is laid out as in Table 10.11. From the table the observed F value is
3.24. From F-distribution tables the critical F value for (2, 10) degrees of freedom at the
5 per cent significance level is 4.10. The hypothesis is accepted. There seem to be no
significant differences between the effects the teaching material has on the results.
Learning Summary
Analysis of variance is one of the most advanced topics of modern statistics. It is far
more than an extension of two-sample significance tests, for it allows significance
tests to be approached in a much more practical way. The additional sophistication
allows significance tests to be used far more realistically in areas such as market
research, medicine and agriculture.
In practical situations there is a close association between analysis of variance and
research design. Although multi-factor analysis of variance is theoretically possible,
attempts to carry out such tests can involve large amounts of data and computing
power. Moreover, large and involved pieces of work can be more difficult to
comprehend conceptually than statistically. The results often present enormous
problems of interpretation. Consequently, before one embarks upon lengthy
analyses, time must be spent planning the research so that the eventual statistical
Review Questions
10.1 Analysis of variance is a significance test applied to several samples. Which of the
following hypotheses could it test?
A. The samples come from populations with equal means.
B. The samples come from populations with equal variances.
C. The samples come from a common population.
D. The samples are normally distributed.
Questions 10.2 to 10.5 refer to the following situation:
A one-way analysis of variance is carried out for five treatment groups, each with 12
observations. Calculations have shown Total SS = 316, SST = 96.
10.5 If the critical value at the 5 per cent significance level for (4,55) degrees of freedom is
2.54, then the conclusion to be drawn is that the treatments do have a significant effect.
True or false?
10.6 Analysis of variance is a significance test that is applied to several samples. Each sample
should have which of the following attributes?
A. Be randomly selected.
B. Have the same mean as the other samples.
C. Have the same variance as the other samples.
D. Come from a population with a normal distribution.
10.10 An analysis of variance with a balanced design is one in which the number of treatments
(columns) equals the number of blocks (rows). True or false?
a. Carry out a two-way analysis of variance to determine whether the results indicate a
difference between stores in the responses to the attribute ‘courteous staff’.
b. Why, in testing for this difference, is it important to allow for the effect of the day of
the week on interviewees’ responses?
c. Would the outcome of the test be different if this influence were not neutralised?
d. Do the days of the week have a significant influence on responses, after allowing for
the effect of location?
e. Intuitively, is it likely to be worth pursuing the question of an interaction effect in
this situation?
Statistical Relationships
Module 11 Regression and Correlation
Module 12 Advanced Regression Analysis
Learning Objectives
Regression and correlation are concerned with relationships between variables. By
the end of this module the reader should understand the basic principles of these
techniques and where they are used. He or she should be able to carry out simple
analyses using a calculator or a personal computer. The many pitfalls in practical
applications should also be known.
The module deals with simple linear regression and correlation at a non-statistical
level. The aim is to explain conceptually the principles underlying these topics and
highlight the management issues involved in their application. The next module
extends the topics and describes the statistical background.
11.1 Introduction
Regression and correlation are concerned with relationships between variables. They
investigate whether a variable is related statistically to one or more other variables
that are thought to ‘cause’ changes in it. Regression is a method for determining
the mathematical formula relating the variables; correlation is a method for
measuring the strength of the relationship. Regression shows what the connection
is; correlation shows whether the connection is strong enough to merit using it.
For example, suppose a company wishes to investigate the relationship between
the sales volume for one of its products and the amount spent advertising it. The
purpose might be to forecast future sales volumes or it might be to test the effec-
tiveness of the advertising. Regression and correlation are based on past data and
therefore the first step is to unearth the historical records of both variables. Perhaps
the quarterly sales volume and the quarterly advertising expenditure over several
years are available. If both variables are plotted graphically, the scatter diagram of
Figure 11.1 results. Each point (or observation) refers to one quarter. For instance,
the point A refers to the quarter when advertising expenditure was 12 (in £000) and
sales volume was 36 (in 000).
40 A
20
0
12 24 36
Quarterly advertising expenditure (x) (£000)
the same quarter; and low with low). Correlation will reinforce the intuitive evidence
of the scatter diagram.
The expression regression analysis is often used to include both regression and
correlation.
The purpose of this module is to show where regression and correlation can be
used, to illustrate the underlying principles and to point out the pitfalls in their use.
Initially, only simple linear regression will be described. ‘Simple’ in this context
means involving only two variables; linear means that the relationship is some form
of straight line (as opposed to a curve). In the next module the description will be
extended to other types of regression. On the assumption that anyone doing any
serious regression or correlation will have the help of some form of calculator or
computer, the mathematics are kept to a minimum. Unfortunately, to gain any more
than a merely superficial understanding of the subject, some technical details are
necessary. The mathematical preliminaries will be tackled immediately after the next
section, which describes some practical applications of the techniques.
11.2 Applications
11.2.1 Forecasting
Forecasting is frequently based on regression analysis. The variable to be forecast is
regressed against another variable thought to cause changes in it. This example is of
a furniture manufacturer forecasting revenue by relating it to a measure of national
economic activity. It is reasonable to do this since it is likely that the business of
such an organisation will be influenced by the economic environment. The econom-
ic variable chosen is the UK gross domestic product (GDP). Annual data for
furniture sales and GDP over a ten-year period are shown in a scatter diagram in
Figure 11.2.
10.0
9.0
Sales (£m)
8.0
7.0
6.0
5.0
0 50 100 150 200 250 300
GDP (£bn)
formula linking the two sets of numbers (i.e. can provide the equation of the straight
line that is as near as possible to all ten points). Then, given an estimate of GDP for
future years, estimates of furniture sales volumes can be calculated from the
equation.
11.2.2 Explaining
On occasions it is helpful to understand the extent to which two variables are
related, even when there is no intention to use the relationship for forecasting. The
alleged link between smoking and lung cancer is an obvious example.
In organisational research into the possibility of a relationship between the sala-
ries and weights of top executives, data on salary and weight for a random sample of
top executives in a selection of US companies were collected. The scatter diagram is
given in Figure 11.3. The data are cross-sectional because the observations relate
to different people at one point in time; the data in Figure 11.2 are time series
because each observation relates to a different point in time.
Figure 11.3 shows a relationship but not a strong one. In general, high salaries are
associated with low weights and low salaries with high weights. But the association
is loose. The points are not very close to a straight line. There is a weak correlation
between the variables. Since high is associated with low and vice versa, the correla-
tion is said to be negative. In the furniture sales application the correlation was
positive since high was associated with high and low with low.
50
Salary (£000)
40
30
20
40 60 80 100
Weight (kg)
(a) y
y = a + bx
u
Slope =b = v
u
Intercept = a
0
x
(b) y
a
v
u
Slope = v
u
0
x
Figure 11.4 Straight line (a) with positive slope; (b) with negative slope
11.3.2 Residuals
Figure 11.5 is an example of a scatter diagram. If we draw any straight line through
the set of points, the points will not in general fall exactly on the straight line.
Consider the first point, A, at x = 1. The y value at A is the actual y value. The
point directly below A that lies on the line is B. The y value at B is the fitted y
value. If the equation of the line is known, then the fitted y value is obtained by
putting the x value at A into the equation and calculating y. If at A the actual y value
is 12 and the line is y = 10 + 0.5x, the fitted y value is as follows:
Fitted value = 10 + 0.5 × 1 = 10.5
The difference between actual and fitted y values is the residual:
Residual = Actual value − Fitted value
At A, the residual is 12.0 − 10.5 = 1.5. If the point lies above the line, the residu-
al is positive; if it is below, the residual is negative; if it is on the line, the residual is
zero. Each point has a residual. The residuals would, of course, be different if a
different line were drawn.
Residual
A
12
10.5
10 B
1 2 3 x
(x, y)
Sum of residuals = 0
of −6 is +6.) This avoids the problem of positives and negatives cancelling. This
would work well except that the device of taking absolute numbers is not easy to
manipulate mathematically. In the past the absence of efficient computers made this
an important consideration. For this and other, more statistical, reasons the criterion
has been rarely used, though the growing availability of computers has led to an
increased usage in recent years.
The third approach, and the one traditionally employed, is called least squares.
The negative signs are eliminated by squaring the residuals (a negative number
squared is positive). The sum of the squared residuals is a minimum for the best
line. In other words, the best line is the one that has values for a and b that make
the sum of squared residuals as small as possible. The criterion in least squares
regression can be stated: sum (residuals squared) is a minimum. (The least
squares method also has some technical advantages over alternative criteria, which
will not be pursued here.)
Although the least squares criterion has been selected, it is not immediately obvi-
ous how it should be used to find the equation of the best line. Finding the equation
of the line means finding values for a and b in y = a + bx. The least squares criterion
has to be turned into formulae for calculating a and b. The means of the transfor-
mation is differential calculus, a level of mathematics beyond the scope of this
module. Differential calculus will be left in its ‘black box’ and a jump made straight
to the least squares regression formulae:
For two variables, labelled x and y, and n paired observations on those variables,
the least squares regression line of y on x is given by:
= +
where:
∑( )( )
=
∑( )
and:
= −
Fortunately the availability of computers and calculators means that these formu-
lae rarely have to be used manually. However, using the formulae on a simple
example does provide a better intuitive understanding of regression.
Example
Find the regression line for the points:
x 1 2 4 5 8 Mean( ) = 4
y 20 19 34 30 47 Mean( ) = 30
∑( − ) = (1 − 4) + (2 − 4) + (4 − 4) + (5 − 4) + (8 − 4)
= 9 + 4 + 0 + 1 + 16
= 30
From the ‘black box’ formulae above:
= 120/30 = 4
= 30 − 4 × 4 = 14
The regression line is therefore:
= 14 + 4
11.5 Correlation
The formulae for a and b can be applied to any set of paired data. A regression line
can therefore be found for any group of points. The scatter diagram may show that
the points lie approximately on a circle, but regression will still find the equation of
the line that is closest to them. In other words, regression finds the best line for a
set of points but does not reveal whether a line is a good representation of the
points. Correlation fills this gap. It helps to decide whether, in view of the closeness
(or otherwise) of the points to a line, the regression line is likely to be of any
practical use. It does this by quantifying the strength of the linear relationship (i.e. it
measures whether, overall, the points are close to a straight line).
y
r = –1
x
r = –1 occurs when there is perfect correlation (all the
points are on the line) and the line has a negative slope.
y
r = about –0.8
x
r = –0.8 occurs when the points lie reasonably close
.to a negatively sloped straight line.
y
r = about 0
x
When r is close to zero the points are not at all like a straight line.
r = about 0.8
x
r = 0.8 occurs when the points lie reasonably close
to a positively sloped straight line.
y
r=1
x
r = 1 occurs when there is perfect correlation and the points
are on a positively sloped straight line.
The correlation coefficient, r, can take on all values between −1 and +1. Correla-
tion coefficients close to −1 or +1 always indicate a strong linear relationship
between the variables. In such a case the scatter diagram would show the points
falling approximately along a line. Close to 0, correlation coefficients indicate a weak
or non-existent linear relationship. For example, the scatter diagram might show the
points following a circular pattern. Figure 11.7 illustrates the meaning of different
values of r.
An intuitive understanding of the way the correlation coefficient works can be
gained by considering the square of the correlation coefficient. By convention, but
for no apparent reason, it is written with a capital letter, R2. The intuitive explana-
tion of R2 runs like this: before carrying out a regression analysis one can measure
the total variation in the y variable by:
Total variation = ∑( − )
This expression measures the extent to which y varies from its average value. It
uses squares for the same reason that the residuals are squared in the least squares
criterion: to eliminate negatives. It measures scatter in a similar manner to variance
in summary measures.
Part of this total variation in y can be thought of as being caused by the x varia-
ble. This is the purpose of regression and correlation: to investigate the extent to
which changes in y are affected by changes in x. The variation in y that is caused by
x is called the explained variation. It can be measured from the difference between
the fitted y value and the average y value:
Explained variation = ∑(Fitted − )
Since the fitted y values are calculated from the regression equation, this variation
is understood, hence ‘explained’. Another part of the total variation in y is ‘unex-
plained’. This is the variation in the residuals, the ‘left-over’ variation:
Unexplained variation = ∑(Residuals)
Note that it is this unexplained variation that least squares regression minimises.
Using some more ‘black box’ mathematics, it is possible to show that:
Total variation = Explained variation + Unexplained variation
This fact is used to define the correlation coefficient:
= Explained variation/Total variation
That this formula is fully compatible with the earlier formula for r will be demon-
strated in a later example. The formula for R2 indicates why the correlation
coefficient squared measures the strength of a linear relationship. If:
=1
then:
Explained variation = Total variation
and:
Unexplained variation = 0
If:
Unexplained variation = 0
then the sum of all residuals is 0, and therefore all the residuals must individually be
equal to 0. The points must all lie exactly on the line. R2 = 1 thus signifies perfect
correlation. If:
=0
then:
Explained variation = 0
and:
Unexplained variation = Total variation
In this case the regression line does not explain variations in y in any way. The
residuals vary just as much as the original values of y. The points are not at all like a
straight line.
Because variation is measured in squared terms it is labelled R-squared. Some-
times a computer regression program will print out r, sometimes R-squared and
sometimes both. r has the advantage of distinguishing between positive and negative
correlation (and some technical statistical advantages); R2 has the advantage of a
better intuitive meaning.
To summarise, the essence of correlation is that when a regression analysis is
carried out, the variation in the y variable is split into two parts: (a) a part that is
explained by virtue of associating the y values with the x, and (b) a part that is
unexplained since the relationship is an approximate one and there are residuals.
The correlation coefficient squared tells what proportion of the original variation in
y has been explained by associating y with x (i.e. drawing a line through the points).
The higher the proportion, the stronger the correlation is said to be. It is often
sufficient to make a judgement as to whether R2 is high enough to be useful in a
particular situation. In most cases, 0.75 or more would be regarded as highly
satisfactory; 0.50–0.75 would be adequate; below 0.50 would give rise to serious
doubts. However, there are statistical methods by which one can be more precise.
They will be covered in the next module.
Example
This is the same example that was used for calculating the regression coefficients a and
b:
x 1 2 4 5 8 Mean( ) = 4
y 20 19 34 30 47 Mean( ) = 30
= 120/√30 × 526
= 120/126
= 0.95
The correlation coefficient is close to 1, indicating a strong positive correlation. Exactly
the same result can be achieved using the formula for R2.
To use the formula for R2, it is first necessary to work out the fitted y values as a step
in obtaining the explained variation. The fitted values result from putting the x values
into the regression equation. Recall that the regression equation is:
= 14 + 4
For = 1, fitted = 14 + 4 = 18
For = 2, fitted = 22
For = 4, fitted = 30
For = 5, fitted = 34
For = 8, fitted = 46
Explained variation= ∑(fitted − )
= (18 − 30) + (22 − 30) + (30 − 30)
+(34 − 30) + (46 − 30)
= 144 + 64 + 0 + 16 + 256
= 480
Total variation = ∑( − )
= 526 (calculated previously)
= Explained variation/Total variation
= 480/526
= 0.91
The formula for r resulted in:
= 120/√30 × 526
Squaring:
= 120 /(30 × 526)
Cancelling by 30:
= 480/526
This is precisely equivalent to the result obtained from the formula for R2.
The reason why the residuals of a regression analysis are important, apart from
their role as the basis of the least squares criterion, is that they help to decide
whether the variables are linearly related. The correlation coefficient is one test for
deciding whether the two variables are linked by a linear equation. By itself, howev-
er, this test is not sufficient. A further test is that the residuals should be random,
meaning that they should have no pattern or order.
Figure 11.8 shows a scatter diagram for which the correlation coefficient is high
but for which a straight line does not fully represent the way in which the variables
are linked. There is another effect, probably a seasonal pattern or cycle, which
should be incorporated (in a way as yet unspecified) into the equation. In short, if
the unexplained variation in y (the residuals) is random, then, no matter how large
this variation may be, a linear equation is the best way to represent the relationship.
If the unexplained variation has a pattern, then a linear equation is not, by itself, the
best way to represent the relationship. More can be done to improve the model
expressing the relationship between the variables. Should the purpose of the
regression be to forecast, Figure 11.9 shows how different the predictions might be
if this pattern were not incorporated.
y
With pattern
Without pattern
Now Forecast x
1
Residuals
–1 Fitted y
–2
–3
regression analysis is no longer accurate number crunching but rather the ability to
interpret the vast outputs of computer regression packages.
Four steps in regression and correlation have been dealt with so far. They are:
(a) inspecting the scatter diagram;
(b) calculating the regression coefficients;
(c) calculating the correlation coefficient;
(d) checking the residuals for randomness.
All four steps can be carried out on a PC. The exact commands that would have
to be issued would depend upon the package itself (and could be found from its
manual), but the principles and the general form of the output will be the same.
The following situation will be used to demonstrate how a package might work.
A retail clothing company is trying to forecast its sales of clothes for four-year-old
children as part of its corporate plan. As a first step, a regression model relating
nationwide sales to the birth rate four years previously is investigated. It makes
sense to consider this relationship since sales must be related in some way to the
number of children needing clothes. Annual data on sales (in constant price terms)
and the birth rate for the last 20 years are available on a PC. The variables were
labelled ‘sales’ and ‘births’ when they were entered into the computer.
Birth rate
A scatter diagram is a first check that the analysis makes sense. It also gives the
analyst better insight and more direct knowledge of the situation.
0 x
Coefficient
Births 3.14
Constant 8.65
R-squared = 0.93
Standard error of residuals = 5.62
This output means that the equation relating sales and births is:
Sales = 8.65 + 3.14 × Births
If the births four years ago were 18.0, then the estimate for sales this year would
be:
Sales = 8.65 + 3.14 × 18.0
= 8.65 + 56.52
= 65.17
The only other quantity on the simplified output is the residual standard error
(the standard deviation of the residuals), which is 5.62. If the scatter of the residuals
(the scatter of the actual values about the fitted line) is the same as their scatter in
the future, then this figure gives an idea of the accuracy of any forecasts made. The
65.17 forecast made above is a point on the line. The residual standard error
indicates how the actual value might differ from the forecast. Indeed, if the residuals
are normally distributed, then the 95 per cent confidence limits for the sales will be
at least as wide as plus or minus twice the residual standard error (i.e. ±11.24). (The
reason for the ‘at least’ will be explained in Module 12.)
3
2
Residuals
1
0
Fitted y
–1
–2
–3
(a)
3
2
Residuals
1
0
Fitted y
–1
–2
–3
(b)
3
2
Residuals
1
0
Fitted y
–1
–2
–3
when a relationship passes all the statistical tests, what can be concluded and
how can the results be used?
It is essential to remember that, while the statistics can show whether the varia-
bles are associated, they do not say that changes in one variable cause changes in
another. Some examples will illustrate the point.
There is a close association (high correlation, random residuals, etc.) between the
price of rum and the remuneration of the clergy. This does not mean that the
two variables are causally related. A rise in salary for the clergy will not, presum-
ably, be spent on rum thereby depleting stocks and causing a rise in price. It is
more likely that there is a third factor, such as inflation or the general level of
society’s affluence, which affects both variables so that in the past the salaries
and the price of rum have moved together. It would be a mistake to suppose
that, if conditions were to change, the relationship must continue to hold. If, for
some philanthropic purpose, the clergy agreed to take a cut in salary one year, it
is unlikely that the price of rum would fall as well. The price of rum would con-
tinue to change in response to inflation or whatever factors influence it.
Different circumstances would now apply to clergy salaries and the association
would be broken.
The point is that a causal relationship will hold through changing circumstances
(provided the causal link is not broken); a purely associative relationship is likely
to be broken as circumstances change. In applications the implication is clear. If
the relationship is associative (and most are), beware of changing circumstances.
The difference between a causal and an associative relationship depends on its
structure, not on the statistics. Common sense and logic are the ways to distin-
guish between the two. In the case of rum and the clergy, no one would seriously
argue that the link is causal; in the case of sales of clothing for four-year-old
children and births four years earlier, there is some sort of causal link (although
of course there are other influences operating); in other cases the question of
causality can be far from clear.
(b) Spurious regressions should be guarded against. Spurious means that the
correlation coefficient is high but there is no underlying relationship. This may
arise purely by chance when the data available for the analysis just happen to
have a high correlation. If other observations had been taken no correlation
would have been apparent. Spurious correlations may also arise because of a
fault in the regression model. For example, a study sought to determine the rela-
tionship between companies’ profitability and their size. Profitability was
measured by return on net assets; size was measured by net profit. The equation
was:
Return on net assets = + × Net profit
The observations were taken from a large sample of companies. This regression
has an in-built likelihood of a high correlation since:
Return on net assets = Net profit/Net assets
The regression equation can therefore be re-expressed:
Net profit/Net assets = + × Net profit
Net profit appears in both the y and x variables. If there are only small variations
in net assets from company to company, then the tendency will be for high y
values to be associated with high x values (and low with low), simply because net
profit dominates both sides of the equation. A high correlation coefficient may
well result but doubt would surround the conclusion that profitability and size
are linked.
(a) (b)
y y
x x
A y = 10 + 0.8x
B y = 20 + 0.5x
C y = 30 + 0.1x
Learning Summary
Regression and correlation are important techniques for predicting and understand-
ing relationships in data. They have a wide range of applications: economics, sales
forecasting, budgeting, costing, human resource planning, corporate planning, etc.
The underlying statistical theory (outlined in the next module) is extensive. Unfor-
tunately, the depth of the subject can in itself lead to errors. Users of regression can
allow the statistics to dominate their thought processes. Many major errors have
been made because the wider non-statistical issues have been neglected. As well as
providing company knowledge and broad expertise, managers have a role to play in
drawing attention to these wider issues. They should be the ones asking the pene-
trating questions about the way regression and correlation are being applied. If not
the managers, who else?
Managers can only do this, however, if they have a reasonable grasp of the basic
principles (although they should not be expected to become experts or to be
involved in the technical details). Only when they have taken the trouble to equip
themselves in this way will they be taken seriously when they participate in discus-
sions. Only then will they take themselves seriously and have sufficient confidence
to participate in the discussions. Regression and correlation have a mixed track
record in organisations, varying from high success to abject failure. A key to success
seems to be for managers to become truly involved. Too often the managers pay lip
service to participation. Their contribution is potentially very large. To make it
count they need to be aware of two things. First, the broad principles and manageri-
al issues (the topics in this module) are at least as important as the technical,
statistical aspects. Second, knowledge of the statistical principles (the topic for the
next module) is necessary, not in order that they may do the regression analyses
themselves, but as a passport to a legitimate place in discussions.
Review Questions
11.1 Which of the following statements are true?
A. Correlation and regression are synonymous.
B. Correlation would establish whether there was a linear relationship between
the weights of one sample of business executives and the salaries of another
sample of business executives.
C. Annual sales figures for each of a company’s 17 regions is a set of cross-
sectional data containing 17 observations.
D. If high values of one variable are associated with low values of a second and
vice versa, then the two variables are negatively correlated.
11.2 Which of the following statements is true? The residuals of a regression line are:
A. the perpendicular distances between actual points and the line.
B. the difference between actual and fitted y values.
C. always positive.
D. all zero for a ‘best-fit’ line.
x 4 6 9 10 11
y 2 4 4 7 8
11.7 The relationship between the ages of husbands and wives is likely to show:
A. strong positive correlation.
B. weak positive correlation.
C. zero correlation.
D. weak negative correlation.
E. strong negative correlation.
11.8 When the residuals resulting from a linear regression analysis are examined, which of
the following characteristics is desirable?
A. Randomness.
B. Serial correlation.
C. Heteroscedasticity.
Coefficient
Ad. exp. 6.3
Constant 14.7
R-squared = 0.70
Sum of squared residuals = 900
11.10 What is the prediction of sales volume when the advertising expenditure is 5?
A. 21.0
B. 31.5
C. 46.2
D. 7.74
11.12 The relationship between a company’s sales (y) and its expenditure on advertising (x) is
investigated. The linear regression lines of sales on advertising (y on x) and advertising
on sales (x on y) are both calculated. Which of the following statements is true?
A. The correlation coefficients are the same for both.
B. The slopes of the two regression lines are the same.
C. The two intercepts are the same.
Booking office
1 2 3 4 5 6
Transactions (y) 11 7 12 17 19 18
Clerks (x) 3 1 3 4 6 7
a. Draw a scatter diagram and decide whether the relationship appears linear. Calcu-
late the correlation coefficient.
b. The following three straight lines could all be fitted to the data:
i. The line joining the extreme y values, i.e. linking the points (1,7) and (6,19).
ii. The line joining the extreme x values, i.e. linking the points (1,7) and (7,18).
iii. The regression line of y on x.
Find the equation of each line. Measure the residuals (in the y direction). Calculate the
mean absolute deviations and the variances of the residuals. Compare the MADs and
variances and suggest which of the three straight lines gives the closest fit.
The computer gives the following results after regressing sales against family income:
Coefficient
Income 0.17846
Constant 35.228
Correlation coefficient r = 0.92
Residual standard error = 4.72
a. What is the estimated relationship between sales and income? What sales level
would be predicted for a store whose catchment area has an average disposable
income per family of £221?
b. How good is the fit of the linear equation to the data?
c. Use the residual standard error to suggest what the maximum level of accuracy
achieved by the model is likely to be.
d. What are the non-statistical reservations connected with forecasting sales in this
way?
Learning Objectives
Regression and correlation are complicated subjects. The previous module present-
ed the basic concepts and the managerial issues involved. In this module, the basic
concepts are extended in three directions. First, multiple regression deals with
equations involving more than one x variable. Second, non-linear regression allows
relationships to be based on equations that represent curves. Third, the statistical
theory underlying regression is described. This last topic permits rigorous statistical
tests to be used in the evaluation of the results. Finally, to bring together all aspects
of regression and correlation, a step-by-step approach to carrying out a regression
analysis is given.
This module contains advanced material and may be omitted first time through
the course.
12.1 Introduction
Regression analysis is a big subject. Most universities have at least one full professor
of regression analysis (except that he or she is not usually given the title of Professor
of Regression Analysis: some camouflage is often adopted, such as ‘Professor of
Econometrics’). The previous module covered only the tip of this iceberg. It
covered simple linear regression and correlation. The restrictions ‘simple’ and
‘linear’ will be removed in this module. Multiple (more than one right-hand-side
variable) and non-linear (equations other than y = a + bx) regression will be
introduced. Moreover, in Module 11 regression and correlation were dealt with at a
non-statistical level. For example, residuals were visually inspected for randomness.
Some of the underlying theory will now be considered. The purpose is the practical
one of allowing statistical tests to be employed, alongside visual ones, in evaluating
and using regression and correlation analyses.
The module deals with these topics in the order multiple regression, non-linear
regression, statistical theory. Finally, to bring all these ideas together, regression
analysis is summarised through a step-by-step guide to handling a regression
problem.
Some important differences between simple and multiple regression will soon
begin to emerge, but thus far the two are almost the same:
(a) the least squares criterion is the same;
(b) the residuals should be random.
Beyond these basic principles, the differences become apparent.
Scatter Diagrams
It is not possible to draw, in two dimensions, a scatter diagram involving several
variables. Therefore, at the outset of the analysis more than one scatter diagram will
need to be drawn: there will have to be a scatter diagram for the y variable com-
bined with each of the x variables. The purpose will be the same as in simple
regression: to gain an approximate idea of whether and how the variables are
related.
Correlation Coefficient
In simple regression, the correlation coefficient squared (R2) measures the propor-
tion of variation explained by the regression. In this way it quantifies the closeness
of fit of the regression equation. In multiple regression, a more sensitive measure is
needed. The reason is as follows.
Start with a simple regression model:
Sales = + × Births
Suppose a second x variable, personal disposable income (PDI), is added:
Sales = + × Births + × PDI
It is not possible for R2 to decrease when PDI is added. Even if PDI were totally
unconnected with Sales, R2 could not fall. This is because, at the worst, the multiple
regression could choose A and B to be equal to their original values, a and b, and C
to be equal to 0. Then the multiple regression would be the same as the simple
regression with the same R2. Since least squares acts to minimise the sum of squared
residuals and thereby maximises R2, the new R2 can do no worse than equal the old
one. Consequently, even if a new x variable is unconnected to the y variable, R2 will
almost certainly rise, making it appear that the closeness of fit has improved.
A more sensitive measure of closeness of fit is the adjusted correlation coefficient
squared, (pronounced ‘R-bar-squared’). This is based on the same ratio as R2, but
the formula is adjusted to make allowance for the number of x variables included. If
a new x variable is unconnected with the y variable, then will fall. It is not
necessary to know the exact nature of the adjustment since most computer packages
print out , usually in preference to R2. R-bar-squared is used in just the same way
as R-squared, as a measure of proportion of variation explained and thereby as a
2
quantification of the closeness of fit. To summarise, is based on R2 but adjusted
to make allowance for the number of right-hand-side variables included in the
regression.
Collinearity
What would happen if the same x variable were included twice in the regression
equation? In other words, suppose that in the equation below z and t were the same:
= + + +
The answer is that the computer package would be unable to complete the calcu-
lations. However, if z and t were almost but not quite the same, then the regression
could proceed and c and d would be estimated. Clearly, these estimates would have
little value. Nor would it be easy to determine which of the variables had the greater
influence on the y variable.
This, in simple form, is the problem of collinearity (sometimes referred to as
multi-collinearity). It occurs when two (or more) of the x variables are highly
correlated. In these circumstances the two variables are contributing essentially the
same information to the regression. Their coefficient estimates are unreliable in the
sense that small changes in the observations can produce large changes in the
estimates. Regression finds it difficult to discriminate between the effects of the two
variables. While the equation overall may still be used for predictions, it cannot be
used for assessing the individual effects of the two variables.
The basic test for collinearity is to inspect the correlation coefficients of all x
variables taken in pairs. If any of the coefficients are high, the corresponding two
variables are collinear.
There are three remedies for the problem:
(a) Use only one of the variables (which to exclude is largely a subjective decision).
(b) Amalgamate the variables (say, by adding them together if the aggregated
variable has meaning).
(c) Substitute one of the variables with a new variable that has similar meaning and
that has a low correlation with the remaining one of the pair.
The test and remedies for collinearity are not precise. It is more important to be
aware of the problem and the restrictions it places on interpretation than to delve
into the technicalities lying behind it.
Example
The weekly sales of a consumer product are to be predicted from three explanatory
variables. The first x variable is the gross domestic product (GDP), reflecting the
influence of the economic environment on sales; the second is the weekly advertising
expenditure on television; the third is the weekly advertising expenditure in newspa-
pers.
The regression analysis would proceed as follows.
(a) Inspect the scatter diagrams to see whether approximate linear relationships do
exist. This time there will be three: sales against GDP, sales against television adver-
tising and sales against newspaper advertising.
(b) Carry out the regression analysis by computer. The computer will ask for the
name of the dependent variable, then those of the independent variables. The
printout will look something like:
Coefficient
GDP 2.23
TV advertising 0.35
Newspaper advertising 1.09
Constant 4.28
R-bar-squared = 0.97
Residual standard error = 3.87
Coefficient
GDP 3.14
Constant 8.65
R-bar-squared = 0.93
Standard error of residuals = 5.62
Notice that the coefficient for GDP is 3.14, whereas for multiple regression it was
2.23. The addition of an extra variable changes the previous coefficients, including
the constant.
R-bar-squared has risen to 0.97 from 0.93. The presence of the advertising expendi-
tures has therefore increased the proportion of variation explained. The increase is
not large but then R-bar-squared was already high. This is a real increase since the
adjusted correlation coefficient has been used and this makes allowance for the
presence of an extra variable.
Correspondingly, the residual standard error has decreased. In other words, the
residuals are generally smaller, as would be expected in view of the increased R-bar-
squared.
(c) Check the residuals. This process is exactly the same as for simple regression. A
scatter diagram of residuals plotted against fitted y values is inspected for random-
ness.
(d) Check for collinearity. Most computer packages will print out the correlation
matrix showing the correlations between the x variables taken in pairs:
Variable 1 2 3
1 1.0 0.1 0.3
2 1.0 0.7
3 1.0
There is high correlation (0.7) between the two advertising variables. They are collinear
to some extent. Their coefficients will not be reliable and could not be used with
confidence to compare the effectiveness of the two types of advertising. Some thought
should be given to the possibility of amalgamating the variables to form a combined
advertising expenditure variable. The other remedies for collinearity would probably not
be used. Since the correlation between them is far from perfect (and the variables do
contribute some separate information) it would be wrong to drop one of the variables
completely. It is unlikely that the third remedy for collinearity (finding a substitute) could
be applied because of the difficulty of finding another variable with similar meaning.
However, if the purpose of the regression is solely to predict sales, the equation could
still be used as it stands. Values for GDP and advertising would be inserted in the
equation to give a predicted value for sales.
100
80
60
x y
0 10
1 15
40
2 24
3 37
20 4 54
5 75
6 100
0 x
1 2 3 4 5 6
y 8 9 12 16 21 39 43 53 65 79
x 2 3 3 4 5 7 7 8 9 10
100
80
60
40
20
0 x
2 4 6 8 10
y 8 9 12 16 21 39 43 53 65 79
x 2 3 3 4 5 7 7 8 9 10
z 4 9 9 16 25 49 49 64 81 100
Variable Coefficient
x −0.50
z 0.80
Constant 5.10
R-bar-squared = 0.99
Residual standard error = 1.57
The correlation coefficient is high (R-bar-squared = 0.99). The residuals should also be
inspected for randomness. Restoring x2 to the equation in place of z:
= 5.1 − 0.5 + 0.8
The equation can now be used to forecast values of y given values of x.
12.3.2 Transformations
A variable is transformed when some algebraic operation is applied to it. For
example, a variable x is transformed when it is turned into its square (x2), its
reciprocal (1/x) or its logarithm (log x). The list of possible transformations is long.
The principle behind the use of transformations in regression is that a non-linear
relationship between two variables may become linear when one (or both) of the
variables is transformed. Linear regression is then used on the transformed varia-
bles. The variable is de-transformed when the equation is used to make predictions.
In other words, although the relationship between y and x is curved, it may be
possible to find a transformation of y or x or both such that the relationship
between the transformed variables is linear.
For example, a relationship of the form:
= · ﴾12.2﴿
is non-linear between y and x. This is the exponential function. Recall from Module
2 that it is characterised by the fact that, each time x increases by 1, y increases by a
constant proportion of itself. Contrast this with a linear function (Y = A + BX),
where each time X increases by 1, Y increases by a constant amount (= B). Linear
means a constant increase (or decrease); exponential means a constant percentage
increase (or decrease). There are clearly situations where an exponential function
might apply. For instance, if the sales increase of some product were thought likely
to be 10 per cent per year, the relationship between sales and time would be
exponential. Were the increase thought likely to be £1 million each year, the
relationship would be linear.
The issue in regression is this. Suppose that two variables are thought to be relat-
ed by an exponential function and that some historical data are available. How can
the equation relating the variables (i.e. the values of a and b in Equation 12.2) be
found when the regression formula applies only to linear relationships?
Figure 12.3 shows the graph of the exponential function. A transformation can
make it linear. If natural logarithms (to the base e) are taken of each side of the
equation and some algebra applied, using the rules for manipulating logarithms, the
result is:
= ·
log = log ( · )
log = log + ·
(a) (b)
y log y
a log a
x x
Sales volume y 12 14 17 20 24 28 34 41 48 56 66 76
Month x 1 2 3 4 5 6 7 8 9 10 11 12
Use regression analysis to find the exponential equation linking sales and time. Predict
sales volume for the next three months.
An exponential equation is of the form:
= ·
Fitting an equation of this type to the data amounts to estimating the values of a and b.
In order to use linear regression formulae, the equation must first be transformed to a
linear one. Taking logarithms of both sides, the equation becomes:
log = log + ·
The regression is performed on log y and x. To do this, the logarithms of the sales must
be found:
log y 2.48 2.64 2.83 3.00 3.18 3.33 3.53 3.71 3.87 4.03 4.19 4.33
x 1 2 3 4 5 6 7 8 9 10 11 12
Putting these two variables into a regression package, the output is:
Variable Coefficient
x 0.17
Constant 2.32
R-bar-squared = 0.99
Residual standard error = 0.02
The exponential function is one of the relatively few non-linear relationships that can be
made linear by applying a transformation to both sides of the equation. Mostly non-
linear relationships cannot be made linear in such a neat way. Experience allied to
mathematics is needed in order to know which relationships are amenable to this
transformation treatment.
The use of transformations so far has referred to situations in which there were sound
reasons for supposing the two variables were related by a particular non-linear relation-
ship. For example, the belief that sales of a product were increasing at a rate of 10 per
cent per year was a good reason for using an exponential function.
In other situations the type of non-linear relationship may not be known. Transfor-
mations can be helpful in a different way. The scatter diagram may just show a curve of
some sort but it may be difficult to go further and suggest possible non-linear equations
relating the variables. In such circumstances it may be necessary to try several transfor-
mations by using the squares, square roots, reciprocals, logarithms, etc. of one or both
of the variables to find the scatter diagram of transformed variables that looks most like
a straight line. Equally, regression analyses may be carried out with several types of
transformation to find the one which gives the best statistical results, highest R-squared,
random residuals, etc.
This approach, although sometimes necessary, can be dangerous. If enough
transformations are tried, it is usually possible eventually to come up with one that
appears satisfactory. However, there may be no sound reason for using the trans-
formation. The model may be purely associative. It is better to base the choice of
transformation on logic or prior knowledge rather than exhaustive trial and error.
For example, take the case of a company with several plants manufacturing the
same product. The unit cost of production (y) will vary with the capacity of the
plant (x). The relationship is unlikely to be linear. Finding a transformation that
expresses the relationship in linear form could be based on trial and error. Several
transformations of y and x could be tried until one is found that appears (from the
scatter diagram or regression results) to make the relationship linear. It would be
preferable to have a sound reason for trying particular transformations. The ‘law’ of
economies of scale suggests that unit costs might be inversely proportional to
capacity. This is a good reason for transforming capacity to 1/capacity and consider-
ing the relationship between y and 1/x. If this relationship appears to be linear then
statistics and logic are working together.
When this is the case, the statistics are in the position of confirming the logic.
Otherwise there will be no underlying basis for using the transformation and the
statistics will be in isolation from the real situation. Or attempts must be made to
rationalise the structure of the situation to fit in with the statistics. This might of
course lead to the discovery of new theories. More likely, there is a danger of using
associative rather than causal models. The trial and error approach to transfor-
mations should not be ruled out, but it should be used with caution.
sampling. For the moment the discussion will be restricted to simple linear regres-
sion.
If it is believed that two variables are linearly related, then the statistician hypoth-
esises that, in the whole population of observations on the two variables, a straight-
line relationship does exist. Any deviations from this (the residuals) are caused by
minor random disturbances. A sample is then taken from the population and used
to estimate the equation’s coefficients, the correlation coefficient and other statis-
tics. This sample is merely the set of points upon which calculations of a, b and r
have been based up to now.
However, the fact that these calculations are made from sample data means that
a, b and r are no more than estimates. Had a different sample been chosen, different
values would have been obtained. Were it possible to take many samples, distribu-
tions of the coefficients would be obtained. In practice only one sample is taken and
the distributions estimated from it. This is similar to the way in which the sampling
distribution of the mean (Module 8) could be estimated from just one sample. In
effect, the coefficient distributions are estimated from the variations in the residuals.
These distributions (of a, b, r and other statistics) are the basis of significance
tests of whether the hypothesis of a straight-line relationship in the population is
true. They are also the basis for determining the accuracy of regression predictions.
The statistical approach has several practical implications.
attributable to regression and that forming the residuals. It can be proved, with
some ‘black box’ mathematics, that the ratio of the mean squares of the two sources
of variation is an F ratio under the hypothesis that there is no linear relationship
between the two variables. In other words, if it is hypothesised that the x and y
variables are not related, then the ratio:
( )
( )
x 1 2 4 5 8 Mean( ) = 4
y 20 19 34 30 47 Mean( ) = 30
x y Fitted Residual
1 20 18 2
2 19 22 −3
4 34 30 4
5 30 34 −4
8 47 46 1
SS (regression) = ∑(Fitted − )
= (18 − 30) + (22 − 30) + (30 − 30) + (34 − 30) + (46 − 30)
= 144 + 64 + 0 + 16 + 256
= 480
SS (residual) = ∑(Residual)
= 2 + (−3) + 4 + (−4) + 1
= 46
As expected:
Total SS = SS(regression) + SS(residual)
526 = 480 + 46
In practice only two of the three sums of squares would have been calculated directly.
The third would be derived from the sums of squares equality. The sums of squares can
be put into an analysis of variance table (an ANOVA table), as in Table 12.1.
The degrees of freedom for SS(regression) is 1 since there is one independent variable.
The degrees of freedom from SS(error) is 3. There are five observations, but 2 degrees
of freedom are lost because the coefficients a and b have to be calculated from the data
before the residuals can be found. The degrees of freedom for the Total SS is one fewer
than the number of observations because the y mean ( ) has to be calculated first.
The mean squares are calculated by dividing the sums of squares by the corresponding
degrees of freedom. The observed F value is the ratio of the mean squares.
Finally, the observed F value has to be compared with the critical F value found from in
Appendix 1. For (1,3) degrees of freedom, the critical F value at the 5 per cent level is
10.13. The observed value greatly exceeds this. It is therefore concluded that the
hypothesis must be rejected. It cannot be said that there is no linear relationship
between the two variables. This is no more than should be anticipated given the high
correlation coefficient.
are random, the linear model must be the best pattern that can be obtained from the
data.
Many statistical tests for randomness exist. This runs test is a common example.
It works in the following way. A ‘run’ is a group of consecutive residuals with the
same sign. As an example, the ten residuals in Figure 12.4 have four runs.
Residuals: +3.1 –1.6 –0.2 –1.4 +1.1 +2.9 +0.3 –1.0 –3.1 –0.1
Runs: 1 2 3 4
(a) y
1st run
2nd run
x
(b) y
A runs test is based on an expected number of runs, which is the number of runs
that would be most likely to occur if the residuals were random. The expected
number of runs can be calculated using the basic ideas of probability. It is then
compared with the observed number of runs counted in the residuals. If the actual
differs from the expected by a large margin, the residuals will be assumed to be non-
random. In statistical terms, a significance test will indicate whether any difference
between the observed and expected numbers of runs is sufficient to reject the
hypothesis that the residuals are random.
Statistical tables (see Table A1.7 in Appendix 1) are available that show the critical
values for the observed number of runs. The tables are used as follows. If there are n1
positive residuals and n2 negative residuals, then Table A1.7 in Appendix 1 shows, for
one parameter, the lower critical value for the number of runs. If the number of runs
is less than this critical value, then the hypothesis of random residuals is rejected at the
5 per cent level (the tables relate only to the 5 per cent level). Table A1.7 in Appendix
1 shows the upper critical value. If the observed number of runs exceeds this critical
value (for the given numbers of positive and negative residuals) then the hypothesis is
rejected. For random residuals the number of runs should therefore be between the
upper and lower critical values.
Example
Are the residuals of Figure 12.4 random?
The runs test is a significance test. As with all significance tests, there are five stages to
follow.
(a) The hypothesis is that the residuals are random.
(b) The evidence is the sample of residuals in Figure 12.4.
(c) The significance level is the customary 5 per cent.
(d) The critical values are given by Table A1.7 in Appendix 1. There are 10 residuals,
four positive and six negative. Therefore:
n1 = 4 n2 = 6
From Table A1.7 the lower critical value is 2. From Table A1.7 the upper critical
value is 9.
(e) Since the observed number of runs is four (from Figure 12.4), the observed result
does not lie beyond the critical values. The hypothesis is accepted. The residuals
appear to be random.
If either n1 or n2 is larger than 20, tables such as those in Appendix 1 cannot be used. In
this range the number of runs behaves like a normal distribution with:
Mean = +1
( )
Standard deviation =
( ) ( )
A significance test proceeds just as the one above except that the critical values are not
taken from tables. They are two standard deviations above and below the mean (for a 5
per cent significance level).
If there is a strong pattern in a set of residuals, a visual test with a scatter diagram is
usually sufficient to detect it. However, most computer regression packages offer a
statistical test for randomness. Many such tests exist but no one package will offer them
all. The runs test is just one possibility.
When a runs test is called for, the computer usually calculates the actual number of runs
counted in the residuals. It also prints out the range (between the critical values), which
indicates random residuals. In practice, therefore, the significance test is usually carried
out automatically by the computer.
A computer printout might be as follows:
Since the number of runs falls within the range, the conclusion is that the residuals are
random.
2.5% 2.5%
0
Hypothesis
Critical Critical
that coefficient
value value
=0
If the coefficient estimate is in this
range the hypothesis is accepted
and the variable excluded from the
equation.
(e) If tObs. exceeds t.025 then the hypothesis is rejected, the variable does have a
significant effect; if tObs. is less than t.025 then the hypothesis is accepted and the
variable may be eliminated from the regression equation.
Most computer packages print out the observed t value automatically as the ratio
between the coefficient and its standard error.
Example
An earlier example on multiple regression was concerned with predicting the sales of
clothing for four-year-old children by relating it to the number of births four years
earlier and the gross domestic product (GDP). A third variable, the store’s advertising
expenditure that year, is added to the equation. The computer output for this multiple
regression is:
Are all the three variables – births, GDP and advertising – rightfully included in the
regression equation?
The computer output shown above is more extensive than that shown up to now. This
output includes the coefficients, standard errors, t values and degrees of freedom. Most
packages would show at least as much information as this.
To answer the question, a t test must be carried out on the coefficients for each of the
variables. Because the observed t values have been printed out (= coefficient/standard
error), this can be done very easily. In each case the critical t value is t.025 for 38 degrees
of freedom. For this number of degrees of freedom the t and normal distributions
almost coincide. From the normal distribution table (see Appendix 1, Table A1.2), the
critical value is 1.96.
For births and GDP the observed t values exceed the critical (both 5.9 and 2.5 are
greater than 1.96). These variables therefore both have a significant effect on the sales
of clothing and are rightfully included in the equation. The observed t value for advertis-
ing is 1.3, less than the critical value. According to the statistical test, advertising does
not have a significant effect on clothing sales. The variable can be eliminated from the
equation.
The residuals are the basis for measuring the accuracy of a prediction. Intuitively
this makes sense. The residuals are the historical differences between the actual
values and the regression line. It might be expected that actual values in the future
will differ from the regression line by similar amounts. The scatter of the residuals,
as measured by their standard error, is therefore an indication of forecasting
accuracy.
The residuals are what is ‘left over’ outside a regression equation. However, there
is also error within the regression model. The source of this error is inaccuracy in
the estimation of the coefficients of the variables and the constant term. This
inaccuracy is measured through the standard errors of the coefficient estimates.
The overall uncertainty in a prediction comes therefore from two areas:
(a) What the regression does not deal with – the residuals.
(b) What the regression does deal with – error in the estimation of regression model
coefficients.
Both types of error are combined in the standard error of predicted values,
often shortened to SE(Pred). The formula for this standard error is a complicated
one, integrating as it does all the standard errors mentioned above. However, most
computer packages have the capability to print it out.
Once calculated, SE(Pred) is used to calculate confidence intervals for predic-
tions. Provided there are more than 30 data points (the usual rule of thumb), the
normal distribution will apply to the point estimate. Thus, 95 per cent of future
values of the variable are likely to lie within ±2 SE(Pred) of the point estimate. If
there are fewer than 30 observations then the t-distribution applies. Instead of 2, the
t value (found from t-distribution tables) for the appropriate number of degrees of
freedom is substituted.
Such a 95 per cent confidence interval is sometimes referred to as the forecast
interval. It is used to decide whether the level of accuracy is sufficient for the
decisions being taken. For example, suppose an organisation is forecasting its profit
in the next financial year in order to take pricing decisions. The point forecast is
£50 m, the SE(Pred) is £4 m and there are 40 degrees of freedom. The accuracy of
the prediction can be calculated from these data. It is 95 per cent probable that the
profit will turn out to be in the range £42 m to £58 m (±2 SE(Pred)).
It should be noted that SE(Pred) is different for each different set of x values
used to make the prediction. This can be visualised through an example of simple
regression (only one x variable). Since the x value is multiplied by the slope in
making the prediction, any inaccuracy in estimating the slope coefficient (the x
coefficient) must have a changing effect for changing x values. Figure 12.7 shows
how the forecast interval varies for different x values. The interval is wider for x
values further from the centre of the regression.
2 SE (Pred)
2 SE (Pred)
(x, y)
(c) Check the residuals. The residuals should be random. A scatter diagram
between residuals and fitted values will demonstrate this visually; a runs test will
permit the check to be made statistically.
(d) Decide whether any x variables could be discarded. A t test on each x
variable coefficient will indicate if the variable has a significant effect on the y
variable. If not, the variable could be discarded, but this decision should be taken
in conjunction with prior knowledge. For example, if there were a good non-
statistical reason for including a particular x variable, then a t value of, say, 1.4
should not automatically lead to its elimination.
(e) Check for collinearity. A correlation matrix for all the x variables will show
which, if any, of them are collinear and therefore have unreliable coefficient
estimates.
(f) Decide if the regression estimates are accurate enough for the decision.
For the point estimate corresponding to any particular x value(s), SE(Pred) will
be the basis of calculating confidence intervals. These can be contrasted with the
decision at hand.
(g) If necessary, formulate a new regression model. Should any of the checks
have given unsatisfactory results, it is necessary to return to stage (a) and try
again with a new model.
The seven steps above relate solely to the statistical application of the technique
of regression. Another set of problems (covered in Module 13 on Business Fore-
casting) arise when the use of the technique has to be integrated with the decision-
making process of an organisation.
Learning Summary
This module has extended the ideas of simple linear regression by removing the
limitations of ‘simple’ and ‘linear’. First, multiple regression analysis makes the
extension beyond simple regression. It allows changes in one variable (the y
variable) to be explained by changes in several other variables (the x variables).
Multiple regression analysis is based on the same principle – the least squares
criterion – as simple regression. However, the addition of the extra x variables does
bring about added complications. Table 12.2 summarises the similarities and
differences between the two cases as far as their practical application is concerned.
Perhaps the best advice in this statistical minefield is to make the correct balance
between statistical and non-statistical factors. For example, the t test for the
inclusion of variables in a multiple regression equation should be taken carefully into
account but not to the exclusion of other factors. In the earlier example on predict-
ing sales of children’s clothing, the t value for advertising was only 1.3. Statistically it
should be excluded. On the other hand, if it has been found from other sources
(such as market research interviews) that advertising does have an effect, then the
variable should be retained. The poor statistical result may have arisen because of
the limited sample chosen or because of data inaccuracy. The profusion of complex
data produced by regression analyses can promote a spurious sense of accuracy and
a spurious sense of the importance of the statistical aspects. It is not unknown for
experts in regression analysis to make mountains out of statistical molehills.
Review Questions
12.1 Another expression for a right-hand-side variable is:
A. a dependent variable.
B. an x variable.
C. an explanatory variable.
D. a residual variable.
12.2 Simple regression means that one y variable is related to one x variable; multiple
regression means that several y variables are related to several x variables. True or
false?
Questions 12.3 to 12.6 refer to the following part of the computer output
from a multiple regression analysis:
12.5 The regression analysis must have been based on how many observations?
A. 28
B. 35
C. 33
D. 36
12.6 Statistically, variable 2 should be eliminated from the regression equation because:
A. It has the lowest coefficient.
B. It has the lowest t value.
C. It has a t value less than 1.96.
D. Its standard error is less than 1.96.
12.7 R-bar-squared is a better measure of closeness of fit than the unadjusted R-squared
because it makes allowance for the number of x variables included in the regression
equation. True or false?
Questions 12.8 to 12.11 refer to the following data:
From a multiple regression involving three right-hand-side variables and 38 observa-
tions, this information is given:
Sum of squares (regression) = 120
Sum of squares (residual) = 170
12.8 How many degrees of freedom do the sums of squares (regression) have?
A. 3
B. 4
C. 34
D. 35
12.14 To carry out a regression analysis based on the equation y = aebx a linear regression is
performed on which two variables?
A. y and x
B. log y and x
C. y and log x
D. log y and log x
12.15 A regression analysis is based on the equation y = aebx by carrying out a linear
regression on the variables log y and x. The computer printout shows the coefficient for
x to be 8 and the constant to be 4. What are the values for a and b in the original
exponential function?
A. a = 8, b = 4
B. a = 4, b = 8
C. a = antilog 8, b = 4
D. a = 8, b = antilog 4
E. a = antilog 4, b = 8
a. Which predictions should be used as the basis for the contract negotiations for the
new plants? What are the reasons for preferring this model?
b. Comment on certain aspects of the following:
i. In making predictions for a plant built in 2018 the age variable will have value
zero. Therefore, age will have a zero influence in the prediction. In these circum-
stances, how can it have been useful to incorporate age as a right-hand-side
variable?
ii. Using the third model (unit costs related to 1/Capacity and Age), what are the 95
per cent forecast intervals for the predictions of unit costs at the two plants?
iii. Why, for the same regression model, is SE(Pred) sometimes different for the
predictions for the two new plants?
iv. How can relatively small percentage increases in R-bar-squared (from 0.73 to
0.93 to 0.99) be consistent with such large decreases in SE(Pred) (from 8.0 to 4.0
to 1.7)?
Business Forecasting
Module 13 The Context of Forecasting
Module 14 Time Series Techniques
Module 15 Managing Forecasts
Learning Objectives
The intention of this module is to provide a background to business forecasting. By
the end, the reader should know what it can be applied to and the types of tech-
niques that are used. Special attention is paid to qualitative techniques at this stage
since they are the alternative to the quantitative techniques that are usually thought
to form the nucleus of the subject.
13.1 Introduction
The business world of the 1970s and earlier was more stable than it is at present.
This view is not merely the product of nostalgic reminiscence. Inspection of
business and economic data of the period reveals relatively smooth series with
steady variations through time. As a result, business forecasting was not the major
issue it is now. In fact, many managers claim to have done their forecasting then on
the back of the proverbial envelope. The situation is different today. Uncertainty is
evident everywhere in the business world. Forecasting has become more and more
difficult. Data, whether from companies, industries or nations, seem to be increas-
ingly volatile. The rewards for good forecasting are very high; the penalties for bad
forecasting or for doing no forecasting at all are greater than ever. Even the most
non-numerate managers tend to agree that even a second envelope is insufficient.
As a consequence, interest and investment in forecasting methods have been
growing. Organisations are spending more time and money on their planning. Much
of this increased effort has gone into techniques: established techniques are being
used more widely; new techniques have been developed. The specialist forecaster’s
role has grown. Unfortunately the outcome of all this effort has not always been
successful. Indeed, some of the most costly mistakes in business have been made
because of the poor use of forecasting methods, rather than any inadequacy in the
techniques themselves. Analysing these mistakes reveals that, in the main, they have
come about not through technical errors but because of the way the forecasting was
organised and managed.
While attention has rightly been given to the ‘kitbag of techniques’ of the practi-
tioner (statistician, operational researcher, etc.), the roles of non-specialists involved
in the process (accountants, financial analysts, marketing experts and the managers
who are to use the forecasts to take decisions) have been neglected. These roles are
usually concerned with managing the forecasts. However, because they have less
technical expertise, the non-specialists have tended to hold back and not participate
in planning the forecasting system. Their invaluable (although non-statistical)
expertise is thereby lost to the organisation in this regard. Accordingly, the effec-
tiveness of many organisations’ forecasting work has been seriously weakened. The
role of the non-specialist is at least as important as that of the practitioner.
The purpose of this module is to provide a context for the more detailed fore-
casting topics described in the two following modules. To give this background, the
full range of types of technique used and their applications will be reviewed. Since
many organisations have traditionally adopted a simple qualitative approach to
forecasting (typically, a group of managers decides what the forecast is to be),
qualitative techniques, including the not-so-simple ones, will then be described in
some detail.
situations where the qualitative approach is the only one possible. For new products,
industries or technologies (developing and retailing computer software, for instance)
no past records are available to predict future business; in some countries political
uncertainties may mean that past data records have no validity. In these situations
qualitative techniques provide soundly based ways of making forecasts.
Causal modelling means that the variable to be forecast is related statistically to
one or more other variables that are thought to ‘cause’ changes in it. Nearly always
this means that regression analysis, in conjunction with past data, is used to establish
the relationship. The equation is assumed to hold in the future and is used to make
the forecasts. For example, the econometric forecasts of national economies
regularly published in the media are based on causal modelling. The models relate
economic variables one with another. Policies, such as restricting the money supply,
and economic assumptions, such as the future price of oil, are fed into the model to
give forecasts of inflation, unemployment, etc.
A further example of causal modelling is that of a company trying to predict its
turnover from advertising expenditure, product prices and economic growth. The
value of causal modelling is that it introduces, statistically, external factors into the
forecasting. This type of method is therefore usually good at discerning turning
points in a data series.
Time series methods predict future values of a variable solely from historical
values of itself. They involve determining patterns in the historical record and then
projecting the patterns into the future. While these methods are not good when the
underlying conditions of the past are no longer valid, there are many circumstances
when time methods are the best. Indeed, many surveys have shown time series
methods to be superior to the other two approaches. They are used when:
(a) conditions are stable and will continue to be so in the future;
(b) short-term forecasts are required and there is not enough time for conditions to
change more than a little;
(c) a base forecast is needed, onto which can be built changes in future conditions.
Time series methods are also, in general, the cheapest and easiest to apply and are
often automatic, needing no manual intervention once they have been set up. They
can therefore be used when there are many forecasts to be made, none of which
warrants a large expenditure. This might be the case in forecasting stock levels at a
warehouse dealing in large numbers of small-value items.
13.3 Applications
The range of applications of forecasting is wide, and to appreciate their full extent it
is helpful to categorise them. There is more than one way of doing this. The
categorisation may be according to type of technique or the functional area (market-
ing, finance, corporate planning, etc.) or the variable to be forecast (sales volume,
cash flow, manpower and so on). It is probably most useful, however, to think of
forecasting in terms of the time horizon of the forecast. Is the forecast to cover the
next two months, the next two decades or something in between? Usually the time
horizon is expressed as short term (up to a year), medium term (one to five years),
or long term (more than five years).
story demonstrates that this is not necessarily so. In 1880 it was forecast that by
1920 London would be impassable to traffic because of a 3-foot layer of horse
manure. Qualitative techniques might have avoided this error by considering
changes in the technology of road transport corresponding to the fast-developing
railway network. Even if the prediction had been true, it would not have been
necessary to know that the layer would be exactly 3-foot thick. Two feet or 4 feet
would have served the decision makers equally well.
This last example also illustrates why qualitative forecasting is sometimes referred
to as technological forecasting. In this sense, qualitative techniques try to predict
turning points and new directions of business activity. The fact that the business
environment is rapidly changing and that there is a corresponding need to forecast
technological change is undoubtedly behind the recent increase in usage of these
techniques. And, of course, there are situations in which the lack of quantitative data
means that qualitative techniques are all that can be used.
People appear naturally to associate formal organisational forecasting with quan-
titative rather than qualitative techniques. Forecasting is seen as a numerical and
analytical process. What, then, causes organisations to act ‘unnaturally’ and use
qualitative techniques? And why is the use of these techniques increasing so rapidly?
Two motives lie behind the use of qualitative techniques.
The first motive is that the forecaster’s choice is restricted because there is a lack
of adequate data. Quantitative techniques work from a historical data record, and
preferably a long one. This lack of data may occur simply because no data exist. The
organisation may be marketing an entirely new product or exporting to a region in
which it has no experience. Alternatively, data may exist but be inadequate. This
may be because data have been recorded incompetently, but more likely it is because
the circumstances in which the data are generated are changing rapidly. For exam-
ple, the political situation in an importing country may be unstable, making past
records of business an unreliable base for future projections. Recently this problem
has been seen in situations affected by rapidly changing technology. In the electron-
ics industry events happen so quickly that historical data are soon out of date. For
example, forecasts of mobile application sales are difficult to make quantitatively
because product developments occur so frequently and the rate of growth of sales is
so steep. Quantitative techniques are generally poor in dealing with such volatility.
The second motive for using qualitative techniques is a more positive one. The
factors affecting the forecast may be better handled qualitatively. Module 15 on
Managing Forecasts shows that an important step in a systematic approach to
forecasting is a conceptual model in which the influences on the forecast are listed.
The most important influences may be ones that are not susceptible to quantifica-
tion. For example, forecasts of business activity in Scotland in 2015 would have
been influenced in a major way by the outcome of the referendum on Scottish
independence held in late 2014. The outcome and the impact of any negotiations
between Scottish and UK governments would have been difficult to deal with
quantitatively, yet would likely have a large bearing on business activity in the
interim period. As a forecaster in Scotland in mid-2014, it would probably have
been better to try to estimate the effect of this influence qualitatively. It is therefore
by no means always the case that qualitative techniques are a second best to
quantitative ones. In some situations they will better reflect actual circumstances.
The above are the two motives for using qualitative techniques. The other ques-
tion posed at the outset concerned the increasing frequency of their use. The answer
lies in the second of the above motives. The business environment seems recently
to have been changing more rapidly than previously, whether for technological or
political reasons. The situations in which qualitative techniques are seen to have
advantages are occurring more frequently. The microelectronic revolution is a clear
example of this, but the boundaries are wider. Since the late 1970s business data
have been more volatile than previously. Many data series show greater variability in
recent years. This has meant that the rewards for accurate forecasting have in-
creased, but it also means that previously established forecasting models have
performed less well. The need to plan successfully and the need to consider new
techniques have resulted in greater use of forecasting techniques, both in terms of
the number of organisations that have a forecasting system and of the range of
techniques employed.
Several of the most common qualitative forecasting methods hardly deserve the
title ‘technique’, although they may well have been accorded pseudo-scientific labels.
They are included here if only for the sake of completeness before we move on to
more serious contenders.
Visionary Forecasting
A purely subjective estimate (or guess, or hunch or stab in the dark) made by
one individual is a visionary forecast. Many managers believe themselves to be
good at this. Or they believe that someone else in the organisation is good at it.
Most organisations like to believe that they have a forecasting folk hero tucked
away on the third floor. Sadly, when these forecasts are properly monitored,
the records usually show visionary forecasting to be inaccurate. The reason for
this paradox seems to be that visionary forecasting is judged on its occasional
successes. Everyone remembers the person who, in 1978, predicted political
changes in China; the time the same person stated that the US dollar/pound
sterling exchange rate would never fall below 1.50 is forgotten. Nevertheless,
there are undoubtedly people who are good visionaries, but they are few in
number and their records should be carefully monitored.
Panel Consensus Forecasting
Panel consensus forecasting is probably the most common method used in
business. This refers to the meeting of a group of people who, as a result of
argument and discussion, produce a forecast. One would think that this should
provide good forecasts, bringing together as it does the expertise of several
people. Again, the records suggest otherwise. Panel consensus forecasts are
generally inaccurate. The reason is that the forecasts are dominated by group
pressures. The status, strength of characters and vested interest of the partici-
pants all influence the forecast. The full potential of the gathered experience is
not brought to bear and the forecast may turn out to be little different from
that of the strongest personality working alone. Some improvement can be
gained by using structured group meetings in which one person is given the
responsibility for organising the meeting and providing background information.
Brainstorming
Brainstorming is a technique perhaps better known for producing ideas than
generating forecasts. It is based on a group meeting but with the rule that every
suggestion must be heard. No proposal is to be ridiculed or put to one side
without discussion. In forecasting, brainstorming is used first to define the full
range of factors that influence the forecast variable and then to decide upon a
forecast. When properly applied it is a useful technique, but the process can
degenerate into an ill-disciplined panel consensus forecast.
Market Research
Market research also falls within the area of qualitative forecasting. It is an
accurate but expensive technique. This extensive subject involves a large
number of distinct skills such as statistical sampling, questionnaire design and
interviewing. It is more usually described within the subject of marketing and is
mentioned here only for the sake of completeness.
The techniques described in more detail below are potentially effective and
accurate, and fall firmly within the area of qualitative forecasting.
entire group. Even better, each participant will have had the opportunity to readjust
his or her views in response to worthy suggestions from others. The views of
‘cranks’, even persistent cranks, will be largely filtered out by the averaging process.
The onus is on deviationists to defend their views anonymously through the
chairman.
When tested, the Delphi technique has produced good results. But it has some
disadvantages. It can be expensive, especially when the group members are assem-
bled in the same physical place. Also, it is possible to cheat by indulging in some
game-playing. One participant knowing the likely views of other participants can
submit unrealistic forecasts in order that the averaging process works out the way he
or she wants. For example, in an attempt to forecast sales, a financial executive may
substantially understate his or her view so that the optimistic view of the sales
manager is counterbalanced and, as a result, he or she achieves the aim of holding
down stock levels.
The technique can be unreliable, meaning that different groups of people might
well produce different forecasts. The results can also be sensitive to the style of the
questions posed to the group.
Nevertheless, the technique is used in a variety of situations including by Japan’s
Office of Science and Technology to predict the future technological landscape.
Further, it can be combined with more detailed techniques for translating assump-
tions into quantified scenarios. It is particularly useful in the most difficult types of
forecasting, for example, where the time horizon is long and there are many
uncertainties.
By itself, of course, it is an approach rather than what would normally be referred
to as a technique. Its disadvantage is, as recent research has shown, that the act of
describing and publicising a scenario within an organisation increases the chance
that it will happen. Nevertheless, the basic idea is so sound that, in combination
with other techniques, there is really no reason not to use it.
(e) Using the original probabilities and the cross probabilities, the overall likelihoods
of different developments can be calculated. This may involve simulation. For
example, one factor might be the launch of a new product, and the develop-
ments associated with it would be the levels of the success of the launch. Given
the probabilities of all other developments, the relative chances of the successful
launch of the new product can be calculated.
The essence of cross-impact matrices is that they are a means whereby the plan-
ner can juggle with a whole series of uncertain developments and, in particular, their
influence on the others. The cost of the technique may only be justified when the
list of developments is long. In these circumstances the whole process may be
computerised, with formulae being the basis of the adjustment of the probabilities.
How all the probabilities are used is not a part of the technique. They may be used
formally in further calculations or they may be used informally in making judge-
ments about the future. They may well be used as part of scenario writing, in
formulating the most realistic scenarios.
Although a sales example has been used to illustrate these steps, the technique is
at its best when dealing with technological uncertainties. In fact, one of the earliest
reports of its application was in the development of the US Minuteman missile
system.
The advantage of the system is that it provides potential for the difficult task of
dealing with a wide range of complex events and interactions in a relatively straight-
forward manner. Its disadvantages are its expense and the need for the forecaster to
have the capability to handle the probabilities produced.
13.4.4 Analogies
When forecasting a variable for which there is no data record, a second variable
whose history is completely known and which is supposed to be similar to the first
is used as an analogy. Because of conceptual similarities between the two, it is
assumed that as time goes by the data pattern of the first will follow the pattern of
the second. The forecast for the first is the already-known history of the second.
For example, the company forecasting sales of a new product in South East Asia
might choose as an analogy the sales of a previous new product with similar
characteristics marketed in that country or a similar country in the recent past. The
growth record of the previous product is the basis of the forecast for the current
one. The forecast does not have to be exactly the same as the analogy. The record
may be adjusted for level and scatter. For instance, the sales volume of the current
product may be thought to be double that of the previous one and have greater
month-by-month variations. To forecast, the growth record of the analogy would be
doubled and the scatter (or confidence limits) of the monthly forecasts increased.
The essence of the technique is not that the analogy should be exactly like the
forecast variable but that similarities in the products and the marketing environment
should be sufficient to believe that the data patterns will be comparable.
The advantage of the technique is that it provides a cheap but comprehensive
forecast in a way that makes sense to the marketing managers. The analogy is not
restricted to business. Biological growth can provide the basis for analogies in
business and the social sciences. The underlying philosophy of the technique is that
there may be social laws just as there are natural laws. Although the laws themselves
in, say, marketing may not be fully or even partially understood, data records are the
evidence of the laws and can be used in forecasting.
The main problem with the technique is that there must be at least one but not
too many analogies to choose from. If the situation is totally new to the organisa-
tion, there may not be an analogy. On the other hand, there may be several plausible
analogies and great arguments may develop in deciding the right one to use. For
example, a wine and spirits company was planning the launch of a promising new
product, but it was extremely difficult to decide which of several previous successful
products should be the analogy. All had been successful, yet their growth patterns
differed considerably. The problem was resolved by making a subjective decision in
favour of one but agreeing to monitor the forecast variable’s record closely to see if,
at some point, the marketers should revert to a new analogy.
the ideas of quantum theory. In business the examples may not be so clear-cut, but
there are plenty of possibilities to think about: a sudden take-off in the sales of a
product, a turnaround in a company’s profitability, an instantaneous change in the
price of a commodity.
Catastrophe theory is not a quantitative technique. It does not calculate the ex-
pected size of the jumps. Rather, it is a systematic way of determining whether a
catastrophe is likely in a given situation. The technique comprises a series of
questions to answer and characteristics to look for that will indicate the nature of
the situation being investigated.
Catastrophe theory is relatively new and there is not much in the way of a track
record to judge its success. However, it certainly fills a gap in the range of forecast-
ing techniques and is growing in popularity. The reason for its importance and the
interest it has created is this: companies can usually take emergency action to deal
with continuous changes (whether rapid or not) in a situation. Sudden jumps or
reversions in behaviour, on the other hand, often leave no time for taking evasive
action. The potential of catastrophe theory is that it may be able to predict circum-
stances with which companies have no way of dealing unless they have advance
warning.
Objective:
Level Build commercially successful airliner
3 Seating Protection
A Passenger comfort
B Safety
C Cost
D Route capability
(c) Weight the importance of each criterion relative to the others. A group of
experts would presumably have to carry out this task by answering questions
such as: ‘What is the weight of each criterion in achieving the highest level objec-
tive?’ In the airliner example the weights might be assigned:
Weight
A Passenger comfort 0.10
B Safety 0.35
C Cost 0.40
D Route capability 0.15
Total 1.00
(d) Weight the sub-objectives at each level (referred to as the elements of the tree)
according to their importance in meeting each criterion. The question posed
might be: ‘In order to meet criterion C, what is the relative importance of each
element at level 3?’ At each level a set of weights for each criterion must be as-
sessed. For the airliner example, the process might work as in Table 13.2.
The first column in the table, for example, shows the assessed relevance of the
four elements at level 1 to the criterion of comfort. Accommodation is weighted
20 per cent, environment 65 per cent and so on. Since the table gives the relative
relevance of the elements to the criteria, each column must sum to 1. The pro-
cess of assessing relevance weights is carried out for each level of the relevance
tree.
(e) Calculate partial relevance numbers. Each element has a partial relevance
number (PRN) for each criterion. It is calculated:
PRN = Criterion weight × Element weight
It is a measure of the relevance of that element with respect only to that criterion
(hence ‘partial’). For the airliner example the partial relevance numbers are
shown in Table 13.3. For instance, from the table the PRN for accommodation
with respect to
(f) local relevance number (LRN) for each element. The LRN for each element is
the sum of the PRNs for that element. It is a measure of the importance of that
element relative to others at the same level in achieving the highest-level objec-
tive. For the airliner example the LRNs are as shown in Table 13.4.
Learning Summary
The obvious characteristic that distinguishes qualitative from quantitative forecast-
ing is that the underlying information on which it is based consists of judgements
rather than numbers, but the distinction goes beyond this. Qualitative forecasting is
usually concerned with determining the boundaries within which the long-term
future might lie; quantitative forecasting tends to provide specific point forecasts
and ranges for variables in the nearer future. Qualitative forecasting offers tech-
niques that are very different in type, from the straightforward, exploratory Delphi
method to the normative relevance trees. Also, qualitative forecasting is at an early
stage of development and many of its techniques are largely unproven.
Whatever the styles of qualitative techniques, their aims are the same: to use
judgements systematically in forecasting and planning. In using the techniques it
should be borne in mind that the skills and abilities that provide the judgements are
more important than the techniques. Just as it would be pointless to try a quantita-
tive technique with ‘made-up’ numerical data, so it would be folly to use a qualitative
technique in the absence of real knowledge of the situation in question. The
difference is that it is perhaps easier to discern the lack of accurate data than the lack
of genuine expertise.
On the other hand, where real expertise does exist, it would be equal folly not to
make use of it. For long-term forecasting, by far the greater proportion of available
information about a situation is probably in the form of judgement rather than
numerical data. To use these judgements without the help of a technique usually
results in a plan or forecast biased by personality, group effects, self-interest, etc.
Qualitative techniques offer chances to distill the real information from the sur-
rounding noise and refine it into something useful.
In spite of this enthusiasm there is a warning. In essence, most qualitative tech-
niques come down to asking questions of experts, albeit scientifically. Doubts about
the value of experts are well entrenched in management folklore. But doubts about
the questions can be much more serious, making all else pale into insignificance.
Armstrong (1985) quotes the following extract from a survey of opinion by Hauser
(1975).
The lesson must be that the sophistication of the techniques will only be worth
while if the forecaster gets the basics right first.
Review Questions
13.1 There are three types of forecasting technique: short-, medium- and long-term. True or
false?
13.4 Causal modelling is the same as least squares regression analysis. True or false?
13.5 Qualitative forecasting techniques can use numerical data. True or false?
13.8 The intention of scenario writing is to provide several alternative views of the future. It
is not to provide a specific forecast. True or false?
13.9 Which of the following statements about a cross-impact matrix are true?
A. It produces sales forecasts.
B. It gives the probabilities that sales will reach a given level.
C. It can only be applied if an extensive list of possible future developments can be
provided.
D. It reviews the probabilities of these developments under the assumptions that
other developments have taken place.
13.10 A good analogy variable should have had exactly the same values at each point of its
history as that expected for the forecast variable. True or false?
13.11 Catastrophe theory deals with which of the following changes in variables?
A. Discontinuities in behaviour.
B. Steep growth.
C. Falls to disastrously low levels.
13.12 In relevance trees, the partial relevance numbers (PRNs) of an element of the tree is
calculated as which of the following?
A. Sum of the criterion weights.
B. Sum of element weightings.
C. Criterion weight × Element weighting.
D. Sum of criterion weights × Element weighting.
References
Armstrong, J. S. (1985). Long Range Forecasting. New York: John Wiley and Sons.
Hauser, P. M. (1975). Social Statistics in Use. New York: Russel Sage.
Makridakis, S. G. (1983). Forecasting: Methods and Applications (2nd ed.). New York: John Wiley
and Sons.
Learning Objectives
By the end of the module the reader should know where to use time series methods.
Time series data are distinguished by being stationary or non-stationary. In the latter
case the series may contain one or more of a trend, seasonality or a cycle. The
module describes at least one technique to deal with each type of series.
Technical section: Section marked with * contains technical material and may
be omitted on a first reading of the module.
14.1 Introduction
Time series methods are forecasting techniques that predict future values of a
variable solely from its own historical record. In various ways they identify patterns
in past data and project them into the future. The methods are categorised accord-
ing to the types of series to which they can be applied. The different types of series
are:
(a) stationary (meaning, roughly, without a trend);
(b) with a trend (a consistent movement upwards or downwards);
(c) with a trend and seasonality (a regular pattern that repeats itself every year);
(d) with a trend, seasonality and cycle (a regular pattern that takes more than a year
to repeat itself).
This module will describe techniques that deal with each of the above types of
data. First, the particular situations in which time series techniques have shown
themselves to be successful will be discussed.
The averaging process is intended to smooth away the random fluctuations in the
series. The forecast for any future time period is the most recent smoothed value. In
the above series the forecast is 18.7 for all periods in the future. A constant forecast
makes sense because the series is stationary.
Seasonal as well as random fluctuations in the data can be smoothed away by
including sufficient points in the average to cover the seasonality. Seasonal monthly
data would be smoothed using a 12-point moving average. Each month is included
once and only once in the average, and thus seasonal variation will be averaged out.
Figure 14.1 shows how exponential smoothing works to average out random fluctua-
tions. As with moving averages, the forecast for future time periods of a stationary
series is the most recent smoothed value, in this case 17.84.
20
Data
10
Original
Smoothed
0
1 2 3 4 5 6
Time
Since α, and thus 1 − α, lie between 0 and 1, these weightings are decreasing.
For instance, if α = 0.2, the weightings are 0.2, 0.16, 0.128, 0.1024, 0.0819 …
Recent actual values receive heavier weighting than earlier ones.
The smoothing equation derived above illustrates how the weighting works. It is
not intended to be used for calculations.
More generally, the forecast for m periods ahead, Ft, is given by:
= + ·
To summarise, when a time series has a trend, forecasts with Holt’s Method are
based on three equations:
= (1 − )( + )+
= (1 − ) + ( − )
= + ·
where
= actual observation at time
= smoothed value at time
, = smoothing constants between 0 and 1
= smoothed trend at time
= forecast for periods ahead
Table 14.4 shows how Holt’s Method is applied to an annual series of sales fig-
ures. The series has been shortened in order to simplify the example. The
smoothing constants have values α = 0.2, γ = 0.3.
2012 12 12.0 –
2013 15 15.0 3.00
2014 20 18.4 = 0.8(15 + 3) + 0.2 × 20 3.12 = 0.7(3.00) + 0.3(18.4 − 15)
2015 21 21.4 = 0.8(18.4 + 3.12) + 0.2 × 21 3.08 = 0.7(3.12) + 0.3(21.4 − 18.4)
2016 25 24.6 = 0.8(21.4 + 3.08) + 0.2 × 25 3.12 = 0.7(3.08) + 0.3(24.6 − 21.4)
2017 28 27.8 = 0.8(24.6 + 3.12) + 0.2 × 28 3.14 = 0.7(3.12) + 0.3(27.8 − 24.6)
Forecasts
2018 30.94 = 27.8 + 3.14
2019 34.08 = 27.8 + 2 × 3.14
2020 37.22 = 27.8 + 3 × 3.14
The choice of smoothing constants is based on the same principles as for ordi-
nary exponential smoothing.
The calculating process needs a starting point both for the trend and for the
smoothed values. The smoothed values for the first two time periods are taken to be
equal to the actual values. There can be no trend for the first time period. The
smoothed trend for the second time period is taken to be equal to the difference
between the first two actual values.
The first three of these elements are then reassembled to make a forecast. The
elements are isolated one by one.
Trend
The trend is isolated by regression analysis between the data and time (see Fig-
ure 14.2), i.e. the observations (xt) are regressed against time (t), where t takes on the
value 1 for the first time period, 2 for the second, 3 for the third and so on.
The regression equation will look like this:
= + +
where:
= Actual data
+ = Trend element
= Residuals comprising seasonality, cycles, random part
Cycles
The next step is to isolate any cycle in the data. By choosing a suitable moving
average (12 points for monthly data, four for quarterly, etc.) the random and
seasonal elements can be smoothed away, leaving just the trend and cycle. If St is
such a moving average, then the ratio between St and the trend (a +bt) must be the
cycle. If the ratio St/(a + bt) is approximately one for all time periods then there is
no cycle. If it differs from one with any regular pattern then the ratio should be
inspected to determine the nature of the cycle. For instance, if the ratio is graphed
against time, it might appear as in Figure 14.3. This suggests a cycle of period 12
quarters, or three years. The ratio returns to its starting point after this interval of
time.
xt
u3
u1
u2
Slope =b
a
1 2 3 t
Cycle Cycle
St 1 3 9 15 21 27 33 39
a + bt
Time (quarters)
Seasonality
Seasonality is isolated by an approach similar to that for cycles. The moving average,
St, comprises trend and cycles; the actual values comprise trend, cycle, seasonality
and random effect. The ratio:
or
should therefore reflect seasonality and random effect. Suppose the data are
quarterly, then the seasonality for, say, the first quarter is calculated by averaging the
ratios:
Seasonality for first quarter = Average of , , ,…
The seasonality for the other three quarters can be calculated similarly. The aver-
aging helps to eliminate the random effect that is contained in the ratios.
In making a forecast, the three isolated elements are multiplied together. Suppose
the forecast for a future quarter, t = 50, is needed. It will be calculated:
Forecast = Trend × Cycle × Seasonality
for = 50
If the data are quarterly with a cycle of length 12 quarters, t = 50 is the second
period of a cycle and the second period of the seasonality. Therefore:
Forecast = ( + 50 ) × Cyclic effect for second period of the cycle
× Seasonal effect for second quarter
Example
The data in Table 14.5 refer to the quarterly shipments of an electrical product
from a company’s warehouse. Make a forecast of elements in each quarter of
2019 using the decomposition method. (The necessary calculations and data are
set out in full in Table 14.6.)
(a) Calculate the trend. A regression analysis is carried out with the ship-
ments as the y variable and time as the x variable, i.e:
Cycle = 12
Moving average
1.1
Trend
1
1 5 9 13 17 21 25 29 33
0.9
Cycle = 12
(c) Calculate the seasonal effect. The seasonality is the ratio of actual to
moving average, averaged for each quarter. Table 14.6 shows these ratios in
column 7 calculated as column 3 divided by column 4. Averaging these ratios
for the first quarter:
. . . . . . . .
Seasonality index quarter 1 =
= 0.82
The seasonal index is meant to rearrange the pattern within a year, not to
increase the trend. In the above case the trend would be increased by 2.5
per cent each year. The seasonal indices have to be adjusted so that their
average is 1.0. This is done by dividing each index in Table 14.8 by 1.025 to
give the adjusted seasonal indices of Table 14.9.
of 2019 are time periods 5–8 of a cycle. The cyclical effects for these pe-
riods are taken from Table 14.7.
iii. The seasonal effect for each quarter is taken from Table 14.9. The fore-
cast is the product of the three elements. For example, for 2019, Q1:
Forecast = 37.21 × 1.05 × 0.80
= 31.26
(a) Pre-whiten. This is a term coined by Box and Jenkins that can be thought of as
removing the trend from the series. Later, when the forecasts are being calculat-
ed, the trend can be taken out of storage.
(b) Identify. Select those past values and residuals that seem most likely to affect
future values. This is done by looking carefully at the autocorrelation coeffi-
cients to interpret in what way the past affects the future. These coefficients are
the result of correlating the series of residuals with the same series of residuals
lagged by 1, 2, 3, … Interpreting them is a skilled operation that goes beyond the
scope of this module. For example, the result of this step might be:
Forecast for next month = Combination of values for this month and last month
together with this month’s residual
(c) Estimate. Determine each coefficient of those past values and residuals selected
at the identify step. A computer-based algorithm called the method of steepest
descent usually does this, providing a best-fit line. For example, this step might
produce:
Forecast for next month = 0.93 × This month’s value + 0.09 × Last month’s value
−0.24 × This month’s residual
(d) Diagnose. Random residuals suggest the best forecasting model has been found
as signalled by zero, or close to zero, autocorrelations. If the residuals are not
random, other past values or residuals need to be included. The estimation and
diagnosis steps are then repeated.
(e) Forecast. When the diagnostic step reveals random residuals, the equation can
be used to forecast.
Pre-whiten
Identify
Estimate
Diagnose
Not OK
OK
Forecast
Learning Summary
In spite of the fact that surveys have demonstrated how effective time series
methods can be, they are often undervalued. The reason is that, since a variable is
predicted solely from its own historical record, the methods have no power to
respond to changes in business or company conditions. They work on the assump-
tion that circumstances will be as in the past.
Nevertheless, their track record is good, especially for short-term forecasting. In
addition, they have one big advantage over other methods. Because they work solely
from the historical record and do not necessarily require any element of judgement or
forecasts of other causal variables, they can operate automatically. For example, a large
warehouse, holding thousands of items of stock, has to predict future demands and
stock levels. The large number of items, which may be of low unit value, means that it
is neither practicable nor economic to give each variable individual attention. Time
series methods will provide good short-term forecasts by computer without needing
managerial attention. Of course, initially some research would have to be carried out,
for instance, to find the best overall values of smoothing constants. But, once this
research was done, the forecasts could be made automatically. All that would be
needed would be the updating of the historical record as new data became available.
Especially with a computerised stock system, this should cause little difficulty.
The conclusion is therefore not to underestimate time series methods. They have
advantages in cost and, in the short term, in accuracy over other methods.
Review Questions
14.1 Which of the following statements about time series forecasting methods are true?
They:
A. relate to stationary series only.
B. are based on regression analysis.
C. make forecasts of a variable from its own historical record.
D. are able to forecast without manual intervention.
14.4 The difference between moving averages (MA) and exponential smoothing (ES) is that
MA computes a smoothed value by giving equal weights to past data values, whereas ES
allows the forecaster to choose the weights. True or false?
Questions 14.5 and 14.6 refer to the following data:
Period 1 2 3 4 5 6 7 8
Value 7 9 8 10 12 11 7 10
14.5 The three-point moving average forecast for time period 9 is:
A. 9.3
B. 9
C. 10
D. 8.5
14.8 In the Holt–Winters Method, the seasonality at any point is measured as:
A. Trend/Actual
B. Trend/Smoothed
C. Actual/Trend
D. Actual/Smoothed
14.9 As part of the decomposition method, a time series xt is regressed against time (t = 0,
1, 2, 3, …). The result is:
14.11 A graph relating cyclical effect to time is as shown below. The length of the cycle is thus:
Cycle
10 20 30 40
Quarter
A. 10 years
B. 2.5 years
C. 20 years
D. 5 years
14.12 In an application of the decomposition method on quarterly data, the following have
been calculated:
Monthly demand
2016 2017
Oct. Nov. Dec. Jan. Feb. Mar. Apr. May June July Aug.
2000 1350 1950 1975 3100 1750 1550 1300 2200 2775 2350
Period Demand
2016 1 140
2 155
3 155
4 170
2017 1 180
2 170
3 185
4 190
There is a clear trend in the data. There may be seasonality, but the record is too short
to be sure. The best method therefore appears to be Holt’s Method. Make a forecast
for total demand in 2018 using this technique with α = 0.2 and γ = 0.3.
calculation to combine these figures into the required statements. Keith’s knowledge of
McClune and his familiarity with the industry made him confident of his ability to predict
the company’s monthly operating expenses and the time delay in cash receipts. He could
do this by taking average values over past years while adjusting for inflation and any one-
off occurrences.
The difficult part of Keith’s task was that of forecasting whisky sales for the next 12
months. He could, of course, take McClune’s sales figures and use some forecasting
technique to predict ahead, but he was concerned that operating difficulties had made
the McClune data unreliable. On the other hand, he could obtain a monthly forecast of
sales for the whole Scotch whisky industry (from the Scotch Whisky Association in
London, to which McClune belonged) and then apply projected market share figures for
McClune to obtain the forecasts he was seeking. Although this approach might have
spin-off benefits in providing input to McClune’s market research projects, he was
concerned that the estimate of market share could involve errors of a size that would
negate the work put into the industry forecasts. He decided on the first approach of
using McClune’s own historical sales records. Keith’s first step was to obtain the sales
data. McClune’s files held sales figures going back over a hundred years, but Keith
decided that only data from 2006 onwards was relevant. During 2004 two strikes
outside the company had affected McClune’s business and the sales records for 2004
and 2005 were particularly unreliable. From 2006 onwards there were no such major
effects and Keith was able to make satisfactory adjustments for the operating difficulties
during the period. These adjusted historical sales figures (in volume terms) are present-
ed in Exhibit 1 (see Table 14.12).
Keith plotted the historical data and obtained the graph shown in Exhibit 2 (see
Figure 14.6), showing a strong seasonal pattern, consistent from year to year, as well as
a clear trend. Given this pattern he took the view that short-term forecasts could best
be obtained from a time series technique. He set out to consider which technique might
be best.
Year Month Sales Time Year Month Sales Time Year Month Sales Time
2007 3 8.67 15 2010 3 8.09 51 2013 3 10.84 87
2007 4 9.26 16 2010 4 9.45 52 2013 4 11.83 88
2007 5 10.55 17 2010 5 10.14 53 2013 5 12.68 89
2007 6 9.17 18 2010 6 11.17 54 2013 6 12.33 90
2007 7 8.66 19 2010 7 9.29 55 2013 7 11.72 91
2007 8 4.45 20 2010 8 4.36 56 2013 8 4.20 92
2007 9 9.10 21 2010 9 11.75 57 2013 9 15.06 93
2007 10 11.32 22 2010 10 15.31 58 2013 10 17.66 94
2007 11 15.23 23 2010 11 22.94 59 2013 11 24.92 95
2007 12 18.02 24 2010 12 28.67 60 2013 12 32.06 96
2008 1 5.87 25 2011 1 9.04 61 2014 1 11.00 97
2008 2 6.19 26 2011 2 10.01 62 2014 2 9.02 98
2008 3 8.34 27 2011 3 11.41 63 2014 3 11.58 99
2008 4 8.91 28 2011 4 10.82 64 2014 4 12.11 100
2008 5 9.05 29 2011 5 12.57 65 2014 5 11.68 101
2008 6 9.98 30 2011 6 11.83 66 2014 6 13.44 102
2008 7 6.26 31 2011 7 8.91 67 2014 7 10.87 103
2008 8 3.98 32 2011 8 4.61 68 2014 8 3.62 104
2008 9 7.24 33 2011 9 13.21 69 2014 9 14.87 105
2008 10 13.18 34 2011 10 17.39 70
2008 11 15.88 35 2011 11 27.33 71
2008 12 22.90 36 2011 12 35.21 72
36
34
32
30
28
26
24
22
Sales
20
18
16
14
12
10
8
6
4
2
0 20 40 60 80 100
Months
Managing Forecasts
Contents
15.1 Introduction.......................................................................................... 15/1
15.2 The Manager’s Role in Forecasting.................................................... 15/2
15.3 Guidelines for an Organisation’s Forecasting System ..................... 15/4
15.4 Forecasting Errors ............................................................................. 15/13
Learning Summary ....................................................................................... 15/15
Review Questions ......................................................................................... 15/17
Case Study 15.1: Interior Furnishings ........................................................ 15/19
Case Study 15.2: Theatre Company........................................................... 15/19
Case Study 15.3: Brewery ............................................................................ 15/19
Prerequisite reading: None, apart from for Case Study 15.3, which draws on material
in Module 11–Module 14
Learning Objectives
The purpose of this module is to describe what managers need to know if they are
to use forecasts in their work. It is stressed that forecasting should be viewed as a
system, not a technique. The system needs to be managed and it is here that the
manager’s role is crucial. The parts of it that fall within a manager’s sphere rather
than that of the forecasting expert are discussed in some detail. Some actual and
costly mistakes in business forecasting will demonstrate the crucial nature of the
manager’s role. By the end the readers should know how they can use the forecast-
ing techniques described in previous modules effectively in their organisations.
15.1 Introduction
The techniques of business forecasting have been the topic of previous modules.
Causal modelling was the subject of Module 11 and Module 12, qualitative tech-
niques of Module 13 and time series methods of Module 14. How can this armoury
best be used in an organisation? If it is to be used effectively, the manager’s role is a
vital one. Forecasting should be viewed as a system of which techniques are just a
part. Experts can usually be called upon to look after the techniques, but the whole
system, if it is to function properly, needs the input of management skill and
knowledge. Indeed, a review of some of the expensive losses made by organisations
because of forecasting errors reveals that the errors usually occurred because of a
lack of management skills rather than statistical skills.
This module is about managing the experts and their techniques so that forecast-
ing makes a real contribution in business. It discusses the manager’s role, gives nine
guidelines for developing a forecasting system and describes some major forecasting
mistakes that have been made.
because the non-specialists are in the best position to forge the link between
techniques and decisions. More than this, since they are going to use the forecasts,
they should have confidence that they are reliable and valuable.
Wherever responsibility rests, the development of a forecasting system is, in
larger organisations, usually a team activity. Typically, the team members will include
a forecasting practitioner, a representative of the user department and a financial
expert, although the exact composition inevitably depends upon individual circum-
stances. In the smaller organisations the forecasting may be done by one person in
whom must be combined all the team’s expertise.
In a team, the role of the practitioner or specialist is reasonably clear. The roles
of the other team members include such things as facilitating access to the user
department and providing data, but much more importantly they must include
responsibility for ‘managing’ the forecasts. This means ensuring that resources (the
experts and techniques) are properly applied to objectives (the intended uses of the
forecasts). In carrying this out, it is essential to view forecasting as a system and not
just as a technique. While the specialist is considering the statistical niceties of the
numbers being generated, the ‘manager’ should be considering the links with the
rest of the organisation: what is the decision-making system that the forecasts are to
serve? Is the accuracy sufficient for the decisions being taken? Are the forecasts
being monitored and the methods adjusted? And so on. In short, the specialist takes
a narrow view of the technique but the manager takes a broad view of the whole
forecasting system. The role of managing the system frequently and properly falls to
a general manager in the user department. It is the most vital role in the whole
forecasting process.
This recommended broad view can be broken down into three distinct areas.
They indicate the knowledge with which a manager needs to be equipped in order to
play an effective part in the system.
1. Being aware of the range of techniques available. A specialist may have a
‘pet’ technique. The manager should know the full spectrum of techniques at a
general level of detail so that he or she can make at least an initial judgement on
their applicability to the situation. Such knowledge will also increase confidence
and credibility when the manager is taking part in discussions with specialists.
2. Incorporating forecasts into management systems. This is the essence of the
manager’s role. There is a checklist (described later) of things that should be
done to integrate a forecasting process with the rest of the organisation.
3. Knowing what is likely to go wrong. Many organisations have made forecast-
ing errors in the past. Most have one thing in common: they are sufficiently
simple that, with hindsight, it seems remarkable that the mistakes could have
been made. Yet the mistakes were made and they are a source of valuable infor-
mation for the present.
Of these three areas, the first, knowing the range of techniques, has already been
covered. The next two are the subjects of the following sections.
tested to make forecasts for the period B to A. For this period, forecasts and actual
can be compared. The technique giving the most accurate forecasts (as measured by
the MSE or MAD) is taken as being the best. When forecasting for the future (time
periods beyond A), all the data up to A are used.
This method is an independent way of testing accuracy. It is independent in the
sense that two separate sets of data, up to B and from B to A, were used to forecast
and to check the forecasts respectively. Contrast this with the use of the correlation
coefficient to measure accuracy for causal methods. In this case the closeness of fit
is a comparison between a set of data and a statistical model (the regression equa-
tion) constructed on exactly the same set of data.
Forecast
Actual
B A Time
views. The second is to use this consensus to make adjustments to the forecasts that
have already been derived by other means.
The first task draws on qualitative forecasting techniques. Two of them, struc-
tured groups and the Delphi method, are generally helpful in this context. Obtaining
qualitative forecasts by the Delphi method is a similar exercise to that of bringing
together judgements with a view to adjusting quantitatively derived forecasts.
However, there is a significant difference. The Delphi method requires participants
to alter their opinions in coming to a forecast. When a statistical forecast is already
on the table, participants may be reluctant to change their views. The element of
competition that is inherent in the Delphi method may be more apparent when the
aim is to adjust an existing forecast rather than to form a forecast from scratch. The
participants may see themselves as part of a bargaining process rather than a
scientific technique. Experience shows that a consensus is hard to achieve in these
circumstances. This makes the next task, adjusting the forecasts, even more difficult.
There is no way to adjust the forecasts apart from a process of discussion and,
eventually, agreement. The control on this process is that participants must be
accountable for the decisions they make. The discussion must be minuted. At a later
stage, not only the forecasts themselves but also the adjustment of forecasts should
be monitored. If the minutes reveal that any participant in the adjustment process
was insisting on a view that turned out to be incorrect, he or she will have to explain
why. This may have a deterrent effect to stop game-playing. More importantly, a
record will be built up over time so that the people whose views are consistently
correct will be evident. Moreover, people whose judgements have consistently
proved misguided, prejudiced or the product of vested interest will also be revealed.
As time goes by the adjustment process should reflect more and more the track
records of participants rather than the strength with which they hold their opinions.
Of course, the process is far from foolproof: monitoring may deter some people
from proffering their opinion; track records will mean little if there is a rapid
turnover in participants; time is needed to build up track records; special cases will
always be argued; most things can be explained away if one is clever enough.
Nevertheless, the balance must be in favour of allowing judgements to be incorpo-
rated. Participation is better than non-participation. The alternative to allowing
judgements to influence forecasts, with all the risks entailed, is to leave people who
are intimately affected by the forecasts feeling that they are outside the system. Their
positive input will not be available; a negative approach on their part may cause the
system to fail.
those affected by the forecasts. Without clear communication it can prove very
difficult for participants to agree on the problems, never mind the solutions.
The lack of communication can be seen in the polarised attitudes of the two
major groups involved: the users and the producers of the system. The former
wonder how such theoretically minded experts can help practically minded manag-
ers; the latter wonder why so many people have yet to come out of the Stone Age
and into the age of modern management. When such conflicting views exist, the
two sides are almost impossible to reconcile. The situation can be avoided by
ensuring that users are concerned with the project from the outset. ‘Us and them’
feelings will then be minimised. The first stage of the design process, analysing the
decision-making system, provides the ideal starting point for this cooperation.
User involvement is therefore a precondition for implementation. Without it, the
steps suggested below can have little impact. Several research studies support the
need for user involvement. In particular, it has been shown that involvement in
which the user feels that he or she has some influence on the design is an important
prerequisite of success. With this proviso, the implementation stage should set out
to answer four questions.
range of problems to all participants. But there may be disagreements. For instance,
there may be some doubt as to whether operators of the old manual system are the
right people to operate the new system. Changes in budgets are another common
source of dispute.
Continuing the consultations, perhaps supplemented with structured group meet-
ings, the project leader hopes to obtain a consensus on all these issues. It is
becoming more and more apparent that a forecasting expert needs behavioural skills
just as much as quantitative ones.
Some of these problems had already been provided for. The users were eventual-
ly convinced that meetings to incorporate judgements into forecasts allowed them
control over forecasts as before, but with the advantage of the meetings being more
soundly based on a statistical forecast. The other problems could not, however, be
discarded. To cope with the added management burden, the producers of the
forecasts had to agree to a substantial simplification of the printout and a tailoring
of the output to individual users’ needs. The users’ lack of confidence both in
themselves and in the producers proved yet more difficult to solve. Eventually a
series of experiments was agreed. The system would be introduced in one depart-
ment at a time. Meanwhile, the other departments would continue to operate the
manual system. A strict feedback procedure would allow the users to learn from the
experience of others and to make necessary alterations as the implementation
proceeded. As a result, implementation was a lengthy process (and emotionally
draining for the project leader), but it led to an end product that was worth waiting
for: a forecasting system that worked.
Of course, all situations differ and there is a need for flexibility on all sides. One
factor, however, is always a great bonus. This is the case when one of the users is
sufficiently convinced of the need for a new system or sufficiently enthusiastic about
the new technology and techniques that he or she will take on a leadership role. If
the users are being motivated by one of their own side rather than a seemingly
distanced expert, many of the difficulties simply do not arise. This factor is now
seen as being increasingly important in the implementation of new technology,
whether related to forecasting or not. Indeed, all the ideas suggested in this section
go far beyond the implementation of forecasting. They apply to the implementation
of any new methods, techniques or practices.
Monitoring should not be the occasional lightning audit to catch people out but a
continual flow of information.
Monitoring will, therefore, stem from regular reports comparing the actual data,
as they become known, with the forecast. The reports will include purely statistical
measures of accuracy, similar to those used in assessing the accuracy of techniques
ex post. Thus, the mean absolute deviation (MAD) and mean square error (MSE) will
be used to indicate the average level of accuracy over the time period in question.
Beyond this, general summary information is required possibly on a management-
by-exception basis. Times of exceptional accuracy or inaccuracy should be reported
with, where possible, reasons for the deviation. For instance, the report might say
that the third-quarter forecasts were inaccurate, just as they were in the last two
years, or that the third-quarter forecasts were inaccurate because of the special
circumstance of, say, a strike.
If the system allows for forecasts to be altered to reflect judgements, then, in
addition to these frequent monitoring reports, there must also be less regular reports
assessing the performance of the ‘judges’. Track records must be compiled showing
which judges have been accurate and which inaccurate. Even if a particular judge’s
views have not been included so far, they may be sometime in the future and it will
be useful to have his or her track record to hand.
Among all this statistical data it is worth remembering that it is not always the
forecasts that are closest to the actual data that are the best. This paradox comes
about as follows. In one important sense the best forecasts are those that have
credibility with the users. If they have credibility, then notice will be taken of them
and management action will follow. This action may cause the originally recorded
forecasts to appear inaccurate. Their true accuracy can, of course, never be known
because conditions have changed (by the taking of management action). According-
ly, a comprehensive monitoring system should go beyond numerical data and
consider perceptions of value and success. In other words, the users will be ques-
tioned from time to time and their opinions of the strengths and weaknesses of the
system solicited. A moderately accurate but functioning system is preferable to a
highly accurate but never used system.
For example, the senior management of a manufacturing company could not
understand why production planning decisions were so poor. A check on the
demand forecasting system that supported the decisions revealed that the forecasts
were highly accurate. A check on the way decisions were made uncovered the
surprising information that the forecasts were never used by the production
planners. They used their own ‘seat of the pants’ judgements as forecasts. The
reason was apparently that the system had not been properly implemented. Liaison
between producers and users had been very poor and the computer-based system
had never been explained to those receiving its output. As a result the planners felt
threatened and isolated by the system and ignored it. If less effort had been chan-
nelled into obtaining high levels of accuracy and more into making sure users knew
what to do with the forecasts, the overall benefits to the organisation would have
been considerable.
In this example the failure of the forecasting system naturally gives cause for
concern. Just as disturbing must be the fact that no one was aware of its failure until
the evidence of the bad decisions started to roll in. Such poor communication is a
surprisingly frequent occurrence. In more organisations than would wish to
acknowledge it, producers and users of forecasting systems (or other management
aids) hold diametrically opposed views about the success or failure of the project.
The producers think it a success; the users think it a failure. The producers do not
go to talk to the users because they think all is fine; the users think the non-
appearance of the producers is because they are too ashamed. If the two sides do
meet, the users are too polite to say what they say between themselves; the produc-
ers think the users’ faint praise is because they begrudge them their success. Such
situations reinforce the view that survey checks conducted by an independent body
a short while after a system has become operational are a necessity, not a luxury.
In summary, there is nothing very sophisticated about the monitoring of fore-
casting performance. Its essentials are the recording of comprehensive data, both
quantitative and qualitative, together with a willingness to face facts and act upon
them. Perhaps this is why so few organisations monitor their forecasts. The excite-
ment of forecasting, such as it is, lies in the techniques. There is no excitement in
tedious data evaluations, and therefore, some might think, monitoring must be an
unimportant part of the process.
250
200
150
100
50
1968 1969 1970 1971
quently, passenger miles flown increased at the time when the model was predicting
a decrease. There was no causal link between the variables, so, when circumstances
altered, the model no longer held good.
The second mistake was that in order to forecast passenger miles flown, the
airline had first to forecast the Index of Manufacturing Production. This in itself
was no trivial matter. A direct attack on passenger miles flown would have carried
an equal chance of success while saving time and effort. Where forecasts of eco-
nomic variables are needed, ones that are readily available should be chosen. Good
forecasts of some economic variables, such as gross domestic product, personal
disposable income and others, are published regularly by a variety of econometric
forecasting institutes.
These two examples show how easily major mistakes can be made. More espe-
cially, they show that the role of the forecasting non-specialist in supervising a
forecasting effort or as part of a forecasting team is of vital importance. Mistakes are
usually non-technical in nature. There is no guaranteed means of avoiding them, but
it is clearly the responsibility of the ‘managers’ in the team to guard against them.
The lessons other organisations have learned the hard way can help them in their
task.
Learning Summary
Managers have a clear role in ‘managing’ forecasts. But they also have a role as
practitioners of forecasting. The low cost of a very powerful computer means that it
is not a major acquisition; software and instruction manuals are readily available.
With a small investment in time and money, managers, frustrated by delays and
apparent barriers around specialist departments, take the initiative and are soon
generating forecasts themselves. They can use their own data to make forecasts for
their own decisions without having to work through management services or data
processing units.
This development has several benefits. The link between technique and decision
is made more easily; one person has overall understanding and control; time is
saved; re-forecasts are quickly obtained. But, of course, there are pitfalls. There may
be no common database, no common set of assumptions within an organisation.
For instance, an apparent difference between two capital expenditure proposals may
have more to do with data/assumption differences than with differences between
the profitabilities of the projects. Another pitfall is in the use of statistical techniques
that may not be as straightforward as the software manual suggests. The use of
techniques by someone with no knowledge of when they can or cannot be applied is
dangerous. A time series method applied to a random data series is an example. The
computer will always (nearly always) give an answer. Whether it is legitimate to base
a business decision on it is another matter.
However, it is with management aspects of forecasting that this module has
primarily been concerned. It has been suggested that this is an area of expertise too
often neglected and that it should be given more prominence. Statistical theory and
techniques are of course important as well, but the disproportionate amounts of
time spent studying and discussing them give a wrong impression of their im-
portance relative to management issues.
In particular, the topics covered as steps 7–9 in the guidelines – the incorporation
of judgements, implementation and monitoring – are given scandalously little
attention within the context of forecasting. This is generally true, whether books,
courses, research or the activities of organisations are being referred to. A moment’s
thought demonstrates that this is an error. If a forecasting technique is wrongly
applied, good monitoring will permit it to be adjusted speedily: the situation can be
retrieved. If judgements, implementation or monitoring are badly done or ignored,
communication between producers and users will probably disappear and the
situation will be virtually impossible to retrieve.
Why should these issues be held in such low regard? Perhaps the answer lies in
the widespread attitude that says that a manager needs to be taught statistical
methods but that the handling of judgements, implementation and monitoring are
matters of instinct that all good managers have. They are undoubtedly management
skills, but whether they are instinctive is another matter. Whatever the reason, the
effect of this inattention is almost certainly a stream of failed forecasting systems.
How can the situation be righted? A different attitude on the part of all con-
cerned would certainly help, but attitudes are notoriously hard to change. A long-
term yet realistic approach calls for more information. Comparatively little is known
about these management aspects. If published reports and research on the manage-
ment of forecasting were as plentiful as they are on technical aspects, a great
improvement could be anticipated.
Even so, the best advice of all is probably to avoid forecasting. Sensible people
should only use forecasts, not make them. The general public and the world of
management judge forecasts very harshly. Unless they are exactly right, they are
failures. And they are never exactly right. This rigid and unrealistic test of forecast-
ing is unfortunate. The real test is whether the forecasting is, on average, better than
the alternative, which is often a guess, frequently not even an educated one.
A more positive view is that the present time is a particularly rewarding one to
invest in forecasting. The volatility in data series seen since the mid-1970s puts a
premium on good forecasting. At the same time, facilities for making good forecasts
are now readily available in the form of a vast range of techniques and wide choice
of relatively cheap computers. With the latter, even sophisticated forecasting
methods can be applied to large data sets. It can all be done on a manager’s desktop
without the need to engage in lengthy discussions with experts in other departments
of the organisation.
Whether the manager is doing the forecasting in isolation or is part of a team, he
or she can make a substantial contribution to forward planning. To do so, a system-
atic approach to forecasting using the nine guidelines and an awareness of the
hidden traps will serve that manager well.
Review Questions
15.1 A manager using forecasts should view forecasting as a system, whereas a forecasting
expert should view forecasting as a series of techniques. True or false?
15.2 Which department should, preferably, take overall responsibility for the development of
a production planning forecasting system?
A. Production.
B. Data processing.
C. Financial planning.
D. Management services.
15.4 The purpose of developing a conceptual model of the forecasts is to consider what
variables might be incorporated in a causal model. True or false?
Questions 15.5 to 15.8 refer to the data in Table 15.2 showing a three-point
moving average forecast.
15.7 What is the MAD (mean absolute deviation) of the forecast errors for periods 4–6?
A. 3.7
B. 1.0
C. 3.3
D. 14.8
E. 1.2
15.8 What is the MSE (mean square error) of the forecast errors for periods 4–6?
A. 22.2
B. 14.8
C. 6.7
D. 3.9
15.9 In comparing two forecasting methods, the MAD and the MSE are calculated for both
methods. In what circumstances could the following situation arise?
The MSE for method 1 is lower than the MSE for method 2; the MAD for method 1 is
higher than the MAD for method 2.
A. The situation could never arise.
B. Method 1 is superior to method 2.
C. Method 2 is superior to method 1.
D. On average method 1 is superior, but the method leaves some very large
errors.
E. On average method 2 is superior, but the method leaves some very large
errors.
15.10 Why was the method of testing forecasting accuracy that was based on keeping some of
the historical data to one side described as an independent test?
A. It can be used for any forecasting method.
B. It considers forecast errors of more than one step ahead.
C. The coefficients in the forecasting model are estimated with data separate from
that on which accuracy is measured.
15.12 Although most participants in a forecasting system can agree on the likely problems in
implementing it, it is very difficult to obtain a consensus on the solutions that should be
adopted. True or false?
issues, and, therefore, even if such software is not available, it is still possible to tackle many
of the questions posed.
A large company whose main business is the production of alcoholic beverages wants to
forecast the sales of a particular brand of beer. The organisation is diversified and has
many other products. The terms of reference are to forecast the UK sales volumes for
the brand. The forecasts are to be used as the basis for short- to medium-term
production and marketing decisions.
Beer, of course, takes a relatively short time (a matter of weeks) to produce and
distribute. Stock levels are kept low partly because the product can deteriorate, albeit
slowly, and partly because the costs of holding stocks are high compared to production
costs. The need for low stock levels, together with fierce competition from other
producers, has meant that it is imperative to respond quickly, in terms of both produc-
tion and marketing, to changes in demand. This is the primary reason for this forecasting
project.
Substantial amounts of data are available. The most important data are the brand’s sales
volumes (quarterly) going back several years. Furthermore, the data are believed to be
accurate and reliable. Data for the most recent 15 years are shown in Table 15.4. They
are in unspecified units and have been rounded for ease of manipulation.
Wider data concerning the product as well as other general data about the industry and
the economic climate are also available. Table 15.5 shows some of these data.
Table 15.5 GDP (in index form) and advertising data 2009–17
GDP Advertising expenditure
Quarter Quarter
Year 1 2 3 4 1 2 3 4
2009 93.5 93.4 94.3 96.2 1.4 1.8 1.9 1.4
2010 95.7 96.0 96.6 97.6 1.3 1.9 2.0 1.7
2011 98.3 100.5 100.6 100.7 1.5 1.8 2.3 1.5
2012 100.2 104.2 102.7 103.1 1.6 2.3 2.4 1.7
2013 102.4 100.5 98.7 98.5 1.7 2.0 2.8 1.9
2014 98.3 97.4 98.5 99.6 1.7 2.2 3.0 1.9
2015 100.2 100.4 100.5 101.7 1.8 2.4 3.0 2.2
2016 103.3 103.1 103.7 103.3 1.6 2.5 3.1 2.2
2017 103.9 103.7 104.2 104.0 1.8 2.7 3.0 2.3
Approach this forecasting problem by using the nine-point checklist for developing a
forecasting system. Specifically:
a. The main decisions that the forecasts are to serve are:
i. production levels (monthly);
ii. distribution quantities (quarterly);
iii. marketing action (quarterly).
All have a time horizon of no more than one year. Suggest what other related deci-
sions are likely to be affected by the forecasts.
b. What forecasts are required, in terms of timing and accuracy?
c. Suggest a conceptual model for making forecasts.
d. What restrictions on data availability might there be?
e. Which techniques are appropriate in this situation?
f. Test the accuracy of:
i. a causal model relating sales volume to GDP and/or advertising expenditure;
ii. a Holt–Winters time series model.
g. How could judgement be incorporated into the forecasts?
h. What are the likely implementation problems and how might they be resolved?
i. How should the forecasts be monitored?
References
Jenkins, G. (1979). Practical Experiences with Modelling and Forecasting Time Series. Gwilym
Jenkins and Partners (Overseas) Ltd.
Statistical Tables
Table A1.1 Binomial Distribution Tables
p
n r 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50
1 0 0.9500 0.9000 0.8500 0.8000 0.7500 0.7000 0.6500 0.6000 0.5500 0.5000
1 0.0500 0.1000 0.1500 0.2000 0.2500 0.3000 0.3500 0.4000 0.4500 0.5000
2 0 0.9025 0.8100 0.7225 0.6400 0.5625 0.4900 0.4225 0.3600 0.3025 0.2500
1 0.0950 0.1800 0.2550 0.3200 0.3750 0.4200 0.4550 0.4800 0.4950 0.5000
2 0.0025 0.0100 0.0225 0.0400 0.0625 0.0900 0.1225 0.1600 0.2025 0.2500
3 0 0.8574 0.7290 0.6141 0.5120 0.4219 0.3430 0.2746 0.2160 0.1664 0.1250
1 0.1354 0.2430 0.3251 0.3840 0.4219 0.4410 0.4436 0.4320 0.4084 0.3750
2 0.0071 0.0270 0.0574 0.0960 0.1406 0.1890 0.2389 0.2880 0.3341 0.3750
3 0.0001 0.0010 0.0034 0.0080 0.0156 0.0270 0.0429 0.0640 0.0911 0.1250
4 0 0.8145 0.6561 0.5220 0.4096 0.3164 0.2401 0.1785 0.1296 0.0915 0.0625
1 0.1715 0.2916 0.3685 0.4096 0.4219 0.4116 0.3845 0.3456 0.2995 0.2500
2 0.0135 0.0486 0.0975 0.1536 0.2109 0.2646 0.3105 0.3456 0.3675 0.3750
3 0.0005 0.0036 0.0115 0.0256 0.0469 0.0756 0.1115 0.1536 0.2005 0.2500
4 0.0000 0.0001 0.0005 0.0016 0.0039 0.0081 0.0150 0.0256 0.0410 0.0625
5 0 0.7738 0.5905 0.4437 0.3277 0.2373 0.1681 0.1160 0.0778 0.0503 0.0312
1 0.2036 0.3280 0.3915 0.4096 0.3955 0.3602 0.3124 0.2592 0.2059 0.1562
2 0.0214 0.0729 0.1382 0.2048 0.2637 0.3087 0.3364 0.3456 0.3369 0.3125
3 0.0011 0.0081 0.0244 0.0512 0.0879 0.1323 0.1811 0.2304 0.2757 0.3125
4 0.0000 0.0004 0.0022 0.0064 0.0146 0.0284 0.0488 0.0768 0.1128 0.1562
5 0.0000 0.0000 0.0001 0.0003 0.0010 0.0024 0.0053 0.0102 0.0185 0.0312
7 0 0.6983 0.4783 0.3206 0.2097 0.1335 0.0824 0.0490 0.0280 0.0152 0.0078
1 0.2573 0.3720 0.3960 0.3670 0.3115 0.2471 0.1848 0.1306 0.0872 0.0547
2 0.0406 0.1240 0.2097 0.2753 0.3115 0.3177 0.2985 0.2613 0.2140 0.1641
3 0.0036 0.0230 0.0617 0.1147 0.1730 0.2269 0.2679 0.2903 0.2918 0.2734
4 0.0002 0.0026 0.0109 0.0287 0.0577 0.0972 0.1442 0.1935 0.2388 0.2734
5 0.0000 0.0002 0.0012 0.0043 0.0115 0.0250 0.0466 0.0774 0.1172 0.1641
6 0.0000 0.0000 0.0001 0.0004 0.0013 0.0036 0.0084 0.0172 0.0320 0.0547
7 0.0000 0.0000 0.0000 0.0000 0.0001 0.0002 0.0006 0.0016 0.0037 0.0078
8 0 0.6634 0.4305 0.2725 0.1678 0.1001 0.0576 0.0319 0.0168 0.0084 0.0039
1 0.2793 0.3826 0.3847 0.3355 0.2670 0.1977 0.1373 0.0896 0.0548 0.0312
2 0.0515 0.1488 0.2376 0.2936 0.3115 0.2965 0.2587 0.2090 0.1569 0.1094
3 0.0054 0.0331 0.0839 0.1468 0.2076 0.2541 0.2786 0.2787 0.2568 0.2188
4 0.0004 0.0046 0.0185 0.0459 0.0865 0.1361 0.1875 0.2322 0.2627 0.2734
5 0.0000 0.0004 0.0026 0.0092 0.0231 0.0467 0.0808 0.1239 0.1719 0.2188
6 0.0000 0.0000 0.0002 0.0011 0.0038 0.0100 0.0217 0.0413 0.0703 0.1094
7 0.0000 0.0000 0.0000 0.0001 0.0004 0.0012 0.0033 0.0079 0.0164 0.0313
8 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0002 0.0007 0.0017 0.0039
20 0 0.3585 0.1216 0.0388 0.0115 0.0032 0.0008 0.0002 0.0000 0.0000 0.0000
1 0.3774 0.2702 0.1368 0.0576 0.0211 0.0068 0.0020 0.0005 0.0001 0.0000
5 0.0022 0.0319 0.1028 0.1746 0.2023 0.1789 0.1272 0.0746 0.0365 0.0148
6 0.0003 0.0089 0.0454 0.1091 0.1686 0.1916 0.1712 0.1244 0.0746 0.0370
7 0.0000 0.0020 0.0160 0.0545 0.1124 0.1643 0.1844 0.1659 0.1221 0.0739
8 0.0000 0.0004 0.0046 0.0222 0.0609 0.1144 0.1614 0.1797 0.1623 0.1201
9 0.0000 0.0001 0.0011 0.0074 0.0271 0.0654 0.1159 0.1597 0.1771 0.1602
10 0.0000 0.0000 0.0002 0.0020 0.0099 0.0308 0.0686 0.1171 0.1593 0.1762
11 0.0000 0.0000 0.0000 0.0005 0.0030 0.0120 0.0336 0.0710 0.1185 0.1602
12 0.0000 0.0000 0.0000 0.0001 0.0008 0.0039 0.0136 0.0355 0.0727 0.1201
13 0.0000 0.0000 0.0000 0.0000 0.0002 0.0010 0.0045 0.0146 0.0366 0.0739
14 0.0000 0.0000 0.0000 0.0000 0.0000 0.0002 0.0012 0.0049 0.0150 0.0370
15 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0003 0.0013 0.0049 0.0148
16 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0003 0.0013 0.0046
17 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0002 0.0011
18 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0002
19 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
20 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
Area tabulated
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389
1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319
1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767
A1/4 Edinburgh Business School Quantitative Methods
Appendix 1 / Statistical Tables
z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936
2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
λ
r 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
0 0.9048 0.8187 0.7408 0.6703 0.6065 0.5488 0.4966 0.4493 0.4066 0.3679
1 0.0905 0.1637 0.2222 0.2681 0.3033 0.3293 0.3476 0.3595 0.3659 0.3679
2 0.0045 0.0164 0.0333 0.0536 0.0758 0.0988 0.1217 0.1438 0.1647 0.1839
3 0.0002 0.0011 0.0033 0.0072 0.0126 0.0198 0.0284 0.0383 0.0494 0.0613
4 0.0000 0.0001 0.0003 0.0007 0.0016 0.0030 0.0050 0.0077 0.0111 0.0153
5 0.0000 0.0000 0.0000 0.0001 0.0002 0.0004 0.0007 0.0012 0.0020 0.0031
6 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0002 0.0003 0.0005
7 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001
λ
r 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0
0 0.3329 0.3012 0.2725 0.2466 0.2231 0.2019 0.1827 0.1653 0.1496 0.1353
1 0.3662 0.3614 0.3543 0.3452 0.3347 0.3230 0.3106 0.2975 0.2842 0.2707
2 0.2014 0.2169 0.2303 0.2417 0.2510 0.2584 0.2640 0.2678 0.2700 0.2707
3 0.0738 0.0867 0.0998 0.1128 0.1255 0.1378 0.1496 0.1607 0.1710 0.1804
4 0.0203 0.0260 0.0324 0.0395 0.0471 0.0551 0.0636 0.0723 0.0812 0.0902
5 0.0045 0.0062 0.0084 0.0111 0.0141 0.0176 0.0216 0.0260 0.0309 0.0361
6 0.0008 0.0012 0.0018 0.0026 0.0035 0.0047 0.0061 0.0078 0.0098 0.0120
7 0.0001 0.0002 0.0003 0.0005 0.0008 0.0011 0.0015 0.0020 0.0027 0.0034
8 0.0000 0.0000 0.0001 0.0001 0.0001 0.0002 0.0003 0.0005 0.0006 0.0009
9 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0001 0.0001 0.0002
λ
r 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.0
0 0.1225 0.1108 0.1003 0.0907 0.0821 0.0743 0.0672 0.0608 0.0550 0.0498
1 0.2572 0.2438 0.2306 0.2177 0.2052 0.1931 0.1815 0.1703 0.1596 0.1494
2 0.2700 0.2681 0.2652 0.2613 0.2565 0.2510 0.2450 0.2384 0.2314 0.2240
3 0.1890 0.1966 0.2033 0.2090 0.2138 0.2176 0.2205 0.2225 0.2237 0.2240
4 0.0992 0.1082 0.1169 0.1254 0.1336 0.1414 0.1488 0.1557 0.1622 0.1680
5 0.0417 0.0476 0.0538 0.0602 0.0668 0.0735 0.0804 0.0872 0.0940 0.1008
6 0.0146 0.0174 0.0206 0.0241 0.0278 0.0319 0.0362 0.0407 0.0455 0.0504
7 0.0044 0.0055 0.0068 0.0083 0.0099 0.0118 0.0139 0.0163 0.0188 0.0216
8 0.0011 0.0015 0.0019 0.0025 0.0031 0.0038 0.0047 0.0057 0.0068 0.0081
9 0.0003 0.0004 0.0005 0.0007 0.0009 0.0011 0.0014 0.0018 0.0022 0.0027
10 0.0001 0.0001 0.0001 0.0002 0.0002 0.0003 0.0004 0.0005 0.0006 0.0008
11 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001 0.0001 0.0001 0.0002 0.0002
12 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0001
Area of probability
2
0 ca
Entries in the table give 2 values, in which α is the area or probability in the upper tail of the chi-squared distribution. For example,
with 10 degrees of freedom and an area of 0.01 in the upper tail, 2.01 = 23.2093.
5% of area Example
For F1 = 10, F2= 9 degrees of freedom:
1% of area P(F>3.13) = 0.05
P(F>5.26) = 0.01
0 1 2 3 4 5 F
3.13 5.26
Table A1.6 F-distribution tables Φ1 degrees of freedom (for greater mean square)
Φ2 1 2 3 4 5 6 7 8 9 10 11 12 14 16 20 24 30 40 50 75 100 200 500 ∞ Φ2
1 161 200 216 225 230 234 237 239 241 242 243 244 245 246 248 249 250 251 252 253 253 254 254 254 1
4052 4999 5403 5625 5764 5859 5928 5981 6022 6056 6082 6106 6142 6169 6208 6234 6258 6286 6302 6323 6334 6352 6361 6366
2 18.51 19.00 19.16 19.25 19.30 19.33 19.36 19.37 19.38 19.39 19.40 19.41 19.42 19.43 19.44 19.45 19.46 19.47 19.47 19.48 19.49 19.49 19.50 19.50 2
98.49 99.00 99.17 99.25 99.30 99.33 99.34 99.36 99.38 99.40 99.41 99.42 99.43 99.44 99.45 99.46 99.47 99.48 99.48 99.49 99.49 99.49 99.50 99.50
3 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 8.81 8.78 8.76 8.74 8.71 8.69 8.66 8.64 8.62 8.60 8.58 8.57 8.56 8.54 8.54 8.53 3
34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 27.34 27.23 27.13 27.05 26.92 26.83 26.69 26.60 26.50 26.41 26.35 26.27 26.23 26.18 26.14 26.12
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96 5.93 5.91 5.87 5.84 5.80 5.77 5.74 5.71 5.70 5.68 5.66 5.65 5.64 5.63 4
21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 14.66 14.54 14.45 14.37 14.24 14.15 14.02 13.93 13.83 13.74 13.69 13.61 13.57 13.52 13.48 13.46
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.78 4.74 4.70 4.68 4.64 4.60 4.56 4.53 4.50 4.46 4.44 4.42 4.40 4.38 4.37 4.36 5
16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.27 10.15 10.05 9.96 9.89 9.77 9.68 9.55 9.47 9.38 9.29 9.24 9.17 9.13 9.07 9.04 9.02
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06 4.03 4.00 3.96 3.92 3.87 3.84 3.81 3.77 3.75 3.72 3.71 3.69 3.68 3.67 6
13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7.98 7.87 7.79 7.72 7.60 7.52 7.39 7.31 7.23 7.14 7.09 7.02 6.99 6.94 6.90 6.88
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.63 3.60 3.57 3.52 3.49 3.44 3.41 3.38 3.34 3.32 3.29 3.28 3.25 3.24 3.23 7
12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84 6.71 6.62 6.54 6.47 6.35 6.27 6.15 6.07 5.98 5.90 5.85 5.78 5.75 5.70 5.67 5.65
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.34 3.31 3.28 3.23 3.20 3.15 3.12 3.08 3.05 3.03 3.00 2.98 2.96 2.94 2.93 8
11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.03 5.91 5.82 5.74 5.67 5.56 5.48 5.36 5.28 5.20 5.11 5.06 5.00 4.96 4.91 4.88 4.86
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.13 3.10 3.07 3.02 2.98 2.93 2.90 2.86 2.82 2.80 2.77 2.76 2.73 2.72 2.71 9
10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47 5.35 5.26 5.18 5.11 5.00 4.92 4.80 4.73 4.64 4.56 4.51 4.45 4.41 4.36 4.33 4.31
10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.97 2.94 2.91 2.86 2.82 2.77 2.74 2.70 2.67 2.64 2.61 2.59 2.56 2.55 2.54 10
10.04 7.56 6.55 5.99 5.64 5.39 5.21 5.06 4.95 4.85 4.78 4.71 4.60 4.52 4.41 4.33 4.25 4.17 4.12 4.05 4.01 3.96 3.93 3.91
12 4.75 3.88 3.49 3.26 3.11 3.00 2.92 2.85 2.80 2.76 2.72 2.69 2.64 2.60 2.54 2.50 2.46 2.42 2.40 2.36 2.35 2.32 2.31 2.30 12
9.33 6.93 5.95 5.41 5.06 4.82 4.65 4.50 4.39 4.30 4.22 4.16 4.05 3.98 3.86 3.78 3.70 3.61 3.56 3.49 3.46 3.41 3.38 3.36
13 4.67 3.80 3.41 3.18 3.02 2.92 2.84 2.77 2.72 2.67 2.63 2.60 2.55 2.51 2.46 2.42 2.38 2.34 2.32 2.28 2.26 2.24 2.22 2.21 13
9.07 6.70 5.74 5.20 4.86 4.62 4.44 4.30 4.19 4.10 4.02 3.96 3.85 3.78 3.67 3.59 3.51 3.42 3.37 3.30 3.27 3.21 3.18 3.16
14 4.60 3.74 3.34 3.11 2.96 2.85 2.77 2.70 2.65 2.60 2.56 2.53 2.48 2.44 2.39 2.35 2.31 2.27 2.24 2.21 2.19 2.16 2.14 2.13 14
8.86 6.51 5.56 5.03 4.69 4.46 4.28 4.14 4.03 3.94 3.86 3.80 3.70 3.62 3.51 3.43 3.34 3.26 3.21 3.14 3.11 3.06 3.02 3.00
15 4.54 3.68 3.29 3.06 2.90 2.79 2.70 2.64 2.59 2.55 2.51 2.48 2.43 2.39 2.33 2.29 2.25 2.21 2.18 2.15 2.12 2.10 2.08 2.07 15
8.68 6.36 5.42 4.89 4.56 4.32 4.14 4.00 3.89 3.80 3.73 3.67 3.56 3.48 3.36 3.29 3.20 3.12 3.07 3.00 2.97 2.92 2.89 2.87
16 4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 2.49 2.45 2.42 2.37 2.33 2.28 2.24 2.20 2.16 2.13 2.09 2.07 2.04 2.02 2.01 16
8.53 6.23 5.29 4.77 4.44 4.20 4.03 3.89 3.78 3.69 3.61 3.55 3.45 3.37 3.25 3.18 3.10 3.01 2.96 2.89 2.86 2.80 2.77 2.75
17 4.45 3.59 3.20 2.96 2.81 2.70 2.62 2.55 2.50 2.45 2.41 2.38 2.33 2.29 2.23 2.19 2.15 2.11 2.08 2.04 2.02 1.99 1.97 1.96 17
8.40 6.11 5.18 4.67 4.34 4.10 3.93 3.79 3.68 3.59 3.52 3.45 3.35 3.27 3.16 3.08 3.00 2.92 2.86 2.79 2.76 2.70 2.67 2.65
18 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 2.41 2.37 2.34 2.29 2.25 2.19 2.15 2.11 2.07 2.04 2.00 1.98 1.95 1.93 1.92 18
8.28 6.01 5.09 4.58 4.25 4.01 3.85 3.71 3.60 3.51 3.44 3.37 3.27 3.19 3.07 3.00 2.91 2.83 2.78 2.71 2.68 2.62 2.59 2.57
19 4.38 3.52 3.13 2.90 2.74 2.63 2.55 2.48 2.43 2.38 2.34 2.31 2.26 2.21 2.15 2.11 2.07 2.02 2.00 1.96 1.94 1.91 1.90 1.88 19
8.18 5.93 5.01 4.50 4.17 3.94 3.77 3.63 3.52 3.43 3.36 3.30 3.19 3.12 3.00 2.92 2.84 2.76 2.70 2.63 2.60 2.54 2.51 2.49
20 4.35 3.49 3.10 2.87 2.71 2.60 2.52 2.45 2.40 2.35 2.31 2.28 2.23 2.18 2.12 2.08 2.04 1.99 1.96 1.92 1.90 1.87 1.85 1.84 20
8.10 5.85 4.94 4.43 4.10 3.87 3.71 3.56 3.45 3.37 3.30 3.23 3.13 3.05 2.94 2.86 2.77 2.69 2.63 2.56 2.53 2.47 2.44 2.42
21 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37 2.32 2.28 2.25 2.20 2.15 2.09 2.05 2.00 1.96 1.93 1.89 1.87 1.84 1.82 1.81 21
8.02 5.78 4.87 4.37 4.04 3.81 3.65 3.51 3.40 3.31 3.24 3.17 3.07 2.99 2.88 2.80 2.72 2.63 2.58 2.51 2.47 2.42 2.38 2.36
22 4.30 3.44 3.05 2.82 2.66 2.55 2.47 2.40 2.35 2.30 2.26 2.23 2.18 2.13 2.07 2.03 1.98 1.93 1.91 1.87 1.84 1.81 1.80 1.78 22
7.94 5.72 4.82 4.31 3.99 3.76 3.59 3.45 3.35 3.26 3.18 3.12 3.02 2.94 2.83 2.75 2.67 2.58 2.53 2.46 2.42 2.37 2.33 2.31
23 4.28 3.42 3.03 2.80 2.64 2.53 2.45 2.38 2.32 2.28 2.24 2.20 2.14 2.10 2.04 2.00 1.96 1.91 1.88 1.84 1.82 1.79 1.77 1.76 23
7.88 5.66 4.76 4.26 3.94 3.71 3.54 3.41 3.30 3.21 3.14 3.07 2.97 2.89 2.78 2.70 2.62 2.53 2.48 2.41 2.37 2.32 2.28 2.26
24 4.26 3.40 3.01 2.78 2.62 2.51 2.43 2.36 2.30 2.26 2.22 2.18 2.13 2.09 2.02 1.98 1.94 1.89 1.86 1.82 1.80 1.76 1.74 1.73 24
7.82 5.61 4.72 4.22 3.90 3.67 3.50 3.36 3.25 3.17 3.09 3.03 2.93 2.85 2.74 2.66 2.58 2.49 2.44 2.36 2.33 2.27 2.23 2.21
25 4.24 3.38 2.99 2.76 2.60 2.49 2.41 2.34 2.28 2.24 2.20 2.16 2.11 2.06 2.00 1.96 1.92 1.87 1.84 1.80 1.77 1.74 1.72 1.71 25
7.77 5.57 4.68 4.18 3.86 3.63 3.46 3.32 3.21 3.13 3.05 2.99 2.89 2.81 2.70 2.62 2.54 2.45 2.40 2.32 2.29 2.23 2.19 2.17
27 4.21 3.35 2.96 2.73 2.57 2.46 2.37 2.30 2.25 2.20 2.16 2.13 2.08 2.03 1.97 1.93 1.88 1.84 1.80 1.76 1.74 1.71 1.68 1.67 27
7.68 5.49 4.60 4.11 3.79 3.56 3.39 3.26 3.14 3.06 2.98 2.93 2.83 2.74 2.63 2.55 2.47 2.38 2.33 2.25 2.21 2.16 2.12 2.10
28 4.20 3.34 2.95 2.71 2.56 2.44 2.36 2.29 2.24 2.19 2.15 2.12 2.06 2.02 1.96 1.91 1.87 1.81 1.78 1.75 1.72 1.69 1.67 1.65 28
7.64 5.45 4.57 4.07 3.76 3.53 3.36 3.23 3.11 3.03 2.95 2.90 2.80 2.71 2.60 2.52 2.44 2.35 2.30 2.22 2.18 2.13 2.09 2.06
29 4.18 3.33 2.93 2.70 2.54 2.43 2.35 2.28 2.22 2.18 2.14 2.10 2.05 2.00 1.94 1.90 1.85 1.80 1.77 1.73 1.71 1.68 1.65 1.64 29
7.60 5.42 4.54 4.04 3.73 3.50 3.33 3.20 3.08 3.00 2.92 2.87 2.77 2.68 2.57 2.49 2.41 2.32 2.27 2.19 2.15 2.10 2.06 2.03
30 4.17 3.32 2.92 2.69 2.53 2.42 2.34 2.27 2.21 2.16 2.12 2.09 2.04 1.99 1.93 1.89 1.84 1.79 1.76 1.72 1.69 1.66 1.64 1.62 30
7.56 5.39 4.51 4.02 3.70 3.47 3.30 3.17 3.06 2.98 2.90 2.84 2.74 2.66 2.55 2.47 2.38 2.29 2.24 2.16 2.13 2.07 2.03 2.01
32 4.15 3.30 2.90 2.67 2.51 2.40 2.32 2.25 2.19 2.14 2.10 2.07 2.02 1.97 1.91 1.86 1.82 1.76 1.74 1.69 1.67 1.64 1.61 1.59 32
7.50 5.34 4.46 3.97 3.66 3.42 3.25 3.12 3.01 2.94 2.86 2.80 2.70 2.62 2.51 2.42 2.34 2.25 2.20 2.12 2.08 2.02 1.98 1.96
34 4.13 3.28 2.88 2.65 2.49 2.38 2.30 2.32 2.17 2.12 2.08 2.05 2.00 1.95 1.89 1.84 1.80 1.74 1.71 1.67 1.64 1.61 1.59 1.57 34
7.44 5.29 4.42 3.93 3.61 3.38 3.21 3.08 2.97 2.89 2.82 2.76 2.66 2.58 2.47 2.38 2.30 2.21 2.15 2.08 2.04 1.98 1.94 1.91
36 4.11 3.26 2.86 2.63 2.48 2.36 2.28 2.21 2.15 2.10 2.06 2.03 1.98 1.93 1.87 1.82 1.78 1.72 1.69 1.65 1.62 1.59 1.56 1.55 36
7.39 5.25 4.38 3.89 3.58 3.35 3.18 3.04 2.94 2.86 2.78 2.72 2.62 2.54 2.43 2.35 2.26 2.17 2.12 2.04 2.00 1.94 1.90 1.87
38 4.10 3.25 2.85 2.62 2.46 2.35 2.26 2.19 2.14 2.09 2.05 2.02 1.96 1.92 1.85 1.80 1.76 1.71 1.67 1.63 1.60 1.57 1.54 1.53 38
7.35 5.21 4.34 3.86 3.54 3.32 3.15 3.02 2.91 2.82 2.75 2.69 2.59 2.51 2.40 2.32 2.22 2.14 2.08 2.00 1.97 1.90 1.86 1.84
40 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 2.07 2.04 2.00 1.95 1.90 1.84 1.79 1.74 1.69 1.66 1.61 1.59 1.55 1.53 1.51 40
7.31 5.18 4.31 3.83 3.51 3.29 3.12 2.99 2.88 2.80 2.73 2.66 2.56 2.49 2.37 2.29 2.20 2.11 2.05 1.97 1.94 1.88 1.84 1.81
42 4.07 3.22 2.83 2.59 2.44 2.32 2.24 2.17 2.11 2.06 2.02 1.99 1.94 1.89 1.82 1.78 1.73 1.68 1.64 1.60 1.57 1.54 1.51 1.49 42
7.27 5.15 4.29 3.80 3.49 3.26 3.10 2.96 2.86 2.77 2.70 2.64 2.54 2.46 2.35 2.26 2.17 2.08 2.02 1.94 1.91 1.85 1.80 1.78
44 4.06 3.21 2.82 2.58 2.43 2.31 2.23 2.16 2.10 2.05 2.01 1.98 1.92 1.88 1.81 1.76 1.72 1.66 1.63 1.58 1.56 1.52 1.50 1.48 44
7.24 5.12 4.26 3.78 3.46 3.24 3.07 2.94 2.84 2.75 2.68 2.62 2.52 2.44 2.32 2.24 2.15 2.06 2.00 1.92 1.88 1.82 1.78 1.75
46 4.05 3.20 2.81 2.57 2.42 2.30 2.22 2.14 2.09 2.04 2.00 1.97 1.91 1.87 1.80 1.75 1.71 1.65 1.62 1.57 1.54 1.51 1.48 1.46 46
7.21 5.10 4.24 3.76 3.44 3.22 3.05 2.92 2.82 2.73 2.66 2.60 2.50 2.42 2.30 2.22 2.13 2.04 1.98 1.90 1.86 1.80 1.76 1.72
48 4.04 3.19 2.80 2.56 2.41 2.30 2.21 2.14 2.08 2.03 1.99 1.96 1.90 1.86 1.79 1.74 1.70 1.64 1.61 1.56 1.53 1.50 1.47 1.45 48
7.19 5.08 4.22 3.74 3.42 3.20 3.04 2.90 2.80 2.71 2.64 2.58 2.48 2.40 2.28 2.20 2.11 2.02 1.96 1.88 1.84 1.78 1.73 1.70
50 4.03 3.18 2.79 2.56 2.40 2.29 2.20 2.13 2.07 2.02 1.98 1.95 1.90 1.85 1.78 1.74 1.69 1.63 1.60 1.55 1.52 1.48 1.46 1.44 50
7.17 5.06 4.20 3.72 3.41 3.18 3.02 2.88 2.78 2.70 2.62 2.56 2.46 2.39 2.26 2.18 2.10 2.00 1.94 1.86 1.82 1.76 1.71 1.68
60 4.00 3.15 2.76 2.52 2.37 2.25 2.17 2.10 2.04 1.99 1.95 1.92 1.86 1.81 1.75 1.70 1.65 1.59 1.56 1.50 1.48 1.44 1.41 1.39 60
7.08 4.98 4.13 3.65 3.34 3.12 2.95 2.82 2.72 2.63 2.56 2.50 2.40 2.32 2.20 2.12 2.03 1.93 1.87 1.79 1.74 1.68 1.63 1.60
65 3.99 3.14 2.75 2.51 2.36 2.24 2.15 2.08 2.02 1.98 1.94 1.90 1.85 1.80 1.73 1.68 1.63 1.57 1.54 1.49 1.46 1.42 1.39 1.37 65
7.04 4.95 4.10 3.62 3.31 3.09 2.93 2.79 2.70 2.61 2.54 2.47 2.37 2.30 2.18 2.09 2.00 1.90 1.84 1.76 1.71 1.64 1.60 1.56
70 3.98 3.13 2.74 2.50 2.35 2.23 2.14 2.07 2.01 1.97 1.93 1.89 1.84 1.79 1.72 1.67 1.62 1.56 1.53 1.47 1.45 1.40 1.37 1.35 70
7.01 4.92 4.08 3.60 3.29 3.07 2.91 2.77 2.67 2.59 2.51 2.45 2.35 2.28 2.15 2.07 1.98 1.88 1.82 1.74 1.69 1.62 1.56 1.53
80 3.96 3.11 2.72 2.48 2.33 2.21 2.12 2.05 1.99 1.95 1.91 1.88 1.82 1.77 1.70 1.65 1.60 1.54 1.51 1.45 1.42 1.38 1.35 1.32 80
6.96 4.88 4.04 3.56 3.25 3.04 2.87 2.74 2.64 2.55 2.48 2.41 2.32 2.24 2.11 2.03 1.94 1.84 1.78 1.70 1.65 1.57 1.52 1.49
100 3.94 3.09 2.70 2.46 2.30 2.19 2.10 2.03 1.97 1.92 1.88 1.85 1.79 1.75 1.68 1.63 1.57 1.51 1.48 1.42 1.39 1.34 1.30 1.28 100
6.90 4.82 3.98 3.51 3.20 2.99 2.82 2.69 2.59 2.51 2.43 2.36 2.26 2.19 2.06 1.98 1.89 1.79 1.73 1.64 1.59 1.51 1.46 1.43
125 3.92 3.07 2.68 2.44 2.29 2.17 2.08 2.01 1.95 1.90 1.86 1.83 1.77 1.72 1.65 1.60 1.55 1.49 1.45 1.39 1.36 1.31 1.27 1.25 125
6.84 4.78 3.94 3.47 3.17 2.95 2.79 2.65 2.56 2.47 2.40 2.33 2.23 2.15 2.03 1.94 1.85 1.75 1.68 1.59 1.54 1.46 1.40 1.37
150 3.91 3.06 2.67 2.43 2.27 2.16 2.07 2.00 1.94 1.89 1.85 1.82 1.76 1.71 1.64 1.59 1.54 1.47 1.44 1.37 1.34 1.29 1.25 1.22 150
6.81 4.75 3.91 3.44 3.14 2.92 2.76 2.62 2.53 2.44 2.37 2.30 2.20 2.12 2.00 1.91 1.83 1.72 1.66 1.56 1.51 1.43 1.37 1.33
200 3.89 3.04 2.65 2.41 2.26 2.14 2.05 1.98 1.92 1.87 1.83 1.80 1.74 1.69 1.62 1.57 1.52 1.45 1.42 1.35 1.32 1.26 1.22 1.19 200
6.76 4.71 3.88 3.41 3.11 2.90 2.73 2.60 2.50 2.41 2.34 2.28 2.17 2.09 1.97 1.88 1.79 1.69 1.62 1.53 1.48 1.39 1.33 1.28
400 3.86 3.02 2.62 2.39 2.23 2.12 2.03 1.96 1.90 1.85 1.81 1.78 1.72 1.67 1.60 1.54 1.49 1.42 1.38 1.32 1.28 1.22 1.16 1.13 400
6.70 4.66 3.83 3.36 3.06 2.85 2.69 2.55 2.46 2.37 2.29 2.23 2.12 2.04 1.92 1.84 1.74 1.64 1.57 1.47 1.42 1.32 1.24 1.19
1000 3.85 3.00 2.61 2.38 2.22 2.10 2.02 1.95 1.89 1.84 1.80 1.76 1.70 1.65 1.58 1.53 1.47 1.41 1.36 1.30 1.26 1.19 1.13 1.08 1000
6.66 4.62 3.80 3.34 3.04 2.82 2.06 2.53 2.43 2.34 2.26 2.20 2.09 2.01 1.89 1.81 1.71 1.61 1.54 1.44 1.38 1.28 1.19 1.11
∞ 3.84 2.99 2.60 2.37 2.21 2.09 2.01 1.94 1.88 1.83 1.79 1.75 1.69 1.64 1.57 1.52 1.46 1.40 1.35 1.28 1.24 1.17 1.11 1.00 ∞
6.64 4.60 3.78 3.32 3.02 2.80 2.64 2.51 2.41 2.32 2.24 2.18 2.07 1.99 1.87 1.79 1.69 1.59 1.52 1.41 1.36 1.25 1.15 1.00
2 2 2 2 2 2 2 2 2 2
3 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3
4 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 4
5 2 2 3 3 3 3 3 4 4 4 4 4 4 4 5 5 5
6 2 2 3 3 3 3 4 4 4 4 5 5 5 5 5 5 6 6
7 2 2 3 3 3 4 4 5 5 5 5 5 6 6 6 6 6 6
8 2 3 3 3 4 4 5 5 5 6 6 6 6 6 7 7 7 7
9 2 3 3 4 4 5 5 5 6 6 6 7 7 7 7 8 8 8
10 2 3 3 4 5 5 5 6 6 7 7 7 7 8 8 8 8 9
11 2 3 4 4 5 5 6 6 7 7 7 8 8 8 9 9 9 9
12 2 2 3 4 4 5 6 6 7 7 7 8 8 8 9 9 9 10 10
13 2 2 3 4 5 5 6 6 7 7 8 8 9 9 9 10 10 10 10
14 2 2 3 4 5 5 6 7 7 8 8 9 9 9 10 10 10 11 11
15 2 3 3 4 5 6 6 7 7 8 8 9 9 10 10 11 11 11 12
16 2 3 4 4 5 6 6 7 8 8 9 9 10 10 11 11 11 12 12
17 2 3 4 4 5 6 7 7 8 9 9 10 10 11 11 11 12 12 13
18 2 3 4 5 5 6 7 8 8 9 9 10 10 11 11 12 12 13 13
19 2 3 4 5 6 6 7 8 8 9 10 10 11 11 12 12 13 13 13
20 2 3 4 5 6 6 7 8 9 9 10 10 11 12 12 13 13 13 14
2
3
4 9 9
5 9 10 10 11 11
6 9 10 11 12 12 13 13 13 13
7 11 12 13 13 14 14 14 14 15 15 15
8 11 12 13 14 14 15 15 16 16 16 16 17 17 17 17 17
9 13 14 14 15 16 16 16 17 17 18 18 18 18 18 18
10 13 14 15 16 16 17 17 18 18 18 19 19 19 20 20
11 13 14 15 16 17 17 18 19 19 19 20 20 20 21 21
12 13 14 16 16 17 18 19 19 20 20 21 21 21 22 22
13 15 16 17 18 19 20 20 21 21 22 22 23 23
14 15 16 17 18 19 20 20 21 22 22 23 23 23 24
15 15 16 18 18 19 20 21 22 22 23 23 24 24 25
16 17 18 19 20 21 21 22 23 23 24 25 25 25
17 17 18 19 20 21 22 23 23 24 25 25 26 26
18 17 18 19 20 21 22 23 24 25 25 26 26 27
19 17 18 20 21 22 23 23 24 25 26 26 27 27
20 17 18 20 21 22 23 24 25 25 26 27 27 28
Short-Cut Formula
Variance = ∑( )− · /( − 1)
∑( )
Standard deviation =
Coefficient of variation =
!
Combinations =
!( )!
Binomial Distribution
P( successes) = (1 − )
where p = probability of a success, n = sample size
Mean =
Standard deviation = (1 − )
For a proportion
Mean =
( )
Standard deviation =
Estimation
95 per cent confidence limits for population mean:
±
√
Pooled standard deviation for two samples:
( ) ( )
=
Poisson Distribution
P(r events) = ·
!
Normal Distribution
Testing a sample mean
z=
/√
t-Distribution
t=
/√
Chi-Squared Distribution
= ( − 1) ×
F-Distribution
=
Regression
Least squares regression line of y on x is
= +
where
∑( )( )
= and
∑( )
= −
Correlation Coefficient
∑( )( )
=
∑( ) ·∑( )
Runs Test
If n1 or n2 > 20, number of runs is normally distributed with
Mean = +1
( )
Standard deviation =
( ) ( )
Exponential Smoothing
= (1 − ) +
Holt’s Method
= (1 − α)( + )+
= (1 − ) + ( − )
= + ·
where
= actual observation at time
= smoothed value at time
, = smoothing constants between 0 and 1
= smoothed trend at time
= forecast for periods ahead
1 A toy manufacturer is putting together a financial plan for his business, but to do so he
needs to estimate the Christmas demand for his newly developed children’s board
game. He knows demand will vary according to the price charged, but he does not want
to set the price until nearer the Christmas season. However, he does have information
on the two similar products he marketed last year. One was priced at £5 and sold six
(thousands); the other was priced at £11 and sold three (thousands). Over this range he
thinks the relationship between price and demand is linear. What is the straight-line
equation linking price and demand?
A. Demand = 3.5 + 0.5 × Price
B. Demand = 6 − 2 × Price
C. Demand = 8.5 − 0.5 × Price
D. Demand = 3.5 − 0.5 × Price
2 A furniture manufacturer is a supplier for a large retail chain. Amongst her products are
a round coffee table and a square coffee table, both of which can be made on either of
two machines. Knowledge of the production times per table and the time per week that
each machine is available results in the following two equations showing possible
production mixes for each machine.
Machine A: 2R + 3S = 27
Machine B: 5R + 4S = 50
where R = number of round tables made per week
where S = number of square tables made per week.
The solution to these simultaneous equations is the numbers of each product that must
be made if all available machine time is used up. Which of the following production
mixes is the solution to the equations?
A. = 5, = 6
B. = 3, = 7
C. = 10, = 0
D. = 6, = 5
3 Which of the following is correct? When analysing any management data, the pattern
should be found before considering exceptions because:
A. exceptions can only be defined in the context of a pattern.
B. in management only the pattern needs to be considered.
C. finding the pattern may involve statistical analysis.
D. patterns are easier to find than exceptions.
Questions 4 and 5 are based on the following data, which refer to the
number of customer complaints about the quality of service on one of an
airline’s routes in each of ten weeks:
11 20 12 11 16 13 15 21 18 13
7 An integrated factory production line fills, labels and seals 1-litre bottles of mineral
water. Production is monitored to check how much liquid the bottles actually contain
and that legal requirements on minimum contents are being met. A distribution is
formed by taking a random sample of 1000 bottles and measuring the amount of liquid
each holds. The measurements are grouped in intervals of 1 millilitre and a histogram
formed. To calculate how much of the production falls within different volume ranges, it
is necessary to know what type the resulting distribution is. Which of the following is
correct? The distribution is:
A. normal
B. binomial
C. Poisson
D. observed
8 In practice the binomial distribution is usually approximated by the normal if the rule of
thumb, np > 5, holds. Which of the following is correct? The reason for following the
procedure is that:
A. it is easier to calculate probabilities.
B. suitable binomial tables are not available.
C. basic theory shows that the distribution is in reality normal.
D. the Central Limit Theorem applies.
9 A medical consultant sees 100 patients per month in his office. In an attempt to optimise
his appointments system, he decides to analyse the time he spends with each patient.
What type of standard distribution should he expect for the average time he spends
with each patient over the course of a month?
A. normal
B.
C. binomial
D. Poisson
Questions 10 to 12 refer to the following circumstances.
A supermarket is checking for the presence of bacterial infection in precooked TV
dinners. To do this a sample of 16 is selected at random and given a laboratory test that
gives each meal a score between 0 (completely free of infection) and 100 (highly
infected). Tests on the sample of 16 result in an average score of 10. It is known from
investigations of other products that the infection score is likely to be normally
distributed with a standard deviation of two.
10 What is the point estimate of the population mean (the average score for all dinners)?
A. 2
B. 0.5
C. 8 to 12
D. 10
11 What are the 95 per cent confidence limits for the population mean?
A. 6 to 14
B. 8 to 12
C. 9 to 11
D. 9.5 to 10.5
12 Which one of the following concepts did you have to use in answering Question 11?
A. Central limit theorem
B. Variance sum theorem
C. Sampling distribution of the mean
D. Analysis of variance
13 A consultant has made a presentation concerning the likely impact of a new quality
control system on rejection rates. The production director asks why a significance level
of 5 per cent has been used. Which of the following is valid? It is:
A. the definition of significance.
B. convention.
C. the boundary between an acceptable risk and an unacceptable risk.
D. part of normal distribution theory.
14 A restaurant chain asks its customers to rate the service on a scale from 0 to 10 before
they leave the restaurant. It is a corporate objective that an overall average of 8 should
be achieved. To check how a particular restaurant compares with the corporate
average, a random sample of the ratings of 36 customers is selected. The sample has a
mean of 6.8 and standard deviation of 3. A two-tailed significance test is conducted with
the hypothesis that the overall mean rating is 8. Which of the following is correct?
A. The test is inconclusive at the 5 per cent level.
B. The test suggests the hypothesis should be accepted at the 5 per cent level.
C. The test suggests the hypothesis should be rejected at the 5 per cent level.
D. The test suggests the hypothesis should be rejected at the 1 per cent level.
19 Which of the following is correct? The t values for the three variables are respectively:
A. 8.0, 6.0, 1.3
B. 0.8, 0.1, 12.0
C. 6.4, 0.6, 15.6
D. 5.4, 0.5, 13.0
21 A company is trying to forecast its next two years’ revenue for a new venture in the Far
East. Because of the lack of suitable data, it has been decided to use the Delphi tech-
nique. Six executives have been selected to take part. Which of the following is correct?
The executives should:
A. all be at the same level of seniority.
B. not talk to one another about their forecasts.
C. adjust their forecast towards the mean at each iteration.
D. continue making forecasts until a consensus is reached.
Week 1 2 3 4 5 6 7
Stock (£ million) 13 15 14 10 8 11 12
22 For budgeting purposes a three-point moving average forecast for period 8 is calculated.
Which of the following is correct? The forecast is:
A. 9.7
B. 10.3
C. 10.7
D. 12
24 Which technique should a confectionery company use to forecast quarterly ice cream
sales for the year ahead based on a ten-year data record?
A. Delphi
B. Exponential smoothing
C. Holt’s Method
D. Holt–Winters Method
25 A manufacturer of white goods has used two forecasting methods, three-point moving
averages and exponential smoothing, to make short-term forecasts of the weekly stock
levels. The company now wants to choose the more accurate of the two and use it
alone to forecast. To do this two measures of accuracy, the MAD (mean absolute
deviation) and MSE (mean square error), have been calculated for each method over the
last ten weeks. What conclusion should be drawn if moving averages has the lower MSE
while exponential smoothing has the lower MAD?
A. Moving averages is superior.
B. Exponential smoothing is superior.
C. A calculating error must have been made because such a situation could never
arise.
D. Exponential smoothing is better on average but probably has more large
individual errors.
Case Study 1
Case Study 2
1 Since the introduction of a new line in scented envelopes, production schedules have
been continually changed because of substantial variations in demand from quarter to
quarter. The production scheduling department is trying to improve the situation by
adopting a scientific approach to forecasting demand. They have applied the decomposi-
tion method to quarterly data covering 2012–17. The separation of the elements of this
time series revealed the following:
i. The trend was: Demand = 2600 + 40t
where t =1, 2, 3, etc. and t = 1 represents Q1, 2012
ii. There was no cycle.
iii. The seasonal factors (actual/moving average) were:
Q1 Q2 Q3 Q4
2012 – – 76 100
2013 113 109 73 104
2014 109 111 76 106
2015 110 110 72 104
2016 114 107 77 102
2017 114 108 76 108
Now, in the first quarter of 2018, the production scheduling department needs a
forecast for the third quarter of 2018.
a. What should the forecast be? (15 marks)
b. Should the schedule be delayed until the results for the current quarter are available
(i.e. is the actual demand for the first quarter likely to affect the forecast)? (5 marks)
c. If the method proves unsatisfactory in practice, what other forecasting methods
could be used? (5 marks)
Case Study 3
1 Seven hundred motorists, chosen at random, took part in a market research survey to
find out about consumers’ attitudes to brands of petrol. The ultimate objective was to
determine the impact of advertising campaigns.
The motorists were presented with a list of petrol brands:
BP
Esso
Fina
Jet
National
Shell
They were also presented with a list of attributes:
Good mileage
Long engine life
For with-it people
Value for money
Large company
The motorists were then asked to tick the petrol brands to which they thought the
attributes applied. A motorist was free to tick all or none of the brands for each
attribute. Table A3.1 contains the results, showing the percentage of the sample that
ticked each brand for each attribute.
a. Analyse this table. What is the main pattern in it? (10 marks)
b. What are the exceptions? (5 marks)
c. Can you explain the exceptions? (5 marks)
d. What further information would be useful to understand the table better? (5 marks)
NB: to analyse the table you will find it useful to re-present it first by rounding, reorder-
ing, etc.
Case Study 4
In an attempt to measure the effect on fuel consumption of speed, altitude and load,
multiple regression analysis was used, relating fuel consumption to speed, altitude and
load. The results were:
a. What is the regression equation linking fuel consumption to the other variables?
What consumption would you predict for a trip at an average speed of 45 mph, at an
altitude of 200 m and with total passenger weight of 300 kg? (5 marks)
b. Calculate the values of the residuals. Is there a pattern in the residuals?
(5 marks)
1 A box contains 12 wine glasses, three of which have flaws. A glass is selected at random
and is found to be flawed; it is not returned to the box.
What is the probability that the next glass selected will also be faulty?
A. 0.17
B. 0.18
C. 0.25
D. 0.27
Questions 2 and 3 are based on the following information:
A supermarket’s daily sales figures (in £000) over the last quarter (13 weeks = 78
days) have been summarised:
3 What is the sales level that was exceeded on 56 per cent of all days?
A. £43 680
B. £50 000
C. £56 320
D. £60 000
4 A baker is producing standard loaves. A sample of 50 loaves was taken and weighed. It
was found that the average loaf weighed 502 g with a variance of 30 g.
In what range would the weight of 95 per cent of the sample loaves be?
A. 491.0 to 513.0 g
B. 496.5 to 507.5 g
C. 500.4 to 503.5 g
D. 501.8 to 502.2 g
11 Data were collected on the number of episodes of a 10-part serial that were seen by a
sample of 14 viewers:
0 0 0 1 1 2 2 8 9 9 10 10 10 10
13 A random sample of 36 employees is taken to estimate the average hourly rate of pay
with an accuracy of £0.25 per hour (with 95 per cent probability of being correct).
What would the accuracy have been if a sample size of 400 had been used?
A. £0.004 per hour
B. £0.045 per hour
C. £0.075 per hour
D. £0.150 per hour
14 A machine produces chair legs. The lengths of the legs are normally distributed with a
mean of 25 cm and a standard deviation of 0.6 cm. Legs that are longer than 25.8 cm or
shorter than 24.4 cm have to be discarded. The machine produces 450 legs per shift.
How many legs per shift have to be discarded?
A. 0
B. 83
C. 113
D. 337
16 A company is launching a new magazine. The initial advertising over the first three
months achieves its target that 45 per cent of the population will have heard of it. What
is the probability that in a group of six people only two people have heard of the new
magazine?
A. 0.018
B. 0.186
C. 0.278
D. 0.333
20 A random sample of 12 is taken from a normal distribution. The variance of the sample
is 7. What are the 95 per cent confidence limits for the population variance?
A. 3.30 to 17.48
B. 3.51 to 20.18
C. 3.83 to 22.01
D. 3.91 to 16.83
Questions 21 and 22 refer to the following data.
A two-way analysis of variance has been carried out for four blocks and six treat-
ments. Calculations have shown that:
Total sum of squares = 444
Treatment sum of squares = 160
Block sum of squares = 120
Month 1 2 3 4 5 6
Sales 12 15 16 18 20 23
24 The following ten residuals have been obtained from a regression analysis:
+1.2 −1.4 −1.6 −0.3 +1.1 +1.7 −0.8 +0.3 −0.9 −1.2
A runs test is carried out. What is the upper critical value for this test?
A. 2
B. 9
C. 10
D. 11
Case Study 1
1 A large hotel in a capital city offers a wide range of accommodation ranging from single
rooms to four-room suites. Facilities are also available for functions such as conferences,
receptions, dinners and dances. At some times of the year all accommodation and
facilities are fully booked, while at others the hotel may be less than half full. A firm of
consultants has been asked to advise how reliable forecasts of usage might be prepared.
Describe the factors that would need to be taken into account by the consultants and
the steps they would need to take.
(25 marks)
Case Study 2
1 A company is investigating the amount spent on maintenance of its PC. It believes that
the amount spent is likely to be dependent on the usage of the machines. The following
information has been collected:
b. Find the equation for the linear regression of the annual maintenance cost on weekly
usage. (8 marks)
c. Calculate the correlation coefficient. (3 marks)
d. Write a short report on your findings. Indicate what further analysis would be
needed to test the validity of the regression model. (7 marks)
e. The company is investigating the cost of a maintenance contract to cover the six
PCs. What is the maximum amount that you would recommend the firm to pay for
such a contract?
What assumptions have you made in reaching this decision? (3 marks)
Case Study 3
1 A firm has three production lines for making crystal wine glasses. Daily records are kept
of the number of faulty glasses produced by each line, as shown in the table below.
a. Carry out a one-way analysis of variance to indicate whether there are differences in
the number of faulty glasses produced by the different assembly lines. (18 marks)
b. Write a short report on your findings, and indicate any further details of the
production process you would wish to investigate. (7 marks)
Case Study 4
a. What precautions should the supermarket chain take to ensure the validity of the
pilot study? (4 marks)
b. Does the promotion make a significant difference to the weekly turnover of the
stores in the pilot study? (7 marks)
c. Does the promotion significantly increase the weekly turnover of the stores in the
study? (7 marks)
d. Write a short report on your findings. Comment on the implications for the other
supermarkets in the group. (7 marks)
Examination Answers
11 20 12 11 16 13 15 21 18 13 mean = 15
Deviations from −4 5 −3 −4 1 −2 0 6 3 −2
Absolute deviations 4 5 3 4 1 2 0 6 3 2 mean = 3
Week 1 2 3 4 5 6 7 8
Stock 13 15 14 10 8 11 12
Mov. av. – 14 13 10.7 9.7 10.3
Forecast 14 13 10.7 9.7 10.3
Week 1 2 3 4 5 6 7 8
Stock 13 15 14 10 8 11 12
ES 13 13.4 13.5 12.8 11.9 11.7 11.7
Forecast – 13.0 13.4 13.5 12.8 11.9 11.7 11.7
24 The correct answer is D. Because ice cream sales are to be forecast, a technique that
can handle seasonality is needed. This suggests D rather than B or C. A quantitative
rather than qualitative technique is suggested by the lengthy data record, ruling out
A.
25 The correct answer is D. The MAD calculates the average of the absolute values of
the forecasting errors (i.e. the average error size). Therefore, exponential smoothing,
having the lower MAD, is better on average. The MSE calculates the average of the
squared forecasting errors. Squaring the errors has the effect of giving a heavy
weighting to any large errors. Consequently, if exponential smoothing has a higher
MSE but a lower MAD, it is likely to have at least one more really large error.
Case Study 1
1
(a) Two-stage cluster sampling. It is two-stage sampling because there are two levels
in the sampling: universities/colleges and bookshops. It is cluster sampling be-
cause at the second stage all bookshops near the universities/colleges are
included in the sample.
(5 marks)
(b) This method is preferable because it will be:
(i) cheaper. Only a limited number of locations need be visited since the sample
is restricted to ten universities/colleges. Simple random sampling could
mean visiting just one bookshop at many more locations.
(ii) easier. A list of universities/colleges from which to select is more likely to be
available than a list of bookshops near campuses.
(iii) more accurate. The area near a university or college is likely to contain all
types of bookshop (in terms of size, ownership, site, proximity to students,
etc.). This diversity will be reflected in the sample, whereas it might not
have been with simple random sampling.
(10 marks)
(c) Potential sources of bias:
(i) Not all bookshops used by students will be less than two miles from a cam-
pus and they will be excluded.
(ii) The book will be of interest to groups other than students: graduates, po-
tential employers, etc. The bookshops used by these groups will mainly be
excluded from consideration.
(iii) Two miles is an arbitrary radius. Large universities will be more spread out
and specialist student bookshops could be at greater distances from the
centre of a campus.
(5 marks)
(d) Possible improvements:
(i) There are known differences among universities, which might affect the
number of copies on display and which are ignored. Stratified sampling
would make the sample more accurate by recognising the difference be-
tween undergraduate and graduate schools, large and small schools.
(ii) More of the potential market would be included by having other strata, for
example, city centre bookshops.
(iii) Instead of a two-mile radius, the catchment areas of the universities should
be defined.
(5 marks)
Case Study 2
1
(a) First calculate the average of each quarter’s seasonal factors to see if their level
needs to be adjusted. Quarterly seasonal averages:
Q1 Q2 Q3 Q4
2012 – – 76 100
2013 113 109 73 104
2014 109 111 76 106
2015 110 110 72 104
2016 114 107 77 102
2017 114 108 76 108
Average 112 109 75 104
Case Study 3
1
(a) The procedure is to follow the five stages for the non-technical analysis of data.
Stage 1. Eliminate irrelevant data: in this case there are no superfluous data.
Stage 2. Consider the seven guidelines for communicating data and re-present the
table.
Three changes can be made. They are to:
(i) round the percentages to eliminate the decimal points.
(ii) put rows and columns in size order. This can be done by calculating row
and column averages and using these figures as the basis for ordering.
Stage 3. Describe the overall pattern. The companies have approximately the
same ranking whatever the attribute: Shell scores highly for all attributes, Fina
lowly.
(10 marks)
(b) Stage 4. Investigate the exceptions to the pattern. There are three main excep-
tions. They are that:
(i) Esso scores more highly than expected for ‘large company’.
(ii) Jet scores more highly than expected for ‘value for money’.
(iii) National scores more highly than expected for ‘with-it people’.
(5 marks)
(c) All exceptions can be explained in terms of the brands’ advertising styles. Esso is
known as a large company internationally, Jet is known as a retailer of cheaply
acquired petrol and National has advertised to younger age groups. The conclu-
sion is that the pattern is a good fit.
(5 marks)
(d) Stage 5. Finally, additional information should be introduced. Is the response to
the questions proportional to market share or to the size of the company? Does
the pattern hold for other samples in other regions and at different times? It may
even hold for other products besides petrol.
The information can be used to monitor the effectiveness of advertising cam-
paigns. The impact of advertising does not depend on the level of response but
on the extent to which the response changes from the norm.
(5 marks)
Case Study 4
1
(a) The regression equation is:
mpg = 53.9 − 0.4 × speed − 0.8 × altitude − 1.5 × weight
When speed = 45, altitude = 200 and weight = 300,
Trip Residual
1 2.1 = 33 − (53.9 − 0.4 × 50 − 0.8 × 0 − 1.5 × 2)
2 3.4 = 36 − (53.9 − 0.4 × 40 − 0.8 × 1 − 1.5 × 3)
3 0.5 = 31 − (53.9 − 0.4 × 45 − 0.8 × 3 − 1.5 × 2)
4 1.3 = 24 − (53.9 − 0.4 × 55 − 0.8 × 4 − 1.5 × 4)
5 −0.6 = 34 − (53.9 − 0.4 × 35 − 0.8 × 1 − 1.5 × 3)
6 1.1 = 27 − (53.9 − 0.4 × 45 − 0.8 × 5 − 1.5 × 4)
7 −2.0 = 20 − (53.9 − 0.4 × 55 − 0.8 × 3 − 1.5 × 5)
8 −1.9 = 35 − (53.9 − 0.4 × 35 − 0.8 × 0 − 1.5 × 2)
9 −0.6 = 32 − (53.9 − 0.4 × 40 − 0.8 × 1 − 1.5 × 3)
10 −0.8 = 27 − (53.9 − 0.4 × 50 − 0.8 × 2 − 1.5 × 3)
11 0.3 = 23 − (53.9 − 0.4 × 55 − 0.8 × 4 − 1.5 × 4)
12 −3.3 = 28 − (53.9 − 0.4 × 45 − 0.8 × 2 − 1.5 × 2)
(b) There is a pattern in the residuals in that the negative residuals are in the latter
trips. However, this is not linked to particular values of the independent varia-
bles. If the trips are numbered in time order (i.e. trip 1 was the first and so on),
the pattern could mean that fuel consumption worsened as the car became older.
(5 marks)
(c) To see if variables should be excluded, the t values have to be calculated:
Speed has a t value that is less than −2 and is rightfully included. There is a
doubt about altitude and weight since their t values are closer to zero. However,
other evidence (on the working of internal combustion engines) suggests that
they should be retained.
(5 marks)
(d) Approximately, the accuracy of the forecast (95 per cent confidence limits) in
part (a) is given by 2 × residual standard error (i.e. ±4.4).
The standard error of residuals reflects error arising from the residuals only.
SE(Predicted) reflects error from the estimate of the regression line as well as the
residuals.
(5 marks)
(e) Improvements to the regression:
(i) Include more data: 12 observations are too few when there are three inde-
pendent variables.
(ii) Ensure that the style of driving (or the driver) does not change from trip to
trip.
(iii) Ensure the terrain is approximately the same for each trip (i.e. the same
mixture of urban and rural).
(iv) Include a variable for wet or dry conditions.
(5 marks)
Over both courses, 37 out of 80 (46.2 per cent) found the course satisfactory.
18 (1)
2x + 5y = 4 (2)
(1) × 3
6x + 15y = 54 (3)
(2) × 2
6x – 8y = 8 (4)
(3) − (4)
23y = 46
y= 2
In (1)
2 + 10 = 18
2 = 8
= 4
In (2)
LHS: 3 − 4 = 12 − 8 = 4
RHS: 4
Solution: x = 4; y = 2
8 The correct answer is C. 200 replies were received from non-clerical staff. Of these,
120 (150 − 30) indicated that their firm used a computer.
Therefore, 120/250 = 48 per cent of the replies were filled in by non-clerical staff
from computer-using firms.
9 The correct answer is D.
. .
Median =
.
=
= 3.425
10 The correct answer is A.
x ( − ) ( − )
2.95 −0.459 0.211
3.10 −0.309 0.095
3.17 −0.239 0.057
x ( − ) ( − )
3.40 −0.009 0.000
3.45 0.041 0.002
3.50 0.091 0.008
3.80 0.391 0.153
3.90 0.491 0.241
∑ = 27.27
∑( − ) = 0.767
= 3.409
( ) .
Variance = ∑ = = 0.11
( )
Area A Area B
z2 z1
Source d.f. SS MS F
Treatments 5 160 32 2.94
Blocks 3 120 40 3.67
Error 15 164 10.9
Total 23 444
Treatment F = 2.94
22 The correct answer is A. Critical values:
treatments = 2.90 at 5% with (5,15) d. f.
= 4.56 at 1%
blocks = 3.29 at 5% with (3,15) d. f.
= 5.42 at 1%
They are significantly different at the 5 per cent level of significance.
23 The correct answer is C.
Month Sales
(x) (y) xy x2
1 12 12 1
2 15 30 4
3 16 48 9
4 18 72 16
5 20 100 25
6 23 138 36
∑x = 21 ∑y = 104 ∑xy = 400 ∑x2= 91
= 3.5 = 17.33
∑
=
∑
× . × . .
= =
× . × . .
.
=
.
= 2.061
= − = 17.33 − 2.061 × 3.5
= 17.33 − 7.2135
= 10.12
Equation = 10.12 + 2.06
= 10.1
24 The correct answer is B.
n1 = number of positive residuals = 4
n2 = number of negative residuals = 6
Lower critical value = 2
Upper critical value = 9
25 The correct answer is A.
/
9 = /
=
( )
=
( )
=
= 0.037
Case Study 1
1 Use the nine guidelines from the text, integrating them with the practical details of
the case.
Step 1. The case does not make clear what the purpose of the forecasts is. It could
be to plan human resource requirements or to set a seasonal pricing schedule or to
plan special offers. There are several other possibilities. This must be clarified at the
outset. In this context the current decision-making process – timing, decision
makers, systems – should be analysed. It may be necessary to revise the decision-
making process. It would of course be important to know the hotel’s business
strategy.
Step 2. In the light of Step 1, a detailed list of the forecasts required should be made
– the variables (room occupancy, restaurant usage, conference usage, etc.), their
levels of accuracy, their time horizon (just the season ahead or more), their timing
(how long before the season they are needed), to whom they will be distributed.
Step 3. A conceptual model would probably relate hotel usage to factors such as
pricing, promotional spend, local rival capacity, marketing consortia, economic
factors (exchange rates, economic activity, etc.), seasonal factors and citations in
guidebooks. The model would segment the market.
Step 4. A wealth of historical data is probably available: occupancy rates for
different facilities at different levels of detail, promotional spend, prices, economic
data including regional activity. Data on demand, as opposed to occupancy, may not
be available – the hotel is unlikely to keep records of customers turned away
because it is full. Estimates may have to be made.
Step 5. More than one technique should be used, choosing from judgement
(Delphi, scenarios) and causal modelling.
Step 6. The independent test of accuracy would be used to show how well the
techniques would have worked if used last season. Measures of accuracy would be
the MSE and MAD.
Step 7. Judgement would come into play when unusual or unforeseeable events
occurred, for example, exceptional weather or especially good or bad reviews.
Judgement has to be used in such circumstances, and at the last minute. It would be
important to record any changes made.
Step 8. All decision makers and stakeholders would have to be involved in any new
decision-making system or any new provision of forecast information. A system
Case Study 2
1
(a) Scatter diagram
Annual maintenance (£00)
4.0
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0 5 10 15 20 25 30 35 40
Weekly usage (hours)
(b) Using the formula/data table in the text, the coefficients of the linear regression
can be calculated:
= 0.092, = 0.41
And so the linear regression equation is:
= 0.41 + 0.092
(c) Using the formula/data table in the text, the correlation coefficient of the
regression can be calculated:
= 0.96
(d) The regression has a close fit, as suggested by the high correlation coefficient,
although this should be checked for significance. Accuracy is likely to be good
but there are a number of reservations:
the findings are valid only within the range of the data;
a large sample is needed.
As a further test, the residuals should be checked for randomness, either by plot-
ting residuals against fitted values or by conducting a runs test.
(e) An assumption has to be made on future usage. Assuming it is slightly higher at,
say, 25 hours then from the regression equation:
Cost = 0.41 + (0.092 × 25) = 2.71
Annual cost for 6 PCs = 2.71 × 6 = 16.26 (i. e. £1626)
This is the projected internal cost of maintenance and therefore the highest that
should be paid for an outside contract. This assumes that there is no change in
cost because of changes in the average ages of the PCs – through the passage of
time or through purchasing new PCs.
Case Study 3
1
A ( − ) B ( − ) C ( − )
5 0 6 1 3 −2
6 1 7 2 3 −2
4 −1 8 3 4 −1
4 −1 6 1 5 0
5 0 7 2 3 −2
4 −1 6 1 4 −1
Total 28 40 22
6 6 6
4.67 6.67 3.67
ANOVA table
Source DF SS MSE F ratio
Treatments 2 28.00 14.00 20.89
Error 15 10.00 0.67
Total 17 38.00
Case Study 4
1
Data for Store 5 were missing. Test based therefore on total of nine stores.
(a) Validity of pilot study:
Random selection of sample.
Choice of weeks − avoid festivals, etc.
Size of stores selected.
Geographical locations.
Effect of local promotions.
Number of stores chosen.
Observed =
/√
.
=
. /√
= 3.49 with ( − 1 = 8) degrees of freedom
From t tables, critical value at the 1 per cent level = 3.355. This is the value that
leaves 0.005 in each tail.
As our calculated value of t is greater than 3.355, we reject the null hypothesis at
the 1 per cent level of significance.
There is a significant difference in the turnover of Week 2 as compared to Week 1, at
the 1 per cent level of significance.
(c) To test for a significant increase in turnover, use a one-tailed test with the
hypothesis that there is no increase.
Calculation of the observed t is as before (i.e. t = 3.49) with 8 degrees of free-
dom.
From tables (with 8 d.f.), the critical value of t for a one-tailed test is 2.896. This
is the value that leaves 0.01 in one tail.
As our calculated value of t is greater than 2.896, we reject the null hypothesis at
the 1 per cent level of significance.
There is a significant increase in the turnover of Week 2 as compared to Week 1, at
the 1 per cent level of significance.
(d) The findings are, as in (a) and (b), a significant difference and a significant
increase in Week 2 as compared to Week 1, both at the 1 per cent level of signif-
icance.
It seems likely that the introduction of the voucher scheme may produce a simi-
lar pattern for the turnover in other stores in the group. This will of course
depend on how representative the stores used in the pilot study were. Also, a
trial period of one week is not very long; the turnover should be monitored for a
longer period. Moreover, the cost of the promotion and the effect on turnover
should be evaluated.
Module 1
Review Questions
1.1 The correct answer is True. A large part of statistics is concerned with analysing
sample information and then generalising the results to the population. However
well chosen, a sample may be non-representative, in which case any conclusions
may be incorrect. Statistical theory enables one to calculate a degree of belief (i.e. a
probability of being right) for the conclusion.
1.2 The correct answer is D. After the first card has been drawn and not replaced, there
are 51 cards left in the pack, of which three are aces.
(second ace) = =
1.3 The correct answer is A, C.
1.4 The correct answer is B. If a coin is unbiased:
(tails) =
Earlier tosses of the coin are unconnected with the ninth trial and therefore cannot
affect the probability. However, after eight consecutive ‘heads’ one might begin to
doubt that the coin was unbiased.
1.5 The correct answer is C. Not less than £50 000 means £50 000 or more (i.e. the two
highest categories). Total frequency for the two categories is 17 + 6 = 23.
1.6 The correct answer is A. Frequencies can be turned into probabilities. For sales of
£60 000 or more, the frequency is 6, which is equivalent to a probability of:
6 1
=
78 13
1.7 The correct answer is B. Ninety per cent of days is 9/10 × 78 = 70 days,
approximately, which is covered by exactly the top four categories (categories 2–5).
The level exceeded on 90 per cent of days must thus be the boundary between the
first and second categories (i.e. £30 000).
1.8 The correct answer is B. There are many distributions of which the normal is one
that is much used. It has a smooth shape, is continuous and is always symmetrical.
Its parameters do affect its shape in some ways, but not its symmetry.
1.9 The correct answer is D. Sixty-eight per cent of the readings are within ± 1 standard
deviation of the mean (i.e. between 50 and 70). Since the distribution is symmetrical
about the mean, half of this percentage must be between 60 and 70.
1.10 The correct answer is B. Ninety-five per cent of motorists’ speeds are in the range
mean ± 2 standard deviations, 82 ± 22. Outside this range are 5 per cent of motor-
ists. Because of the symmetry of the normal distribution, 2.5 per cent must be less
than the range, 2.5 per cent more than the range (see Figure A4.1). Therefore, 2.5 per
cent of motorists have a speed less than 60 km/h. Consequently, 97.5 per cent must
have a speed greater than 60 km/h.
2.5% 2.5%
95%
60 82 104
Note that the classes are mutually exclusive. Each reading goes in one class, and
only one class. If the class intervals were 0–1, 1–2, etc., there would be confusion
when it came to classifying 1.0. Note also that the limits of each class are whole
numbers of minutes, which are easier to work with than classes of, say, 1.5–3.5, etc.
The frequency histogram derived from this data is given in Figure A4.2.
25
20
15
10
1 2 3 4 5 6 7 8 9 10
9 or more 2%
8–9 2%
7–8 3%
6–7 3%
Total 10%
The total frequency associated with the 6–7 class is 5. Approximately, it can be
estimated that 3 are in the range 6.4–7.0, and 2 in the range 6.0–6.4 (dividing the
range in the ratio 3:2). Therefore, the service time exceeded by only 10 per cent of
the customers is 6.4 minutes.
This last part of the question could have been answered more exactly by going back
to the original data. The five readings in this class are 6.1, 6.3, 6.5, 6.6, 6.9. This
would lead to the conclusion that the time exceeded by only 10 per cent of custom-
ers is 6.3 minutes.
JPC is likely to pay for 298 bonus units/day, not 200 (= 20 × 10) as it believes.
Under JPC’s calculations the employees receiving 0 bonus units would actually have
to pay negative bonuses to the company.
Current cost per day = 71 × 0.72 × 20 = £1022.40
New cost per day = 71 × 0.72 × 20 + 298 × 0.50 = £1171.40
. .
True offer =
.
= 14.6%
Module 2
Review Questions
2.1 The correct answer is A.
2.2 The correct answer is B. The line has intercept = 1 (when x = 0, y = 1), therefore C
and D must be wrong. The slope of the line is negative, therefore A must be wrong.
Since the line passes through (0,1) and (1,0), the slope is −1 (i.e. m = −1). The line
is y = 1 − x.
2.3 The correct answer is A. B can be rejected immediately since it is a straight line and
the equation is not linear (it has an x2 in it).
Plot a selection of points:
x = −1, y = 11
x = 0, y = 4
x = 1, y = −1
x = 2, y = −4
x = 3, y = −5
x = 4, y = −4
x = 5, y = −1
The curve starts with a high value of y at x = −1. As x increases y, decreases and is a
minimum when x = 3. After this point, y increases again. It is A rather than C that
has this shape.
2.4 The correct answer is C.
6x + 4 = 2y − 4
Add 4 to both sides: 6x + 8 = 2y
Divide both sides by 2: 3x + 4 = y
2.5 The correct answer is D.
=
=
(√ )
2.11 The correct answer is C. The definition of logarithms means that a number is being
sought such that 2? = 8. The number is 3.
2.12 The correct answer is B. A is the equation of a straight line and cannot be correct.
The intercept of the curve shown is 10, thus D cannot be correct. The curve shown
is one of growth. The equation C is one of decay (the exponent is negative) and thus
cannot be correct. At first sight B satisfies all the requirements. A check shows that:
= 1, = 10 · 10 . = 10 · √10 = just over 30
This is in accordance with the curve in the graph and B is correct.
140
C = Total cost of hire (£)
x = Miles travelled
50
1000 x
Revenue = 5
Thus 5 = 100 + 4
= 100
The breakeven point for System 1 is 100.
For System 2
Costs = 1200 + 4
Revenue = 8
Thus 8 = 1200 + 4
4 = 1200
= 300
The breakeven point for System 2 is 300.
. × .
= 10 000 = 10 000
From tables, e1.662 = 5.27
Sales = 52 700
Module 3
Review Questions
3.1 The correct answer is A, B. These are the two major guidelines in communicating
data. There is no requirement for data to be specified to two decimal places (alt-
hough one of the ‘rules’ of data presentation is that they should be rounded to two
effective figures). The data do not have to be analysed before presentation (although
presentation should be in a form amenable to analysis). It is often the case that only
the receiver is in a position to analyse them.
3.2 The correct answer is False. Although the assumption is often made, there is
frequently no link between the specification of data and its accuracy.
3.3 The correct answer is True. Data should be sufficiently accurate to avoid wrong
decisions being made. If the decisions do not require a high level of accuracy, then it
is pointless to supply such accuracy.
3.4 The correct answer is B. The two effective figures are the 3 and the 7. The rounded
number is thus 3700.
3.5 The correct answer is C. All numbers are rounded to their first two figures: 1700 for
1732, 38 for 38.1, etc.
3.6 The correct answer is A, B, C. All are correct. Speedy comparisons, ease of
subtraction and closeness are all factors that aid the comparison of numbers.
3.7 The correct answer is A, B, C, D. All are correct. Any of the four bases could be
used. Circumstances and/or taste would dictate which.
3.8 The correct answer is B, C. The rows are apparently in alphabetical order, so B is
valid (although it is conceivable that they have been ordered by, say, capital em-
ployed). A gap has been introduced in the numbers purely to accommodate the
labels, thus C is valid. A is not a valid criticism since the numbers are rounded to
two effective figures. The first figure is ‘1’ for all numbers, therefore the two
effective figures are the second and third. D is not valid, since a vertical gridline
would not help the communication of the data.
3.9 The correct answer is C. Income statements cannot be ordered by size because they
have a logical build-up. Nor would it help with communication to use summary
measures or interchange rows and columns. Other rules can be applied. Rounding
can be done and is not illegal. A and B are not correct. Auditing can only be
checked in published accounts in the most superficial and trivial way. Most pub-
lished accounts are in fact rounded to some degree already. D is not correct since
published accounts are for shareholders.
3.10 The correct answer is B, D. A is not correct since the involvement of time has no
bearing on the choice between tables and graphs. C is not correct since graphs are
not usually good at distinguishing fine differences.
Average 35 74 79 77 85
Rule 1. Rounding to two effective figures means losing the decimal point. It
would be perfectly acceptable to do so given the purpose of the table (monitor-
ing performance). In any case, it is unlikely that data relating to achievement of
objectives are accurate to one decimal place. However, if the data were for dif-
ferent purposes – for example, paying bonuses for an outsourced service – the
loss of the decimal place might not be acceptable.
Rule 2. The data are in alphabetical order. This would be fine if the table was a
long reference table where readers were only interested in the performances of
particular authorities. To judge the relative performance of the six authorities,
size order is better. The order in the first month has been chosen so that relative
improvement can easily be seen.
Rule 3. Rows and columns have not been interchanged. Comparisons between
authorities are at least as important as comparisons over time.
Rule 4. A summary measure would be useful since it would allow comparisons
with ‘average performance’. The ‘average’ used here is the ‘total’ – the achieve-
ment of all authorities taken as a combined group. If this is not available then the
arithmetic mean of the six authorities would serve nearly as well.
Rule 5. Labelling is generally clear but some minor changes have been made,
mainly to improve the appearance of the table.
Rule 6. The excessive use of gridlines is obtrusive and obscures comparisons.
This is one of the worst features of the original table. The gridlines, except for
the horizontal one separating the headings from the data, have been removed.
The columns have been moved closer together for easier comparison.
Rule 7. A short verbal summary might be: Authorities have made significant
improvements, moving from an average achievement of 34 per cent to 85 per
cent. All authorities have improved to some extent but some have improved
more markedly than others.
2000
Germany
1000 Italy
500
Belgium
Denmark
0
1965 1970 1975 1980 1985 1990 1995
8000
7000
USA
6000
GDP (Eur thousand million)
5000
4000 Japan
3000
2000 Germany
France
1000
Belgium
0
1965 1970 1975 1980 1985 1990 1995
Module 4
Review Questions
4.1 The correct answer is False. Traditional statistical methods can be helpful, but
because they were not designed for management situations there is a gap in a
manager’s needs they do not fill.
4.2 The correct answer is False. The need for such skills is not new, but it is now
especially important because more data are now available than previously. In any
case, computers do not necessarily have to present data in a complicated style,
although they often do so.
4.3 The correct answer is C. Most data sets contain items which are unnecessary or
irrelevant to the analysis being undertaken. Inaccuracies should lead to corrections
being made rather than omissions, so A is not a correct reason. Large amounts of
data may well be analysed at one time, so B is not a correct reason.
4.4 The correct answer is A, B. Data to eight decimal places may or may not be
required, depending on the situation. The key point is that the number of decimal
places does not necessarily denote accuracy.
4.5 The correct answer is A, B. C is incorrect since the model will be to some extent an
approximation and thus more inaccurate than the original data.
4.6 The correct answer is B. Sales increase by almost 25 per cent each year. A and D are
also models, but not such good ones. C refers only to one year.
4.7 The correct answer is B. Division 2 is the exception, since its profit/capital
employed is under 4 per cent. The others all earn a return of six per cent (this is the
model for the data). It may be tempting to think of Division 4 as the exception since
it is the largest.
4.8 The correct answer is False. You would not be right. Although 11 exceptions out of
36 is a high ratio, all can be explained in terms of strikes or holidays. The model is
probably a good one with the condition that it does not apply when there are strikes
or holidays.
4.9 The correct answer is C, D. C and D would help to show whether the pattern found
for distilled spirits was repeated for other alcoholic beverage sectors of the market.
Since the essence of the analysis is to look at variations between states of the USA,
it is difficult to see that there would be a basis for comparison with the provinces of
a different country. The factors that make one province different from another are
not the same as make one state different from the next.
4.10 The correct answer is False. The reason for choosing a simple model first is that the
objective of understanding the data can be achieved more quickly and easily. Any
model that is a good one, whether simple or sophisticated, will ‘model’ the pattern,
not obscure it.
(c) The model for the data is that, under each category of expenditure, the percent-
age for each region is roughly the same as the percentage for the total (the whole
of England and Wales). The model is, to some extent, dictated by the question,
since we are looking for out-of-the-ordinary expenditures.
(d) The major exceptions can be spotted by looking down each column and noting
the regions whose percentage is very different from the percentage at the head of
the column.
(e) The only really valid comparison would have to be with previous years’ expendi-
tures.
Suppose a major exception is defined, arbitrarily, as being more than one-fifth away
from the England and Wales figure. For example, for raw materials an exception is
any region whose percentage for this category is between 40 − 8 and 40 + 8 (8
being one-fifth of 40) (i.e. between 32 per cent and 48 per cent). Choosing a fraction
other than one-fifth would result in a tightening or a loosening of the definition of
‘major exception’.
The answer to the question posed is that the unusual exceptions are:
(a) Raw materials: North West, South, Wessex.
(b) Transport: Wales.
(c) Manpower: Wessex, Yorkshire, Wales.
(d) Finance: There is a large variation in this category. Only Thames and Anglia are
close to the average figure.
(e) Fuel: Anglia, Wessex.
Note that in this instance ordering by size has had little effect on the analysis. This
does not diminish its general usefulness. Once the exceptions are known, the
analysis has done its job. It is for management to explain the exceptions and, if
necessary, take action.
(d) Summary measures are already provided. The total row presumably refers to the
overall data for all trades.
(e) There is no excessive use of gridlines or white space in the original table.
(f) The labelling is poor. The labels appear to be too long. Although it is sometimes
necessary to preserve legal or precise definitions, this is unlikely to be the case
here. The labels also interfere with the presentation of the numbers. Their length
results in there being gaps in the columns of numbers. The labelling should be
abbreviated.
(g) No verbal summary can be made at this stage.
The result of this re-presentation is Table A4.5.
No major pattern emerges, but there is some indication that trades for which a large
percentage of employers are underpaying also have a large percentage of employees
being underpaid. This could be the model of the data. However, the exceptions to
this model are so many that there must be doubts as to its validity.
When searching for another model involving the third column, it becomes apparent
that this column, referring to the amount underpaid, is meaningless unless the
sample size is known, i.e. the number of employers and employees concerned in the
amounts underpaid. This fact suggests that the next step in the analysis might be to
find information on the sample sizes and thereby calculate the average amounts
underpaid per employee in each trade. This is a case where, even though no neat
analysis emerges, it is clear what to do next.
Module 5
Review Questions
5.1 The correct answer is B. Summary measures are used because a few summary
figures are easier to handle (i.e. to remember, to compare with other numbers, etc.)
than a set of data comprising perhaps several hundred numbers. If the summary
measures are well chosen, little accuracy will be lost for practical purposes, but
inevitably some information will be; if the measures are ill-chosen they could
misrepresent the data and be misleading. Measures of location and scatter frequently
capture the main features of a set of data, but not always. Just how a set of data can
be summarised depends upon the set itself.
5.2 The correct answer is A.
Arithmetic mean =
= 4
5.3 The correct answer is D. In ascending order:
0, 1, 1, 2, 2, 3, 4, 4, 5, 5, 5, 5, 6, 6, 7, 8
Median =
= 4.5
5.4 The correct answer is E. Mode = 5
5.5 The correct answer is E. None of the statements A–D applies.
The arithmetic mean is the pre-eminent measure of location because it is well
known, easy to use and easy to calculate. But in a few circumstances (the presence
of extreme readings, distinct clusters of readings and when taking the average of
averages) the arithmetic mean can be misleading. Except for these circumstances,
the arithmetic mean is preferable in the case of a symmetrical distribution, when all
three give approximately the same result. For a U-shaped distribution the arithmetic
mean is not helpful for descriptive purposes (there are two clusters of numbers).
23 27 21 25 26 22 29 24 27 26
− −2 2 −4 0 1 −3 4 −1 2 1
| − | 2 2 4 0 1 3 4 1 2 1 ∑| − | = 20
( − ) 4 4 16 0 1 9 16 1 4 1 ∑( − ) = 56
∑| |
Mean absolute deviation =
= =2
5.11 The correct answer is B.
∑( )
Variance =
= = 6.2
5.12 The correct answer is D.
Standard deviation = √Variance
= √6.2 = 2.5
5.13 The correct answer is B. A, C and D are untrue. While no one measure of scatter is
pre-eminent in general, a particular measure is likely to be preferable for each
particular set of data. The range is a descriptive measure that depends on the two
extreme readings. The interquartile range overcomes this problem. Variance suffers
from the disadvantage that it measures the scatter in units that are the square of the
units of the original readings. Standard deviation, being the square root of variance,
measures scatter in original units. Variance and standard deviation are easy to handle
mathematically but are not intuitively easy to understand. The opposite is true of
interquartile range and mean absolute deviation. They are ‘sensible’ and simple
measures of scatter, but are not easy to handle mathematically.
5.14 The correct answer is B, C. An outlier should be corrected, retained or excluded. It
is corrected of course when an error in typing, collection, etc., has been made; it is
always retained when it is an integral part of the pattern; it is usually – but not
always – excluded when it is not a regular part of the pattern. The ambiguity in the
latter case arises because of the difficulty in deciding whether an event is truly
isolated (e.g. is a strike an isolated event?). Answer A is false because an outlier is
sometimes corrected.
5.15 The correct answer is B. Between 2013 and 2014, cost of living grew by:
. .
= 8.7%
.
Wages, salaries grew by:
. .
= 8.4%
.
The growth in the cost of living is therefore slightly greater.
=
= 1107 hours
Note that all bulbs whose length of life was, say, 700–899 hours are approximat-
ed as having the mean length of life for that class (i.e. 800 hours).
Mean life for Supplier B’s bulbs:
( × ) ( × ) ( × ) ( × )
=
=
= 1070 hours
Supplier A’s bulbs last on average 37 hours longer.
(b) To measure uniformity of quality means measuring the scatter of length of life.
Range and interquartile range do not depend on all 60 observations. Since two
suppliers are being compared and the differences may be small, a more precise
measurement is required. Standard deviation (or variance) could be used, but
mean absolute deviation is easier to calculate.
For Supplier A:
MAD = (12 × |800 − 1107| + 14 × |1000 − 1107| + 24
× |1200 − 1107| + 10 × |1400 − 1107|)/60
=
=
= 172 hours
For Supplier B:
MAD = (4 × |800 − 1070| + 34 × |1000 − 1070| + 19
× |1200 − 1070| + 3 × |1400 − 1070|)/60
=
=
= 115 hours
Supplier B’s bulbs are more uniform in quality.
(c) Having a longer average life is obviously a desirable characteristic. However,
uniform quality is also desirable. As an example, consider a planned maintenance
scheme in which bulbs (in factories or offices) are all replaced at regular intervals
whether they have failed or not. This can be cheaper since labour costs may be
higher than bulb costs and replacing all at once takes less labour time than indi-
vidual replacement. In such a scheme the interval between replacement is usually
set such that, at most, perhaps 10 per cent of bulbs are likely to have failed. The
interval between replacement can be larger for bulbs of more uniform quality
since they all tend to fail at about the same time. Consequently, the scatter of
length of life is more important than average length of life in these circumstanc-
es.
The choice between Supplier A and Supplier B depends upon the way in which
the bulbs are replaced and is not necessarily based solely on average length of
life.
A B C D
Coefficient of variation 0.06 0.06 0.08 0.07
There is, therefore, little difference between departments with respect to stability.
of readings (the classes for 0, 1, 2 miles and part of the 3-mile class) and the top 25
per cent (classes for 30+, 20–29, 15–19, 12–14 miles and part of the 10–11 miles
class).
Interquartile range = 11 − 3 = 8 miles
A second measure of scatter is more difficult. The range cannot be calculated
because the largest distance is not known. The MAD, variance and standard
deviation cannot be calculated because the highest class is imprecise. Whereas, for
instance, the 15–19 miles class can be substituted by its mid-point in making
calculations, this cannot be done for a class defined as ‘30 and over’ miles. However,
since it is known that the arithmetic mean distance is 10.5 miles, the mid-point of
the 30+ miles class can be approximated.
If x is the average distance for the 30+ miles class, then:
Arithmetic mean = [(3 × 0) + (7 × 1) + (9 × 2) + (10 × 3) + (16 × 4.5)
+(12 × 6.5) + (11 × 8.5) + (8 × 10.5) + (8 × 13) + (4 × 17)
+(6 × 24.5) + (6 × )]/100
.
=
.
=
= 10.5 miles
701.5 + 6 = 1050
6 = 348.5
= 58
The MAD can be calculated using 58 miles as the mid-point of the highest class:
MAD = [3 × |0 − 10.5| + 7 × |1 − 10.5| + 9 × |2 − 10.5| + 10 × |3 − 10.5|
+16 × |4.5 − 10.5| + 12 × |6.5 − 10.5| + 11 × |8.5 − 10.5| + 8 × |10.5 − 10.5|
+8 × |10.5 − 10.5| + 8 × |13 − 10.5| + 4 × |17 − 10.5| + 6 × |24.5 − 10.5|
+6 × |58 − 10.5|]/100
= 830.5/100
= 8.3 miles
The data can be summarised (the median is 6.5 miles with a mean absolute deviation
of 8.3 miles). The interquartile range could be substituted for the MAD.
Since the interesting feature is the extremely long distance travelled by a few, an
alternative summary could be more ad hoc (the median is 6.5 miles, but a quarter of
the workers travelled more than 12 miles; i.e. a verbal summary is being used to
describe the skewed nature of the data).
(c) It does matter which index is used since between 2013 and 2014 the indices
move in different directions. The weighted index must be the better one since it
allows for the fact that very different quantities of each product are purchased.
The importance of the weighting is twofold. First, petroleum products are all
made from crude oil and are to some extent substitutable as far as the producer
is concerned. An index should reflect the average price of every gallon of petro-
leum product purchased. Only the weighted index does this. Secondly, products
such as car petrol, of which large quantities are purchased, will have a bigger
effect on the general public than products, such as kerosene, for which only
small amounts are purchased. Again, the index should reflect this.
(d) A differently constructed index would use different weightings. The price part of
the calculation could be changed by using price relatives but this would have
little effect since the prices are close together.
The different weightings that could be used are:
(i) Most recent year quantity weighting. But this would imply a change in his-
torical index values every year.
(ii) Average quantity for all years’ weighting. This would not necessarily mean
historical changes every year. The average quantities for 2012–14 could be
used in the future. This has the advantage that it guards against the chosen
base year being untypical in any way.
Module 6
Review Questions
6.1 The correct answer is C. Sampling is necessary because it is quicker and easier than
measuring the whole population while little accuracy is lost. Statement A is not true
because it is not always impossible to take population measurements, although it is
usually difficult. Statement B is untrue because sampling is always less accurate since
fewer observations/measurements are made.
6.2 The correct answer is A. Many sampling methods are based on random selection for
two reasons. First, it helps to make the sample more representative (although it is
unlikely to make it totally representative). Second, it enables the use of statistical
procedures to calculate the range of accuracy of any estimates made from the
sample. A is therefore a correct reason, while B, C and D are incorrect.
6.3 The correct answer is B. Starting top left, the first number is 5; therefore, the first
region chosen is SE England. Moving across the row, the second number is 8 and
the corresponding region is Scotland. The third number is 5, which is ignored, since
SE England is already included. In this situation, having the same region repeated in
the sample would not make sense. Consequently, we sample without replacement.
The fourth number is 0 and is also ignored since it does not correspond to any
region. The fifth number is 4 and so London completes the sample.
6.4 The correct answer is B, C. Multi-stage sampling has two advantages over simple
random sampling. The population is divided into groups, then each group into
subgroups, then each subgroup into subsubgroups, etc. A random sample of groups
is taken, then for each group selected a random sample of its subgroups is selected
and so on. Therefore, it is not necessary to list the whole population and advantage
B is valid. Since the observations/measurements/interviews of sample elements are
restricted to a few sectors (often geographical) of the population, time and effort
can be saved, as, for example, in opinion polls. Advantage C is therefore also valid.
Multi-stage sampling is solely a way of collecting the sample. Once collected, the
sample is treated as if it were a simple random sample. Its accuracy and the observa-
tions required are therefore just the same as for simple random sampling. Reasons A
and D are false.
6.5 The correct answer is A, C. In stratified sampling the population is split into
sections on the basis of some characteristic (e.g. management status in the absentee-
ism survey). The sample has to have the same sections in the same proportions as
the population (e.g. if there are 23 per cent skilled workers in the population, the
sample has to have 23 per cent skilled workers). In respect of management status,
therefore, the sample is as representative as it can be. In respect of other characteris-
tics (e.g. length of service with the company) the stratified sample is in the same
position as the simple random sample (i.e. its representativeness is left to chance).
Thus a stratified sample is usually more representative than a simple random one
but not necessarily so. Statement A is true.
A cluster sample can also be stratified by making each cluster have the same
proportions of the stratifying characteristics as the population. Statement B is
untrue.
If a total sample of 100 is required, stratification will probably mean that more than
100 elements must be selected. Suppose 23 per cent skilled workers are required and
that, by the time the sample has grown to 70, 23 skilled workers have already been
selected. Any further skilled workers chosen cannot be used. To get a sample of
100, therefore, more than 100 elements will have had to be selected. Only if the cost
of selection is negligible will a stratified sample be as cheap as a simple random
sample. Statement C is true.
6.6 The correct answer is B. Use of a variable sampling fraction means that one section
of the population is deliberately over-represented in the sample. This is done when
the section in question is of great importance. It is over-sampled to minimise the
likelihood of there being error because only a few items from this section have been
measured. Incidentally, D describes weighted sampling.
6.7 The correct answer is A, B, C. Non-random sampling is used in all three situations.
Size
Small Medium Large
Sample % 33 33 33
Population % 34 34 32
(i) Some small percentage of accounts will have been closed between the
time of compilation of the computer listings and use of the infor-
mation, thus reducing the sample size and the accuracy. The original
sample may have to be increased to allow for this.
(ii) Some accounts will be dormant (i.e. open but rarely, or never, used). A
decision to include or exclude these accounts must be made. It is usual-
ly made after consideration of the purposes to which the information is
being put.
(iii) To know the occupation of customers requires visiting branches, since
this information may not be computerised. This is time-consuming and
requires the establishment of a working relationship with the bank
manager. It also requires permission to breach the confidentiality be-
tween customers and manager.
(iv) The personal details may well be out of date, since such information is
only occasionally updated.
(v) It may not be possible to classify some account holders into a socioec-
onomic group. For example, the customer may have been classified as a
schoolchild seven years ago. This is a problem of non-response. Omit-
ting such customers from the sample will lead to bias. Extra work must
be done to find the necessary details.
Module 7
Review Questions
7.1 The correct answer is B. The probabilities are calculated from the formula:
.
(data class X) =
This is the relative frequency method.
7.2 The correct answer is A. Even though the situation that generates the data appears
to be normal, the distribution is still observed because the probabilities were
calculated from frequencies. Had the data been used to estimate parameters from
which probabilities were to be found in conjunction with statistical tables, the
resulting distribution would have been normal.
7.3 The correct answer is False. The amounts owed are not strictly a continuous
variable since they are measured in currency units. They are not measured, for
example, in tiny fractions of pennies.
7.4 The correct answer is A. The addition rule for mutually exclusive events gives:
(no more than 1 cancellation) = (0 or 1 cancellation)
= (0) + (1)
= 32% + 29%
= 61%
7.5 The correct answer is C. The multiplication rule gives:
(0 cancellations on day 1 and 0 on day 2) = 0.32 × 0.32
= 0.102 i. e. 10.2%
7.6 The correct answer is C. The basic multiplication rule as given relates to
independent events. They may not be independent. For example, spells of bad
weather are likely to prevent patients attending. The fact that there were no cancella-
tions on day 1 might indicate a spell of good weather and a higher probability of no
cancellations on the following day.
7.7 The correct answer is D. The number of ways of choosing three objects from eight
is 8 3 .
! × ×
= = = 8 × 7 = 56
!× ! !
7.8 The correct answer is B. Knowledge about standard distributions (e.g. for the
normal ±2 standard deviations covers 95 per cent of the distribution) is available
rather than it having to be calculated direct from data.
A is not correct since some small amount of data has to be collected to check that
the standard distribution is applicable and to calculate parameters.
C is not necessarily correct. Standard distributions are approximations to actual
situations and may not lead to greater accuracy.
7.9 The correct answer is B. The population is split into two types: watchers and non-
watchers. A random sample of 100 is taken from this population. The number of
watchers per sample therefore has a binomial distribution.
7.10 The correct answer is True. Since the programme is described as being popular, the
proportion of people viewing (p) is likely to be sufficiently high (perhaps about 0.3)
so that np and np(1 − p) are both greater than 5. The normal approximation to the
binomial can therefore be applied.
7.11 The correct answer is D. The population can be split into two types: those that have
heard and those that have not. A random sample of five is taken from this popula-
tion. The underlying distribution is therefore binomial with p = 0.4 and n = 5. The
binomial formula is:
( of type 1) = × × (1 − )
Thus:
(1 person has heard of the chocolate bar) = × 0.4 × 0.6
= 5 × 0.4 × 0.6
= 2 × 0.1296
= 0.26 (approx. )
Or, Table A1.1 could have been used.
7.12 The correct answer is A. The average per clerk per day is 190. There are 12 clerks.
The total per day is therefore 190 × 12 = 2280.
7.13 The correct answer is C. Since the distribution is normal, 68 per cent of clerks will
clear a number of dockets in the range:
Mean ±1 standard deviation
= 190 ±25
= 165 to 215
32 per cent of clerks will clear a number of dockets outside this range. Since a
normal distribution is symmetrical, half of these (16 per cent) will clear fewer than
165. Likewise, 16 per cent will clear more than 215.
16% of 12 = 1.92
Approximately two clerks will clear more than 215 dockets per day.
7.14 The correct answer is A. There is a 95 per cent probability that the number of
dockets cleared by any clerk on any day will lie within the range covered by 95 per
cent of the distribution. This range is, for a normal distribution, the mean ±2
standard deviations.
The range is 190 ± 2 × 25 = 140 to 240.
Area B
Area A
8%
z = 1.405
Number of defectives 0 1 2 3 4 5 6
Observed no. of samples 52 34 10 4 0 0 0
Theoretical no. of samples 53 35 10 1 0 0 0
There is a close correspondence. The results are consistent with a process defec-
tive rate of 10 per cent. Note that because of rounding the theoretical numbers
of samples do not add up to 100.
(b) A first reservation concerning this conclusion is whether the samples were taken
at random. If the samples were only taken at particular times, say at the start of a
shift, it might be that starting-up problems meant that the defective rate at this
time was high. The results would then suggest the overall rate was higher than it
actually is.
Second, if the samples that contain defectives were mostly towards the end of
the time period during which the samples were collected, this might indicate that
the process used to have a defective rate less than 10 per cent but had deteriorat-
ed.
Third, the fact that there are more samples with three defectives than expected,
and fewer with zero and one defective, suggests greater variability in the process
than expected. This might occur because p is not constant at 10 per cent but
varies throughout the shift. The antidote to this problem is either to split the
shift into distinct time periods and take samples from each or to use a more
sophisticated distribution called beta-binomial, which allows for variability in p
and which will be described in a later module.
P(0 users) = 0
P(1 user) = 0.0001
P(2 users) = 0.0008
P(3 users) = 0.0040
P(4 users) = 0.0139 ← Total so far 0.0188
P(5 users) = 0.0365
P(6 users) = 0.0746
P(7 users) = 0.1221
P(8 users) = 0.1623
P(9 users) = 0.1771
P(10 users) = 0.1593
P(11 users) = 0.1185
P(12 users) = 0.0727
P(13 users) = 0.0366
P(14 users) = 0.0150 ← Total from here to end 0.0214
P(15 users) = 0.0049
P(16 users) = 0.0013
P(17 users) = 0.0002
P(18+ users) = 0
Therefore, there is a 96 per cent probability that a sample will contain from five to
13 regular users of the breakfast cereal. If many samples are taken, it is likely that 96
per cent of them will contain five to 13 users. Because consumers are counted in
whole numbers, there is no range of users equivalent to the 95 per cent requested in
the question.
The question has been answered, but it has been a lengthy process. Since np (= 9)
and n(1 − p) (= 11) are both greater than five, the binomial can be approximated by
the normal. The parameters are:
Arithmetic mean = =9
Standard deviation = (1 − )
= √4.95
= 2.22
For a normal distribution, 95 per cent of it lies between ±2 standard deviations.
Therefore, 95 per cent of samples are likely to be between:
9 − (2 × 2.22) and 9 + (2 × 2.22)
4.56 and 13.44
Recall that, because a discrete distribution is being approximated by a continuous
one, a whole number with the binomial is equivalent to a range with the normal. For
example, five users with the binomial corresponds to the range 4.5 to 5.5 with the
normal. In the above calculations, the range 4.56 to 13.44 covers almost (but not
quite) the range five to 13 users.
Approximately, therefore, 95 per cent of the samples are likely to have between five
and 13 users inclusive. This is the same result as obtained by the lengthier binomial
procedure.
Module 8
Review Questions
8.1 The correct answer is C. Statistical inference uses sample information to make
statements about populations. The statements are in the form of estimates or
hypotheses.
8.2 The correct answer is C. Inference is based on sample information. Even though a
sample is random, it may not be representative, and therefore there is some chance
that the inference may be incorrect. The other statements are true but they are not
the reasons for using confidence levels.
8.3 The correct answer is A. The mean of the sample = (7 + 4 + 9 + 2 + 8 + 6 + 8 + 1
+ 9)/9 = 6
The variance = [(7 − 6)2 + (4 − 6)2 + (9 − 6)2 + (2 − 6)2 + (8 − 6)2 + (6 − 6)2 + (8
− 6)2 + (1 − 6)2 + (9 − 6)2]/(9 − 1) = 9
The standard deviation is:
√9 = 3
The standard deviation of the distribution of means of sample size 9 is:
3/√9 = 1
8.4 The correct answer is B. The point estimate of the population mean is simply the
sample mean.
8.5 The correct answer is B. The point estimate of the mean is 6. The 90 per cent
confidence limits are (from normal curve tables) 1.645 standard errors either side of
the point estimate. The limits are 6 ± 1.645 × 1 = 4.4 to 7.6 (approximately).
8.6 The correct answer is A. The 95 per cent confidence limits cover a range of 2
standard errors on either side of the mean. A standard error is 150/√n where n is
the sample size. Thus
20 = 2 × 150/√
1 = 15/√
√ = 15
= 225
8.7 The correct answer is False. Sample evidence does not prove a hypothesis. Because
it is from a sample, it merely shows whether the evidence is statistically significant or
not.
8.8 The correct answer is A. The tester decides on the significance level. He or she may
choose whatever value is thought suitable but 5 per cent has come to be accepted as
the convention. The other statements are true but only after 5 per cent has been
chosen as the significance level.
8.9 The correct answer is True. Critical values are an alternative approach to
significance tests and can be used in both one- and two-tailed tests.
8.10 The correct answer is A. The standard error of the sampling distribution is 6 (=
48/√64). There is no suggestion that any deviation from the hypothesised mean of 0
could be in one direction only. Therefore the test is two-tailed. At the 5 per cent
level the critical values are 2 standard errors either side of the mean (i.e. at −12 and
12). Since the observed sample mean is 9.87, at the 5 per cent level the hypothesis is
accepted. At the 10 per cent level the critical values are 1.645 standard errors from
the mean (i.e. at −9.87 and 9.87). At the 10 per cent level the test is inconclusive.
8.11 The correct answer is C. Since the test is one-tailed at the 5 per cent level, the
critical value is 1.645 standard errors away from the null hypothesis mean. The
critical value is therefore 9.87. For the alternative hypothesis the z value of 9.87 is
−1.69 (= (9.87−20)/6). The corresponding area in normal curve tables is 0.4545.
Since the null hypothesis will be accepted (and the alternative rejected even if true)
when the observed sample mean is less than 9.87, the probability of a type 2 error is
0.0455 (= 0.5 − 0.4545; i.e. 4.55 per cent).
8.12 The correct answer is D. The power of the test is the probability of accepting the
alternative hypothesis when it is true. The power is therefore:
1.0 − P(type 2 error) = 95.45 per cent.
8.13 The correct answer is C. The test is to determine whether the plea has met its
objective by bringing about an increase of £2500 per month (i.e. whether the
average increase in turnover per branch is £2500 per month). This is equal to £7500
over a three-month period.
8.14 The correct answer is B. The samples being compared are the turnovers before and
after the plea. They are paired in that the same 100 branches are involved. Each
turnover in the first period is paired with the turnover of the same branch in the
second period.
The test is one-tailed. In the circumstances that the plea was well exceeded the
hypothesis would not be rejected. One would only say that the plea had not
succeeded if the observed increase was significantly less than £7500, but not if
significantly more. Therefore only one tail should be considered. The test should
thus be one-tailed, based on paired samples.
8.15 The correct answer is False. The procedure described relates to an unpaired sample
test, not a paired test. A paired test requires a new sample formed from the differ-
ences in each pair of observations.
The mean of the one sample (£186) must lie within 12 (= 2 × 6) of the true
population mean at the 95 per cent confidence level. Consequently, the true
population mean must be in the range 186 ± 12 = £174 to £198.
result must therefore be seen as the probability of a result as far from the mean
as z = 1.46 in either direction (a two-tailed test).
Probability of sample result = 2 × 7.21% (0.0721 = 0.5 − 0.4279)
= 14.42
(e) This result is larger than the significance level of 5 per cent and the hypothesis
must be accepted. There is insufficient evidence to suggest that the new course
makes a significant difference to the test scores at the 5 per cent level.
rather than to be the basis for contract action. Second, the assumptions underlying
the test must be met. In this case this means that the sample should have been
selected in a truly random fashion. If not, the whole basis of the test is undermined.
Third, a tensile strength of slightly less than 12 kg might be adequate for the cloth
concerned, and it might not be economic to go to the expense of ensuring the
contract is kept to the letter. On the other hand, a tensile strength of, say, 11 kg or
less might have been more serious because the quality of the cloth was noticeably
reduced. This might suggest the adoption of a test that had 11 kg as the alternative
hypothesis.
Salesperson Difference ( − ) ( − )
6 13 9 81
7 −3 −7 49
8 −6 −10 100
9 −6 −10 100
10 25 21 441
11 17 13 169
12 21 17 289
13 21 17 289
14 −14 −18 324
15 −7 −11 121
16 19 15 225
17 −7 −11 121
18 −34 −38 1444
19 −7 −11 121
20 13 9 81
21 13 9 81
22 9 5 25
23 −11 −15 225
24 11 7 49
25 18 14 196
26 −19 −23 529
27 8 4 16
28 −7 −11 121
29 9 5 25
30 18 14 196
120 0 5750
Mean = 120/30 = 4
Standard deviation = 5750/29 = 14.08
Conduct the significance test in five stages:
(i) The hypothesis is that the new sample comes from a population of mean 0.
(ii) The evidence is the new sample of mean 4 and standard deviation 14.08.
(iii) The significance level is 5 per cent.
(iv) The standard error of the sampling distribution of the mean is:
14.08/√30 = 2.57
The z value of the observed sample mean:
= (4 − 0)/2.57 = 1.56
(v) From the normal curve table in Appendix 1 (Table A1.2), the correspond-
ing area is 0.4406. The probability of such a z value is therefore 0.0594 or
5.94 per cent.
This percentage is slightly higher than the significance level. The hypothesis is
accepted (but only just). The new scheme does not give rise to a significant
increase in output.
(c) The six possible reservations listed in the text suggest:
(i) The sample should be collected at random. This means a true random-
based sampling method should have been used to select the salespeople
covering all grades, areas of the country, etc. It also means that the months
used should not be likely to show variations other than those stemming
from the new scheme. For instance, allowance should be made for seasonal
variations in sales.
(ii) Checks should be made that the structure of the test is right. For instance,
does a simple measure like the total sum assured reflect the profitability of
the company? Profitability may have more to do with the mix of types of
policy than total sum assured.
(iii) The potential cost/profit to the company of taking the right decision in
regard to the incentive scheme suggests that more effort could be put into
the significance test. In particular, a larger sample could be taken.
(iv) The balance between the two types of error should be right. It is more im-
portant to know whether the scheme is profitable than to know whether it
gives a significant increase. The test should have sufficient power to distin-
guish between null and alternative hypotheses. This is discussed below.
(d) If the alternative hypothesis is a mean increase of £5000:
(i) P(type 1 error) = significance level = 5%.
(ii) The critical value of the one-tailed test is 1.645 standard errors from the
mean. The critical value is 4.23 (= 1.645 × 2.57). For the alternative hy-
pothesis, the z value of 4.23 is 0.30 (= (4.23 − 5)/2.57) (see Figure A4.8).
From the normal curve table in Appendix 1 (Table A1.2), the correspond-
ing area is 0.1179. The null hypothesis is accepted (and the alternative
hypothesis is rejected) if the observed sample mean is less than the critical
value, 4.23. A type 2 error is the acceptance of the null hypothesis when the
alternative hypothesis truly applies. Therefore:
P(type 2 error)=38.21%
5%
0 4230 5000
Critical
value
Module 9
Review Questions
9.1 The correct answer is B. A ‘natural’ measurement such as height is a typical example
of a normal distribution. Many small genetic factors presumably cause the varia-
tions. This is highly typical of the sort of situation on which the normal is defined.
9.2 The correct answer is B, C. The binomial formula, with its factorials and powers, is
more difficult to use than the Poisson. Binomial tables extend to many more pages
than the Poisson because the former has virtually one table for each sample size.
A is not a correct reason. If the situation is truly binomial but the Poisson is used,
some accuracy will be lost but the loss will be small if the rule of thumb applies.
9.3 The correct answer is B. The situation looks to be Poisson. Assume this to be the
case. The parameter is equal to the average number of accidents per month: 36/12
= 3. From the Poisson probability table (see Appendix 1, Table A1.3):
(0 accidents) = 0.0498
(1 accident) = 0.1494
(2 accidents) = 0.2240
(3 accidents) = 0.2240
(4 accidents) = 0.1680
(5 accidents) = 0.1008
Therefore:
(up to and including 5 accidents)
= 0.0498 + 0.1494 + 0.2240 + 0.2240 + 0.1680 + 0.1008
= 0.9160
Therefore:
(more than 5 accidents) = 1 − 0.916
= 0.084
= 8% (approximately)
9.4 The correct answer is B. The variance is estimated in essentially the same way as the
standard deviation, which is merely the square root of the variance. The same
reasoning that leads to the standard deviation having n − 1 degrees of freedom leads
to the variance having n − 1 = 24 degrees of freedom.
9.5 The correct answer is False. In addition to the conditions made in the statement, the
distribution from which the sample is taken must also be normal if the sampling
distribution of the mean is to be a t-distribution.
9.6 The correct answer is D.
Standard error = Individual standard deviation/ Sample size
= 95/√18
= 22.4
( × ) ( × ) ( × ) ( × ) ( × ) ( × ) ( × ) ( × )
=
= 160/100
= 1.6
On average, therefore, an aircraft was involved in 1.6 incidents over the 400
days. From the column headed 1.6 in the table, the theoretical frequencies of
incidents can be found. For example, the probability of 0 incidents is 0.2019.
Out of 100 aircraft, one would thus expect 20 to be involved in 0 incidents. The
comparison between theoretical and observed is:
giving:
A proportion of 0.0236 (or 2.36 per cent) of the 800 would therefore be ex-
pected to be involved in five or more incidents over a 400-day period (i.e. 19
aircraft).
(c) Reservations about the conclusion are principally to do with whether the
incidents are random. It may be that for this aircraft certain routes/flights/ pi-
lots/times of the year are more prone to accident. If so, the average incident rate
differs from one part of the population to another and a uniform value (1.6)
covering the whole population should not be used. In this case it may be neces-
sary to treat each section of the population differently or to move to a more
sophisticated distribution, possibly the negative binomial.
Another problem may be that the sample is not representative of the population.
Not all routes may be included; not all airlines may be included; there may be a
learning effect, with perhaps fewer errors later in the 400-day period than earlier;
or perhaps the pilots are doubly careful when they first fly a new aircraft. The
data should be investigated to search for gaps and biases such as these. If the
insurance is to cover all aircraft/routes/airlines then the sample data should be
representative of this population.
Lastly, the data are all about incidents; the insurance companies become involved
financially only when an accident takes place. The former may not be a good
surrogate for the latter. If possible, past records should be used to establish a
relationship between the two and to test just how good a basis the analysis of
incidents is for deciding on accident insurance.
(d) If a check of the data reveals missing routes or airlines then the gaps should be
filled if possible. The data should be split into subsections and the analysis re-
peated to find if there has been a learning effect or if there are different patterns
in different parts of the data. There could be differences on account of seasonali-
ty, routes, airlines, type of flight (charter or scheduled). If differences are
observed then the insurance premium would be weighted accordingly.
Data from the introduction of other makes of aircraft could serve to indicate
learning effects and also the future pattern of incidents.
The large amounts of money at stake in a situation like this would make the extra
statistical work suggested here worthwhile.
Car 1 2 3 4 5 6 7 8 9 10 11 12 13 14
(mpg) 21 24 22 24 29 18 21 26 25 19 22 20 28 23 = 23
− −2 1 −1 1 6 −5 −2 3 2 −4 −1 −3 5 0
2 4 1 1 1 36 25 4 9 4 16 1 9 25 0
( − )
= ∑( − ) /(14 − 1)
= 136/13
= 10.46
The five stages of a significance test can now be followed:
(i) The hypothesis is that the tyres have made no difference and that the petrol
consumptions come from a population with a mean of 22.4.
(ii) The evidence is the sample of 14 cars’ petrol consumptions with a mean of
23 and a standard deviation of:
√10.46 = 3.23
(iii) The significance level is 5 per cent.
(iv) The sample size is less than 30, the standard deviation has been estimated
from the sample and the underlying individual distribution is normal. All
the conditions attached to the t-distribution are present. The observed t
value is:
= (23 − 22.4)/(3.23/√14)
= 0.60/0.86
= 0.70
(v) The test is one-tailed, assuming the tyres could bring about only an im-
provement, not a deterioration, in petrol consumption; the degrees of
freedom are 13. The t value corresponding to the 5 per cent level is thus
taken from the row for 13 degrees of freedom and the column for t0.05. The
value is 1.771. The observed t value is less than this and therefore the hy-
pothesis is accepted. The tyres do not make a significant difference.
(b) On the other hand, the alternative hypothesis is that the tyres result in an
improvement in petrol consumption of 1.5 mpg. Under this hypothesis the sam-
ple would have come from a distribution of mean 23.9 (= 22.4 + 1.5). The
observed t value is:
= (23 − 23.9)/(3.23/√14)
= −0.9/0.86
= −1.05
Ignoring the negative sign, this observed t value (under the alternative hypothe-
sis) is lower than the critical t value of 1.771, just as was the previous observed t
value (under the null hypothesis). The sample evidence would therefore be insuf-
ficient to reject either the null or the alternative hypothesis. Clearly the sample
size is too small to discriminate properly between the two hypotheses.
If the probabilities of type 1 and type 2 errors are to be equal then the critical t
value, t0.05, should be equidistant from both hypotheses (i.e. halfway between
them at 23.15). A sample size larger than 14 is evidently required to do this. As-
suming that the sample size needed is greater than 30 (and therefore the t-
distribution can be approximated to the normal):
. = (23.15 − 22.4)(3.23/√ )
1.645 = 0.75/(3.23/√ )
√ = (1.645 × 3.23)/0.75
= 50
A sample size of 50 is needed if the test is to discriminate equally between the no
change hypothesis and the hypothesis based on the manufacturer’s claim. To
achieve such a sample size would presumably involve using cars of a wider age
span than six to nine months.
(c) Many factors affect a car’s petrol consumption. A well-designed significance test
should exclude or minimise the effect of all factors except the one of interest, in
this case, the tyres. Major influences on consumption are the type of car and its
age. By comparing like with like in respect of these factors, their effect is elimi-
nated. Other factors cannot be controlled in this way. Very little can be done
about the type of usage, total mileage, the style of the drivers and the quality of
maintenance. It is hoped that these factors will balance out over the sample of
cars and the time period.
(d) The principal argument in favour of the officer’s suggestion is that it may
eliminate the effect of different maintenance methods from the test so that ob-
served differences are accounted for by the tyres, not the maintenance methods.
The arguments against his proposal are stronger. First, the maintenance methods
are unlikely to be identical in the two forces. The procedures laid down may be
the same but the interpretation of them by different sets of mechanics of differ-
ent levels of skill will almost certainly mean that there are still differences.
Second, his proposals create some new difficulties not present in the original
significance test. Some factors affecting petrol consumption that were eliminated
by the first test are now reintroduced. The geography of the territories served by
the forces will differ; the drivers of the cars will be different; the roles of the cars
may be different. All these factors will cause different fuel consumptions in the
two samples of cars, which may well disguise or overwhelm the influence of the
tyres.
While the officer’s test could certainly be carried out, the new variables his test
introduces would put a question mark over any conclusions drawn. On the
whole, the officer’s suggestion should be rejected but without blunting his en-
thusiasm for using analytical methods to help in decision taking.
Module 10
Review Questions
10.1 The correct answer is A, C. Analysis of variance tests the hypothesis that the
samples come from populations with equal means or that they come from a
common population. In the former case, B is an assumption. In both cases, D is an
assumption. B and D are therefore not hypotheses but assumptions underlying the
testing of the hypotheses by analysis of variance.
10.2 The correct answer is A.
SSE = Total SS − SST = 316 − 96 = 220
10.3 The correct answer is D.
MST = SST/Degrees of freedom = 96/(5 − 1) = 24
10.4 The correct answer is D.
MSE = SSE/Degrees of freedom
= 220/[(12 − 1) × 5]
= 4
Observed F = MST/MSE = 24/4 = 6
10.5 The correct answer is True. The hypothesis tested by the analysis of variance is that
the treatments come from populations with equal means (i.e. the treatments have no
effect). But since the observed F value exceeds the critical value, the hypothesis
must be rejected. The treatments do have a significant effect.
10.6 The correct answer is A, D. It is hypothesised that B is an attribute of the
populations from which the samples are taken; C is an assumed attribute of the
populations. Since the samples are selected at random, it would be virtually impossi-
ble for the samples to have these attributes.
10.7 The correct answer is B. The grand mean is 4. Total SS is calculated by finding the
deviation of each observation from the grand mean, then squaring and summing.
Taking each row in turn:
Total SS = (3 − 4)2 + (7 − 4)2 + (4 − 4)2 + (2 − 4)2 +…+ (3 − 4)2 + (1 − 4)2 +
(4 − 4)2
= 1 + 9 + 0 + 4 + 4 + 4 + 4 + 4 + 16 + 9 + 1 + 4 + 0 + 4 + 1 + 9 + 16
+1+9+0
= 100
10.8 The correct answer is A.
SST = r × [(5 − 4)2 + (5 − 4)2 + (3 − 4)2 + (3 − 4)2]
where r = number of rows (observations or blocks).
SST = 5 × (1 + 1 + 1 + 1)
SST = 20
The first column of Table A4.7 describes the sources of error. The second relates to
degrees of freedom, always given by c − 1 and (r − 1)c for a one-way analysis of
variance.
The third column requires the calculation of the sums of squares. SST deals with the
‘between’ sums of squares and is concerned with the group means and their
deviations from the grand mean.
SST 10 × [(98.3 − 101.8)2 + (102.9 − 101.8)2 + (103.6 − 101.8)2 +
= (98.5 − 101.8)2 + (101.3 − 101.8)2 + (106.4 − 101.8)2]
= 10 × (12.25 + 1.21 + 3.24 + 10.89 + 0.25 + 21.16)
= 490.0
SSE deals with (within) sums of squares and is concerned with the individual
observations and the deviations between them and their group means.
SSE = (97 − 98.3)2 + (102 − 98.3)2 + (95 − 98.3)2 + (99 − 98.3)2 + Brand 1
(98 − 98.3)2 + (100 − 98.3)2 + (95 − 98.3)2 + (96 − 98.3)2 +
(103 − 98.3)2 + (98 − 98.3)2
+ (106 − 102.9)2 + (109 − 102.9)2 + (100 − 102.9)2 + (104 Brand 2
− 102.9)2 + (103 − 102.9)2 +(101 − 102.9)2 + (99 − 102.9)2
+ (103 − 102.9)2 + (105 − 102.9)2 + (99 − 102.9)2
+ (112 − 103.6)2 + (98 − 103.6)2 + (101 − 103.6)2 + (101 − Brand 3
103.6)2 + (105 − 103.6)2 + (103 − 103.6)2 + (102 − 103.6)2
+ (105 − 103.6)2 + (108 − 103.6)2 + (101 − 103.6)2
+ (105 − 98.5)2 + (93 − 98.5)2 + (99 − 98.5)2 + (96 − 98.5)2 Brand 4
+ (99 − 98.5)2 + (101 − 98.5)2 + (97 − 98.5)2 + (99 − 98.5)2
+ (101 − 98.5)2 + (95 − 98.5)2
+ (101 − 101.3)2 + (102 − 101.3)2 + (97 − 101.3)2 + (100 − Brand 5
101.3)2 + (105 − 101.3)2 + (99 − 101.3)2 + (101 − 101.3)2 +
(103 − 101.3)2 + (105 − 101.3)2 + (100 − 101.3)2
+ (108 − 106.4)2 + (113 − 106.4)2 + (104 − 106.4)2 + (108 Brand 6
− 106.4)2 + (103 − 106.4)2 + (104 − 106.4)2 + (108 −
106.4)2 + (104 − 106.4)2 + (110 − 106.4)2 + (102 − 106.4)2
= 68.1 + 94.9 + 148.4 + 106.5 + 58.1+ 112.4
= 588.4
Total SS is of course the total sums of squares. It is concerned with all observations
and their deviations from the grand mean. Going through the observations, each
row in turn:
Total SS = (97 − 101.8)2 + (106 − 101.8)2 +…+ (100 − 101.8)2 + (102 − 101.8)2
= 1078.4
It was not strictly necessary to calculate all three sums of squares since:
Total SS = SST + SSE
Calculating all three provided a check. Table A4.7 shows that the equality is satis-
fied: 1078.4 = 490.0 + 588.4.
Next, the mean squares are calculated by dividing the sums of squares by the
associated degrees of freedom (column 4 in Table A4.7). The ratio of the mean
squares is the observed value of the F variable and is calculated in the final column.
To finish the test, the critical F value for (5, 54) degrees of freedom at the 5 per cent
level is found from the table of the F-distribution. In this case, the value is 2.38. The
observed F value, 8.99, greatly exceeds 2.38. The hypothesis is rejected at the 5 per
cent significance level. There is a significant difference in the levels of irritability
caused.
At the 1 per cent level the hypothesis is also rejected. The critical F value for (5, 54)
degrees of freedom is 3.37 at the 1 per cent level. The observed F value exceeds this
also. The evidence that the powders do not cause the same level of skin irritation is
strong.
Qualifications
The reservations that should attach to the results of the test are to do with both
statistics and common sense.
(a) An F test assumes that the populations from which the samples are drawn are
normally distributed. In this case, it must be assumed that the distribution of
observations for each brand is normal. This may not be true, especially since the
sample size (10) is too small for the central limit theorem to have any real effect.
(b) An F test also assumes that the populations from which the samples are drawn
have equal variances. Again, this may not be true although statistical research has
indicated that variances would have to be very different before the results of the
test were distorted.
(c) Since skin irritation is very much a subjective problem and one that is hard to
quantify, there must also be doubts about the validity of the data (i.e. does the
index measure accurately what it is supposed to measure?). The tester should
look carefully at the ways in which the index has been validated by the research-
ers.
(d) The data must also come into question for more fundamental reasons. The
design of the experiment gives rise to the following doubts:
(i) How were the households chosen? Are they a representative group?
(ii) Do the households do their washing any differently because they are being
monitored by the tester?
(iii) How representative are the batches of washing?
(iv) How ‘standard’ are the batches of washing?
(v) Are there factors that make some people more prone to skin irritation and
that should therefore be built into the test?
(vi) Are the data independent (e.g. is there any cumulative effect in the testing)?
Does any brand suffer a higher index because of the effect of brands tested
earlier?
After calculating the means, the next step is to construct a two-way analysis of
variance (ANOVA) table as shown in Table A4.9.
SSB = 8 × [(66 − 73)2 + (82 − 73)2 + (78 − 73)2 + (74 − 73)2 + (65 − 73)2]
= 8 × (49 + 81 + 25 + 1 + 64)
= 1760
The error sum of squares (SSE) is calculated by first determining the total sum of
squares (Total SS).
Total SS = (71 − 73)2 + (73 − 73)2 + (66 − 73)2 + (69 − 73)2 Monday 616
+ (58 − 73)2 + (60 − 73)2 + (70 − 73)2 + (61 −
73)2
+ (71 − 73)2 + (78 − 73)2 + … Tuesday 928
+ (73 − 73)2 + (78 − 73)2 + … Wednesday 326
+ (73 − 73)2 + (75 − 73)2 + … Thursday 62
+ (62 − 73)2 + (66 − 73)2 + … Friday 900
= 2832
(d) Referring back to Table A4.9, the effect of days of the week on responses can
also be tested. This time the observed F value is the ratio MSB/MSE, equal to
22.3. The critical F value at the 5 per cent level for (4, 28) degrees of freedom is
2.71. The observed F far exceeds this amount. Days of the week have a highly
significant effect on the responses.
(e) Should the analysis of variance be taken further by looking into the possibility of
an interaction effect? The usefulness of such an extension to the study depends
on how much it is thought days of the week and locations have independent
effects on responses. If it were thought that the ‘Monday’ and ‘Friday’ effects
were more marked in some parts of the country than others then an interaction
variable would permit the inclusion of this influence in the analysis of variance.
Intuitively it does not seem likely that people feel particularly worse about Mon-
days (better about Fridays) in some cities than in others. In any case, since the
effect of location has already been demonstrated to have a significant bearing on
responses, the inclusion of a significant interaction term could only make the
effect more marked (by decreasing the SSE while SST remains the same). Over-
all it does not seem worthwhile to extend the analysis to include an interaction
term.
Module 11
Review Questions
11.1 The correct answer is C, D. A is untrue because regression is specifying the
relationship between variables; correlation is measuring the strength of the relation-
ship. B is untrue because regression and correlation cannot be applied to unpaired
sets of data. C is true, by definition, and D is true, because if the data were plotted in
a scatter diagram, they would lie approximately along a straight line with a negative
slope.
11.2 The correct answer is B. A is untrue because residuals are measured vertically, not at
right angles to the line. B is true, by definition. C is untrue because actual points
below the line have negative residuals, and D is untrue because residuals are all zero
only when the points all lie exactly on the line (i.e. when there is perfect correlation).
11.3 The correct answer is B.
− ( − ) − ( − ) ( − )( − )
4 2 −4 16 −3 9 12
6 4 −2 4 −1 1 2
9 4 1 1 −1 1 −1
10 7 2 4 2 4 4
11 8 3 9 3 9 9
∑40 25 34 24 26
=8 =5
∑( − )( − ) = 26
∑( − ) = 34
∑( − ) = 24
Slope coefficient = = ∑( − )( − )/∑( − )
= 26/34
= 0.765
11.4 The correct answer is C.
Intercept = − Slope × = 5 − 0.765 × 8
= −1.12
= 26/√34.24
= 26/28.6
= 0.91
11.6 The correct answer is A.
Residual = Actual − Fitted
= 2 − (4 × 0.765 − 1.12)
= 0.06
11.7 The correct answer is A. The evidence of everyday life is that husbands and wives
tend to be of about the same age, with only a few exceptions. One would therefore
expect strong positive correlation between the variables.
11.8 The correct answer is A. If data are truly represented by a straight line, the residuals
should exhibit no pattern. They should be random. Randomness implies that each
residual should not be linked with the previous (i.e. there should be no serial
correlation). Randomness also implies that the residuals should have constant
variance across the range of x values (i.e. heteroscedasticity should not be present).
11.9 The correct answer is False. The strong correlation indicates association, not
causality. In any case, it is more likely that if causal effects are present, they work in
the opposite direction (i.e. a longer life means a patient has more time in which to
visit his doctor).
11.10 The correct answer is C. The prediction of sales volume for advertising expenditure
of 5 is:
Sales = 14.7 + 6.3 × Advertising expenditure
= 14.7 + 31.5
= 46.2
11.11 The correct answer is B. Unexplained variation = Sum of squared residuals = 900
˗squared = Explained variation/Total variation
= (Total − Unexplained)/Total
0.70 = (Total − 900)/Total
0.7 × Total = Total − 900
0.3 × Total = 900
Total = 3000
11.12 The correct answer is A. The difference between a regression of y on x and one of x
on y is that y and x are interchanged in the regression and correlation formulae.
Since the correlation coefficient formula is unchanged if x and y are swapped round,
the correlation coefficients are the same in both cases. Since the slope and intercept
formulae are changed if x and y are swapped round, then these two quantities are
different in the two cases (unless by a fluke).
20
15
10
5
x
0 1 2 3 4 5 6 7 8
− 2 − 2 ( − )( − )
( − ) ( − )
3 11 −1 1 −3 9 3
1 7 −3 9 −7 49 21
3 12 −1 1 −2 4 2
4 17 0 0 3 9 0
6 19 2 4 5 25 10
7 18 3 9 4 16 12
∑ 24 84 24 112 48
=4 = 14
∑( )( )
=
∑( ) ·∑( )
= 48/√24 × 112
= 48/51.8
= 0.93
The correlation coefficient is high, confirming the visual evidence of the scatter
diagram that the relationship is linear.
(b) Line (i)
The line goes through the points (1,7) and (6,19). Therefore, the line has slope =
(19 − 7)/(6 − 1) = 12/5, (i.e. the line is y = a + 2.4x). Since the line goes
through the point (1,7):
7 = a + 2.4
a = 4.6
The line is y = 4.6 + 2.4x
Line (ii)
The line goes through the points (1,7) and (7,18). Therefore, the line has slope =
(18 − 7)/(7 − 1) = 11/6 = 1.8 (i.e. the line is y = a + 1.8x). Since the line goes
through (1,7):
7 = a + 1.8
A = 5.2
The line is y = 5.2 + 1.8x
Line (iii)
The regression line is found from the regression formulae:
∑( )( )
Slope =
∑( )
= 48/24
= 2
Intercept = − Slope ×
= 14 − 2 × 4
= 6
The line is y = 6 + 2x
The residuals are calculated as actual minus fitted y values. For example, for line
(i) and the point (3,11), the residual is:
11 − (4.6 + 2.4 × 3) = −0.8
The MADs are calculated as the average of the absolute values of the residuals.
For example, for line (i):
MAD = (0.8 + 0 + 0.2 + 2.8 + 0 + 3.4)/6 = 1.2
The variances are calculated as the average of the squared residuals (but with a
divisor of 5, not 6, as in the formula for the variance). For example, for line (i):
Variance = (0.64 + 0 + 0.04 + 7.84 + 0 + 11.56)/5 = 20.08/5
= 4.016
The mean absolute deviation shows that line (i), connecting the extreme y values,
has the smallest residual scatter. On the MAD criterion, line (i) is the best.
The variance shows that line (iii), the regression line, has the smallest residual
scatter. On the variance criterion (equivalent to least squares), line (iii) is the best.
This has to be the case since the regression line is the line that minimises the
sum of squared residuals.
Clearly different, but equally plausible criteria (minimising the MAD and mini-
mising the variance of the residuals) give different ‘best fit’ lines. Even when one
keeps to one criterion the margin between the ‘best’ line and the others is small
(in terms of the criterion). Yet the three lines (i), (ii) and (iii) differ markedly
from one another and would give distinctly different results if used to forecast.
The conclusion is that, while regression analysis is a very useful concept, it
should be used with caution. A regression line is best only in a particular way
and, even then, only by a small margin.
A visual inspection of the residuals does not suggest any particular pattern. First,
there is no tendency for the positives and negatives to be grouped together (e.g.
for the positive residuals to refer to the smaller stores and the negatives to the
larger, or vice versa). In other words, there is no obvious evidence of serial cor-
relation. Second, there is no tendency for the residuals to be of different sizes at
different parts of the range (e.g. for the residuals to be, in general, larger for
larger stores and smaller for smaller stores). In short, there is no evidence of
heteroscedasticity.
Visually, the residuals appear to be random. Taken with the high correlation
coefficient, this indicates that there is a linear relationship between sales and
family disposable income.
(c) The scatter of the residuals about the regression line is measured through the
residual standard error. If the residuals are normally distributed, 95 per cent of
them will lie within 2 standard errors of the line. For a point forecast (given by
the line) it may be anticipated, if the future is like the past, that the actual value
will also lie within 2 standard errors of the point forecast.
If residual error were the only source of error, 95 per cent confidence limits for
the forecast could be defined as, in the example given above:
£74 668 ± 2 × 4720
i.e. £65 228 to £84 108
However, there are other sources of error (see Module 12) and therefore the
above confidence interval must be regarded as the best accuracy that could be
achieved.
(d) The linear relationship between sales and family disposable income appears to
pass the statistical tests. Further, since it must be a reasonable supposition that
sales are affected to some degree by the economic wealth of the catchment area,
the model has common sense on its side.
On the other hand, there are many influences on a store’s sales besides family
income. These are not included in the forecasting method. Ideally, a method that
can include other variables would be preferable.
A second reservation is concerned with the quality of the data. While store sales
are probably fairly easy to measure and readily available, this is unlikely to be the
case with the disposable family income. If these data are not available, an expen-
sive survey would be required to make estimates. Even then, the data are not
likely to carry a high degree of accuracy.
Last, the catchment area will be difficult to define in many if not all cases, adding
further to the inaccuracy of the data.
Module 12
Review Questions
12.1 The correct answer is B, C. B and C give synonyms for a right-hand-side variable.
Another synonym is an independent variable, the opposite of A. D is incorrect,
there being no such thing as a residual variable.
12.2 The correct answer is False. The statement on simple regression is correct, but the
statement on multiple regression should be altered to ‘one y variable is related to
several x variables’.
12.3 The correct answer is A. The coefficients have standard errors because they are
calculated from a set of observations that is deemed to be a sample and therefore
the coefficients are estimates. Possible variations in the coefficients are calculated
via their standard errors, which are in turn estimated from variation in the residuals.
B is incorrect since, although there may be data errors, this is not what the standard
errors measure. The standard errors are used to calculate t values that are used in
multiple regression, but this is not why they arise. Therefore, C and D are incorrect.
12.4 The correct answer is C. The t values are found by dividing the coefficient estimate
by the standard error. Thus:
Variable 1: 5.0/1.0 = 5.0
Variable 2: 0.3/0.2 = 1.5
Variable 3: 22/4 = 5.5
12.5 The correct answer is D. The degrees of freedom = No. observations − No. x
variables − 1. Thus, number of observations = 32 + 3 + 1 = 36.
12.6 The correct answer is C. The elimination of variables is based on a significance test
for each variable. The t value for each variable is compared with the critical t value
for the relevant degrees of freedom. In this case, the number of observations
exceeds 30; therefore, the normal distribution applies and the critical t value is 1.96.
12.7 The correct answer is True. The formula for R-bar-squared has been adjusted to
take degrees of freedom into account. Since each x variable reduces the degrees of
freedom by 1, the number of x variables included is allowed for.
12.8 The correct answer is A. Sums of squares (regression) have as many degrees of
freedom as there are right-hand-side variables (i.e. 3).
12.9 The correct answer is C.
Mean square (regression) = Sums of squares (regression)/Degrees of freedom
= 120/3
= 40
12.10 The correct answer is D. The degrees of freedom for sums of squares (residuals) =
n − k − 1 = 34
Mean square (residual) = Sums of squares (residual)/Degrees of freedom
= 170/34
= 5.0
12.11 The correct answer is C. The critical F ratio is for (3,34) degrees of freedom and for
the 5 per cent level. From F tables this is found to be 2.88. Since observed exceeds
critical, there is a significant linear relationship.
12.12 The correct answer is D. The independent variables are the right-hand-side
variables. Only x (and x2) appear on the right-hand side, but in curvilinear regression
squared terms are treated as additional variables. Therefore, the independent
variables are x and x2.
12.13 The correct answer is False. A transformation is used not to approximate a curved
relationship to a linear one but to put the relationship in a different form so that the
technique of linear regression can be applied to it.
12.14 The correct answer is B. To carry out a regression analysis on the exponential
function y = aebx the equation is first transformed by taking logarithms (to the base
e) of either side to obtain eventually:
log = log +
This is a linear equation between log y and x. Hence, a linear regression can be
carried out on the variables log y and x.
12.15 The correct answer is E. In its linear form the equation is:
log = log +
The coefficient of x is thus b and the constant is log a. Therefore:
= 8 and = antilog 4
(c) The correlation coefficient is high at 0.92. But it would be expected to be high
where there are so few observations and two right-hand-side variables. The F test
would be a more precise check. An ANOVA table should be drawn up as in
Table A4.13.
The observed F value is 37.3. The critical F value for (2,5) degrees of freedom at
the 5 per cent significance level is, from the F table, 5.79. Since observed F ex-
ceeds the critical F, it can be concluded that there is a significant linear relation-
ship.
(d) Two additional pieces of information would be useful. The correlation matrix
would help to check on the possible collinearity of the two x variables. Calcula-
tions of SE(Pred) would help to determine whether the predictions produced by
the model were of sufficient accuracy to use in decision making.
(e) The model could be used to forecast revenue, provided that conditions do not
change. In particular, this means that the ways in which decisions are taken re-
main the same.
A seeming paradox of the model is the negative coefficient for television adver-
tising expenditure. Does this mean that television advertising causes sales to be
reduced? The answer is almost certainly no. The reason is this: television adver-
tising is only used when sales are disappointing. Consequently, high television
advertising expenditure is always associated with low revenue (but not as low as
it might have been). The causality works in an unexpected way: from sales to
advertising and not the other way around.
Provided decisions about when to use the two types of advertising conform to
the past, the model could be used for predictions. If, however, it was decided to
experiment in advertising policy and make expenditures in different circum-
stances to those that have applied in the past, the model could not be expected
to predict gross revenue.
(f) The prime improvement would be extra data. Eight observations is an unsatis-
factory base for a regression analysis. This is not a statistical point. It is simply
that common sense suggests that too little information will be contained in those
eight observations for the decision at hand, whatever it is.
90
Unit costs (£/tonne)
80
70
60
50
40
100 200 300 400
Capacity (tonnes/week)
unit costs to 1/Capacity alone). They do not appear to be random in the first
model as described in Case Study 12.2.
(iv) The third model is the only one of the three that is a multiple regression
model. Both of its right-hand-side variables have been included correctly.
Their t values are 23.8 (for 1/Capacity) and 7.3 (for Age), well in excess of
the critical t value at the 5 per cent level.
(v) Collinearity is largely absent from the third model. Using the formula given
in Module 11, the correlation coefficient between 1/Capacity and Age is
0.44. Squaring this to obtain R-squared, the answer is 0.19. This is a low
value (and an F test shows the relationship is not significant).
(vi) The third model has the lowest SE(Pred), at 1.7, compared with 4.0 for the
second model. It is more than twice as accurate as the next best model.
(b)
(i) Although the value of age in making a prediction for a 2018 plant is zero,
age nevertheless has had an effect on the prediction. Age was allowed for in
constructing the regression model. All coefficients were affected by the
presence of age in the regression. One could say that the regression has
separated the effect of capacity from that of age. The (pure) effect of capac-
ity can now be used in predicting for a modern plant.
(ii) The 95 per cent forecast intervals must be based on the t-distribution since
the number of observations is less than 30. For 9 degrees of freedom
(12−2−1) t0.025 is 2.26. The intervals are:
270 capacity plant: 44.9 ± 2.26 × 1.7 = 44.9 ± 3.8
350 capacity plant: 39.9 ± 2.26 × 1.7 = 39.9 ± 3.8
(iii) SE(Pred) takes into account a number of sources of error. One of these is
in the measurement of the variable coefficients. Any prediction involves
multiplying these coefficients by the values of the right-hand-side variables
on which the predictions are based. Therefore, the amount of the error will
vary as these prediction values vary. SE(Pred) will thus be different for dif-
ferent predictions.
(iv) R-squared measures variation explained; SE(Pred) deals with unexplained
variation plus other errors. Although, therefore, the two are linked, the rela-
tionship is not a simple or exact one. An increase in R-squared from 0.93 to
0.99 in one way appears a small increase. From another point of view it re-
flects a great reduction in the unexplained variation, which is reflected in
the substantial improvement in prediction accuracy.
Module 13
Review Questions
13.1 The correct answer is False. The techniques are classified into qualitative, causal
modelling and time series; the applications are classified into short-, medium- and
long-term.
13.2 The correct answer is B, D. A is false because time series methods make predictions
from the historical record of the forecast variable only, and do not involve other
variables. B is true because a short time horizon does not give time for conditions to
change and disrupt the structure of the model. C is false since time series methods
work by projecting past patterns into the future and therefore are usually unable to
predict turning points. D is true because some time series methods are able to
provide cheap, automatic forecasts.
13.3 The correct answer is A, C. A is the definition of causal modelling. B is false since
there is no reason why causal modelling cannot be applied to time series as well as
cross-sectional data. C is true because causal modelling tries to identify all the
underlying causes of a variable’s movements and can therefore potentially predict
turning points. D is false since causal modelling can be used for short-term fore-
casts, but its expense often rules it out.
13.4 The correct answer is False. Causal modelling is the approach of relating one
variable to others; least squares regression analysis is a technique for defining the
relationship. There are other ways of establishing the relationship besides least
squares regression analysis.
13.5 The correct answer is True. Qualitative forecasting does not work statistically from a
long data series, as the quantitative techniques tend to. However, in forming and
collecting judgements, numerical data may be used. For example, a judgement may
be expressed in the form of, say, a future exchange rate between the US dollar and
the euro.
13.6 The correct answer is A, B, C, D. The situations A to D are the usual occasions
when qualitative forecasting is used.
13.7 The correct answer is A. A is correct, although ‘expert’ needs careful definition. It
would be better to say that the participants were people with some involvement in
or connection with the forecast. B is not true since the participants are not allowed
to communicate with one another at all. C is not true because the chairman passes
on a summary of the forecasts, not the individual forecasts. D is not true. The
chairman should bring the process to a stop as soon as there is no further move-
ment in the forecasts, even though a consensus has not been reached.
13.8 The correct answer is True. This is a definition of scenario writing. Each view of the
future is a scenario.
13.9 The correct answer is C, D. A and B are not true since the technique of the cross-
impact matrix does not apply to forecasts or probabilities of particular variables,
whether sales or not. C is true since the technique is based on the full range of
future events or developments and they therefore need to be fully listed. D is true,
being a description of what the technique does.
13.10 The correct answer is False. The essence of an analogy variable is that it should
represent the broad pattern of development expected for the forecast variable. It
does not have to be exactly the same at each point. They could, for example, differ
by a multiplicative factor of ten.
13.11 The correct answer is A. Catastrophe theory applies to ‘jumps’ in the behaviour of a
variable rather than smooth changes, however steep or unfortunate.
13.12 The correct answer is C. C gives the formula for a partial relevance number.
Objective:
Level Design successful automobile
A Performance
B Passenger comfort
C Safety
D Running costs
E Capital costs
Weight the importance of each criterion relative to the others. This is done by
asking which criteria are most relevant to the basic objective of designing a suc-
cessful automobile. The weights might be assigned as follows:
Weight
A Performance 0.30
B Passenger comfort 0.20
C Safety 0.10
D Running costs 0.15
E Capital costs 0.25
Total 1.00
(c) Weight the sub-objectives at each level (the elements of the tree) according to
their importance in meeting each criterion. In this case, the result might be as in
Table A4.14.
The first column shows the assessed relevance of the three elements at level 1 to
the criterion of performance. Accommodation is weighted 10 per cent, control
65 per cent and information 25 per cent. Since the table gives the relative rele-
vance of the elements at each level to the criteria, this part of each column must
sum to 1. The process of assessing relevance weights is carried out in a similar
way for the second level of the tree.
(d) Each element has a partial relevance number (PRN) for each criterion. It is
calculated:
PRN = Criterion weight × Element weight
It is a measure of the relevance of that element with respect only to that criteri-
on. For this case the partial relevance numbers are shown in Table A4.15.
For instance, at level 2 the PRN for direction with respect to capital costs is 0.05
× 0.25 = 0.0125.
PRNs are calculated for each element at each level for each criterion.
(e) The LRN for each element is the sum of the PRNs for that element (see Ta-
ble A4.16). It is a measure of the importance of that element relative to others at
the same level in achieving the highest-level objective. For example, at level 2 the
LRN for direction is 0.0375 (= 0.0150 + 0 + 0.0100 + 0 + 0.0125). There is one
LRN for each element at each level.
(f) There is one CRN for each element. They are calculated by multiplying the LRN
of an element by the LRNs of each associated element at a higher level (see Ta-
ble A4.17). This gives each element an absolute measure of its relevance.
For example:
CRN (direction) = LRN (direction) × LRN (control)
= 0.0375 × 0.5625
= 0.021
The CRNs at the second level show the comparative importance with respect to
the overall objective of the elements at that level. Thus, speed is the most im-
portant (0.240) and baggage the least important (0.012).
Recall that by this process the bottom row of elements (specific technological
requirements) will have overall measures of their relevance in achieving the ob-
jective at the highest level of the tree. This should lead to decisions about the
importance, timing, resource allocation, etc. of the tasks ahead.
Module 14
Review Questions
14.1 The correct answer is C. C is the definition of time series forecasting. A is untrue
because TS methods work for stationary and non-stationary series. Decomposition
is the only method that uses regression even to a small extent. Therefore, B is
untrue. D is partly true. Some, but not all, TS methods are automatic and need no
intervention once set up.
14.2 The correct answer is A, D. TS methods analyse the patterns of the past and project
them into the future. Where conditions are not changing, the historical record is a
reliable guide to the future. TS methods are therefore good in the short term when
conditions have insufficient time to change (situation A) and in stable situations (D).
For the same reason, they are not good at predicting turning points (situation B). In
order to analyse the past accurately, a long data series is needed, thus situation C is
unlikely to be one in which TS methods are used.
14.3 The correct answer is D. A stationary series has no trend and constant variance.
Homoscedastic means ‘with constant variance’. Thus, it is only D that fully defines a
stationary series.
14.4 The correct answer is False. The part of the statement referring to MA is true; the
part referring to ES is false. ES gives unequal weights to past values, but they are
not completely determined by the forecaster. They are partly chosen by the fore-
caster in that a smoothing constant, α, is selected.
14.5 The correct answer is A. The three-point moving average for 9 is the average of the
values for periods 6, 7 and 8:
Forecast for period 9 = = 9.3
14.6 The correct answer is A.
(b) The forecast for future months is 2056 – see Table A4.19.
(c) In both cases it is assumed that the series are stationary. In other words, there is
no trend in the data and they have constant variance through time.
α = 0.2 γ = 0.3
Period Actual Smoothed series (St) Smoothed trend (bt)
4 207.5 + (4 × 8.2) = 240.3
Exhibit 3 shows the historical data in column 1, the time index in column 2
and the trend in column 3.
Exhibit 3
(1) (2) (3) (4) (5) (6)
Year Month Sales Time Trend Mov. Cycle Season
av.
2006 1 8.12 1 8.89
2006 2 7.76 2 8.94
2006 3 7.97 3 9.00
2006 4 7.88 4 9.05
2006 5 8.45 5 9.11
2006 6 8.68 6 9.16 9.80 1.07 0.89
2006 7 6.77 7 9.22 9.74 1.06 0.70
2006 8 6.60 8 9.27 9.70 1.05 0.68
2006 9 8.39 9 9.33 9.76 1.05 0.86
2006 10 11.88 10 9.38 9.87 1.05 1.20
2006 11 15.58 11 9.44 10.05 1.06 1.55
2006 12 19.50 12 9.49 10.09 1.06 1.93
2007 1 7.43 13 9.55 10.25 1.07 0.72
2007 2 7.26 14 9.60 10.07 1.05 0.72
2007 3 8.67 15 9.66 10.13 1.05 0.86
2007 4 9.26 16 9.71 10.08 1.04 0.92
2007 5 10.55 17 9.77 10.05 1.03 1.05
2007 6 9.17 18 9.82 9.93 1.01 0.92
2007 7 8.66 19 9.88 9.80 0.99 0.88
2007 8 4.45 20 9.93 9.71 0.98 0.46
2007 9 9.10 21 9.99 9.68 0.97 0.94
2007 10 11.32 22 10.04 9.65 0.96 1.17
2007 11 15.23 23 10.10 9.53 0.94 1.60
2007 12 18.02 24 10.15 9.59 0.94 1.88
2008 1 5.87 25 10.21 9.39 0.92 0.62
2008 2 6.19 26 10.26 9.35 0.91 0.66
2008 3 8.34 27 10.32 9.20 0.89 0.91
2008 4 8.91 28 10.37 9.35 0.90 0.95
2008 5 9.05 29 10.43 9.41 0.90 0.96
2008 6 9.98 30 10.48 9.82 0.94 1.02
2008 7 6.26 31 10.54 9.98 0.95 0.63
2008 8 3.98 32 10.59 10.12 0.96 0.39
2008 9 7.24 33 10.65 10.12 0.95 0.72
1.16
1.14
1.12
1.1
1.08
1.06
1.04
Cycle
1.02
1
0.98
0.96
0.94
0.92
0.9
0.88
0 20 40 60 80 100 120
Time
Exhibit 5
2006 2007 2008 2009 2010 2011 2012 2013 2014 Aver-
age
Jan. 0.72 0.62 0.75 0.79 0.69 0.51 0.71 0.76 0.69
Feb. 0.72 0.66 0.74 0.79 0.76 0.56 0.57 0.63 0.68
Mar. 0.86 0.91 0.78 0.69 0.86 0.65 0.76 0.80 0.79
Apr. 0.92 0.95 0.87 0.79 0.80 0.73 0.82 0.84
May 1.05 0.96 0.87 0.84 0.91 0.58 0.88 0.87
June 0.89 0.92 1.02 0.91 0.90 0.82 0.80 0.86 0.89
July 0.70 0.88 0.63 0.81 0.75 0.63 0.82 0.82 0.75
Aug. 0.68 0.46 0.39 0.36 0.34 0.33 0.34 0.29 0.40
Sep. 0.86 0.94 0.72 0.95 0.92 0.96 1.00 1.04 0.92
Oct. 1.20 1.17 1.30 1.11 1.19 1.28 1.21 1.22 1.21
Nov 1.55 1.60 1.55 1.82 1.75 2.08 1.80 1.73 1.73
.
Dec. 1.93 1.88 2.23 2.07 2.18 2.71 2.36 2.21 2.20
Total 11.98
Average 1.00
assumed that there is no definite cycle. The cyclical factor for January 2015
and the other months is taken to be 1.0.
Seasonality for January 2015, taken from Exhibit 5, is 0.69.
The forecast for January 2015 is calculated by multiplying the three compo-
nents together:
Forecast = 14.83 × 1.0 × 0.694
= 10.29
The forecasts for the whole of 2015 are shown in Exhibit 6.
(b) There are a number of assumptions and limitations in the use of these forecasts.
These reservations do not mean of course that the forecasts cannot be used, but
they do mean that they should only be used in full awareness of the problems
involved. The reservations are:
(i) The decomposition method does not allow the accuracy of the forecasts to
be measured.
(ii) Other forecasting methods could be used in such a situation, for example, the
Holt–Winters Method. Keith Scott should try other methods and compare
their accuracy.
(iii) Keith Scott should ensure he discusses the forecasts thoroughly with man-
agement before formalising the method by incorporating the forecasts in pro
formas and the like.
Module 15
Review Questions
15.1 The correct answer is False. Both manager and expert should view forecasting as a
system. They differ in the focus of their skills. The expert knows more about
techniques; the manager knows more about the wider issues.
15.2 The correct answer is A. The people who are to use the system should have primary
responsibility for its development. They will then have confidence in it and see it as
‘their’ system.
15.3 The correct answer is D. Analysing the decision-making process may reveal
fundamental flaws in the system or organisational structure which must be ad-
dressed before any forecasts can hope to be effective.
15.4 The correct answer is False. The conceptual modelling stage includes consideration
of possible causal variables but has wider objectives. The stage should be concerned
with all influences on the forecast variable. Time patterns and qualitative variables
also come into the reckoning.
15.5 The correct answer is D. The smoothed value calculated in period 5 from the actual
values for periods 3–5 is 16.0. This is the forecast for the next period ahead, period
6.
15.6 The correct answer is A. The one-step-ahead forecast error for period 6 is the
difference between the actual value and the one-step-ahead forecast for that period:
Error = Actual − Forecast
= 17.0 − 16.0
= 1.0
15.7 The correct answer is C. The MAD is the average of the errors, ignoring the sign:
MAD = (3.3 + 5.7 + 1.0)/3
= 3.3
15.8 The correct answer is B. The MSE is the average of the squared errors:
MSE = (3.3 + 5.7 + 1.0 )/3
= 44.4/3
= 14.8
15.9 The correct answer is E. MAD and MSE are different measures of scatter. In
comparing forecasting methods they may occasionally give different answers,
suggesting different methods as being superior.
MAD measures the average error. The method for which it is lower is therefore
more accurate on average. The MSE squares the errors. This can penalise heavily a
method that leaves large residuals. Forecasting method 2 may be better on average
than method 1 (i.e. have a lower MAD), but contained within the MAD for method
2 there may be some large errors that cause the MSE of method 2 to be higher than
that of method 1.
15.10 The correct answer is C. Other measures of closeness of fit (e.g. the correlation
coefficient) are based on the same data as were used to calculate the model parame-
ters. This method keeps the two sets of data apart. A and B are true but not the
reasons why the test is described as an independent one.
15.11 The correct answer is A, B, C. A, B and C summarise the reasons why step 7,
incorporating judgements, is an important part of the system.
15.12 The correct answer is False. A consensus on what the problems are can be just as
difficult to obtain as a consensus on the solutions.
(b) Using exponential smoothing (ES) with α = 0.1, the forecast is as in Table A4.22.
(a) Internal
(i) reputation of play
(ii) reputation of actors
(iii) presence of star names
(iv) ticket prices
(v) advertising/promotional expenditures
(vi) demand for previous, similar productions
(b) External
(i) personal disposable income
(ii) weather
(iii) time of year
(iv) day of week
(v) competition
These are the factors that it is hoped a forecasting model could take into account.
Step 8: Implement
The manager of the forecasting system should establish and gain agreement on what
the implementation problems are and how they can be solved. The problems would
depend on the individuals involved, but it is likely that they would centre on the
replacement of purely judgement-based methods by a more scientific one. The
manager should also follow the first set of forecasts through the decision process.
Step 9: Monitor
Implementation refers to just before and after the first use of the forecasts. In
monitoring, the performance of the system is watched, but not in such detail, every
time it is used. The accuracy of the forecasted demand for each production should
be recorded and reasons for deviations explored. In the light of this information, the
usefulness of the system can be assessed and changes to it recommended. The track
records of those making judgemental adjustments to the forecasts should also be
monitored. In this situation, where judgement must play an important part, this
aspect of monitoring will take on particular significance.
forecasting project may have to be postponed pending the resolution of these wider
issues.
A thorough analysis of a decision-making system involves systems theory. A lower-
level approach of listing the decisions directly and indirectly affected by the fore-
casts is described here. The list would be determined from an exhaustive study of
the organisational structure and the job descriptions associated with relevant parts
of it. Here is a brief description of the main decisions.
The list forms the input information for step 2, determining the forecasts required.
Step 3: Conceptualise
At this step consideration is given to the factors that influence the forecast variable.
No thought is given to data availability at this step. An ideal situation is envisaged.
An alcoholic beverage is not strictly essential to the maintenance of life. It is a
luxury product. Therefore, its consumption will be affected by economic circum-
stances. It would be strange if advertising and promotions did not result in changes
in demand. In addition, the variability of the production, advertising and promo-
tions of the different competitors must have an effect. In particular, the launch of a
new product changes the state of the market. It is not just competing beer products
that are important. Competition in the form of other products, such as lager and
possibly wines and spirits, must also have an influence.
The data record in Table 15.4 makes it clear that there is a seasonal effect. In other
words, the time of year and, perhaps, the weather are relevant factors. The occur-
rence of special events may have had a temporary influence. A change in excise duty
as a result of a national budget is an obvious example. More tentatively, national
success in sporting tournaments is rumoured to have an effect on the consumption
of alcoholic beverages.
There are also reasons for taking a purely time series approach to the forecast. First,
the seasonal effect will be handled easily by such methods. Second, the time horizon
for the forecasts is short: less than one year. Within such a period there is little time
for other variables to bring their influence to bear. To some extent, therefore, sales
volume could have a momentum of its own. Third, a time series approach will give a
forecast on the basis ‘if the future is like the past’. Such a forecast would be the
starting point for judging the effect of changing circumstances.
determine any cyclical effect (these effects are often five or seven years in length).
To summarise, what is required is a technique that can handle trend and seasonality
but not cycles. The obvious choice is the Holt–Winters technique.
The causal modelling technique should be multiple regression analysis with two
independent variables. The first independent variable is the gross domestic product
(GDP) of the UK, as a measure of the economic climate. Other economic variables
can also be used in this role, but GDP is the most usual. The second independent
variable is the sum of advertising and promotional expenditures (ADV/PRO) of the
organisation. Scatter diagrams relating the dependent variable with each independ-
ent variable in turn can verify that it is reasonable to consider GDP and ADV/PRO
as independent variables.
Other potential independent variables will have to be ignored for reasons of the
non-availability of data.
The table shows the parameter sets in three groups. For the first group the smooth-
ing constant for the main series has been varied; for the second, that for the trend
has been varied by keeping the series constant at its ‘best’ level; finally, the constant
for seasonality has been varied by keeping the other two at their ‘best’ level. The
parameter set with the lowest MAD and MSE is (0.2, 0.4, 0.5). The Holt–Winters
model with these parameters would appear to be the best. Note that the procedure
for finding these parameter values is an approximate one. There is no guarantee that
the truly optimum set has been found. To ensure that this had been done would
have required an exhaustive comparison of all possible parameter sets.
The third model, with GDP and ADV/PRO as independent variables, is slightly
better than the second, having a marginally higher R-bar-squared.
Finally, since these are regression models, they should be checked for the existence
of any of the usual reservations: lack of causality, spuriousness, etc. There may
indeed be a problem with causality. The second and third models are superior
because the ADV/PRO variable captures the seasonality, which was a problem in
the first. It is not clear whether it is the seasonality or the expenditure on advertising
and promotion that explains the changes in sales volumes. There will be no difficul-
ty if advertising and promotion expenditures continue to be determined with
seasonal variations as in the past, but if the method of allocation changes then both
models will be inadequate. A new model, dealing with advertising/promotion and
seasonality separately, will have to be tested.
Meanwhile, the model with two independent variables seems to be the best. The
results of this regression analysis are shown in more detail in Table A4.25.
Table A4.25 Output of the regression model linking sales to GDP and
ADV/PRO
Sales volume = –44.3 + 0.49 × GDP + 17.4 × ADV/PRO
R-bar-squared = 0.93
Quarter
Residuals Year 1 2 3 4
2009 −0.3 −4.2 −0.4 −0.6
2010 2.4 −1.3 1.7 −4.6
2011 2.6 4.3 −0.5 1.4
2012 −1.1 −5.3 4.7 1.7
2013 −2.9 1.8 0.7 2.5
2014 −1.9 2.8 1.3 3.0
2015 −2.6 −1.2 −0.7 −0.3
2016 1.4 −2.2 0 −0.1
2017 −1.4 −5.0 3.5 0.8
The best time series model is the Holt–Winters with smoothing constants 0.2 (for
the series), 0.4 (for the trend) and 0.5 (for the seasonality); the best regression model
relates sales volume to GDP and total expenditure on advertising and promotion.
To choose between these two, an independent test of accuracy should be used. This
means that the latter part of the data (2017) is kept apart and the data up to then
(2009–16) used as the basis for forecasting 2017. The better model is the one that
provides forecasts for 2017 that are closer to the actual sales volumes. There are two
reasons for comparing the models in this way.
First, the test is independent in the sense that the data being forecast (2017) are not
used in establishing the forecasting model. Contrast this with the use of R-bar-
squared. All of the data, 2009–17, are used to calculate the coefficients of the model;
the residuals are then calculated and R-bar-squared measures how close this model is
to the same 2009–17 data. This is not an independent measure of accuracy.
Second, the accuracy of smoothing techniques is usually measured through the
mean square error or mean absolute deviation; the accuracy of regression models is
measured by R-bar-squared. These two types of measures are not directly compara-
ble. On the other hand, the independent test of accuracy does provide a directly
comparable measure: closeness to the 2017 data.
The details of the test are as follows. The 2009–16 data are used for each of the two
models as the basis of a forecast for each of the four quarters of 2017. The close-
ness of the two sets of forecasts to the actual 2017 data is measured using the mean
square error and the mean absolute deviation. The model for which both these
measures are smaller is chosen as the better to use in practice. Should the two
measures have contradicted one another, then this would have meant that the model
with the lower MSE tended to be closer to extreme values whereas the model with
lower MAD tended to be closer on average to all values.
Table A4.26 shows the results of this test. The Holt–Winters time series is clearly
superior to the regression model. Both measures, MAD and MSE, demonstrate that
it gives the better forecasts for the already known 2017 data. The Holt–Winters
technique, with smoothing constants 0.2, 0.4 and 0.5, should be chosen to make
forecasts. The whole of the data series, including of course 2017, should be used in
doing this. Forecasts for 2018 are shown in Table A4.27.
To obtain a consensus from these data, a modified version of the Delphi method
might be used. All the experts represented in the table should be approached,
presented with the views of the others and asked if they wish to adjust their opin-
ions. As a result, some of the more extreme views might be altered.
The second stage, the adjustment of the statistical forecasts, is made by people
within the organisation. They should be accountable for any changes made; it is not
of course possible to make the external experts accountable for their views within
the context of an organisation’s forecasts.
volvement approach is likely to minimise the effect, since the participants will feel
more like a team than they would otherwise.
Likewise, a lack of belief in a forecasting system will probably have effects that go
far beyond any one particular project, but early user involvement will mitigate the
effects.
Conclusions
It should be emphasised that it is probably only for short-term forecasts that a time
series method will seem to be the best. For medium-term forecasts beyond a year
ahead, a causal model is likely to be better. Even for a short-term forecast, however,
uncertainty and volatility in the UK economic environment will eventually cause
problems, and adjustments will have to be made to the Holt–Winters model. For
important medium-term forecasts on which the expenditure of a lot of money is
justified, it may be worthwhile to use all three approaches to forecasting: causal,
time series and qualitative. If all give similar output, there is mutual confirmation of
the correct forecast; if they give different output, then the process of reconciling
them will be a valuable way for the managers involved to gain a better understand-
ing of the future.
This case solution has covered the important aspects of the case, but not all the
aspects. Among the omissions, for example, are statistical tests of randomness.
Furthermore, techniques other than Holt–Winters and some limited causal model-
ling have not been described, but they should have been considered. The emphasis
has, however, been on the role of a manager, not a statistician. The items included
are, in general, the things a manager would need to be aware of in order to be able
to engage in sensible discussions with forecasting experts.