0% found this document useful (0 votes)
7 views244 pages

Statistics-Problems and Solutions

The document is a comprehensive textbook on statistics, authored by J. Murdoch and J. A. Barnes, aimed at providing practical applications of statistical methods through worked examples and problems. It covers fundamental concepts such as probability theory, distributions, estimation, significance testing, and linear regression, with an emphasis on relating theory to real-world applications. The book is designed to assist students and industry professionals in understanding and applying basic statistical concepts effectively.

Uploaded by

Halil Gazioğlu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views244 pages

Statistics-Problems and Solutions

The document is a comprehensive textbook on statistics, authored by J. Murdoch and J. A. Barnes, aimed at providing practical applications of statistical methods through worked examples and problems. It covers fundamental concepts such as probability theory, distributions, estimation, significance testing, and linear regression, with an emphasis on relating theory to real-world applications. The book is designed to assist students and industry professionals in understanding and applying basic statistical concepts effectively.

Uploaded by

Halil Gazioğlu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 244

J Statistics; : :

J A sel Problems and Solutions


36 Ab ee
STATISTICS:
PROBLEMS AND SOLUTIONS
A Complete Course in Statistics

by

J. Murdoch BSc, ARTC, AMIProdE

and

J. A. Barnes BSc, ARCS

STATISTICS, PROBLEMS AND SOLUTIONS


BASIC STATISTICS, LABORATORY INSTRUCTION MANUAL
STATISTICAL TABLES FOR SCIENCE ENGINEERING
BUSINESS STUDIES AND MANAGEMENT
STATISTICS:
PROBLEMS AND SOLUTIONS

J. Murdoch, BSc, ARTC, AM) ProdE


Head of Statistics and Operational Research Section,
Cranfield Institute of Technology |

and

J. A. Barnes, BSc, ARCS


Lecturer in Statistics and Operational Research,
Cranfield Institute of Technology

MACMILLAN
© J. Murdoch and J. A. Barnes 1973

All rights reserved. No part of this publication may be reproduced or transmitted,


in any form or by any means, without permission

First published 1973

Published by
THE MACMILLAN PRESS LTD
London and Basingstoke
Associated companies in New York Toronto
Melbourne Dublin Johannesburg and Madras

SBN 333 12017 5

Printed in Great Britain by


The Whitefriars Press Ltd., London and Tonbridge
Preface

Statistics is often regarded as a boring, and therefore difficult, subject


particularly by those whose previous experience has not produced any real
need to understand variation and to make appropriate allowances for it. The
subject can certainly be presented in a boring way and in much advanced work
can be conceptually and mathematically very difficult indeed.
However for most people a simple but informed approach to the collection,
analysis and interpretation of numerical information is of tremendous benefit
to them in reducing some of the uncertainties involved in decision making. It is
a pity that many formal courses of statistics appear to frighten people away
from achieving this basic attitude usually through failing to relate the theory to
practical applications.
This book, whose chapters each contain a brief summary of the main
concepts and methods, is intended to show, through worked examples, some of
the practical applications of simple statistical methods and so to stimulate
interest. In order to establish firmly the basic concepts, a more detailed
treatment of the theory is given in chapters 1 and 2. Some examples of a more
academic nature are also given to illustrate the way of thinking about problems.
Each chapter contains problems for the reader to attempt, the solutions to
these being discussed in some detail, particularly in relation to the inferences
that can validly be drawn even in those cases where the numbers have been put
into the correct ‘textbook formula’ for the situation.
This book will not only greatly assist students to gain a better appreciation
of the basic concepts and use of the theory, but will also be of interest to
personnel in industry and commerce, enabling them to see the range of
application of basic statistical concepts.
For the application of basic statistics, it is essential that statistical tables
are used to reduce the computation to a minimum. The tables used are those
by the authors, Statistical Tables, a companion volume in this series of
publications on statistics. The third book, Basic Statistics: Laboratory
Instruction Manual, designed to be used with the Cranfield Statistical Teaching
Aids is referred to here and, in addition, some experiments are suggested for
Vv
Vi Preface

the reader to perform to help him understand the concepts involved. In the
chapters of this book references to Statistical Tables for Science, Engineering
and Management are followed by an asterisk to distinguish them from
references to tables in this book.
The problems and examples given represent work by the authors over many
years and every attempt has been made to-select a representative range to
illustrate the basic concepts and application of the techniques. The authors
would like to apologise if inadvertently examples which they have used have
previously been published. It is extremely difficult in collating a problem book
such as this to avoid some cases of duplication.
It is hoped that this new book, together with its two companion books, will
form the basis of an effective approach to the teaching of statistics, and
certainly the results from its trials at Cranfield have proved very stimulating.
J. Murdoch
Cranfield J. A. Barnes
Contents

List of symbols

Probability theory
1:2:1 Introduction
i222 Measurement of probability
ees Experimental Measurement of Probability
1.2.4 Basic laws of probability
1.2.5 Conditional probability
1.2.6 Theory of groups
ip? 7 Mathematical expectation
1.2.8 Geometric probability
2.9 Introduction to the hypergeometric law
F:2.10 Introduction to the binomial law
$2.11 Management decision theory
1.3 Problems
1.4 Worked solutions
t5 Practical experiments
Appendix 1—specimen experimental results

Theory of distributions 32
22:1 Introduction . 32
PRG: Frequency distributions 33
2233 Probability distributions 35
2.2.4 Populations ee)
Wi) Moments of distribution 37
2.2.6 Summary of terms 38
2.04) Types of distribution 40
2.2.8 Computation of moments 42
2219 Sheppard’s correction 45
2:3 Problems 45
2.4 Worked solutions 48
Vii
Vili Contents

jee Practical experiments 60


2521 The drinking straw experiment 60
Dedee The shove halfpenny experiment 61
PRONG) The Quincunx 61

Hypergeometric binomial and Poisson distributions 63


Staal Hypergeonietric law 63
i
hss Binomial law 63
BPee) Poisson law 63
3.2.4 Examples of the use of the distributions 65
3.225 Examples of the Poisson distribution 68
a Problems 72
3.4 Worked solutions 73
335 Practical experiments 76
Appendix 1—binomial experiment with specimen results 77

Normal distribution 80
4.2.1 Introduction 80
4.2.2 Equation of normal curve 80
4.2.3 Standardised variate 81
4.2.4 Area under normal curve 81
4.2.5 Percentage points of the normal distribution 82
4.2.6 Ordinates of the normal curve 82
4.2.7 Fitting a normal distribution to data 82
4.2.8 Arithmetic probability paper 82
4.2.9 Worked examples 83
4.3 Problems 89
4.4 Worked solutions 92
4.5 Practical experiments 102
Appendix 1—Experiment 10 of Laboratory Manual* 102
Appendix 2—Experiment 11 of Laboratory Manual* 105

Relationship between the basic distributions 109


a2 Résumé of theory 109
ee. L Hypergeometric, binomial and Poisson approximations 111
e.2. Normal approximation to Poisson 112
S253 Examples of use of approximations 113
5.2.4 Examples of special interest 115
ms Problems 118
5.4 Worked solutions 119
Appendix 1—Experiment 8 of Laboratory Manual* 122
Contents ix

6 Distribution of linear functions of variables 124


6.2.1 Linear combination of variates 124
x22 Sum of 7 variates 127
6.2.3 Distribution of sample mean 127
6.2.4 Central limit theorem 128
G:225 Sum of two means 129
6.3 Problems 130
6.4 Worked solutions 133
Appendix 1—Experiment 12 of Laboratory Manual* 141

Estimation and significance testing (I)—‘large sample’ methods 145


G25 Point estimators 145
di2.2 Confidence intervals 145
72.3 Hypothesis testing 146
7.2.4 Errors involved in hypothesis testing 146
E235 Hypothesis (significance) testing 147
7.26 Sample size 147
fe. Tests for means and proportions 147
2:8 Practical significance 149
Helo Exact and approximate tests 149
M210 Interpretation of significant results 150
2d) Worked examples 150
4.3 Problems 162
7.4 Worked solutions 163

Sampling theory and significance testing (II)—‘t’, ‘F’ and x? tests 170
8.2.1 Unbiased estimate of population variance 170
$32.2 Degrees of freedom 171
8.2.3 The ‘u’-test with small samples 171
8.2.4 The ‘t-test of significance 13
$2.5 The ‘F’-test of significance 174
8.2.6 The ‘x?’-test of significance 175
8.2.7 One- and two-tailed tests C77
8.2.8 Worked examples ET
8.3 Problems 182
8.4 Worked solutions 184
8.5 Practical experiments 192
Appendix 1—Experiment 14 of Laboratory Manual* 193

Linear regression theory 197


oS Basic concepts 197
D2. Assumptions 198
Contents

9.23 Basic theory 198


9.2.4 Significance testing 199
92:5 Confidence limits for the regression line 200
9.2.6 Prediction limits 200
92:7 Correlation coefficient 201
9.2.8 Transformations 203
9.2.9 Worked example 203
23 Problems 208
9.4 Worked solutions PA |
List of symbols

constant term in linear regression


regression coefficient
scaling factor used in calculating mean and variance
expected frequency for x” goodness-of-fit test
expected frequency in cell ij of contingency table
frequency of ith class
variance ratio
null hypothesis
alternative hypothesis
mean of poisson distribution
rth moment of a distribution about the origin
rth moment of a distribution about its mean
number of observations and/or number of trials in probability theory
observed frequency for x” goodness-of-fit test
observed frequency in ijth ‘cell of contingency table
probability
probability of an eventA
number of permutations of 1 objects taken x at a time
P(A/B) conditional probability of A on assumption that B has occurred
E|x] expected value of variate, x
sample correlation coefficient
~
standard deviation of a sample
A unbiased sample estimator of population variance
Student’s ‘t’
coded variable used in calculating mean and variance of sample

standardised normal variate


value of variate
Vi value of dependent variable corresponding to x; in regression
estimated value of dependent variable using the regression line
ways.
+ Little confusion should arise here on the use of the same symbol in two different
Their use in both these areas is too standardised for the authors to suggest a change.
xi
xii List of Symbols

Greek Symbols

population mean
population variance
Nv
sum of the squares of standardised normal deviates
number of degrees of freedom
magnitude of risk of Ist kind or significance level
magnitude of risk of 2nd kind or (1 —8) is the power of the test
proportion of a population having a given attribute
<<
RES
DAS
Oe standard error
Note: a and £ are also used as parameters of the population regression line
n= at (x;—x) but again no confusion should arise.

Mathematical Symbols
n

Ds summation fromi= 1 ton


i=1
e exponential ‘e’, the base of natural logarithms
= approximately equal to
b= 6 the sample statistic b is an estimate of population parameter 6
XY x is greater than y
x2y x is greater than or equal to ¥
aay, x is less thany ;
x<y x is less than or equal toy

n : ea : ,
se a(S) number of different combinations of size x from group of size n

x! factorial x =x(x—1)(x—2)...3x2x1
Fy number of permutations of n objects taken x at a time

n : F
Note: The authors use | but in order to avoid any confusion both are given
in the definitions.
1 Probability theory

1.1 Syllabus Covered


Definition and measurement of probability; addition and multiplication laws;
conditional probability; permutations and combinations; mathematical
expectation; geometric probability; introduction to hypergeometric and
binomial laws.

1.2 Résumé of Theory and Basic Concepts

1.2.1 Introduction
Probability or chance is a concept which enters all activities. We speak of the
chance of it raining today, the chance of winning the football pools, the chance
of getting on the bus in the mornings when the queues are of varying size, the
chance of a stock item going out of stock, etc. However, in most of these uses of
probability, it is very seldom that we attempt to measure or quantify the
statements. Most of our ideas about probability are intuitive and in fact
probability is a quantity rather like length or time and therefore not amenable
to simple definition. However, probability (like length or time) can be measured
and various laws set up to govern its use.
The following sections outline the measurement of probability and the rules
used for combining probabilities.

1.2.2 Measurement of Probability


Probability is measured on a scale ranging from 0 to 1 and can take any value
inside this range. This is illustrated in figure 1.1.
The probability p that an event (A) will occur is written

P(A)=p where O0O<p<1


1
uv Statistics: Problems and Solutions

fs ] Probability that you will die one day (absolute certainty)

4. or 0.5 Probability that an unbiased coin shows ‘heads’ after one toss
% or 0.167 Probability that a die shows ‘six’ on one roll
0 Probability that you will live forever (absolute impossibility)

Figure 1.1. Probability scale.

It will be seen that on this continuous scale, only the two end points are
concerned with deductive logic (although even here, there are certain logical
difficulties with the particular example quoted).
On this scale absolute certainty is represented by p = 1 and an impossible
event has probability of zero. However, it is between these two extremes that
the majority of practical problems lie. For instance, what is the chance that a
machine will produce defective items? What is the probability that a machine
will find the overhead service crane available when required? What is the
probability of running out of stock of any item? Or again, in insurance, what
is the chance that a person of a given agé will survive for a further year?

1.2.3 Experimental Measurement of Probability


In practice there are many problems where the only method of estimating the
probability is the following

total occurrences of the event A


prokahilitviof event Akl) total number of trials

For example, what is the probability of an item’s going out of stock in a given
period?
Measurement showed that 189 items ran out in the period out of a total
number of stock items of 2000, therefore the estimate of probability of a stock
running out is
P(A)
= 283 = 0.0945
Again, if out of a random sample of 1000 men, 85 were found to be over
1.80 m tall, then

estimate of probability of a man being over 1.80 m tall= 7$35 = 0.085.

1.2.4 Basic Laws of Probability

1. Addition Law of Probability


This law states that if A and B are mutually exclusive events, then the probability
that eitherA or B occurs in a given trial is equal to the sum of the separate
probabilities of A and B occurring.
Probability Theory 3

In symbolic terms this law — be shown as

P(A or B) = P(A) + P(B)


This law can be extended by repeated application to cover the case of more
than two mutually exclusive events.
Thus P(A or Bor Cor...)=P(A)+P(B)+P(C)+...
The events of this law are mutually exclusive events, which simply means
that the occurrence of one of the events excludes the possibility of the occurrence
of any of the others on the same trial.
For example, if in a football match, the probability that a team will score
O goals is 0.50, 1 goal is 0.30, 2 goals is 0.15 and 3 or more goals is 0.05, then
the probability of the team scoring either 0 or 1 goals in the match is

P(O or 1) = P(O) + P(1) = 0.50 + 0.30 = 0.80


Also, the probability that the team will score at least one goal is

P(at least one goal) = P(1) + P(2) + P(3 or more) = 0.30 + 0.15 + 0.05 = 0.50
Any event either occurs or does not occur on a given occasion. From the
definition of probability and the addition law, the probabilities of these two
alternatives must sum to unity. Thus the probability that an event does not
occur is equal to
1 —(probability that the event does occur)

In many examples, this relationship is very useful since it is often easier to


find the probability of the complementary event first.
For example, the probability of a team’s scoring at least one goal in a
football match can be obtained as
P(at least 1 goal) = 1—P(0 goals) = 1—0.50 = 0.50
as before.
As a further example, suppose that the probabilities of a man dying from
heart disease, cancer or tuberculosis are 0.51, 0.16 and 0.20 respectively. The
probability that a man will die from heart disease or cancer is 0.51 + 0.16 = 0.67.
The probability that he will die from some cause other than the three mentioned
is 1—-(0.51 + 0.16 + 0.20) = 0.13; i.e., 13% of men can be expected to die from
some other cause.
However, consider the following example. Suppose that of all new cars sold,
40% are blue and 30% have two doors. Then it cannot be said that the probability
of a person’s owning either a blue car or a two-door car is 0.70 (= 0.40 + 0.30)
since the events (blue cars and two-door cars) are not mutually exclusive, i.e., a
car can be both blue and have two doors. To deal with this case, a more general
version of the addition law is necessary. This may be stated as
P(A or B or both) = P(A) +P(B) — P(A and B)
4 Statistics: Problems and Solutions

The additional term, the probability of events A andB both occurring


together on a trial is obtained using the multiplication law of probabilities.
2. Multiplication Law of Probability
This law states that the probability of the combined occurrence of two events
A and B is the product of the probability of A and the conditional probability
of B on the assumption that A has occurred.
Thus P(A and B) = P(AB) = P(A) x P(B/A)
where P(B/A) is the conditional probability of event B on the assumption that
A occurs at the same time (see list of symbols, page xi).
P(AB) is also given by P(B) x P(A/B)
While this law is usually defined as above for two events, it can be extended
to any number of events.

3. Independent Events
Events are defined as independent if the probability of the occurrence of either
is not affected by the occurrence or not of the other. Thus if A and B are
independent events, then the law states that the probability of the combined
occurrence of the events A and B is the product of their individual
probabilities. That is

P(AB) = P(A) x P(B)


Many people meeting the ideas of probability for the first time find difficulty
in deciding whether to add or multiply probabilities. If the problem (or part of
it) is concerned with either event A or event B occurring, then add probabilities;
if A and B must both occur (at the same time or one after the other), then
multiply probabilities. Consider the following example with the throwing of
two dice illustrating the use of the two basic laws of probability.

Examples
1. In the throw of two dice, what is the probability of obtaining two sixes?
One of the dice must show a six and the other must also show a six. Thus the
required probability (independent events) is

PSX =
2. In the throw of two dice, what is the probability of a score of 9 points?
Here we must consider the number of mutually exclusive ways in which the
score 9 can occur. These ways are listed below

Dice A 3 4 5 6
and or | or | or
Dice B 6 5 4 3
Probability Theory 5

The probability of any of these four possible arrangements occurring is equal


to, as before (§ x4) =4
Thus the probability that two dice show a total score of 9 is equal to
dtdetietie=4 1

3. In marketing a product, records show that on average one call in 10 results


in making a sale to a potential customer. What is the probability that a salesman
will make two sales from any two given calls?
Assuming the events (sales to different customers) to be independent, use of
the multiplication law gives the probability of making two sales in two calls as
0.1 x 0.1 = 0.01.
As an extension of this example, what is the probability of making at least
one sale in five calls? The easiest way to calculate this probability is to note that
the event complementary to making one or more sales is not to make any sales.
Using the multiplication law gives the probability of making no sales in five calls
as
0.9 x 0.9 x 0.9 x 0.9 x 0.9 = 0.9° = 0.5905
P(at least one sale in five calls) = 1 —0.5905 = 0.4095
These basic laws for combining probability may be used to answer such
questions as how many calls must be planned so that there is a high probability,
say 95% or 99%, of making at least one sale or of making at least two sales,
etc. Or, again, what is the probability that it will need more than,-say eight
calls to make two sales?
As an example, suppose that the probability of making at least one sale inn
calls is to be at least 0.95. What is the smallest value of n which will achieve this?
Turning the problem round gives that the probability of making no sales in
n calls is to be at most 0.05 and thus
0.9" <0.05
The smallest value of n which satisfies this requirement is 29. This means that
if the salesman schedules 29 customer calls every day, he will make at least one
sale on just over 95% of days in the long run. Conversely on just under one day
in 20 he will receive no orders as a result of any of his 29 visits. The average daily
number of sales made will be 2.9.
. The example of the addition law where the events (a car being blue and a
car having two doors) were not mutually exclusive can now be completed.
The probability that a randomly chosen car is either blue or has two doors
or is a two-door blue car is given by
P(blue) + P(2 doors) — P(blue and 2 doors)

= 0.4+ 0.3—(0.4 x 0.3)


= 0.70—0.12 = 0.58
6 Statistics: Problems and Solutions

Not
blue 0.6

Blue 0.4

Figure 1.2

This result is valid on the assumption that the number of doors that a car has
is not dependent on its colour.
Figure 1.2 illustrates the situation. The areas of the four rectangles within the
square represent the proportion of all cars having the given combination of
colour and number of doors. The total area of the three shaded rectangles is
equal to 0.58, the proportion of cars that are either blue or have two doors or
are two-door blue cars.

1.2.5 Conditional Probability


In many problems and situations, however, the events are neither independent
nor mutually exclusive, and the general theory for conditional probability will
be outlined here.
Before considering conditional probability in algebraic terms, some simple
numerical examples will be given.

If one card is drawn at random from a full pack of 52 playing cards, the
probability that it is red is 26/52. Random selection of a card means that each
of the 52 cards is as likely as any of the others to be the sampled one.
If a second card is selected at random from the pack (without replacing the
first), the probability that it is red depends on the colour of the first card drawn.
There are only 51 cards that could be selected as the second card, all of them
having an equal chance. If the first had been black, there are 26 red cards
available and the probability that the second card is red is therefore 26/51 (i.e.,
conditional upon the first card being black).
Similarly, if the first card is red, the probability that the second is also
red is 25/51.
The process can be continued; the probability that a third card is red being
26/50, 25/50 or 24/50 depending on whether the previous two cards drawn were
both black, one of each colour (drawn in either order) or both red.
Probability Theory 7

Most practical sampling problems are of this ‘sampling without replacement’


type and conditional probabilities need to be taken into account. There are,
however, suitable approximations which can often be used in practice instead
of working with the exact conditional values. (These methods are referred to
in chapter 5.)
Using the multiplication law and the conditional probabilities discussed
above, the probability that two cards, taken randomly from a full pack (52 cards),
will both be red is given by

P(first card red) x P(second card red given that first was red) = BxB=z

This result applies whether the two cards are taken one after the other or
both at the same time.

As another example, suppose that two of the bulbs in a set of 12 coloured


lights are burnt out. What is the probability of finding

(a) both burnt-out bulbs in the first two tested?


(b) one of the burnt-out bulbs in the first two tested?
(c) at least one of the burnt-out bulbs in the first two tested?

The solutions are

(a) P(first bulb tested is burnt out) = %


P(second bulb tested is also burnt out) = 4
P(finding both burnt-out bulbs in first two tests) = % x tr = %

(b) P(first bulb tested is burnt out) = %


and P(second bulb tested is not burnt out) = 77
The product of these probabilities = 4 x 4? = 3 =33

The same result can be obtained if a burnt-out bulb is found on the second
test, the first bulb being good. The two situations are mutually exclusive.
Then P(first bulb good) = 43
P(second bulb burnt out) = 4
Thus P(first bulb good and second burnt out) = 7? x 7 = 3

Bither of these two situations satisfies (b) and the probability of at least one
of the burnt-out bulbs being in the first two tested is given by their sum

B+8-B-2
(c) The probability of at least one burnt-out bulb being found in two tests
is equal to the sum of the answers to parts (a) and (b), namely
20 fer
28 rT
a 1 66.— 66-22
8 Statistics: Problems and Solutions

As a check on this result, the only other possibility is that neither of the
faulty bulbs will be picked out for the first two tests. The probability of this,
using the multiplication law with the appropriate conditional probability, is
Bxh-H
The situation in part (c) therefore has probability of 1 —43 = 3 as given by
direct calculation.

Consider a box containing r red balls and w white balls. A random sample of
two balls is drawn. What is the probability of the sample containing two red
balls?

r=red balls

w-=white balls

If the first ball is red (event A), probability of this event occurring
r
nA r+w

The probability of the second ball being red (event B) given the first was red
is thus
r=A
P(B/A) =
r+w—1

since there are now only (r—1) red balls in the box containing (r + w— 1) balls.
.. Probability of the sample containing two red balls

r (r—1)
e (r+w) : (r+w—-1)
In similar manner, probability of the sample containing two white balls
re cud (w—1)
(r+w) ‘ (r+w-1)

Also consider the probability


of the samples containing one red and one
white ball. This event can happen in two mutually exclusive ways, first ball red,
second ball white or first ball white, second ball red.
Thus, the probability of the sample containing one white and one red
ball is
Logs W cakeog ul as, r = 2wr
(r+w) (rtw—1) (rt+w) (rtw-1) (rt+w)(rtw-1)
Note: Readers might like to verify that the sum of these three probabilities = 1.
Probability Theory 9

Examples

1. In a group of ten people where six are male and four are female, what is the
chance that a committee of four, formed from the group with random selection,
comprises (a) four females, or (b) three females and one male?

(a) Probability of committee with four females


to x3 x%x4=0.0048
(b) Committee comprising three females and one male. This committee can
be formed in the following mutually exclusive ways

lstmember M F F F
2ndmember F or M or F or F
3rd member F F M F
4th member F -~ F F M

The probability of the first arrangement is


to x§ x2 x4 = 0.0286

The probability for the second arrangement is

tp x $ xd x4 = 0.0286
and similarly for the third and fourth columns, the position of the numbers in
the numerator being different in each of the four cases. The required probability
is thus 4 x 0.0286 = 0.114.

2. From a consignment containing 100 items of which 10% are defective, a


random sample of 10 is drawn. What is the probability of (a) the sample
containing no defective items, or (b) the sample containing exactly one
defective item?

(a) Probability of no defective items in the sample only arises in one way.
.. Probability of no defective items

P(O)
= Ho x % x Bx...
x 3 =0.33

(b) Exactly one defective item in the sample can arise in 10 mutually exclusive
ways as shown below

D = defective item
G = good item
QaQa”
Quan
aici agaan
Qaqa+ QOaqan
angaga Se
AAD(Qo GOQ95
Qu@
10 Statistics: Problems and Solutions

Thus the probability of one defective in 10 items sampled is


P(1) = 10 x 7% x 8 x Bx...
x % =041

1.2.6 Theory of Groups


There are two group theories which can assist in the solution and/or computation
involved in probability theory (rather than the long methods used in examples
in section 1.2.5).

Permutations
Groups form different permutations if they differ in any or all of the following
aspects.

(1) Total number of items in the group.


(2) Number of items of any one type in the group.
(3) Sequence.
Thus

ABB, BAB are different permutations (3); AA, BAA are different permutations
(1) and (2); CAB, CAAB are different permutations (1) and (2); BAABA, BABBA
are different permutations because of (2).
Thus distinct arrangements differing in (1) and/or (2) and/or (3) form different
permutations.

Group Theory No. 1


If there are 1 objects, each distinct, then the number of permutations of
objects taken x at a time is
n!
Pipes
~ (n-x)!
An example is the number of ways of arranging two different letters out of
the word girl.

Heren=4,x=2

Py
4!
=a, = 12

The arrangements are


gi, gr, gl, ir, il, rl, Ir, li, ri, lg, rg, ig
Combinations
Groups form different combinations when they differ in

(1) Total number of objects in the group.


(2) Number of objects of any one type in the group.
(Note: Sequence does not matter.)
Probability Theory 11
Thus ABB, BAB, BBA, are not different combinations.

Group Theory No.2


If there are n objects, each distinct, then the number of different combinations
of size x is given by

n n!
Cl = ———_
poh (a) x!(n—x)!
As an example, a committee of three is to be formed from five department
heads. How many different committees can be formed?

ee ACEYRe
(2) Amt 2x1)

1.2.7 Mathematical Expectation


In statistics, the term expected value refers to the average value that a variable
takes. It is often used in the context of gambling but its use is appropriate
whenever we are concerned with average values.
The expected value is thus the mean of a distribution (see chapter 2),
i.e., the average sample value which will be obtained when the sample size tends
to infinity.
Suppose player A receives an amount of money M, if event £, happens, an
amount M, if EF, happens, ... and amount M,, if E,, happens, where
FE,, E2,...E, are mutually exclusive and exhaustive events; P;, P2,... Py,
are the respective probabilities of these events. Then A’s mathematical
expectation of gain is defined as
E(M) =M,P, +M,P, ths ati, Pp

In gambling, for the game to be fair the expectation should equal the charge for
playing the game. This concept is also used in insurance plans, etc. Use of this
concept of expected value is illustrated in the following example.

Example
The probability that a man aged 55 will live for another year is 0.99. How large
a premium should he pay for £2000 life insurance policy for one year?
(Ignore insurance company charges for administration, profit, etc.)

Let s = premium to be paid

Expected return = 0 x 0.99 + £2000 x 0.01 = £20

Premium s = £20 (should equal expected return)


12 Statistics: Problems and Solutions

1.2.8 Geometric Probability


Many problems lend themselves to solutions only by using the concept of
geometric probability and this will be illustrated in this section.

Example—A Fairground Problem


At a fair, customers roll a coin onto a board made up to the pattern shown in
figure 1.3. If the coin finishes in a square (not touching any lines), the number
of coins the customer will win is shown in that square, but otherwise the coin
is lost. If at least half of the coin is outside the board, it is returned to the
player.

Figure 1.3 |

Given that the lines are 1 mm thick, the sides of the squares are 60 mm and
the diameter of the coin is 20 mm what is

(a) the chance of getting the coin in the 4 square?


(b) the chance of getting the coin in a 2 square?
(c) the expected return per trial, if returns are made in accordance with the
numbers in the squares?

6|
mm

Figure 1.4
Probability Theory 13

Considering one square (figure 1.4), total possible area (ignoring small
edge-effects of line thickness) = 61? = 3721
For one square, the probability that the coin does not touch a line is

1600 _e048
Be
Thus if the coin falls at random on the board

(a) the chance that it falls completely within the 4 square = § x 0.43 = 0.048
(b) the chance that it falls completely within a 2 square = § x 0.43 = 0.191
(c) the expected payout per trial is (4 x 0.048) + (2.x 0.191) +(1 x 0.191)
= (0.76
Since it costs one coin to play, the player will lose 0.24 of a coin per turn in the
long run. 1

1.2.9 Introduction to the Hypergeometric Law


The hypergeometric law gives an efficient way of solving problems where the
probabilities involved are conditional.
In general form, it can be defined as follows.

Definition of Hypergeometric Law


If a group contains N items of whichM are of one type and the remainder
N-—M, are of another type, then the probability of getting exactly x of the
first type in a random sample of size 7 is
eyiyt a
x/\n-x
P(x) = N
n
To illustrate the use of the hypergeometric law, consider example (2),
page 9 again.
Here N = 100
M= 10 or number of defective items in the batch
N—M = 90 or number of good items in the batch

Sample size n = 10
é 90) tio! ©)90!
0 /\10/ _ 0110! *10!80!
*For (a) x=0; P(0) = 100\ 100!
(oe 10! x 90!
AOU BO) 81
= 100%
— 90*%°° * —91 = 0.33
14 Statistics: Problems and Solutions

for(b) x=1,

Both results are the same as before but are obtained more easily.

1.2.10 Introduction to the Binomial Law


Although this law will be dealt with more fully in chapter 3, it is useful to
introduce it here in the chapter on probability since knowledge of the law
helps in the understanding of probability.

Definition of Binomial Law


If the probability of success in an individual trial is p, and p is constant over all
trials, then the probability of x successes in n independent trials is

P(x) =()ea-py"
To illustrate the use of the binomial law consider the following example. A
firm has 10 lorries in service distributing its goods. Given that each lorry spends
10% of its time in the repair depot, what is the probability of (a) no lorry in the
depot for repair, and (b) more than one in for repair?

(a) Probability of success (i.e., lorry under repair), p = 0.10


Number of trials n = 10 (lorries)

Probability of no lorries being in for repair

P(0) = i )x 0.10° x 0.90?° = 0.3487

(c.f. result obtained from first principles)

(6) The probability of more than one lorry being in for repair, P(>1), can best
be obtained by:
P(>1) = 1—P(0)—P(1)
Probability of exactly one lorry being in for repair

P(1)= © x 0.10! x 0.90!°-! = 0.3874

Probability of more than one lorry being in for repair

P(>1) = 1—0.3487 —0.3874 = 0.2639


Thus this binomial law gives a neat and quick way of computing the
probabilities in simple cases like this.
Probability Theory 15

1.2.11 Management Decision Theory


What has become known as decision theory is in simple terms just the application
of probability theory and in the authors’ opinion should be considered primarily
as just this. This point of view will be illustrated in some examples below and
in the problems given later in the chapter. It must be appreciated that in decision
theory the probabilities assigned to the decisions are themselves subject to
errors and, whilst better than nothing, the analysis should not be used unless a
sensitivity analysis is also carried out. Also, when using decision theory (or
probability theory) for decisions involving capital investment, discounted cash
flow (D.C.F.) techniques are required. However, in order not to confuse readers,
since this is a text on probability, D.C.F. has not been used in the examples or
problems.
Note: Although it is the criterion used here by way of general introduction,
the use of expected values is just one measure in a decision process. In too many
books it appears to be the sole basis on which a decision is made.

Examples
1. Consider, as a simplification of the practical case, that a person wishing to
sell his car has the following alternatives: (a) to go to a dealer with complete
certainty of selling for £780, (b) to advertise in the press at a cost of £50, in
order to sell the car for £850.
Under alternative (b), he estimates that the probability of selling the car for
£850 is 0.60. If he does not sell through the advertisement for £850, he will take
it to the dealer and sell for £780. (Note that a more realistic solution would
allow for different selling prices each with their associated probability of
occurrence.) Should he try for a private sale?
If he advertises the car there is a chance of 0.6 of obtaining £850 and
therefore a chance of 0.4 of having to go to the dealer and accept £780.
The expected return on the sale
= £850 x 0.6 + £780 x 0.4 = £822

For an advertising expenditure of £50, he has only increased his expected


return by £(822—780) or £42.
On the basis of expected return therefore, he should not advertise but go
direct to the dealer and accept £780.
This method of reaching his decision is based on what would happen on
average if he had a large number of cars to sell each under the same conditions
as above. By advertising each of them, he would in the long run receive £8 per
car less than if he sold direct to the dealer without advertising. Such a long
16 Statistics: Problems and Solutions

run criterion may not be relevant to his once only decision. Compared with the
guaranteed price, by advertising, he will either lose £50 or be £20 in pocket with
probabilities of 0.4 and 0.6 respectively. He would probably make his decision
by assessment of the risk of 40% of losing money. In practice, he could probably
increase the chances of a private sale by bargaining and allowing the price to drop
as low as £830 before being out of pocket.
As a further note, the validity of the estimate (usually subjective) of a 60%
chance of selling privately at the price asked should be carefully examined as
well as the sensitivity of any solution to errors in the magnitude of the
probability estimate.

2. A firm is facing the chance of a strike occurring at one of its main plants.
Considering only two points (normally more would be used), management
assesses the following:

(a) An offer of 5% pay increase has only a 10% chance of being accepted
outright. If a strike occurs:

chance of a strike lasting 1 month, = 0.20


chance of a strike lasting 2 months = 0.50
chance of a strike lasting 3 months = 0.30
chance of a strike lasting longer than 3 months = 0.0
(b) An offer of 10% pay increase has a 90% chance of being accepted
outright. If a strike occurs:

chance of strike lasting 1 month = 0.98


chance of strike lasting 2 months = 0.02
chance of strike lasting longer than 2 months = 0.0
Given that the increase in wage bill per 1% pay increase is £10 000 per month
and that any agreement will last only 5 years and also that the estimated cost
of a strike is £1 000 000 per month, made up of lost production, lost orders,
goodwill, etc., which offer should management make?

(a) Considering expected costs for the offer of 5%. Expected loss due to strike
= 0.90[(0.20 x 1) + (0.50 x 2) + (0.30 x 3)] x £1 000 000 = £1 890 000

Increase in wage bill over 5 years

= £10 000 x 12x 5 x 5 = £3 000 000


Total (expected) cost of decision = £4 890 000

(b) For the offer of 10%, expected loss due to strike

= 0.10[(0.98 x 1) + (0.02 x 2)] x £1 000 000 = £102 000


Probability Theory 17

Increase in wage bill over 5 years

= £10 000 x 12 x 5 x 10
= £6 000 000
Total (expected) cost of decision = £6 102 000
Thus, management should clearly go for the lower offer and the possible
strike with its consequences, although many other factors would be considered
in practice before a final decision was made.

1.3. Problems for Solution


1. Four playing cards are drawn from a well-shuffled pack of 52 cards.
(a) What is the probability that the cards drawn will be the four aces?
(b) What is the probability that the cards will be the four aces drawn in
order Spade, Heart, Diamond, Club?

2. Four machines—a drill, a lathe, a miller, and a grinder—operate independently


of each other. Their utilisations are: drill 50%, lathe 40%, miller 70%, grinder
80%.

(a) What is the chance of both drill and lathe not being used at any instant
of time?
(b) What is the chance of all machines being in use?
(c) What is the chance of all machines being idle?

3. A man fires shots at a target, the probability of each shot scoring a hit being
1/4 independently of the results of previous shots. What is the probability that
in three successive shots

(a) he will fail to hit the target?


(b) he will hit the target at least twice?

4. Five per cent of the components in a large batch are defective. If five are
taken at random and tested
(a) What is the probability that no defective components will appear?
‘ (b) What is the probability that the test sample will contain one defective
component?
(c) What is the probability that the test sample will contain two defective
components?

5. A piece of equipment will only function if three components, A, B and C,


are all working. The probability of A’s failure during one year is 5%, that of
B’s failure is 15%, and that of C’s failure is 10%. What is the probability that
the equipment will fail before the end of one year?
18 Statistics: Problems and Solutions

6. A certain type of seed has a 90% germination rate. If six seeds are planted,
what is the chance that
(a) exactly five seeds will germinate?
(b) at least five seeds will germinate?

7. A bag contains 7 white, 3 red, and 5 black balls. Three are drawn at random
without replacement. Find the probabilities that (a) no ball is red, (b) exactly
one is red, (c) at least one is red, (d) all are of the same colour, (€) no two are
of the same colour.

8. If the chance of an aircraft failing to return from any single operational


flight is 5%
(a) what is the chance that it will survive 10 operational flights?
(b) if such an aircraft does survive 10 flights, what is the chance that it will
survive a further 10 flights?
(c) if five similar aircraft fly on a mission, what is the chance that exactly
two will return?

9. If the probability that any person 30 years old will be dead within a year is
0.01, find the probability that out of a group of eight such persons, (a) none,
(b) exactly one, (c) not more than one, (d) at least one will be dead within a
year.

10. A and B arrange to meet between 3 p.m. and 4 p.m., but that each should
wait no longer than 5 min for the other. Assuming all arrival times between
3 o'clock and 4 o’clock to be equally likely, find the probability that they meet.

11. A manufacturer has to decide whether or not to produce and market a new
Christmas novelty toy. If he decides to manufacture he will have to purchase a
special plant and scrap it at the end of the year. If a machine costing £10 000
is bought, the fixed cost of manufacture will be £1 per unit; if he buys a
machine costing £20 000 the fixed cost will be 50p per unit. The selling
price will be £4.50 per unit.
Given the following probabilities of sales as:

Sales £2000 £5000 £10 000


Probability 0.40 0.30 0.30

What is the decision with the best pay-off?

12. Three men arrange to meet one evening at the “Swan Inn’ in a certain town.
Probability Theory 19

There are, however, three inns called ‘The Swan’ in the town. Assuming that each
man is equally likely to go to any one of these inns

(a) what is the chance that none of the men meet?


(b) what is the chance that all the men meet?

13. An assembly operator is supplied continuously with components x, y, and z


which are stored in three bins on the assembly bench. The quality level of the
components are (1) x—10% defective, (2) y—2% defective, (3) z—5% defective.

Bin Bin Bin

10% defective] ¥ 2% defective pee 5% defective

(1)
Assembly unit

Figure 1.5

An assembly consists of two components of x, one component of y and two


components of z. If components are selected randomly, what proportion of
assemblies will contain

(a) no defective components?


(b) only one defective component?

14. A marketing director has just launched four new products onto the market.
A market research survey showed that the chance of any given retailer adopting
the products was
Product A 0.95 Product C 0.80
Product B 0.50 Product D 0.30

What proportion of retailers will (a) take all four new products, (b) take
A, B and C but not D?

1.4 Solutions to Problems


1. (a) Probability of the 1st card being an ace =
If the first card is an ace,
the probability of 2nd card being an ace =4
If the first two cards are aces,
the probability of 3rd card being an ace =%
20 Statistics :Problems and Solutions

If the first three cards are aces,


the probability of 4th card being an ace = 45
By multiplication law,
the probability of all four being aces =pxaxexa
5

= 0.000 003 7

(b) Probability of 1st card being the Ace of Spades =%5


If the first card is the Ace of Spades,
the probability of 2nd card being the Ace of Hearts =a
If the first two cards are the Aces of Spades and
Hearts, probability of 3rd card being the Ace of Diamonds = 2
If the first three cards are the Aces of Spades, Hearts
and Diamonds, the probability of 4th card being the
Ace of Clubs =4
By the multiplication law, the probability of drawing
four aces in the order Spades, Hearts, Diamonds, Clubs =a x4xHxe
= 0.000 0001 5
2. The utilisations can be expressed in probabilities as follows:

Probability of being used Probability of being idle


Drill 0.50 0.50
Lathe 0.40 0.60
Miller 0.70 0.30
Grinder 0.80 0.20

(a) By the multiplication law, the probability of drill and lathe being
idle = 0.50 x 0.60 = 0.30
(b) By the multiplication law, the probability of all machines
being busy = 0.50 x 0.40 x 0.70 x 0.80 = 0.112
(c) Probability of all machines being idle at any
instant = 0.5 x 0.6 x 0.3 x 0.2 = 0.018

3. (a) P(all three shots miss target) = 3 x # x 3 = 24 = 0.42


(b) Pchits target once) = (4 x 3x 3)+ (3 x4 x3)t+Gx¥x4)=¥4 =0.42
P(hits target at least twice) = 1 -(0.42 + 0.42) = 0.16

(This result can be checked by direction evaluation of the probabilities of


two hits and three hits.)
4. This problem is solved from first principles, although the binomial law can be
applied.
(a) Probability of selecting a good item from the large batch = 0.95
Probability Theory 21

By the multiplication law, probability of selecting five good items from the
large batch = 0.95 x 0.95 x 0.95 x 0.95 x 0.95 = 0.77
(b) In a sample of five, one defective item can arise in the following five
ways:

Aa da GAL A
D= defective part
AeA a Dai 2A
A = acceptable part
ANA et
AIA) SAN SAT AD
The probability of each one of these mutually exclusive ways occurring

= 0.05 x 0.95 x 0.95 x 0.95 x 0.95 = 0.0407


The probability that a sample of five will contain one defective item

=5 x 0.0407 = 0.2035

(c) Ina sample of five, two defective items can occur in the following ways:

De Dodd) het Ae oA rea ee A A. A


DCA A eae ak Dl eee Ae | A A
A Paden bar: AvobvD AGAt HiAceh D decDi| oA
Ass Ae iain Aes! LA ih Diced vile De-trod whcD
Ay Avia beDAA. -AinpcD™ jedi 1) DOD
or in 10 ways.
Probability of each separate way = 0.05? x 0.953 = 0.00214
Probability that the sample will contain two defectives
= 10 x 0.00214 = 0.0214
It will be seen that permutations increase rapidly and the use of basic laws
is limited. The binomial law is of course the quicker method of solving this
problem, particularly if binomial tables are used.

5. The equipment would fail either if A, or B, or C were to fail, or if any


combination of these three were to fail.
Thus the probability of the equipment failing for any reason= 1 — probability
that the equipment operates for the whole year.
Probability that A does not fail = 0.95
Probability that B does not fail = 0.85
Probability that C does not fail = 0.90
Probability that the equipment does not fail = 0.95 x 0.85 x 0.90 = 0.7268
Probability that the equipment will fail = 1 —0.7268 = 0.2732

SPS—2
22 Statistics: Problems and Solutions

6. (a) P(5 seeds germinating) = 6 x 0.9° x 0.1 = 0.3543


(b) P(at least 5 seeds germinating) = P(5S or more) = P(5 or 6)
= P(5) + P(6) = 0.3543 + 0.9° = 0.8858

7. Conditional probability:
(a) Probability that no ball is red = 42 x 44 x 48 = 0.4835
(b) Probability that 1 ball is red = 3 x ( x 3 x 43) = 0.4352
(c) Probability that at least 1 is red = 1—0.4835 = 0.5165
(d) Probability that all are the same colour
= P(all white) + P(all red) + P(all black)

= Ch x fexh) +R x xh) tC x x B)= 0.1011


(e) Probability that all are different = 6 x 75 x #% x 7 = 0.231

8. (a) P(aircraft survives 1 flight) = 0.95


P(aircraft survives 10 flights) = 0.95'° = 0.7738
(b) P(aircraft survives further 10 flights having survived ten)
= 0.95'° = 0.7738
(c) P(any 2 of the 5 return) = 10 x 0.95? x 0.05% = 0.0012

9. (a) Probability that any 1 will be alive = 0.99

By the multiplication law, probability that all 8 will be alive = 0.99° = 0.92

Probability that none will be dead = 0.92

(b) By multiplication law, probability that 7 will be alive and 1 dead


= 0.997 x 0.01. The number of ways this can happen is the number of
permutations of 8, of which 7 are of one kind and 1 another.

Number of ways jl= aca aa 8

By the addition law, probability that 7 will be alive and 1 dead

= 8 x 0.997 x 0.01 = 0.075


(c) By the addition law, probability of none or one being dead

= 0.92 + 0.075 =.0.995


Probability of not more than one being dead = 0.995
Probability Theory 23

(d) Probability of none being dead = 0.92


Probability of 1 or more being dead = 1—0.92
Probability of at least 1 being dead = 0.08

10. At the present stage this is best done geometrically, as in figure 1.6,
A and B will meet if the point representing their two arrival times is in the
shaded area.
P(meet) = 1—P(point in unshaded area) = 1—(44)? = 7%

MW —=—5 min

'‘ =
Bs arriva!
time

lOSSSimin i]

Figure 1.6 As arrival time —=—

11. There are three possibilities: (a) to produce the toys on machine costing
£10 000; (b) to produce the toys on machine costing £20 000; (c) not to
produce the toys at all.
The solution is obtained by calculating the expected profits for each
possibility.

4.50-(1+ oa
(a) Profit on sales of 2000 = e| per unit

=— £1.50 or a loss of £1.50 per unit

4 suri i000)‘,
Profit on sales of 5000 = ¢|
4.50 _ (:
+ 5000 )|
per unit

= £(+ 4.50—3) = + £1.50

or a profit of £1.50 per unit


10 000\ |_
Profit on sales of 10 000 = ¢|
4.50 —- (1
at 10 “cal =+ £2.50

or a profit of £2.50 per unit

Expected profit = —£1.50 x 0.40 + £1.50 x 0.30 + £2.50 x 0.3


= £(—0.60 + 0.45 + 0.75) = + £0.60 per unit
24 Statistics: Problems and Solutions

20 000
(b) As before: profit on sales of 2000 = e|
4.50= (0.501 4000 )

= — £6.00 (i.e., loss of £6.00 per unit)

Similarly Profit on sales of 5000 = 0 or break even

Profit on sales of 10 000 = + £2.00 per unit

Expected profit = — £6.00 x 0.4 + £0 x 0.3 + £2.00 x 0.3


= £(—2.40 + 0.60) = — £1.80 per unit

(c) Expected profit = 0


Solution is to install machine (a)
Note: If machine (a) had given a loss, then solution would have been not to
produce at all.

12. (a) P(the men do not meet) = P(all go to different inns)


= P(1st goes to any) x P(2nd goes to one of the other two)

x P(3rd goes to last inn)

=1x3x4=3
(b) P(all three men meet) = P(1st goes to any inn)

x P(2nd goes to same inn)

x P(3rd goes to same inn)

=1xhxh=J
13. (a) There will be no defective components in the assembly if all five
components selected are acceptable ones. The chance of such an occurrence is
given by the product of the individual probabilities and is

0.90 x 0.90 x 0.98 x 0.95 x 0.95 = 0.7164

(b) If the assembly contains one defective component, any one (but only
one) of the five components could be the defective. There are thus five mutually
exclusive ways of getting the required result, each of these ways having its
probability determined by multiplying the appropriate individual probabilities
together.
Probability Theory 25

lst x component | D A
2nd x component| A D
A = acceptable part
y component | A | or or or or
lst z component | A D = defective part
A
2nd z component] A A SF
meaoSRADR
mw RRA”
dba
The probability of there being just one defective component in the assembly
is given by

2 x (0.10 x 0.90 x 0.98 x 0.95 x 0.95) +(0.90 x 0.90 x 0.02 x 0.95 x 0.95) +
+2 x (0.90 x 0.90 x 0.98 x 0.05 x 0.95) = 0.1592 + 0.0146 + 0.0754 = 0.2492
14. Assume the products to be independent of each other. Then

(a) Probability of taking all four new products


= 0.95 x 0.50 x 0.80 x 0.30 = 0.1140
(b) Probability of taking only

A, B, and C = 0.95 x 0.50 x 0.80 x (1 —0.30) = 0.2660

1.5 Practical Laboratory Experiments and Demonstrations


Experience has shown that when students are being introduced to statistics, the
effectiveness of the course is greatly improved by augmenting it with a practical
laboratory course of experiments and demonstrations, irrespective of the
mathematical background of the students.
The three experiments described here are experiments 1, 2, and 3 from the
authors’ Laboratory Manual in Basic Statistics, which contains full details,
analysis and summary sheets.
Appendix 1 gives full details of experiment 1 together with the analysis and
summary sheets.
The following notes are for guidance on experiments.

1.5.1 Experiment 1
This experiment, in being the most comprehensive of the experiments in the
book, is unfortunately also the longest as far as data collection goes. However,
as will be seen from the points made, the results more than justify the time.
Should time be critical it is possible to miss experiment 1 and carry out
experiments 2 and 3 which are much speedier. In experiment 1 the data
collection time is relatively long since the three dice have to be thrown 100
times (this cannot be reduced without drastically affecting the results).
26 Statistics : Problems and Solutions

Appendix 1 contains full details of the analysis of eight groups’ results for
the first experiment, and the following points should be observed in summarising
the experiment:

(1) The variation between the frequency distributions of number of ones


(or number of sixes) obtained by all groups, and that the distributions
based on total data (sum of all groups) are closer to the theoretical situation.
(2) The comparison of the distributions of score of the coloured dice and
the total score of three dice show clearly that the total score distribution now
tends to a bell-shaped curve.

1.5.2 Experiment 2
This gives a speedy demonstration of Bernoulli’s law. Asn, the number of
trials, increases, the estimate of p the probability gets closer to the true
population value. For 7 = 1 the estimate is either p = 1 or 0 and as n increases,
the estimates tend to get closer to p = 0.5. Figure 1.7 shows a typical result.

1.0

x
= x ifwe e x (eed
seo
°
erx Xx X
ee X
a \/ AY

oO

Figure 1.7

1.5.3 Experiment 3
Again this is a simple demonstration of probability laws and sampling errors.
Four coins are tossed 50 times and in each toss the number of heads is
recorded. See table 6 of the laboratory manual.
Note
It is advisable to use the specially designed shakers or something similar.
Otherwise the coins will roll or bias in the tossing will occur. The results
of this experiment are summarised in table 8 of the laboratory manual and the
variation in groups’ results are stressed as is the fact that the results based on all
groups’ readings are closer to the theoretical than those for one group only.

1.5.4 Summary of Experiments 1, 2, and 3


The carrying out of these experiments will have given students a feel for the
basic concepts of statistics. While in all other sciences they expect their results
Probability Theory 93)

to obey the theoretical law exactly, they will have been shown that in statistics
all samples vary, but an underlying pattern emerges. The larger the samples
used the closer this pattern tends to be to results predicted by theory. The
basic laws—those of addition and multiplication—and other concepts of
probability theory, have been illustrated.
Other experiments with decimal dice can be designed.t+

Appendix 1—Experiment 1 and Sample Results

Probability Theory
Number of persons: 2 or 3.

Object
The experiment is designed to illustrate

(a) the basic laws of probability


(b) that the relative frequency measure of probability becomes more
reliable the greater the number of observations on which it is based, that is,
Bernoulli’s theorem.

Method
Throw three dice (2 white, 1 coloured) a hundred times. For each throw,
record in table 1
(a) the number of ones
(b) the number of sixes
(c) the score of the coloured die
(d) the total score of the three dice.
Draw up these results, together with those of other groups, into tables
(2, 3, and 4).

Analysis
1.- For each set of 100 results and for the combined figures of all groups,
calculate the probabilities that, in a throw of three dice:

(a) no face shows a one


(b) two or more sixes occur
(c) a total score of more than 13 is obtained

+ Details from: Technical Prototypes (Sales) Limited, 1A West Holme Street, Leicester.
28 Statistics: Problems and Sclutions

Table |

INo. |No. Icoi- Total |No. |No. |Col— |Total Total


lof jof joured of joured |score score
I's 16's |die I's 16's Jdie

ee GPa lB
2
O | Np | eS |\%
Oo | a

le OWN Soe weal|Vel


| Ss Ones #
/_————_+ + = SaaS f

|Qe [ uy 1 Oo u 10

O | 6 Ne \ ) Work

Onl We eS elt Set | MOnN Gate |inless


ONO N ES: q uy
eel OL mal S \
On PO Serle 2: \
r

afofe|« t
\ 0) 7

0 4
gs 4
O o

0) 7

0 Ze

2 4
Oo Ze

BO) ie 7

O Oe AS | 4

| O || 6

AolmglOS I ue
T | T
{ | | \o

10) (6) 4 7
Probability Theory

Table 2

Theo —
mental jretical

Pproba-— |proba—

bility

ees

0: 0525
eee
0:077§ |0.06
a
ca
eee
ss 8 |S]
H}0
aN
Ea
AP
|
Onl 22 00872 |

i GHONz) gl ele 0087S |)01 157

O Beil Wierea Curse IO) Zeio)


AQA!0123S

OS

72 004947¢

us lo 2x62 [00694
3 5 lo-0u37|0,0463
T
Za [00263 0.0278
6 |0:007¢|0.0138

4 |O00S5

No. of throws

Probability of
score of more
than 13

2. Compare these results with those expected from theory and comment on
the agreement both for individual groups and for the combined observations.

3. Draw probability histograms both for the score of the coloured die and for
the tctal score of the three dice, on page 27. Do this for your own group’s
readings and for the combined results of all groups.

‘Comment on the agreement with the theoretical distributions.


Note; The theoretical probability distribution for the total score of three
dice is shown in table 2.
30 Statistics: Problems and Solutions

Probability
that,
one two
in more
or
throw,

e
wo
c4
°=
iS
-

_
°
°
"ha given
of
which
no. ets
Hear
Ht
SIXES
appear
J
ie}
iS
-_

~~
c=
A
fe)
ve)
Table °=
3
a one|SIXES
in face
shows
no
occur
throw,

ac|aha ae
eiSSeenAe
SPR
fe
Pe
given
which
of
no.

No.
|No.
of
in
throwsthrows|ONES
occur

Group

probability
Experimental
Probability Theory 31

vU
Uv
°
he

°
>
=
=

5
a
°
i
eu
®
S
oH
-_

=
°
wo
ss
5
®
Qa
Qa
Lo)
oO
he
°o
Oo
w
(=
o
=
oO
he.
pos
£
4Table =
Ae
w
3
°o
Nes
Ae
_

te
°

fe}
ie
i

ece
SRS
ol
eee
eer:oot
°

Bel etc)
el
[ia
[48
bic
tw
d
ae eS
[eo
ec
orth
eel
| gor ey
eee
PSEC
SRE
fe}
Za die
|coloured
throws

@ 7

= ae ee
SSS
Pa
ee
ee eae
Ed
ee
(ae
eS
ee fo[v7
[we
five
Da]
7k
te
|
ae
Ga iar
fee
[too
Weed
Peel
Be
ee fae heoretical
probabilit
probabilit
2 Theory of distributions

2.1 Syllabus Covered


Summary of data; frequency and probability distributions; histograms; samples
and populations; distribution types; moments and their calculation, suggested
experiments and demonstrations.

2.2 Résumé of Basic Theory and Concepts


2.2.1 Introduction
The understanding of the concepts of distributions and their laws is fundamental
to the science of statistics. Variation occurs almost without exception in all
our activities and processes. For example nature cannot produce two of her
products alike—two ‘identical’ twins are never exactly alike; a description of
similarity is the saying ‘as alike as two peas’, yet study two peas from the same
pod and differences in size or colour or shape will become apparent. Consider
for example the heights of men. Heights between 1.70 m and 1.83 m are quite
common and heights outside this range are by no means rare.
Although it is not so obvious, man-made articles are also subject to the same
kind of variability. The manufacturer of washers realises that some washers
will have a greater thickness than others. The resistance of electrical filaments
made at the same time will not be exactly alike. The running cost of a department
in a company will not be exactly the same each week, although, off hand, there
is no reason for the difference. The tensile strength of a steel bar is not the same
at two different parts of the same bar. The ash content of coal in a truck is
different when a number of samples from the truck are tested. Differences in
the diameter of components being produced on the same lathe will be found.
The time taken to do a given job will vary from occasion to occasion.
In present-day manufacture, the aim is usually to make things as alike as
possible. Or, alternatively, the amount of variability is controlled by specification
so that variation between certain limits is permitted.
It is interesting to note that even with the greatest precision of manufacture,
32
Theory of Distributions 33

variability will still exist, providing the measuring equipment is sensitive


enough to pick it up.
Thus, variation will be seen to be present in all processes, to a greater or
lesser extent, and the use of distributions and their related theorems is
necessary to analyse such situations.

2.2.2 Basic Theory of Distributions


The basic concepts of distributions can be illustrated by considering any
collection of data such as the 95 values of the output time for an open hearth
furnace given in table 2.1. The output time is the overall time from starting to
charge to finishing tapping the furnace.

7.8 8.0 8.6 8.1 7.9 8.2 8.1 1 8.2 8.1


8.4 8.2 7.8 8.0 Cis) 7.4 8.0 7.3 7.6 7.8
had: 7.8 ies) tS) 7.8 8.3 nw 8.0 8.2 7.4
yi Tes eo 8.2 8.5 Tee Ted 7.8 8.4 8.1
8.2 7.9 8.7 Tel 7.8 8.0 Bla = 8.2 he IES:
8.0 8.1 7.8 Saline. 6 7.8 *a:9 8.5 7.8
8.3 dee) 8.1 7.6 AES) 8.3 7.4 We 8.7
7.6 8.0 8.0 8.2 8.2 ee 8.1 8.4 7.6
hg CR Ue 7.8 7.8 ed des Tell 8.1
8.1 8.0 8.1 Test! 8.0 8.0 8.0 8.1 Ud

Table 2.1 Furnace output time (h)

Referring to these data, it will be seen that the figures vary one from the
other; the first is 7.8 h, the next 8.4 h and so on; there is one as low as 7.1 h and
one as high as 8.7 h.
In statistics the basic logic is inductive, and the data must be looked at as a
whole and not as a collection of individual readings.
It is often surprising to the non-statistician or deterministic scientist how
often regularities appear in these statistical counts.
The process of grouping data consists of two steps usually carried out
together.

‘Step 1. The data are placed in order of magnitude.


Step 2. The data are then summarised into groups or class intervals.

This process is carried out as follows:

(1) The range of the data is found, i.e.


largest reading — smallest reading = 8.7—7.1=1.6h

(2) The range is then sub-divided into a series of steps called class intervals.
34 Statistics: Problems and Solutions

These class intervals are usually of equal size, although in certain cases unequal
class intervals are used. For usual sample sizes, the number of class intervals is
chosen to be between 8 and 20, although this should be regarded as a general
rule only. For table 2.1, class intervals of size 0.2 h were chosen, i.e.,
7.1=1.35 TST Sy aes] Oe
(3) More precise definition of the boundaries of the class intervals is however
required, otherwise readings which fall say at 7.3 can be placed in either of two
class intervals.
Since in practice the reading recorded as 7.3 h could have any value between
7.25 h and 7.35 h (normal technique of rounding off), the class boundaries will
now be taken as:
7.05-7.25, 7.25-7.45, .. ., 8.45-8.65, 8.65-8.85

Note: Since an extra digit is used there is no possibility of any reading’s falling
on the boundary of a class.
The summarising of data in figure 2.1 into a distribution is shown in
table 2.2. For each observation in table 2.1 a stroke is put opposite the sub-range
into which the reading falls. The strokes are made in groups of five for easy
summation.

Value of Frequency Probability


variable Frequency distribution distribution

7.05=7°25 | 1 0.01
125-745 tt 5 0.05
7.45-7.65 TH tit 10 ‘0.11
7.65-7.85 HH THT TH Iill 19 0.20
7.85-8.05 TH ttt tH tH tH II 27 0.28
8.05-8.25 TH tt THT TH II 22 0.23
8.25-8.45 tHt | 6 0.06
8.45-8.65 III 3 0.03
8.65-8.85 \| 2 0.02
Total=95 Total=1.00

Table 2.2
The last operation is to total the strokes and enter the totals in the next to
last column in table 2.2 obtaining what is called a frequency distribution. There
are for example, one reading in class interval 7.05-7.25, five readings in the
next, ten in the next, and so on. Such a table is called a frequency distribution
since it shows how the individuals are distributed between the groups or class
intervals. Diagrams are more easily assimilated so it is normal to plot the
Theory ofDistributions 35

30

25

20

Y,
YY) 8.45 8.65 885 9.05
Dare time (h)

Figure 2.1. Frequency histogram (output data).

frequency distribution and this frequency histogram is shown in figure 2.1.


In plotting a histogram, a rectangle is erected on each class interval, the area
of the rectangle being proportional to the class frequency.
Note: Where class intervals are all of equal length, the height of the rectangle
is also proportional to the class frequency. Other examples of frequency
distributions and histograms are given in section 2.3.

2.2.3 Probability Distributions


The frequency distribution is often transformed into a probability distribution
by calculating the relative frequency or probability of a reading falling in each
class interval.
For example, probability of a reading falling in the interval 7.45-7.65
_ number of readings in class_ 10 _ 0.11
total number of readings 95 ~

(See chapter 1 on measurement of probability.)


Probability distributions have a distinct advantage when comparing two or
more sets of data since the area under the curve has been standardised in all
cases to unity.

2.2.4 Concept of aPopulation


All the data of table 2.1 are summarised by means of the feoulieucy distribution
shown in figure 2.1. The distribution was obtained from a sample of 95
observations. However, in statistics the analyst likes to think in terms of
36 Statistics: Problems and Solutions

thousands of observations; in fact he thinks in terms of millions or more and


thus he conceives an infinite population. Normally millions of observations
cannot be obtained, only hundreds at the most being available, and so the
statistician is forced to work with a finite number of readings. These readings
are thought of as a sample taken from an infinite population and in some way
representative of this population. Statisticians take this infinite population as a
smooth curve. This is substantiated by studying what happens to the shape of
the distribution as the sample size increases. Figure 2.2 illustrates this, the data

Frequency scale
GZ.

Infinite
sample

Scale of x
[ee fk as eer). eee Yh
=a 2" Sb Oe BS — See Oe es

Figure 2.2. The effect of the sample size on the histogram shape.

here being taken from an experiment in a laboratory. A sample size of 100 gives
an irregular shape similar to those obtained from the data of output times.
However, with increasing sample size, narrower class intervals can be used and
the frequency distribution becomes more uniform in shape until with a sample of
10 000 it is almost smooth. The limit as the sample size becomes infinite is also
shown. Thus with small samples, irregularities are to be expected in the frequency
distributions, even when the population gives a smooth curve.
It is the assumption that the population from which the data was obtained
Theory of Distributions 37

has a smooth curve (although not all samples have), that enables the statistician
to use the mathematics of statistics.

2.2.5 Moments of Distribution


The summarising of data into a distribution always forms the first stage in
statistical analysis. However, this summarising process must usually be taken
further since a shape is not easy to deal with.
The statistician, in the final summary stage, calculates measures from the
distribution, these measures being used to represent the distribution and thus
the original data.
Each measure is called a statistic. In calculating these measures or statistics,
the concept of moments is borrowed from mechanics. The distribution in
probability form is considered as lying on the x axis and the readings in each
interval as having the value of the mid-point of each interval, ie.,x,,X2,...,XN
etc.
If the probabilities associated with these variable values are p;, D2,... DN
(figure 2.3 shows this diagrammatically), then p,; +p, +...+py = 1.

Figure 2.3

Consider now the 1st moment of the distribution about the origin
N
ay p; x; = xX (the arithmetical average)
i=1

Thus the lst statistic or measure is the arithmetical average x. Higher moments
are now taken about this arithmetical average rather than the origin.
Thus, the 2nd moment about the arithmetical average

N
Fs 2 pdx; —x)?
l=
38 Statistics: Problems and Solutions

This 2nd moment is called the variance in statistics, and its square root is called
the standard deviation.
Thus the standard deviation of the distribution

-/[> read

The higher moments are as follows:


N
3rd moment about the average = > pAx;—x)*
iB=L

N
4th moment about the average = yi pAx;—x)*
i=
N
or in general the kth moment about the average = by Ddotre ee
=
The first two moments, the mean and the variance, are by far the most
important.

2.2.6 Résumé of Statistical Terms used in Distribution Theory


Sample
A sample is any selection of data under study, e.g., readings of heights of men,
readings from repeated time studies.

Random Sample
A random sample is a sample selected without bias, i.e., one for which every
member of the population has an equal chance of being included in the sample.

Population or Universe
This is the total number of possible observations. This concept of a population
is fundamental to statistics. All data studied are in sample form and the
statistician’s sample is regarded as having been drawn from the population of all
possible events. A population may be finite or infinite. In practice, many finite
populations are so large they can be conveniently considered as infinite in size.

Grouping or Classification of Numerical Data


The results are sub-divided into groups so that no regard is paid to variations
within the groups. The following example illustrates this.
Theory of Distributions 39

Groupings Number of results

3.95-4.95 8
4.95-5.95 7
5.95-6.95 5

The class boundaries shown in this example are suitable for measurements
recorded to the nearest 0.1 of a unit. The boundaries chosen are convenient for
easy summary of the raw data since the first class shown contains all
measurements whose integer part is 4, the next class all measurements starting
with 5 and so on.
It would have been valid but less convenient to choose the class as, say,
3.25-4.25;4.25=5.25, 0.4.7
In grouping, any group is called a class and the number of values falling in
the class is the class frequency. The magnitude of the range of the group is
called the class interval, i.e., 3.95-4.95 or 1.

Number of Groups
For simplicity of calculation, the number of intervals chosen should not be too
large, preferably not more than twenty. Again, in order that the results obtained
may be sufficiently accurate, the number must not be too small, preferably
not less than eight.

Types of Variable
Continuous. A continuous variable is one in which the variable can take every
value between certain limits a and b, say.
Discrete. A discrete variable is one which takes certain values only—frequently
part or all of the set of positive integers. For example, each member of a
sample may or may not possess a certain attribute and the observation recorded
(the value of the variable) might be the number of sample members which possess
the given attribute.

Frequency Histogram
A frequency distribution shows the number of samples falling into each class
interval when a sample is grouped according to the magnitude of the values. If the
class form, frequency is plotted as a rectangular block on the class interval the
diagram is called a frequency histogram. Note: Area is proportional to frequency.

Probability Histograms
A probability histogram is the graphical picture obtained when the grouped
40 Statistics: Problems and Solutions

sample data are plotted, the class probability being erected as a rectangular
block on the class interval. The area above any class interval is equal to the
probability of an observation being in that class since the total area under the
histogram is equal to one.

Limiting form of Histogram


The larger the sample, the closer the properties of histograms and probability
curves become to those of the populations from which they were drawn, i.e.,
the limiting form.

Variate
A variate is a variable which possesses a probability distribution.

2.2.7 Types of Distribution


While there is much discussion as to the value of classifying distributions into
types, there is no doubt in the authors’ minds that classification does help the
student to get a better appreciation of the patterns of variation met in practice.
Figure 2.4 gives the usually accepted classifications.

Type 1: Unimodal
Examples of this variation pattern are: intelligence quotients of children,
heights (and/or weights) of people, nearly all man-made objects when produced
under controlled conditions (length of bolts mass-produced on capstans, etc.).
A simple example of this type of distribution can be illustrated if one
assumes that the aim is to make each item or product alike but that there
exists a very large number of small independent forces deflecting the aim, and
under such conditions, a unimodal distribution arises. For example, consider
a machine tool mass-producing screws. The setter sets the machine up as
correctly as he can and then passes it over to the operator and the screws
produced form a pattern of variation of type 1. The machine is set to produce
each screw exactly the same, but, because of a large number of deflecting
forces present, such as small particles of grit in the cooling oil, vibrations in
the machine, slight variation in the metal—manufacturing conditions are not
constant, hence there is variation in the final product. (See simple quincunx
unit on page 61.)

Type 2: Positive Skew


Examples of this type of distribution are the human reaction time and other
types of variable where there is a lower limit to the values, i.e., distribution of
number of packages bought at a supermarket, etc.
If this type of distribution is met when a symmetrical type should be
expected it is indicative of the process being out of control.
Theory of Distributions
41

Symmetrical Positive skew

Negative skew Bimodal

ah shaped Ul shaped

|
!
!
|
}
I
'
I
|
!
I
|
!
|
\

|
|

Figure 2.4. Types ofdisiribution.

Type 3: Negative Skew


True examples of this type are difficult to find in practice but can arise when
there is some physical or other upper constraint on the process.

Type 4: Bimodal
This type cannot be classified as a separate form unless more evidence of
measures conforming to this pattern of variation are discovered. In most cases
42 Statistics: Problems and Solutions

this type arises from the combination of two distributions of type 1 (see
figure 2.5).

™ Ms

Figure 2.5. Bimodal distribution arising from two type-1 distributions with
different means m, and mM.

Type 5: J-Shaped or Negative Exponential


Examples of its type include flow of water down a river, most service time
distributions and time intervals between accidents or breakdowns.

Type 6: U-Shaped
This type is fascinating in that its pattern is the opposite to type 1. A variable
where the least probable values are those around the average would not be
expected intuitively and it is rare when it occurs in practice. One example,
however, is the degree of cloudiness of the sky—at certain times of the year
the sky is more likely to be completely clear or completely cloudy than anything
in between.

2.2.8 Computation of Moments of Distribution


Dependent on the type of data and their range, the data may or may not be
grouped into class intervals. The calculation of moments is the same for either
grouped or ungrouped data, but in the case of grouped data, all the readings in
the class interval are regarded as lying at the centre of the interval. The method
used in this text and throughout all the examples makes use of a simple
transformation of the variate and is usually carried out on the frequency
distributions rather than on the probability distribution. This use of frequency
distributions is common to most text books and will be used here although
there is often advantage in using the probability distribution.
Theory of Distributions
43

Let x; = value of the variate in the ith class interval


f; = frequency of readings in the ith class interval
P; = probability of a value in the ith class interval
» f, =n, the total number of readings
u

DdAm
The 1st moment (arithmetic average) =~—— = X
ad I

or
rs DiXi =X
1

The 2nd moment (variance) (s'p 5;flep— 8°


\ fi
or

2 pix; — X)?

For computing purposes the formula for variance is usually modified to


reduce the effect of rounding errors. These errors can arise through use of the
calculated average x which is generally a rounded number. If insufficient
significant figures are retained in x, each of the deviations (x; — X) will be in
error and the sum of their squares [2f;(x; —X)*] will tend to be inaccurate.

Computation of Moments using Frequency Distributions


The variate (x;) is transformed to

Xj —Xo
ONE aera ane Or xXj=cutXg

where Xo = any value of x taken as an arbitrary average, c = class interval width.


It can easily be shown that

ch. fui
lst moment X = Xo ta
dali
t

2
2nd moment (s’)? = c? |2rd esChu"
Xi
44 Statistics: Problems and Solutions

Example

The values given in table 2.3 have been calculated using the data from table 2.2.

: Mid point Frequenc


Class interval es ) y u; fu; fju?

7.05-7.25 TRS 1 4 —4 16
7.25-7.45 235 5 —3 ='5 45
7.45-7.65 1.59 10 —2 =20 40
7.65-7.85 TAS 19 —| —19 19
7.85-8.05 7.95 Dit 0 0 0
8.05-8.25 8.15 22 +] +22 22
8.25-8.45 8.35 6 +2 +12 24
8.45-8.65 8.55 3 +3 +9 29
8.65-8.85 8.75 9) +4 +8 BH
2 ae Lhuj=—7 Dfhu?=225

Table 2.3

Let xp = 7.95, c =0.20h

2G)
arithmetic average = 7.95 + 0.20(32) =7.94h
_7\2

variance (s')? = (0.2)? mag


omer ULES!

Computation using Probability Distributions

Class interval Mid point Probability 2, on 2


(x;) (p;) i iP; Uj Pj

7.05-7.25 TEAS) 0.01 —4 —0.04 0.16


7.25-7.45 TESS 0.05 —3 —0.15 0.45
7.45-7.65 SS 0.11 —2 —0.22 0.44
7.65-7.85 TS 0.20 —] —0.20 0.20
7.85-8.05 7.95 0.28 0 0 0
8.05-8.25 8.15 0.23 +] +0.23 0.23
8.25-8.45 8.35 0.06 +2 +0.12 0.24
8.45-8.65 8.55 0.03 +3 +0.09 0.27
8.65-8.85 8.75 0.02 +4 +0.08 0.32
SS

Table 2.4
Theory of Distributions 45

Letxo =7.95 and c=0.2 then Lup,;=0.09 and X(p;u?)=2.31

The formulae for the moments are


arithmetic average X = x9 + c Lpju; = 7.95 + (—0.018) = 7.93 h
variance (s')? = c? [Zp;u?—(Zp;,u;)? |= 0.2? (2.31—0,097) = 0.092
which compares favourably with results achieved using the frequency distribution
in view of the rounding off of probability to the second decimal point.

2.2.9 Sheppard’s Correction


When calculating the moments of grouped distributions, the assumption that
the readings are all located at the centre of the class interval leads to minor
errors in these moments. It must be stressed that the authors do not consider
that these corrections, known as Sheppard’s corrections, are of sufficient
magnitude in most problems, to be used.
However, it is only correct that they should be given:

Correction to 1st moment, x = 0

Correction to 2nd moment = — D

Thus the 1st moment calculation is unbiased while the answer given for the
2nd moment should be reduced by c?/12.

2.3 Problems for Solution


In the following problems, students are required to
(1) summarise data into distributions
(2) draw the frequency histogram
(3) calculate the mean and standard deviation of data.
While there is a large collection of problems given, tutors should select those
examples most relevant to their students’ courses. Worked solutions are
given for all questions in section 2.4, but in the authors’ opinion the answering
of two or three problems should be adequate.
Note: Students’ answers may differ slightly from the given answers, depending
on the class intervals selected.
The distributions illustrate that with limited samples of 30 to 100 observations
the shapes of the distributions can tend in some cases to be relatively irregular.
46 Statistics: Problems and Solutions

1. In a work study investigation of an operator glueing labels onto square


biscuit tins, the following readings, in basic minutes, were obtained for the time
of each operation:

0.09"-0°09 0, 8b. 7 0,096 90.09 2cOr1d 0:09 + 0:07 40,005 40:06


0.09. 0.09. 0.09 0.11.-—-0.09) 0:07. 0,09": 0: 06— Senite 0:07
0.09 0.10 0.06 0.10 0.08 0.06 9.09 0.08 0.08 0.08
0.08) *-:0.10-, 0.08 5:50.07, :; :0.09.)-0.082< 0:09 0:1 ),0:09-,,,,0,09
0.08 0.10 0.09 0.08 0.10 0.08 0.08 0.09 0.09 0.09
0.08 0.06 0.08 0.08 0.10 0.09 0.09 0.10 0.10. 0.11

2. In the assembly of a Hoover agitator, time element number 2 consists of:


pick up two spring washers, one in each hand and place on spindle, pick up two
bearings and place on spindle, pick up two felt washers and place on spindle,
pick up two end caps and screw onto spindle.
The following data, in basic minutes, were obtained from 93 studies for the
time element number 2:

0.26..-.0.28. 0.31 ~ 0:22: 2O254i0.28.5 0.285026 0.29 0.25


024 20.29... 0.26- O28. ~ 0.2457.0.26,5 0.29: 0:23. “026-026
0.232 10130*:80252% 029% Oi Fire O26n 20/3310i24270882i O34
0.26...0.31=, 0.23 »_ 0.29) . 0.22 0/2642 029070125) nO24ssn0i28
0.27" 0:32=— 023 0:26 0252 0.28- 0:36 042 “O24 7071
O23: OZ -046° -0235 "023 2061. 0.29 O38 7025
024. O28. 033. 20 24a 029 056 0:32. 0.27 024
0:25 0,29 0:33 0.25 0:35, 024" 033.4028 5026
0:°26""0.20°"" 0.24 "0:26" 0:34 *0:30" 0:30-0:29
0.18" 0.22 "0.25" 0.2? =OBar-? 01307 9°0:30 1003

3. The time interval, in minutes, between the arrival of successive customers


at a cash desk of a self-service store was measured over 56 customers and the
results are given below:

L035! °51.68 0,782 <1210089032' °41.61.~ §0/10% 40:43 S70l8 10:09


0.21.6 °2.71 “2312 2812S SOUP OaAS8 2054221940 GOROl 16
LTA .0.16°°20:319 20.999 O18) 4004) A626 et oh4Si Ft -0:63
0.57-°0.65 °4.60 1.:72>°0)52 2:32) OOSineOederrsisdeelani
1.16°5 0/58 ©)OS F< *Q04He Ay IOS) OMT lO OSs HS SameDQiOSichiO01
01452" 10425 0:255 QOS SiS ai 3190
Theory of Distributions 47

4. The number of defects per shift from a large indexing machine are given
below for the last 52 shifts:

Z 6 4 5
3 4 3 2
7 3 5 4
5 3 2 1 WW
eS
he NW
ON WorFNnd
eRe
WOW FSF
NAWbBROW
re
WO
= D
fh
NO
WNNAA

5. The crane handling times, in minutes, for a sample of 100 jobs lifted and
moved by an outside yard mobile crane are given below:
5 6 2A 8 7 8 Bl 5 10 Pail
13 15 17 7 DT 6 6 11 9 4
7 4 2 1 92. 10 tts 4 15 11 38
16 52 87 20-3 18 22 11 i 9 8
6 10 10 Wi 37 32 10 26 14 15
28 18 2 17 pA} 4 9 19 10 44 20
We > 20 8 25 14 23 13 12 vi,
9 92 33 p79) 19 151 171 od 4 6
al 1 5) 7 45 6 7 17 f) 19 42
9 6 55 61 52 4 5 102 8 oe

6. The lifetime, in hours, of a sample of 100 electric light bulbs is given


below:

1067 919 1196 785 1126 936 918 1156 920 1192
855 1092 1162 1170 929 950 905 972 1035 922
1022 978 832 1009 1157 1 151 1009 765 958 1039
923 1333 811 1217 1085 896 958 1311 1037 1083
999 932 1035 944 1049 940 lig2 1145 1026 1040
901 1324 818 1250 1203 1 078 890 1303 1147 1289
1187 1067 1118 1037 958 760 1101 949 883 699
824 643 980 935 878 934 910 1058 867 1083
844 814 1103 1000 788 1 143 935 1069 990 880
1037 Mey 863 990 1035 1 112 93 970 1258 1029

7. The number of goals scored in 57 English and Scottish league matches for
Saturday 23rd September, 1969, was:

0 Zz 3

WwW
WN
he = f
COrFRNnN
NNO ns
WN

MAWAaAN
WOR
We BW
WB WwW
KK
Nov fw
es Wh
W
WN (on OWN
WwWNWD
mAWrnNB
48 Statistics: Problems and Solutions

8. The intelligence quotients of 100 children are given below:t


YS3 STi" "tee ero pe aes 85 82 108 85
94 oh WS 110% 2. $133 98 106 52 4 102
115 . 10%. f00 57 —-108 77 94. 6:12 100 4 102
104 6% 9H11 88 87 9} 102 95." tO! 88
90 93 85 ‘4-107 BO ‘6 0G "120 ah £0! 103
ROS 100 ART 2 c107. VAEI2 98 83 98 89 =:106
Ao ahad 85 24 MEAS 93, 7 ee 90. 202 87
OS Uala7 VAI? 94 93 Rs 038.4 105.422. 104
104 19 D402 GATO4 VOLO? o7 Toe: TOS 2t03 > 207
106 96 $3 © 107 _**4102" 1iGw ae: 76 98 88

9. The sales value for the last 30 periods of a non-seasonal product are given
below in units of £100:
43 41 74 61 79 60 71 69 63 c
70 66 64 71 71 74 56 74 41 71
63 57 57 68 64 62 a oe 40 76

10. The records of the total score of three dice in 100 throws are given below:
16 + 9 12 11 8 15 13 12 13
8 7 6 13 10 1] 16 14 iy i?
14 14 ~ 13 9 13 8 10 12 14
8 = 10 6 9 10 13 12 13 13
16 ij 13 12 9 8 10 11 12 10
15 12 4 16 10 9 13 10 9 12
9 - 14 13 7 6 11 9 15 8
5 £2 7 6 7 13 13 iia| US. 14
12 i 10 12 12 12 13 3 16 4

2.4 Solutions to Problems

1. Range = 0.11 — 0.06 = 0.05 min.


Since only two significant figures are given in the data, there is no choice
regarding the class interval width.
Size of class interval = 0.01 min, giving only six class intervals (below the
preferred minimum of eight).

+ These data were taken from Facts from Figures by M. J. Moroney, Pelican.
Theory ofDistributions
49

30

20

Frequency

- Basic minutes

Figure 2.6. Glueing labels onto biscuit tins.

Class interval Mid point (x) Frequency (f) u uf u? f

0.055-0.065 0.06 5 —3 —15 45


0.065-0.075 0.07 4 —2 —8 16
0.075-0.085 0.08 14 —1 —14 14
0.085-0.095 0.09 23 0 0 0
0.095-0.105 0.10 9 +1 +9 9
0.105-0.115 0.11 5 +2 +10 20
Zf= 60 Luf=—18 Lu? f=104
Table 2.5

Transforming x = x9 + cu

Let xg = 0.09
c=0.01

lst moment about the origin = arithmetical mean,

=the Zuf _ 18 ;
oe Beles aa 0.09 +0.01 x(4° )=0.087 min

Variance ae
ey aes
Lf
ible
104 — a 104— 5.4
(ShLp =C nned) of Ls
0.01 2 =
0.01 2 ( 60 )

= 1.64 x 1074
50 Statistics: Problems and Solutions

Standard deviation

s’=/(1.64 x 10-*) = 0.013 min


The histogram is shown in figure 2.6.

2. Range = 0.46 — 0.17 = 0.29 min; size of class interval = 0.03 min, giving
9-10 class intervals.

Class interval Mid point (x) Frequency (f) u uf ur

0.165-0.195 0.18 3 —3 —9 WY
0.195-0.225 0.21 5 —2 —10 20
0.225-0.255 0.24 25 —] —25 D5
0.255-0.285 0.27 26 0 0 0
0.285-0.315 0.30 19 +1 +19 19
0.315-0.345 0.33 10 +2 +20 40
0.345-0.375 0.36 3 +3 +9 aT
0.375-0.405 0.39 0 +4 0 0
0.405-0.435 0.42 1 +5 +5 25
0.435-0.465 0.45 1 +6 +6 36
Lf= 93 Duf=+15 Lu? f=219
Table 2.6
(For histogram see figure 2.7.)

Calculation of the Mean and Standard Deviation


Transform
= Xo cu
Let

Xo = 0.27, c= 0.03
Average time
ar=x) Xo
Xa
Luf = 0.27 ++ {0.03 x 93
EC +c
xf =8) = 0.275

Variance of sample
2
| Du f— (auf)* 219 — (+15)?

(s’)? = y| babi eae >: 0.032 93 = 0.032 219 — 2AD


xf 93 93

_0.03? 93
x 216.58 = 0.0021

Standard deviation s’ = 0.046 min


Theory of Distributions
51

30

tw(e)

Frequency

y, Zs j
Cis O12) 70.245 (O24040. 3000355) O56 4 OS970:.42 ~O:45
Basic minutes

Figure 2.7. Time taken to assemble Hoover agitator.

3. Range = 4.60 —0.01 = 4.59 min; width of class interval = 0.5 min.

Class interval Frequency (/) u uf Uu iz

0-0.499 19 —2 —38 76
0.50-0.999 ri —1 —11 iM
1.00-1.499 a 0 0 0
1.50-1.999 6 +1 +6 6
2.00-2.499 4 +2 +8 16
2.50-2.999 Ss +3 +9 27
3.00-3.499 2 +4 +8 32
3.50-3.999 3 +5 +15 75
4.00-4.499 0 +6 +0 0
4.50-4.999 ers +7 +7 49
6 Duf =+4 Lu?f= 292

Table 2.7
(For histogram, see figure 2.8.)
Transform
X=Xo t cu
Let
xo s 1.25, c= 0.50

1st moment about the origin = arithmetic average,

zs Lu 4 _ :
paayhe ae 125-+-0:50-x 56 1.29 min
52 Statistics: Problems and Solutions

Frequency

Interval between arrivals (min)

Figure 2.8. Interval between arrival of customers.

Variance of the sample

zu? f— ee be SF oe 292 — 0.29


Gy ee Pe Vibe 0.5? mr ae 025 —= eae 1.30
Standard deviation of sample s’ = 1.14 min

4. Range = 9 —0= 9 defectives; width of class interval = 1 defective.

Number of defectives Number of shifts() u uf aid

0 3 —3 —9 Da
1 7 —2 —14 28
2 9 —1 —9 9
3 1 0 0 0
4 9 +] +9 9
5 6 +2 lD 24
6 3 +3 +9 QT
al 2 +4 +8 32
8 0 +5 +0 0
9 1 +6 +6 36
Uf =52 Luf=+12 Lu?f= 192
; Table 2.8
(For histogram see figure 2.9.)
Theory ofDistributions
53

fe)

Frequency
a

eee)
aes
ie
leh
Sal
Ce
ea

Numberof defects per shift


Figure 2.9. Number of defects in indexing machine.
Transform x9 = 3
Let c = 1 defective 2
Average number of defectives per shift = 3 + 1 x= = 3.2 per shift
52
Variance of the sample | (12)?
i2->—
(s')? = 1? x ime: = 3.64

Standard deviation = 1.9

5. Range = 192 —4= 188 min.


In this case, if equal class interval widths were chosen, then a width of
perhaps 20 min would be suitable. However, as can be checked, in the case of
the J-shaped distribution unequal class intervals give a better summary.

Class interval Mid point (x) Frequency (f) u uf u? f

O- 9.99 5 35 3 —105 BS
10- 19.99 iS) 30 —2 —60 120
20- 29.99 25 15 —] —15 15
30- 39.99 35 6 0 0 0
40- 49.99 45 3 +] 3 3
50- 69.99 60 4 12 10 25
70- 99.99 85 2 +5 10 50
100-139.99 120 3 arto) ASD) 216.75
140-199.99 140 2 tlt). Dal 364.5
xf = 100 Duf =—104.5 Du?
f= 1108.5
Table 2.9
(For histogram, see figure 2.10.)
SPS—3
54 Statistics: Problems and Solutions

ee SSF (2)

SSS
GW
SX
SS

NSS
SS
MAAAASSSSS.
KG
[NN
J[L
4 SSS.
SS
SSG. : :
5 15 25 5 0
1658

Figure 2.10. Crane handling times.

Transform x = x9 + cu
Let
c = 10 min
Oi 35

Arithmetic average = 35 — 10 x a = 24.6 min

Variance of the sample

10? 193722 7104-5)


<= 100
('P = | 107(9.72) =972
= 2 =

Standard deviation s’ = 31.2 min


Theory of Distributions 55

6. Range = 1333 — 643 = 690 h; class interval chosen as 100.

Variate (x) Frequency (/) u uf uf

549.5- 649.5 1 —4 —4 16
649.5- 749.5 1 —3 —3 9
749.5- 849.5 10 —2 —20 40
849.5- 949.5 26 —1 —26 ‘26
949 .5-1049.5 26 0 0 0
1049.5-1149.5 18 +1 +18 18
1149.5-1249.5 11 ' +2 9D) 44
1249.5-1349.5 7 +3 iol 63
Lf= 100 Luf= +8 Lu?f= 216
op NOC ae DAIL Suis eemere Dna ee a Dl Rn aetna cee
Table 2.10
(For histogram, see figure 2.11.)

Transforming x = x9 t+ uc
where c = 100h
Xo = 1000h
Average life of bulbs,

¥z== 1000 + 100 x (is)


reas 1008h

Variance of the sample


216 8)"
—Tog
(s')? = 100? ac = 21 536

Standard deviation s’ = 146.6 h

30

Dy 2)

fo)
Frequency

550 650 750 850 95 1050 1150 1250 1350


Lifetime (h)
Figure 2.11. Lifetime of electric light bulbs.
56 Statistics: Problems and Solutions

7. Range = 0-8 goals.


Discrete distribution
Width of class interval = 1 goal

Number of 2
ease Frequency (f) u uf uef

0 2 —4 —8 32
1 9 —3 —27 81
2 11 —2 —22 44.
5 LS —l —15 15
4 8 0 0 0
5 5 ae +5 5
6 5 an +10 20
7 1 +3. +3 9
8 1 +4 +4 16
Lf=57 Luf = —50 Luss fieaee

Table 2.11
(For histogram, see figure 2.12.)
Xo =4 c=1

Average goals/match
—50 |=
x¥=4+1
a +
x(
—_
57 ) 3.12
Variance of sample
2
222 — eer
(s 7 = aleex est ee, Sails

Standard deviation of sample = 1.8

Frequency

Goals / match
Figure 2.12. Number ofgoals scored in soccer matches.
Theory of Distributions 57

8. Range = 143 —57=85.


Suitable class intervals could be either 10 or 15. In this case as with the author
in Facts from Figures the class interval is chosen as 10.

Class interval Frequency (f) u uf u2 sis

54.5- 64.5 1 a4 -4 16
64.5- 74.5 D —3 —6 18
7T4.5- 84.5 2 —2 —18 36
84.5- 94.5 22, —l —22 22
94.5-104.5 33 2a) 0 0
104.5-114.5 22 +1 “E22 22
114.5-124.5 8 +) +16 372
124.5-134.5 g TS +6 18
134.5-144.5 1 +4 +4 16
Lf= 100 Luf = —2 Lu’f= 180
Table 2.12
Transforming x = x9 + cu (For histogram, see figure 2.13.)

where Xo = 99.5
c=10
Average intelligence quotient
gs
¥=99.5+ 2). 99.3
10x (=)
Variance of sample
_9)2

(s')? = 10? i
180— 7 180

Standard deviation of sample s' = 13.4


40

30

20
Frequency

65 65) 85 95 105 115 125 155 145


55
Intelligence quotients

Figure 2.13. Intelligence quotients of children.


58 Statistics: Problems and Solutions

9. Range = 79 — 40 = 39; class interval width = 4.

Variate (x) Frequency (f) u uf uf

39.5-43.5 4 —5 —20 100


43.5-47.5 0 —4 0 0
47.5-51.5 0 —3 0 0
$1.5-55.5 1 —2 —2 4
55.5-59.5 4 —1 —4 4
59.5-63.5 5 0 0 0
63.5-67.5 3 +1 +3 3
67.5-71.5 7 +2 +14 28
71.5-75.5 4 55 ell? 36
75.5-79.5 2 +4 +8 32
f= 30 Luf=+11 Lu? f= 207

Table 2.13
(For histogram, see figure 2.14.)
where x9 = 61.5
c=4
Average sales/period
aa
z= 61.5+4(35)
iia 63
Variance of sample
2
207 — Ce
(s')? = 4? Sear] |= 108.3
Standard deviation of sample s’ = 10.4
7

Frequency

40 48 56 64 72 80
Sales value in £100’s
Figure 2.14. Sales value of a product over 30 time periods.
Theory of Distributions 59

10. Range = 16—4 = 12; use class interval of 2 units.

Variate (x) Frequency (f) u uf urs

3.5- 5.5 7 —3 —21 63


5.5- 7.5 13 —2 —26 D2
7.5- 9.5 17 —] —-17 Wa
955-1 10-5 18 0 0 0
11.5-13.5 29 +1 +29 29
13.5-15.5 11 eg 22 44
15.5-17.5 5 - +3 +5 45
Yf= 100 Yuf=+2 Lu? f= 250

Table 2.14

(For histogram, see figure 2.15.)

where x = 10.5
c=2

Average score

iex= 10.5+2 pee


100 ee 10.54
;
Variance of scores

2
(s) —
D 42 100
(i)
250gal Gere
100
|= 10

Standard deviation s' = 3.16

30

Frequency

4 vA
VA, a vA

4or5 6or7 Bord lOorll I2orl3 l4orl5 I6orl7

Total dice score

Figure 2.15. Total score of three dice.


60 Statistics: Problems and Solutions

2.5 Practical Laboratory Experiments and Demonstrations


The three experiments described are experiments 4, 5 and 6 from the authors’
Laboratory Manual in Basic Statistics, pages 20-32.
As explained on page 20 of the manual, the authors leave the selection of
the populations to be used to the individual instructor—use whatever is
most suitable.
The objects of these experiments are firstly to show basic concepts involved
and secondly to give experience in computing means and standard deviation.
Thus data collection should be as quick as possible and the following points
noted:

(1) How accurately should students measure? Obviously the unit of


measurement must be small enough to give approximately eight to twenty class
intervals and since the sample size of 50 is relatively small, the best number of
class intervals is at the bottom end of the range.
(2) Selection of class intervals: The tables for computing mean and
standard deviations are set out fully in the manual. However, with the kit, one
obvious and quick experiment is to measure 50 rods from either the red or
yellow population using the measuring rules. Again one of the experiments
designed by the authors is described below. This ‘straw’ experiment is perhaps
one of the best and most famous distribution experiments due to various points
which can be made. Also described is the shove halfpenny experiment.

2.5.1 The Drinking Straw Experiment


The simplicity and speed of this experiment illustrate the main requirements of
good design.
Here students (in groups of two or three) are given 50 or more ordinary
drinking straws (usual size 180-250 mm) and one of the standard measuring
rules from the kit.
One student acts as cutter for the whole experiment and cuts, with the use
of the rule, one straw to exactly 130 mm. With this straw laid on the bench as
the guide and holding the other straws 0.5-1 m away he then attempts to cut
50 straws to the 130 mm standard. As the straws are cut they are passed to
others in the group for measuring and the results are entered in the table in the
manual.
Students have to decide (or be guided) on the unit of measurement, i.e. at
least 6 to 8 class intervals. (Vote: no feedback must take place in this experiment
and measurers should not let the cutter see results.) This experiment ends
usually in distributions whose shapes are either (a), (b) or (c) as illustrated in
figure 2.16.
Theory of Distributions
61

(a) (b) (c)

Figure 2.16
or:
For the case of

(a) The cutter has held the standard and produced a bell-shaped curve.
(b) Here either consciously or not, the standard has been changing.
(c) Here the negative skew distribution has arisen by the cutter again either
consciously or not, placing control on the short end of the straw.

2.5.2 The Shove Halfpenny Experiment


Number of persons: groups of 2 or 3.

Laboratory Equipment
Shove-halfpenny board or specially designed board (available from Technical
Prototypes (Sales) Ltd).

Method
After one trial, carry out 50 further trials, measuring the distance travelled each
time, the object being to send the disc the same distance at each trial.

Analysis
Summarise the data into a distribution, draw a histogram and calculate the mean
and standard deviation.

2.5.3 The Quincunx


This use of a quincunx developed by the authors is outlined below and gives an
effective simple demonstration of the basic concepts of variation.
This simple model, the principle of which was originally devised by Galton
to give a mechanical representation of the binomial distribution, is an
effective means of demonstrating to students the basic concepts of distributions.
The quincunx supplied with the statistical kit has ten rows of pins and seed
is fed in a stream through the pattern of pins. The setting is such that each pin
has a 50% chance of throwing any one seed to the right or to the left and thus,
a symmetrical distribution is obtained. With this large array of pins and the
speed of the stream, a simple analogy of the basic concept of distributions can
be demonstrated speedily and effectively.
62 Statistics: Problems and Solutions

Distributions can be regarded as arising under conditions where the aim is to


produce items as alike as possible (the stream in the model), but due to a large
number of deflecting forces (the pins) each one independent, the final product
varies and this variation pattern forms a distribution. For example, if one
considers an automatic lathe, mass producing small components, then material
and machine settings are controlled to produce products alike. However, due to
a very large number of small deflecting forces—no one of which has an appreciable
effect, otherwise it would be possible to correct for it—such as vibration,
particles of dirt in the cooling oil, small random variation in the material, the
final components give rise to a distribution similar to that generated by the
quincunx.
3 Hypergeometric, Binomial
and Poisson distributions

3.1 Syllabus Covered


Use of hypergeometric distribution; binomial distribution and its application;
Poisson distribution and its application; fitting of distributions to given data.

3.2 Résumé of Theory and Concepts


Here only a brief résumé of theory is given since this is already easily available
in a wide range of textbooks. However, this résumé, gives not only the basic
laws but also the basic concepts, which are necessary to give a fuller under-
standing of the laws.

3.2.1 The Hypergeometric Law


If a group containsN items of whichM are of one type and the remainder
N—M, are of another type, then the probability of getting exactly x of the
first type in a random sample of size n is

P(x)= wey, )
n

BOD: The Binomial Law

If the probability of success of an event in a single trial is p and p is constant for


all trials, then the probability of x successes in n independent trials is

P(x) = (")p*(1—p)"-*
3.2.3 The Poisson Law
If the chance of an event occurring at any instant is constant in a continuum of
63
64 Statistics: Problems and Solutions

time, and if the average number of successes in time ¢ is m, then the probability
of x successes in time ¢ is
mxeany
P(x) = x!

where m = expected (or average) number of successes


e = exponential base e =~ 2.178
Here, the event’s happening has a definite meaning but no meaning can be
attached to its not happening. For example, the number of times lightning strikes
can be counted and have a meaning but this is not true of the number of times
lightning does not strike.
Again the Poisson law can be derived as the limit of the binomial under
conditions where p >0 and n>°° but such that np remains finite and equal to m..
Then the probability of x successes is
mee”
ce
as before.

3.2.4 The essential requirement is for students to be able to decide which


distribution is applicable to the problem, and for this reason the problems are
all given under the general headings, because for example the statement that the
problems relate to the Poisson law almost defeats the purpose of setting the
problems.

Tutors must stress the relationship between these distributions so that students
can understand the type to use for any given situation.
Tutors can introduce students to the use of binomial distribution in place of
hypergeometric distribution in sampling theory when n/N < 0.10.
Students should be introduced to the use of statistical tables at this stage. For
all examples and problems, the complementary set of tables, namely Statistical
Tables by Murdoch and Barnes, published by Macmillan, has been used. As
mentioned in the preface, references to these tables will be followed by an
asterisk.

Note: The first and second moments of the binomial and Poisson distributions
are given below.
Binomial Poisson
1st moment (mean) uw np m
2nd moment about the
mean (variance) o? np(1—p) m
Hypergeometric, Binomial and Poisson Distributions 65

3.2.5 Examples on Use of Distributions


1. Assuming randomness in shuffling, what is the distribution of the number of
diamonds in a 13-card hand? What is the probability of getting exactly five
diamonds in the hand?
This is the hypergeometric distribution.

Type 1—diamonds 13 cards


Type 2—not diamonds 39 cards
Probability of x diamonds in 13-card hand is
ale 39 )
x /\13—x
P(x) =
=e
13
Probability of exactly five diamondsin the hand

(es see hes


NAN iy dee Sue aaa
P(5) = a4 52!
= 0.1247
13 39!13!

2. A distribution firm has 50 lorries in service delivering its goods; given that
lorries break down randomly and that each lorry utilisation is 90%, what
proportion of the time will
(a) exactly three lorries be broken down?
(b) more than five lorries be broken down?
(c) less than three lorries be broken down?

This is the binomial distribution since the probability of success, i.e. the
probability of a lorry breaking down, is p = 0.10 and this probability is constant.
The number of trials n = SO.
(a) Probability of exactly three lorries being broken down

P(3) = (3)0.10° (1—0.10)*”


from table 1* of statistical tables
P(3) = 0.8883 —0.7497 = 0.1386
(b) Probability of more than five lorries being broken down
co

P(x >5) efilamoiia


=50 Do
(1 igi —0.10) 50—-x
66 Statistics: Problems and Solutions

from table 1*

P(x > 5) = 0.3839

(c) Probability of less than three lorries being broken down

3
=0.1117
Pe <3)=1- => (*2) o.10*c1 0.1050 = 10.888

3. How many times must a die be rolled in order that the probability of 5
occurring is at least 0.75?
This can be solved using the binomial distribution. Probability of
success, ie. a 5 occurring, is p =.
Let k be the unknown number of rolls required, then probability of x
number of 5’s in k rolls

roo=(‘N(6) (6)
Ke k 1 x 5 k—-x

(Gl
Probability required is
k
KN /ANGf SNE ls

:
k
ENT AN(S\"* )
1 — »e (\é) (é) = probability of not getting a 5 in k throws = (=

= 10.75, = 0:25
Din

_ log 0.25 _ 1.39794 _ 0.60206 _


7.6
log 0.833. 1.92065 0.07935
Number of throws required = 8

4. A firm receives very large consignments of nuts from its supplier. A random
sample of 20 is taken from each consignment. If the consignment is in fact 30%
defective, what is
(a) probability of finding no defective nuts in the sample?
(b) probability of finding five or more defective nuts in the sample?
This is strictly a hypergeometric problem but it can be solved by using the
binomial distribution since probability of success, i.e. of obtaining a defect, is
p = 0.30 which can be assumed constant. The consignment is large enough to
Hypergeometric, Binomial and Poisson Distributions 67

ignore the very slight change in p as the sample is taken.


(a) From table 1*,p = 0.30, n = 20

P(0) = eS 0.30°(1—0.30)?° = 10.9992 = 0.0008


(b) The probability of finding five or more defectives

20 199
Px>5)= Dd ( )0.30*(1 —0.30)2°-*
x=5 \%

from table 1* = 0.7625

5. The average usage of a spare part is one per month. Assuming that all machines
using the part are independent and that breakdowns occur at random, what is
(a) the probability of using three spares in any month?
(b) the level of spares which must be carried at the beginning of each month
so that the probability of running out of stock in any month is at most 1 in
100?
This is the Poisson distribution.
The expected usage m = 1.0
(a) .. Probability of using three spares in any month
1.02e7!°

P(3) = 3
from table 2* P(3) = 0.0803 —0.0190 = 0.0613
(b) This question is equivalent to: what demand in a month has a probability
of at most 0.01 of being equalled or exceeded?

Note: Runout if stock falls to zero.

From table 2*

Stocking four spares, probability of four or more = 0.0190


Stocking five spares, probability of five or more = 0.0037
Stock five, the probability of runout being 0.0037

(Note: It is usual to go to a probability of 1/100 or less.)

6. The average number of breakdowns due to failure of a bearing on a large


automatic indexing machine is two per six months. Assume the failures are
68 Statistics: Problems and Solutions

random, and calculate and draw the probability distribution of the number of
failures per six months per machine over 100 machines.
Calculate the average and the standard deviation of the distribution.
This is the Poisson distribution.
Expected number of failures per machine per six months, m = 2.

Expected
Number Probability number ap u2f;
of failures P; of failures ; :
fi
0 O71353 1335 —2 —27 54.0
1 0.2707 Dial —l —27.1 21
2 0.2707 Del 0 0 0
3 0.1804 18.0 +] +18 18.0
4 0.0902 9.0 in) +18 36.0
5 0.0361 3.6 +3 +10.8 32.4
6 0.0121 12 +4 4.8 19.2
7 or over 0.0045 0.5 nS 2.5 Noe
1.0000 f;= 100 Lufj=0 u*f;= 199.2

Table 3.1. The values have been calculated from table 2* of statistical tables.

Transform x = uc + Xo
Xone 2
c=1

The arithmetical average

- 3}
X=Xo eae

Variance
Luf)*
Su2 fos
, D2 199.2 —
(s')? =¢? = f =il2adsarsapes
=)= 1.992
Standard deviation = 1.41

3.2.6 Special Examples of the Poisson Distribution of General


Interest
The following examples have been chosen to show the use of the Poisson
distribution and to illustrate clearly the tremendous potential of statistics or,
that is, the logic of inference.
Hypergeometric, Binomial and Poisson Distributions 69

Students will be introduced here to some of the logic used later so that they
can see, even at this introductory stage, something of the overall analysis using
statistical methods.

1. Goals Scored in Soccer


Problem 7 in chapter 2 (page 47) gives the data to illustrate this example. The
distribution of actual goals scored in the 57 matches is given in table 3.2. The
mean of this distribution is easily calculated, as in chapter 2, as

average number of goals/match m = 3.1

Number of goals/match Qitesial 2 13mine mes Sil Desi Hermes Total


Frequency Ce aoa le 1S Si Sp ae Tiel Sif

Table 3.2

Setting up the hypothesis that goals occur at random at a constant average


rate, i.e. it does not matter which team is playing, then the Poisson distribution
should fit these data. Using table 2* of statistical tables the probabilities are
given in table 3.3, together with the Poisson frequencies.

Number of goals/match 0 1 2 3 4 5 6 7 8 Total


Poisson probability 0.045, 0.140 0.217 0.223 0.173 0.107 0.056 0.024 0.0142 1.000
distribution
Poisson frequency 2.6 Ghee AE Naa] WyAntfede’ PO Teta ysl Sto ee Or8 Sik
distribution
Actual frequency 2 gy) 1] 15 8 5 5 1 1 57
distribution

Table 3.3

The agreement will be seen to be fairly close and when tested (see chapter 8), is
a good fit. It is interesting to see that the greater part of the variation is due to this
basic law of variation. However, larger samples tend to show that the Poisson
does not give a correct fit in this particular context.

2. Deaths due to Horsekicks


The following example due to Von Bortkewitz gives the records for 10 army
corps over 20 years, or 200 readings of the number of deaths of cavalrymen due
to horsekicks. The frequency distribution of number of deaths per corps per
year is shown in table 3.4.
70 Statistics: Problems and Solutions

Number of deaths/corps/year 0 1 2, 8} 4
Frequency 109 65 2) 3 1

Table 3.4
From this table the average number of deaths/corps/year, m = 0.61
Setting up the null hypothesis, namely, that the probability of a death has been
constant over the years and is the same for each corps, is equivalent to
postulating that this pattern of variation follows the Poisson law. Fitting a
Poisson distribution to these data and comparing the fit, gives a method of
testing this hypothesis. Using table 2* of statistical tables, and without
interpolating, i.e. use m = 0.60, gives the results shown in table 3.5

Number of deaths 0 1 2 3 4ormore ‘Total

Poisson probability 0.5488 0.3293 0.0988 0.0197 0.0034 1.0000


Poisson frequency 109.8 65.9 19.8 3.9 0.68 200
Actual frequency 109 65 22 3 1 200

Table 3.5

Comparison of the actual pattern of variation shows how closely it follows


the basic Poisson law, indicating that the observed differences between the
corps are entirely due to chance or a basic law of nature.

3. Outbreaks of War
The data in table 3.6 (from Mathematical Statistics by J. F. Ractliffe, O.U.P.) give
the number of outbreaks of war each year between the years 1500 and 1931
inclusive.

Number of outbreaks of war 0 1 » 3 4 5 Total


Frequency 223 VED BASS a1'5 4 0 432

Table 3.6

Setting up a hypothesis that war was equally likely to break out at any instant
of time during this 432-year period would give rise to a Poisson distribution. The
fitting of this Poisson distribution to the data gives a method of testing this
hypothesis.
The average number of outbreaks/year = 0.69 = 0.70
Using table 2* of statistical tables, table 3.7 gives a comparison of the actual
variation with that of the Poisson. Again comparison shows the staggering fact
that life has closely followed this basic law of variation.
Hypergeometric, Binomial and Poisson Distributions 71

Number of
outbreaks of war 0 1 2 3 4 5 ormore Total

Poisson probability 0.4966 0.3476 0.1217 0.0283 0.0050 0.0008 1.00


Poisson frequency 214.5 150.2 52.6 1222. 2.0 0.3 432
Actual frequency 223 142 48 15 4 0 432

Table 3.7

4. Demand for Spare Parts for B-47 Aircraft Airframe


Units Number of weeks
Item demanded
per week Observed Poisson?

Seal: $3 each (1AFE 15-24548-501) © 0 48 46


‘ 1 9) 16
2 2 a
3 2 _
50° 1 —

Mean demand per week 0.3°


Dome assembly: $610 each 0 33 26
(LAFE4-2608-826) 1 17 24
2 a 11
3 5 3
4 2 [4
6 1 _

Mean demand per week 0.9

Boost assembly—elevator control 0 20 17


$800 each (1AFE 15-24377-27, and 1 22 23,
substitute -20, -504) 2 13 IBS)
3 5 7
4 3 2
5 2 i

Mean demand per week 1.3

Table 3.8. Observed frequencies of demand compared with derived Poisson


distributions
4 Computed by assuming that the observed mean demand per week is the mean of the
Poisson distribution.
b Two units or more.
¢ A demand of 50 units by a single aircraft was recorded on 23 December 1953. The
mean used to fit the Poisson distribution (0.3) was obtained omitting this demand.
Four units or more.
€ Five units or more.
1/22 Statistics: Problems and Solutions

The actual demands at the MacDill Airforce Base per week for three spares
for B47 airframe over a period of 65 weeks are given in table 3.8.
The Poisson frequencies are obtained by using the statistical tables and
table 3.8 gives a comparison of the actual usage distribution with that of the
Poisson distribution.
The theoretical elements assuming the Poisson distribution are shown in the
table also. It will be seen that these distributions agree fairly well with actual
demands.

5. Spontaneous Ignitions in an Explosives Factory


The distribution of the number of spontaneous ignitions per day in an explosives
factory is shown in table 3.9 and covers a period of 250 days. The Poisson
frequencies, using the same mean number of explosions per day, have been
calculated and the fit found to be good. This implies that the explosions occur
at random, thus making it very unlikely that there is any systematic cause of
ignition.

Number of ignitions Observed number of days Poisson number of days

0 15 74.2
1 90 90.1
2 54 54.8
3 OR 222
4 6 6.8
5 2, 1.6
6 or more 1 0.4

Table 3.9. Mean number of ignitions per day = 1.126.

Authors’ Special Note


In all the foregoing examples, the (actual) observed distribution is compared
with the (theoretical) expected distribution assuming the null hypothesis to be
true. It should be stressed here that the degree of agreement between the
observed and theoretical distributions can only be assessed by special tests,
called significance tests. These tests will be carried out in chapter 8 later in the
book.

3.3 Problems for Solution


1. A book of 600 pages contains 600 misprints distributed at random. What is
the chance that a page contains at least two misprints?
Hypergeometric, Binomial and Poisson Distributions 73

2. If the chance that any one of ten telephone lines is busy at any instant is 0.2,
what is the chance that five of the lines are busy?

3. A sampling inspection scheme is set up so that a sample of ten components is


taken from each batch supplied and if one or more defectives is found the
batch is rejected. If the suppliers’ batches are defective

(a) 10% and (b) 20% what percentage of the batches will be rejected?

4. In a group of five machines, which run independently of each other, the


chance of a breakdown on each machine is 0.20. What is the probability of
breakdown of 0, 1, 2, 3, 4, 5 machines? What is the expected number of
breakdowns?

5. In a quality control scheme, samples of five are taken from the production at
regular intervals of time. ;
What number of defectives in the samples will be exceeded 1/20 times if the
process average defective rate is (a) 10%, (b) 20%, (c) 30%?

6. In a process running at 20% defective, how often would you expect ina
sample of 20 that the rejects would exceed four?

7. From a group of eight male operators and five female operators a committee
of five is to be formed. What is the chance of
(a) all five being male?
(b) all five being female?
(c) how many ways can the committee be formed if there is exactly one
female on it?
8. In 1000 readings of the results of trials for an event of small probability, the
frequencies f; and the numbers x; of successes were:
ER NO TL NIM Ses 3 A G7
Gee 305 365. 5 210 5 80. oneBgnot
tied 12 cadul
Show that the expected number of successes is 1.2 and calculate the expected
frequencies assuming Poisson distribution.
Calculate the variance of the distribution.

3.4 Solutions to the Problems


1. Assuming an average of one misprint per page, use of Poisson table 2* gives
P(2 or more misprints) = 0.2642
74 Statistics: Problems and Solutions

10
2. P(5 lines busy) = (5 )0.25 0.8° = 0.0264 from table 1* in statistical tables.

3. (a) Sample size n = 10


Probability of defective = p = 0.10
Reject on one or more defectives in sample of 10
From table 1*
Probability of finding one or more defectives in 10 = 0.6513
Percentage of batches rejected = 65.13
(b) Sample size n = 10
Probability of a defective p = 0.20
Reject on one or more defectives in sample of 10
From table 1*
Probability of finding one or more defective in 10 = 0.8926
Percentage of batches rejected = 89.26

4.n=5 isrom statistical tables the probabilities of 0,1,2,3,4,5 machines


p= 0.20) breaking down have been calculated and are given in table 3.10.

Number of machines Probability of


broken down this number

0 0:33
1 0.41

3: eae
05 approximatel
-
4 0.01
5) 0

Table 3.10

Expected number of breakdowns = np = 5 x 0.20 = 1

5. (a) n=5
p=0.10
From table 1*
Probability of exceeding 1 = 0.0815
Probability of exceeding 2 = 0.0086
1 in 20 times is a probability of 0.05
Number of defectives exceeded 1 in 20 times is greater than 1 but less
than 2.
Hypergeometric, Binomial and Poisson Distributions 75

(b) n=5
p= 0.20
From table 1*
Probability of more than 2 = 0.0579
Number of defectives exceeded 1 in 20 times (approximately) is 2

(c) n=5
p = 0.30

From table 1*
Probability of more than 3 = 0.0318
Number of defectives exceeded 1 in 20 times is nearly 3

6. n= 20
p= 0.20

From table 1*
Probability of more than four rejects = 0.3704
Four will be exceeded 37 times in 100

7. (a) M=8, N-M=5, N= 13,n=5


aS

oe
Probability

NSJ\O7 <9 S43 l ny BUSLBOY vi BT


XG S.%-4 bn had
evel Ben ethic OL3!ABL 1326-12261
Lx 10 9
5. 518!
(b) M=8, N= 13, N-M=5,n=5
x= 0

Probability

POP (S)(3) Sidi RIT o


IBISHibodpeMix Bod xdoid ce
7 (2) galGiicnbah iilSesadicd bxelteeion soo”
5 8! 5!
(c) Number of ways one female can be chosen from five
76 Statistics: Problems and Solutions

Number of ways four males can be chosen from eight

-({)
Total number of ways

3 S\.. Sh 5 9c8 % 7x XSi


=5x(8)=5xq04 4x3x2x1 32
8.
10. x f u uf uly
Assumed
mean 0 305 =I =305 305
1 365 0 0 0
2 210 +1 210 210
3 80 re 160 320
4 28 +3 84 252
5 9 +4 36 144
6 2 ar) 10 50
ec 1 +6 6 36
Df 1000 Luf201 Lu?f1317
Table 3.11

Expected number = x = 1 +a. 1.201

25 (Coury
Varirianc oak ae
>f se 1000 :

3.5 Practical Laboratory Experiments and Demonstrations


The authors feel that of the three distributions the binomial lends itself best to
demonstration by laboratory experiments. Attempts to demonstrate a true
hypergeometric or Poisson distribution tend to be either very tedious and/or
relatively expensive.
However, with the use of the binomial sampling boxes} the basic concepts
and mechanics of the binomial can be speedily and effectively demonstrated.
The use of both 6-sided and/or the special decimal dice also give a simple
method for carrying out binomial distribution experiments.
Appendix 1 contains the full details of the experiment, together with a
sample set of results.
+ Available in two sizes from Technical Prototypes Ltd. 1A, Westholme Street,
Leicester.
Hypergeometric, Binomial and Poisson Distributions Te

Appendix 1—Experiment 7 and Sample Results

Binomial Distribution
Number of persons: 2 or 3.

Object
The experiment is designed to demonstrate the basic properties of the binomial
law.

Method
Using the binomial sampling box, take 50 samples of size 10 from the population,
recording in table 18, the number of coloured balls found in each sample.
(Wote: Proportion of coloured (i.e. other than white) balls is 0.15.)

Analysis
1. Group the data of table 18 into the frequency distribution, using the top
part of table 19.
2. Obtain the experimental probability distribution of the number of coloured
balls found per sample and compare it with the theoretical probability
distribution.
3. Combine the frequencies for all groups, using the lower part of table 19,
and obtain the experimental probability distribution for these combined results.
Again, compare the observed and theoretical probability distributions.
4. Enter, in table 20, the total frequencies obtained by combining individual
groups’ results. Calculate the mean and standard deviation of this distribution
and compare them with the theoretical values given by np and ./[np(1—p)]
respectively where, in the present case, n = 10 and p= 0.15.

Sample Results

(= 1)
kisZzo
21-30
31-40
41-50

Table 3.12 (Table 18 of the laboratory manual)

Summarise these data in table 19.


78 Statistics: Problems and Solutions

mane of coloured balls in sample Bes

giver ees =

=
‘Tally-marks'
Group No.__ a
me a

Experimental
frequency 50

Min sas ee
Experimental
oe as
peavosiiy, pisrpyeprnfrsfoupoorpoa
Theoretical
| | | | 10
Ore ESE Ona eee eae
Group eA
2 nba
4abail
3

ose op hear Mme lets sone


peer e ere 7 ee es lse|16 Oo
Wea ae FN Cd
Tia fresia Sit | ft so
aes) i ies Pe ee al
cl Coa
Total frequency
a ce
pessannty. Pusspssprnpusposspoopend
Experimental
| || |
Table 3.13 (Table 19 of the laboratory manual)

Number of
coloured
balls per
dete vaptintts

ecs

Bzaabaat
Table 3.14 (Table 20 of the laboratory manual)
Hypergeometric, Binomial and Poisson Distributions 79

For the distribution of number of coloured balls per sample of 10

ny Sf 406.” 1.618
observed mean=

px? — CB" 1619 - 4"


Observed standard deviation = af <a = “i
fe SAN
> fl 400
Solo

Theoretical mean = np = 10x 0.15 = 1.5

Theoretical standard deviation = ./[np(1—p)] =~V/(10 x 0.15 x 0.85)

=4/1.275 = 1.13.
4 Normal distribution

4.1 Syllabus Covered


Equation of the normal curve; area under the normal curve; ordinates of the
normal curve; standardised normal variate; use of tables of area; fitting of
normal distribution to data; normal probability paper.

4.2 Résumé of Theory


4.2.1 Introduction
The normal, or gaussian, distribution occupies a central place in the theory of
statistics. It is an adequate, and often very good, approximation to other
distributions which occur; examples of this are given in chapters 5 and 6. Many
of the advanced methods of statistics require the assumption that the basic
variables being used are normally distributed; the purpose of this is usually to
allow standard tests of significance to be applied to the results.
It often happens, however, that data summarised into a frequency
distribution (see chapter 2) are more or less normally distributed; that is, some
central value of the variable has the highest frequency of occurrence and the
class frequencies diminish near enough symmetrically on either side of the
central value. In such cases, it is very convenient to use the properties of the
normal distribution to describe the population. This chapter deals with the
main properties of the normal distribution.

4.2.2 Equation of the Normal Curve


Chapter 2 mentioned that, the greater the number of readings that are taken,
the more the outline of the plotted histogram tends to a smooth curve. If the
population is actually normal then this limiting shape of the histogram will be
similar to that in figure 4.1.
The curve can be described in terms of an equation so that the height of the
curve, y, can be expressed in terms of the value of the measured variable, x.
80
Normal Distribution 81

This equation is
(x=)?
naive oils yerions BOO?"
~~ ox/(2m)
where yu is the mean of the variable x
o is the standard deviation of x
e is the well-known mathematical constant (=2.718 approximately)
7 is another well-known mathematical constant (=3.142 approximately)

This equation can be used to derive various properties of the normal distribution.
A useful one is the relation between area under the curve and deviation from the
mean, but before looking at this we need to refer to a standardised variable.

Figure 4.1

4.2.3 Standardised Variate


Any random variable, x, having mean, mu, and standard deviation, o, can be
expressed in standardised form, i.e. x is measured from yu in multiples of o. The
standardised variable is therefore given by (x —)/o and is dimensionless.
In particular, if x is a normal variate then

u LSie
0

is a standardised normal variate.


Tables 3, 4 and 5* in statistical tables are tabulated in terms of this
standardised normal variate, u, and therefore they apply to any normal variate.

4.2.4 Area under Normal Curve


The total area under the normal curve is unity (as is the case for any probability
density function) and the area under the curve between two values of x, say a
and b (shown shaded in figure 4.2) gives the proportion of the population having

Prob(la<.x < bd)

Figure 4.2 a fn b x
82 Statistics: Problems and Solutions

values between a and b. This is equal to the probability that a single random
value of x will be bigger than a but less than b.
By standardising the variable and using the symmetry of the distribution,
table 3* can be used to find this probability as well as the unshaded areas in each
tail.

4.2.5 Percentage Points of the Normal Distribution


Table 4* gives percentage points (this is the common name although it is
actually ‘proportion points’ which are tabulated) of the normal distribution;
the a-proportion point or 100 a percentage point is the value of u, denoted
by uy, which is exceeded with probability a. Negative values of ug
corresponding to a greater than 0.50 can be found by symmetry.

4.2.6 Ordinates of the Normal Curve


Table 5* gives the height of the normal curve for values of u and by plotting
a selection of points, the outline of a normal distribution with any required
mean and standard deviation can be drawn.

4.2.7 Fitting a Normal Distribution to a Set of Data


Observed data will often be presented in the form of a frequency distribution
together with a histogram. A normal distribution can be fitted to such a
summary. The continuous curve outlining the shape of the normal distribution
with the same (or any other) mean and standard deviation can be superimposed
on the histogram using the ordinates in table 5*.
However, a more usual approach is to find the expected frequencies in each
class interval of the observed data assuming that the population is normal with
some given mean and standard deviation. This is best done using table 3* of
areas and gives a basis for testing whether the assumption of normality is
reasonable for the observed data (see chapter 8 for an example).

4.2.8 Arithmetic Probability Paper


This is graph paper with a special scale which makes the normal distribution,
when plotted cumulatively, appear as a straight line. One axis has a linear scale
and on this one convenient values of the variate are plotted. The other scale is
usually marked in percentages which represent the probability that the variate
takes on a value less than or equal to each of the plotted values.
Any observed data can be plotted on this paper, the straighter the line the
more nearly normal is the distribution. Unfortunately the straightness of the
line is rather a subjective judgement.
If a variate is obviously not normal, a suitable transformation can sometimes
be found which is distributed approximately normally; that is the logarithm, say,
(or the square root or the reciprocal, etc.), of each observation is used as the
Normal Distribution 83

variable. By plotting these new variables on probability paper it can be seen


whether any of the transformations gives a straight line.

4.2.9 Worked Examples


1. What is the chance that a random standardised normal variate

(a) will exceed 1.0?


(b) will be less than 2.0?
(c) will be less than —2.0?
(d) will be between —1.5 and +0.5?

Table 3* can be used to find these probabilities and it is useful to draw a


diagram to ensure that the appropriate areas are found. In figure 4.3 the
shaded areas represent the required answer. Remember that table 3* gives the
probability of exceeding the specified value of u for positive values of u only.

(a) (c)

(b)

S_ “a N
2) N
(d)

Kl: oO ees)
Figure 4.3

(a) u=1.0
Area = 0.1587
(b) u=2.0
Area in right tail = 0.02275
Thus shaded area = 1 — 0.02275 = 0.97725
(c) By symmetry area to left of wu =—2 is the same as the area to the right
of “= 42.
Thus the shaded area = 0.02275
(d) Area above u = +0.5 is 0.3085
Area below u = —1.5 is 0.0668
Total unshaded area= 0.3753
shaded area = 0.6247
84 Statistics: Problems and Solutions

2. Jam is packed in tins of nominal net weight 1 kg. The actual weight of jam
delivered to a tin by the filling machine is normally distributed about the set
weight with standard deviation of 12 g.
(a) If the set, or average, filling of jam is 1 kg what proportion of tins
contain
(i) less than 985 g?
(ii) more than 1030 g?
(iii) between 985 and 1030 g?

(b) If not more than one tin in 100 is to contain less than the advertised
net weight, what must be the minimum setting of the filling machine in order to
achieve this requirement?

(a) In solving such problems as these, it is always useful to draw a sketch


(figure 4.4) to ensure that the appropriate area under the curve is found from
tables.* In each case the shaded area is the required solution.

985 !000 1000 1030


(i) (ii)

Figure 4.4

( w= 285 zee ay ae
Using table 3* and the symmetry of the curve, the required proportion is 0.1056
ay MGSO“ NOOO TOT

This corresponds to a right-hand tail area of 0.00621

(iii) To find a shaded area as in this case, the tail areas are found directly
from tables* and then subtracted from the total curve area (unity).
Normal Distribu tion 85

The lower and upper tail areas have already been found in (i) and (ii) and
thus the solution is

1 —(0.1056 + 0.00621) = 1—0.1118 = 0.8882


(b) In this case, the area in the tail is fixed and in order to find the value of
the mean corresponding to this area, the cut-off point (1000 g) must be
expressed in terms of the number of standard deviations that it lies from the
mean.

0.01

1000
Figure 4.5

From table 4* (or table 3* working from the body of the table outwards),
1% of a normal distribution is cut off beyond 2.33 standard deviations from the
mean.
The required minimum value for the mean is thus
1000 + 2.33 x 12 = 1028 g= 1.028 kg
3. The data from problem 1, chapter 2 (page 46), can be used to show the
fitting of a normal distribution. The observed and fitted distributions are also
shown plotted on arithmetic probability paper.
The mean of the distribution was 0.087 min and the standard deviation
0.013 min. The method of finding the proportion falling in each class of a
normal distribution with these parameters is shown in table 4.1. The expected
class frequencies are found by multiplying each class proportion by the total
observed frequency. Notice that the total of the expected normal frequencies is
not 60. The reason is that about 4% of the fitted distribution lies outside the
range (0.045 to 0.125) that has been considered.
Table 4.2 shows the observed and expected normal class frequencies in
cumulative form as a percentage of the total frequency. Figure 4.6 shows these
two sets of data superimposed on the same piece of normal (or arithmetic)
probability paper.
The dots in figure 4.6 represent the observed points and the crosses represent
the fitted normal frequencies. Note that the plot of the cumulative normal
percentage frequencies does not quite give a straight line. The reason for this
is that the 7% of the normal distribution having values less than 0.045 has not
been included. If this 7% were added to each of the cumulative percentages in
the right-hand column of table 4.2 then a straight-line plot would be obtained.
SPS—4
Statistics: Problems and Solutions

Uy e9eL
pore ree a Sic Tee Te Se ee

STTO-SOT'O
09 86S 9L66°0

‘0-cOl:
8510°0
4 Po eS Gn-eit0

SIT0
aN
0890°0

0 8°0 OF 10°

:STZ
Vv

: ; 8€80°0 sel SOl'0 Up ae


S

:
3
;
6 OTT 8E810 9/.97°0 29°0 $60°0 SOT'0-S$60°0
EZ LI 07670 - . $60°0-S80°0
; : 96SS'0 = vOrr 0-T St0- $30°0 Pea
vl Lt 91970 ZIZ8°0 = 88L1:0-1 260— $00 ¢80°0-S$L0°0
v 0°8 rt omOO) ShS6'0 = SSb0'0-1 69 I— 590°0 ¢L0°0-$90°0
S eC 98€0°0 see yaa ste ; $90°0-S$S0°0
0 70 £900°0 1€66'0 = 6900°0-1 9V'C ¢s00 $60°0-S+0°0
7666°0 = 9000'0-T US =e 0
Abe $v0'0-S€0°0
Aouonbaly Aouonbeljy Sseyo <n, Avepunoq
(Nn, ss0ge ‘q’°N jo Arepunoqg
ssejo jewiou yore dd Joddn oe ssefO
pearesqo pojoodxq UI BoIV Wiehe ln ns pesipiepurys D
86
Normal Distribu tion

pearlosqg posit jew.iou

sayejnum
% ng aATJeNUIND sanyejnuing
% aAT]eINUIND
SSPID Aouenbealy
Aouenbaly Aouanbely Aouanbor.4 Aouenbasy Aduonbely

¢$$0°0-S+0°0 v0 v0
$90°0-SS0°0 S €°8
L0

Sc, LE Sv
¢L0°0-S$90°0 6 Ost 0°8 LOl SLI
$80°0-SL0°0 €¢ tatste ESI VIC O'bY
£60°0-S80°0 ov LOE SUB Ev 6 CEb
SO1'0-$60°0 SS L'16 OTT VS 6° S16
STT'O-SOT'O 09 0°00I Tv 0°6S €°86
ScIO-SIT'O 80 86S L’66

AGeLCv
87
88 Statistics: Problems and Solutions

oO
(=
o

: RAE
=]

: BSAg-)a
9
=
a
=)
S)

; alte

O:055/50.065, O.075. O.085. -OlOSS BOOS FOUN honOnles


Upper class boundaries
Figure 4.6

A further point to note is that the cumulative frequences are plotted against
the upper class boundaries (not the mid point of the class) since those are the
values below which lie the appropriate cumulative frequencies.
In addition, if the plotted points fall near enough on a straight line, which
implies approximate normality of the distribution, the mean and standard
deviation can be estimated graphically from the plot. To do this the best
straight line is drawn through the points (by eye is good enough). This straight
line will intersect the 16%, 50% and 84% lines on the frequency scale at three
points on the scale of the variable.
The value of the variable corresponding to the 50% point gives an estimation
of the median, which is the same as the mean if the distribution being plotted
is approximately symmetrical.
The horizontal separation between the 84% and 16% intercepts is equal to
Normal Distribution 89

2o for a straight line (normal) plot and so half of this distance gives an estimate
of the standard deviation.
Applying this to the fitted normal points, the mean is estimated as 0.087
and the standard deviation comes out as 0.5 (0.100—0.074) = 0.013, the
figures used to derive the fitted frequencies in the first place. The small bias
referred to earlier caused by omitting the bottom 74% of the distribution in the
plot has had very little influence on the estimate in this case.

4.3 Problems for Solution

(** denotes more difficult problems)


1. For any normal distribution, what proportion of it is
(a) more than twice the standard deviation above the mean?
(6) further than half the standard deviation below the mean?
(c) within one and a half standard deviations of the mean?

2. A normal distribution has a mean of 56 and a standard deviation of 10. What


proportion of it

(a) exceeds 68?


(b) is less than 40?
(c) is contained between 56 and 65?
(d) is contained between 60 and 65?
(e) is contained between 52 and 65?

3. Problem 8 of chapter 2 (page 48) gives the intelligence quotients of a sample


of 100 children. The mean and standard deviation of these numbers are 99.3
and 13.4, respectively, and the histogram indicates that normality is a good
assumption for the distribution of intelligence quotient (I.Q.).
(a) What proportion of all children can be expected to have 1.Q’s
(i) greater than 120?
(ii) less than 90?
(iii) between 70 and 130?
.(b) What I.Q. will be exceeded by
(i) 1% of children?
(ii) 0.1% of children?
(iii) 90% of children?
(c) Between what limits will 95% of children’s I.Q. values lie?

What assumptions have been made in obtaining these answers?


90 Statistics: Problems and Solutions

4. A process of knitting stockings should give a mean part-finished stocking


length of 1.45 m with a standard deviation of 0.013 m. Assuming that the
distribution of length is normal,
(a) if a tolerance of 1.45 m + 0.020 m is fixed, what total percentage of
oversize and undersize stockings can be expected?
(b) What tolerance can be worked to if not more than a total of 5% of
stockings undersized or oversized can be accepted?
(c) if the mean part-finished length is actually 1.46 m, what proportion of
the output are undersized or oversized stockings, allowing a tolerance of
1.45 m + 0.025 m.

5. The door frames used in an industrialised building system are of one


standard size. If the heights of adults are normally distributed, men with a mean
of 1.73 m and standard deviation of 0.064 m and women with a mean of 1.67 m
and standard deviation of 0.050 m,
(a) what proportion of men will be taller than the door frames if the standard
frame height is 1.83 m?
(b) what proportion of women will be taller than the standard frame height of
1.83 m?
(c) what proportion of men will have a clearance of at least 13 cm ona
frame height of 1.83 m?
(d) what should the minimum frame height be such that at most one man in
a thousand will be taller than the frame height?
(e) if women outnumber men (e.g. in a large department store) in the ratio
19: 1, for what proportion of people would a frame height of 1.83 m be too
low?

6. The data summarised in table 4.3 come from the analysis of 53 samples of
rock taken every few feet during a tin-mining operation. The original data for
each sample were obtained in terms of pounds of tin per ton of host rock but
since the distribution of such a measurement from point to point is quite skew,
the data were transformed by taking the ordinary logarithms of each sample
value and summarising the 53 numbers so obtained into the given frequency
distribution.
Fit a normal distribution to the data.

**7_ The individual links used in making chains have a normal distribution of
strength with mean of 1000 kg and standard deviation of 50 kg.
If chains are made up of 20 randomly chosen links

(a) what is the probability that such a chain will fail to support a load of
900 kg?
Normal Distribu tion 91

Logarithm of ore Frequency of given


grade ore grade

0.6-0.799 1
0.8-0.999 3
1.0-1.199 6
L.2=1.399 8
1.4-1.599 12
1.6-1.799 11
1.8=1-999 6
2:0=2:199 4
Do). 399 2

53

Table 4.3

(6) what should the minimum mean link strength be for 99.9% of all chains
to support a load of 900 kg?
(c) what is the median strength of a chain?

**8_ The standardised normal variate, u, having mean of 0 and variance of 1, has
probability density function
tay

H)= Tomo : —o<y<oo

If this distribution is truncated at the point u, (i.e. the shaded portion, a,


of the distribution above uv, is removed—see figure 4.7), obtain an expression in
terms of a and u, showing the amount by which the mean of the truncated
distribution is displaced from u = 0.

p (uv)

Figure 4.7 0 Ya tee

9. In a bottle-filling process, the volume of liquid delivered to a bottle is


normally distributed with mean and standard deviation of 1 litre and 5 ml
respectively. If all bottles containing less than 991 ml are removed and emptied,
and the contents used again in the filling process, what will be the average volume
of liquid in bottles offered for sale?
92 Statistics: Problems and Solutions

4.4 Solutions to Problems

1. Use table 3* of statistical tables.

(a) The proportion two standard deviations is 0.02275 (from the table).
(b) From the symmetry of the normal distribution, 0.3085 of the area is
further than 0.5 standard deviations below the mean.
(c) 0.0668 of the distribution is beyond one and a half standard deviations
from the mean in each tail. Thus the proportion within 1.5 standard deviations is
1 —(0.0668 + 0.0668) = 0.8664

2. (a) u=
68—56 _12_ 12
19. 10
Thus, 0.1151 of the area exceeds 68

Figure 4.8 56 68

Thus, 0.0548 of the distribution takes values less than 40.

Figure 4.9 40 56

(c) For 65,u= 65 208.45


10
Area in upper tail above 65 = 0.1841
For 56,u=0
Required shaded area = 0.5000 — 0.1841 = 0.3159

Figure 4.10 56 65
Normal Distribution 93

(d) For 60, u=


60—56 _ 0.4
10

Thus, area above 60 is 0.3446. Area above 65 is found in (c) to be 0.1841.


Thus, proportion between 60 and 65 is 0.3446 —0.1841 = 0.1605.

Figure 4.11 5660 65

(e) For 52,u=


56 0.4
10
From symmetry, area below 52 = 0.3446.
From (c) area above 65 = 0.1841. Thus, proportion between 52 and
65 = 1 —(0.3446 + 0.1841) = 1—0.5287 = 0.4713

Figure 4.12 52 56 65

3. Gy Fer lO", tee = 1.54


Proportion greater than 120 is 0.0618, say, 0.06

Figure 4.13 99.3 120

(ii) LQ. = 90, u =a? = 0.69


By symmetry, proportion less than 90 = 0.2451, say 0.24, since u is nearer to
—0.694

Figure 4.14 90 99.3


94 Statistics: Problems and Solutions

+130=—99.3
(iii) LO. = 1305 4= eae eyae = 2.29

Area above u = 2.29 is 0.0110.

f pl Ui ok ear
I.Q. = 70, u ear 2.19

Area below u = —2.19 is 0.0143


Proportion of children with 1.Q. values between 70 and 130 is

1 —(0.0110 + 0.0143) = 1 — 0.0253 = 0.975

Figure 4.15

(b) (i) For all normal distributions, 1% in the tail occurs at a point 2.33 standard
deviations from the mean. (See table 4* or use table 3* in reverse.)
Thus, 1% of all children will have an I.Q. value greater than

99.3:°+ 233-134 a99-5.+ 31.2 =. 130:5

a=0.01

Figure 4.16 99.3 Ci

(ii) For a= 0.001 (0.1%), the corresponding u-value is 3.09.


Thus one child in 1000 will have an I.Q. value greater than

99.3 + 3.09 x 13.4 = 99.3 + 41.5 = 140.8

a=0.00 |

Figure 4.17 99.3 ?

(iii) Ten per cent of children will have I.Q. values less than the value which
90% exceed.
The u-value corresponding to this point is —1.28 and converting this into the
scale of I.Q. gives

99.3—1.28 x IS 493 2 = 821


Normal Distribu tion 95

Figure 4.18 2 99.

(c) We need to find the lower and upper limits such that the shaded area is 95%
of the total. There are a number of ways of doing this, depending on how the
remaining 5% is split between the two tails of the distribution. It is usual to
divide them equally. On this basis, each tail will contain 0.025 of the total area
and here the required limits will be 1.96 standard deviations below and above
the mean respectively.
Thus, 95% of children will have I.Q. values between
99.3—1.96x 13.4 and 99.3+1.96x 13.4 i.
99.3-26.2 and 99.3+26.2

Figure 4.19 f 9.3 Z

We have assumed that the original sample of 100 children was taken randomly
and representatively from the whole population of children about whom the
above probability statements have been made. This kind of assumption should
always be carefully checked for validity in practice.
In addition, the mean and standard deviation of the sample were used as
though they were the corresponding values for the population. In general, they
will not be numerically equal, even for samples as large as 100, and this will
introduce errors into the statements made. However, the answers will be of the
right order of magnitude which is mostly all that is required in practice.
The assumption of normality of the population has already been mentioned.

4. (a) If the mean length is 1.45 m then the maximum deviation allowed for a
stocking to be acceptable is
+
0.020
0.013
standard deviations, i.e. u = +1.54.
The percentage of unacceptable output is represented by the two shaded
areas in figure 4.20 and is 2 x 0.0618 x 100 = 12.36%.
96 Statistics: Problems and Solutions

= OLONS

Figure 4.20 1.43 1.45 1.47

(b) This time the two shaded areas are each specified to be 0.025 (23%).
Therefore the tolerance that can be worked to corresponds to u = + 1.96,
ie. to + 1.96 x 0.013 = + 0.025 m, or +25 mm.

0.025 0.025

Figure 4.21 @ 1.45 ?

(c) The lower and upper lengths allowed are 1.425 m and 1.475 m respectively.
The shaded area gives the proportion of stockings that do not meet the standard
when the process mean length is 1.46 m.

_ 1.475— 1.460_ ; a
1.475m, u Sie ledksye area = 0.1251

_ LAZS
= 1460 : F
1.425m, u OES 2.69; area = 0.0036

Total shaded area = 0.1287


Thus nearly 13% of output will not meet the standard.

Figure 4.22 1.425 1.46 1475

Bee We
5. (a) For 1.83m, u= Mgigeae = 1.56

Required proportion = 0.0594


oy70.064

Figure 4.23 1.73 1.83


Normal Distribution 97

(b) For 1.83 m, u=


MARS0.050ETO Bids

Required proportion = 0.00069

ow=0.050

Figure 4.24 1.67 1.83

(c) Men shorter than 1.83 — 0,13 = 1.70 will have a clearance of at least
0.13 m.

; E110
= Wise
Corresponding u = 00 0.47

From symmetry, proportion of men with at least 13 cm to spare is 0.3192.

Figure 4.25 ErOmalrs

(d) The frame height which is exceeded by one man in a thousand will be
3.09 standard deviations above the mean height of men, i.e. at

173)%3.09:x 0.064 = 1.93-m

Figure 4.26

(e) For women, 1.83 m corresponds to.u =


iS5-- d07 = 322
0.050
Proportion of women taller than 1.83 m = 0.00069
<4.83)—1.73-—
For men, 1.83 m corresponds to u = ~~ Goea = 1.56

Proportion of men taller than 1.83 m= 0.0594


Expected proportion of people for whom 1.83 m is too low is
0.00069 x 0.95 + 0.0594 x 0.05 = 0.004, i.e
4 people in a 1000.
98 Statistics: Problems and Solutions

The problem can be extended by allowing some people to wear hats as well
as shoes with different heights of heel.
This problem was intended to give practice in using normal tables of area. Any
practical consideration of the setting of standard frame heights would need to
take account of the physiological and psychological needs of human door users,
of economics and of the requirements of the rest of the building system.

6. It is quite possible to use a normal distribution having an arbitrary mean and


standard deviation, but it would make more sense in this case to use the mean
and standard deviation of the observed data. The reason for this is that we are
mainly concerned with testing the assumption of normality without wishing to
specify the parameters.
First the mean and standard deviation are found.

Coded
xy ifs variable (uw) fu fu?

0.6-0.799 1 —4 —4 16
0.8-0.999 3 —3 —9 27
1.0-1.199 6 —2 —12 24
1.2-1.399 8 —1 8 8
1.4-1.599 12 0 =33 0
1.6-1.799 11 1 11 11
1.8-1.999 6 2D 12 24
2.0-2.199 4 3 12 36
2.2-2.399 "2 4 8 Si
53 43 178
=3o
10

Table 4.4

Mean= 1.5 + 0.2 x 38 = 1.538


10?
Standard deviation = 0.2
LIS 53"
Pak iveac
Gia
= 0.2 3 (48:1) = 0.364

Using these two values, the areas under the fitted normal curve falling in each
class are found using table 3* of the statistical tables. This operation is carried
out in table 4.5. Note that the symbol uw in the table refers to the standardised
normal variate corresponding to the class boundary, whereas in table 4.4 it
represents the coded variable (formed for ease of computation) obtained by
Normal Distribution
99

subtracting 1.5 from each class midpoint and dividing the result by 0.2, the
class width.

———————
eee

: Expected
Class Class ; u Area above u Area el normal
boundaries each class
frequency

ae
0.6-0.8 0.6 oe2.58 1-0.0049=0.9
Se a eS 4054 0.86
Pe 0.8 =2.03 1-0.0212=0.9788 o4gy Bice
1.0-1.2 1.0 1.48 1—0.0694=0.9306 ~
0.1068 5.66
a a 1.2 0.93 1-0.1762= 0.8238 4 jas eas
1.4-1.6 1.4 —0.38 1—0.3520=0.6480 0.2155 11.42
heer8 1.6 0.17 0.4325 9 1967 10.43
1.8-2.0 1.8 “hOel Zin 0.2358 - 9 1338 7.09
2.0-2.2 2.0 E27 0.1020, 9 0676 3.58
2.2-2.4 oe, ae ishao 0.0255 1.35
WAla 6 2. 3
Table 4.5
7. (a) Since a chain is as strong as its weakest link, the chain will fail to support
a load of 900 kg if one or more of its links is weaker than 900 kg.

The probability that a single link is weaker than 900 kg is given by the area
in the tail of the normal curve below

_ 900-1000 _ coy, 1.6: 0102275


Z 50
.. The probability that a single link does not fail at 900 kg = 0.97725 and the
probability that none of the links fails = 0.97725”°. Thus the probability that
a chain of 20 links will not support a load of 900 kg is

1 —(0.97725)”° = 1—0.631 = 0.37

a= 50

900 1000
Figure 4.27 Single link strength

(b) In this case, the probability of a chain supporting a load of 900 kg is


required to be 0.999.
100 Statistics: Problems and Solutions

Let p be the probability that an individual link is stronger than 900 kg.
Then we have that
p”® = 0.999
p = 0.99998 (using 5 figure logarithms)

0.000 02\H) 0.999 98


oF
300 2
Figure 4.28 Single link strength

It follows that the probability of an individual link’s being weaker than


900 kg must be at most 0.00002.
Thus 900 kg corresponds to u = — 4.0 approximately and the mean link
strength must be at least
900
+ 4.0 x 50 = 1100 kg

(c) In the long run, one chain out of every two will be stronger than the
median chain strength.
Let p be the probability that an individual link exceeds the median chain
strength.
Then from p”°= 0.5
p = 0.96594 (using 5 figure logarithms)
and the probability that an individual link is Jess than the median chain strength
is (1 —p) = 0.0341.

a=50

0.0341

: u=—|.82 1000
Figure 4.29 Single link strength

Such a tail area corresponds approximately to u =—1.82 and the median


strength of a chain is therefore given by

1000—(1.82 x 50) = 909 kg

8. The density function is

p(u) = wen ena


Normal Distribution 101

(uv)

Figure 4.30 : Oy. 7

The mean of the truncated distribution is given by


Ug Ug

u d(u) du [ue—24” du

i a (27) (1--@)
ig em
(1—a)/(27) gigs
| g(u) du
—oo

= 1 Te hey ie 1
“ieaVGay? "= Gay He
Since the mean was previously at u = 0 (i.e. when a = 0), the above
expression also represents the shift in mean.
o(u,) is the ordinate (from table 5* of statistical tables) of the normal
distribution corresponding tou = uy.
The result just obtained can be used to solve the numerical part of the
problem.
The bottle contents are distributed normally but if the segregation process
operates perfectly (which it will not do in practice), the distribution of bottle
contents offered for sale will correspond to the unshaded part of figure 4.31.

991 1000
Figure 4.31 Bottle contents (ml)

The cut-off volume of 991 ml corresponds to

pe_ 991~—1000
oS 5 _
1.8

The amount of truncation is therefore a = 0.0359. The increase in mean


volume of despatched bottles is therefore
1 _ 0.0790 x 5=0.41ml
(istossa 8) * "7 a0
102 Statistics: Problems and Solutions

Note: The change in mean is positive since the truncation occurs in the lower
tail instead of the upper tail.
The mean volume of bottle contents is therefore 1000 + 0.41 = 1000.4 ml.

4.5 Practical Laboratory Experiments and Demonstrations


The following experiments are reproduced from Basic Statistics, Laboratory
Instruction Manual

Appendix 1—Experiment 10

Normal Distribution
Number of persons: 2 or 3.

Object
To give practice in fitting a normal distribution to an observed frequency
distribution.

Method
The frequency distribution of total score of three dice obtained by combining
all groups’ results in table 2, experiment 1, should be re-listed in table 26
(Table 4.6).
Analysis
1. In table 26, calculate the mean and standard deviation of the observed
frequency distribution.
2. Using table 27, fit a normal distribution, having the same mean and standard
deviation as the data, to the observed distribution.
3. Draw the observed and normal frequency histograms on page 46 and comment
on the agreement.

Notes ;
1. It is not implied in this experiment, that the distribution of the total score
of three dice should be normal in form.
2. The total score of three dice is a discrete variable, but the method of fitting
a normal distribution is exactly the same for this case as for a frequency
distribution of grouped values of a continuous variable.
Normal Distribution 103

Class width = unity.


If c = width of class interval, choose x9 to be the midpoint of a class which,
by inspection, is somewhere near the mean of the distribution.
Obtain the class values u from the relation

The values of u will be


Class positive or negative integers.
Interval
The mean, x, of the sample
Eo) Seo) is given by
S95 4.0
X=Xo les
4.5—5.5

5.5—-6.9

6:0 -7,9

(gleeats)

8°95
= 9.9

Oger VOjis) The variance, (s’)?, of the sample is


IO: 5-115 a — —
i lesk33 seo 12 of
f225=!3.5 13

13.5-14.5 14

15; 5-16.5

(6.551%5

liver:

+ve terms
Tot al of
=v e terms
a
\\ N
\
The standard deviation, s', of
the sample is given by
s! = ./(variance)

Net Totals \V

Table 4.6 (Table 26 of the laboratory manual)


Total U Area under Area
score for normal curve | for Expected
Of.3 |(Glass class from each normal Observed
dice boundaries | boundaries | u to co class frequency frequency

{ee 28S)
S
oS
L 4
Lee)
5 +

Sr
6
655
>

1625)

Table 4.7 (Table 27 of the laboratory manual)


Notes
1. wis the deviation from the mean, of the class boundary expressed as a
multiple of the standard deviation (with appropriate sign).
_ class boundary —*x
i.e. u
S
2. The area under the normal curve above each class boundary may be found
from the table of area under the normal curve at the end of the book.
The normal curve area or probability for each class is obtained by differencing
the cumulative probabilities in the previous column.
3. Other tables which cumulate the area under the normal curve in a different
way may be used, but some of the column headings will require modification
and the probabilities subtracted or summed as appropriate.
4. In order to obtain equality of expected and observed total frequencies, the
two extreme classes should be treated as open-ended, i.e. with class boundaries
of —°e and +° instead of 2.5 and 18.5 respectively.
Normal Distribution 105

Appendix 2—Experiment 11

Normal Distribution

Number of persons: ZOD.

Object
To calculate the mean and standard deviation of a sample from a normal
population and to demonstrate the effect of random sampling fluctuations.

Method
From the red rod population M6/1 (Normally distributed with a mean of 6.0
and standard deviation of 0.2) take a random sample of 50 rods and measure
their lengths to the nearest tenth of a unit using the scale provided. The rods
should be selected one at a time and replaced after measurement, before the
next one is drawn. ;
Record the measurements in table 28.
Care should be taken to ensure good mixing in order that the sample is
random. The rod population should be placed in a box and stirred-up well
during sampling.

Analysis
1. Summarise the observations into a frequency distribution using table 29.
2. Calculate the mean and standard deviation of the sample data using table 30.
3. Compare, in table 31, the sample estimates of mean and standard deviation
obtained by each group. Observe how the estimates vary about the actual
population parameters.
4. Summarise the observed frequencies of all groups in table 32. On page 51,
draw, to the same scale, the probability histograms for your own results and
for the combined results of all groups. Observe the shapes of the histograms
and comment.
a fg Supe
We ee]
1-10

11-20

21-30 = |
31-40

41-50
ee
[eae fe eh
Table 4.8 (Table 28 of the laboratory manual)

Summarise these observations into class intervals of width 0.1 unit with the
measured lengths at the mid points using the ‘tally-mark’ method and table 29.
106 Statistics: Problems and Solutions

Class
Class
interval “‘Tally-marks’ Frequency
mid point
(units)

5-35-3249 5.4
5.45-5.55 Sos
3-3 9=509 SG
Dede el Sil,
Se Byte)
5:85-5:95 559

3295-6.05 6.0

6:05-6.15 6.1

Gal =6525 6.2

G.25=0250 6.3

6.35-6.45 6.4

6.45-6.55 6.5

6.55-6.65 6.6

Total frequency

Table 4.9 (Table 29 of the laboratory manual)

Width of class interval 0.1 unit.


If c is width of class interval, choose x9 to be the mid point of a class which,
by inspection, is somewhere near the mean of the distribution.
Obtain the class values u from the relation

The values of wu will be positive or negative integers.


The mean x of the sample is

Xe ont Oat slth


xf
Normal Distribution 107

Class Mid
Interval, point writes wens

= [a
units

26He
Hi
§:85=5:95-| 5.9

feareisfer
|
6.35-6.45 | 6.4 i ies

6.55—-6.65 | 6.6
posalof ine Yj, pao

\S
eens LZAZZZZ_
at
[netrors
\
V/V \\ nt
Table 4.10 (Table 30 of the laboratory manual)
$F

The variance (s’)* of the sample is


2 al Shu)?
PN 7 alas a
(s y 0 12 xf |

2f

The standard deviation s of the sample is given by

s' =+/(variance)
108 Statistics: Problems and Solutions

Sample

| Pee
Grou Sample M x Standard
- size St deviation

ieee
2 si —
| =
B
es
ee

FE
aoe at pees et
7

ao 8

[FoR
Population
E parameters 6.00
| 0.2
|
Table 4.11 (Table 31 of the laboratory manual—summary of data)

Frequency of rod lengths

5.4|5.5|5.6| 5.7} 5.8} 5.9| 6.0) 6.1] 6.2} 6.3

iif

Total
frequencies

|
hos groups)

Table 4.12 (Table 32 of the laboratory manual)


5 Relationship between the basic
distributions

5.1 Syllabus Covered


The relationships between hypergeometric, binomial, Poisson and normal
distributions; use of binomial as approximation to hypergeometric in sampling;
Poisson as approximation to binomial; normal as an approximation to binomial;
normal as an approximation to Poisson.

5.2 Résumé of Theory


The following points should be revised and stressed.

(1) The basic laws of the distributions in chapters 3 and 4.


(2) Their interrelationships and conditions for using approximate distributions.
Tables 5.1 and 5.2 summarise the interrelationships together with rules for use of
the approximations.
Note: In practice use the Poisson and normal distributions as approximations
to hypergeometric and binomial whenever possible:
(3) (2) Binomial approximation to hypergeometry.
(b) Using Poisson approximation to binomial.
(c) Normal approximation to binomial.
(d) Normal approximation to Poisson.
(4) Whenever the normal distribution is used to approximate either the
hypergeometric, binomial or Poisson distributions, care should be taken to
remember that a continuous distribution is being approximated to a discrete
one, and to include an allowance when calculating probabilities. For example,
take the case of using the normal approximation to the following problem.
What is the chance in a group of 100, of more than 20 persons dying before
65 years of age, given that the chance of any one person’s dying is 0.20?
Here, since p > 0.20 and np > 5, normal approximation can be used.
However, in calculating the probability figure 5.1 illustrates that the value 20.5
and not 20 must be used.
109

I
v

c
ATT d1IjaWOasIOd [erurourg uOSSsIOg [BULION
uorynqriystp uorynqrtystp uorynqrijstp NQIIsIp
UOT

( a) W—
[eIaUayUWII9}
JO x-—u\/x u x yisuoq
p wornWorjngriiys
x)q(x)dqt ee EN (dared
me
ae — jxalia 3uorounyx) z(1—
“(xn
c
z0¢

=(x)/fL 3
/NO uz
WwW
uvoy xu N du ul nt

He QOURTIe XU ie = — (uinin (4-1)


N (I-N)(/N d—|)du ul 0 Zz

1SIJ0N uolejndu
jo og uorjejndu
[f[M rog joa1IG uoljeInduros SaTIT]IGeqo
Uvd lg
AlIsea
9q
SartqtyIqeqoid
SI AT[ensn
oq SNOTPd9}
UdAd $1 sJoIsea ueyy10} (1) pourejqo
Woy
& 9]42)JO
dAIssaoxe
puke 94} It 4 st ‘9[qeorjoesd
J] 10 (7) satqey,
JO seore IopuN 9Y} [BUulIOU
UOT}NGIISIp 9e[NUIIOJ satqe}
jo Satj[Iqeqoid uosstog SorjIqeqoid “aAINO
1] SI Alessad9au
0}
paou A[UOoq pasn
UI are jou ‘oTqeyieae
ay} ore Ajipeol ‘ayquyieae Ssoidxa
94} a[QeIIeA
UI
sor10e1do10yM
9y} [ensn 93e11do1dde
ouo JO pastprepuej
‘UIIOJ
‘a°t
s UI
SUOT}eWITxo1dde
‘o'T) suornqrysip
(¢) pue SUIIO}
JO
uorjnqrsysip
(Z) Pue (p) UvD A[[e1oues
aq
sj1 (suoljewtxoidde
Op pasn
se e Poos
jou dats 94} polinbol ‘uol]ewTxoidde

ee ETL LE Sea a a ——————


91QeT,“[°S AIysuoT}eoY
U99Mj0q
y SUOIINQII}SIp

*Aob’INIIV
Relationship Between the Basic Distributions 111

Use (2) as an approximation for (1)

: M n
utt
puttingp =N ages7 < 0.10
if

Use (3) as an approximation for (2)

puttingm = np if p<0.10

Use (4) as an approximation for (2) if p=0.10

putting u = np and o? = np(1—p) p= 0.90


np>5

Use (4) as an approximation for (3) ifm215


: d preferably
utting uy==m man
putting and o? = m sere
eae

Table 5.2 Approximations and a guide to their use in practice

The suggested approximations will usually be satisfactory for practical


purposes. However, for values of the parameters near to the limiting conditions
give above, care should be taken when determining probabilities in the tails of
a distribution, as the errors of approximation may be considerably greater than
allowable.

= ese Probability of exceeding 20.


Value of 20.5 must be used.

19 20 2! 22
Figure 5.1 Number of deaths

5.2.1 Hypergeometric, Binomial and Poisson Approximations


Tables 5.3 and 5.4 give details of the accuracy of the approximations at the
limiting conditions; obviously the further the parameters are from these
conditions the more accurate the approximation.
Batch size V = 100 of which 10 are defective.

Sample size n = 10, p=0.10

thus =0.10, np=1


112 Statistics: Problems and Solutions

Table 5.3 gives a full comparison of the probabilities of finding x defects in the
sample.

No. of defects Hypergeometric Binomial Poisson


in sample (x) distribution distribution distribution

0 0.3305 0.3487 0.3679


1 0.4080 0.3874 0.3679
» 0.2015 0.1937 0.1839
3 0.0518 0.0574 0.0613
4 0.0076 0.0112 0.0153
5 0.0006 0.0015 0.0031
6 or over 0.0000 0.0001 0.0006

Table 5.3

5.2.2 Normal Distribution as an Approximation to Poisson


Table 5.4 gives a full comparison of the probability of x successes where m = 15
using the Poisson and its normal approximation.

No. of Poisson Normal No. of Poisson Normal


successes distribution approximation successes distribution approximation
(x) (x)

0 0.0000 0.0001 16 0.0960 0.0993


1 0.0000 0.0001 17 0.0848 0.0899
2 0.0000 0.0004 18 0.0706 0.0762
3 0.0002 0.0009 19 0.0557 0.0603
4 0.0007 0.0019 20 0.0418 0.0452
5 0.0019 0.0037 on 0.0299 0.0313
6 0.0048 0.0072 UD 0.0204 0.0200
7 0.0104 0.0122 23 0.0132 0.0122
8 0.0194 0.0200 24 0.0083 0.0072
9 0.0325 0.0313 25 0.0050 0.0037
10 0.0486 0.0452 26 0.0029 0.0019
11 0.0663 0.0603 27; 0.0016 0.0009
12 0.0828 0.0762 28 0.0008 0.0004
13 0.0956 0.0899 29 0.0005 0.0001
14 0.1025 0.0993 30 0.0002 0.0001
15 0.1024 0.1026 31 0.0001 0.0000
SSS

Table 5.4
Relationship Between the Basic Distributions 113

The two distributions converge rapidly as m increases.


While most statisticians accept the use of the normal approximation for
m > 15, it will be seen that there is quite an appreciable divergence in the tails
of the distributions. The authors recommend that whenever possible normal
approximation is used when m > 30.
The statistical tables* have been amended to give Poisson probabilities up to
m= 40.

5.2.3 Examples on the Use of Theory


1. In sampling from batches of 5000 components, a sample of 50 is taken and if
one or more defects is found the batch is rejected. What is the probability of
accepting batches containing 2% defects?
The theoretically correct distribution is the hypergeometric, but since
ARH J 50
Neveas S000
is less than 10%, the binomial can be used.
However, computation is still difficult and since p < 0.10 the Poisson
distribution can be used.
Solution by binomial approximation from table 1*
Probability of accepting batches with 2% defectives = 0.3642
Solution by Poisson approximation
Expected number of defects in sample = np = SO x 0.02 = 1

From table 2*
Probability of accepting batches with 2% defectives = 0.3679
2. In 50 tosses of an unbiased coin, what is the probability of more than 30
heads occurring?
This requires the binomial distribution which gives
50
A> 30 heads)= E (3)ayaye
Resi

from table 1 = 0.0595

Using normal approximation since p > 0.10 and np > 5


mean of distribution = np = 25
Variance of distribution = np(1—p) = 50 x 4 x 4 = 12.5
114 Statistics: Problems and Solutions

a0 =3.54

Figure 5.2 25 30.5

Standard deviation = 3.54

Jao 25 5 soc)
33 Saroaeigaa See
which from table 3* leads to a probability of 0.0606.
Note: Since a continuous distribution is being used to approximate to a
discrete distribution, the value 30.5 and not 30 must be used in calculating the
u value.
3. A machine produces screws 10% of which have defects. What is the probability
that, in a sample of 500
(a) more than 35 defects are found?
(b) between 30 and 35 (inclusive) defects are found?
The binomial law: assuming a sample of 500 from a batch of at least 5000.
The normal approximation can be used since p > 0.10, np = 50.
w= np = 500 x 0.10= 50
o = (500 x 0.10 x 0.90) = 45 = 6.7

twa 28SES0. S96


Probability of more than 35 defects from tables* = 1 — 0.01539 = 0.9846

(b) Probability of between 30 and 35 defects, use limits 29.5 and 35.5.

o=6.7

Figure 5.3 29.5 355. 50

29.5 50. 20:5 _


u

Probability of more than 29 = 1 — 0.0011 = 0.9989


’. Probability of between 30 and 35 inclusive = 0.9989 — 0.9846 = 0.0143
Relationship Between the Basic Distributions 115

4. The average number of breakdowns per period of an assembly moulding


line is 30. If the breakdowns occur at random what is the probability of more
than 40 breakdowns occurring per period?

Here the theoretically correct distribution is the Poisson. However, since


m > 15 use normal approximation.
Solution by Poisson, table 2*

y 3 0* *:e~1 30
x= x=41. = 0:03.23

Using normal approximation

o =V30= 5.48

Figure 5.4 30 40.5

Again to include 41 breakdowns but exclude 40

Probability of exceeding 40, P(>40) = 0.0274 from statistical table 3*.

5.2.4 Examples of Special Interest

1. Bomb Attack on London


During the last war, it was asked whether the bombs dropped on London were
aimed or whether they fell at random. The term ‘aimed’ is of course very loose,
since obviously the Germans could point the bombs towards Britain, but aim in
this problem is defined as pinpointing targets inside a given area.

To determine the solution, part of London was divided into 576 equal areas
(4 km? each) and the number of areas with 0, 1, 2, . . ., hits was tabulated from
the results of 537 bombs which fell on the area. These data in distribution form
are shown in table 5.5.
116 Statistics: Problems and Solutions

Number of hitsj 0 1 2 3 4
Number of areas with] hits 229 DiI 93 65) 7 1
eS a ee ee

Table 5.5

In statistical logic, as will be seen later, an essential step in testing in the logic
is the setting up of what is called the null hypothesis.
Here the null hypothesis is that the bombs are falling randomly or that there
is no ability to aim at targets of the order of 4 km? in area.
Then if the hypothesis is true, the probability of any given bomb falling in
any one given area = 3%.
Probability of x hits in any area

2 (O37 Ten\Vi.£575\22)-
HO (x Nate) G2)
from the binomial law.
However, since the probability of success is very small and the number of
attempts is relatively large, the Poisson law can be used as an approximation to
the binomial thus greatly reducing the computation involved.
Thus, for the Poisson calculation
average number of successes m = np = 537 x 3% = 0.93

The results obtained by reference to statistical tables by interpolation for the


chance of various numbers of hits are given in table 5.6.

Number of hitsj 0 1 2, 3 4 5
Probability of j hits 0:39.55; 10:367) + OF1708 10:053h.520,0125 2 0: 002

Table 5.6

Table 5.7 shows the results obtained by comparing the actual frequency
distribution of number of hits per area with the Poisson expected frequencies if
the hypothesis is true.

Number of hits7 0 1 Y BUOoaggy 815


Actual number of areas with j hits 22 Oa DIA. | 103 a SS ue by sell
Expected number of areas with] hits (Poisson) 227 211 98 31 7 1

Table 5.7
Relationship Between the Basic Distributions 117

The agreement is certainly good enough (without significance testing) to state


that the null hypothesis is true; namely, that the bombs fell at random, so that
the area into which the bomb could be aimed must have been much larger than
the area of London.

2. Defective Rate of a Production Process


The number of defects per shift produced by a certain process over the last 52
shifts is given in table 5.8. Is the process in control, i.e. has the defective rate
remained constant over the period? The total production per shift is 600 units.

Number of defects/shift Oren 3 §4a 516 (058 9G Total


Frequency Dic Soh FS ehBl [Miesiya OW an amesye a) I) 52

Average defects/shift = 3.6


Table 5.8

This problem gives an excellent introduction to the basic principles of


quality control.
The process is assumed to be in control. If this hypothesis is true then

the probability of any one component being defective = 2“ = 0.006

Thus, by the Binomial law

probability of x defects in a shift P(x) = ea 0.006* 0.994900*

Number of Number of Poisson Calculated number


defectives(s) shifts p(s) of shifts: 52 P(s)

0 2 0.0273 1
1 6 0.0984 5
2 9 Onli 9
3 11 O125 iil
4 8 0.1912 10
5 6 O213 77, Wl
6 4 0.0826 4.5
7 3 0.0425 2
8 2 0.0191 1
9 1 0.0076 0.5
10 0 0.0040 0.2
52 1.0000 51.2

Table 5.9
SPS—5
118 Statistics: Problems and Solutions

However, here again the Poisson law gives an excellent approximation to the
binomial, reducing the computation considerably.
It should be noted that in most attribute quality control tables this Poisson
approximation is used.
Using m = 3.6, table 5.9 gives the comparison of the actual pattern of variation
with the Poisson.
Reference to the table indicates that the defects in the period of 52 shifts did
not show any ‘abnormal’ deviations from the expected number.
Thus, this comparison gives the basis for determining whether or not a
process is in control, the basic first step in any quality control investigation.

5.3 Problems for Solution


1. Ina machine shop with 250 machines, the utilisation of each machine is
80%, i.e. 20% of the time the machine is not working. What is the probability
of having

(a) more than 60 machines idle at any one time?


(b) between 60 and 65 machines idle?
(c) less than 32 machines idle?

2. In a sampling scheme, a random sample of 500 is taken from each batch of


components received. If one or more defects are found the batch is rejected.
What is the probability of rejecting batches containing
(a) 1% defectives?
(b) 0.1% defectives?

3. Assuming equal chance of birth of a boy or girl, what is the probability that
in a class of 50 students, less than 30% will be boys?

4. The average number of customers entering a supermarket in 1 h is 30.


Assuming that all customers arrive independently of each other, what is the
probability of more than 40 customers arriving in 1 h?

5. In a hotel, the five public telephones in the lobby are utilised 48% of the time
between 6 p.m. and 7 p.m. in the evening. What is the probability of
(a) all telephones being in use?
(b) four telephones being in use?
6. A city corporation has 24 dustcarts for collection of rubbish in the city. Given
that the dustcarts are 80% utilised or 20% of time broken down, what proportion
of the time will there be more than three dustcarts broken down?
7. A batch of 20 special resistors are delivered to a factory. Four resistors are
Relationship Between the Basic Distributions 119

defective. Four resistors are selected at random and installed in a control panel.
What is the probability that no defective resistor is installed?

5.4 Worked Solutions to the Problems


1. This is the binomial distribution.
Since p = 0.20, np = 250 x 0.20 = 50, thus normal approximation can be
used.

u=5.0

o”? = 250 x 0.20 x 0.80 = 40

o=6:3

Figure 5.5 S175 50 §9.5 6€0.565.5

_ 60.5-50 _
(a) u 6.3 1.67

Probability of more than 60 machines idle P(>60) = 0.0475

_ 65.5-50 _ 2.46
ore
Probability of more than 65 machines idle P(>65) = 0.0069

Also
u=
59.5-50
per _ et
Probability of more than 59 machines idle P(>59) = 0.0655
Probability of between 60 and 65 machines idle (inclusive = 0.0655-0.0069
= 0.0586

_ 31.5-50 _
2.94
53
Probability of less than 32 machines idle P(<32) = 0.0016

2. This is the hypergeometric distribution in theory but if it is assumed that 7,


the sample size, is less than 10% of the batch, the binomial can be used.
120 Statistics: Problems and Solutions

For 1% defectives, since p < 0.10 \the Poisson approximation can


and for 0.1% defectives, since p< 0.10 } be used.

(a) m=np
= 500 x qo = 5
Probability of rejecting batches with 1% defectives

P(>0) = 0.9933

(b) m= m= np
np = 500 X se
709 aa Tis
:

Probability of rejecting batches with 0.1% defectives

P(>0) = 0.3935

3. Probability of birth of a boy p = 4, sample size n = 50


This is the binomial distribution but since p > 0.10 and np > 5 the normal
approximation can be used. Thus, uw = np = 50 x 0.5 = 25
o =V/[np(1 —p)] = V(50 x 0.5 x 0.5) = 12.5 = 3.54
To calculate the probability of there being less than 15 boys

PVAG- 25210, 502


asd deregusare (aia|

a=3.54

Figure 5.6 14.5 pE=25

From tables*,
Probability of class of 50 having less than 15 boys = 0.0015

Compare this with the correct answer from binomial tables of 0.0013.

4. This by definition is the Poisson law. However, since m > 15, the normal
approximation can be used. Here p = 30, o =»./30= 5.48

_ 40.5-30_
MO Sgro aiiee
Probability of more than 40 customers arriving in 1 h = 0.0274
(Compare this with the theoretically correct result from Poisson of 0.0323.)
Relationship Between the Basic Distributions 121

a=5.48

Figure 5.7 p2=30 40.5

5. Probability of a telephone booth being busy p = 0.48.


number of booths, n = 5
This is the binomial distribution.
This example is included to demonstrate clearly that cases will arise in
practice where the approximations given will not apply. Here we cannot use
Poisson approximation since p > 0.10. Also we cannot use normal approximation
since np is not greater than 5. Thus the problem must be solved by computing
the binomial distribution or referring to comprehensive binomial tables.
Thus

probability of all telephones being in use = 0.48° = 0.0255

probability of four telephones being in use = (®)0.487 x 0.52! = 0.1380

6. Here, n = 24
Probability of dustcart’s being broken down (p) = 0.20. This is the binomial
distribution. Here the normal distribution can be used as an approximation.
Mean yp = np = 24 x 0.20 = 4.8

Variance o? = np(1—p) = 24 x 0.20 x 0.80 = 3.84


Standard deviation = 1.96

3548-13
1.96 1.96

ao = 1.96

Figure 5.8 3.5 p= 4.8


122 Statistics: Problems and Solutions

Table 3* gives the probability of three or less dustcarts being out of service
as 0.2546
Probability of more than three dustcarts being out of service
P(>3) = 1— 0.2546 = 0.7454 or 74.5%

7. Here this is the hypergeometric distribution and since the sample size 4 is
greater than 10% of population (20) no approximation can be made. Thus the
hypergeometric distribution must be used.
Probability of 0 defects
Ces 16!
o/\4/_12!4!16 15 14 13
P(0) = (2) SaaHr MoO 1g M18 y7 TSAO
& 16! 4!

5.5 Practical Laboratory Experiment and Demonstrations


Using the binomial sampling box, experiment 8 in the laboratory manual
demonstrates how the Poisson distribution can be used to approximate the
binomial.
The laboratory instructions are given together with recording, analysis and
summary sheets in pages 37-39 of the manual.
The laboratory instruction sheet for experiment 8 is reproduced in
Appendix 1.

Appendix 1—Instruction Sheet for Experiment 8

Number of persons: 2 or 3.

Object
To demonstrate that the Poisson law may be used as an approximation to the
binomial law for suitable values of n (sample size) and p (proportion of the
population having a given attribute), and that, for a given sample size n, the
approximation improves as p becomes smaller. (Vote: for a given value of p, the
approximation also improves as n increases.)

Method
Using the binomial sampling box, take 100 samples of size 10, recording, in
table 21, the number of red balls in each sample. (Proportion of red balls in the
population = 0.02.)
Relationship Between the Basic Distributions 123

Analysis
1. Summarise the data into a frequency distribution of number of red balls per
sample in table 22 and compare the experimental probability distribution with
the theoretical binomial (given) and Poisson probability distributions.
Draw both the theoretical Poisson (mean = 0.2) and the experimental
probability histograms on figure 1 below table 22.

2. Using the data of experiment 7 and table 23, compare the observed
probability distribution with the binomial and Poisson (mean = 1.5) probability
distributions.
Also, draw both the theoretical Poisson (mean = 1.5) and the experimental
probability histograms on figure 2 below table 23.

Note: Use different colours for drawing the histograms in order that comparison
may be made more easily.
6 Distribution of linear
functions of variables

6.1 Syllabus Covered


Variance of linear combinations of variates; distribution of sample mean;
central limit theorem.

6.2 Résumé of Theory and Basic Concepts


6.2.1 Linear Combination of Variates
Consider the following independent variates x, y, z,... with means X, y, Z,...
and variances 0%, 0}, 02, .
Let w,=ax,+by,+cz,+...wherea, b, c are constants.
Then w is distributed with mean w=ax+by+cZ+...and variance
Gy 2a02 tb Gy keGe ei,

Special Case 1—Variance of Sum of Two Variates


Here a= +1,b=+1 andc = 0 as for all other constants, then

Wy = Xp + yy
WH=EtH
Oy 20s POs
or the variance of the sum of two independent variates is equal to the sum of
their variances.

Special Case 2— Variance of Difference of Two Variates


Here a = +1, b = —1 and all other constants = 0.

aa Wy =Xr—Jr

W=xX-j
Oy “OstOy
124
Distribution of Linear Functions of Variables 125

or the variance of the difference of two variates is the sum of their variances.
Note: It should be noted that while this theorem places no restraint on the
form of distribution of variates the following conditions are of prime importance:

(1) If variates x, y, z,... are normally distributed then w is also normally


distributed.
(2) If variates x, y, z are Poisson distributed then w is also distributed as
Poisson.

Examples
1. In fitting a shaft into a bore of a housing, the shafts have a mean diameter of
50 mm and standard deviation of 0.12 mm. The bores have a mean diameter of
51 mm and standard deviation of 0.25 mm. What is the clearance of the fit?
The mean clearance = 51 —50=1mm
Variance of clearance = 0.12? + 0.25? = 0.0769
Standard deviation of clearance = \/0.0769 = 0.277 mm

2. A machine producing spacers works to a nominal dimension of 5 mm and


standard deviation of 0.25 mm. Five of these spacers are fitted on to a bolt
manufactured to a nominal shaft dimension of 38 mm and standard deviation
0.50 mm.
What is the mean and variance of the clearance on the end of the shaft of
the bolt?

mm

38 mm :

Figure 6.1

Here average clearance = 38—5x 5= 13mm


Variance of clearance = 1 x 0.5? + 5 x 0.25” = 0.5625 mm
Standard deviation = 0.75 mm

3. (a) The time taken to prepare a certain type of component before assembly
is normally distributed with mean 4 min and standard deviation of 0.5 min. The
time taken for its subsequent assembly to another component is independent of
preparation time and again normally distributed with mean 9 min and standard
deviation of 1.0 min.
126 Statistics: Problems and Solutions

What is the distribution of total preparation and assembly time and what
proportion of assemblies will take longer than 15 min to prepare and assemble?
Let w = total preparation and assembly time for rth unit.

w=4+9= 13 min

a7; =I 05° + 17 xX (0. = 1.05

or standard deviation of w, oy, =./1.25 = 1.12 min

Figure 6.2 15

Distribution of preparation and assembly time, w

eal plain Seles e ~


u= 740 Tp 178 mean. ..py, =13 Gy 51,12

Reference to table 3* gives that probability of total assembly and preparation


time exceeding 15 min is 0.0375 or 3.75% of units.

(b) In order to show clearly the use of constants a, b, c,.. ., consider the
previous example, but suppose now that each unit must be left to stand for
twice as long as its actual preparation time before assembly is commenced.
What is the distribution of total operation time now?
Here

Wy = 3X, + yy
where

xX, = preparation time

¥, = assembly time
w=(3
x 4)+9=21 min

Ow = 3° KU fi 125
Standard deviation of w = 1.8 min.

(c) To further clarify the use of constants, consider now example 3(a). Here
the unit has to be sent back through the preparation phase twice before passing
on to. assembly.
Distribution of Linear Functions of Variables 127

Assuming that the individual preparation times are independent, what is the
distribution of the total operation time now?
Here ;

Wy = (Xp, + Xp, + Xp.) + Vr


or

W=(4+4+4)+9 =21 min as before


however, variance

oy = (1? x 0.5? + 1?.x 0.57 + 1? x 0.52)+ 1? x 12 = 1.75


Standard deviation

Ow = 1.32

6.2.2 Distribution of Sum of n Variates


The sum of n equally distributed variates has a distribution whose average and
variance are equal to n times the average and variance of the individual variates.
This follows direct from the general theorem in section 6.2.1.
Let
X=V=25.. and @=b=c,..= 1

then
WX tx. 4 te = AX

o%, = (1? x 02) +(1? x 02) +... +(1? x 02) = no2

Example
Five resistors from a population whose mean resistance is 2.6 k&2 and standard
deviation is 0.1 k&82 are connected in series. What is the mean and standard
deviation of such random assemblies?
Average resistance = 5 x 2.6 = 13 kQ
Variance of assembly = 5 x 0.17 = 0.05
Standard deviation = 0.225 kQ

6.2.3 Distribution of Sample Mean


The distribution of means of samples of size n from a distribution with mean yu
and variance o? has a mean of p and variance o”/n.

Population

mean UL
variance o”
128 Statistics: Problems and Solutions

Consider ith sample of size n from this population


Let x, = rth member of this sample
then the mean of the ith sample

Pe |
Xj G1 tee Gal tee. oc Xi)

1
as(t)m a=1 1
a x3 +...
ba (3) aries471 (x,) )
6959) i Ps (; Xi

Since mean of x, = mean of x, = mean of x, =p

average of distribution of samples of size n = .(utut...t+u)=y

2 2 2
; 1 1
Variance of distribution of sample of size n = (2) o? + (2) Oo Fs x (2) o

Standard deviation of samples of size n = ie

6.2.4 Central Limit Theorem


(Associated with theorem of distribution of sample mean in section 6.2.3 or
with distribution of sum of variates in section 6.2.2.)
The distribution of sample mean (or the sum of n variates) has a distribution
that is more normal than the distribution of individual variables.
This theorem explains the prominence of the normal distribution in the
theory of statistics and the approximation to normality obviously depends upon
the shape of the distribution of the variate and the size of n. Asn increases the

NO Bas oO ®

Figure 6.3. Probability distribution (a) the score of 1 die (b) the score of 3 dice.
Distribution of Linear Functions of Variables 129

sampling distribution of means gets closer to normality and similarly the closer
the original distribution to normal the quicker the approach to true normal form.
However the rapidity of the approach is shown in figure 6.3 which shows the
distribution of the total score of three 6-sided dice thrown 50 times. This is
equivalent to sampling three times from a rectangular population and it will
be seen that the distribution of the sum of the variates has already gone a long
way towards normality.

6.2.5 Distribution of the Sum (or Difference) of Two Means


My and ly are the means of distribution of x and y and o%, 0} their respective
variances, then if a sample of n, is taken from the x population and a sample of
ny from the y population, the distribution of the sum (or difference) between
the averages of the samples has mean

Mx + My (or Ux — My)
and variance
0% , Oy
2 Ps

i ny
In the special case where two samples of size n, and, are taken from the
same population with mean yu and variance o*, the moments of the distribution
of sum (or difference) of sample averages is mean 2u for sum (and 0 for
difference) and variance

@ IG, + a
ny Ny
This theorem is most used for testing the difference between two populations,
but this is left until chapter 7.

Example
A firm calculates each period the total value of sales orders received in £.p. The
average value of an order received is approximately £400, and the average number
of orders per period is 100.
What likely maximum error in estimation will be made if in totalling the
orders, they are rounded off to the nearest pound?
Assuming that each fraction of £1 is equally likely (the problem can,
however, be solved without this restriction) the probability distribution of the
error on each order is rectangular as in figure 6.4, showing that each rounding
off error is equally likely.
Consider the total error involved in adding up 100 orders each rounded off.
Statistically this is equivalent to finding the sum of a sample of 100 taken
from the distribution in figure 6.4.
130 Statistics: Problems and Solutions

creae tS
=—50'p Oo +50p

Figure 6.4. Probability distribution of error/order.

From theorem 3 the distribution of this sum will be normal and its mean and
variance are given below.
Average error =0
Variance of sum = 1000? where o? variance of the distribution of individual
errors

For a rectangular distribution, 0? = h?/12 where h = range of base.


h in this problem = 100p = £1.00

Variance of sum = 1 = 8.33

Standard deviation = ./8.33 = £2.90


Here the likely maximum error will be interpreted as the error which will be
exceeded only once in 1000 times. '
Since the error can be both positive or negative
Maximum likely error = + 3.29 x £2.9 or +£9.50

6.3 Problems for Solution


1. A manufacturer markets butter in} kg packages. His packing process has a
standard deviation of 10 g. What must his process average be set at to ensure that
the chance of any individual package’s being 5% under the nominal weight of
$ kg is only 5% (or 1 in 20)?
If the manufacturer now decides to market in super packages containing
four 3-kg packages, what proportion of his product can be saved by this
marketing method if he still has to satisfy the condition that super packages
must only have a 5% chance of being 5% under nominal weight of 2 kg?

2. The maximum payload of a light aircraft is 350 kg. If the weight of an adult
is normally distributed (N.D.) with mean and standard deviation of 75 and 15 kg
respectively, and the weight of a child is normally distributed with mean and
standard deviation of 23 and 7 kg respectively, what is the probability that the
plane can take off safely with

(a) four adult passengers?


(b) four adult passengers and one child?
Distribution of Linear Functions of Variables
131

In each case, what is the probability that the plane can take off if 40 kg of
baggage is carried?
3. Two spacer pieces are placed on a bolt to take up some of the slack before
a spring washer and nut are added. The bolt (b) is pushed through a plate (p)
and then two spacers (s) added, as in figure 6.5.

(b)

Clearance

Figure 6.5

Given the following data on the production of the components


plate: mean thickness 12 mm, standard deviation of thickness 0.05 mm,
normal distribution
bolt: mean length 25 m, standard deviation of length 0.025 mm, normal
distribution
spacer: mean thickness 3 mm, standard deviation of thickness 0.05 mm,
normal distribution

what is the probability of the clearance being less than 7.2 mm?
4. In a machine fitting caps to bottles, the force (torque) applied is distributed
normally with mean 8 units and standard deviation 1.2 units. The breaking
strength of the caps has a normal distribution with mean 12 units and standard
deviation 1.6 units. What percentage of caps are likely to break on being fitted?
5. Four rods of nominal length 25 mm are placed end to end. If the standard
deviation of each rod is 0.05 mm and they are normally distributed, find the
99% tolerance of the assembled rods.
6. The heights of the men in a certain country have a mean of 1.65 m and
standard deviation of 76 mm.
(a) What proportion will be 1.80 m or over?
132 Statistics: Problems and Solutions

(b) How likely is it that a sample of 100 men will have a mean height as
great as 1.68 m. If the sample does have a mean of 1.68 m, to what extent does
it confirm or discredit the initial statement?
7. A bar is assembled in two parts, one 66 mm + 0.3 mm and the other
44 mm + 0.3 mm. These are the 99% tolerances. Assuming normal distributions,
find the 99% tolerance of the assembled bar.

8. Plugs are to be machined to go into circular holes of mean diameter 35 mm


and standard deviation of 0.010 mm. The standard deviation of plug diameter is
0.075 mm.
The clearance (difference between diameters) of the fit is required to be at
least 0.05 mm. If plugs and holes are assembled randomly:

(a) Show that, for 95% of assemblies to satisfy the minimum clearance
condition, the mean plug diameter must be 34.74 mm.
(b) Find the mean plug diameter such that 60% of assemblies will have the
required clearance.
In each case find the percentage of plugs that would fit too loosely (clearance
greater than 0.375 mm).

9. Tests show that the individual maximum temperature that a certain type of
capacitor can stand is distributed normally with mean of 130°C and standard
deviation of 3°C. These capacitors are incorporated into units (one capacitor per
unit), each unit being subjected to a maximum temperature which is distributed
normally with a mean of 118°C and standard deviation of 5°C.
What percentage of units will fail due to capacitor failure?

10. It is known that the area covered by 5 litres of a certain type of paint is
normally distributed with a mean of 88 m? and a standard deviation of 3 m?. An
area of 3500 m? is to be painted and the painters are supplied with 40 5-litre tins
of paint. Assuming that they do not adjust their application of paint according
to the area still to be painted, find the probability that they will not have
sufficient paint to complete the job.

11. A salesman has to make 15 calls a day. Including journey time, his time
spent per customer is 30 min on average with a standard deviation of 6 min.
(a) If his working day is of 8 h, what is the chance that he will have to work
overtime on any given day?
(b) In any 5-day week, between what limits is his ‘free’ time likely to be?
12. A van driver is allowed to work for a maximum of 10h per day. His
journey time per delivery is 30 min on average with a standard deviation of 8
min.
In order to ensure that he has only a small chance (1 in 1000) of exceeding
the 10 h maximum, how many deliverties should he be scheduled for each day?
Distribution of Linear Functions of Variables 133

6.4 Solutions to Problems

1. At least 95% of individual packets must weigh more than 0.475 kg. Thus the
process average weight must be set above 0.475 kg by 1.645 times the standard
deviation (see figure 6.6; 5% of the tail of a normal distribution is cut off
beyond 1.645 standard deviations), i.e at

0.475 + 1.645 x 0.010 = 0.475 + 0.0164 = 0.491 kg

0.05
0.475 0.475+1.645 x 0.01 1.9 1.9+1.645x0.0l
Individual packets Weight of 4 packs

Figure 6.6 ; Figure 6.7

If individual packages are packed four at a time, the distribution of total net
weight and the probability requirements are shown in figure 6.7.
The mean weight of 4 packages must be

1.9 + 1.645 x 0.014/4 = 1.9 + 0.033 = 1.933 kg

Thus the process setting must be 1.933/4 = 0.483 kg


The long run proportional saving of butter per nominal 4-kg package is

0.491 — 0.483 _ 0.008


= 0.0163 or 1.63%
0.491 0.491

2. (a) The weight of four adult passengers will be normally distributed with
mean of 4 x 75(= 300) kg and standard deviation of \/4 x 15(=30) kg. The
shaded area in figure 6.9 gives the probability that the plane is within its
maximum payload.
_ The standardised normal variate,

y=_ 320-=
350—300_
300 _5
Sn SelOi)
30 30

Cent

: Nioas! 5

23 75
Child weight Adult weight

Figure 6.8 Figure 6.9


134 Statistics: Problems and Solutions

Figure 6.10

Table 3* gives the unshaded area as 0.0475. The required answer is


1 — 0.0475 = 0.9525, say 0.95
With 40 kg of baggage, the mean weight is increased to 300+ 40 = 340 kg.

= 350 = 340 _ 10_ 0.33


a0 |30
The probability of safe take-off now becomes 1 — 0.3707 = 0.63
(b) For four adults and one child the weight distribution is shown in
figure 6.11.

As before,
250323 . 2f
aces inane hc
Thus probability of safe take-off = 1— 0.1894 = 0.81

Figure 6.11

With 40 kg of baggage in addition, the distribution is


_ 350-363 _-13__
30.8 30.8 oe
Probability of safe take-off = 0.3372, say 0.34

:
o = 308

Figure 6.12 350 300+23+40


Distribution of Linear Functions of Variables 135

3. The mean clearance will be 25 — 12 —3—3=7 mm

The variance of the clearance will be 0.025? + 0.050? + 0.0502 + 0.0502


= 0.008125

and the standard deviation is 0.090 mm.


The distribution of clearance is shown in figure 6.13, the shaded area being
the required answer.

P2710 70.2
Ssh
10.095 8
thus the probability that the clearance is less than 7.2 mm is 1 — 0.0132 = 0.987

JU o=0.090

Figure 6.13
Do».
7
.
7.2

4. Acap will break if the applied force is greater than its breaking strength.
The mean excess of breaking strength is 12 — 8 = 4 units while the standard
deviation of the excess of breaking strength is \/(1.6? + 1.27) =./4.00 = 2.0.
When the excess of cap strength is less than zero the cap will break and the
proportion of caps doing so will be equal to the shaded area of figure 6.14, i.e
the area below

u= =p a COT O0225.

about 24% of caps.

O 4
Figure 6.14 Excess of breaking strength

5. The distribution of the total length of four rods will be normal with a mean
of 4x 25 = 100 mm and standard deviation of \/4 x 0.05 = 0.10 mm.
Ninety-nine per cent of all assemblies of four rods will have their overall
length within the range
100+2.58x0.10mm ie. 100+0.26mm

6. (a) Assuming that heights can be measured to very small fractions of a


136 Statistics: Problems and Solutions

metre, the required answer is equal to the shaded area in figure 6.15.
1180S 1.6575
u= “SOE = 1.97

and area = 0.0244

a =0.076

Figure 6.15 1.65 1.80

If heights, say, can only be measured to the nearest 5 mm, it would be


reasonable to say that any height actually greater than 1.795 m would be
recorded as 1.80 m or more. In this case u becomes

1.795 — 1.65 _ 0.145 _


0.076 0.076 io!

Proportion over 1.80 m tall = 0.0281.

(b) Average heights of 100 men at a time (selected randomly) will be


distributed normally with a mean of 1.65 m and standard deviation of
76/\/100 = 7.6 mm.
Probability of a mean height of 1.68 m or more equals the shaded area in
figure 6.16.

_ 1.68— 1.65
Unie a
The shaded area is about 0.00004.

1.65 168
Figure 6.16 Mean
of |OO heights

Possible alternative conclusions are that this particular sample is a very unusual
one or that the assumed mean height of 1.65 m is wrong (being an underestimate)
or that the standard deviation is actually higher than the assumed value of
76 mm.

7. The standard deviation of each component part is 0.3/2.58. The standard


deviation of an assembly of each part will be 0.3/2/2.58 about a mean of
Distribution of Linear Functions of Variables 137

66+ 44= 110mm.


Ninety-nine per cent of assemblies will lie within 110 + 0.3.\/2 mm, i.e. within
110+ 0.42 mm.

8. The distribution of clearance will have standard deviation of

\/(0.100? + 0.0757) = 0.125 mm


(a) For 95% of assemblies to have clearance greater than 0.05 mm and
assuming normality of the distribution, the mean must be

0.05 + 1.645 x 0.125 = 0.256 mm


and thus the mean plug diameter must be less than 35 mm by this amount, i.e.
34.74 mm.

0.05 OFZ06; ‘O375


Figure 6.17 Clearance

A clearance of 0.375 mm is equivalent to

_ 0.375 — 0.256
pagers? 10:28
Table 3* shows that the probability of exceeding a standardised normal variate
of 0.95 is 0.1711, i.e. approximately 17% of plugs would be too loose a fit.

(b) For 60% of assemblies to have clearance greater than 0.05 mm, the mean
clearance must be
0.05 + 0.253 x 0.125 = 0.082 mm

and the mean plug diameter must be less than 35 mm by this amount, i.e.
34.92 mm.

0.05 OS75

Figure 6.18 Clearance

For a clearance of 0.375 mm,

_ 0.375 — 0.082 _
u
0.125 i
138 Statistics: Problems and Solutions

corresponding to an upper tail area of 0.0096, i.e. less than 1% of clearances


would be too great.

9. Acapacitor will fail if the maximum temperature to which it is subjected is


greater than its own threshold temperature.
The proportion of units for which the maximum applied temperature is
greater than the temperature the capacitor can resist is given by the shaded area
in figure 6.21.

GS

118 y 130 x
Max. applied temperature Capacitor max. temperature

Figure 6.19 Figure 6.20

Figure 6.21 Oo 12 (x-y)

For. the distribution of excess of capacitor threshold temperature over


applied temperature,

the mean = 130—118=12 and variance = 3? + 5? = 34.


Since this distribution will be normal and any negative excess corresponds to
failure of the capacitor, we require the area below 0. The w-value equivalent to
this is
0=d2
ee 00
34
giving a proportion of 0.0197.

10. The area covered by S litre of paint is normally distributed with mean and
standard deviation of 88 m? and 3 m’, respectively. Thus the area covered by

a0=3V40,

3500 3520
Figure 6.22 Area covered by 40 x 5 litres of paint
Distribution of Linear Functions of Variables 139

40 x 5 litre of paint will also be normally distributed with mean of


40 x 88 (=3520) m? and standard deviation of 1/40 x 3(=19.0) m?.
The probability of covering less than 3500 m? is the probability of having
insufficient paint for the job (shaded area of figure 6.22).
To find the shaded area
_ 3500-3520
19.0
_ a t-05

giving an answer of about 14.7%.


11. (a) The distribution of time spent on a total of 15 calls will be approximately
normal (by the central limit theorem) with a mean of 15 x 30(=450) min and a
standard deviation of 1/15 x 6(=23.3) min.

57> 6V 15

450 480
Figure 6.23 Time for |5 visits

The probability that 15 calls take longer than 8 h is represented by the shaded
area in figure 6.23.
480 min (8 h) corresponds to
_ 480—450
is 1:29

The required probability is 0.0985.


(b) There may be differing interpretations about what is meant by ‘free’ time
in a week. ‘Free’ time for the salesman occurs on days when he works less than
8 h. The total of such time is found for five consecutive days, no account being
taken of any ‘overtime’ that has to be worked. The solution of such a problem is
quite difficult.
’ In this case, we shall consider ‘free’ time as the net amount by which his
actual working time is less than his scheduled working time.

CHS)
Figure 6.24 Working time
140 Statistics: Problems and Solutions

In 5 days, the number of calls to be made is 75. The distribution of total


time to make these calls is approximately normal (by the central limit theorem)
with a mean of 75 x 0.5 (= 37.5) h and standard deviation of /75 x 0.1
(= 0.866) h.
The salesman’s total working time will lie within 2.58 standard deviations of
the expected time for 75 calls with a probability of 99%, i.e. within
37.5 + 2.58 x 0.15/75 = 37.5 + 2.24 = 35.26 to 39.74h
There is thus only a small chance (1%) that his ‘free’ time in one week lies
outside the range
0.25 to 4.75 h
12. Let the required number of deliveries be n. The time required for n
deliveries will be approximately normally distributed (central limit theorem)
with mean time of 30n min and standard deviation of 8/n min.

q "
@

30 30n 600
Time per delivery Time for 7 deliveries

Figure 6.25 Figure 6.26

In order that there is only 1 chance in 1000 that n journeys take longer than
10h (600 min), n must be such that
30n + 3.09 x 8\/n < 600
The largest value of n that satisfies the inequality can be found by systematic
trial and error. However, a more general approach is to solve the equality as a
quadratic in./n, taking the integral part of the admissible solution as the number
of deliveries to be scheduled.
Thus

30n + 24.72,/n — 600=0


ae —24.72 + (24.72? + 72.000) _
4.0788 or —4.9028
60
(discard negative term as inadmissible)
n= 16.64 or 24.04
n = 24.05 corresponding to the negative root of the quadratic is clearly
Distribution of Linear Functions of Variables 141

inadmissible since the average total journey time would be 12 h, violating the
probability condition.
The number of deliveries to be scheduled is therefore 16.
If 16 deliveries were scheduled, the probability of exceeding 10 h would
actually be less than 0.001—in fact about 1 in 10 000.

6.5 Practical Laboratory Experiments and Demonstrations


The following experiment from Basic Statistics Laboratory Instruction Manual
demonstrates the basic concepts of the distribution of sample means and the
central limit theorem.

Appendix 1—Experiment 12

Sampling Distribution of Means and Central Limit Theorem


Number of persons: 2 or 3.

Object
To demonstrate that the distribution of the means of samples of size n, taken
from a rectangular population, with standard deviation o tends towards the
normal with standard deviation o/./n.

Method
From the green rod population M6/3 (rectangularly distributed with mean of
6.0 standard deviation of 0.258), take 50 random samples each of size 4,
replacing the rods after each sample and mixing them, before drawing the next
sample of 4 rods.
Measure the lengths of the rods in the sample and record them in table 33.

Analysis
1. Calculate, to 3 places of decimals, the means of the 50 samples and summarise
them into a grouped frequency distribution using table 34.
2. Also in table 34, calculate the mean and standard deviation of the sample
means and record these estimates along with those of other groups in table 35.
Observe how they vary amongst themselves around the theoretically
expected values.
3. In table 36, summarise the frequencies obtained by all groups and draw, on
page 57, the frequency histogram for the combined results. Observe the shape
of the histogram.
142 Statistics: Problems and Solutions

Sample no.
|Sample no Oe eedanevan amPe
reirine bey
a ee eee eT
|
| ie at
Total iL bay | |
Average Be | 12

[ Sample no. |11 a KEEN 14 lilS nie peas Woe


T
x

|
[aa
|
= 4

aro
4
Total
Average | hse sah

[Sample no. oN ee 23 [24 |25 | 26 | aT IS |39 1830

| Total
Average

Sample no.

Total
Average

Sample no.

Total
Average

Table 6.1 (Table 33 of the laboratory manual)


=e
Distribution of Linear Functions of Variables 143

'Tally-marks. eque! Class


Interval * units i LS) S nN

creesroyerst
me

PEPPER
PERE
RE
fARGGEGE
TS
Totals of +ve terms

\
\\
Total of —ve terms

Net totals
Lid
Table 6.2 (Table 34 of the laboratory manual)

Calculation of Distribution Mean, x, and Standard Deviations, s


6.000 is the mid point of the class denoted by u = 0
Class width = 0.075
The mean, xX, of the distribution is given by

% = 6.000 + 0.075 4 = 7
rf
The standard deviation, s’, of the distribution is given by:
2
Lfu? — sate
s =0.075 pace pac, =

* Strictly the class intervals should read 5.5875-5.6625 and the next 5.6625-5.7375 etc.
but the present tabulation makes summarising simpler.
144 Statistics: Problems and Solutions

Figure 6.27. ‘Page 57’ of the laboratory manual.


7 Estimation and significance
testing (I)—large sample’ methods

7.1 Syllabus
Point and interval estimates; hypothesis testing; risks in sampling; tests for
means and proportions; sample sizes; practical significance; exact and approximate
tests.

7.2 Résumé of Theory

7.2.1 Point Estimators


A point estimator is a number obtained from a sample and used to estimate a
population parameter. For example, the average of a random sample is an
estimator of the mean of the population from which it came. The sample
median can also be used to estimate the mean of a symmetrical population as
can other sample statistics. There are certain statistically desirable properties
that point estimators should possess (unbiasedness, consistency, efficiency,
sufficiency) and which make one estimator better than another for a particular
purpose.
However, regardless of the estimator used, it is necessary to allow for
uncertainty due to sampling variation, i.e. the numerical value obtained from the
sample will not be exactly the same as the parameter value and an interval must
be defined within which we can be reasonably confident that the parameter lies.

7.2.2 Confidence Intervals


Two numbers are calculated to determine the ends of an interval within which
we can state that the population parameter lies. A probability is attached to the
calculated interval and signifies the confidence we have in stating that the
parameter actually falls within the interval. What this means is that if we found,
say, « 95% confidence interval for a parameter, if such an interval were calculated
for each of a large number of individual sample estimates, 95 out of every 100
intervals in the long run would contain the parameter and five would not.
145
146 Statistics: Problems and Solutions

The determination of confidence intervals is calculated from the sampling


distribution of the particular sample estimator being used.

7.2.3 Hypothesis Testing


In science, a theory is developed to ‘explain’ the occurrence of an observed
phenomenon. Further observations, usually coupled with deliberate experiments,
are made to test the theory. The theory will be accepted as an adequate model
until observations are made which it cannot satisfactorily ‘explain’. In this case
modification, or abandonment of the theory in favour of another one, is
necessary. This is the approach used in statistical hypothesis testing.
An hypothesis is set up concerning a population; for example, it may be a
statement about the value of one or more parameters of the population or
perhaps about its form, i.e. that it is normal or exponential, etc. Statistical
techniques are necessary to decide whether observationsagree with such
hypotheses because variation, and hence uncertainty, is usually present.
A statistical hypothesis is usually of the null type. As examples, consider
the following.

(1) In testing whether a coin is biased, the hypothesis would be set up that it
was fair, i.e. the probability of a ‘head’ on one toss is 0.5.
(2) In testing the efficiency of a new drug, it would be assumed as a hypothesis
that it was no different in cure potential from the standard drug in current use.
(3) A new teaching method has been introduced; to assess whether it gives an
improvement in its end product compared with the previous method, the
hypothesis set up would be that it made no difference, i.e. that it was of the
same effectiveness.
(4) To determine whether an overall 100 k.p.h. speed limit on previously
unrestricted roads reduces accidents, the hypothesis would be set up that it
makes no difference. The same method would be used to assess the effect of
breathalyser tests.

7.2.4 Errors Involved in Hypothesis Testing


In deciding, on the basis of observed data subject to variation, whether or not
to accept a statistical hypothesis, two types of error may be made.

Type I error (the a risk)


This first kind of error is the risk of rejecting the original hypothesis when it
is, in fact, true. The risk is expressed in probability terms and is of magnitude a.

Type II error (the B risk)


This error is the risk of accepting (or better, failing to reject) the original
hypothesis when, in fact, it is false. As for a, the risk is expressed as a probability
of magnitude B.
Estimation and Significance Testing (I) 147

7.2.5 Hypothesis (Significance) Testing


A test of a statistical hypothesis is a procedure for deciding whether or not to
accept the hypothesis. This decision is made by assessing the significance of the
observed results, i.e. are they so unlikely on the basis of the test hypothesis that
the latter must be rejected in favour of some alternative hypothesis?
For example, in (1) in section 7.2.3 on testing the bias of a coin, the test
would consist of counting the number of ‘heads’ in some convenient number of
tosses and calculating the preuability that such a result could have been
obtained if the hypothesis was true.
When this probability has been calculated and it turns out to be very small,
two explanations are possible. Either the hypothesis is false or else a rare event
has occurred by chance.
It is customary to choose the first of these two alternatives when the
probability is below a given level. In fact it will be seen that this probability is
the risk of rejecting the hypothesis when in fact it is true. The levels of a are
arbitrary but conventional values used are

a= 0.05, ‘a@=0.01 and a=0.001

7.2.6 Sample Size


The magnitude of a can be fixed for a given test but 6 depends on the variability
of the basic variable, the extent to which the test hypothesis is false (if it is
false) and n, the sample size used. Generally the only one of these that can be
altered at will is the sample size although there may be practical limitations of
time, cost, or feasibility restricting even this.
Nevertheless, it is useful to know the sample size required to achieve given
levels of a and 6 under given conditions of variability and parameter value of the
population.

7.2.7 Tests for Means and Proportions


This section deals with some of the standard tests for population means and
proportions. In both cases the following problems are considered.
I. Is the sample mean (or proportion) consistent with the value of the
population mean assumed under the test hypothesis?
II. Do two different sample means (or proportions) indicate a significant
difference in population means?
III. What is a confidence interval for the mean of a population?
IV. What is a confidence interval for the difference between the means of
two populations?

Table 7.1 summarises the requirements for tackling these four problems.
The notation used is:
Sample size n (n, and n, for problems II and IV)
n+ (td—"d)s z T =
|ea Ita‘ (td sa
Statistics: Problems and Solutions

roa 2°
tu Olu
cu tu | 2/n — (td —1d) — (%@x—! x)
(@n—'n)ss( — 4+ )/ “Un
fee z9 70 = a3

AI
Ajoyewrxoidde st [eAlo}UI BDUSpPIJUO %(%— 1) OOT BOUAPIFUOS SI
[PAIO}UT %(— OOT1)
ar
ia z/” n+edsus
SYS (d=pa
‘6 7/ 2 = (0
Fees
ce eae tek a.
SI [PAIOJUT BoUaPTJUOD %(M— 1) OOT

Ill
AJoVeWIXxOIdde St [eAIOJUT soUSpPIJUOD %(7— 1) OO!
tdtu~
tu
tdtu

Tu
4
4
pue %u 10 Ty Jo 93¥WIT}S9 JSoq OY] SI L
ty — ly st stsoyjodAdy [Inu oy} s1oyM Ay [[NuU sy} soy
tr = '™n sournsse sisoyjod

c Sate

(33
(+ “)u-pe| :

3)
td—ld

I
(@x—txy="
zo zo
0} ssonpel SIU, 0} seonpel STU],
ty Ty 3)
Snkh
a
s+
eae ETE he 32?
zo
tay=(td
“(t u Tay)" =Crt) ~
(Ha)

et
n
le
(u— |)
ue
Oe

SoTqeiieA

=
148 u—d n—x
(a3Ie] U IOJ) soynqtiy1V We]qolg
Estimation and Significance Testing (I) 149

Variables Population mean uw (u, and wu, for problems II and IV)
Population standard deviation o (0, and 0, for problems II and
IV) assumed known
Sample mean X (X, and x, for II and IV)
Proportions Population proportion (7, and 7, for problems II and IV)
Sample proportion p (p, and p, for problems II and IV)

u is the standardised normal variate. In the case of variables, o (and o, and o,


as appropriate) is assumed to be known or calculated from a sample of size n
larger than about 30.

7.2.8 Practical Significance


Even if a significant result is obtained, i.e. an observed sample is so extreme that
the test hypothesis is no longer tenable, such statistical significance need not
mean that the result is of any practical significance. For example, if a particular
kind of light bulb has a mean life of 1220 h and a test of significance detects
that, after a fairly expensive modification of the lamp-making process the mean
life is increased, the modification is unlikely to be worth permanent incorporation
if the new mean life is only 1225 h, say.
In summary then, the decision made as a result of a significance test depends
on the possible consequences of that decision together with any other relevant
information that may be available.

7.2.9 Exact and Approximate Tests


This section was given the sub-heading of ‘Large Sample’ Methods which is a
common classification in the literature. Parts of chapter 8 refer to ‘small sample’
methods. The authors believe that a better classification would be into ‘exact’
and ‘approximate’ tests.
For example, in table 7.1, both of the tests using variables as well as the
confidence interval estimation are exact for any n, provided that the populations
involved are normal and that the variance o” (and o? and o3 as appropriate) is
known. The tests and intervals become approximate when the populations
involved are not normal but because of the central limit theorem, the error
involved is very small for n larger than about 4.
‘ Approximations are also introduced when o is unknown but is estimated
from the sample data. In this case, provided n is greater than about 30, very
little error is involved. It is in cases where a test statistic is approximately
normally distributed (which is often the case for large n) that the description of
large sample methods is applied. Note, however, that the u-test can be exact for
any n under appropriate conditions.
For attributes, as mentioned in table 7.1, all the procedures are approximate
since they depend on the tendency of the binomial distribution towards
SPS—6
150 Statistics: Problems and Solutions

normality for large n (and preferably with 7 neither small nor large)—see chapter 5.

7.2.10 Interpretation of Significant Results


Great care must be taken in the interpretation of a significant result. If a sample
result is extreme (i.e. significantly different from expectation) the rules given
lead to rejection of the test hypothesis. However, it is possible to get such a
result when the hypothesis is true because the sample is not random but is
heavily biased. This bias may arise either in the initial selection of the sample or
in the subsequent extraction of numerical data from the members of the sample
or in other subtle ways.
For example, in a coin-tossing experiment, the null hypothesis of a 50%
probability of heads may be rejected by the sample evidence for a particular
coin, the conclusion being that the coin is biased. However, the true situation
could be that the coin is not biased but the method of tossing it (i.e. of
sampling) is biased. This possibility should be considered in the initial
design of the experiment so that such a mistake is not made in the final conclusion.
To repeat, there is more to statistics than knowing which formula to substitute
the numbers into—the relevance and validity of the numbers must be considered
and any interpretation of results closely matched to the circumstances of the
case.

7.2.11 Worked Examples


1. From long experience, a variable is known to be normally distributed with
standard deviation 6.0 about any given value of the mean, i.e. whatever the
current mean is, the variability about that mean is constant. A random sample
of 16 items from the population has a mean of 53.0. Is the current population
mean 50.0?
Set up the test hypothesis Hy: mean (i.e. E[x]) = 50 and the alternative
hypothesis H,: E[x] # 50.
The means of samples of size 16 will have standard error of 6/./16 = 1.5 and
thus the sample mean of 53 deviates from the overall assumed mean by
iyity 53 — 50
= 2 standard errors
1.5
Since this is a two-sided test (i.e. if the mean is not 50, there is no knowledge
that it must be larger), the result is significant at the 5% level as the observed
value of |u| is greater than 1.96. In fact, the actual significance level corresponds
to about 4.5%. If the consequences of wrong rejection of the hypothesis were
not very serious—as measured in terms of money, safety, inconvenience—then it
would be reasonable to reject the assumption that the mean of the population is
currently 50 units.
In general, having shown that there is some evidence (but not complete proof)
that the mean is not 50 units, the next quesiton is—what is it?
The sample mean provides the best estimate of the population mean but an
Estimation and Significance Testing (I) 151

allowance must be made for sampling fluctuations; this is done by using the
standard error of the sample mean to determine a confidence interval for the
population mean.
For 95% confidence, the interval (conventionally symmetric in tail
probability) is
X—1.960/\/n and X+1.960//n ice.
53-1.96x 1.5 and 53+1.96x 1.5
53—2.94 and 53+2.94
50.06 and 55.94 say 50.1 and 55.9
Notice that the interval does not include the previously assumed mean of 50.0.
In this respect, the two procedures (hypothesis testing and interval estimation)
are equivalent since the test hypothesis will be rejected at the 5% level of
significance if the observed sample mean is more than 1.96 standard errors on
either side of the assumed mean, and if this is the case the 95% confidence
interval cannot include the assumed mean. This argument applies in the two-
sided case for any significance level a and associated confidence probability
(1—a).
Also note, that in this example, the standard deviation was known and the
test and confidence interval estimation was perfectly valid for any size of
sample.

2. A synthesis of pre-determined motion times gives a nominal time of 0.10


min for the operation of piecing-up on a ring frame, compiled after analysis of
the standard method. 160 observed readings of the piecing-up element had an
average of 0.103 min and standard deviation of 0.009 min. Is the observed
time really different from the nominal time?
Here, the population o is not known but an estimate based on 160 (random)
readings will be satisfactory.
Set up Ho: real mean element time, Up, = 0.100 min, H, : real mean element
time, MW; , # 0.100 min

3 _~ 0.103— 0.100 = v0 -
: fe GRO ee
This is significant at the 1 in 1000 level (|u| > nasthe actual type I error being
less than 6 parts in 100 000 (table 3*).
Ninety-nine per cent confidence limits for the real mean piecing-up time
under the conditions applying during the sampling of the 160 readings are

0.103 + 2.58 aga 0.103 + 0.00184, i.e.


0.1012 to 0.1048 min
152 Statistics: Problems and Solutions

Thus, the evidence suggests that the synthesis of the mean operation time
tends to underestimate the actual time by something between 1% and 5%.
Whether this is of any practical importance depends on what use is going to be
made of the synthetic time. Perhaps the method of synthesising the time may be
worth review in order to bring it into line with reality.
3. In special trials carried out on two furnaces each given a different basic mix,
furnace A in 200 trials gave an average output time of 7.10 h while 100 trials
with furnace B gave an average output time of 7.15 h.
Given that, from previous records, the variance of furnace A is 0.09 h? and
of B is 0.07 h? and an assurance that these variances did not change during the
trials, is furnace A more efficient than B?
First of all, set up the test hypothesis that there is no difference in furnace
efficiencies (i.e. average output times). The test is two-sided since there is no
reason to suppose that if one is more efficient then it is known which one it will
be.
Set up

Ho RA Lp; TL. PA ee=O


H,: ha — pp FO

The test statistic appropriate is


CONS ee Ns
u= el bi
BAS eB
N~ Ng

which becomes on substituting the observed data and the test assumptions
regarding (ua — Mp)
_(7.10—7.15)—0 _ —0.05 _
u "7/0.09, 0.07) = 9.034 = —1.47

200 100

Since this is numerically less than 1.96, or any higher value of wu corresponding
to a smaller a, the difference in mean output times has not been shown to be
statistically significant at any reasonable type I error level.

Note: Even if a very highly significant value of u had been obtained (say
|u| > 4.0) then the question could still not have been answered because of the
way the trials had been set up. The two furnaces may have been different in
mean output times (efficiencies) but because different basic mixes had always
been used in the furnaces, it is not apparent how much of the efficiency
difference was due to the different mixes and how much was due to inherent
properties of the furnaces (including, perhaps, the crews who operate them). To
determine whether the mix differences, furnace differences or a combination of
Estimation and Significance Testing (I) 153

both are responsible for differing mean output times would require a properly
designed experiment (this experiment is not designed to answer the question
posed). E
In addition, it was assumed that the variances of the output times would be
unchanged during the special trials. This may often be a questionable assumption
and is unnecessary in this example since the sample variances of the 200 and
100 trials respectively could be substituted for 04 and of with very little effect
on the significance test.

4. In a given year for a random sample of 1000 farms with approximately


the same area given to wheat, the average yield of wheat per hectare (10 000 m?)
was 2000 kg, with standard deviation of 192 kg/ha. The following year, for a
random sample of 2000 farms, the average was 2020 kg/ha, with standard
deviation of 224 kg/ha. Does the second year show an increased yield?
In this case, because of the large samples, each of them greater than about 30,
the sample variances can be used instead of the unknown population variances.
Set-up Hy: no difference in mean yield per hectare, i.e. u;—p, = 0
H,: mean yields per hectare different between the years, i.e.
Mi—H2 #0

4 a (1 —X2)—(Mi — M2) _ (2000 — 2020) — 0_ ay _=20 Ledges!


ot 4% (2. geet V(36.9+25.1) 62 i
By i 1000 2000

This is almost significant at the 1% level and suggests that the mean yield for the
whole population of farms is greater in the second year.
As a word of warning, such a conclusion may not really be valid since the two
samples may not cover in the same way the whole range of climatic conditions,
soil fertility, farming methods, etc. The significant result may be due as much
to the samples’ being non-representative as to a real change in mean yield for the
whole population. The extent of each would be impossible to determine without
proper design of the survey. There are many methods of overcoming this, one
of which would be to choose a representative samples of farms and use the same
farms in both years.
5. A further test of the types illustrated in examples 3 and 4 can be made
when the population variances are unknown but there is a strong a priori
suggestion that they are equal. In this case, the two sample variances can be
pooled to make the test more efficient, i.e. to reduce 6 for given a and total
sample size (n, + 12).
A group of boys and girls were given an intelligence test by a personnel
officer. The mean scores, standard deviations and the numbers of each sex
154 Statistics: Problems and Solutions

are given in table 7.2. Were the boys significantly more intelligent than the
girls?

Boys Girls

Mean score 124 ra


Standard deviation 11 10
Number q2 50

Table 7.2

The question as stated is trivial. If the test really does measure that which is
termed ‘intelligence’, then on average that group of boys was more intelligent
than that group of girls, although as a group they were more variable than the
girls.
However, if the boys are a random sample from some defined population of
boys and similarly for the girls, then any difference in average intelligence
between the populations may be tested for.
Assuming that there is a valid reason for saying that the two populations have
the same variances, the two sample variances can be pooled by taking a weighted
average using the sample sizes as weights (strictly the degrees of freedom—see
chapter 8—are used as weights but this depends on whether the degrees of
freedom were used in calculating the quoted standard deviations; in any case,
since the sample sizes here are large the error introduced will be negligible).
Pooled estimate of variance of individual scores

_ (72x 117)
2
+ (50x 10?)
2
_ 445
72+ 50
(Note: The variances are pooled, not the standard deviations.)

u= (Xp -XG)—(p—Uo)
2S oath
2 2

np Ne
where s? is the pooled variance
_— (124-121)-0 _ 373600 _ 3x 60_
V[112G4 +4)) V(i12x 122) 117 me
Thus there is no evidence that the populations of boys and girls differ in average
intelligence. This conclusion does not mean that there is not a difference,
merely that if there is one we have not sufficient evidence to demonstrate it,
and even if we did, it may be so small as to be of no practical importance at all.
Confidence limits for the difference between two population means can be
set in the same way as in examples (1) and (2) above.
Estimation and Significance Testing (I) 155

Thus 95% confidence limits for the difference in mean intelligence are given
by
(x eae ) + 1.96 5B, SG
B—%XG)= 1. ears

or, using the pooled variance, by

%_p—Xgt 1.96 /|s(-1 ale (124


— 121) +1.96/[112(4
+ 5)
1B
112% tae
122
= 3 oh
#1.96 |aren
3600 a )

=3+1.96x
2 =3+43.82, ie
—0.82 to 6.82
including the null value, 0, as it must from the significance test.
Note: The use of 1.96 instead of 2.0 is somewhat pedantic in practical terms;
it is retained in this chapter to serve as a reminder that the appropriate u-factor
is found from the tables* of the normal distribution in conjunction with the
choice of a.

Examples Concerning Proportions


6. A programmed learning course has been introduced to train operators for a
precision job in a company. Ten per cent of the operators trained by the previous
method were found to be unsuitable for the job. Of 100 operators trained by the
new method, eight were not suitable. Is the new method better than the old?
Set up the null hypothesis that both methods are the same in their effect and
hence have the same ‘failure’ rate, i.e.

Ho: 7 = 0.10 H,:7#0.10


m is the probability that any one operator will not benefit from the course and
is assumed constant for all operators.
The sample proportion of operators, p, who do not benefit from the course
will be binomially distributed with mean of 7 and standard error of
V[r—n)/n].
For large n, this binomial distribution can be approximated by the normal
distribution with the same parameters.
Thus,
pi _ 0.08—0.10 _ 0.02_
—0.67
u~—hin—m))_//0.1x 0.9\.__0.03._
n 100
Since this is not at all a low probability result, there is no evidence that the
new method is any more or less effective than the previous one.
156 Statistics: Problems and Solutions

7. Acertain type of seed is supposed to have a germination rate of 80%. If 50


seeds are tested and 14 fail to germinate, does this mean that the batch from
which they came is below specification?
Set up
Hy: m = 0.80 A,:
7 #0.80

Use the normal approximation to give

Serta OS Oe me OO
a ee 0.8x0.2\~ 0.16 bat
n 50

This is not numerically large enough to reject the test hypothesis—the type I
error would correspond to just under 16%.
A slight improvement can be made in the adequacy of the normal approximation
by making the so-called correction for continuity. However, with the large
sample sizes generally required for use of the normality condition, this refinement
will not usually be worth incorporating. It is given here as an example.

36 or fewer germinating seeds can be considered as 36.5 or fewer on a continuous


scale. 36.5 corresponds to 0.73 as a proportion of 50 and the corrected value for
u becomes

nts 0:73 =V30" = 0:07750 * 1.24


0.8 x 0.2 0.4 :
50
The type I error corresponding to such a value of u (two tails) is about 21.5%;
too high for most people to contemplate making.
Both this example and example (6) could have been done using the number
of occurrences rather than the proportion of occurrences in a sample. The
approaches are identical but for setting confidence limits, the proportion method
is better.
Standardising the number of ‘successes’ x in n trials gives

BE
“J [nn(1—7)]
which on dividing top and bottom by n gives

aahicg
The exact test can be carried for this example since the appropriate
Estimation and Significance Testing (I) 157

parameters are tabulated in table 1* (cumulative binomial probabilities).


For an assumed germination rate of 80%, the expected (mean) number of
seeds germinating out of 50 tested is 50 x 0.8 = 40. Because of the method of
tabulating (i.e. for proportions < 0.50), the problem is best discussed in terms
of seeds failing to germinate, the expected number being 50 x 0.2 = 10.
The probability of 14 or more failing to germinate is 0.1106 and the
probability of six or fewer failing to germinate is 1-0.8966 = 0.1034, i.e. a total
probability (magnitude of type I error) of 21.40% which compares favourably
with the refined normal approximation.
Aa further point, if a 5% significance level is specified for this problem
(two-sided test since the true germination rate could be above or below 80%),
using table 1* with n = 50 and 7m (tabulated as p) = 0.20, the acceptance region
for failed seeds is from 5 up to 15 inclusive with the critical region split as near
equally as possible between, the two tails (1.85% in the lower tail and 3.08% in
the upper tail).
Approximate confidence limits for the seed population germination rate
are found as

95% pt 1.96,/[e0nrac,=0.72+ 06 Meg)


50
= 59.6% to 84.4%

99% pt 2.58 4]Poroe =0.72+ 2.584/(072%028) = 55.6% to 88.4%


As mentioned earlier, these confidence limits are approximate because of the
use of the. normal distribution and because of substitution of the sample
proportion, p, in place of the population proportion 7 in the expression for
the standard error of p.
Note: The standard error, and hence the size of the confidence interval,
depends mainly on the actual size of the sample and not, for practical purposes,
on the proportion which the sample is of the population. The latter usually
only becomes important when it is about 20% or more. In such a case, the
formula used in the example overestimates the standard error a little bit, i.e. the
probability associated with the calculated interval is a little higher than stated.
Thus in this example, the 50 seeds provide the same information about the
overall germination rate whether they were taken randomly from a batch of
1000 seeds or a batch of 1 000 000 seeds (or any other large number).

8. As an extension of the previous example, suppose two seedsmen, A and B,


each produce large quantities of nominally the same type of seed. Under
standard test conditons, out of 200 seeds from A, 180 germinate, whilst
255 germinate out of 300 from B. Has A a better germination rate than B?
158 Statistics: Problems and Solutions

Set up the null hypothesis that both germination rates are the same, i.e.

Ho: Ta = TR Hi:
1, FT,

An approximately normal test statistic can be set up (see summary table 7.1) as

2 (Pp, —Pp) —(T,a —Tp)


ey —T,) é ™3(1 ier
na ng

Under the null hypothesis, 7,4 = 7p = some value 7, say, and the test statistic
becomes

ee (Pa —Pp)

abo
(ex*70)
u

The actual value of 7, however, is unknown and it is usual to replace it by its


pooled sample estimate, p, obtained as a weighted average of the two sample
proportions p, and pp, the sample sizes being the weights.
Thus

_"aPa +MpPp _ 1804255 _ 435_4 94


na tng 500 500 '

0.90— 0.85 0.05


4/1087 x 0.13Gko tan) 010307 Ue
Since this value does not exceed 1.96, numerically, there is no evidence at the
5% level of a difference between the seeds of A and B as far as germination rate
is concerned,

9. Example (1) of this section was concerned with a normal variable with
standard deviation of 6.0 units, this being assumed constant whatever the mean
of the population. The null hypothesis was set up that the mean was 50.0 units.
(a) If a two-sided test of this hypothesis is carried out at the 1% level of
significance, what will be the type II error, if a sample of size 16 is taken and the
population mean is actually equal to

(i) 51.0 units?


(ii) 53.0 units?

(b) What size of sample would be necessary to reject the test hypothesis
with probability 90% when the population mean is actually 48.0 units? The
significance level (type I error) remains at 1%.

(a) (i) Figure 7.1 shows the essence of the solution. The solid distribution
is how xX is assumed to be distributed under the null hypothesis, the critical
Estimation and Significance Testing (I) 159

Figure 7.1

region being given by the shaded area in its two tails. The boundaries of the
acceptance region for a 1% significance level are at

6
50 Bc+ 2.58 aie
—_—_ = 46.13 and 53.87

The dotted distribution shows how Xx is actually distributed. If the observed


sample mean falls in the acceptance region, the null hypothesis would not be
rejected and a type II error would be committed. The shaded area shown dotted
is the probability of making such an error and to find it, we need
53.87
— 51.0
igh
hited
aia ee = +1.93

/16
and

u= Mla ote. —3,25

V16
The tail areas corresponding to these values are 0.0268 and 0.0006
approximately.
The type II error is therefore equal to
1 —(0.0268 + 0.0006) = 0.9726

(ii) The solution to this part is the same as that for part (i) except that the
actual distribution of the sample mean will be centred around 53.0.
The values of u corresponding to the limits of the acceptance region are
53.87— 53.0
[LR 16IPF pee
apna = 0.58

/16
and

aes 1.13 —238 = 458


Te
160 Statistics: Problems and Solutions

The type II error is therefore given by


1 —(0.2810+ 0.0000) = 0.7190
(b) Here the risks are fixed for specific values of the population mean; the
sample size, n, is to be found. The requirements are shown in figure 7.2, in which
X} and X3 are the lower and upper boundaries of the acceptance region. Half per
cent of the sample mean distribution assumed under the null hypothesis will
lie outside each of these boundaries (1% type I error with a two-sided
alternative).

48.0 x,* 50.0 % 9*


Figure 7.2

The dotted distribution shows how the means of samples of size n will be
distributed when the population average is actually 48.0 units. The extreme
part of the right-hand tail of this distribution will lie above x2 but it will be
such a minute proportion in this case as to be negligible.
The following equations can be set up.

x* = 48.0+1.28 <7, (7.1)

#* = 50.0-2.58.x a (7.2)
Subtracting one equation from the other leads to

Onz (50.0 — 48.0)


(1.28 + 2.58) sgt

or
.
Ree 3.86 x 6
) = 11.58? = 134
2

The critical values of x are thus

6
50:0 as
2.58°x Ji134
——
48.66 and 51.34

Part (b) of this example postulated the requirement that if the mean is 48.0
units (or more generally, if it differs from the test value by more than 2.0 units),
the chance of detecting such a difference should be 90%. This requirement
Estimation and Significance Testing (I) 161

would have been determined by the practical aspects of the problem. However,
if the actual population mean were less than 48.0 (or bigger than 52.0), the
probability of committing a type II error with a sample size of 134 would be less
than 10%; and if the population mean were actually between 48.0 and 50.0, this
probability would be greater than 10%.

10. What is the smallest random sample of seeds necessary for it to be asserted,
with a probability of at least 0.95, that the observed sample germination
proportion deviates from the population germination rate by less than 0.03?
The standard error of a sample proportion is \/[m(1 —7)/n] where 7 is the
population proportion and n the sample size. Assuming that 1 will be large
enough for the approximation of normality to apply reasonably well to the
distribution of p, the problem requires that

196, [|| ONS

giving

men 1096\"
n= eae m1 —m)

m, the quantity to be estimated is unknown (ifitwere known, there would be no


need to estimate it) and this creates a slight difficulty in determining n. However,
m1 —7) takes its maximum value of4 when 7 =4 and
9)

n= (158) 1 = 1060
would certainly satisfy the conditions of the problem (whatever the value of 7).
Alternatively if an estimate is available of the likely value of 7, this can be
used instead of 7 as an approximation. Such an estimate may come from previous
experience of the population or perhaps from a pilot random sample; the pilot
sample estimate can be used to determine the total size necessary. If the pilot
sample is at least as big as this, no further sampling is needed. If it was not, the
extra number of observations required can be found approximately. If such
extra sampling is not possible for some reason (too costly, not enough time),
the confidence probabilities of types I and II errors will be modified (adversely).
For this example, if the seed population germination rate is usually about
80%, then the required value of sample size for at most a deviation of 0.03
(i.e. 3%) with probability of 0.95 is
1.96 2
n= (iss) 0.8 x 0.2 = 680

(c.f. 1060 before).


162 Statistics: Problems and Solutions

7.3 Problems for Solution


1. In production of a tinned product, records show that the standard deviation
of filled weights is 0.025 kg. A sample of six tins gave the following weights:
1.04, 0.97, 0.99, 1.00, 1.02, 1.01 kg.
(a) If the process is required to give an average weight of 1.00 kg does the
filling machine require re-setting?
(b) Determine confidence limits for the actual process average.
2. Ina dice game, if an odd number appears you pay your opponent Ip and if
an even number turns up, you receive 1p from him. If, after 200 throws, you
are losing 50p and the dice are your opponent’s, would you be justified in
feeling cheated?
3. A company, to determine the utilisation of one of its machines, makes
random spot checks to find out for what proportion of time the machine is in
use. It is found to be in use during (a) 49 out of 100 checks, and (b) 280 out of
600 checks.
Find out in each case, the percentage time the machine is in use, stating the
confidence limits.
How many random spot checks would have to be made to be able to estimate
the machine utilisation to within + 2%?
4. Ina straight election contest between two candidates, a survey poll of 2000
gave 1100 supporting candidate A. Assuming sample opinion to represent
performance at the election, will candidate A be elected?
5. In connection with its marketing policy, a firm plans a market research
survey in a country area and another survey in a town. A random sample of the
people living in the areas is interviewed and one question they are asked is
whether or not they use a product of the firm concerned. The results of this
question are:

Town: Sample size = 2000, no. of users = 180


Country: Sample size = 2000, no. of users = 200

Does this result show that the firm’s product is used more in the country than in
town?

6. Ina factory, sub-assemblies are supplied by two sub-contractors. Over a


period, a random sample of 200 from supplier A was 5% defective, while a
sample of 300 from supplier B was 3% defective.
Does this signify that supplier B is better than supplier A?
A further sample of 400 items from B contained eight defective sub-assemblies.
What is the position now?

7. If men’s heights are normally distributed with mean of 1.73 m and standard
Estimation and Significance Testing (I) 163

deviation of 0.076 m and women’s heights are normally distributed with mean
of 1.65 m and standard deviation of 0.064 m, and if, in a random sample of 100
married couples, 0.05 m was the average value of the difference between
husband’s height and wife’s height, is the choice of partner in marriage influenced
by consideration of height?
8. For the data of problem (3) (page 46), chapter 2, estimate 99% confidence
limits for the mean time interval between customer arrivals. Also find the
number of observations necessary to estimate the mean time to within 0.2 min.

9. An investigation of the relative merits of two kinds of electric battery showed


that a random sample of 100 batteries of brand A had an average lifetime of
24.2 h, with a standard deviation of 1.8 h, while a random sample of 80 batteries
of brand B had an average lifetime of 24.5 h with a standard deviation of 1.5 h.
Use a significance level of 0.01 to test whether the observed difference between
the two average lifetimes is significant.
10. Two chemists, A and B, each perform independent repeat analyses on a
homogeneous mixture to estimate the percentage of a given constituent.
_ The repeatability of measurement has a standard deviation of 0.1% and is
the same for each analyst. Four determinations by A have a mean of 28.4% and
five readings by B have a mean of 28.2%.
(a) Is there a systematic difference between the analysts?
(b) If each analyst carries out the same number of observations as the other,
what should this number be in order to detect a systematic difference between
the analysts of 0.3% with probability of at least 99%, the level of significance
being 1%?

7.4 Solutions to Problems


1. The observed sample mean is x = (1.04 + 0.97 + 0.99 + 1.00 + 1.02 + 1.01)/6
= 1.005 kg
(a) Assuming the mean net weight of individual cans is 1.00 kg, i.e.
Ho: E[x] = uo = 1.00 kg
H,: E[x] =u, #1.00 kg

then
~%—Mo _ 1.005—1.000_
V6 _
Be akin. 0,025 Foss Oe
The probability of such a deviation is about 62% and so there is no real
evidence that the process average is not 1.00 kg, i.e. the sample data are quite
164 Statistics: Problems and Solutions

consistent with a setting of 1.00 kg, although a type II error could be committed
in deciding not to re-set the process.
(b) Confidence limits for the actual current process average are, for two levels
of confidence

95%: 1.005 + 1.96 x ae = 1,005 + 0.02 = 0.985 and 1.025 kg


0.025 _ ‘4
99%: 1.005 + 2.58 x —7 = 1.005 + 0.026 = 0.979 and 1.031 kg

2. Losing S5Op in 200 throws means that there must have been 125 odd numbers
(losing results) and 75 even numbers (winners) in 200 throws. Set up the null
hypothesis that the dice is unbiased.
Ho: 7 = 0.5 (7 = the probability of an odd number)
A, :17#0.5

The total number of odd numbers will be binomially distributed and since
n= 200 and m7 =5 we know that

RMR ote ARE aoa making the correction for continuity


4 [na(—n)] V(200 x ¥ x 4)
_ 24.5
err ea
The probability of such a deviation is certainly less than 0.0007 and it therefore
seems likely that the dice is biased towards odd numbers.

3. 95% confidence limits for the proportional utilisation of the machine are
approximately

De 196,||
PO=P)]

which gives

(a) 0.49 + en eee) = 0.49 + 0.098 = 39.2 to 58.8%

and

280 ks 280 x 320 -


(5) Go0*+ 1.96 |e Sea =) 2 0.467 + 0.04 = 42.7 to 50.7%
Note: The standard error has been reduced by a factor of approximately
/6, the square root of the ratio of the two sample sizes.
Estimation and Significance Testing (I) 165

Also, since 7 is near to 0.5, for 95% confidence estimation, the required number
of spot checks is given by

1.96,[(25%25) =0.02, ie. n=98? x4 = 2401


For a 99% confidence interval of width (2 x 0.02), the required n is found
from

ane =0.02 n=129?x4=4160

4. 99% confidence limits for the population proportional support for candidate
A are

1100 Se
aos
1100 x 900. = =

2000 — 2.58 I(seap x 2000 x see) Lay ee


Thus it is virtually certain that candidate A will be elected.

5. Assume that there is no difference in the proportion of people using the


product either in the country (7) or in the town (7).
Ho: To =T7 Hy:
tc Fr,
The best estimate under Hg of the common usage rate
_ 200 + 180 ie
Fe MOO Din: 0.095

Then

(0.10 —0.09) — 0
ua = 1.08
~ s/[0.095 x 0.905 (ada + z000)]
There is no evidence that the proportion of people in the country area using the
product is any different from that in the town.
6. Assume that the percentage of sub-assemblies which are defective is the
same in the long run for both suppliers.
Thus
Ho: Ta —TR = 9 Ay: 1, —TR #0

Assuming Ho to be true, the best estimate of each-supplier’s defective proportion


is

T=
200 x 0.05 +300 x 0.03 _ 19
200 + 300 500
166 Statistics: Problems and Solutions

Thus

ne (0.05 —0.03)—O _ 0.02 x 500 = 1145


— V [s60 x $85 (abs + 300) V9 x 481 x B60
There is no evidence of a difference between the suppliers.
With the additional evidence, assuming that the underlying conditions remain
unchanged, the test may be carried out again
Ho: Ta —TR=0 Ay: 1,4 —TpR #0

Pooling all the information,


10+9+8 27
*= 590+300+400 900 2:
10m el Vee
u=
~4/[0.03 x 0.97 (ats + ab5)] = 1.88
(300 —705) —0

This value of u nearly reaches its critical value for a 5% (two-sided) level of
significance; the actual level is about 6%. There is thus some suspicion that B is
better than A but what action is taken depends on the consequences of the
possible alternative decisions.
7. Set up the test hypothesis that the choice of marriage partner is not influenced
by the height of either. In this case, in a married couple, the height of a man and
of a woman is each a random selection from the distributions of men’s and
women’s heights respectively.

{0.00987 ee
¥10o ;

0.05 0.08
Figure 7.3 Average height excess for |OO couples

The excess of the man’s height over the woman’s height will be a normal
variable with mean of (1.73 — 1.65) m and variance of (0.076? + 0.064? )m?.
The average difference (excess) of height taken over a random sample of 100
such married couples will be normally distributed (i.e. from one sample of 100
to another) with mean of 1.73 — 1.65 = 0.08 m and variance of
(0.076? + 0.0647)/100 and have a standard error of \/0.00987/s/100 = 0.0099 m.
The observed average difference was 0.05 m corresponding to a u value of

0.05—0.08__
Ooodere mueUs
Estimation and Significance Testing (I) 167

The (two-sided) significance level corresponding to this is approximately 0.0024


and thus it seems reasonable to conclude that the choice of marriage partner is
not independent of height.

8. The observed data of problem 3, chapter 2, are distributed in a skew pattern


with a calculated mean of 1.29 min and standard deviation of 1.14 min. The
figure of 1.29 min is the average of 56 individual readings, and by the central
limit theorem, such an average can be expected to be normally distributed. The
appropriate confidence limits can be found using the sample standard deviation
as an estimate of the population standard deviation since it is based on more than
30 readings; such an approximation will be good enough for most practical
purposes.
cee Ss
tie
V56

0.005 0.005

129? HB 129°
Time between successive customers Average of 56 time intervals

Figure 7.4 Figure 7.5

99% confidence limits for the mean time between arrivals are

1.29 2.58 x ace = 1.29 + 0.39 = 0.90 and 1.68 min

The number of observations, n, necessary to estimate the population mean to


within 0.2 min (99% confidence) is given by equating the sampling error to the
required error, i.e.
git 17 O58 VET AYA ee
ee etree ) 216

9. Set up the test hypothesis that there is no difference in mean lifetimes


between the two brands.

Ho: E[(®a —Xp)] =Ha —Up =0


H,: E[(%, —Xp)] =Ha —Up #0
An appropriate statistic is
_ %,4—Xp)—0
u— 2 2
OR,Sn
na ng

The denominator being the standard error of the difference of two sample
menns based on samples of size n, and ng respectively.
168 Statistics: Problems and Solutions

Thus
EM and = 2a)
he |eg
100 80
substituting the sample variances for the population variances

=0.3 =0.3°
ea
~ 40.0605 0.246
Since this value is not numerically larger than 2.58, there is no evidence of a
difference in mean lifetimes between A and B.
10. (a) Assume there is no systematic difference between the analysts, i.e. the
means of an infinitely large number of analyses of the same material would be
equal for A and B.
Under such a null hypotheses we may use the test statistic

jolla ia Orme Ca Xp). vy Oe eee


aucun
ne ioe
ea ar
R\n
Np icaeni a

This is significant at the 1% level (i.e. |u| > 2.58) and we can conclude (with
only a small type I error) that there is a systematic difference between the
analysts, A giving a higher result than B on average. Thus at least one of them,
and possibly both, gives a biased estimate of the actual percentage composition.
99% confidence limits for the extent of this systematic difference are given
by
(28.4 — 28.2) + 2.58 ./[0.17(4 + 4)] = 0.2 + 0.173 = 0.027 and 0.373%
(b) Figure 7.0 shows the requirements of this problem.

(7.3)

(7.4)

Figure 7.6
Estimation and Significance Testing (I) 169

An equivalent pair of equations would be obtained for a systematic difference


of —0.3%.

Note: In writing down equations (7.3) and (7.4), the minute part of the left-
hand tail of the dotted distribution falling in the lower part of the critical
region has been ignored.

Putting 2, =p =n gives the required number of readings by each analyst as

_ (2.58 +2.33)?
x 2.x 0.1? =4.91?
x §=5.35
oa Oar nt
Thus each analyst should do six tests, the probability of detecting a systematic
difference of 0.3% between them (if it exists) being greater than the required
minimum of 99%. :
In fact the required minimum power would still be achieved if one analyst
took six tests and the other five in order to reduce the total cost or effort
involved.

7.5 Practical Laboratory Experiments and Demonstrations


Since this chapter is concerned with ‘large sample’ methods, all the experiments
and demonstrations on illustrating the basic concepts of significance have been
left over to chapter 8.
In view of sample sizes required, experimenting is not very effective for
methods outlined in this chapter.
9 Sampling theory and significance
testing (II]—z) FF’and x tests

8.1 Syllabus Covered


Unbiased estimate of population variance; degrees of freedom; small sampling
theory; ‘?’ test of significance; confidence limits using ‘t’; paired comparisons;
‘F’ test of significance for two variances; x” test of significance; goodness of fit
tests; contingency tables.

8.2 Resume of Theory and Basic Concepts

8.2.1 Unbiased Estimate of Population Variance


In chapter 7 the use of significance testing for large samples or for samples where
an independent estimate of population variance was available was discussed. The
u’ test was described for comparing a sample mean with a given hypothesis and
also for testing significant differences between two population means.
In this chapter the problems outlined are different in that the sample sizes are
small and no independent estimate of population variance is available—an
estimate from the sample having to be used for the population variance.
In obtaining an unbiased estimate of the population variance from sample
data the following formula must be used
n

DG er
S pete ae (8.1)
where X = sample average.

Note: If an independent estimate of population mean y is available the sample


estimator of variance is

PG py
s? = (8.2)

The denominators in both equations (8.1) and (8.2) are called the degrees of
freedom of the variance estimate.
170
Sampling Theory and Significance Testing (II) 171

8.2.2 Degrees of Freedom


This concept of degrees of freedom is very difficult to define exactly but it can
be considered as the number of independent variates. This number of independent
variates or degrees of freedom is equal to the total number of variates less the
number of independent linear constraints on the variates.
For example in equation (8.1) in estimating the population variance, the
sample mean x is used in the equation thus reducing the number of comparisons
or degrees of freedom to n — 1. No such reduction is necessary in equation (8.2).
When dealing with x? goodness of fit testing a further explanation of this concept
of degrees of freedom will be given.

8.2.3 The ‘u’ Test with Small Samples


The arbitrary division of significance testing between large sampling theory (or
approximate methods) in chapter 7 and the small sampling theory (or exact
methods) in this chapter, necessitates the repeating of one test in order to
maintain consistency.

Testing the Hypothesis that the Mean of a Normal Population has a Specific
Value Up» —Population Variance Known
Here, providing the population variance is known (and therefore the sample
estimate of variance is not used), then the ‘u’ test is appropriate whatever the
sample size.
Thus
u=
X —WMo
we
Vn

is calculated and the significance level is determined.

Example
In an intelligence test on ten pupils the following scores were obtained: 105,
120, 90, 85,130, 110, 120,115,125, 100.
Given that the average score for the class before the special tuition for the
test. was 105 with standard deviation 8.0, has the special tuition improved the
performance?
Here since the standard deviation is given and if the assumption is made that
tuition method does not change this variation, then the wu test is applicable.

Null hypothesis—tuition has made no improvement

Average score in test


_ 105+ 120+...+125+100_
X=: 10 110
172 Statistics: Problems and Solutions

Here a one-tailed test can be used if it is again assumed that tuition could not
have worsened the performance.
Thus
_ 110-105 =,1298
of 8
V10
From table 3*

for5% u=1.64

Lo t= 233

This result is significant at the 5% level; there is evidence of an improvement.

8.2.4 The ‘t’ Test of Significance


Testing the Hypothesis that the Mean of a Normal Population has a Specific
Value lo —Population Variance Unknown
Here the sample of size 7 is used to give the estimate of population variance.

>x? ie (2x;)

S
Peles
=
es
=>
n (8.3)
n—1 n—l
The null hypothesis is set up that the sample has come from a normal population
with mean Uo.
W. S. Gosset under the nom de plume of ‘Student’ examined the following
statistic

t=~ ee (8.4)
Vn
and showed that it is not distributed normally but in a form which depends on
the degrees of freedom (v) if the null hypothesis is true. Table 7* sets out the
various percentage points of the ‘t’ distribution for a range of degrees of freedom.
Obviously ¢ tends to the statistic u in the limit where v>%, i.e. f is
approximately normally distributed for large degrees of freedom v. Reference
to the table* shows that, as most textbooks assert, where the degrees of freedom
exceed 30, the normal approximation can be used or the ‘t’ test can be replaced
by the simpler ‘w’ test.

Note: For a two-tailed test note that a 5% significance level requires


a= 0.025 in table 7* and for a 1% significance level, w = 0.005 (a is a proportion,
not a percentage). See section 8.2.6.
Sampling Theory and Significance Testing (II) 173

Again, one-tailed tests are only used when a priori logic clearly shows that
the alternative population mean must be on one side of the hypothesis value ug.
See section 8.2.7.

Testing the Hypothesis that the Means of Two Normal Populations are yu, and py
Respectively—Variances Equal but Unknown
Note: The assumption must hold that the variances of the two populations
are the same (i.e. 0% = 03,) since we are going to pool two sample variances and
this only makes sense if they are both estimates of the same thing—a common
population variance. If 0% does not equal 0%, then the statistic given below is not
distributed like ¢ ;
The two sample variances s2 and By are pooled to give a best estimate of the
common population variance.

guage siisit G fea) 9}. PCRS)


Nytny +2

where 7, and ny are the sizes of the two samples, and

p= FD)
— Wx = Hy)
(8.6)
lesen,
with (7, +n, — 2) degrees of freedom. The usual test hypothesis is that the
populations have equal means and under this assumption (u, —s,) = 0 and the
test statistic reduces to

f=

es
eraliadn'd

Ooh,
(8.7)

t-Test Using Paired Comparisons


In many problems, the power of the significance test can be increased by pairing
the results and testing the hypothesis that the mean difference between paired
readings is equal to Uo.

Note: This approach is only legitimate provided that there is a valid reason
for pairing the observations. This validity is determined by the way in which
the experimental observations are obtained.
Let the number of paired readings = n
Let the difference of the ith pair =;
174 Statistics: Problems and Solutions

Then

n n

2 (d;- a)? dd;


oe —— ae where d ==

and

fae =H (8.8)
Vn
The test hypothesis is usually of the null type where there is assumed to be no
difference on average in the paired readings, i.e. uo = 0. In this case the test
statistic t is given by

(8.9)
S|
ole.)

Confidence Limits for Population Mean


Where the degrees of freedom are less than about 30, the confidence limits for
population mean Uo are:

for 95% confidence limits X + (to 025») -

7 aes a S
for 99% confidence limits x + (40.005,v) Fy

This is similar to the large sample case except that f is used instead of u.

8.2.5 The ‘F’ Test of Significance


For testing the hypothesis that the variances of two normal populations are
equal.
Again, a null hypothesis is set up that the variances are the same.
Let

2 = 2(x;— x)? s2 2(y;—¥)?


i ny, —1 y ee

Then

F=3 where &>8 (8.10)


Sampling Theory and Significance Testing (II) 175

If F is greater than Fo 925 (see table 9*) for (n, — 1) degrees of freedom of
numerator and (71, — 1) degrees of freedom for the denominator, then the
difference is significant at 5% level (a = 0.05). For F to be significant at 1% level as

use F’g995 (actually F’'9.9; will have to be used giving a 2% significance level
of F).

8.2.6 The x? Test of Significance


Definition of x?
Let x1, X2,...X, ben normal variates from a population with mean yw and
standard deviation o.
Then
SEND Gs ae er)

the suffix n denoting the number of degrees of freedom. Obviously the larger n
is the larger x? and the percentage points of the sampling distribution of x? are
given in table 8*.
For example, where m = 1 the numerical value of the standardised normal
deviate u exceeds 1.96 with 5% probability and 2.58 with 1% probability (i.e.
with half the probabilities in each tail). Consequently x? with one degree of
freedom has 5% and 1% points as 1.96? and 2.58” or 3.841 and 6.635.
However, for higher degrees of freedom the distribution of x? is much more
difficult to calculate, but it is fully tabulated in table 8*.

Goodness of Fit Test using x?


A most important use of the x? distribution is in a significance test for the
‘goodness of fit’ between observed data and an hypothesis.
Let k = number of cells or comparisons
O; = observed frequency in ith cell
E; = expected frequency in ith cell from the hypothesis
r = number of restrictions, derived from the observed readings, which
have to be used when fitting the hypothesis.
Then

a (OF 1
x Hi
is distributed like x? with (k — r) degrees of freedom where r = number of
parameters used to fit the distribution.
For the use of this test all the £; values must be greater than 5. If any are less
then the data must be grouped.
176 Statistics: Problems and Solutions

Application of x? to Contingency Tables


When the total frequency can be divided between two factors, and each factor
subdivided into various levels, then the table formed is called a contingency
table. Data in the form of a contingency table give one of the simplest methods
of testing the relationship between two factors.
Consider the following contingency table (table 8.1) with the first factor
(F,) at a levels and the second factor (F;)at b levels. The individual cell totals
Oy give the observed frequency of readings at the ith level of factor F, and the
jth level of factor Fy.

Factor 1 Row
1 2 3 —— i Baie a totals

Factor 2
1 On Or 031 On Oa 2Oj1
1

2 On Or

Ite soldi Oj Oi;


i

b Ow Oab 2Ojn,
l

Column
totals 7 Oy 20H 20aj 2205

Table 8.1
22O;j; = total frequency
ij
XOj; = total frequency at the ith level of factor 1 (column total)
j
2 Oj; = total frequency at the jth level of factor 2 (row total)
t

These tables are generally used to test the hypothesis that the factors are
independent.
If this hypothesis is true then, the expected cell frequency
220i; x 2O* 2
Ey = bel wand Ep |tee
— Fi |
LLUO*z
Pad
Ee] ij
Sampling Theory and Significance Testing (II) 177

is distributed as x? with (a— 1) (b— 1) degrees of freedom.


It can be shown that only (a— 1) (b — 1) of the comparisons are independent
since the row and column totals of expected frequencies must be the same as
the row and column totals of observed frequencies.

8.2.7 One- and Two-tailed Tests


This whole question of one- and two-tailed tests is a subject of considerable
controversy among statisticians.
However, the following points of guidance are useful in deciding which to
apply.
(1) In general, if ever in doubt use the two-tailed test since this plays safer.
(2) Only if, from a priori knowledge, it can be definitely stated that the
change must be in one direction only, can the one-tailed test be used.
The observations apply, of course, to all significance tests and it is hoped
that the examples given will clarify this confusing problem.

8.2.8 Examples on the Use of the Tests


1. A canning machine is required to turn out cans weighing 251 g on the average.
A random sample of five is drawn from the output and each is found to weigh
252 g respectively. Can it be said that the machine produces cans of average
weight 251 g?
Coding the variate by subtracting 250 givesx = 1, 2, 4, 4, 2: 2x = 13,
Dx? = 41,
Null hypothesis is set up that the process is running at an average of 251 g.
Mean

x= "2=26
Estimated variance of population
rx? ee P23

s= tie ie IO =e,
n—1

Estimated standard deviation of population s = 1.34


34
Estimated standard error of sample mean ex = Se = 0.6

On the null hypothesis that the population average is 1

t
26
ae 2.67
178 Statistics: Problems and Solutions

From tables* (4 degrees of freedom)

to.o2s = 2.78 lo.005 = 4.6


or the results are not significant. However, since the ¢ value is close to the 5%
level (2-tailed) it is possible that if a large sample were taken a difference may
be shown.

2. A weaving firm has been employing two methods of training weavers. The
first is the method of ‘accelerated training’, the second, ‘the traditional’ method.
Although it is accepted that the former method enables weavers to be trained
more quickly, it is desired to test the long-term effects on weaver efficiency. For
this purpose the varying efficiency of the weavers who have undergone training
during a period of years has been calculated, and is given in table 8.2.
Is there any significant difference between training methods?

Specialised Traditional
training method Total
A B

Above shed average 1 32 12 44


Below shed average 2 14 22 36
Insufficient data 3 6 9 15

Total 52 43 95

Table 8.2. Training schemes and weaver efficiency

Null hypothesis set up that there is no difference in the methods.

En, =33 x 44=24.1 Ep, = 44—24.1 = 19.9

Eqn = # x 36= 19.7 E'p2 = 36—19.7= 16.3


Ey3 = 3 x 15 = 8.2 F’g3 = 15—8.2 = 6.8

gO ROE TOS Seed oe ee


x Dal |19,9 nO edOaunieaee bak
—— + — aS as patel es

= 2.59 +3.14+ 1.65 +2.00+ 0.59 + 0.79

=10.76
Degrees of freedom = (3—1)(2—1) = 2

X0.05 = 5.991 X6.01 = 9.210


Sampling Theory and Significance Testing (II) 179

Above average |

Below average 2

Insufficient
data 3

Tota!

Table 8.3

Result is significant at 1% level.


There is evidence that the training methods differ in their long-term
efficiency.

3. In a study of two processes in industry the following data were obtained.

Process i Process 2

Sample size 50 60
Mean 10.2 ACE
Standard deviation Re 2.1

Table 8.4

Is there any evidence of a difference in variability between the processes?


A further sample taken on process 1, gave: sample size = 100, mean = 10.6,
standard deviation = 3.1.
What is the significance of the difference in variability now?
Null hypothesis set up is that there is no difference in the variation of the
two processes.
. . . 2)
He Greater estimate of population variance % IG)
—_ -. — =
= 165
_—

Lesser estimate of population variance 2.1?


Referring to table 9*

Degrees of freedom of greater estimate v, = 49, read as 24 (safer than ©)

Degrees of freedom of lesser estimate vu, = 59, read as 60


180 Statistics: Problems and Solutions

To be significant F must reach 1.88 at 5% level or 2.69 at 0.2% level.

The difference is not significant.


Variance of combined sample for process 1
_st(my— 1) + 3% = 1) _ 2.7 x 4943.1? x 99
ny +n,—2 50+ 100—2

+ STOR INY
ceearrco 8.83

ioe
F= ar 2.00

From table 9*
Degrees of freedom of greater estimate = 149, read as
Degrees of freedom of lesser estimate = 59, read as 60
5% significance level= 1.48

0.2% significance level = 1.89

Difference is highly significant or there is strong evidence that process variations


are different.

4. For example (1), page 69, in chapter 3, for goals scored per soccer match,
test whether this distribution agrees with the Poisson law.
Null hypothesis: the distribution agrees with the Poisson law.

No. of goals/match 0 1 2 3 4 5 Go Soe Lotal

Actual frequency (QO) 2) Oe lal 1S 8 5 Saeed el 57


ae gS
ae cea
11 l

Poisson frequency (£) 2-6


——.
8:0 “12/4 9] 27 7959) “6113/0
tn
40:8
ee
Sif
10.6 5.4

Table 8.5

In table 8.5 the last three class intervals must be grouped to give each class
interval an expected value greater than 5. Also, the first two.

2 _(11—-10.6)? |(11—12.4)° (5—6.1)? |(7-5.4)?_


x 10:6 ‘(ied tend Gad coreSiik une clue wT
Sampling Theory and Significance Testing (II) 181

Degrees of freedom = 6—1—1 = 4, since the totals are made the same and the
Poisson distribution is fitted with same mean as the actual distribution.
Referring to table 8*

Neos = 9488 Xboi = 13.277


Thus, there is no evidence from the data for rejecting the hypothesis, or the
pattern of variation shows no evidence of not having arisen randomly.

5. In a mixed sixth form the marks of eight boys and eight girls in a subject
were :

Boys: 25, 30, 42, 44, 59, 73, 82, 85; boys’ average = 55

Girls: 32, 36, 40, 41, 46, 47, 54, 72; girls’ average = 46
Do these figures support the theory that boys are better than girls in this
subject?
Null hypothesis—that boys and girls are equally good at the subject.
From the sample of boysx, = 55 st = 540.57
From the sample of girls x. = 46 5> = 156.86

Applying the F test to test that population variances are not different, gives .
540.57 Ms f¥e
F Fe156.86
le a 3.46 (not significant) v,=7
= Uy —=9/

Best estimate of population variance by pooling

2 _(m— 1)si + (n2 — 1)s3 ~ 1x 540.57 + 7 x 156.86 ~


ny Pee | OR ET) ree = 349.0

Standard error of difference between means

e(X1 —X2) = V[349(% + 4)] = 9.35


9.35 9.35 0-6
t=

with 14 degrees of freedom.


'From table 7*
To.025— 2.14 (two sided 5%)

There is no evidence from these data that boys are better than girls.
(see discussion of example 5, chapter 7, p 153).

6. In designing a trial to test whether or not the conversion of a machine has


reduced its variability, a sample of 20 on the new process is taken.
SPS—7
182 Statistics: Problems and Solutions

Previous machine standard deviation before conversion = 2.8 mm. For the new
process, calculated from sample of 20, standard deviation = 1.7 mm.
What is the significance of this test?
It can be assumed that the process change could not have increased the
variation of the process.
Null hypothesis—that no change has occurred in process variation. Thus, a
one-tailed test can be used

28a
Fao a = 2.71 Uy =%, V2 = 19 (use 18)

from table 9*. Thus

Fo.0s = SY Fo.01 = 2.57

Therefore, the result is highly significant and the change can be assumed to have
reduced the process variation.

8.3 Problems for Solution

1. Three women take an advanced typing course in order to increase their


speed. Before the course their rates are 40, 42, 40 words per minute. After the
course their speeds are, 45, 50 and 42 respectively. Is the course effective?

2. Table 8.6 gives the data obtained in an analysis of the labour turnover
records of the departments of a factory. Is there any evidence that departmental
factors affect labour turnover and if so, which departments?

Average Number of
Dore labour force leavers/year

A 60 15
B 184 16
C 162 iS
D 56 12
E 30 4
F 166 25
G 182 aS
H 204 18

Table 8.6

3. Table 8.7 gives the data obtained on process times of two types of machine.
Is machine A more variable than machine B?
Sampling Theory and Significance Testing (II) 183

Machine A Machine B

Average time 2S 733


Standard deviation 0.5 0.2
Sample size 100 80

Table 8.7

4. A change made to a process was tested by timing two sets of different


workers. Those using the new process completed the job in

B25 32; Sky Dg Oty SA aS gs OS

Using the old process, another group completed it in


31, 32, 32, 33; 33, 34, 37, 43, 47, 48 s
Is the new process quicker?

5. In designing a trial to test whether or not the conversion of a machine has


reduced its variability, a sample of 13 items is taken from the new process.
Previous machine standard deviation before conversion = 2.8 mm; standard
deviation from new process = 1.7 mm.
Is a reduced variability demonstrated?

6. The number of cars per hour passing an intersection, counted from 11 p.m.
to 12 p.m. over nine days was 7, 10, 5, 1, 0, 6, 11, 4, 9.
Does this represent an increase over the previous average per hour of three
cars?

7. In a time study, only 18 readings of an element could be taken as the order


was nearly finished. They were as follows, in minutes
0.12, 0.14, 0.16, 0.12, 0.12, 0.17,0.15, 0.14, 0.12,
0.11, 0.12, 0.12; 0.12, 0:15, 0.17,:0.13, 0.14, 0.14
Within what limits (95% confidence) would you expect the actual average
time for this element to lie?

8. A coin is tossed 200 times and heads appear only 83 times. Is the coin
biased?

9. A new advertising campaign is tried out in addition to normal advertising


in six selected areas and the sales for a three-month period compared with
184 Statistics: Problems and Solutions

those of the six areas before the special campaign. The data are given in table 8.8.
Has the new campaign had any effect on the sales?

Sales before campaign Sales after campaign

Area 1 £2000 £2500


2 £3600 £3000
3 £2500 £3100
4 £3000 £2800
5 £2800 £3400
6 £2900 £3200

Table 8.8

8.4 Solution to Problems

1. The null hypothesis is set up—that the advanced typing course will not
affect the speed of typists.
This is a paired ‘t’ test since by considering the differences only, the variation
due to varying basic efficiency of the individuals is eliminated.

Difference
Typist
ae xe
1 5 25
2 8 64
3 sa 4
2x; = 15 xs a= 93

X=5 Table 8.9

Estimated population variance


\2 2
Do es s " a
: n—1 Oc ggeeeana
Thus

_x-—0. 5-0.
t= oi cs = 389

V3 a
with 2 degrees of freedom.
Sampling Theory and Significance Testing (II) 185

Reference to table 7* (using two-tailed test)

t0.05/2 = 4.303 t0.010/2 = 9.925

or the result is notsignificant.


Thus there is no evidence from this sample that the new course improves
speed.
It is not surprising that no evidence was found from this trial because of the
small sample taken. In practice, when more girls were tested the new course was
shown to be more effective, illustrating that the result not significant does not
mean no difference but that no evidence of a difference has been found.

2. Null hypothesis: there is no difference in turnover rate between departments.


The expected number of leavers and x? contributions are given in table 8.10.

mene Number of Expevied Contribution


Dept. labour , number of ae
force/year eavers/yeat leavers/year eee
(O;) (Ej)

A 60 15 TS TRS)
B 184 16 23.0 wai
Cc 162 15 20.0 12
D
a 0, {86
56
4 {16
ae
be
be
ial: 25
F 166 25 20.8 0.9
G 182 25 22.8 0.2
H 204 18 SS Ve)

Totals 1044 130 130.4 x” = 16.6

Table 8.10

Since in department E, the expected number of leavers is less than five, it has
to be grouped with another department. It is logical to group it with a similar
department or one whose effect would be expected to be similar on number of
leavers. Here, having no other a priori logic, since there is little difference between
the observed and expected frequency for department E, it has little effect. Here
it is combined with department D, the next smallest.

Average turnover rate = an x 100% = 12.5% per year


186 Statistics: Problems and Solutions

Expected number of leavers per year in department A

S235
=F00 * 00-79

and so on.
Thus, x? = 16.6 with (7—1) or 6 degrees of freedom since only the total was
used to set up hypothesis.
Reference to table 8*

x3.05
=12.592 x}.o1
=16.812 X$,001
= 22.457
Thus result is significant at 1/20 level, or there is evidence of differences between
departments.
When such a result is obtained then it is usually possible to isolate the
heterogeneous departments by locating the department with the largest
contribution to x. Providing the x? is significant at 1 degree of freedom or
exceeds 3.841 at 1/20 significance level, this department should be excluded
from the data and the analysis repeated until x” is not significant. If the result
is significant with no single contribution greater than 3.841 then conclusion
can only be drawn that heterogeneity is not due to one or two specific
departments but general variations between all.
The results of repeating the analysis excluding department A are given in
table 8.11.

Average Expected Le
Dept. inbour Nome of umes conte tee
force/year eavers/year leavers/year ee

B 184 16 Dies 1.40


C 162 iss 19.0 0.84
D ae ae
E 30 86 4 16 10.0 3.60

F 166 25 19.4 1.61


G 182 25 23 0.64
H 204 18 23.8 1.41

Totals 984 115 115.0 9.50

Table 8.11

Average turnover rate = $43 x 100% = 11.7%


Thus x” = 9.50 with (6—1) or 5 degrees of freedom

Xoos= 11070 ~° ¥$.o1 = 15,086 *Xaont = 20017


Sampling Theory and Significance Testing (II) 187

or result is not significant, there is no evidence of differences between remaining


departments. It would however be worth checking further on department D
since its contribution to total x” has been ‘watered-down’ in having the data
from E combined with it.

3. This is an ‘F” test.


Assume no a priori knowledge—use two-tailed test.

eg
= 6.25 Vv; = 99 v, = 79
Fo
Referring to table 9* use v, = 24, vy 2 60 (on safe side)

Fo.05 = 1.88 F.002 = 2.69

Clearly present value is highly significant, the product from machine A is more
variable than the product from machine B.

4. Null hypothesis—the change to the process has not affected the time.
Letx, = time of new process
xX, = time of old process

then, mean of new process X,; = 35s.


Variance of new process estimated from sample

si 2G : 5)" _ 16.44
Mean of old process X2 = 37s.
Variance of old process estimated from sample

st = 42.67
In order to apply the ‘t’ test, the variances of the two populations must be
the same.
Using the ‘F’ test to test that population variances are the same, gives

42.67 ;
f =
16.44
eee =
2.59 with v, =
=9 degrees of freedom
f f;

v2 = 9 degrees of freedom

Table 9* shows that this is not significant—or there is no evidence of difference


in population variances.

Thus pooling the two estimates s? and s3 to give best estimate

ge = = Ds + (m2 — D8 _ 996
ny t+n,—-2
188 Statistics: Problems and Solutions

Standard error of the difference between the means

A(X, —x2) = V [29.6 (ts + t5)] = 2.43

p= GT=3)-9_ 0.82
with 18 degrees of freedom.
From table 7*
to.05/2= 2.101 to.02/2= 2.878
or result is not significant at 0.05 level—there is no evidence that the change has
reduced the time.

5. Null hypothesis—conversion has not reduced the variability.


Thus
Nn

pe = = 2.71 v; = © degrees of freedom


v2 = 12 degrees of freedom

Referring to table 9* for v; = ©, v2 = 12 gives


Fo.0s = 2.30 Fo.01 = 3.36

The result is significant at the 5% level but not at the 1% level. Some further
sampling would probably be in order so as to reduce the errors involved in
reaching a decision.
Strictly, the one-sided F test (used because there is, say, prior knowledge
that the conversion cannot possibly increase product variation but may reduce
it) should be applied as follows.
Observed
ae
B= 7 gz = 0.37

The lower 5% point of F with v; = 12 and v2, = © is obtained from table 9* as

F . =
1 SE
1
0.95, 12, Faas ane 0.435

Since the observed value of F is lower than this, the reduction in variation is
significant (statistically) at the 5% level.
The lower 1% point of F is
| fale
3.36 2:30
and the observed F is not significantly low at this level.
Sampling Theory and Significance Testing (II) 189

6. Null hypothesis: there has been no increase in number of cars.


From sample

Dx,=53 Ex?=429
Sample mean

x=5.9
Vari
ariance 429 — ee 2
2=
—g =
14.6

Standard deviation
s= 3.82

Sexe A
t= 3.8) 2.28

V9

with 8 degrees of freedom.


From table 7*

to.05/2 = 2.306
Thus the result is not quite significant at the 5% level. On the present data no
real increase in mean traffic flow is shown.

7. Sample mean
ZX i
x= = 0.136 min
n
Estimate of population variance
were Ve
= mae tr = 0.000212 = 0.0146 min
Let uo = unknown true population average. Then for 95% confidence

= s ~ s
X— 10.025 TS Mo <x + 10.025 J,

0.0146 0.0146
0.136—2.11 V18 = Lo 0.1364 241 V18

or inside limits 0.136 + 0.0073.

8. This problem will be solved using two alternative methods—the ‘u’ test and
the x? test.
190 Statistics: Problems and Solutions

Ist Method—the ‘u’ Test


Hy pothesis—the coin is unbiased.

Probability of a head = 0.50


Sampling distribution of number of heads in 200 trials has
w= np = 200 x 0.50= 100
o = [np( 1—p)] =V200 x 0.5 x 0.5 = 7.07
_ 835-100.) 1165 & eee.
7.07 7.07
From table 3*, probability of 83 or fewer heads = 0.01; by symmetry the
possibility of 117 or more heads is 0.01.

83.5 100

Figure 8.1

2nd Method—the ‘x’ Test

Heads Tails

Observed O 83 iG
Expected EF 100 100

Table 8.12

Heads Tails

O 83.5 116.5
E 100 100

Table 8.13. Using Yate’s correction

x2 = 83.5—100) , GIGS
sin #2 3
SEO 2
ae
0 100
with 1 degree of freedom.
Sampling Theory and Significance Testing (II) 191

From table 8* the probability of x? this high, or higher, is approximately 0.02.


However, in calculating x”, the tabulation includes both tails of the normal
distribution of which it is the sum of the squares.
Hence the probability of getting x? > 5.445 is 0.02 if both the probability
of 83 or smaller and also of 117 or more are included.
Thus the probability of 83 or smaller = 0.01, which agrees with the result by
the ‘uw’ test.

9. Here, since from a priori knowledge, it can be stated that the new campaign
can only increase the sales rate. Then a one-tailed test can be used for extra
‘power’ in the test.
Again the paired ‘t’ test is applicable. Null hypothesis—new campaign has
not increased the sales.

ares Difference in sales


25

1 +500
oy) —600
3 +600
4 —200
5 +600
6 +300

Average +200

Table 8.14

Code data
tego
y fi OO
Thus x =2

27 = 2244, 8, = 4.93
2-0
= = 0.99
04.93
V6
with 5 degrees of freedom
to.os = 2-015 (for one tail test)

or result is not significant; there is no evidence of an increase in sales rate.


192 Statistics: Problems and Solutions

8.5 Practical Laboratory Experiments and Demonstrations


The concept of significance is perhaps one of the most difficult to grasp in
statistics, ie. that one cannot prove a hypothesis, only offer evidence on a
probability basis for its rejection.
Here again practical participative laboratory experimentation gives the most
effective vehicle for putting across this concept.

8.5.1 Experiment 14—the ‘t’ Test of Significance


(This experiment is from the Laboratory Manual pages 62-65.)
Given to students in groups of two or three after lectures on significance
testing and ‘t’ test of means.
In this experiment, use is made of the two normal populations of rods
supplied in the Kit.t While it is appreciated that realism can be introduced to
experiments by using components from industry, experience has shown the
necessity of having standard populations available, especially as they are used
extensively throughout the experiments.
In Appendix 1 the instruction sheet, recording forms and analysis and
summary sheets for the experiment are given together with a set of results
obtained.

Red Yellow
rod population rod population

Mean pu 6.0 6.2


Standard deviation o 0.2 0.2

Table 8.15

The population parameters are given in table 8.15. These parameters are
chosen so that for the first part of the experiment with sample sizes n = 10,
approximately half the groups will establish a significant difference between the
populations while the other half will show no significant difference at the 5%
probability level. Since each group summarises the results of all the groups, this
experiment brings out much more clearly than any lecture could do, the
concept of significance.
In the second part of the experiment where each sample size is increased to
30, the probability is such that all groups generally establish (95% probability)
a significant difference. The experiment demonstrates that there is a connection
between the two types of error inherent in hypothesis testing by sampling and
the amount of sampling carried out. To complete this experiment, including the
full analysis, takes approximately 40 min.

} Available from Technical Prototypes, 1A Westholme Street, Leicester.


Sampling Theory and Significance Testing (II) 193

8.5.2 Experiment 15—the ‘F’ Test


(This experiment is described in pages 66-68 of the Laboratory Manual.)
The same rod populations as for experiment 14 again demonstrate the
basic concepts of the test.

8.5.3 Experiment 16—Estimation of Population Mean


(Pages 69-71 of Laboratory Manual.)

8.5.4 Experiment 17—Estimation of Population Mean (Small Sample)


(Pages 72-74 of Laboratory Manual.)

8.5.5 Experiment 18—Estimation of Population Variance


(Pages 75-76 of Laboratory Manual.)
Note: All these experiments use the standard rod populations supplied with
the Statistical Kit No. 1. ~

8.5.6 Experiment 19—The x? Test


Using data from experiment | this experiment is described on pages 77-79 of the
Laboratory Manual.

Appendix 1
Object
To test whether the means of two normal populations are significantly different
and to demonstrate the effect of sample size on the result of the test.
Method
Take a random sample of size 10 from each of the two populations (red and
yellow rods) and record the lengths in table 1. Return the rods to the
appropriate population.
Also take a random sample of size 30 (a few rods at a time) from each of the
two populations (red and yellow rods) and record the lengths in table 2.
Analysis
(1) Code the data, as indicated, in tables 1 and 2.
(2) Calculate the observed value of ‘t’ for the two samples of size 10 and again
' for the samples of size 30. ,
(3) Summarise your results with those of other groups in tables 3 and 4.
Observe whether a significant difference is obtained more often with the
samples of size 30 than with the smaller samples.

Notes
The ‘t’ test used is only valid provided the variances of the two populations
are equal. This requirement is, in fact, satisfied in the present experiment.
Table |

Yellow
ee
population
In order to reduce the subsequent arithmetic,
and to keep all numbers positive, the coded
values, x’, are used in the calculation. The

os
x’ id

coded data can be obtained by subtracting


Ore from all readings, the smallest observed rod
length in the sample. The coded values, y’, may
be obtained in a similar way for the sample of
<
bo
9 ©lier
— om
Be) red rods.
If a is the length of the shortest yellow rod
S 9 Be) in the sample, the mean, x, of the sample is
< o>
ZA p>
x=at7
= Dx" _
5.8+ 10
Brits
Gale;
S v ° fe) <

The variance, s2, of the yellow sample is

real C230 Silke


Lx 1.69 10
2 10
='0.0355
Sy
9 9
S £}p]—]r
[Oars
S| v\
wn|o
S|

!
ve}
e/o]S{s
SJ
iN
SE
ay
A os a8)
If b is the length of the shortest red rod in the
sample, the mean, y, of the sample is

ou Ly
v= bt 10 = 5.97

lengths data
The variance, s},, of the red sample is
Dy"? ~~ (zy)?

S |p.
®
52 = 10 = LOY stole ai0 0667
> 9 9 nae
S ” vt
The pooled estimate of variance, s?, is
|S]}o]s
~) Om a
Gea
Ges)
@
rx’? peels fs Lys “Cy:
>w of
| fel? S N) S 1e) es
Ss
2 =
18 1 =
0.0512
. fe)
(2 8=9 8 £6.17 5.97) oi0aT
° fe) £ , [1,1 V0.0512V%5 +75 0.0512
e
in
eal
aa ng ny
= 1.96
> SS
1O
OF]aie
cl
ea S/IS]S
s/o]—-}y~r{—-

= Ly "2
Table 2
Yellow
SS
population
a a
—(30)
ee
Red ae population
a he Ee
— (30)

Ro Coded Rod oO fe)Qa ) a The analysis is exactly similar


lengths |data lengths | data to that for samples of size 10.
x’ x
vn
y' If aand b denote the
ie) re) lengths of the shortest yellow
i mle and red rods (in samples of 30),
respectively
6.) O-$ ‘SAE~ any
ay,

08
02S goatee 6.10

peo es
ere Ogee

The pooled estimate of variance,


Fa
BS
Iee
(J
ad
ce
aA s , is

PITNIOTO};lo];o
|e
Q ae 1. (ay)?
pa 0.8 = 30
O% le
[Ff
ole
NI
HMI
o~ i 304 30—2
OF
S|
[ROHR
Ci
=|
Su
S/S

F ole
-|0
y aun Oey) A?
lwlole
[TS
Tp
TN
|ALAIS
IS
IW
ISIS
SIS
sIs
ISIS]
22
GS)

ISISOISI/OIY
ips
G a) -—
{|
Oo x 10) 0 Ny Ny
|o gfj»
ww
ON
oe
OS
tee
Sia
Ge
TS iw]
{re
iP
O}|wW
15
&NI}
=|» S|S2 |FOo
9 YIN + be
Re
seh
aa ° ie) =
which reduces to
foo los S ie) » S 2 2
fer feuloul [eo -= a eo
(X-Y)/870
ie)
;
NEVES
=

pot [oulorg om ieee 12 x29.5


Pet [owlovg| 2 por "97ings =a [aes 130]
Tos losloce gol] =238=26
Foe
fc Er op i
“IS
IWIWNIWNIDINIQ]*
S;oy]C
SI
SISISISISTSISI

| 0 [oslooa 1 fe
0-6 |O'36
a6 [ons
| £8 [on joo

ex’ =|12,0]5- 6] = = zy”


Table 3
Summary Table — samples of size lO

Miae. a 5h
Sample means Difference Value Whether
significant
Group at 5% level
(x=y) r (two -tail test)

2 Coal

3 6-38

4 6: 25S)

| 5) (gp 80

6 6-04

[ il 6-\3

8 ON arf,
The value of |*]| which must be exceeded for the observed difference to be significant
at the 5% level =2:10l

Table 4

Summary Table — samples of size 30

Sample means Difference Whether


significant
at 5% level
(x -y) (two- tail test)

The value of || which must be exceeded for the ast


| rie
to be significant
a at
the 5% level =2-002
Qj Linear regression theory

9.1 Syllabus
Assumption for use of regression theory; least squares; standard errors;
confidence limits; prediction limits; correlation coefficient and its meaning in
regression analysis; transformations to give linear regression.

9.2 Résumé of Theory Covered

9.2.1 Basic Concepts


Regression analysis is concerned with the relationship between variables. In this
chapter, only the linear relationship between a dependent variable, y, and an
independent variable, x, will be discussed.
Regression analysis can be extended to cover curvilinear relationships
between two variables and the relationship between a variable y and m other
variables X1,X2,... Xm. This is called multi-regression analysis and details
can be found in textbooks on mathematical statistics.
The data for regression analysis may take two forms:
(1) The natural pairing of variables such as: height and weight, height of
son and height of father, the output of a department per week and the average
cost, or the sales of a product in an area and the advertising expenditure in that
area.
(2) The independent variable x is given assigned values and for each value of
x, a range of values of y is obtained. This type of data normally arises when the
experimental design is under the control of the analyst and data in this form
are from many points of view preferable to data of class (1). For example, in
establishing relationships between cutting tool life and speed, the experimenter
may vary speed (x) over a finite number of values and then take a number of
observations of tool life (v) at each of these levels.
Note: It is important to appreciate clearly that the regression relationship
calculated only holds over the range of variation of x used in the calculation.
Any extrapolation of the relationships can only be carried out based upon
SPS-—8 197
198 Statistics: Problems and Solutions

a priori knowledge or assumptions that this relationship will hold for other
values of x.

9.2.2 Assumptions Required for Linear Regression Analysis


The following assumptions are required for the use of regression theory and for
the use of significance testing in the theory.
(1) The dependent variable (y’) is normally distributed for each value of the
independent variable (x).
(2) The independent variable x is either free from error or subject to negligible
error only.
(3) The variance of y for all values of x is constant.
Note: It is also possible in advanced theory to apply regression analysis in cases
where the variance of y is a function of x.

9.2.3 Basic Theory


The regression line is fitted by the method of least squares. Given the population
theoretical regression line as

n= a+ (x3)
then the best estimate of this line is given by

Y=atb(x-X)
where
n

dLfyi
a =

Ly;
I

and

b= 2fiVi = D)Q; med)


5 eR Ls
a and b are unbiased estimates of a and B respectively.
These estimates minimise the residual variance of y about the regression
line and for this reason the approach is known as the ‘method of least squares’.
Note: In the first form of data where there is a natural pairing of the points,
f, = 1 for alli, and the regression coefficients are given by the following
formulae
n

dy ~a b=
» Oi-HGei-¥)
n a ae
U ~

where n = number of pairs of observations.


Linear Regression Theory 199

Since this book is concerned with giving an introduction to the theory, the
examples given will be for this case of paired variables. It should be stressed,
however, that for cases where f; > 1 a more rigorous theory can be developed
and, in fact, a test for linearity can be incorporated into the analysis. Details
of this more advanced analysis can be found in most mathematical statistics
textbooks.
This omission of an independent test of linearity requires usually an a priori
knowledge of linearity and this should in all cases be examined by drawing a
scatter diagram.

9.2.4 Significance Testing


It is, of course, necessary not only to calculate the statistics ‘a’ and ‘b’ but also
to be able to test their significance. This point cannot be stressed strongly
enough. In addition it should be noted that even if a regression coefficient is
found to be significant, it does not necessarily imply a causal relationship between
the variables.
The standard errors of the coefficients are: standard error of a

55)
€,= ae

standard error of b

Qeete Sale
> V2 - x)
‘where s’ = residual variance about regression line

_ 20;
= i)
n—2
where Y; = estimate from regression line.
The significance of a and b can, therefore, be tested by the ‘t’ test (see
chapter 8) or, alternatively, as shown in some textbooks, by an ‘F’ test, (see for
example Weatherburn, A First Course in Mathematical Statistics, C.U.P., pages
193 and 224, example 8).

The t-Test of the Significance of an Observed Regression Coefficient b


Set up the null hypothesis that 6 = 0, i.e. that there is no linear relationship
between y and x and thus the values of y are independent of the values of x.
Remember that in this simple theory, it is necessary to assume that the
only possible relation between y and x is a linear one.
Under the assumptions given in section 9.2.2, the statistic

b—8B
ep
200 Statistics: Problems and Solutions

will be distributed like Student’s ¢ with (n — 2) degrees of freedom. The degrees


of freedom of €, are 2 less than the number of points (pairs of values of x and y)
since the residual sum of squares about the fitted regression line is subject to
two independent constraints corresponding to the two constants calculated from
the data and used to fit the regression line.
The value of

t=
b=
€p
given by the data can be referred to table 7* of the Statistical Tables and if it is
significantly large, judged usually on a two-sided basis, there is thus evidence of a
linear relationship between y and x.

9.2.5 Confidence Limits for the Regression Line


The standard error, €y,, of the regression estimate, Y;, is given by

Vlei + Ger 57] mikesae Ree |


100(1—«@)% confidence limits for the precision of estimation of the regression
line are then given byY; + ty/2y €y; for given x; where v=n — 2.
Note: The confidence limits are closest together at the average value X, of
the independent variable.

9.2.6 Prediction Limits


The confidence limits defined in section 9.2.5 relate to the position of the
assumed ‘true’ regression line. If the relation is to be used to predict the value
of y that would be observed corresponding to a given value of x, then, in
addition to the uncertainty about the ‘true’ regression line, the scatter of
individual values of y about this ‘true’ line must also be allowed for.
The standard error of a single value of y corresponding to a given value,
Xj, 1S

orn flat? Sem

obtained by adding on the variance of a single value of y to the variance of the


regression estimate, Y;.
Thus, for a particular x;, there is a probability of 100(1 —a@)% that the
corresponding value of y that would be observed will lie in the interval

Ses loj2,v Ey;


Linear Regression Theory 201

9.2.7 Correlation Coefficient (r)


A measure closely related to the regression coefficient (b) is the correlation
coefficient (r).
The correlation coefficient (r) is a measure of the degree of (linear) association
between the two variables and is defined as

r= 2(x —xy —9)


V[Z(@ —x) Ty —5¥)"]
The observed correlation coefficient. can be tested for significant departure from
zero but, as in the case of the regression coefficient, b, a significant value does
not necessarily imply any causal relationship between x and y.
The residual variance about the regression line defined in section 9.2.4 as the
sum of the squared deviations of each observed value of y from its estimated value
using the fitted regression equation, this sum being divided by ( — 2) its degrees
of freedom, is related to the correlation coefficient and to the total variance
of y.
Thus
2; - Yj)’ Zi; -¥)? 2; =p)?
ee ee ar) ye = ed) eal)
Daa J = 2 J (n BE, 1)

= s%,(1-17) SS:

which for large n is approximately equal to (1 aay):


For large n, it follows that a useful interpretation of this result is that 7
measures the proportion of the total variance of y that is ‘explained’ by the
linear relation between y and x.
r’ can take values between 0 and 1 inclusive and hence for any set of data,
r will be in the range

~1<r<tl

When r= + 1, then the total variance of y is completely explained by the


variation in x or in other words the relationship is deterministic.
Figure 9.1 shows three sets of data with different values of the correlation
coefficient (r). In the first two cases, the regression coefficient b is the same.
A further useful relationship is that between the regression coefficient (b)
and the correlation coefficient (7) and which is
S
b=rx—
Sx
202 Statistics: Problems and Solutions

where sz is the variance of the values of x, i.e.


=\2
2 (x; — xX)

(a) r=+l-0 (b) r=+0-5 (c)en=O

Figure 9.1

In practice, it is usual for all these calculations to be carried out on some


form of calculating machine or computer, though there is no reason, apart
from the tedious arithmetic involved, why they should not be done ‘by hand’
preferably with suitable coding of the data.
The coefficients are computed as follows

Lyi
Suyt
a
e n

oy xe (2x;)(Zyj)

b =
: n
ye (x,
: n
The correlation coefficient

Ricmedicme|
ties HENS)

Also the regression coefficient


S
b=rx~
Sx

Thus the following totals are required for the computation

n, 2Xj; =x} ’ UX» Lis and Ly}.


Linear Regression Theory 203

9.2.8 Transformations
In some problems the relationship between the variables, when plotted or from
a priori knowledge, is found not to be linear. In many of these cases it is possible
to transform the variables to make use of linear regression theory.
For example, in his book Statistical Theory with Engineering Applications
(Wiley), Hald discusses the problem of the relationship between tensile strength
of cement (y) and its curing time (x).
From a priori knowledge a relationship of the form y = Ae®* is to be
expected.
The simple logarithmic transformation therefore gives

B
logio Y =logig A — - logio €

or the logarithm of the tensile strength is a linear function of the reciprocal


value of the curing time and the theory of linear regression can then be applied.
Note: The requirement that the variance of y is constant for all x must, of
course, hold in the transformation and this must be checked. Usually a visual
check is adequate.

9.2.9 Example on the Use of Regression Theory


The following example has been selected to illustrate the various concepts,
computational methods and analysis.
In order to keep the computation inside reasonable limits, the number of
observations has been kept small; in practice, however, in many actual problems
hundreds of readings are involved, but with the use of computers the
computation is no problem.
The data given in table 9.1 show the relationship between the scoring of post-
graduate students in a numeracy test on interview and their performance in the
final quantitative examination.

Student 1 2 3 4 5 6 if 8 Ohe10
Numeracy test score 200 175 385 300 350 125 440 315 275 230
(pts)
Final exam performance Saat tealale Gl mgole 0.8/4.3 67 An OS euro 2
(%)
Table 9.1

What is the best relationship between test score and final performance?
Before any analysis is started the scatter diagram must be plotted to test the
assumption that the relationship is linear. This diagram shows no evidence of
non-linearity (figure 9.2).
204 Statistics: Problems and Solutions

100

90

80

70

60

50
“\X Be Confidence limits
Le -

40 eat *
Bad eet

(%)
final
in
Score
exam

30 Prediction limits

100 150 200 250 300 350 400 450


Test score

Figure 9.2. Regression line with 95% confidence limits and prediction limits.

Here y = final exam performance


x = numeracy test score.
No attempt has been made to code the data and the various summations
required for analysis are given below
n=10 Lx; = 175 960

Zy; = 602 Ly? = 37050


Yx;=2795 Dx? = 868 325

Total variance of x

ox? 2»
t n uh 69 395 ~2795F
104 )dxi 871228
ag : eas wae
Linear Regression Theory
205

Total variance of y

Correlation Coefficient

175 960 — 2795 x 602


os 10 EOL 0.92
87 122 x 810 ~ 8401

Regression Coefficients

zy,_602 PR OTe 90_\_


aM nepN RTOS ON2ii muh Pipa | ex) ‘iki
The regression line is thus given by
Y = 60.2 + 0.088 (x — 279.5) = 35.6 + 0.088x

Residual Variance about the Regression Line


The approximate residual variance using the relation given in 9.2.7 is
s=83(1—r?) =90(1
—0.927) = 13.8
Note: This will be slightly in error through squaring a rounded value of r.
A better approach, to ensure arithmetical accuracy, would be to calculate
s” as
7701
90.0 (1-weit)
84012 = 14.4

However, in addition, in this example, 1 (=10) is not very large and the
approximation used will lead to an underestimate of the actual residual variance.
Using the exact expression gives
2

st MD = 14.4 x8 = 16.2

and this will be used in the remaining calculations since without this correction,
the bias of the estimator is — (11%) of the true value.
The residual standard deviation, s, is+/16.2 = 4.02
206 Statistics: Problems and Solutions

Standard Errors of the Regression Coefficients

«oe -/ - (382) = 1.27

c= Fag -vLeenal = Garza) 0


Significance of b
Assuming E[b] = 6 = 0, the observed value of f is
_ 0.088 —0_
“poise <a"
with 8 degrees of freedom.
Reference to table 7* shows that this value exceeds the 0.1% level of t
(5.041 for the two-sided test) and hence the observed value of b = + 0.088 is
very significantly different from zero. This implies that there is a strong linear
relation between y and x, i.e. between final quantitative examination
performance and initial numeracy test score.

Confidence Limits for the Regression Line


The standard error, €y;, of the regression estimate is

y;=V[e2 + €}(x; — x)? ]= V[1.27? + 0.0136? (x;—279.5)? ]


Thus for

=x = 279.5, €y; = V1.27? = 1.27


x; = 380 (or 179), ey; = V(1.27? + 0.0136? x 100.5?) = 1.87
x; = 440 (or 119), ey; = V(1.27? + 0.0136? x 160.5?) = 2.53
For any given value of x;, the confidence limits for the regression estimate
(i.e. of the mean value of y for that value of x) are found as

Y; + taj2,n—2) €Yi
For 95% limits, the appropriate value of t (table 7*) is 2.306; table 9.2 shows
the derivation of the actual limits for a range of values x.
The scatter diagram (drawn before any computations were carried out, in
order to check that the basic regression assumptions were not obviously violated),
the fitted regression line and 95% confidence limits are shown in figure 9.2.
From figure 9.2 or table 9.2, there is 95% confidence that the average final
examination percentage for all candidates who score 330 points in their initial
numeracy test will lie between 61.3% and 67.9%.
Linear Regression Theory 207
——_—_—_—_—n

Lower 95% Upper 95%


limit limit
x; yea ey; 2.90 6Y, = (1 ;—2.31 €y)/)** (Y} +231 ey,)

119 46.1 2.53 5.8 40.3 51.9


179 $1.4 1.87 4.3 47.1 Ds
229 $5.8 1.44 3.3 SOS) 59.1
DOES 60.2 1.27 2.9 5) eg 63.1
330 64.6 1.44 3.9 61.3 67.9
380 69.0 1.87 4.3 64.7 1335
440 74.3 2.53 73578 68.5 80.1

Table 9.2

Prediction Limits for a Single Value of y for Given x


The standard error, €y,;, of a single value of y corresponding to a given value of
x; is

€y; = V[s? + ef + €5 (x; -—%)?]


Limits within which 95% of all possible values of y for a given x; will lie are
found as

Veh W3tte,;
These limits are calculated in table 9.3 and are also drawn in figure 9.2.

Lower 95% Upper 95%


prediction limit prediction limit
x; ve Ey; (Yis2.31 €,) (¥7+2.31 €,,)

19, 46.1 4.8 35.0 Sie


179 51.4 4.4 41.2 61.6
229 55.8 4.3 45.9 65.7
279.5 60.2 4.2 ‘ 50.5 69.9
330 64.6 4.3 54.7 74.5
380 69.0 4.4 58.8 We
440 74.3 4.8 63.2 85.4

Table 9.3

From the figures in table 9.2, it can be expected, for example, that 95% of
candidates scoring 330 points in their numeracy test will achieve a final
examination mark between 55% and 74% inclusive, 5% of candidates gaining
marks outside this range.
208 Statistics: Problems and Solutions

Note: Such predictions are only likely to be at all valid if the sampled data
used to calculate the regression relation are representative of the same population
of students (and examination standards) for which the prediction is being made.
In other words, care must be taken to see that inferences really do apply to the
population or conditions for which they are made.
The danger of extrapolation has been mentioned. The regression equation
indicates that students scoring zero in the test, on average, gain a final mark of
35.6%. This may be so but it is very likely that the relation between the two
examination performances is not linear over all values of x. Conclusions on the
given data should only be made for x in the range 125 to 440.

9.3 Problems for Solution

1. The shear strength of electric welds in metal sheets of various thickness is


given in table 9.4.

Thickness of Shear strength


sheets (mm) of sheets (kg)

0.2 102
0.3 129
0.4 201
0.5 342
0.6 420
0.7 591
0.8 694
0.9 825
1.0 1014
ival 1143
jo. 1219

Table 9.4

Calculate the linear relationship between strength and thickness and give the
limits of accuracy of the regression line.

2. The following problem is based on an example in Ezekiel’s Methods of


Correlation Analysis and shows for 20 farms, the annual income in dollars
together with the size of the farm in hectares (i.e. units of 10 000 m?). The
data are given in table 9.5.
Find the best linear relationship between the size of farm and income and
Linear Regression Theory 209

Size of farm (ha) Income ($)


(x) ()

60 960
220 830
180 1260
80 610
120 590
100 900
170 x 820
110 880
160 860
230 760
70 1020
120; 1080
240 960
160 700
90e% 800
110 1130
220 760
110 740
160 980
80 800

Table 9.5

state the limits of error in using this relationship to predict farm income from
farm size.

3. The data obtained from a controlled experiment to determine the


relationship between y and x are given below
Mak 2d 10 15 20 30 40 56 65 80
PET DMA] F200 27 Sre30:0. 13500 433° «40.2 eAth.8

Calculate the linear regression line.

4. A manufacturer of optical equipment has the following data on the unit


cost of certain custom-made lenses and the number of units in each order.
Number of units 1 3 > 10 12 (x)
Cost per unit (£) 58 55 40 37 a2 (y)

(a) Calculate the regression coefficients and thus the regression equation
210 Statistics: Problems and Solutions

which will enable the manufacturer to predict the unit cost of these lenses in
terms of the number of lenses contained in each order.
(b) Estimate the unit cost of an order for eight lenses.

5. The work of wrapping parcels of similar boxes was broken down into eight
elements. The sum of the basic seconds per parcel (i.e. of these eight elements)
together with the number of boxes in each parcel is given in table 9.5.

Number of boxes Sum of basic |Number of boxes Sum of basic


in parcel seconds per parcel in parcel seconds per parcel
(x) () (x) (y)
ig

1 130 22 260
6 200 20 190
13 150 34 290
19 200 42 270

Table 9.5

(a) Calculate the constant basic seconds per parcel and the basic seconds for
each additional box in the parcel.
Calculate the linear regression and test its significance.
(b) What would be the best estimate of the basic seconds for wrapping a
parcel of 18 boxes?

6. A manufacturer of farm tools wishes to study the relationship between his


sales and the income of farmers in a certain area. A sample of 11 regions showing
the income level of farmers in that area, together with the total sales to the
area, gave the data in table 9.6. Of what use is this information to the
manufacturer?
ed

Income level of Total sales to Income level of Total sales to


farms in area farms in area farms in area farms in area
($) ($) ($) ($)
1300 2800 1300 3000
900 1900 1200 2600
1400 3200 800 3300
1000 2400 1400 1500
800 1700 700 1600
900 2000
Purmareseatllineneemr?\ SatWSs callie ntti! ime ei st MRE a
Table 9.6
Linear Regression Theory 211

7. The following example illustrates the application of regression analysis to


time series.
The annual sales of a product over eight years are given below

1960. 196% 1962; -1963',% 19644. 1965<_ 1966 ©1967


300 245 450 B25 B75 300 375 400

Estimate the best linear time trend and calculate confidence limits for
forecasting.

9.4 Solutions to Problems

1. Let x = thickness of sheet (mm)


y = shear strength of sheet (kg)
n=11 2x; = 6.49

2x; = 7.7 Ly? = 5 692 958

LY; = 6680.0 x =0.7


Lxiy; = 6008 y = 607.3

Variance of x ees
6.49 — lie
i=
> 10
LPs
107
10
0.110
Total Variance of y
_ (6680)?
Ti idea 11__ _ 1636 376.2 _ 163 637.6
s2, = 10 10

Correlation Coefficient
consti eeisase)
= USI 28
r GAO x 1 636 376.2)
The proportion of the total variance of y ‘explained’ by the linear regression
relation betweeny and x is approximately 0.99287 or 98.6%.

Regression Line

a=y = 607.3

ig Eee 0.9928 x
b=rx2 163
TTD 637.6 _ 1210.9
212 Statistics: Problems and Solutions

The linear regression line is given is


Y — 607.3 = 1210.9(x—0.7) or Y=240.3
+ 1210.9

Standard Errors
The estimated residual variance about the regression line is

2 = 3(1—r?)( ed = 163 637.6 (1—0.99287) = 2608.7


thus

eq =,[(750827) =15.40 € Chiveuey) =A8


Test of Significance of b
From the evidence of the scatter diagram and the high value of r, the observed
value of b is expected to be significant. In confirmation, the test gives

(aS_ 120.9209
a =24.9
a very highly significant value of t for 9 degrees of freedom.

Confidence Limits and Prediction Limits


The estimated standard error of the regression line is

ey; = V [ea + e5(x— X)"]


and the estimated standard error of a single predicted value of y for given x is

ey = VIS? + ed + Be 3)?]
Table 9.7 shows some values of these two standard errors for particular
values of x, together with the 95% confidence and prediction limits using the
appropriate ¢-value of 2.26 (9 degrees of freedom).
The information in this table, as well as the observed data are plotted in
figure 9.3.
Notice that the fitted ‘best’ line does not go through the origin. In fact the
origin is not contained within the 95% confidence interval for the ‘true’
regression line—which is equivalent to saying that the intercept of the fitted
line is significantly (5% level) different from zero. From inspection of the
observed data, there is a suggestion that the true relation curves towards the
origin for low values of sheet thickness. In short, do not extrapolate for
thickness values below 0.2 mm and bear in mind that the calculated relationship
for sheet thicknesses of 0.2 mm and just above may underestimate the average
shear strength of welds.
Linear Regression Theory 213

95% confidence limits 95% prediction limits


for regression line for single values

Xj Ya ey; Y; a IPAS, €y; Ey; MBE PONS Ey;

0.2 1.9 28.81 —63, 67 58.64 —131, 134


0.3 123.00 24.83 67, 179 56:79 —5, 251
0.4 244.1 DES 196, 292 55:31 95369
0.5 365.2 18.22 324, 406 54.23 243, 488
0.6 486.2 16.15 450, 523 D3toT 365, 607
Ord, 607.3 15.40 572, 642 5)3)-35) 487, 728
0.8 728.4 16.15 692, 765 S325 607, 849
0.9 849.5 18.22 808, 891 54.23 T2709 72.
1.0 970.6 20623 923, 1019 Soro 846, 1096
ei HOOT 24.83 1036, 1148 56.79 963, 1220
id 12278 28.81 1148, 1278 58.64 1080, 1345

Table 9.7

1400

Regression iine

Prediction
limits

(kg)
Shear
sheets
of
strength
Confidence limits

~300579 1 0.20304050.60.7 080.910 1.1 1.2


Thickness of sheets (mm)

Figure 9.3. Regression line with 95% confidence limits and prediction limits.
214 Statistics: Problems and Solutions

In the following solutions, since the calculations are all similar to that of
problem 1, the detailed computations are not given.

2. Here the scatter diagram (figure 9.4) shows little evidence of a relationship
but, on the other hand, it does not offer any evidence against the linearity
assumption so the computation is as follows.

n= 20 x= 139.5
x = 2790 p= 872.0
Sy = 17440 s2 = 3194.47
Lx? = 449 900 s} = 28711.58
Sy? = 15 753 200 r= +0.0078
Exy = 2 434 300 Sy = 56.5
Sy = 169.4

b = 0.0078 x ws = + 0.02339
1300

1200

600

($) 500
Income

400

300

200

100

40 80 120 160 200 240

Size of farm (ha)

Figure 9.4
Linear Regression Theory
215

Regression Line

Y — 872 = 0.0234 (x — 139.5) Y = 868.7 + 0.0234x

Significance of b
From inspection of the scatter diagram (figure 9.4) and the low value of r (the
significance of which can be tested using table 10*), the observed value of b is
not expected to differ significantly from zero.
Residual variance

s? = 28 711.58(1 —0.00787) x 42 = 30 305


Standard error of b

w= | 30 305 = |(33398)-0.7
Soe Zep) 2."
60 695 ez .

Thus the observed value of

0.0234 —0
Oe a ome 0.033

which is clearly not significant. (For the slope of the fitted regression line to be
significantly different from zero, at the 5% level, the observed value of t would
have to be numerically larger than 2.101.)
Thus, until further evidence to the contrary is obtained, farm income can be
assumed to be independent of farm size, at least for the population of farms
covered by the sample of 20 farms.
Since the data show no evidence of a relation between farm size and income,
there is little point in retaining the fitted regression equation. The best estimate
of the mean income of farms in the given population is therefore $872.
Ninety-five per cent confidence limits for this mean income are given by

872 + 2.101 x Oy = 872 + 79.6 = $792.4 to $951.6


Ninety-five per cent prediction limits for the income of an individual farm
are given as
872 + 2.101 x 169.4 (1 + 39)= $872 + 364.7
= $507.3 to $1236.7

3. This problem is of interest since the assumption of linearity can be quite


safely rejected after drawing the scatter diagram (figure 9.5). There is therefore
no point in trying to fit a single linear relationship to the data.
216 Statistics: Problems and Solutions

y 45

x
40 =
x
35 X

30 x
x
25 The relationship is not
linear so analysis cannot
be continued

10 20 30 40 50 60 70 80

Figure 9.5

In practice either a polynomial or other mathematical function would be


fitted to the observed data or else a suitable transformation of the values of
either x or y or both would be used to give an approximately linear relation. In
this latter case, the standard methods could be used to find the linear regression
relation between the transformed y-values and the transformed x-values.
However, since both methods are beyond the scope of this chapter, the
answer here is that linear regression analysis cannot validly be used directly with
these data. Although there appears to be a relationship, it is not linear.

4. The scatter diagram (figure 9.6) indicates quite a strong relationship between
unit cost and order size, and a simple linear relation would probably be adequate,
at least in the range of order size considered. Such a simple model would be
inadequate for extrapolation purposes since the cost per unit would be expected
to tend towards a fixed minimum value as order size was increased indefinitely
and therefore some sort of exponential relation would be a better fit for such
purposes.
Linear Regression Theory
217

60 Regression line

+ 50 95% Confidence
%e limits

S 40
=
o
a
ae)
°o

3 SS
x
20 oa

aS
Sx
10

penne | els [E
i Rees
E SPRL SOF
O 2 4 6 8 10 l2
Number of units in order
Figure 9.6

(a) The required totals of the basic data are


x = number of units in an order
y = cost per unit (£)
nas
rx = 31 Ma 6:2

Lx? = 279 s2= 217


ZY = 212 y=42.4

Ly? = 9842 sy = 213.3


Yxy = 1057 r= —0.9459

Regression Line

a=y=424

Iboh180.9459 213.3\ )_ _ 2.97


[77
(Y—42.4) = =2.97(x—6.2) or) Y=6.8-—2.97x
218 Statistics: Problems and Solutions

Significance of b
Residual variance
s? = 213.3[1 —(—0.9459)?] x 3 = 29.94

Standard error of b

& = Fs) alae = 0.587


Observed value of

~—2,97 —0 gare
“0-587 sabe
Reference to table 7* for 3 degrees of freedom shows that the value of |¢|
for significance at the 1% level is 5.841 and at the 2% level is 4.541. The
observed value of t falls between the two and it may reasonably be inferred that
the slope of the ‘true’ regression line is different from zero and is negative, the
best estimate of its value being —2.97.

Confidence Limits for the Regression Line


The standard error of the regression estimate is

ey, =V [ez + €3(x;— x)?] -,/{29.04 E


: See

95% confidence limits for the regression estimate at several values of x; are
derived in table 9.8, figure 9.6 showing these limits plotted on the scatter diagram.

Xj % ey; Y;-—3.18 ey; Y,;+3.18.ey;

1 57.8 BEo 45.4 70.2


3 See 3.09 42.1 Onlieg])
6.2 42.4 2.45 34.6 SO
10 Slet Speiil 20.6 41.6
Lio) 25-2 4.19 ite9 3825

Table 9.8

(b) To estimate the unit cost of an order for eight lenses, substitution of
x = 8 can be made in the regression equation giving

Y = 60:8 — 2.97 x 8'= 237.0


Linear Regression Theory 219

This figure is the ‘best’ estimate of the average over all possible orders of eight
lenses, of the cost per lens in an order of eight lenses.
The uncertainty of this figure (£37.0) is given by the interval (at 95%
confidence) £28.50 to £45.50.
If required, the cost per lens for a randomly selected order for eight lenses
is likely to be (95% probability) in the interval, £17.64 to £56.36, a very wide
range indeed.

5. The scatter diagram (figure 9.7) does not show any evidence against the
assumption of linearity and in this example, a priori logic suggests that it would
be a reasonable model of the situation.
Let x = the number of boxes in a parcel and y = the number of basic seconds
per parcel.

300 i

280 Yi

260

240

220 <<

200
parcel
Basic
per
sec

180

95% Confidence limits


160

140

120

SSMS nse it (Sate a a ee ee


10 15 20 233) 30 S}s) 40
O 5
Number of boxes in parcel

Figure 9.7
220 Statistics: Problems and Solutions

The following totals are obtained from the data (without coding)

n=8

xx = 164 x =2Z05
x? = 4700 sx = 191.14
zy = 1690 yp = 211.25
Ly? = 380 100 sy = 3298.21
xxy = 39 130 r= +0.8069

Regression Line

a=y =211.25

p=® 08069 3298.21)


|(Tria )_ 3.35
(Y—211.25) = 3.35(x—-20.5) Y=142.6+3.35x

Significance of b
Residual variance
s* = 3298.21 (1—0.8069?) x % = 1342.5
Standard error of b

‘4 alk13842:5\i0
ae ik1.002
Observed value of

_3.35=0 = 3.34
1.002
Reference to table 7* shows that this value, having 6 degrees of freedom, falls
between the 2% and 1% levels of significance (3.143 and 3.707 respectively). The
slope of the regression line can therefore be assumed to be different from zero
with b = 3.35 as its best estimate.

Confidence Limits for the Regression Line


The standard error of the regression estimate for given x; is

ey, = [{ises [i+ —


1 ,G@i—20.5)? \}
Linear Regression Theory
221

Table 9.9 shows values of €y; for certain x; together with 95% confiden
ce
limits for the regression estimate at that point. The scatter diagram (figure 9.7)
also has 95% confidence limits drawn on it.

ee

Xj Y; €y; Y;—2.45 ey; Y;+ 2.45 ey;

1 145.95 23.44 88.5 203.4


5 159.35 20.22 109.8 208.9
10 176.10 16.69 S22 D/O
20 209.60 12.96 177.8 241.4
30 243.10 16.07 203.7 2825
40 276.60 23.44 219.2 334.0

Table 9.9

The analysis therefore gives the following estimates


(a) The constant basic seconds per parcel (i.e. the value of Y at x= 0) = 142.6s
and the basic seconds per additional box = 3.35 s.
(b) The average time to wrap a parcel with 18 boxes is 202.9 or 203 s.
although the 95% prediction interval for the time taken to wrap a single parcel
of 18 boxes is from 107 s to 298 s.

6. Here, in order to reduce the computation slightly, all the basic data have
been coded into units of $100; i.e. $1300 becomes 13 etc.
The scatter diagram (figure 9.8) illustrates the case of ‘fliers’ or ‘outliers’, i.e.
readings which do not appearto belong to the bivariate distribution. These
suspect readings are marked as A and B in figure 9.8. Whenever such observations
occur in practice, a decision has to be made as to whether or not to exclude
them. Special tests to assist in this are available but are beyond the level of this
book and all that can be said here is that the source of the readings should be
carefully examined and if any reason is found for their not being homogeneous
with the others, they should then be rejected. In many cases, a commonsense
approach will indicate what should be done.
In this example, the two points, A and B, clearly do not conform and a closer
examination of the situation would probably isolate a reason so that the points
could validly be excluded. However, to demonstrate their strong effect on the
analysis, the points A and B have been retained in fitting the regression line.
222 Statistics: Problems and Solutions

x (B)

: ae
va
ee
3000 x

x a
4 Regression line
not significant
2500

sales($)
Total

500 1000 1500 2000


Income level of farms(§) x

Figure 9.8

x = income level (in $100)


y = total sales (in $100)
w= okt x = 10.64
mx = 117 s2 = 6.85
rx? = 1313 yp = 23.64
Ly = 260 8} = 43.45
Ly? = 6580 r= 0.3566
LXV = 2827
Linear Regression Theory 223

Regression Line

a=y
= 23.64

b i 0.356% [(43.45\
Zee) _ 0.898
(Y—23.64) = 0.898(x— 10.64) Y = 14.09 + 0.898x (in $100)
or converting back to the original units

Y = 1409 + 0.898x (in $)


Significance of b
The residual variance about the line,

s? = 43.45(1—0.35667) x 2 = 42.14
Observed ;
2 0-05 0:098—_
ar 0.784 es es

a value which is not significantly high.


The regression line calculated above could therefore be misleading since the
observed data as a whole show no evidence of a linear relation between y and x.
However, as mentioned above, the analysis can be carried out omitting
readings A and B if a valid reason to do so is found. If this is done, the
calculations give
n= 9
Lx = 95
Lx? = 1053
Zy.= 212

Ly? = 5266
Lxy = 2353

leading to
r=0.9854 and Y=66.15 + 2.29x (in $)
The fact that just two points have obscured the relationship should be noted,
as should the assistance given by the scatter diagram towards interpretation of the
situation.

7. This example illustrates the simple application of regression analysis to time


series data.
224 Statistics: Problems and Solutions

Note: No attempt will be made to justify forecasting from such analysis


(beware of extrapolation) but the method of fitting the linear regression line
is given.
As usual, the scatter diagram is plotted and is shown in figure 9.9.
To reduce the size of numbers involved in the computation, the years are
coded, 1960 being taken as Year 1 and so on up to 1967 as Year 8.

y SOO

400

oa Regression line

($)
sales
Total
200

100

1960 ‘61 ‘62 ‘63 ‘64 ‘65 ‘66 ‘67


oO) eye) (4) an(5)= *\e) (7) (8)
Yeor

Figure 9.9

The scatter diagram shows no strong relationship between the variables


(sales and time), nor is there any apparent evidence of non-linearity, so the
results for the straight line regression are as shown below.

Wizs Ly? = 975 600 V.= 3425


2x = 36 2xy = 12 880 $ = 5307 1

rx? = 204 x=4.5 r= 0.4403

Ly = 2740 s2 = 6.0
Linear Regression Theory 225

Regression Line ~

a=y = 342.5

b a 0.4403, /(S3071\-_
a ) 13.095
(Y — 342.5)= 13.095(x
— 4.5) Y = 283.57+ 13.09x

where x is in coded units.

Significance of b
The residual variance about the line is

s* = 5307.1[1—(0.4403)?
]x Z = 4991.3
The standard error of b is ;

Ep -|(24)- 10.90

and

_b-0_ 13.09 _
7 a, =10.90 on
with 6 degrees of freedom.
Reference to table 7* shows that this is not significantly different from
zero, that is, there is no evidence of a relationship between sales and time. In
this case there is no point in using the regression equation above to estimate sales
for 1968 (Year 9) or beyond. The average yearly sales figure of 342 is probably
as good a figure as any to use for making a short-term forecast on the basis of the
information given.
it 1907 sooner ps
€ wdg Ms g4
- 192 00,ti “(2S eas
ai+52 88S.at . DD

a eee
_ 7 EDP
ees
Y\.-
= la oa

So , sg
as y
¥. { aS

+ el Nae eeCEE g
t oeHi“i
Poy
MAS nd

Ss Ua es ose aa: ‘a
meat! tasuot ib rei nmaltingie Perce aids tet weit *<oides of som 151 5k.
nf tai? bn asice na-rwise qislenoiiglos # to sansbive'on a sree 2} are
estsz sinmiseeod svoda nciiaups noleesgay od} giles af Iniog onal bs
>sfdiedorg2REY 0 orugil seine yitaey Sge79ve oT bnoyed 1 (2 1e9¥)} 82
= att Yoand alt fa Faeroe) Anist-toile 9anixsiriol saroF vite #6
pa Sc
=
= ed . 7” vai
@a
P,
ee)
hap
as
ethnono 200f Sree cs

#
a =
| eet
a
ke
ee

es 3 ge
ee eae
:
| ee) | ee 2 ie wi
oe tae i

give SR
-
2
7
The scanter dipiigraiabws Ac i hath baipives

tase: and Hate} nde ieee apo kena evviderpeer i sii*


Pec) ge BOR OY “ shiv

are

a AT

\"
“ 4 hs
aon, ae

Se ew
The first two chapters of this book provide a detailed treatment of two of
the basic concepts of statistics—Probability and Distribution, and thereafter,
% apart from brief summarised introductions to other topics, the work rests
mainly on worked and unworked examples. Every attempt has been made to
use examples which will stimulate interest and to demonstrate practical
applications throughout the whole range of an introductory course in’ statistics
The book will be especially useful to students of engineeri > and to all ae
he seeking an elementary introduction, to the subject. Although the book can be
i used as an independent text, the attention of students is drawn to two all
books by the same authors which may in certain cases be used to advantage ti
alongside the present volume. These are Basic Statistics, Laboratory Instruction
Manual, and Statistical Tables. Further details are availab|le from the
publishers. it
SBN 333 12017 5 AMD tlh

224 430.74797 .
226 485.45426 326
228 440)16822 328

1525. 423334 1809, 42753


1531. 04044 744 1815. 17009
1536, 66023 748 1820, 91499
48 1542. 28271 748 1826, 66221
B50 «1547.90787 750 1832, 42175
852 1553. 53570 782 1838, 18361
654 1559. 16619 754 1843.91778 854
658 1564. 79034 786 1848, 67425
1570, 43513 758 1855, 43301
660 1576, 07356 760 1861, 19407
1581,71462 762 41866, 95744
1587,35830 Yd 41873,72508
1593.00459 768 1878, 49002
1588,65350 768 1884. 26107
\ 1890, 0334°

286; oe 93986
T BBB 584, BB7LAy

You might also like