0% found this document useful (0 votes)
31 views35 pages

TEC 106-Complete Converted Notes-1

This document provides an introduction to statistics, including definitions of key terms: 1. Statistics is the collection, organization, analysis, interpretation and presentation of data. It originated from governments collecting information and has been developed through contributions from mathematicians. 2. The statistical unit refers to the basic unit of data being analyzed, such as a person, dollar, or kilogram. It indicates whether data is continuous or discrete. A population is the total set of units, while a sample is a subset selected from the population. 3. Other important terms introduced are parameter, statistic, random variation, and discrete vs. continuous data. Statistics is applied in engineering to analyze characteristics of populations from samples.

Uploaded by

King Edwin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views35 pages

TEC 106-Complete Converted Notes-1

This document provides an introduction to statistics, including definitions of key terms: 1. Statistics is the collection, organization, analysis, interpretation and presentation of data. It originated from governments collecting information and has been developed through contributions from mathematicians. 2. The statistical unit refers to the basic unit of data being analyzed, such as a person, dollar, or kilogram. It indicates whether data is continuous or discrete. A population is the total set of units, while a sample is a subset selected from the population. 3. Other important terms introduced are parameter, statistic, random variation, and discrete vs. continuous data. Statistics is applied in engineering to analyze characteristics of populations from samples.

Uploaded by

King Edwin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

TEC 106 3.2.

6 Quadratic mean
PROBABILITY AND STATISTICS 4. Measures of dispersion or spread ………… 13
4.1 Range
Objectives 4.2 Mean deviation
At the end of this course the student should be able to: 4.3 Quartile deviation
 calculate statistical functions and analyze 4.4 Variance
samples and present them in tabular or 4.5 Standard deviation………………………… 14
graphical forms 4.6 Relative measures of deviation…………… 15
 describe concept of a random event and a 4.7 Moments
random experiment 4.8 Measures of skewness
 describe and calculate discrete random and
continuous random variables 5. Random experiments and events…………….16
5.1 Basic terminology
Course Content 5.1.1 Trial and outcome……………
Tabular and graphical representation of samples: 5.1.2 Sample space
frequency, relative frequency, absolute frequency, and 5.2 Venn diagram representations ……………. 17
distribution function; sample mean, sample variance and 5.2.1 Complement
standard deviation. Random experiments and events: 5.2.2 Intersection………………………….18
Venn diagram, union, intersection, mutually exclusive 5.2.3 Union
events, multiplication rule, and complementation rule. 5.3 Sequential experiments
Discrete random variables: probability function, probability 5.4 Sampling
distribution function, mean and variance of a distribution. 5.5 Tree diagram representations
Continuous random variables: continuous distributions,
Binomial distributions, normal probability distribution. 6. Introduction to Probability ……………………...19
6.1 Introduction
Table of Contents 6.1.1 Theoretical probability
1. Introduction to Statistics 6.1.2 Empirical probability
1.1 Definition of Statistics…………………….. 2 6.2 Concept of Statistical regularity
1.2 Purpose of Engineering statistics 6.3 Axioms of mathematical probability…………20
1.3 Introduction to Basic terms 6.4 Conditional probability………………………..21
1.3.1 Statistical unit 6.5 Permutations and combinations…………….23
1.3.2 Population and sample
1.3.3 Variation and Random variation 7. Random variables………………………………..25
1.3.4 Random selection 7.1 Definition of the random variable
1.3.5 Experiment 7.2 Discrete random variables
1.4 The statistical experiment……………….. 3 7.2.1 Discrete probability function
1.5 Probability versus Statistics 7.2.2 Discrete distribution function………
7.3 Continuous random variables
2. Tabular and Graphical representation of 7.3.1 Probability density function ………26
Samples 7.3.2 Continuous distribution function
2.1 Grouping of data 7.4 Mean and variance of a probability distribution
2.2 Tabular representations of data ……….. 4 7.4.1 Mean or expectation………………27
2.2.1 Tally chart 7.4.2 Variance
2.2.2 Absolute frequency
2.2.3 Relative frequency 8. Probability Distributions………………………..29
2.2.4 Cumulative absolute frequency 8.1 Discrete probability distributions
2.2.5 Cumulative relative frequency 8.1.1 Binomial distribution
2.3 Graphical representations of data ……. 7 8.1.2 Hypergeometric distribution
2.3.1 Graphs of the Absolute frequency 8.1.3 Poisson distribution
2.3.1.1 Bar chart 8.2 Continuous probability distributions………..31
2.3.1.2 Dot frequency diagram 8.2.1 Normal distribution
2.3.2 Graphs of the relative frequency 8.2.2 Standard normal distribution
2.3.2.1 Frequency histogram 9. Sample Examination Paper………………….…33
2.3.2.2 Frequency polygon
2.3.3 Cumulative frequency function / curve
Reading List
3. Measures of Central tendency………………… 13 th
3.1 Measures of position 1. Elementary Statistics, 4 Ed, Robert Johnson,
3.1.1 Quartiles (1984). PWS Publishers.
3.1.2 Deciles 2. Mathematical Methods for Physics and
3.1.3 Percentiles Engineering, Riley K. F., Hobson M. P., Bence S.
3.2 Measures of average (2000) J. Cambridge University Press.
th
3.2.1 Median 3. Engineering Mathematics, 4 Ed.
3.2.2 Mode Stroud K. A, (1995), Macmillan Press Ltd.
3.2.3 Arithmetic mean 4. Advanced Engineering Mathematics, Jaggi V. P.,
3.2.4 Geometric mean Mathur A. B. (1992), Khanna Publishers, Delhi.
th
3.2.5 Harmonic mean 5. Advanced Engineering Mathematics, 5 Ed,
Erwin Kreyszig (1993), Wiley Eastern Limited.
1
TOPIC 1.0 Kenya may have the Kenya Shilling (KES) as the statistical
unit of reference. Other examples include kilograms,
INTRODUCTION TO STATISTICS degrees Celsius or Fahrenheit, metres, etc.
The origin of statistics is as old as the human society The statistical unit gives an indication as to whether the
itself. Governments have been collecting information sampled data is continuous or discrete.
regarding populations, armed forces, weapons, wealth,
agriculture, economic conditions and various other activities 1.3.2 Population and sample
of the state. This information when properly collected, Population refers to a collection, or set, of individuals,
classified and presented helped in administration and objects or measurements whose properties are to be
future planning of the state. analysed. In most cases the population is finite or countably
Theoretical development of modern statistics was infinite i.e. the total number of objects in the population is
possible due to the contribution of French mathematicians known. A numerical characteristic of an entire population is
Blaise Pascal (1623 - 1662) and P. Fermat (1601 - 1665), called a parameter.
who solved the famous 'Problem of Points' laying the A sample is a subset of a population. The sample consists
foundation of the theory of probability which is the of a number of individuals, objects or measurements that
backbone of modern statistics. are randomly selected to facilitate the analysis of one or
more characteristics of the population. A numerical
Section objectives: - characteristic of a sample is called a statistic.
At the end of this section you should be able to: - With respect to set theory the population can be thought
1. Define statistics and highlight its applications in the of as being the universal set (U), Fig. 1.0, from which
field of Engineering samples A, B, and C are randomly selected for study.
2. Describe the meanings of the basic terms used in
statistics such as the statistical unit, population, U
sample, data, random variation, random samples
3. Distinguish between discrete and continuous data B
types
4. Define the steps followed in statistical
experimentation. A
5. Describe the relationship between probability and C
statistics.

1.1 Definition of Statistics


Statistics is defined as the "Science that involves the U = Population
manipulation of the mass of numerical data emanating A, B, C = Samples
from activities of interest into forms from which useful Figure 1.1 Venn diagram representation of
conclusions can be drawn. the population and sample
This manipulation of numerical data involves three stages
namely: - 1.3.3 Data
 Collection This is the numerical value of the statistical unit associated
 Summary, and with one element of a population or a sample (i.e.
 Analysis associated with a sample point). For example 12 kgs, 300
The process of drawing useful and necessary conclusions metres, 40 years, etc.
from summarised data is called statistical inference. Data is of two kinds: (1) data obtained from qualitative
information and (2) data obtained from quantitative
1.2 Purpose of Engineering Statistics information.
Statistics provides solutions to practical problems in all Qualitative or attribute data focuses on a quality type of
branches of Physics and Engineering. Typical engineering description of the subject for example colour. The set of
activities where statistical analysis is applied include: - data on colour of cars consists of red, blue, yellow, cream,
 Inspection for quality control of raw materials and etc.
finished products in mass production Quantitative or variable data results from counts (of how
 Reliability testing for complex space age many) or measurements (length, weight, etc.). This data
technology products can either be discrete or continuous. Data that is counted in
 Comparative studies of various kinds of machines the form of whole numbers is called discrete, for example
used in production and output of workers cars, people, buildings and so on. Continuous data is
 Study of consumer reactions to new products and measurable on a continuous scale. This kind of data may
other allied problems, etc. take the form of both whole numbers and fractions e.g.
In this regard statistics has largely contributed to temperature, mass, length, etc.
development in all fields of engineering including industrial
management, operational research, managerial science 1.3.4 Variation and Random Variation
and decision making. Usually, careful examination of individual objects or items in
a population will reveal that none of the are identical.
1.3 Introduction to Basic Terms Variation refers to the differences that occur between
The following terms are frequently used in statistical individual items of the population. In mass - produced
language components, for example variation may be due to changes
in raw materials, workmanship, machine operation etc.
1.3.1 Statistical unit A random variation describes the unpredictable
This is the unit of reference used in a compiled set of data. occurrence of such differences in objects or items of the
For example data on incomes of professionals in a certain population.

2
1.3.5 Random selection Home Study Exercise 1.1
The process of generating a sample in which all objects or An investigation focuses on the cost of Engineering
individuals in the population are given an equal chance of education at Moi University. One of the expenses that a
being selected into the sample is called random selection student must contend with is the cost of textbooks. Let x be
or random sampling. This process involves assigning a the cost of all textbooks purchased by each student this
number to each member of the population and then using a semester. Carefully describe: -
random number generator to select a specified number of (a) the population
members into the sample. (b) the statistical unit
If we are to obtain meaningful information about the If the Dean's office wishes to estimate the average cost of
population from a sample then it is mandatory that the textbooks per student per semester
sample to be a random selection. (c) describe the population parameter
(d) the sample statistic that is of interest to the Dean's
1.3.5 Experiment office if 50 students maintained a record of their
An experiment is a planned activity whose results yield a textbook expenses per semester.
set of data. (e) Describe how you would use the 50 data in the
sample to calculate the value of the sample statistic
Tutorial Exercise discussed in (d).
Find the average age and weight of students in your class
by randomly sampling
(a) 5 students TOPIC 2.0
(b) 10 students
(c) 20 students TABULAR AND GRAPHICAL
Repeat each of the above experiments twice and comment REPRESENTATION OF SAMPLES
on the consistency of the three sets of results obtained.
The raw ungrouped data that is obtained from experiments,
1.4 The Statistical Experiment surveys, etc. are usually numerous and disorderly and
The essential problem in statistics is to find quantitative hence cannot furnish any useful information. Grouping of
procedures for describing and interpreting sets of data. data into classes and manipulation of this data into tabular
There are two aspects to this problem: and graphical forms condenses the information therein
 The description of a set of data (sample/population) making it easier to observe certain trends and consequently
in terms of a small set of descriptive quantities draw meaningful conclusions.
(called statistics/parameters respectively). This
aspect is called descriptive statistics. Section objectives
 Drawing inferences about population parameters At the end of this section you should be able to: -
by examining sample statistics. This aspect is 1. Group an ungrouped data into classes of equal width
called inferential statistics. 2. Determine the class interval, class boundaries, class
values, class limits of a grouped data distribution.
The statistical experiment has the following distinct phases: 3. Construct the frequency distribution for ungrouped and
1. Formulation of the problem - involves the grouped data including the tally chart, absolute
creation of a mathematical model that is based on frequency, relative frequency, cumulative absolute
a clear understanding of statistical concepts. frequency, and cumulative relative frequency.
2. Design of the experiment - choice of statistical 4. Construct bar charts, dot diagrams, frequency
methods, sample size, etc. histograms and polygons and the cumulative frequency
3. Collection of data - actual experimentation to graph (ogive) from a tabulated frequency distribution
develop a data set. 5. Interpret the tabulated and graphical statistical data to
4. Mathematical description/ organisation of data draw meaningful conclusions.
- arranging data into tabular and graphical forms.
5. Analysis of data - computations of numbers that 2.1 GROUPING OF DATA
characterise the average size and spread of the To reduce the volume of numerical analysis bulky statistical
sample values. data raw data is usually condensed into classes of equal
6. Interpretation of data- drawing meaningful width. Figure 2.1 gives a summary of the class boundary
conclusions about the qualitative and quantitative definitions that are used when referring to grouped data.
aspects of the sample that provide answers for the
problem described in (1) above. Upper class
Lower class Class width
limit
limit
The phases 1 - 4 above can be categorised as constituting 7.1 7.3
descriptive statistics while phases 5 and 6 constitute
inferential statistics.
7.05 7.20 7.35
1.5 Probability versus Statistics (Lower class (class value) (Upper class
boundary) boundary)
Probability and statistics are two separate but related fields Class interval
in mathematics. Probability investigates the chance or
likelihood that something (a sample) will happen when you
know the possibilities (i.e. the population). Figure 2.1 Class boundary definitions
Statistics, on the other hand, requires a randomly selected  Class interval = upper class boundary - lower class
sample that is described (descriptive statistics) and then boundary
inferences are made about the population based on the = 7.35 - 7.05 = 0.30 units
information found in the sample (inferential statistics).  Class width = upper class limit - lower class limit
3
= 7.3 - 7.1 =0.2 units  if the number of classes constructed is more or less
 Class value = (upper class limit + lower class than desired, marginally increase or decrease the
limit) / 2 class width respectively, until the desired number is
7.3  7.1 attained.
=  7.2 units
2
Or Class boundaries for class intervals
Class value = (upper class boundary + lower class There are four ways in which the limits of class intervals are
boundary) / 2 shown. The first three cases in Table 2.1 are different ways
7.35  7.05 of expressing the same limits, the assumption being that
=  7.2 units the sample measurements, for example, the under 5
2 metres (column B), are taken to a sufficiently precise
If ndp is the number of decimal places of the class limit then, degree to warrant including items which are for all practical
 Upper class boundary = Upper class limit + ndp /2 purposes equivalent to 5 acres e.g. 4.9897 acres. The limits
= 7.3 + 1/2 =7.35 shown in C are known as class boundaries, the highest
 Lower class boundary = lower class limit - ndp /2 value of one class being the lowest value of the next class.
= 7.1 - 1/2 =7.05 In Case D, the assumption is made that measurements are
The class 7.1 -7.3 includes all values between 7.05 recorded to the nearest metre. The class boundaries are
and 7. 3499. The class 7.4 -7.6 includes all values therefore 0 - 4½, 4½ - 9½,9½ - 14½, etc.
between 7.35 and 7.6499, and so on.
Table 2.1 Different conventions for representing
Procedure for Grouping Data Class intervals
Consider, for example, the thickness of 20 samples of a A B C D
steel plate enlisted as follows: - (metres) (metres) (metres) (metres)
7.3, 7.1, 6.6, 7.0, 7.8, 7.3, 7.5, 6.2, 6.9, 6.7, 6.5, 6.8, 7.2, 0- 0 and under 5 0 -5 0-4
7.4, 6.5, 6.9, 7.2, 7.6, 7.0, 6.8. 5- 5 " 10 5 - 10 5-9
This data can be grouped into six classes of equal width as 10- 10 " 15 10 - 15 10 - 14
follows: 15- 15 " 20 15 - 20 15 -19
 Determine the range, i.e. the difference between 20- 20 " 25 20 - 25 20 -29
the largest and the smallest data values.
Range  7.8  6.2  1.6 The convention used in C is more appropriate for
 Determine the class interval by dividing the range representing data such as age because ages are usually
by the desired number of classes and round off to recorded as at the last birthday.
the number of decimal places (dp.) of the sample NOTE: In all cases the class boundaries must be used for
data. calculations and in making graphs of the distribution.
1.6
Class interval (C.I) =  0.2667  0.3 to 1 dp. 2.2 Tabular Representations of Frequency
6
Distributions
 Determine the class width as follows
-n
The frequency distribution of the illustration sample given in
Class width = C.I 1 x 10 dp , the sample in the preceding section is tabulated as follows:
where C.I = Class interval = 0.3 and
ndp = number of decimal places of the Table 2.2 Frequency distribution table
sample data. Variable Tally (1) (2) (3) (4)
=1, hence X [mm] f r.f c.a.f c.r.f
Class width  0.3  1 x 101  0.3  0.1  0.2 6.2 - 6.4 I 1 0.05 1 0.05
 Beginning with the least data variable construct the 6.5 - 6.7 IIII 4 0.20 5 0.25
classes by adding the class width to it to give the 6.8 - 7.0 IIII I 6 0.30 11 0.55
upper class limit for the first class and adding to it 7.1 - 7.3 IIII 5 0.25 16 0.80
the class interval to give the lower class limit of the 7.4 - 7.6 III 3 0.15 19 0.95
next class Figure 2.2. Repeat the procedure until all 7.7 - 7.9 I 1 0.05 20 1.00
data is covered within the range. n =20  r.f = 1.0

2.2.1 Tally Chart


Start here
The tally chart is constructed by running through the
sample list and denoting each single occurrence of data by
Add class width (0.2)
6.2 6.4 a tally mark. This column is usually omitted from the
Add class frequency distribution table.
inter val (0.3) Add class width (0.2)
6.5 6.7 2.2.2 Absolute frequency (or frequency, column 1)
Add class The absolute frequency indicates how often each variable
inter val (0.3) occurs in the sample. It may be obtained by counting all the
6.8 7.0
tally marks that correspond to each variable in the sample.
For example 6 samples of the steel plate have a thickness
Figure 2.2 Class construction using the class width
of between 6.8 -7.0 mm. The total sum of the frequency
and class interval.
terms should always be equal to the sample size (n).
Therefore the six classes are: - 2.2.3 Relative frequency (Column 2)
6.2 - 6.4, 6.5 - 6.7, 6.8 - 7.0, 7.1 - 7.3,
7.4 -7.6, 7.7 - 7.9
4
This column is obtained by dividing the frequency 7.7 - 7.9 7.8 1 0.05 20 1.00
corresponding to each variable by the sample size (n). For n =20  r.f = 1.0
example, 6 out of 20, i.e. 30%, of all the samples of the y Class boundaries Class interval
steel plate have a thickness of between 6.8 -7.0 mm.
NOTE: The relative frequency is at least equal zero and at 6
most equal to 1.0. ( 0  r .f  1 )

Frequency, f
4
2.2.4 Cumulative absolute frequency (Column 3)
The cumulative absolute frequency (c.a.f) corresponding to
a given sample variable, X, is the numerical sum of all the 2
frequencies of variables in the sample that are less or equal
to X. For example the c.a.f of the samples in the range 7.4 -
7.6 is 1 + 4 + 6 + 5 + 3 = 19. This implies that 19 samples
have a thickness less or equal to 7.6499 mm. 6.3 6.6 6.9 7.2 7.5 7.8 x

6.45 Variable , x 7.35


2.2.5 Cumulative relative frequency (Column 4) (a) Bar Chart
The cumulative relative frequency (c.r.f) corresponding to a y
given sample variable, X, is the numerical sum of all the
relative frequencies of variables in the sample that are less 6
or equal to X. For example the c.r.f of the samples in the

Frequency, f
range 7.4 - 7.6 is 0.05 + 0.20 + 0.30 + 0.25 + 0.15 = 0.95.
4
This implies that 95% of all the samples have a thickness
less or equal to 7.6499 mm.
2
In general when preparing the frequency distribution table
for grouped data the classes should be clearly defined.
They should be of equal width and the number of classes
should preferably lie between 5 and 15. 6.3 6.6 6.9 7.2 7.5 7.8 x
Variable , x
2.3 Graphical Representations of Frequency (b) Dot frequency diagram
Distributions Figure 2.2 Bar Chart and dot frequency diagram
Graphs and diagrams leave a lasting impression on the 2.3.1.2 Dot frequency diagrams and pictograms
mind and make intelligible even to the layman the salient A number of dots, equal to the absolute frequency, are
features of the data. When representing group frequency plotted against each corresponding variable class, Fig 2.2
distributions by graphs, the class values of the variable (x) (b). Pictograms are similar to dot frequency diagrams. In
are plotted along the x- axis while the frequencies are this case suitable pictorial icons replace the dots, with the
plotted along the y -axis. number of the icons per variable giving the frequency.
Graphical plots of frequency distributions can be
categorised into two groups, namely: - 2.3.2 Graphical plots of the absolute frequency
1. Graphical plots of the absolute frequency Here the relative frequency is plotted against the sample
2. Graphical plots of the relative frequency variables. Two common graphs used include the frequency
histogram and the frequency polygon.
2.3.1 Graphical plots of the absolute frequency
These graphs plot the absolute frequency of the sampled 2.3.2.1 Frequency Histogram
data against each corresponding sample variable. This graph is similar in construction to the bar chart. In this
Examples include the bar charts, dot frequency diagrams case the vertical length of each bar is made proportional to
and pictograms. the relative frequency corresponding to the class in
consideration Figure 2.3 (a). As before, all the bars in the
graph are constructed with an equal width that, in the case
2.3.1.1 Bar Chart
of grouped data, is proportional to the class interval.
The bar chart for the illustration sample that was used in
the preceding section is constructed as shown in Fig. 2.2 2.3.2.2 Frequency Polygon / frequency curve
(a) below. The vertical length of each bar is made The frequency polygon is derived from the frequency
proportional to the absolute frequency corresponding to the histogram by drawing a continuous straight line joining the
class in consideration. All the bars in the chart are corresponding relative frequencies that are marked off at
constructed with equal width. In ungrouped samples this middle points of adjacent bars, corresponding to the
width may be chosen arbitrarily whereas in grouped data respective sample points or class values. Figure 2.3(b). If
the width is constructed proportional to the class interval. the corners of the frequency polygon are smoothened out
using a freehand sketch then the resulting graph is called a
Table 2.2 Grouped Frequency distribution table with
frequency curve.
computed class values
Variable Class (1) (2) (3) (4) 2.4 Cumulative Frequency Function and Cumulative
X [mm] value f r.f c.a.f c.r.f Frequency Curve (Ogive)
6.2 - 6.4 6.3 1 0.05 1 0.05 These graphs plot the tabulated cumulative frequency
6.5 - 6.7 6.6 4 0.20 5 0.25 against the sample variables. In the cumulative frequency
6.8 - 7.0 6.9 6 0.30 11 0.55 function is a step function that plots the tabulated
7.1 - 7.3 7.2 5 0.25 16 0.80 cumulative relative frequency (c.r.f) as the y - coordinates
7.4 - 7.6 7.5 3 0.15 19 0.95 and the samples/class values as the x - coordinates, Fig

5
2.4(a). In normal frequency distributions the steps in this y
graph tend to increase in height as you move towards the 20
centre of the distribution.
16
y

Relative frequency, f
0.3 12

c.a.f
8
0.2
4

0.1
0 6.45 6.75 7.05 7.35 7.65 7.95 x
Upper class boundaries
(b) Cumulative frequency curve (ogive)
6.3 6.6 6.9 7.2 7.5 7.8 x
Figure 2.3 Histogram and Frequency Polygon
Variable , x Home Study Exercise 2
(b) Histogram 1. The marks obtained by 70 students in an Applied
y Mathematics exam are as follows:
32, 45, 38, 7, 40, 15, 5, 26, 0, 11, 40, 2, 18, 8, 31, 4, 27, 7,
Relative frequency, f

0.3
0, 15, 12, 35, 28, 46, 9, 29, 10, 34, 2, 7, 5, 17, 2, 8, 35, 30,
11, 36, 47, 19, 16, 0, 18, 16, 14, 2, 38, 41, 42, 17, 45, 28,
0.2 48, 20, 7, 21, 8, 5, 28, 13, 22, 27, 41, 40, 36, 29, 29, 31, 34,
48.
Group this data into seven classes of equal width and
0.1 hence tabulate the frequency distribution. On separate axes
graph the: -
(a) Bar chart
(b) Frequency histogram
6.3 6.6 6.9 7.2 7.5 7.8 x (c) Frequency polygon
Variable , x (d) Cumulative frequency function
(b) Frequency polygon (e) Ogive.
Figure 2.3 Histogram and Frequency Polygon
2. An automatic machine was tested on 40 occasions and
In the cumulative frequency curve (also called the ogive) is found to stitch the following number of bags in one
the free-hand smooth curve that is obtained by joining minute:
points generated when the tabulated cumulative absolute 18, 17, 21, 18, 19, 17, 18, 20, 16, 17, 19, 19, 17, 16,
frequency (c.a.f) is plotted on the y - axis against the upper 15, 19, 17, 17, 20, 18, 17, 18, 19, 19, 18, 19, 18, 18,
class boundaries on the x - axis, Figure 2.4(b). The ogive is 19, 20, 18, 15, 18, 17, 20, 18, 16, 17, 18, 17.
useful in determining the quartiles, deciles or percentiles of Starting with the tally chart, tabulate the frequency
grouped data. distribution. On separate axes graph the: -
y (a) Bar chart
1.0 (b) Frequency histogram
(c) Frequency polygon
0.8
(d) Cumulative frequency function
c.r.f

0.6
(e) Ogive.

After collecting, classifying tabulating and representing the


0.4
data graphically, mathematical analysis of the data
0.2
(descriptive statistics) is performed to determine specific
characteristics such as: -
(i) clustering of data around some central value
6.3 6.6 6.9 7.2 7.5 7.8 x (e.g. mean, median, mode, etc.), which will be
Variable , x studied under the measures of central
tendency.
(a) Cumulative frequency function (ii) Dispersion of data around the central value
(range, mean deviation, standard deviation,
etc.), which will be studied under the
measures of dispersion or spread.

TOPIC 3.0
MEASURES OF CENTRAL TENDENCY
Measures of central tendency are numerical values that
tend to locate, in some sense, the middle of a set of data.
Three categories are examined. Namely the: -
 Measures of position /quantiles (quartiles, deciles
and percentiles)
6
 Mode Step 1- Rank the data into an array starting with the
 Measures of average (arithmetic, geometric and lowest - valued piece and proceeding to the
harmonic means) largest valued.
Step 2- Evaluate the position number, q, for the i
th

Section objectives quartile as follows:


At the end of this section you should be able to: iN
1. Describe the measures of central tendency q , where i  1, 2 or 3.
4
2. Compute the quantiles, the median, mode and mean
Two cases: -
given a set of sampled data.
 If iN/4 is not an integer (i.e. it contains a
3. Describe the characteristics of given data samples using
fraction) then q is equal to the next
the computed measures of central tendency.
larger integer. For example if iN/4 =
15.5, then q = 16.
3.1 Measures of Position (the Quantiles)
 If iN/4 is an integer (i.e. it contains no
Measures of data are used to describe the location of a fraction) then q is equal to (iN/4) + 0.5,
specific piece of data in relation to the rest of the sample. for example if iN/4 = 10, then q = 10.5.
Quartiles, deciles and percentiles are common measures of
Step 3- Locate the value of Qi. To do this, count from
position. th
the lowest valued piece of data L until the n
In general the following formula may be used to evaluate value is found, if n is an integer. If n is not an
each of the described quantiles from a grouped data integer, the it contains the fraction one-half
distribution. which means that the value of Qi is the
th th
average of the q and the (q+1) piece of
q  Ci
X i  li  h data.
For grouped data the quartiles are evaluated using the
fi
formula: -
q  Ci
where
th
Qi  li  h
Xi = the i quantile (i.e. quartile / fi
decile /percentile) where all variables are as previously defined or determined.
lI = lower class boundary of the
th
class of the i quantile which 3.1.2 Deciles
is given by q. Deciles are number values of the variable that divide a set
th
q = the position number of the i of ranked data into ten equal parts; each set of data has 9
quantile in the ranked sample. deciles. (Fig 3.2).
Ci = the cumulative frequency of
the class just preceding the Ranked data, increasing order
th
class of the i quantile.
fi = frequency of the class of the
th
i quantile L 10% 10% 10% 10% 10% H
h = class interval.
3.1.1 Quartiles D1 D2 D3 D8 D9
Quartiles are number values that divide the ranked data Figure 3.2 Deciles
into quarters (see Fig. 3.1). Each set of data has three Procedure for determining the Deciles
quarters, namely the: - Evaluation of the deciles of an ungrouped data proceeds as
 first quartile -(or lower quartile) follows: -
 second quartile (or median) Step 1- Rank the data into an array starting with the
 third quartile (also called the upper quartile) lowest - valued piece and proceeding to the
largest valued.
Ranked data, increasing order Step 2- Evaluate the position number, q, for the i
th

decile as follows:
L 25% 25% 25% 25% H iN
q , where i  1, 2, 3,...,9.
10
Q1 Q2 Q3 Again: -
Figure 3.1 Quartiles.  If iN/10 is not an integer then q is equal
to the next larger integer.
The first or lower quartile (Q1) is a number such that at  If iN/10 is an integer then q is equal to
most one-fourth of the data are smaller in value than Q1, (iN/10) + 0.5.
and at most three- fourths are larger. Step 3- Locate the value of Di by counting as before.
The second quartile (Q2) is the median i.e. the numerical
value of the variable that divides the distribution into two For grouped data the deciles are evaluated using the
equal parts. formula: -
The third or upper quartile (Q3) is a number such that at
q  Ci
most three -fourths of the data are smaller in value than Q3, Di  li  h
and at most one-fourth are larger. fi
Procedure for determining the Quartiles with all variables taken as previously defined or determined.
Evaluation of the quartiles of an ungrouped data proceeds
as follows: - 3.1.3 Percentiles
7
Percentiles are number values of the variable that divide a
set of ranked data into 100 equal parts; each set of data
th SOLUTION
has 99 percentiles (Fig. 3.3). The k percentile Pk is a Step #1 The given data is ranked in ascending order as
number such that at most k% of the data are smaller in follows:
value than Pk and at most (100 - k)% of the data are larger.
25 26 26 27 27 27 27 28 28 28
28 28 28 28 29 29 29 29 29 29
30 30 30 30 30 31 31 31 32 32
Ranked data, increasing order
The frequency distribution is therefore:
Variable frequency Cumulative
L 1% 1% 1% 1% 1% H frequency
25 1 1
P1 P2 P3 P98 P99 26 2 3
27 4 7
Figure 3.3 Percentiles 28 7 14
Procedure for determining the Percentiles 29 6 20
Evaluation of the percentiles of an ungrouped data 30 5 25
proceeds as follows: - 31 3 28
32 2 30
Step 1- Rank the data into an array starting with the
Step #2 Evaluate the position numbers for each of the
lowest - valued piece and proceeding to the
required quantities and find them as follows:
largest valued. Median = Q2 = D5 = P50
Step 2- Evaluate the position number, q, for the i
th
2  N 5  N 50  N N
percentile as follows: q   
iN 4 10 100 2
q , where i  1, 2, 3,...,99.
100
Again: -
= 30 / 2 =15 (an integer)
 If iN/100 is not an integer then q is equal Hence q = 15 + 0.5 = 15.5
to the next larger integer. The median is therefore the arithmetic average of the
 If iN/100 is an integer then q is equal to 15th and 16th members, which are both equal to 29.
(iN/10) + 0.5. 29  29
Step 3- Locate the value of Pi by counting as before. Q2   29 washers.
2
For Q1, i = 1
For grouped data the percentiles are evaluated using the
i N 1 30
formula: - q  N  7.5 (not an integer)
4 4 4
q  Ci
Pi  li  h Therefore q is taken as the next larger integer which is
fi 8. Counting off gives the 8th member of the ranked data
as 28, and hence Q1= 28 washers.
with all variables taken as previously defined or determined.
For P91, i = 91
NOTE: The following statements are true: - i N 91
 Q1  P25 because the position number, q, q  N  (0.91)  30  27.3
100 100
corresponding to both points is the same, i.e. Again q is taken as the next larger integer which is 28.
1 N 25  N 1 Counting off gives the 28th member of the ranked data as
q   N. 31, and hence P91 = 31 washers.
4 100 4
Similarly
 Q3  P75 Example 3.2
For the grouped frequency distribution below
 Q2  D5  P50 = the median. Variable 6.2 - 6.4 6.5 - 6.7 6.8 - 7.0 7.1 - 7.3
Frequency 1 4 6 5
Example 3.1 Variable 7.4 - 7.6 7.7 - 7.9
The contents of each of 30 packets of washers are Frequency 3 1
rd th
recorded as follows: Evaluate median, the 83 percentile, and the 8 decile
28, 31, 29, 27, 30, 29, 29, 26, 30, 28, 28, 29, 27, 26, 32, 28,
32, 31, 25, 30, 27, 30, 29, 30, 28, 29, 31, 27, 28, 28. Find
st st
the median, the 1 quartile, and the 91 percentile.

8
SOLUTION
Step #1 Draw the cumulative frequency distribution:
Graphical Determination of the Measures of Position
Variable Class f c.a.f The quantiles of a grouped frequency distribution can be
x [mm] value determined graphically as follows.
6.2 - 6.4 6.3 1 1  Draw the cumulative frequency curve (ogive) for
6.5 - 6.7 6.6 4 5 the given data.
6.8 - 7.0 6.9 6 11  Evaluate the position number (q) of the required
7.1 - 7.3 7.2 5 16  Read of the variable (Xj) on the x - axis that
7.4 - 7.6 7.5 3 19 corresponds to q as shown in Fig. 3.4
7.7 - 7.9 7.8 1 20
n =20 y

Cumulative abs. frequency


N
position number
Step #2 Evaluate the position numbers for each of the
required quantities and apply the formula to
find them as follows: q
(i) Median = Q2 , i = 2
i N 2N Quantile
q 
4 4
= 20 / 2 =10 (an integer)
Hence q = 10 + 0.5 = 10.5
The median is therefore the arithmetic average of the 0 L Xj H x
10th and 11th members. Both members are in the class Upper class boundaries
6.8 - 7. 0.
q  C2 Figure 3.4 Graphical Determination of Quantiles.
Q2  l2  h
f2
Home Study Exercise 3.1
where l2  6.75; f2  6; C2  5 and h  0.3 On a sheet of graph paper draw the ogive for the set of
Hence data given in example 3.2 and use it to determine the
rd th
10  5 median, the 83 percentile, and the 8 decile. Compare
Q2(10)  6.75   0.3  7.0
6 your answers to those that were obtained by using the
11  5 formula.
Q2(11)  6.75   0.3  7.05
6
Q  Q2(11) 7.0  7.05 Advantages and disadvantages of the Quantiles
Therefore Q2  2(10)   7.025
2 2 1. Advantages
 7.0 to 1 dp. (a) If found directly, they represent the actual item.
NOTE: The median can be approximated as being equal to (b) Extreme items do not affect their values
Q2(10) alone and hence most evaluations disregard taking (c) They can be evaluated even when not all the
sample values are known.
(d) They can be used for measuring qualities and
the average. factors to which mathematical measurement
(ii) For P83, i = 83 cannot be given.
i N 83 2. Disadvantages
q  N  (0.83)  20  16.6
100 100 (a) If the sample size is small, they may not be
Hence q is taken as the next larger integer which is 17. representative
The 83rd percentile is the 17th member, which falls in the (b) The arranging of data into the necessary array is
class 7.4 -7.6 often tedious
q  C83 (c) If the distribution is irregular then the location of
P83  l83  h
f83 the quantiles may be indefinite.
where l83  7.35; f83  3; C83  16 and h  0.3
3.2 Mode
17  16
Hence P83  7.35   0.3  7.45 The mode is the value of the variable x that occurs most
3 frequently in the sample. For example in the set {1, 2, 3, 3,
Therefore the 83rd percentile is 7.45  7.5 mm to 1dp. 4, 5} the mode is 3. Samples with only one mode are called
(iii) For D8, i = 8 uni-modal. In the set {1, 2, 2, 3, 4, 4, 5} there are two
i N 8 modes i.e. 2 and 4 (the set is bi-modal).
q  N  (0.8)  20  16
100 10
Here q is taken as 16.5. Hence the 8th decile is the In uni-modal grouped frequency distributions, the mode is
arithmetic average of the 16th and 17th members but can be given by the formula:
approximated to be equal to the 16th member as seen in
f  f1
part (i). The 16th member falls in the class 7.1 - 7.3 Mode  l  h
D8  l8 
q  C8
h 2f  f1  f2
f8
where
where l8  7.05; f8  5; C8  11 and h  0.3 l = lower class boundary of the
16  11 modal class i.e. the class with
Hence P83  7.05   0.3  7.35 the highest frequency count.
5
Therefore the 8th decile is 7.35  7.4 mm to 1dp mm 9
f = frequency of the modal class n

f x
1
f1 = frequency of the class x i i >(2)
preceding the modal class n
i 1
f2 = frequency of the class n

h
succeeding the modal class.
= class interval.
where n = total sample size = f
i 1
i

The mode can also be determined by construction as


(ii) Mean of grouped data
shown in Fig 3.5.
Consider a grouped frequency distribution with m classes
h Modal class such that the class values are enlisted as
y
x1, x2, x3,, xm with corresponding frequencies
f1, f2, f3,, fm , then the sample mean for the grouped data
Absolute frequency

a is:
Mode  L   (U  L )
h m

f x
1
x i i
a n
i 1

Coded mean technique for evaluating the mean


Let xi  A  di or xi  A  di f or i  1, 2, 3,, n , where A is
0 L U x
any assumed number also called the assumed mean. And
Class values di is the coded variable. Then by definition, the mean is
n n

 f (A  d )
Figure 3.5 Graphical Determination of the Mode. 1 1
x fi xi  i i
n n
Advantages and disadvantages of the mode i 1 i 1
1. Advantages n n

 f d
1 1
(a) It is easy to understand  fi A  i i
(b) Extreme items do not affect its value n n
i 1 i 1
n n

 f d
(c) Like the quantiles, only the middle items need to A 1
be known  fi  i i since f = n
n n
2. Disadvantages i 1 i 1
(a) It is often not clearly defined  A  xc
(b) Exact location is often uncertain m

f d
1
(c) Arrangement of data is tedious. Where xc = coded mean = i i
n
i 1
3.3 Measures of Average For a tabulated frequency distribution in which the classes
Three common measures are used, namely: have equal class intervals (h), a new coded variable (ui)
 Arithmetic mean can be evaluated as:
 Geometric mean x  A di
 Harmonic mean xc  i  , or di  hxc
These three are discussed in the following sections. h h
Therefore the coded mean now becomes:
3.3.1 Arithmetic mean (or just Mean) m m

  f (hx )
1 1
The mean or average of a sample is the mathematical ratio xc  fi di  i c
n n
of the total sum of all the sample variables to the total i 1 i 1
m

f x
sample size. h h
 i c   fxc
sum of all the sample v alues n n
Mean  i 1
total number of samples and x  A  xc
Several mathematical formulations for determining the
Example 3.3
arithmetic mean are possible depending on the nature of
Find the mean and mode of the following data
the data.
Marks Number of Marks Number of
(a) Mean of an ungrouped data students students
Consider n values of a variable x in a sample that are Under 10 5 Under 60 60
" 20 9 " 70 70
enlisted as x1, x2, x3, , xn . The sample mean x is defined
" 30 17 " 80 78
as:
1
x1  x2  x3    xn 1  xn 
x " 40 29 " 90 83
n " 50 45 " 100 85
or using the sigma - notation as:
n

x
1
x i >(1)
n
i 1
If the sample has variables x1, x2, x3, , xn each with
frequencies of f1, f2, f3,, fn respectively, then the frequency
- dependent sample mean x is stated as:
10
Consider deviations about an assumed mean A that is
SOLUTION chosen arbitrarily.
 f x  A  f x  x   x  A
First convert the given cumulative frequency distribution into 2 2
a continuous distribution and then choose an appropriate
xi  A d i 
  f x  x   2x  x x  A  x  A2
2

assumed mean, A. determine xc   .   f x  x 2  2x  A f x  x    f x  A2
h h
A = 55 and h =10   f x  x 2   f x  A2, since  f x  x   0
Marks Class cf. f fx di xc f xc The square sign in the second term of this equation
value
means that x  A2 is always positive and hence the
x
0 - 10 5 5 5 25 - 50 -5 - 25 expression is minimum when A  x , i.e. x  A2 =0.
10 - 20 15 9 4 60 - 40 -4 - 16
This implies that the sum of the squares of the
20 - 30 25 17 8 200 - 30 -3 - 24
deviations of a set of values is minimum when taken
30 - 40 35 29 12 420 - 20 -2 - 24
40 - 50 45 45 16 720 - 10 -1 - 16
about the mean.
50 - 60 55 60 15 825 0 0 0  If x1 and x2 are the means of two samples of
60 - 70 65 70 10 650 10 1 10 sizes n1 and n2 , then the mean M of the combined
70 - 80 75 78 8 600 20 2 16
80 - 90 85 83 5 425 30 3 15 sample of size n1  n2  is
90- 100 95 85 2 190 40 4 8 n1x1  n2 x2
Total n = f =85, fx = 4115 f xc = - 56 M
n1  n2
(i) Apply the formula to determine the mean In general if x1 , x2,, xk are the means of k
m

 4115   48.41  48 marks


1 1 samples of sizes n1, n2,, nk , the mean M of the
x fi xi 
n 85 composite sample is
i 1
OR n x  n x    nk xk
M 1 1 2 2
h n1  n2    nk
x  A  xc  A   fx c
n Example 3.4
 55 
10
 56   48 .41  48 marks A large class of 250 students was divided into 3 groups A,
85 B and C each of 100, 90 and 60 students respectively. The
three classes were taught one course separately and given
(ii) To evaluate the mode, first determine the modal-class the same exam. If the average score was 55%, 50% and
then apply the formula. 48% in group A, B and C respectively, find the mean score
Modal class = 40 - 50 marks for the combined class.
f  f1
Mode  l  h
2f  f1  f2 SOLUTION
where l  40 ; f  16 ; f1  12 ; f2  15 and h  10 Given data:
16  12 Group A: x1  55 , n1  100
 Mode  40   10  48 marks
216   12  15 Group B: x2  50 , n2  90
Group C: x2  48 , n2  60
(III) Weighted Arithmetic Mean The mean for three combined samples is:
When the variables x1, x2, x3, , xn are of unequal n1x1  n2 x2  n3 x3
M
importance, weights w1,w2,w3,,wn are assigned to each n1  n2  n3
10055   9050   6048 
of the variables respectively. The weighted arithmetic mean   51.5%
100  90  60
is given by: -
w x  w 2 x2  w3 x3    w n xn  wx
w  1 1  Example 3.5
w1  w 2  w3    w n w The mean of 200 items was calculated as 50. If two items
were misread as 92 and 8 instead of 192 and 88, find the
Properties of the arithmetic mean correct mean.

 The algebraic sum of the deviations of all the variates SOLUTION


Be definition,
from their mean is equal to zero, i.e. fd =0 when
 fx
d  xx . x or  fx  n  x
n
Proof
Let d  x  x . Given that n = 200 and x  50 ,  fx  50  200  10,000

 fd  f x  x 
which is incorrect.
 fx The correct  fx  10,000  (92  8)  (192  88 )  10,180
  fx  x  f , but x  Hence the corrected mean is
f
 fx corrected  fx 10,180
  fx   f  0, x   50.9
f n 200
 The sum of the squares of the deviations of a set of
values is minimum when taken about the mean.
Proof Treatment of an open-end grouped sample

11
An open- end class is one in which the lower or upper Consider n values of a variable, x stated as x1, x2, x3,, xn .
boundary, for the first or last class respectively, is not given.
The harmonic mean of these variables is defined
For example the grouped data whose classes are enlisted
mathematically as:
as below 15, 15 - 19, 20 - 24, 25 -29, 30 and above, is open
1
- ended. (Fig. 3.6) H
1 1 1 1 1 
    
Variable Frequency n  x1 x2 x3 xn 
Below 15 2 For example the harmonic mean of 4, 8 and 16 is
15 - 19 18 1 48
20 - 24 300 H   6.855
1 1 1 1  7
25 - 29 40    
3  4 8 16 
30 and above 1
The harmonic mean is useful when averaging rates such as
Figure 3.6 Open-ended grouped sample.
the speed (distance / time taken) of a moving object that
A reasonable assumption about the size of the open - end covers different parts of the distance with different speeds,
classes is necessary to facilitate computation. The best machine output per minute, etc.
guide to its probable size is the pattern of the distribution. In
the above example, since the class width is fixed at 4, the Harmonic Mean of Grouped Data
first class can be taken as 10 -14 and the last class as 30 - If the frequencies of variables x1, x2, x3, , xn are
34. The error introduced by making this assumption is f1, f2, f3,, fn respectively, then the harmonic mean H is
minimal.
given by:
Advantages and disadvantages of the arithmetic mean 1
H
1. Advantages 1  f1 f2 f f 
   3    n 
n  x1 x2 x3
(a) It is rigidly defined and easy to understand
(b) It is based on all the sample variables
xn 
1 1
(c) It is amenable to algebraic treatment  
 f / x 
n 1

(d) It is easy to calculate 1 fi
(e) It is unique and always exists n
n xi
(f) It is least affected by fluctuations of sampling, i.e. i 1
it is a stable average. Example 3.5
2. Disadvantages A car covers the first 200 km of a 400 km journey with an
(a) It is unduly affected by extreme values average speed of 120 km/h and the remaining distance with
(b) It may not be present in actual data a speed of 90 km/ h. Find the time taken for the entire
(c) It cannot be located by inspection journey.
(d) It may lead to wrong conclusions in the absence of
the raw data. SOLUTION
The car travels 200 km at 120 km/h and 200 km at 90 km /
h. Since the distance in both cases is constant, the average
3.3.2 Geometric Mean
speed is the harmonic average of 120 km/ h and 90 km/ h
Consider n values of a variable, x stated as x1, x2, x3,, xn . 1 720
The Geometric mean G is defined as: - Average speed =   102 .86 km/h
1 1 1  7
  
G  x1  x2  x3    xn n
1
2  120 90 
For example the geometric mean G of 4, 16 is Total distance 400
Time taken =   3.89 hours
G  4  16  64  8
1
2 average speed 102 .86
The geometric mean is used to determine parameters like
the rate of interest or rate of population growth. It is also Example 3.6
used in the construction of index numbers. Find the geometric and harmonic means of the following
Geometric Mean of a Grouped Data distribution:
If the frequencies of variables x1, x2, x3, , xn are Income
No. of workers
[K£ / week]
f1, f2, f3,, fn respectively, then the geometric mean G is
0 - 10 5
given by:
10 - 19 8
 
1

G  x1f1  x2 f2  x3 f3    xn fn n 20 - 29 3


30 - 39 4
Taking logarithms on both sides
log G  f1 log x1  f2 log x3  f3 log x3   fn log xn 
1
n
n

  f log x
1
 fi log xi 
n n
i 1

3.3.3 Harmonic Mean


The harmonic mean H is the reciprocal of the average of
the reciprocals of the values to be averaged.

12
SOLUTION
First prepare the following table for computation
Class x f log x f log x f/x
0-9 4.5 5 0.6532 3.2661 1.1111
10 - 19 14.5 8 1.1613 9.2909 0.5517
20 - 29 24.5 3 1.3891 4.1675 0.1224 TOPIC 4.0
30 - 39 34.5 4 1.5378 6.1513 0.1159 MEASURES OF DISPERSION
20 22.8758 1.9011 Measures of dispersion are numerical values that describe
the amount of spread or variability that is found among a
The geometric mean is evaluated as: set of data. Five common measures of dispersion are: -
 f log x  Range
log G   Mean deviation
n
22.8758  Quartile deviation
  1.1438  Variance
20
G = 13.93  14 K£ / week  Standard deviation
The harmonic mean H for this data is:
1 n Section objectives
H  At the end of this section you should be able to:
1

 f /x  
 f /x  1. Describe the measures of dispersion
n 2. Compute the range, the mean (absolute) deviation,
20
  10 .52 quartile deviation, variance and the standard deviation
1.9011 of an ungrouped or grouped set of sampled data.
 11 K£ / week 3. Describe the characteristics of given data samples using
the computed measures of dispersion.

Tutorial Exercise 3.1 Consider the results of two students A and B in seven
1. For the following distribution evaluate the: - examinations:
(i) Mode
(j) Median A: 79 20 39 100 96 22 32
(k) P45 B: 48 61 56 53 58 57 55
(l) D2
It can be shown by computation that the average score for
(m) P85 + D4
both students is 55 marks. Therefore it may be concluded
(n) Q3
that both students are equally good performers, but this is
(o) Arithmetic mean
not so. The scores of Student A vary more than those of B
(p) Geometric mean, and
implying that Student B is more consistent and hence
(q) Harmonic mean
dependable.
x 5-9 10 - 14 15- 19 20 - 24 25 - 29 30 - 34 In general closely grouped data yields smaller values of
spread while largely spread -out data will have larger
f 4 9 16 12 6 3
values for the various measures of dispersion.
2. A company which manufactures bulbs has four 4.1 Range
machines on which the bulbs can be made. Owing to The range is the difference between the largest and the
differences in age and design these machines run at least values of the variate.
different speeds, as follows:
Range = largest variable - least variable
Machine Number of bulbs per minute
A 2 The bigger the range the larger the dispersion. For example
B 3 the range for Student A is (100 - 20) = 80, while the range
C 5 for B is (61 - 48) = 13. Hence A has much more dispersion
D 6 than B even if they have the same average score.
The range is not a good measure of dispersion because it
depends on extreme values of the variable while ignoring
(a) When all the machines are running what is the total
all others, and also it gives no information about the spread
number of bulbs produced per hour? [Ans. 288 bulbs]
of data within it.
(b) Over a period of three hours, machines B, C and D 4.2 Mean Deviation
were run for the first 2 hours, and machines A, B and D
The mean deviation (or sometimes called the mean
only were run for the last hour. What was the average
absolute deviation) is the arithmetic mean of the absolute
number of articles produced per hour over the 3-hour
deviations of all the variables about a central value.
period? [231.4  231 bulbs per hour]
Consider a sample of variables x1, x2, x3, , xn with an
arithmetic mean x . The mean deviation MD is:
 xx
MD 
n
where  x  x is the sum of the absolute deviations
taken about the mean.
13
 
For grouped frequency distributions the mean deviation is 2
 f (x  x ) 1
defined as: 2    f x 2  2 xx  x 2
n n n
f x  x  fx f
  fx2  2x   x 2
f x  x 
1 1
MD  i i
n n n n n
i 1
  fx2  2x 2  x 2
1
A Comparison of the results of students A and B by using n
  fx2  x 2
the mean deviation about the known mean ( 55%) 1
proceeds as follows: n
2
2   fx 
 fx  x    fx  
1 2 1
Student A Student B Hence 2  2
 >(b)
x n n  n 
x  55 x x  55
For ungrouped data we have:
79 24 48 7
2   x 2  x 2
1
20 35 61 6 >(c)
n
39 16 56 1
100 45 53 2 The coded mean technique for determining the
96 41 58 3 variance of a grouped distribution.
22 33 57 2 x  A di
Let xc  i  , where A is the assumed mean h is
32 23 55 0 h h
217 21 the class interval and xc is the coded variable. The variance
 xx 217 is defined as:
The MD For A =   31 marks, and 1 
n 7 2  h2   fxc 2  xc 2 
 xx 21 n 
For B =   3 marks 1   fxc  
2 > (d)
n 7  h2   f ( xc )2    
Smaller values of the mean deviation indicate more n  n  

consistency and reliability. As such student B is more 4.5 Standard Deviation
consistent and reliable than student A. The mean deviation
The standard deviation of a sample (denoted by ) is the
is based on all the observations and is a better measure of
positive square root of the variance:
dispersion than the range or the quartile deviation. It is least
when taken from the median. However, taking of absolute n

 x  x 
1 2
values, i.e. ignoring the signs makes it artificial and renders   2  i >(e)
n
it useless for further mathematical analysis. i 1
For grouped frequency distributions:
4.3 Quartile Deviation
2
 f (x  x )
 fx  x 
As seen earlier quartiles divide the sample distribution into 1 2
 or   2
>(f)
four equal parts. The quartile deviation or semi - n n
interquartile range Q is given by: Using the coded mean (also called the step deviation
Q  Q3  Q1
1 method):
2 1  fx  
2
1 
where Q1 = first (or lower) quartile; Q3 = third (or   h   fxc 2  xc 2   h   f ( xc )2    c   >(g)
n  n  n  
upper) quartile, and Q3  Q1 = Inter -quartile range. 
The Quartile deviation takes into account 50% of the data. Example 4.1
For the frequency distribution given below evaluate the
4.4 Variance standard deviation using
The variance of a sample, denoted by Var (x) or 2 , is the (a) the square sum of the deviations
(b) the coded mean technique
numerical average of the sum of the squares of the
Temperature
deviations from the arithmetic mean ( x ). Expressed o Frequency
( C)
mathematically: 30.0 - 30.2 6
n

 x  x 
1 2 30.3 - 30.5 12
2  i
n 30.6 - 30.8 15
i 1
30.9 - 31.1 20
For grouped frequency distributions:
n
31.2 - 31.4 13
 f (x  x )
2
 f x  x 
1 2 31.5 - 31.7 9
2  i i  >(a)
n n 31.8 - 32.0 5
i 1
The equation (a) can be modified to give an alternative
formula as follows:

14
SOLUTION Example 4.2
(a) First prepare the following table for computation.
 fx 2476 .7 Show that s 2  2  d 2, where d  x  A and A is an
x   30 .96  31 ; d = x  x arbitrary number.
n 80
2 2 2
Temp. x f fx d d fd fx
SOLUTION
30.0 - 30.2 30.1 6 180.6 -0.9 0.81 4.86 5436 By definition, the LHS
30.3 - 30.5 30.4
30.6 - 30.8 30.7
12
15
364.8
460.5
-0.6
-0.3
0.36
0.09
4.32
1.35
11090
14137
2 1 2 1

s   f x  A    f x  x   x  A 
2

 
n n
  f x  x   2x  x x  A   x  A 
30.9 - 31.1 31.0 20 620.0 0 0 0 19220 1 2 2
n
 f x  x 
31.2 - 31.4 31.3 13 406.9 0.3 0.09 1.17 12736
2 f
  f x  x   2x  A   x  A 
1 2
31.5 - 31.7 31.6 9 284.4 0.6 0.36 3.24 8987
n n n
31.8 - 32.0 31.9 5 159.5 0.9 0.81 4.05 5088
  f x  x   x  A  , since  f x  x   0
1 2 2
Totals 80 2476.7 18.99 76,694
n
n

 fd
2
  2 
1 2

18.99
 0.487 oC  f (x  x )
But   and x  A  d ,
n 80 n
i 1
OR Therefore s 2  2  d 2

2 4.6 Relative measures of dispersion


1 2   fx  The measures so far discussed indicate the amount of
  fx   
n  n  dispersion with regard to one set of data. When comparing
2 between variations of more than one samples the following
76,694  2476.7 
    0.48 oC three coefficients are used.
80  80  mean dev iation
 Coefficient of mean deviation =
(b) Using the coded mean mean or median
1   fxc  
2 Q  Q1
  h   f ( xc )2      Coefficient of quartile deviation = 3
n  n   Q3  Q1

S.D 
Prepare the following table for computation with  Coefficient of standard deviation = 
x  A di mean x
xc  i  , A = 31 and h =0.3 
h h  Coefficient of variation =  100%
Temp. x f d xc xc
2
f xc f xc
2 x
30.0 - 30.2 30.1 6 -0.9 -3 9 -18 54
4.7 Moments
30.3 - 30.5 30.4 12 -0.6 -2 4 -24 48
The n moment about any point A is denoted by n and is
th
30.6 - 30.8 30.7 15 -0.3 -1 1 -15 15
defined as:
30.9 - 31.1 31.0 20 0 0 0 0 0
n   f x  An , where n   f .
1
31.2 - 31.4 31.3 13 0.3 1 1 13 13
31.5 - 31.7 31.6 9 0.6 2 4 18 36 n
First moment: 1   f x  A .
31.8 - 32.0 31.9 5 0.9 3 9 15 45 1
Totals 80 -11 211 n
1
 2 Set A = 0, 1   fx  x , i.e. the arithmetic
   0.3   211    11    0.485 oC n
 80  80  
 mean is first moment about the origin.
2   f x  A2
1
Second moment:
n
In statistical theory the standard deviation is regarded as th
the most ideal measure of dispersion (or spread) because: Set A = x , the n moment about the mean is denoted
1. It takes into account all observations in the sample by  n and is defined as:
2. The step of squaring the deviations x  x 2 1
 f x  x 
n  n
overcomes the drawbacks associated with ignoring n
the signs as in the mean deviation. The first four moments about the mean are commonly
3. It is suitable for further mathematical treatment used: 1  0 ,  2  2 , 3 gives a measure of
Root Mean Square (RMS) Deviation skewness and  4 gives a measure of the sharpness in
RMS is similar in definition to the standard deviation, but the rise of central frequencies.
evaluates deviations from an arbitrarily chosen number A
within the sample. It is denoted by 's'. 4.8 Measures of Skewness
n A frequency distribution is said to be symmetric when the
 f x  A
1 2 frequencies are symmetrically distributed about their mean.
s i
n Skewness is the deviation of a distribution from a
i 1
symmetrical profile. Three common measures (referred to
When A  x, s   . s is called the mean square deviation.
2
as coefficients of skewness) are: -

15
Q3  Q1  2Q2 (b) For a set of 25 measurements the mean and the
 Bowley's coefficient = standard deviation were found to be 56 cm and 2
Q3  Q1
cm respectively. Later on it was found that a
mean  mod e
 Karl Pearson's coefficient = mistake had been made in one of the
 measurements which was recorded as 64 cm
3 instead of 61 cm. Find the correct value of the
 Coefficient of skewness = mean and the standard deviation. [55.88,1.56 cm]
3
4. Compute the different measures of skewness for the
3 2 following distribution: 3 to 7 (2 counts), 8 to 12 (108
The coefficients 1 and 1 , given by 1  and
 23 counts), 13 to 17(80 counts), 18 to 22 (175 counts), 23 -
27 (80 counts), 28 to 32 (32 counts), 33 to 37 (18
1  1 , are also used to measure the coefficient of counts), 38 to 42 (5 counts). [Bowley's 0.22, Pearson's
skewness of a frequency distribution. 0.69, 3 3 =1.157]
NOTE: The coefficient of skewness is a number that equals 5. Calculate the first four moments about the mean and
zero for a symmetric distribution. In general Bowley's hence find 1 and 2 and comment on the results.
coefficient lies between + 1, while Pearson's coefficient lies
[ 1  3  4  0;2  2;1  0,2  2.75 , The curve is
between + 3.
The relative flatness of the top of a frequency distribution is symmetric and slightly platykurtic]
x 0 1 2 3 4 5 6 7 8
called Kurtosis. It is given by the coefficient 2   4 2 and f 1 8 28 56 70 56 28 8 1

 2  2  3 gives the deviation from Kurtosis. For normal
curves (i.e. curves which are neither flat nor sharply TOPIC 5.0
peaked),  2  0 . If  2  0 then the curve is platykurtic i.e. RANDOM EXPERIMENTS AND EVENTS
flatter than the normal curve and if  2  0 the curve is Statistics is concerned with the analysis of real
leptokurtic i.e. more sharply peaked than normal. experimental data that is random in nature. To some
extent, the results of some experiments depend on chance.
Home Study Exercise 4 For example in an experiment to measure the diameters of
1. (a) Find the mean and median age from the following steel rods that are randomly selected from a batch, it is
cumulative frequency table: normal to expect differences in the results observed.
Cumulative Cumulative Probability is the study of random experiments.
Age in years frequency Age in years frequency A random experiment is one that is:
 performed according to a fully described set of
20 - 25 21 45 - 50 166 rules
25 - 30 40 50 - 55 176  can be repeated arbitrarily often, and
30 - 35 90 55 - 60 186  whose result(s) depends entirely on chance.
35 - 40 130 60 - 65 195 This section introduces the terminology used in discussing
40 - 45 146 65 - 70 199 probability. Particular reference is made to the convenient
[Answers: 38.6, 36.2] graphical representation of experimental results in Venn
(b) Using the same axis graph the histogram and the diagrams.
frequency polygon for this data.
(c) Calculate the mean deviation, quartile deviation, Section objectives:
and the standard deviation. At the end of this section you should be able to:
2. The lengths, in millimetres, of 40 bearings were 1. Describe the basic terminology used with respect
determined with the following results: to probability such as random experiments,
16.6, 15.3, 16.3, 14.2, 16.7, 17.3, 18.2, 15.6, 14.9, outcomes, and sample space.
17.2, 18.7, 16.4, 19.0, 15.8, 18.4, 15.1, 17.0, 18.9, 2. Distinguish between the different kinds of events
18.3, 15.9, 13.6, 18.3, 17.2, 18.0, 15.8, 19.3, 16.8, such as equally likely, mutually exclusive/non-
17.7, 16.8, 17.9, 17.3, 16.6, 15.3, 16.4, 17.3, 16.9, exclusive, exhaustive, favourable and dependent
14.7, 16.2, 17.4, 15.6 and independent events.
(a) Starting with 13.5, group the data into six 3. Represent samples and events using set
classes of equal width and tabulate the notations.
frequency distribution. 4. Use Venn diagrams to represent sample spaces
(b) Calculate the standard deviation using the and their events.
(i) square sum of deviations
(ii) the coded mean technique [1.299 mm] 5.1. BASIC TERMINOLOGY
3. (a) In two samples of sizes n1 and n2 , the means are The following are definitions of some of the terms that are
used in the study of probability.
m1 and m2 ,and the standard deviations are 1 and
2 . If the two samples are pooled together to form 5.1.1 Trial and Outcome (or Event)
A trial is the single performance of a random experiment.
a sample size N  n1  n2 , show that the standard The result of each trial is called an outcome or event. For
deviation of the composite sample is: example tossing a coin is a trial that has two possible
2 2 outcomes, either head (H) or tail (T). Rolling a die is a trial
n11  n22
 1 2 m1  m2 2
nn
 that has six possible outcomes, either 1, 2, 3, 4, 5 or 6.
N N Events can be simple or compound depending on the
nature of the trial.
16
A simple event is a single possible outcome of an Find A, B and C and state the most likely score for the
experiment. For example getting a H or T when one coin is experiment. How many outcomes give a score of at most
tossed. 11?
A compound event combines the outcomes of two or more
trials into one event. For example if two coins are tossed SOLUTION
simultaneously the outcome H on the first and T on the First generate the set of all possible outcomes for the
second is a compound event. All the compound events in experiment rolling two dice simultaneously.
this experiment are HH, HT, TH, and TT. Sample space
1 2 3 4 5 6
1 (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6)
5.1.2 Sample Space 2 (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6)
The set of all the possible outcomes of a trial (i.e. the single 3 (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6)
performance of a random experiment) is called the sample 4 (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6)
space. It is denoted by S. For example: 5 (5, 1) (5, 2) (5, 3) (5, 4) (5, 5) (5, 6)
 When a coin is tossed S  H,T  . The cardinal 6 (6, 1) (6, 2) (6, 3) (6, 4) (6, 5) (6, 6)

number of S is 2 or simply written as nS   2 .


 When a die is rolled, S  1, 2, 3, 4, 5, 6 . Here  Set of outcomes that give a score of 3
1 2 3
nS   6 1 (1, 1) (1, 2) (1, 3)
 When two coins are tossed simultaneously, which 2 (2, 1) (2, 2) (2, 3)
is the same as tossing one coin twice in 3 (3, 1) (3, 2) (3, 3)
succession, S  HH,HT, TH, TT and nS   4
Hence A = {(2, 1), (1, 2)}

Determining the Sample Size


In simple events the sample size nS  is equal to the  the set of outcomes that give a score of 5
1 2 3 4 5
number of all possible outcomes of a trial. When a trial with 1 (1, 1) (1, 2) (1, 3) (1, 4) (1, 5)
a total of x is performed k times in succession then the 2 (2, 1) (2, 2) (2, 3) (2, 4) (2, 5)
sample size is evaluated as nS   x k . For example 3 (3, 1) (3, 2) (3, 3) (3, 4) (3, 5)
4 (4, 1) (4, 2) (4, 3) (4, 4) (4, 5)
Tossing a coin has two possible outcomes H or T (i.e. x = Therefore B ={(1, 4), (2, 3), (3, 2), (4, 1)}
2). If three coins are tossed simultaneously, which is  the set of outcomes that give a score of at least 4
equivalent to tossing one coin three times in succession and at most 5
(i.e. k =3), nS   x k  23  8 . These members are listed
1 2 3 4 5
1 (1, 1) (1, 2) (1, 3) (1, 4) (1, 5)
as: S  HHH , HHT , HTH , HTT ,THH ,THT ,TTT . Similarly if 2 (2, 1) (2, 2) (2, 3) (2, 4) (2, 5)
two dice are tossed simultaneously, x = 6 and k = 2 giving 3 (3, 1) (3, 2) (3, 3) (3, 4) (3, 5)
4 (4, 1) (4, 2) (4, 3) (4, 4) (4, 5)
nS   62  36 . This sample space can be generated as C ={(1, 3), (2, 2), (3, 1), (1, 4), (2, 3), (3, 2), (4, 1)}
shown in Fig. 5.1. Note that B is a subset of C i.e. B  C.
Outcomes of trial one
 The frequency distribution for the scores is:
1 2 3 4 5 6 Score f c.f
2 1 1
Outcomes of trial two

1 (1, 1) (1, 2) (1,3) (1, 4) (1, 5) (1, 6) 3 2 3


2 ( 2, 1) ( 2, 2) ( 2, 3) ( 2, 4) ( 2, 2) ( 2, 6) 4 3 6
3 (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6) 5 4 10
6 5 15
4 ( 4, 1) ( 4, 2) ( 4, 3) ( 4, 4) ( 4, 5) ( 4, 6)
7 6 21
5 (5, 1) (5, 2) (5, 3) (5, 4) (5, 5) (5, 6) 8 5 26
6 ( 6, 1) ( 6, 2) ( 6, 3) ( 6, 4) ( 6, 5) ( 6, 6) 9 4 30
10 3 33
S = { (1, 1), (1, 2), ..., (2, 1), (2, 2), ..., (6, 5), (6, 6)} 11 2 35
12 1 36
n(S) = 36 The most likely score = score with the highest count = 7
Figure 5.1 Sample space evaluation for two dice  From the cumulative frequency column the
rolled simultaneously number of outcomes give a score of at most 11is
A particular outcome occurring in the sample space is 35.
called a sample point. Two outcomes are said to have an
equal likelihood of occurring if they appear the same In general: -
number of times within the sample space. For example in  An event A in the sample space S is a set of
the set S  H,T , H and T have an equal likelihood of outcomes such that A  S, read as "A is
occurring because they both appear once in the sample contained in S".
space.  An empty set (denoted by ) is an event such
that   S and  has no element, n() = 0.
Example 5.1  The sample size, i.e. the total number of
Two dice are rolled simultaneously and a score is obtained outcomes in any trial, gives the total number of
by adding the numbers that appear on the faces that turn exhaustive events or cases.
up. If the events A, B and Care defined as follows: -  The number of occurrences of each
A: the set of outcomes that give a score of 3 outcome/event in the sample space gives the
B: the set of outcomes that give a score of 5 number of favourable cases to that event.
C: the set of outcomes that give a score of at
least 4 and at most 5
17
5.2 VENN DIAGRAM REPRESENTATIONS OF SAMPLE
SPACES AND THEIR EVENTS

nA  n A  n(S )
S
A common graphical representation of the outcomes of an
experiment is the Venn diagram. A Venn diagram usually A
consists of a rectangle, the interior of which represents the
sample space, together with one or more closed curves
inside it. The interior of each closed curve represents an A or A
event. Fig 5.2. shows a typical Venn diagram representing
a sample space S and two events A and B. (a)

S S S S
B B
i B
A
iii
A A A
B
ii
(b) (c) (d)
iv
Fig. 5.3 The shaded areas are (a) A or A
(b) A  B (disjoint events) (c) A  B (non -disjoint
Figure 5.2. A Venn diagram
Each possible outcome in the sample space is assigned to events) (d) A  B
one of the four regions to consider (marked i to iv in Fig. 5.2.2 Intersection
5.2): The intersection of two sets A and B is the set of all
(i) outcomes that belong to event A but not event B elements that are common to BOTH A AND B, Fig. 5.2(d).
(ii) outcomes that belong to event B but not event A The intersection of A and B is denoted by A  B, which is
(iii) outcomes that belong to both event A and event B read as “ A intersection B”. Thus
(iv) outcomes that belong to neither event A nor event A  B A ANDB  {x | x  A and x  B}
B
Example 5.2 It follows that if A  B then nA  B  nA In general
A six-sided fair die is thrown. Let event A be the 'number nA  B  nA and nA  B  nB . Two events A and B
obtained is divisible by two' and event B 'the number is are mutually exclusive or disjoint events if A  B   .
divisible by 3'. Draw a Venn diagram to represent these Mutually exclusive events cannot occur together, for
events. example, in tossing a die, the event that a head turns up
SOLUTION and the event that a tail turns up cannot occur at the same
The sample space S for the experiment time, in a single trial. Mutually non-exclusive events, on
S = {1, 2, 3, 4, 5, 6} the other hand, can occur simultaneously. For example in
A ={2, 4, 6} rolling a die the event that a multiple of three and the event
B = {3, 6} that a multiple of two turns up occur together when a 6 is
The Venn diagram is therefore. thrown.
S 5.2.3 Union
A
2 The union of two sets A and B is the set consisting of all
elements that are members of EITHER A OR B or of both A
6
B AND B, Fig. 5.2(c). Thus
4
A  B A OR B  { x | x  A or x  B, or x  A and x  B}
3 5
1 In this case nA  B  nA  nB  nA  B . Again if A  B
then nA  B  nB  . For mutually exclusive or disjoint
events,) nA  B  nA  nB, since nA  B  n  0 ,
5.2.1 Complement Fig. 5.2 (b).
The complement of a set A (denoted by A or A and read
5.2.4 Set relationships for more than two events
as "not A'') is defined as the set of all elements in the
sample space S that are not in the set A, Fig 5.3(a). In set Consider n events A1, A2, A3,, An in a sample space S,
builder notation: then the union of A1, A2, A3,, An is denoted by
A (or A )  { x | x  S and x  A} A1  A2  A3 ,,An . Similarly the event consisting of all
where the symbol  is used to mean “ is a member of” and the outcomes that belong to every one of the sets is
the symbol means “is not a member of”. A1  A2  A3 ,,An . If for any pair of value i, j with i  j
For example in the trial tossing a coin if event A is "a head Ai  A j   then Ai and A j are said to be mutually
turns up" then A or A is the event "a tail turns up". The exclusive or disjoint.
sum of the cardinal number of a set A and that of its
complement should always equal the cardinal number of
the sample space, i.e.

18
5.3 Sequential Experiments Find R, S and T. How many outcomes include at least
Often the main experiment or trial consists of a sequence of one tail?
sub-experiments, which may or may not influence the 2. (a) A fair coin is tossed twice. List a sample space
outcome of the next one. If the outcome of the first event showing the possible outcomes.
does not affect the outcome of the next event then sub- (b) A biased coin (it favours heads in the ratio 5 to 1) is
experiments and their outcomes are said to be tossed twice. List a sample space showing the
independent. If the outcome of a preceding experiment possible outcomes.
affects the outcome the next one, then the event that [(a) S  HH,HT, TH, TT and (b) S  HH,HT, TH, TT.
comes second is said to be dependent on the first. Note: The sample space listings are identical because sample
Sequential experiments form the basis of conditional space listings do not indicate relative probabilities.]
probability. 3. Two dice are rolled simultaneously to observe the
NOTE: By definition independent events are different from numbers that appear on the faces that turn up. If the
mutually exclusive events, and therefore the two should not events A, B and Care defined as follows: -
be taken to have the same meanings. A: both dice turn up the same number
B: the sum of the outcomes is at most 9
5.4 Sampling C: both dice turn up an odd number.
In the quality control of mass produced components it Find A, B and C and draw the Venn diagram for the
would be both time-consuming and uneconomical to sample space, S. Show the following set relationships
using Venn diagrams: (i) A  B (ii) B  A  C
subject each component to full inspection. In practice a
sample is randomly selected as being representative of the
whole population (or total output of components). This 4. A coin is tossed and a head or tail is observed. If a
sampling can be done with or without replacement from a head results the coin is tossed a second time. If a tail
st
larger finite batch of such components. results on the 1 toss, a die is rolled. Construct the
 Sampling with replacement is a sequential sample space and find how many outcomes have at
experiment in which each unique component that is most one head.
withdrawn from an isolated batch is replaced before
the next random selection is performed. The TOPIC 6.0
sample space remains unchanged throughout and INTRODUCTION TO PROBABILITY
experiment yields independent outcomes
 Sampling without replacement is a sequential Probability is a measure of the likelihood that a particular
experiment in which each unique component that is outcome occurs in a random experiment. The ability to
withdrawn from an isolated batch is not replaced predict likely occurrences has applications in industrial
before the next random selection is performed. The quality control and efficient use of resources.
sample size varies and the experiment yields
dependent outcomes. Section objectives
At the end of this section you should be able to: -
5.5 Tree diagram representations of sample spaces 1. Differentiate between empirical and classical probability
Tree diagrams branch -structured pictorial construction that 2. Understand the concept of statistical regularity
is used to represent the sample space of experiments that 3. State and apply the laws of probability.
are performed sequentially. Each 'branch' within the 4. Solve problems that involve conditional probability.
diagram shows a possible outcome. For example, in an
experiment where one coin is tossed twice in succession, 6.1 INTRODUCTION
the sample space has been determined as There are two approaches to determining probabilities: -
S  HH,HT, TH, TT. Fig. 5.4 illustrates the tree diagram  Empirical (or experimental) probability
representation of this sample space.  Classical (or theoretical) probability

TRIAL 1 TRIAL 2 6.1.1 Empirical Probability


H H, H Empirical probability is the observed relative frequency
H with which an event occurs. It is based on previous known
T H, T results of an experiment.
OUTCOMES The value assigned to the probability of an event A as a
H T, H result of experimentation can be found by means of the
T formula
T T, T n( A)
P ( A)  > (6.1)
Four branches, each branch shows n
a possible outcome. where P'(A) = relative frequency with which event A
S = { (H, H), (H, T),(T, H), (T, T)} occurred
Figure 5.4 Tree diagram n(A) = number of times that event A is actually
observed
Home Study Exercise 6 n = number of times the experiment is
1. An experiment consists of tossing four coins and attempted.
counting the number of heads (H) that turn up. If the The number of likely outcomes of an event A in n
events A, B and Care defined as follows: - performances of an experiment is called the expectation of
R: the set of outcomes that include 3 heads A is denoted by E.
S: the set of outcomes that include 2 tails E  n  P(A)
T: the set of outcomes that include at least
1head and at most 4 heads
19
Example 6.1 For example the experimental results of tossing a coin were
In a coin-tossing experiment 105 heads are observed when tabulated as follows: -
the coin is tossed 200 times in succession. Find the Table 6.1 Statistical Regularity
probability that a head occurs in this experiment and the No. of throws Relative frequency
likely number of heads observed when the coin is tossed of Heads
600 times. 4,000 0.5069
SOLUTION 12,000 0.5016
Let A = event that a head turns up 24,000 0.5005
Hence n(A) = 105 As the number of throws are increased the relative
n = 200
frequency of the heads that turn up (Table 6.1) approaches
n( A ) the theoretical probability of obtaining a Head in the single
P ( A) 
 n experiment - tossing a coin, which is 0.5. The larger the
105 number of experimental trials n, the closer the experimental
  0.525
200 probability P'(A) is expected to be closer to the true
If n = 600, n(A) = n x P(A) probability P(A). This phenomenon is called statistical
= 600 x 0.525 = 315 Heads regularity or the stability of relative frequencies. This
Therefore 315 heads are likely to turn up when the coin is
explains why the percentage of defective components
tossed 600 times.
fluctuates a little when items are produced in bulk.

6.1.2 Classical (or Theoretical) Probability Tutorial Exercise 6.1


Classical probability considers the theoretical number of 1. Toss a single coin 20 times and record H (head) or T
ways in which it is possible for event A to occur. Here the (tail) after each toss. Using your results find the
probability is defined as: - observed probabilities P'(H) and P'(T). Repeat the
experiment 30 times and evaluate P'(H) and P'(T).
Compare your findings in both cases.
n( A)
P ( A)  >(6.2) 2. Place three coins in a cup, shake and dump them out,
n(S ) and observe the number of heads showing. Record
where n(A) = theoretical number of ways in which event OH, 1H, 2H, and 3H after each trial. Repeat the
A is can occur process 25 times. Using your results find: -
n(S)= total number of all equally possible (a) P'(0H) (b) P'(1H) (c) P'(2H) (d) P'(3H)
outcomes Compare your results with the theoretical / true
The prime symbol of equation 6.1 is not used with probabilities P(A).
theoretical probabilities. The probability of A is also a
relative frequency, this time based on all the possible In summary,
outcomes regardless of whether they occur or not.  Probability represents a relative frequency
 P(A) is the ratio of the number of times that an
NOTE: The use of formula 6.2 requires the existence of a event can be expected to occur divided by the
sample space where each outcome has an equal likelihood number of trials
of occurring.  The numerator of the probability ratio must be a
positive number or zero
Example 6.2
If two coins are tossed simultaneously find the probability of  The denominator of the probability ratio must be a
obtaining two heads, one head and no head. positive number
 The number of times that an event is expected to
SOLUTION occur in n trials is always less than or equal to the
Sample space, S = {HH, HT, TH, TT} total number of trials.
n(S) = 4
n( A) 6.3 AXIOMS OF MATHEMATICAL PROBABILITY
By definition, P ( A) 
n(S ) The following properties of the probability P(A) can be
Probability of obtaining two heads deduced: -
n(2H ) 1 (i) For any event A in a sample space S,
P (2H )   0  P( A)  1 .
n(S ) 4
Probability of obtaining one head If P(A) = 1, then A is a certainty; if P(A) = 0, then
A is an impossibility
n(1H ) 2 1
P (1H )   or (ii) For the entire sample space S we have,
n(S ) 4 2 P(S) = 1,
Probability of obtaining no head i.e. we are certain to obtain one of the possible
n(2T ) 1 outcomes.
P (0H )  P(2T )  
n(S ) 4 (iii) If A is the complement of A, then from (ii) we have
P( A)  1 P( A) ,
6.2 Concept of Statistical Regularity (law of large which is referred to as the complementation
numbers) rule. This is particularly useful in problems where
If an experiment is repeated over a large number of times, evaluating the probability of the complement is
the ratio of successful occurrences to the total number of easier than evaluating the probability of the
trials will approach the theoretical probability of the event itself.
outcome of an individual trial. (iv) If A and B are two events in S then
P( A  B)  P( A)  P(B)  P( A  B)
20
This is called the addition rule. The intersection
of both events appears to overlap in the Venn SOLUTION
diagram of the union and is therefore subtracted First generate the set of all possible outcomes for the
to avoid double counting. experiment rolling two dice simultaneously.
However if A and B are mutually exclusive (i.e. Sample space
A  B   ), then P A  B   0 and hence 1 2 3 4 5 6
1 (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6)
P( A  B)  P( A)  P(B) 2 (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6)
In general if p1, p2, , pn are the probabilities of 3 (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6)
4 (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6)
n mutually exclusive events E1, E2,, En , then 5 (5, 1) (5, 2) (5, 3) (5, 4) (5, 5) (5, 6)
the probability that either one of these events will 6 (6, 1) (6, 2) (6, 3) (6, 4) (6, 5) (6, 6)
n(S) = 36
happen is: -
The set of outcomes that give a score less or equal to 3
P(E1  E2    En )  P(E1)  P(E2 )    P(En ) 1 2 3
 p1  p2    pn 1 (1, 1) (1, 2) (1, 3)
Proof 2 (2, 1) (2, 2) (2, 3)
3 (3, 1) (3, 2) (3, 3)
Let m1, m2,, mn be the theoretical number of ways in Hence n(A) = 3, P(A) = 3/36 = 1/12
which the mutually exclusive events E1, E2,, En can occur. The set of outcomes that give a score that is evenly
divisible by 3
The total number of trials each with outcomes 1 2 3 4 5 6
m1, m2,, mn is 1 (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6)
2 (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6)
n  m1  m2    mn 3 (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6)
and the probabilities of occurrence of the events are: 4 (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6)
m m m 5 (5, 1) (5, 2) (5, 3) (5, 4) (5, 5) (5, 6)
p(E1)  p1  1 ; p2  2 ,, pn  n 6 (6, 1) (6, 2) (6, 3) (6, 4) (6, 5) (6, 6)
n n n Therefore n(B) = 12, P(B) = 12/36 = 1/3
Therefore the probability that either one of these events AB = A = {(1, 2), (2, 1)}, n(AB) =2 and P(AB)=1/18
occurs is P(AB)= P(A) + P(B) - P(AB)= 1/12 + 1/3 - 1/18= 13/16
m1  m2    mn Example 6.5
P (E1  E2    En ) 
n A biased six-sided die has probabilities p/2, p, p, p, p, 2p of
m m m showing 1, 2, 3, 4, 5, 6 respectively. Calculate p.
 1  2  n
n n n
 p1  p2    pn SOLUTION
P (S )  P (1 2  3  4  5  6)
Example 6.3 p 13 p
In a horse race the odds in favour of four horses A, B, C   p  p  p  p  2p 
2 2
and D are 1: 3, 1: 4, 1: 5 and 1: 7 respectively. What is the But P(S) = 1
probability that one of the horses will win? If there is one Therefore 13p/2 = 1 or p = 2/13.
more horse in the race then what is its probability of
success? Addition rule for more than two events
SOLUTION Consider three events with a Venn diagram as shown
Let A: event that a horse A wins below.
B: event that a horse B wins
C: event that a horse C wins U B
D: event that a horse D wins 8
In the race only one horse can win and therefore the events 2
1 1
are mutually exclusive. Given that p(A) =  ; 5 6
1 3 4 7
p(B) = 1/5; p(C)= 1/6 and p(D)= 1/8 4 3
1
The probability that one of the horses wins is:
C
P( A  B  C  D)  P( A)  P(B)  P(C)  P(D) A
1 1 1 1 89
=      74.2% Figure 6.1 Venn diagram representation of
4 5 6 8 120 the three events
If there is one more horse in the race, the probability of its
The diagram is divided into eight regions that are of four
success is equivalent to the probability that none of the
other four horses wins, i.e. different types:
 Regions 1, 2 and 3 each correspond to a single
p5  P( A  B  C  D)  1 P( A  B  C  D) event
= 1 - 89/120= 31/120  25.8%  Regions 4, 5 and 6 are each the intersection of
exactly two events
 Region 7 is the three-fold intersection of all three
Example 6.4
events
Two dice are rolled simultaneously and a score is obtained
 Region 8 corresponds to none of the events.
by adding the numbers that appear on the faces that turn
For one-event Venn diagrams there are two regions, for
up. If the events A, B and Care defined as follows: -
A: a score less or equal to 3 is obtained two-event Venn diagrams there are 4 regions, for three-
B: a score that is divisible by 3 is obtained event diagrams there are 8 regions and in general for an n-
n
Find P(A), P(B) and P(A  B). event diagram there are 2 regions. Any particular region R

21
lies either outside or inside the closed curve of any
particular event.
With two choices (inside or outside) for each of the n closed
n
curves, there are 2 different possible combinations to
characterise R. 6.4 CONDITIONAL PROBABILITY
n Conditional probability is the probability that a particular
The 2 regions will break down into n + 1 types with the
numbers of each type as follows: - event occurs given the occurrence of another, possibly
related, event.
 n
C0  1 no events;
Let A be an event in the sample space S (i.e. A  S). The
 n
C1  n one event but no intersections; probability that an event B occurs given that A has already
occurred is
 n
C2  1 n(n  1) two-fold intersections; P( A  B)
2 P B / A 
P ( A)
 n
C3  1 n(n  1)(n  2) three-fold intersections
3!  P( A  B)  P( A)  P(B / A) > (a)
 where P(B/A) is the conditional probability of B given
 n
Cn  1 n - fold intersections. that event A has already occurred.
Similarly
When the probabilities are combined to calculate the
P (B  A) P ( A  B )
probability of the union of the n events, account must be P A / B   
taken of the double, triple, etc. counting. This result can be P (B ) P (B )
shown to be:  P( A  B)  P(B)  P( A / B) > (b)
P ( A1  A2    An ) From (a) and (b) it can be seen that
  P ( Ai )   P ( Ai  A j )   P ( Ai  A j  Ak )  
i i, j i , j ,k P( A  B)  P( A)  P(B / A)  P(B)  P( A / B) > (c)
  1n 1P ( A1  A2    An )
Each summation runs over all possible sets of subscripts, The last equation (c) is called the multiplication rule for
but omitting those in which any two subscripts in a set are arbitrary events in sequential experiments.
the same. In terms of Venn diagrams P(B/A) can be interpreted as the
For example, probability of B in the reduced sample space of A.
If n = 2 Example 6.7
P( A1  A2 )  P( A1)  P( A2 )  P( A1  A2 ) Two dice are rolled simultaneously. Find the probability that
one of the dice turns up a 2 given that the total score is 4.
If n = 3
P ( A1  A2  A3 )  P ( A1)  P ( A2 )  P ( A3 ) SOLUTION
- P ( A1  A2 )  P ( A1  A3 )  P ( A2  A3 ) Sample space
 P( A1  A2  A3 ) 1 2 3 4 5 6
and so on. 1 (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6)
2 (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6)
Example 6.6
3 (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6)
Find the probability of drawing from a pack a card that has 4 (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6)
at least one of the following properties: 5 (5, 1) (5, 2) (5, 3) (5, 4) (5, 5) (5, 6)
A: it is an ace 6 (6, 1) (6, 2) (6, 3) (6, 4) (6, 5) (6, 6)
B: it is a spade n(S) = 36
C: it is a black honour card (ace, king, Let A: the total score is 4
queen, jack or 10) B: a die turns up a 2
D: it is a black ace A = {(3, 1), (2, 2), (1, 3),  P(A) = 3/36 or 1/12
B = {(2, 1), (1, 2), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6), (3,
SOLUTION 2), (4, 2), (5, 2), (6, 2)}
There are four events. And hence 24 =16 regions which AB = {(2, 2)},  P(AB) = 1/36
can be broken down into Therefore, by definition
1
4
C1  4 regions with one event. Measuring all probabilities P( A  B)
P B / A 
1
 36 
in units of 1/52, the individual ones are P ( A) 1/ 12 3
P(A) = 4, P(B) = 13, P(C) =10, P(D) =2
4
There are C2  6 two-fold intersection probabilities,
Multiplication rule for Independent events
P( A  B)  1 P( A C )  2 P( A  D)  2 Two events A and B are statistically independent if the
P(B C )  5 P(B  D)  1 P(C  D)  2 probability of the simultaneous occurrence of both events is
4 P( A  B)  P( A)  P(B / A)  P(B)  P( A / B)  P( A)  P(B)
There are C3  4 three-fold intersection probabilities,
It can be seen that for independent events
P( A  B  C)  1 P( A  B  D)  1 P(B / A)  P(B) and P( A / B)  P( A) , the occurrence
P( A  C  D)  2 P(B  C  D)  1 of one event does not affect the probability of occurrence of
4
There is C4  1four-fold intersection probability the next event.
Also if A and B are independent events, then
P( A  B  C  D)  1 (in units of 1/52)
Hence  P( A  B )  P( A)  P(B )
P( A  B  C  D)   P( A  B)  P( A )  P(B)
1
[4  13  10  2  1  2  2  5  5  2  P( A  B )  P( A )  P(B )
52
 (1  1  2  1)  1]
22
=20/52.
A class has ten boys and five girls. Three students are
Example 6.8 selected at random from the class without replacement.
Let A and B be two independent events in S such that P(A) What is the probability that
= 1/2 and P(AB) = 2/3. Find (i) P(B) and (ii) P(B / A) . (i) All three are girls?
(ii) The first two students chosen are boys and the third
is a girl?
SOLUTION
(i) By definition (iii) The first and the third are of the same sex while the
P ( A  B )  P ( A)  P (B )  P ( A  B ) second is of the opposite sex.
 P ( A)  P (B )  P A  P B 
 P(A)  [1  P A]P B  SOLUTION
Let the events
P A  B   P ( A)
2 1 Gi: ith girl is chosen
 P B    3 2  1
ith boy is chosen
1  P A 
Bi:
1 3
2 n(G) = 5; n(B) = 10; n(S)= 10 + 5 = 15 students
(i) P(All are girls) =
(ii) For independent events A and B
P(G1  G2  G3 )  P(G1)P(G2 / G1)P(G3 / G1  G2 )
P( A  B ) P A   P (B )
P (B / A)    P (B ) 5 4 3 2
P ( A) P ( A)    
1 2 15 14 13 91
 1  (ii) P(First two are boys and the third is a girl) =
3 3
P(B1  B2  G1)  P(B1)P(B2 / B1)P(G1 / B1  B2 )
The multiplication rule for probabilities can straightforwardly 10 9 5 15
be extended to several events. For example consider three    
15 14 13 91
events A, B and C.
(iii) P(first and the third are of the same sex while the
P A  B  C   P (C )  P A  B  second is of the opposite sex) =
 PB  C  P A / B  C 
 PC  P B / C   P A / B  C  P {(B1  G1  B2 )  (G1  B1  G2 )
OR = P(B1  G1  B2 )  P(G1  B1  G2 )
= P A  P B / A  P C / A  B 10 5 9 5 10 4 5
=      
OR 15 14 13 15 14 13 21
= P B  P C / B  P A / C  B , etc.
The following are important relationships involving Example 6.9 could alternatively have been solved using the
conditional probabilities: tree diagram representation Figure 6.2.
 For a set of mutually exclusive events Ai, whose 8/13 B
union is A. For some other event B,
 P A / B
B
P( A / B)  i
9/14
5/13 G
i B 9/13 B
10/15
This is the addition law for conditional probabilities 5/14 G
 If the set of mutually exclusive events AI exhausts 4/13 G
the sample space S then the probability P(B) of an
event B in S is given by the total probability law 10B, 5G
9/13 B
which is stated as:
P (B )   P B / A  P A  , or more generally
i
i i 5/15 10/14 B
4/13 G
G 10/13 B
as
 P B / A  P A / C 
4/14 G
P (B / C )  i i
3/13 G
i
B = Boy, G = Girl
where C is any other event in S. Figure 6.2 Tree diagram representation of
Example 6.9
In a sampling experiment there are two ways in which
objects may be drawn into the sample, namely: Tutorial Exercise 6.1
(i) Sampling with replacement, and 1. A box contains 10 screws, 3 of which are defective.
(ii) Sampling without replacement. Two screws are drawn at random. Find the probability
that none of the screws is defective if they are drawn
In Sampling with replacement each object drawn is with and without replacement [49%, 47%]
returned to the set before the next object is randomly drawn. 2. A batch of 100 iron rods consists of 25 oversized, 25
Each subsequent selection is independent of the previous undersized and 50 rods of the desired length. If two
one. rods are drawn at random without replacement, what is
In Sampling without replacement each object drawn put the probability of obtaining:
aside from the set before the next object is randomly drawn. (a) Two rods of desired length [24.75%]
This kind of sampling continuously reduces the sample size (b) One of desired length [30.5%]
therefore making sequential experiments dependent on one (c) None of desired length [24.75]
another. (d) Two undersized rods [6.06%]
3. A box of 100 gaskets contains 10 gaskets with type A
Example 6.9 defects, 5 gaskets with type B defects and 2 gaskets

23
with both types of defects. Find the probability that a There are three empty spaces on a bookshelf and there are
gasket with a type B defect is drawn given that it has a seven different books available. Determine the number of
type A defect. [20%] ways in which these books can be arranged in the available
spaces if the arrangement is done (a) without repetition and
Bayes' Theorem (b) with repetition.
From the multiplication rule it has been shown that
P( A  B)  P( A)  P(B / A)  P(B)  P( A / B) , from which we SOLUTION
Take n = 7 and k = 3
obtain Bayes' theorem
(a) without repetition,
P ( A)
P( A / B)  P (B / A ) Number of permutations
P (B ) n Pk
This theorem shows that P(B / A)  P( A / B) unless P(A) n! 7!
 
=P(B). n  k ! 7  3!
7  6  5  4!
  210
Odds 4!
Probabilities can also be expressed using 'Odds'. If the (b) with repetition
k 3
odds in favour of an event A are a to b (or a: b), then Number of permutations = n = 7 = 343

 The odds against event A are b to a (or b:a)


a Example 6.11
 P ( A)  A set of snooker balls consists of a white, a yellow, a green,
ab
a brown, a blue, a pink, a black and 15 reds. How many
b
 P( A )  distinguishable permutations of the balls are there?
ab
SOLUTION
Tutorial Exercise 6.2 In total there are 22 balls, with the 15 red balls being
Find the probability that an event A will happen if the odds indistinguishable. Therefore the number of distinguishable
are: permutations is
(a) 7 to 3 in favour of A [7/10]
(b) 1 to 3 in favour of A [1/4] 22! 22!
  859, 541, 760
(c) 3 to 2 against event A [2/5] (1! )(1! )(1! )(1! )(1! )(1! )(1! )(15! ) 15!
(d) 3 to 5 against event A [5/8]
6.5 PERMUTATIONS AND COMBINATIONS Example 6.12
By definition the probability of an event A in a sample space Find the probability that in a group of k people, at least two
th
S is given as the number of outcomes that belong to event have the same birthday (ignoring 29 February).
A divided by the total number of equally possible outcomes,
i.e.: SOLUTION
n( A) Let A: all birthdays are different
P ( A) 
n(S ) A: at least two birthdays are the same
There are 365 possible birthdays for each of the k people
It is therefore necessary to be able count the number of
and hence
possible outcomes in various common situations. the total possible number of outcomes = n(S) = nk = (365)k.
If A is the event The number of outcomes for which the
6.5.1 Permutation birthdays are different, i.e. n(A), is
Consider n elements or objects. The arrangement of these 365!
n(A) = 365 Pk 
objects in a row is called a permutation.
1. The number of possible arrangements (i.e.
365  k !
permutations) of n objects taken all at a time is: n( A) 365!
P ( A)  
nn  1n  2 1  n! (read as "n - factorial") n(S ) 365  k ! 365 k
2. The number of possible permutations of k objects  By the complementation rule the probability that at
selected from n (for k< n), without repetition, is given least two birthdays are the same is
by: 365!
P ( A )  1  P ( A)  1 
nn  1n  2 n  k  1 
n! n
 Pk (read as " n 365  k ! 365 k
n-k !
permutation k").
k 6.5.2 Combinations
With repetition the number of permutations is n .
3. The number of permutations of n distinguishable A combination of any given objects is any selection of one
objects (taken all together) where there are n1 of type or more of the objects without regard to order.
1, n2 of type 2, … , and nm of type m (such that 1. The number of combinations of k objects from n, taken
n  n1  n2    nm ), is given by: without repetition/replacement is
n! n
n! nCk    for 0  k  n
n1! n2!  nm! n  k ! k! k 
n
th
This is so because the i group of identical objects can where C k is called the binomial coefficient since it also
only be arranged in ni! Ways without changing the appears in the binomial expansion.
distinguishable permutation. 2. The number of combinations of k objects from n, taken
with repetition/replacement is
Example 6.10
24
n  k 1  n  k  1 these attributes is 4/5. What is the probability that a
Ck    person has both attributes? [1/45]
 k 
6. A job seeker attends an interview at an engineering
Example 6.13
firm. The probability that she will want the job (A) after
How many samples of 5 screws can be selected from a box
the interview is 68%. The probability that the firm will
containing 500 screws if the selection is done (a) without
want her service (B) is 36%. The probability that she will
replacement and (b) with replacement?
want the job given that the firm will want her service is
88%.
SOLUTION
n = 500, k = 5, therefore the number of (a) Find the probability that after the interview she will
combinations is want the job and the firm will want her service.
(a) Without repetition (b) Find the probability that after the interview the firm
n n! 500! will want her service given that she will want the
Ck    2.55  1011
n  k ! k! 500  5!5! job. [0.3168, 0.4659]
7. Three flower seeds are randomly selected, without
(b) With repetition
replacement, from a package that contains eight seeds
n k 1
for yellow flowers, eight seeds for red flowers and, and
Ck 504 C5  2.66  1011 four seeds for white flowers. Find the probability that
(a) all the three seeds will produce yellow flowers
Home Study Exercise 6
(b) all three seeds selected are for the same flower
1. A fair coin is tossed 100 million times. Let B be the
colour [0.049, 0.10175]
total number of heads observed. Identify each of the
8. In a certain class 15% of the students scored an A in
following statements as true or false. Explain your
Pure Mathematics and 10% scored an A in Statistics.
answer (no computation is required).
5% scored an A in both subjects. A student is selected
(a) it is very probable that B is very close (within a few
at random.
thousand) to 50 million.
(a) What is the probability that he received an A in
(b) It is very probable that B/100 million is very close to
either subject? [0.20]
0.5; maybe between 0.49 and 0.51.
(b) What is the probability that he received an A in
(c) Since Hs and Ts fall with complete irregularity, one
Statistics given that he scored an A in Pure Maths?
cannot say anything about what happens in 100
9. In a lot of 10 items 2 are defective. Find the number of
million tosses.
different samples of 4. Find the number of samples of 4
(d) Hs and Tails fall about equally often. Therefore, if
containing no defectives, 1 defective and 2 defective
on the first 10 tosses we get all heads, it is more
items. [210, 70, 112, 28]
probable that the eleventh toss will yield a T than a
10. How many different license plates showing 5 symbols,
H.
namely two letters followed by 3 digits can be made?
[(a) and (b) are true. Both are examples of statistical
[676,000]
regularity]
2. If P(A) = 0.3 and P(B) = 0.4, and if A and B are TOPIC 7.0
mutually exclusive events, find the following RANDOM VARIABLES
  
(a) P A (b) P B (c) P A  B  (d) P A  B 
[Answers 0.7, 0.6, 0.7, 0] Random variables are functions whose values are real
3. If P(A) = 0.4, P(B) = 0.5, and P A  B  =0.1, find
numbers and depend on "chance". In experiments such as
rolling dice or tossing coins, the random variable is used to
P A  B  . [0.8] count the number of "successes" or "failures". The
4. A certain ophthalmic trait is associated with eye colour. occurrence of each success or failure totally depends on
Three hundred randomly selected individuals are chance.
studied with results as follows:
Eye Colour Section Objectives
Trait Blue Brown Other Totals At the end of this section you should be able to:
Yes 70 30 20 120 1. Define the random variable
No 20 110 50 180 2. Distinguish between discrete and continuous random
Totals 90 140 70 300 variables.
3. Differentiate between discrete and continuous
What is the probability that
probability distributions
(a) a person selected at random has blue eyes?
4. Find the mean and the variance of a probability
(b) a person selected at random has the trait?
distribution
If the following events are defined
A: person has blue eyes
7.1. DEFINITION OF THE RANDOM VARIABLE
B: person has the trait
A random variable (or also called stochastic variable or
C: person has brown eyes
variate), X, is a function for which: -
(c) Are A and B independent? Justify your answer.
(d) How are A and C related (independent, mutually  X is defined in the sample space S of the
exclusive, complementary, or all-inclusive)? Explain experiment and its values are real numbers (i.e.
why or not each term applies? XS and XR)
[(a) 0.30 (b) 0.40 (c) not independent (d) mutually  The set of all possible outcomes in S for which X =
exclusive] a (where a is real number that falls in the domain of
5. The probability that a blue-eyed person is left-handed X), has a well-defined probability.
is 1/7. The probability that a left-handed person is blue- Examples
eyed is 1/3. The probability that a person has neither of (i) When two dice are rolled simultaneously, the sum (X)
of the two numbers that turn up is an integer between

25
2 and 12 (both inclusive). The specific values that X 7.2.1 Discrete Probability Distribution
takes depend on chance. Let x1, x2, x3,  be the value for which X has the distinct
(ii) When two fair coins are tossed, the number of heads
that turn up (X) is an integer equal to 0, 1 or 2. probabilities p1, p2, p3, respectively, such that
(iii) If three screws are randomly selected from a box P( X  x1)  p1 and so on. The probability function of X is
containing left-hand and right-hand screws, the stated as: -
f x   p j
number of left-hand screws drawn (X) is 0, 1, 2 or 3.
when X  x j (j  1,2,3...)
In a given experiment, 0 otherwise
 the event corresponding to a number "a" is denoted Since P(S) = 1
 

 p
by "X = a" and the corresponding probability that
this event occurs by P(X = a). f ( x j )  1 or j 1
 the probability that X assumes any value in the j 1 j 1
interval a < X < b is denoted by P(a < X < b) Similarly
 P(X < c) is the probability that X assumes a value
smaller than c or equal to c.
P (a  X  b )   f ( x )  a
a  x j b
j p
x b
j

 P  X  c  is the probability that X assumes a value j

greater than C. 7.2.2 Discrete distribution function (or Cumulative


distribution function)
The events X < c and X > c are mutually exclusive and their If X is any random variable then for any real number x there
probabilities are summed as: exists the probability P( X  x ) . The distribution function is
P( X  c )  P X  c   P(  X  )  P(S) , thus defined as:
but P(S) = 1, and hence the complementation rule F(x) = P( X  x )
P  X  c   1  P( X  c ) If a  X  b , then
P a  X  b  P X  b  P X  a
Example 7.1 = F(b) - F(a)
Let X be the number of heads that turn up when two coins
are tossed simultaneously. List the sample space and Therefore the distribution function determines the
hence find: distribution of X uniquely and can be used to compute
probabilities.
(a) P(X = 0)
(b) P(X = 1) In terms of the probability function, the distribution function
(c) P(X < 1) is stated as:
(d) P(X > 1)
(e) P(X = 1.5)
F(x)   f ( x )  x
xx j
j
x
p j
j
SOLUTION
Example 7.2
S = {(H, H), (H, T), (T, H), (T, T)}, n(S) = 4
X: number of heads that turn up Show that the number that turns up when a fair die is rolled
1 (X) is a discrete random variable. Sketch the probability and
(a) P(X = 0) = P(TT)= distribution functions.
4
1 1 1
(b) P(X = 1) = P(TH)+P(HT)=  
4 4 2
3
(c) P(X < 1)= P(TH)+P(HT) + P(TT)=
4
1
(d) P(X > 1) = P(X = 2)= P(TT) =
4
OR
P  X  1  1  P ( X  1)
3 1
 1 
4 4
(e) P(X = 1.5) = 0

Random variables can be classified as being either discrete


or continuous depending on their representation.

7.2 Discrete Random Variables


A random variable X and its distribution is said to be
discrete if: -
 the number of values for which X has a probability
different from 0 is finite or at most countably infinite.
 if an interval a  X  b does not contain such a
value then P( a  X  b ) = 0

26
SOLUTION
limits (often - to ). Examples include height, volume,
S = {1, 2, 3, 4, 5, 6}, n(S) = 6 weight, temperature, etc.

x 1 2 3 4 5 6 7.3.1 Probability Density Function (pdf)


1 1 1 1 1 1 The probability density function (pdf), f(x), of a continuous
f(x)=P(X = x) random variable X is defined such that:
6 6 6 6 6 6

P x  X  x  dx   f (x )
1 2 3 4 5 6
F(x) = P(X<x)
6 6 6 6 6 6
For a discrete random variable,

For any continuous random variable with a probability

 f (x )  1
function f(x) to be a pdf, then
j  a2
j 1
For the probability distribution  f ( x )dx   f ( x )dx  1
 a1
6


1 1 1 1 1 1 6
f (x j )        1 where a1 and a2 are the limits that define the domain of the
6 6 6 6 6 6 6 random variable.
j 1
Therefore X is a discrete random variable. Similarly,
x2

f(x) Probability function


P x1  X  x2  
 f ( x )dx
x1
1
6 This probability can be represented graphically as the area
under the curve of f(x) between the limits x1 and x2. (Figure
7.1)
1 2 3 4 5 6 x f(x)
1
F(x)

0.5

1 2 3 4 5 6 x x
Distribution function a1 x1 x2 a2

Example 7.3 Figure 7.1 Probability density function for a


A bag contains seven red balls and three white balls. Three continuous random variable X
balls are drawn at random without replacement. Find the
probability distribution for the number of red balls (X) 7.3.2 The Cumulative Probability Function F(x)
drawn. The cumulative probability function F(x) for a continuous
random variable is given by: -
SOLUTION
x x
Let R = red ball and W = white ball
P(R)= 7/10 and P(W)=3/10
S = {RRR, RRW, RWR, RWW, WRR, WRW, WWR,
F ( x )  P( X  x ) 


f (u )du 
 f (u )du
a1
WWW}, n(S) = 8
X: number of red balls drawn where u is a (dummy) integration variable. It can be shown
3 2 1 1 that
(f) P(X = 0) = P(WWW)=    P( x1  X  x2 )  F ( x2 )  F ( x1)
10 9 8 120
(g) P(X = 1) =
P(RWW)+P(WRW)+P(WWR) Example 7.4
 3 2 7 7 A continuous random variable X has a probability function
=    3 
 10 9 8  1
 4 2x  3 
40
0  x 1
(h) P(X = 2) = P(RRW)+P(RWR)+P(WRR) 
f (x)  
 3 7 6 21 
=    3 
 10 9 8  40 
0 otherwise.
7 6 5 7 (a) Show that X is a continuous random variable
(i) P(X = 3) = P(RRR)=  
10 9 8 24 (b) Find P(0  X  1 2)
7.3 Continuous Random Variables and Continuous
Distributions
A random variable X is said to have continuous distribution
if X is defined for a continuous range of value within given
27
3. Find and graph the probability function f(x) of the
SOLUTION
(a) For a continuous random variable X with a probability random variable X = the sum of the three numbers
function f(x) to be a pdf, obtained in rolling three fair dice.
 4. Let X have the density f(x) = kx in the interval 0< x< 2.

 f ( x )dx  1
Find k. Find x such that (a) P(X< x) = 10%. (b) P(X< x)
= 95%. [k=0.5, x=0.632, 1.95]
 5. Suppose that certain bolts have lengths L=200 + X
For the given function mm, where X is a random variable with density
 
 1
3
  2x  3dx  1
1 f ( x )  1  x 2 when 1  x  1and 0 otherwise.
f ( x )dx  4
4

 
0 Determine c so that with a probability of 95% a bolt will
1 2 1 40
 x  3x 0  1 have any length between 200 - c and 200 + c.
4 4 6. Find the probability that none of 3 bulbs in a signal-light
Therefore X is a continuous random variable. will have to be replaced during the first 1200 hours of
(b)
operation if the lifetime X of a bulb is a random variable

 2x  3dx  x  3x 0
12
1 1 2 12 7 with the density f ( x )  6[0.25  x  1.52 ] where 1< x<
P (0  X  1 2)  
4 4 16 2 and f(x) =0 otherwise, where x is measured in
0
multiples of 1000 hours. [72%]
OR ALTERNATIVELY
x x
7.4 Mean and Variance of a probability distribution
 f u du 
 2u  3du
1
F(x)  It is conventional to characterise the probability function f(x)
4
 in terms of the mean and variance.
 
0
u  3u 0  x  3 
1 2 x x

4 4 7.4.1 Mean or Expectation
P (0  X  1 2)  F (1 2)  F (0) Let X be a random variable with a probability function f(x).
1 1  7 The mean (also called the expectation of X), denoted by
   3  0 
8 2  16 E[X] or , is defined as: -

Example 7.5


 i

xi f ( xi ) f or a discrete distribution
Given the continuous random variable E[ x ]   
2
 3
x
1 x   
  
 xf ( x )dx f or a continuous distribution

f (x)  
 Example 7.6

0 Otherwise Two fair dice are rolled and the sum of the numbers on the
upturned faces (X) is recorded. Find the expectation of X.
Find F(x).
SOLUTION
SOLUTION By definition for discrete distributions
x x
E[ x ]   x f x 
 u
1 i i
F(x)  f (u )du  2 3
du i
 1 xi f ( xi )  P( X  xi ) x i f ( xi )
x
 1  1
 2 2   1  2 2 1/36 2/36
 2u 1 x 3 2/36 6/36
Therefore, 4 3/36 12/36
5 4/36 20/36
 1
1  2 f or1  x   6 5/36 30/36
 x 7 6/36 42/36
F(x)   8 5/36 40/36
 9 4/36 36/36
0 Otherwise 10 3/36 30/36
11 2/36 22/36
For any fixed a and b (> a), in the case of a continuous 12 1/36 12/36
random variable X the probabilities corresponding to the 257
TOTALS 1.00 7
intervals a  x  b, a  x  b, a  x  b are the same. 36
However they differ in the case of discrete random
variables. Therefore the expectation is approximately equal to 7.

Home Study Exercise 7.1


 
Properties of the Mean (or Expectation)
1
1. A random variable X has a pdf f ( x )  3 x 2  4 in The following properties are evident from the definition of
80 the mean: -
the interval 0< x< 4 (and zero elsewhere). Verify that it
is a pdf and find F(x).  If c is an arbitrary constant then E[c] = c
2. A random variable X has a pdf f ( x )  e x in the  E[cgx ]  cE[g x ]
interval 0< x<  (and zero elsewhere). Verify that it is a
pdf and find P(1  X  2) [0.23]
28
 E[c1g x   c2hx ]  c1E[g x ]  c2E[hx ] where c1
and c2 are constants.
7.4.2 Variance 53
Let X be a random variable with a mean . The variance of E[X]= x i pI =
13
the probability distribution, denoted by V[x] or 2 , is 480
defined by: - V[X] = ( xi   )2pI = d2pI =
169
V [ x ]  E [ X   2 ] 253

E[X2] = x i 2pI =


xi  2 f ( x j ) f or a discrete distribution 13
 j
  Note that V [ X ]  E[ X 2 ]  E[ X ]
2


 x   2 f ( x )dx f or a continuous distribution

 

The variance is always positive and its square root is Example 7.8
known as the standard deviation (). A small filling station is supplied with gasoline weekly.
Assume that the volume of sales in thousands of litres has
Properties of the Variance the density function f ( x )  6x 1 x  when 0 < x < 1and 0
From the definition of the variance the following properties otherwise. Find the mean and the variance of X.
can be derived: - If a and b are constants then
 V[a] = 0 SOLUTION
By definition
 V[aX+ b] = a2V [ X ] E[X] =
 1
 V [aX  bY ]  a2V [ X ]  b2V [Y ] , where X and Y are
  x 1 x dx
xf ( x )dx  6 2
independent random variables.  0
1
 6 x  x dx  6 
 V [ X  Y ]  V [ X  Y ]  V [ X ]  V [Y ] 1 x x  3 4


2 3
 
A useful result that relates the variance, the mean and the
second moment of X about zero is derived below: -
0  3 4 
0
1
V [ X ]  E [ X   2 ] 
2
 E [ X 2  2X  μ 2 ] Hence E[X] = 500 litres
 E [ X 2 ]  2E [ X ]  2 V [ X ]  E [ X ]  E [ X ]
2 2
 E [ X 2 ]  2
 1
E[ X 2 ] 
 x f ( x )dx  6 x 1 x dx
2 3
where E [ X 2 ] is the second moment of the probability
th  0
distribution. In general the k moment of a distribution is 1
 6 x  x dx  6
1 x x  4 5


3 4 3
defined by   
  
 0 4 5 10
x kj f ( xi ) f or a discrete distribution 0

 Hence
E[ X k ]   
j
2
 
 x k f ( x )dx f or
 
a continuous distribution V[X] 
3  1
  
10  2 
1
20
(Thousand litres) = 50 litres.

Example 7.7
A biased six-sided die has probabilities p/2, p, p, p, p, 2p of Home Study Exercise 7.2
showing 1, 2, 3, 4, 5, 6 respectively. Find the mean, the 1. Find the mean and the variance of a discrete random
variance and the second moment of this probability variable having the probability distribution f(0) = 1/4,
distribution f(1) = 1/2, f(2) = 1/4. [1, 1/2]
2. Let X be the diameter of bolts in a production. Assume
SOLUTION
that X has the density f ( x )  k( x  0.9)(1.1 x ) if 0.9 < x
P (S )  P (1 2  3  4  5  6)
p 13 p < 1.1. Determine k, graph f(x) and find the mean and
  p  p  p  p  2p  the variance. [k = 750, =1,  =0.002]
2
2 2
But P(S) = 1,
3. Assume that the mileage (in thousands of kilometres)
which car owners get with a certain kind of tires is a
hence 13p/2 = 1 or p = 2/13. d= xi  
random variable X having the density
2 Xi2
f x   ex if x > 0
Xi pi xi pi d d pi d2 pi Xi2
1 1 1/13 -40 1600 1600 1 1 and zero otherwise. Here  > 0 is a parameter.
13 13 169 2197 13
(a) What mileage can the car owner expect to
2 2 4/13 -27 729 1458 4 8
13 13 169 2197 13
get with one of these tires?
3 2 6/13 -14 196 392 9 18 (b) Find the probability that the tire will last at
13 13 169 2197 13 least 30,000 kilometres
4 2
13
8/13 -1
13
1
169
2
2197
16 32
13
4. A random variable X has a pdf f ( x ) 
1
80
3 x 2  4 in  
5 2 10/13 12 144 288 25 50
the interval 0< x< 4 (and zero elsewhere). Find the
13 13 169 2197 13
6 4 24/13 25 625 2500 36 144
mean, the variance and the second moment of X.
13 13 169 2197 13
1 53 480 253
13 169 13 29
5. Show that V [aX  bY ]  a2V [ X ]  b2V [Y ] , where X and (a) there will be 4 defective items
(b) there will be not more than three defective items
Y are independent random variables
(c) all the items will be non-defective
(d) there will be at least one defective item.
TOPIC 8.0
SOLUTION
PROBABILITY DISTRIBUTIONS Let A: Component is defective
X: number of defective components
Common probability distributions that are encountered in P(A) = p = 20% = 0.2. q = 1 - p = 0.8
physical applications can be classified as being either The Binomial distribution is defined as
discrete or continuous. 6
Discrete probability distributions include -the Binomial P(X = x) nCx p x (1 p)n  x =  (0.2)x (0.8)6  x and
x
distribution, the Hypergeometric distribution and the
the recurrence formula is
Poisson distribution. The most frequently referred to
pnx 16x 
Continuous probability distributions is the Normal P ( X  x  1)   P ( X  x ) =  P ( X  x )
distribution (also called Gaussian distribution). q  x 1  4  x 1
x P(X = x) = f(x) F(x)
Section Objectives  6
At the end of this section you should be able to: - 0  (0.2)0 (0.8)60  0.2621 0.2621
1. Describe the Binomial, Hypergeometric and Poisson 0
distributions and use them to calculate probabilities.  6
2. Describe the Normal distribution as applied in the 1  (0.2)1(0.8)5  0.3932
determination of probabilities.  1
OR 0.6553
3. Find the mean and variance of common discrete and
continuous probability distributions 160
4. Justify the use of specific probability distributions in  (0.2621)  0.3932
4  0 1
solving problems.
1  6  1
2  0.3932  0.2458 0.9011
8.1 DISCRETE PROBABILITY DISTRIBUTIONS 4  1 1 
Discrete random variables have probability distributions that 3 162
are discontinuous, i.e. cannot be represented on a  0.2458   0.0819 0.9830
continuous scale. Examples of important discrete 4  2 1
distributions include the Binomial and Poisson distributions. 4 1 63
 (0.08193 )  0.0154 0.9984
4  3 1
8.1.1 The Binomial Distribution
Let A be an event which occurs in n independent 5 164
performances of an experiment, such that P(A) = p, the  (0.0154)  0.0015 0.9999
4  4 1 
probability that A occurs in a single trial. The probability that
6 165
A does not occur is denoted by P( A)  q  1 p . If X is the  (0.0015)  0.0001 1.0000
4  5 1
number of times that A occurs, then the Binomial
distribution is defined as: -
(a) P(X = 4) = f(4) = 0.0154
n
P ( X  x )  f ( x )    p x q n  x nCx p x (1 p)n  x , x = 0, 1, .. (b) P(X < 3) = F(3) = 0.9830
x (c) P(X = 0) = f(0) = 0.2621
(d) P(at least one defective item) = P(X>1)
This distribution is also called the Bernoulli distribution. The = 1 -P(X<1) = 1- P(X = 0)
occurrence of the event A is called a success and its =1 - 0.2621 =0.7379
complement, A denotes a failure.
8.1.2 The Hypergeometric Distribution
For the binomial distribution The Binomial distribution is important in sampling with
 The mean E[ X ]  np replacement. Consider for example a box that contains N
 The variance V [ X ]  npq objects of which M are defective. If a screw is drawn at
random, the probability of obtaining a defective screw is
p nx
 P ( X  x  1)   P ( X  x ) , which is called M
q  x 1  p
N
the Binomial recurrence formula which enables Hence in drawing n screws with replacement, the
successive probabilities P( X  x  k ), k  0,1,2... to probability that precisely x screws are defective is given by
be calculated once P(X = x) is known. the Binomial distribution i.e.
n n n
NOTE: 
x 0
f (x)  
x 0
n
Cx p x (1  p)n  x  [ p  (1  p)]n  1 , P ( X  x )  f ( x )    p x q n  x
x
In the case of sampling without replacement, the probability
as required.
is
Example 8.1  M  N  M 
  
20% of the components that a machine produces are x n  x 
defective. If 6 components are selected at random, find P ( X  x )  f ( x )    x  0,1,...,n
N 
the probability distribution and hence determine the  
probability that n

30
This is called the hypergeometric distribution in which If 2% of the electric light bulbs produced by a company are
there are defective, find the probability that in a sample of 60 bulbs
N  (a) 3 bulbs,
(a)   different ways of picking n objects from N. (b) not more than three bulbs,
n (c) at least 2 bulbs will be defective.
M 
(b)   different ways of picking x defectives from M SOLUTION
x Here p = 2% =0.02, n = 60.   np  1.2  5 and
N  M  hence we use of the Poisson distribution
(c)   different ways of picking n-x non-
 nx   x e  1.2 x e 1.2
P( X  x )  f ( x )  =
defectives from N - M x! x!
Each way in (b) combined with each way in (c) gives the
1.23 e 1.2
total number of mutually exclusive ways of obtaining x (a) P  X  3   0.0867
defectives in n drawings without replacement. (a) gives the 3!
total number of equally possible outcomes. (b) P(no more that three defective bulbs) =
P  X  3   P ( X  0)  P  X  1  P  X  2  P  X  3 
For the Hypergeometric distribution e  1.2 1.2e  1.2 1.22 e  1.2
    0.0867
M 0! 1! 2!
 the Mean E ( x )  np  n  0.9662
N
nM N  M N  n 
(c) P(at least two defective bulbs) =
 the variance V [ X }  2  P  X  2   1  P ( X  2)
N 2 N  1  1  {P ( X  0)  P ( X  1)}
Example 8.2  1  {e 1.2  1.2e 1.2 }
Random samples of 2 gaskets are drawn from a box  0.3374
containing 10 gaskets, 3 of which are defective. Find the
probability function of the number of defective items in the Home Study Exercise 8.1
sample (X) and the probability distributions when sampling 1. Four fair coins are tossed simultaneously. Find the
is done with and without replacement. Find also the mean probability function of the random variable X = number
and variance of X. of heads and compute the probabilities of obtaining no
SOLUTION
heads, precisely 1 head, at least 1 head, not more than
N = 10, M = 3, N - M= 7, n = 2. Therefore p =M/N = 0.3 3 heads. [0.0625, 0.25, 0.9375, 0.9375]
For sampling with replacement 2. Let p = 1% be the probability that a certain kind of light
n  2 bulb will fail in a 24 hour test. Find the probability that a
f ( x )    p x 1  p n  x   0.3x 0.72 x , x  0, 1 and 2 sign consisting of 10 such bulbs will burn 24 hour with
x x no bulb failures. [90.4%]
f(0) = 0.72=0.49; f(1) = 2 x 0.3 x 0.7 = 0.42; f(2) = 0.32=0.09 3. Show that the Poisson distribution function satisfies
For sampling without replacement
F() =1.
4. Suppose that in the production of 50 -ohm radio
 M   N  M   N   3   7  10 
f (x)             ,x=0, 1 & 2 resistors, nondefective items are those that have a
 x  n  x   n   x 2  x   2  resistance between 45 and 55 ohms and the probability
f(0) =f(1) = 21/45  0.47; f(2) = 3/45  0.07. of a resistor being defective is 0.2%. The resistors are
E[X] = np =2(0.3) = 0.6 sold in lots of 100, with the guarantee that all resistors
610  310  2 28 are nondefective. What is the probability that a given lot
V{X] =   0.3733
102 10  1 75 will violate this guarantee? (Use the Poisson
Distribution.) [0.1813]
5. A carton contains 20 fuses, five of which are defective.
NOTE: If N, M and N -M are large compared with n then the
Find the probability that if a sample of 3 fuses is chosen
Hypergeometric distribution may be approximated by the
from the carton by random drawing without
binomial distribution with p = M/N. Thus in sampling an replacement, x (x = 0, 1, 2, 3) fuses in the sample will
"infinite population" the binomial distribution is frequently
91 35 5 1
used. be defective. [ , , , ]
928 76 38 114
8.1.3 The Poisson Distribution 6. A distributor sells rubber bands in packages of 100 and
This is the discrete distribution with the probability function guarantees that at most 10% are defective. A consumer
 x e  controls each package by drawing at least 10 bands
P( X  x )  f ( x )  for x =0, 1, …, n without replacement. If the sample contains no
x! defective rubber bands she accepts the package.
The use of the Poisson distribution is justified when Otherwise she rejects it. Find the probability that any
 p  0 and n   such that np < 10 or preferably given package is accepted although it contains 20
np < 5. defective rubber bands. [9.5%]
   np  2 i.e. the mean is equal to the variance. 7. A process of manufacturing screws is checked every
This distribution provides probabilities of numbers of cars hour by inspecting n screws selected at random from
passing at a given point per unit interval of time, numbers of that hour's production. If one or more screws are
defects per unit length of wire, or per unit area of textile, defective the process is halted and carefully examined.
etc. How large should n be if the manufacturer wants the
probability to be about 95% that the process will be
Example 8.3 halted when 10% of the screws being produced are

31
defective? (Assume independence of the quality of any Figure 8.2 Normal probability estimates
item of that of the other items.)
8. If a magnetic tape contains on average, 2 defects per Example 8.4
100 metres, what is the probability that a roll of 300 The mean mass of 200 people is 67 kg and the standard
metres long will contain (a) x defects and (b) no deviation is 7 kg. Assuming the masses to be normally
defects? distributed, determine how many people
8.2 CONTINUOUS PROBABILITY DISTRIBUTIONS (a) have a mass between 60 and 74 kg.
Data representing measured quantities such as length, (b) have a mass of more than 81 kg.
mass, current, temperature, luminous intensity, etc. is (c) have a mass of between 53 and 88 kg.
continuous and their probability distributions approximate to SOLUTION
the Gauss or Normal distribution Given that  = 67 kg and  = 7kg, n =200 people
Compute
8.2.1 The Normal Probability Distribution     60
    74
The normal probability distribution has the density   2  53
  2  81
 x2    3  46
  3  88
 
1  22  (a) P(mass between 60 and 74) = P(    X    )
P ( X  x )  f ( x )  e
 = 2/3  0.6667
where  is the standard deviation. A random variable
having this distribution is said to be normally distributed.
f(x)
The normal probability curve is a graphical representation 66.67%
of the normal distribution. This curve exhibits symmetry
about the mean ().

f(x)    x
 x2 
  E[X] = np = 200(0.6667)  134 people.
1  22 

f (x)  e

(b) P(mass of more than 81 kg)
= P( X  81)  P( X    2)

=
1
100  95%  0.025
f(x) 2
95%
 x Hence
2.5% 2.5%
E[X] = np =200 x 0.025
Figure 8.1 The Normal probability curve = 5 people.
53 67 81 x
From the normal probability curve it can be shown that: - (c) P(mass between 53 and 88 kg)
2
 66 % of the data, i.e. two-thirds of the values lie
3 = P(  2  X    3)
95% f(x)
between    and    . 2 99¾% 1
 About 95% of the data lie between 2 = (95  99.75)%  0.9738
2
  2 and   2 .
 About 99¾% of the data lie between E[X]= 200(0.9738) 195 people
  3 and   3 . 53  88 x

8.2.2 The Standard Normal Probability Distribution


f(x) f(x)
The effects of changing  and  are only to shift along the x
66.67% 95%
- axis or broaden and narrow the probability curve,
respectively. Therefore all normal distribution curves are
equivalent in that a change of origin and scale can reduce
them to a standard form.
   x    x Consider the random variable Z =(X - )/, The standard
P(    X    ) =0.6667 P   2  X    2  0.95 normal distribution is defined as
 z2 
 
 2 
P (Z  z )  z  
1
f(x)
e 
2
99¾%
and has mean  = 0 and variance  =1. The random
2

variable Z is called the standard variate.

The area under the standard normal probability curve


   x represents the probability. Given the limits Z1,Z2  the area
P   3  X    3  0.9975 under the curve is evaluated from
In all cases the probability = area under the curve taken
within the specified limits
32
z2   z 
2

 2 

1 SOLUTION
P ( x1  X  x2 )  P ( z1  Z  z2 )  e dz
Given that n = 1000,  = 15 cm. and  = 0.2 cm
2
z1 (a) P(length less than 14.95 cm)
= P( X  14 .95 )  P(Z  z1)
Tables are available that give the tabulated solutions to this x1   14.95  15
integral of Z - values between 0.00 and 3.99. Two common where z1    0.25
 0.2
conventions are given in Appendix A and Appendix B.
Standard Normal Distribution Tables
The area under the standard normal probability curve is the
probability. The following conventions are used in tabulating Using Appendix A
the standard probability distribution: P(Z< -0.25)=(-0.25) = 1- (-0.25) =1- 0.5987 = 0.4013
OR
1. In Appendix A, the shaded area under the curve gives Using Appendix B,
P(Z<a) = (a) for values of z in the interval 0  z  4 . P(Z  0.25)  0.5  (0.25)  0.5  0.0987
Here (-a) = 1- (a).  0.4013
E[X] = np =1000(0.4013)  402 components

(b) P(length between 14.95 cm and 15.15cm)


= P(14.95  X  15.15)  P  0.25  Z  z2 
15.15  15
where z2   0.75
0.2
Therefore
Using Appendix A,
P ( 0.25  Z  0.75)  (0.75)  ( 0.25)
 (0.75)  [1  (0.25)]
The following considerations should be made when  0.7734 - 0.4013  0.3721
using this table: OR
 P Z  z    ( z ) if z  0 Using Appendix B
1  (| z |) if z  0
 P( 0.25  Z  0.75)  (0.75)  (0.25)
 P(Z>a)(-a) = 1- (a )  0.2734  0.0987  0.3721
 P(a  Z  b)  (b)  (a) E[X] = np =1000(0.3721)  373 components
(c) P(length larger than 15.43)
Recall that Z =(X - )/. = P( X  15.43)  P(Z  z3 )
2. In Appendix B, The shaded area gives P(0<Z<a) =
(a) for values of z in the interval 0  z  4 . By 15.43  15
where z3   2.15
symmetry (-a) = (a). 0.2
Using Appendix A,
P(Z  2.15)  1  (2.15)  1 0.9842
 0.0158
OR
Using Appendix B,
P(Z  2.15)  0.5  (2.15)  0.5  0.4842
 0.0158
E[X] = np =1000(0.0158)  16 components

Example 8.6
The following considerations should be made when If X is a normal distribution of mean  and variance  ,
2
using this table:
calculate the probabilities that X lies within 1, 2 and 3 of
 (z)=P(0 < Z <z) the mean.
 P (Z  a)  0.5  (a) if a  0
0.5  (a) if a  0
 SOLUTION
P(    X    )  P(1  Z  1) , since Z =(X - )/.
 P Z  a   0.5  (a) if a  0
0.5  (a) if a  0 Using Appendix B

P( 1  Z  1)  P( 1  Z  1)  (1)  ( 1)
 P (a  Z  b )  (b)  (a) if a  0 and b  0  2(1)  20.3413  68.3%
(b)  (a) if a  0 and b  0
 Similarly for 2
Again Z =(X - )/. P(  2  X    2)  P(2  Z  2)
= 2(2)  2(0.4772)  95.4%
Example 8.5
For3
A certain machine produces components that have a mean
P(  3  X    3)  P(3  Z  3)
length of 15 cm and a standard deviation of 0.2 cm. In a
batch of 1000 components, determine the number of = 2(3)  2(0.4987)  99.7%
components that are likely to:
(a) have a length of less than 14.95 cm.
(b) have a length of between 14.95 and 15.15 Home Study Exercises 8.2
(c) be larger than 15.43 cm. 1. Let X be normal with a mean 0 and variance 1.
Determine the constant c such that P( X  c )  45%
33
[1.645] percentage of packets with between 43 and 45 washers
2. If the thickness of iron plates X (in mm) is normally (both inclusive). [14 marks]
distributed with a mean of 10mm and a variance of
(b) On a combined axes graph draw the bar chart, the
4  104 mm. Find the percentage of defective plates
histogram and the frequency polygon for the frequency
expected, assuming that the defective plates (a) are
distribution in 1(a) above. Use the graph paper
thinner than 9.97mm, (b) thicker than 10.05 mm (c) provided. [6 marks]
deviate more than 0.03mm from 10mm.
[6.7%, 0.6%, 13%]
2. The diameters of 34 samples of steel rods were
measured and the following results obtained (in
TOPIC 9.0 millimeters):
SAMPLE EXAMINATION PAPER 19.63 19.82 19.96 19.75 19.86
19.89 20.16 19.56 20.05 19.72
19.73 19.93 20.03 19.86 19.81
19.66 19.77 19.99 20.00 20.11
19.82 19.61 19.97 20.07 20.01
19.96 19.68 19.87 19.90 19.84
19.77 19.78 19.75 19.87

(a) Arrange the values into seven equal classes and


apply the coding technique to determine the mean and
the standard deviation. [10 marks]
MOI UNIVERSITY
OFFICE OF THE CHIEF ACADEMIC OFFICER (b) Draw a cumulative frequency table for the grouped
data in 2(a) above and use it to obtain the median, the
lower quartile, the upper quartile and the quartile
UNIVERSITY EXAMINATIONS deviation. [10 marks]
2002/2003 ACADEMIC YEAR
FIRST YEAR SECOND SEMESTER EXAMINATION 3. (a) Let A and B be two events in the sample space S
such that P(A) = ½, P(AB) = ¾ and P(B) = 3/8. Draw
FOR THE DEGREE OF
Venn diagrams for the following events and hence
determine their probabilities.
BACHELOR OF TECHNOLOGY
IN (i) A  B (ii) A  B (iii) A  B (iv) A  B [12 marks]
ELECTRICAL & COMMUNICATIONS ENGINEERING
PRODUCTION ENGINEERING (b) An engineer working at a diesel power plant
CHEMICAL & PROCESS ENGINEERING identified three major causes of engine failure to be: -
TEXTILE ENGINEERING  Overheating
CIVIL & STRUCTURAL ENGINEERING  Ignition problems
COMPUTER ENGINEERING  Fuel blockage
From experience she estimated the probabilities of their
occurrence at 1/7, 2/9, and 3/11 respectively for each if the
COURSE CODE: STA 104/106 stated causes of failure. If an engine is reported to have
failed, find the probability that the failure is due to: -
COURSE TITLE: BASIC STATISTICS (i) both ignition problems and overheating and not fuel
blockage
DATE: XX/XX/XXXX TIME: 3 HOURS (ii) either ignition problems or fuel blockage
(iii) both overheating and either ignition problems or fuel
INSTRUCTIONS TO CANDIDATES blockage

THIS PAPER CONTAINS SEVEN (7) QUESTIONS. 4. (a) Three screws are drawn at random without
ANSWER ANY FIVE (5) QUESTIONS. replacement from a box containing 4 left-hand and 6 right-
ALL QUESTIONS CARRY EQUAL MARKS. hand screws. If X is a random variable that denotes the
total number of left-hand screws drawn, construct the
1. The contents of thirty packets of washers, randomly probability distribution and show that x is a discrete random
sampled from a batch destined for the market, were variable. [10 marks]
determined as follows: (b) The volume of sales of a liquid detergent (in
thousands of litres) has the probability density
43 44 41 41 44 43 f (x) 6x(1 x) if 0  x  1
42 45 41 44 46 41 0 otherwise
44 42 45 43 43 44 Show that f(x) is a probability density function and the
43 45 45 42 44 44 determine the mean and variance of the distribution.
45 46 44 46 43 44 [10 marks]

5. (a) 10% of the components tat a machine produces are


(a) Draw a frequency table and use it to determine the defective. In a sample of 5 components drawn at random,
mode; the relative frequency of the mode; the find the probabilities of having three defective, two defective
percentage of packets with 45 washers or less and the and at least one defective component. [8 marks]

34
(b) A chemical firm produces aspirin tablets having a
mean mass of 4g and a standard deviation of 0.2g.
Assuming that the masses are normally distributed and that
each tablet is chosen at random, what is the probability that
a tablet
(i) has a mass between 3.55g and 3.85g?
(ii) differs from the mean by less than 0.35g?

If the tablets are placed in cartons of 400, how many in


each carton may be expected to have a mass less than
three? [12 marks]
6. The deposition of grit particles from the atmosphere
was measured by counting the numbers of particles
deposited on 200 prepared cards, in a specified time, and
the following table was compiled.

No. of 0 1 2 3 4 5 6
particles
No. of cards 45 65 52 24 11 3 0

(i) Calculate the mean and variance of this distribution


and hence show that it is reasonable to assume that
the deposition of grit particles is according to a
Poisson distribution. [12 marks]
(ii) Determine the probabilities of obtaining 0,1,2,3,4,5,6
or more particles on any one card and tabulate the
expected frequencies of grit particles on 200 cards
placed at random. [8 marks]

7. A box contains 131 similar transistors of which 70 are


satisfactory, 43 give too high a gain under normal operating
conditions and 18 give too low a gain. Two transistors are
drawn in turn. Find the probability of having (a) two
satisfactory; (b) none with low gain; (c) one satisfactory and
one with high gain; (d) one with low gain and none
satisfactory, and (e) none satisfactory; if the sampling is
done (i) with replacement (ii) without replacement.
[20 marks]
THE END
Answers to Selected Sample Examination Questions

1 (a) 44, 0.30, 90 %, 67%


2 (a) 19.86 mm, 0.16 mm
(b) Q1 =19.76 mm, Q2 =19.86 mm, Q3 =19.92 mm, QD =
0.08.
3. (a) (i) 1/8 (ii) 1/4 (iii) 7/8 (iv) 1/4
(b) (i) 16/693 (ii) 49/99 (iii) 7/99
4. (b) E[X]= 0.5, V[X] = 0.05
5. (a) P(X =3) = 0.0081, P(X = 2) = 0.0729, P(X > 1) = 0.41
(b) (i) 21.44% (ii) 92% (iii) 27 tablets
6. (a) Mean = variance = 1.5
7. With replacement:
[4900/17161, 12769/17161, 6020/17161, 2196/17161,
3721]
Without replacement
[483/1703, 6328/8515, 602/1703, 2117/17030 or
1089/8515, 366/1703]

35

You might also like