Introduction To Statistics Note
Introduction To Statistics Note
OCTOBER, 2013
1
General Introduction
This module is primarily for economics students who need to understand the principles of data
collection, presentation, analysis and interpretation. It is valuable to first degree students of
economics. The material could also be of paramount importance for an individual who is
interested in business research.
The first three chapters cover basic concepts of Statistics focusing on the collection, presentation
and summarization of data. In chapter four, measures of variation using practical examples and
data are discussed comprehensively. In chapters five and six elementary probability and
sampling methods are presented with practical examples, respectively. The last chapter of this
module is about linear correlations and regressions.
General learning objectives followed by introductory sections which are specific to each chapter
are placed at the beginning of each chapter. The module also includes many problems for the
student, most of them based on real data, the majority with detailed solutions. A few reference
materials are also given at the end of each chapter for further reading.
This course module has seven chapters. These are introduction as definition of statistics, uses,
applications & limitation of statistics, types of variables and its measurement scale; methods of
data collection and presentations which includes data collection techniques and presentation
tools; measure of central tendency which deals about mean, median & mode; measure of
dispersion which covers absolute and relative measures of dispersions; elementary probability
including random experiment, sample space, event and approaches of measuring probability of
occurrences of events; random; sampling and sampling distribution as types and methods of
sampling and the distribution of the sample mean and proportion; and simple linear regression
and correlations. In each chapter there are exercises and assignments.
2
Objectives of the Module
General objectives
The general objective of this course module is to introduce, explain and show the basic concepts
and applications of statistical model in business related problems using real data examples.
Specifically:
Upon completion of this course module, the students will able to;
Develop the basic statistical knowledge
Justifies the basic statistical thinking and reasoning
Apply the methods of statistics in scientific research.
Demonstrate the usefulness of statistics in real life.
Module prerequisites…….. No
Credit hours: 3
3
Table of Contents
Contents Page
General Introduction ..................................................................................................................................... 2
Organization of the Module .......................................................................................................................... 2
Objectives of the Module .............................................................................................................................. 3
Assessment Methods ............................................................................. Error! Bookmark not defined.
Chapter 1: Introduction ................................................................................................................................. 7
1.1 Introduction ......................................................................................................................................... 7
1.2 Definitions & classification of statistics ............................................................................................. 7
1.3 Stages in statistical investigation ........................................................................................................ 9
1.4 Definition of some basic terms ......................................................................................................... 10
1.5 Application, uses and limitation of statistics .................................................................................... 10
Summary ................................................................................................................................................. 17
Exercise 1 ................................................................................................................................................ 18
Assignment 1:(4%) ................................................................................................................................. 18
Chapter 2: Sampling Theory ....................................................................................................................... 19
2.1 introduction ....................................................................................................................................... 19
2.2 Basic Concepts .................................................................................................................................. 19
2.3 Reasons for Sampling ....................................................................................................................... 23
2.4 A Review of Methods of Sampling................................................................................................... 23
Summary ................................................................................................................................................. 28
Exercises 2 .............................................................................................................................................. 29
Assignment 2: (5%) ................................................................................................................................ 29
1. Describe the basic difference between Probability and non- probability sampling ............................ 29
2. A survey will be conducted on household water supply in a district comprising 20,000 households,
of which 20% are urban and 80% rural, since it is suspected that in urban areas the access to safe water
sources is much more satisfactory. What sampling technique is best for such condition? Justify. ....... 29
Reference ................................................................................................................................................ 29
Chapter 3: Methods of Data Collection and Presentations ..................................................................... 30
3.1 Introduction ....................................................................................................................................... 30
3.2 Classification of Data ........................................................................................................................ 31
3.3 Data Collection ................................................................................................................................. 31
3.3 Methods of Data Presentation ........................................................................................................... 35
4
3.3.1 Frequency Distribution .............................................................................................................. 35
3.3.2 Diagrammatic and Graphic presentation of data ........................................................................ 43
Summary ................................................................................................................................................. 49
Exercise 3 ................................................................................................................................................ 49
Assignment 3: (6%) ................................................................................................................................ 49
Chapter 4: Measures of Central Tendency .................................................................................................. 51
4.1 Introduction and Objectives of Measuring Central Tendency .......................................................... 51
4.2 Desirable properties of measure of central tendency ........................................................................ 52
4.3 The Summation Notation (∑) ........................................................................................................... 52
4.4 Types of Measures of Central Tendency .......................................................................................... 53
4.4.1. The Mean .................................................................................................................................. 53
4.4.2 The Median ................................................................................................................................ 58
4.4.3 The Mode ................................................................................................................................... 60
4.3.4 Quantiles .................................................................................................................................... 61
Summary ..................................................................................................................................................... 64
Exercise 4 ................................................................................................................................................ 64
Assignment 4: (5%) ................................................................................................................................ 65
Chapter 5: Measures of Variation (Dispersion), Skewness and Kurtosis ................................................... 66
5.1 Introduction and objectives of measuring Variation ......................................................................... 66
5.2 Types of Measures of Dispersion...................................................................................................... 66
5.2.1 Range(R) and Relative Range (RR) ........................................................................................... 67
5.2.2 The Quartile Deviation............................................................................................................... 69
5.2.3 The Mean Deviation and Coefficient of Mean Deviation .......................................................... 70
5.2.4 The Variance, the Standard Deviation and Coefficient of Variation ......................................... 74
5.2.5 Standard Scores (Z-scores) ........................................................................................................ 79
5.3 Moments ........................................................................................................................................... 81
5.4 Skewness ........................................................................................................................................... 83
5.5 Kurtosis ........................................................................................................................................... 84
Summary ................................................................................................................................................. 86
Exercises 5 .............................................................................................................................................. 87
Assignment 5: (5%) ................................................................................................................................ 87
Chapter 6: Simple Linear Regression and Correlation ............................................................................... 89
5
6.1 Introduction ....................................................................................................................................... 89
6.2 Scatter Plot ........................................................................................................................................ 89
6.3 Simple Correlation Analysis ............................................................................................................. 89
6.4 Simple Linear Regression Analysis .................................................................................................. 93
6.4.1 Assumption of simple linear regression ..................................................................................... 94
6.4.2 Parameter estimation .................................................................................................................. 95
Summary ..................................................................................................................................................... 98
Exercise 6 ................................................................................................................................................ 99
Assignment 6: (10%) ............................................................................................................................ 100
Chapter 7: Elementary Probability ............................................................................................................ 101
7.1 Definitions of Some Probability Terms .......................................................................................... 101
7.2 Counting rule (Addition, Multiplication, Permutation, Combination)............................................ 102
7.3 Approaches of measuring probability (Classical, frequents, Subjective) ....................................... 107
7.4 Conditional Probability and Independency ..................................................................................... 112
7.5 Basic Concepts of Probability Distributions ................................................................................... 115
Summary ................................................................................................................................................... 120
Exercises 7 ............................................................................................................................................ 120
Assignment 7 :( 5%) ............................................................................................................................. 121
Answer Key .............................................................................................................................................. 121
Biblography (References) ..................................................................................................................... 122
Appendix: Standard Tables of Statistics ................................................................................................... 123
6
Chapter 1: Introduction
Objectives
At the end of this unit the learners will able to:
1 defines statistics and biostatistics and indentifies the type of statistics
1.5 categorizes the type of variables and identifies its measurement scales.
Contents
1.1 Introduction
1.1 Introduction
In the modern world of computers and information technology, the importance of statistics is
very well recognized by all the disciplines. Statistics has originated as a science of statehood and
found applications slowly and steadily in Agriculture, Economics, Commerce, Biology,
Medicine, Industry, planning, education and so on. Now a day, there is no other human walk of
life, where statistics cannot be applied.
Statistics is defined differently by different authors over a period of time. The word ‘statistics’
derived from lantin word ‘'Statis' which means a political state. It is originated from two quite
dissimilar fields i.e. games of chance and political state. In the olden days statistics was confined
to only state affairs but in modern days it embraces almost every sphere of human activity.
7
The word statistics has several meanings. In the first place, it is a plural noun which describes a
collection of numerical data such as employment statistics, accident statistics, population statistics,
economic statistics, and agricultural statistics e t c. It is in this sense that the word 'statistics' is usually
understood by a layman.
Secondly the word statistics as a singular noun is used to describe a branch of applied mathematics,
whose purpose is to provide methods of dealing with collections of data and extracting information
from them in compact form by tabulating, summarizing and analyzing the numerical data or a set of
observations.
Generally, Statistics can be defined into different ways as statistical data or/and statistical
methods:
1. Plural sense (lay man definition).
It is an aggregate or collection of numerical facts such as the number of people living in
particular area, the distribution of family incomes in Debre Tabor town (in the sense of statistical
data).
2. Singular sense (formal definition)
When it is used in the sense of statistical methods, statistics is defined as the science of
collecting, organizing, presenting, analyzing and interpreting numerical data for the purpose of
assisting in making a more effective decision.
NB Even though statistical data always denote figures (numerical descriptions) it must be
remembered that all 'numerical descriptions' are not statistical data.
In order that numerical descriptions may be called statistics (statistical data) they must possess
the following characteristics:
i) They must be aggregates. This means that statistics are a “number of facts.” A single fact,
even though numerical stated, cannot be called statistics.
ii) The must be affected to a marked extent by a multiplicity of causes. This means that the
numerical values of any quantity at any particular moment is the result of the action and
interaction of a number of forces, differing amongst themselves and it is not possible to
say as to how much of due to any one particular cause.
8
iii) Statistics must be enumerated or estimated according to reasonable standard of accuracy.
This means that if aggregates of numerical facts are to be called ‘Statistics’ they must be
reasonably accurate.
iv) Statistics are collected in a systematic manner for a predetermined purpose. Numerical
data can be called statistics only if they have been compiled in a properly planned manner
and for a purpose about which the enumerator had a definite idea.
v) Statistics should be placed in relation to each other. Numerical facts may be placed in
relation to each other either in points of time, space or condition. This means that facts
should be comparable.
Classifications of statistics
Depending on how data can be used statistics is sometimes divided in to two main areas or
branches.
1. Descriptive Statistics: is concerned with summary calculations, graphs, charts and tables for
the data has been collected. For example average number of female births in a hospital per week.
The attrition rate of Debre Tabor University in 2012/13 was 6.0%.
1. Collection of data: the process of measuring, gathering, assembling the raw data up on which
the statistical investigation is to be based.
Data can be collected in a variety of ways; one of the most common methods is
through the use of survey and experiment. Survey can also be done in different
methods, three of the most common methods are:
9
Telephone survey
Mailed questionnaire
Personal interview.
2. Organization of data: Summarization of data in some meaningful way, e.g table form
3. Presentation of the data: The process of re-organization, classification, compilation, and
summarization of data to present it in a meaningful form.
4. Analysis of data: The process of extracting relevant information from the summarized data,
mainly through the use of elementary mathematical operation.
5. Inference of data: The interpretation and further observation of the various statistical
measures through the analysis of the data by implementing those methods by which
conclusions are formed and inferences made.
Applications of statistics:
Statistics is not a mere device for collecting numerical data, but as a means of developing sound
techniques for their handling, analyzing and drawing valid inferences from them. Statistics is
applied in every sphere of human activity – social as well as physical – like Biology, Commerce,
10
Education, Planning, Business Management, Information Technology, etc. It is almost
impossible to find a single department of human activity where statistics cannot be applied. For
instance;
Almost all human beings in their daily life are subjected to obtaining numerical facts e.g.
abut price.
Applicable in some process e.g. invention of certain drugs, extent of environmental
pollution.
In industries especially in quality control area.
To test the efficiency of a new drug or medicine
Uses of statistics:
The main function of statistics is to enlarge our knowledge of complex phenomena. The
following are some uses of statistics:
2. Data reduction; Statistical measures help to reduce the complexity of the data and
consequently to understand any huge mass of data. The process of summarizing large
amounts of data by forming frequency distributions, histograms, scatter diagrams, etc.,
and calculating statistics such as means, variances and correlation coefficients.
11
available from the sample observations. In the formulation and testing of hypothesis,
statistical methods are extremely useful. Whether crop yield has increased because of the
use of new fertilizer or whether the new medicine is effective in eliminating a particular
disease are some examples of statements of hypothesis.
7. Studying the relationship between two or more variable. The relationship between
variables can be identified by using chi-sqaure test or using regression models.
8. Forecasting future events. For instance, a biologist can forecast thee rainfall for the near
future based on the data of the last ten years connected to rainfall of a particular
ecological zone.
Scope of Statistics
Statistics is applied in every sphere of human activity – social as well as physical – like Biology,
Commerce, Education, Planning, Business Management, Information Technology, etc. It is
almost impossible to find a single department of human activity where statistics cannot be
applied.
Statistical methods are useful in measuring numerical changes in complex groups and
interpreting collective phenomenon. Nowadays the uses of statistics are abundantly made in any
economic study. Both in economic theory and practice, statistical methods play an important
role. Alfred Marshall said, “Statistics are the straw only which I like every other economists
have to make the bricks”. It may also be noted that statistical data and techniques of statistical
tools are immensely useful in solving many economic problems such as wages, prices,
production, distribution of income and wealth and so on. Statistical tools like Index numbers,
time series Analysis, Estimation theory, Testing Statistical Hypothesis are extensively used in
economics.
Limitations of statistics
As a science statistics has its own limitations. The following are some of the limitations:
It deals with only those subjects of inquiry that are capable of being quantitatively
measured and numerically expressed.
12
It deals on aggregates of facts and no importance is attached to individual items–suited
only if their group characteristics are desired to be studied.
Statistical data are only approximately and not mathematical correct.
i. Types of variables
Any aspect of an individual that is measured and take any value for different individuals or
cases, income, demand, or records, like age, sex is called variables.
It is helpful to divide variables into different types, as different statistical methods are applicable
to each. The main division is into qualitative (categorical) or quantitative (numerical) variables.
2. Quantitative Variables are variables or characteristics which can be measured and expressed
numerically. For instance, balance in checking account, number of children in family, expense,
income, salary.
a. Discrete variable are variables which can assume only certain values, and there are
usually "gaps" between the values, such as the number of bedrooms in your house,
number of households in a kebele, number of VAT registered cafés’ in each Sub city at
Addis Ababa.
b. Continuous variables are variables which can assume any value within a specific range,
such as the air pressure in a tire, income, weight, age of household head etc.
Proper knowledge about the nature and type of data to be dealt with is essential in order to
specify and apply the proper statistical method for their analysis and inferences. Measurement
scale refers to the property of value assigned to the data based on the properties of order,
distance and fixed zero.
13
In mathematical terms measurement is a functional mapping from the set of objects {Oi} to the
set of real numbers {M (Oi)}.
The goal of measurement systems is to structure the rule for assigning numbers to objects in such
a way that the relationship between the objects is preserved in the numbers assigned to the
objects. The different kinds of relationships preserved are called properties of the measurement
system. The three properties of measurement scales are listed below:
a. Order
The property of order exists when an object that has more of the attribute than another object, is
given a bigger number by the rule system. This relationship must hold for all objects in the "real
world".
b. Distance
The property of distance is concerned with the relationship of differences between objects. If a
measurement system possesses the property of distance it means that the unit of measurement
14
means the same thing throughout the scale of numbers. That is, an inch is an inch, no matters
were it falls - immediately ahead or a mile downs the road.
More precisely, an equal difference between two numbers reflects an equal difference in the "real
world" between the objects that were assigned the numbers. In order to define the property of
distance in the mathematical notation, four objects are required: Oi, Oj, Ok, and Ol. The
difference between objects is represented by the "-" sign; Oi - Oj refers to the actual "real world"
difference between object i and object j, while M(Oi) - M(Oj) refers to differences between
numbers.
The property of DISTANCE exists, for all i, j, k, l
If Oi-Oj ≥ Ok- Ol then M(Oi)-M(Oj) ≥ M(Ok)-M( Ol ).
c. Fixed Zero
A measurement system possesses a rational zero (fixed zero) if an object that has none of the
attribute in question is assigned the number zero by the system of rules. The object does not need
to really exist in the "real world", as it is somewhat difficult to visualize a "man with no height".
The requirement for a rational zero is this: if objects with none of the attribute did exist would
they be given the value zero. Defining O0 as the object with none of the attribute in question, the
definition of a rational zero becomes:
Scale type
i. Nominal Scales
Nominal scales are measurement systems that possess none of the three properties stated above.
Level of measurement which classifies data into mutually exclusive, all inclusive
categories in which no order or ranking can be imposed on the data.
No arithmetic and relational operation can be applied.
15
Examples:
o Political party preference (Republican, Democrat, or Other,)
o Sex (Male or Female.)
o Marital status(married, single, widow, divorce)
o Country code
o Regional differentiation of Ethiopia.
Ordinal Scales are measurement systems that possess the property of order, but not the property
of distance. The property of fixed zero is not important if the property of distance is not satisfied.
Level of measurement which classifies data into categories that can be ranked.
Differences between the ranks do not exist.
Arithmetic operations are not applicable but relational operations are applicable.
Ordering is the sole property of ordinal scale.
Examples:
o Letter grades (A, B, C, D, F).
o Rating scales (Excellent, Very good, Good, Fair, poor)
o Socio- economic class (low, middle, high)
o Country status (Undeveloped, developing, developed)
o Rate of satisfaction (very satisfied, satisfied, less than satisfied, very unsatisfied)
Interval scales are measurement systems that possess the properties of Order and distance, but
not the property of fixed zero.
Level of measurement which classifies data that can be ranked and differences are
meaningful. However, there is no meaningful zero, so ratios are meaningless.
All arithmetic operations except division are applicable.
Relational operations are also possible.
Examples:
o IQ
16
o Temperature in oF.
Ratio scales are measurement systems that possess all three properties: order, distance, and fixed
zero. The added power of a fixed zero allows ratios of numbers to be meaningfully interpreted;
i.e. the ratio of Bekele's height to Martha's height is 1.32, whereas this is not possible with
interval scales.
Level of measurement which classifies data that can be ranked, differences are
meaningful, and there is a true zero. True ratios exist between the different units of
measure.
All arithmetic and relational operations are applicable.
Examples:
o Weight, Height, income, expense
o Number of students
o Age
Summary
17
Exercise 1
1. The average score for an entire population would be an example of a __________.
a. Parameter c. Variable
b. Statistic d. Constant
2. After measuring a height of two trees, a researcher finds that the height one is three times
greater than the other. These measurements must come from _______ scale.
a. Nominal c. interval
b. Ordinal d. ratio
3. A variable that has an infinite number of possible values between any two specific
measurements is called __________ variable.
a. Independent c. Discrete
b. Dependent d. Continuous
Assignment 1: (4%)
1.1 The following present a list of different attributes and rules for assigning numbers to objects.
Try to classify the different measurement systems into one of the four types of scales
A. Your score on the first statistics test as a measure of your knowledge of statistics.
B. Your score on an individual intelligence test as a measure of your intelligence.
C. A response to the statement "Abortion is a woman's right" where "Strongly Disagree" =
1, "Disagree" = 2, "No Opinion" = 3, "Agree" = 4, and "Strongly Agree" = 5, as a
measure of attitude toward abortion.
D. Times for swimmers to complete a 50-meter race
E. Months of the year September, October, …
F. Blood type of individuals, A, B, AB and O.
G. Regions numbers of Ethiopia (1, 2, 3 etc.)
H. The number of cattle which belongs to a household head
I. Annual income;
References
Bluman, A.G(1995) Elementary statistics (2nd edition)
Gupta, C.P () Introduction to statistical methods (9th edition)
18
Chapter 2: Sampling Theory
Objectives
At the end of this unit the learners will be able to:
Contents
2.1 Introduction
2.2 Basic Concepts
2.3 Reason for sampling
2.4 A Review of Methods of Sampling
2.1 introduction
Sampling is very often used in our daily life. For example while purchasing food grains from a
shop we usually examine a handful from the bag to assess the quality of the commodity. A
doctor examines a few drops of blood as sample and draws conclusion about the blood
constitution of the whole body. Thus most of our investigations are based on samples. In this
chapter, we will see basic concepts of sampling theory, reasons for sampling and the various
methods of sample selections from the population.
We are going to analyze and interpret data to draw conclusions not about the data but about the
source of the data (population consisting of all elements being studied). We collect a sample of
data from the population and use it to make inferences about the population. Very often we will
be interested in estimating a population parameter. In order to estimate this we need to define
our terms carefully:
Population: In a statistical enquiry, all the items, which fall within the purview of enquiry, are
known as Population or Universe. In other words, the population is a complete set of all
possible observations of the type which is to be investigated.
Example:
Total number of students studying in a school or college,
19
total number of books in a library,
Total numbers of houses in a village or town.
Sometimes it is possible and practical to examine every person or item in the population we wish
to describe. We call this a complete enumeration, or census. We use sampling when it is not
possible to measure every item in the population. Statisticians use the word population to refer
not only to people but to all items that have been chosen for study.
Census Method:
Information on population can be collected in two ways – census method and sample method. In
census method every element of the population is included in the investigation.
Example
If we study the average annual income of the families of a particular village or area, and if there
are 1000 families in that area, we must study the income of all 1000 families. In this method no
family is left out, as each family is a unit.
Mertis:
1. The data are collected from each and every item of the population
2. The results are more accurate and reliable, because every item of the universe is required.
3. Intensive study is possible.
4. The data collected may be used for various surveys, analyses etc.
Limitations:
1. It requires a large number of enumerators and it is a costly method
2. It requires more money, labour, time energy etc.
3. It is not possible in some circumstances where the universe is infinite.
20
Sampling:
The theory of sampling has been developed recently but this is not new. In our everyday life we
have been using sampling theory as we have discussed in introduction. In all those cases we
believe that the samples give a correct idea about the population. Most of our decisions are based
on the examination of a few items that is sample studies.
Sample:
Statisticians use the word sample to describe a portion chosen from the population. A finite
subset of statistical individuals defined in a population is called a sample. The number of units in
a sample is called the sample size.
Sampling unit:
The constituents of a population which are individuals to be sampled from the population and
cannot be further subdivided for the purpose of the sampling at a time are called sampling units.
For example,
1. To know the average income per family, the head of the family is a sampling unit.
2. To know the average yield of rice, each farm owner’s yield of rice is a sampling unit.
3. If somebody studies Scio-economic status of the households, households are the sampling
unit.
4. If one studies performance of freshman students in some college, the student is the sampling
unit.
Sampling frame:
For adopting any sampling procedure it is essential to have a list identifying each sampling unit
by a number. Such a list or map is called sampling frame. A list of voters, a list of house holders,
a list of villages in a district, a list of farmers etc. are a few examples of sampling frame.
Target population: the population about which one wishes to make an inference.
Sample size (n): the amount (total number) of individuals or sampling units selected as a sample.
Sampling population: is a population from which one actually draws a sample. Sample
population covers the element from which sample was actually selected.
21
Parameter and statistic:
We can describe samples and populations by using measures such as the mean, median, mode
and standard deviation. When these terms describe the characteristics of a population, they are
called parameters. When they describe the characteristics of a sample, they are called statistics.
A parameter is a characteristic of a population and a statistic is a characteristic of a sample. Since
samples are subsets of population statistics provide estimates of the parameters. That is, when the
parameters are unknown, they are estimated from the values of the statistics.
In general, we use Greek or capital letters for population parameters and lower case Roman
letters to denote sample statistics. [N, μ, , are the standard symbols for the size, mean, standard
̅, s, are the standard symbol for the size, mean, standard deviation of
deviation of population. n, 𝒙
sample respectively].
(ii) Statistical inference procedures allow one to make inference about sample population. Only
when sample population and target population are equal one can infer about target population.
22
2.3 Reasons for Sampling
23
(c). Cluster sampling
(d). Systematic sampling
a. Simple random sample: a sampling technique in which member of the population is
equally likely to be included in the sample. Suppose we have a population of N objects and
we wish to choose n of them to form a sample. We have seen that there are N C n ways of
choosing the sample without replacement and Nn ways with replacement.
Examples
Lottery method – the units to be included in the sample are chosen by a lottery. Assign numbers
to each element in the population. Write each number in a split of paper, toss then draw one
number at a time. This method can only be used if the population is not very large otherwise it is
cumbersome.
Table of random number: used to select representative sample from a large size population. To
select the sample use random digit techniques.
Units of the population from which a sample is required are assigned with equal number of
digits. When the size of the population is less than thousand, three digit number 000,001,002,
….. 999 are assigned. We may start at any place and may go on in any direction such as column
wise or row- wise in a random number table. But consecutive numbers are to be used. On the
basis of the size of the population and the random number table available with us, we proceed
according to our convenience. If any random number is greater than the population size N, then
N can be subtracted from the random number drawn. This can be repeatedly until the number is
less than N or equal to N. We proceed with the following steps
Step 1: each element numbered for example for a population of size 500 we assign 001 to 500.
Step 2: select a random starting point
Step 3: we need only respective number of digits. Proceed in this fashion until the required
number of sample selected
Example 1:
In an area there are 500 families. Using the following extract from a table of random numbers
select a sample of 15 families to find out the standard of living of those families in that area.
24
4652 3819 8431 2150 2352 2472 0043 3488
9031 7617 1220 4129 7148 1943 4890 1749
2030 2327 7353 6007 9410 9179 2722 8445
0641 1489 0828 0385 8488 0422 7209 4950
Solution:
In the above random number table we can start from any row or column and read three digit
numbers continuously row-wise or column wise.
Now we start from the third row, the numbers are:
203 023 277 353 600 794 109 179
272 284 450 641 148 908 280
Since some numbers are greater than 500, we subtract 500 from those numbers and we rewrite
the selected numbers as follows:
203 023 277 353 100 294 109 179
272 284 450 141 148 408 280
b. Stratified random sampling: is often used when the population is split into subgroups or
“strata”. The different subgroups are believed to be very different from each other, but it is
thought that the individuals who make up each subgroup are similar. The number of units to be
chosen from each sub-group is fixed in advance and the units are chosen by simple random
sampling within the sub group. Some of the criteria for dividing a population into strata are: Sex
(male, female); Age (under 18, 18 to 28, 29 to 39); Occupation (blue-collar, professional, other).
It is applied if the population is heterogeneous.
Merits:
1. It is more representative.
2. It ensures greater accuracy
3. It is easy to administer as the universe is sub - divided.
4. Greater geographical concentration reduces time and expenses.
25
5. When the original population is badly skewed, this method is appropriate.
6. For non – homogeneous population, it may field good results.
Limitations:
1. To divide the population into homogeneous strata, it requires more money, time and
statistical experience which is a difficult one.
2. Improper stratification leads to bias, if the different strata overlap such a sample will not
be a representative one.
c. Cluster sampling: in some case the identification and location of an ultimate unit for
sampling may require considerable time and cost in such cases cluster sampling is used. A
simple random sample of groups or cluster of elements is chosen and all the sampling units in
the selected clusters will be surveyed. In cluster sampling the population is subdivided into
groups or clusters and a probability of these clusters is then drawn and studied. Clusters may
be Region, Zones, Weredas, Kebeles, etc. This method of sampling has less cost, faster and
more convenient but it may not be very efficient and representative due to the usual
tendency of the units in different cluster be similar.
Example: if we want to study the travel habit of families in Ethiopia which is divided in to
Regions and Zones. We shall first draw a random sample from the Zones to be studied and
then from these selected Zones or clusters, we draw random sample of households for the
purpose of investigation. To estimate the average annual household income in a large city we
use cluster sampling, because to use simple random sampling we need a complete list of
households in the city from which to sample.
d. Systematic sampling: the items or individuals of the population are arranged in some way
alphabetically, in file drawer by data received or some other method. So that, A complete
list of all elements within the population (sampling frame) is required .A random starting
point is selected and then every Kth member of the population is selected for the sample. For
example if we want select n items from the population of size N using systematic sampling,
we divide N by n (N/n = K) and choose one (i) between 1 and K then we take every Kth
member. So the samples will be i, i+K, i+ 2K, i+ 3K, etc. where 0< i < K.
Example: Suppose we want to choose a sample of about 20 students out of a class of 100
students. First we put the class in order (may be alphabetical order, or by ID number) and give
each a number between 1 and 100. Next we divide 100 by 20 and we get 100/20 = 5. We now
26
choose a number at random between 1 and 5. The student corresponding to that number is the
first student in the sample, and we then take every 5th student. So if, for example, we choose the
number 2 the sample will consist of the 2nd, 7th, 12th, 17th, ..., 92nd and 97th students on the list.
Merits:
1. This method is simple and convenient.
2. Time and work is reduced much.
3. If proper care is taken result will be accurate.
4. It can be used in infinite population.
Limitations:
1. Systematic sampling may not represent the whole population.
2. There is a chance of personal bias of the investigators.
Systematic sampling is preferably used when the information is to be collected from trees in a
forest, house in blocks, entries in a register which are in a serial order etc.
For example, in a study of utilization of pit latrines in a district 150 homesteads are to be visited
for interviews with family members as well as for observations on types and cleanliness of
latrines. The district is composed of 6 wards and each Wereda has between 6 and 9 villages. First,
select 3 woredas out of the 6 by simple random sampling. Second, for each selected Wereda select
5 villages by simple random sampling (15 villages in total). Third, for each the selected villages
select 10 households.
27
(a). Judgment Sampling
In this case, the person taking the sample has direct or indirect control over which items are
selected for the sample. The subjective judgment of the researcher is the basis for selecting items
to be included in a sample. Judgment sampling often used to pre-test the questionnaire.
In this method, the decision maker selects a sample from the population in a manner that is
relatively easy and convenient to the investigator or the data collectors. This technique is simply
convenient to the researcher in terms of time, money and administration.
(c). Quota Sampling
In this sampling technique major population characteristics play an important role in selection of
the sample. It has some aspects in common with stratified sampling, but has no randomization. In
this method, the decision maker requires the sample to contain a certain number of items with a
given characteristic. Many political polls are, in part, quota sampling.
Example: if a scientist is reorganizing that the variability in daily milk production may due to
age difference. Characteristics of cows will be selected from different age group. For instant 30%
of cows’ b/n ages 4-6 years old, and remaining 70% are b/n ages 6-8 years old, a quota sample
must reflect those same percentages.
Note: we can’t make inference about the population by using non-probability sampling.
Summary
Sampling is very often used in our daily life, especially to do real researches or solve in a
specific any problem in any discipline we had better to use probability sampling. Because of
probability sampling can be infer the characters tics of population based on information taken
from the sample. To apply the application of sampling theory we should know the definition of
some basic terms like population, sampling, sample statistic, parameter, sampling unit and
sampling frame etc.
28
Exercises 2
1. The difference between sample estimate and population parameter is termed as
a. Human error c. Non-sampling error
b. Sampling error d. None of the above
2. If each and every unit of population has equal chance of being included in the sample, it is
known as:
a. Restricted sampling c. Simple random sampling
b. Purposive sampling d. None of the above
3. Simple random sample can be drawn with the help of
a. Slip method c. Calculator
b. Random number table d. All the above
4. Explain the difference between the following pairs of terms.
a) Sample and sample size
b) Sampling frame and sampling unit
c) Target population and sampling population
d) Stratified sampling and cluster sampling
Assignment 2: (5%)
1. Describe the basic difference between Probability and non- probability sampling
Reference
Freund J.E and Simon G.A (1998), modern Elementary statistics (9th edition)
Cochran, W. G. (1977). Sampling Techniques, 3rd , Ed, John Wiley& Sons, Inc., New
York.
29
Chapter 3: Methods of Data Collection and Presentations
Objectives
At the end of this unit the learners will able to:
Mention the type data and its source
Identify methods of data collection
Construct frequency distribution for a raw data
Identify the different methods of data organization and presentation
Contents
3.1 Introduction
3.2 Classification of data
3.3 Data collection
3.1 Introduction
Everybody collects, interprets and uses information, much of it in numerical or statistical forms
in day-to-day life. It is a common practice that people receive large quantities of information
everyday through conversations, televisions, computers, the radios, newspapers, posters, notices
and instructions. It is just because there is so much information available that people need to be
able to absorb, select and reject it. In everyday life, in business and industry, certain statistical
information is necessary and it is independent to know where to find it how to collect it. As
consequences, everybody has to compare prices and quality before making any decision about
what goods to buy. As employees of any firm, people want to compare their salaries and working
conditions, promotion opportunities and so on. In time the firms on their part want to control
costs and expand their profits.
30
One of the main functions of statistics is to provide information which will help on making
decisions. Statistics provides the type of information by providing a description of the present, a
profile of the past and an estimate of the future.
It may be noted that different types of data can be collected for different purposes. The data can
be collected in connection with time or geographical location or in connection with time and
location. The following are the three types of data:
1. Time series data: It is a collection of a set of numerical values, collected over a period of
time. The data might have been collected either at regular intervals of time or irregular
intervals of time. For example the data for the three types of expenditures (food,
education, other) in D/Tabor for a family for the four years 2001,2002,2003,2004.
2. Spatial data: If the data collected is connected with that of a place, then it is termed as
spatial data. For example, the data may be number of runs scored by a batsman in different
test matches in a test series at different places, district wise rainfall in Ethiopia, prices of
silver in four metropolitan cities
3. Spacio-temporal data: If the data collected is connected to the time as well as place then
it is known as Spacio-temporal data.
4. Cross sectional data: data on many individual collected over a specified period of time.
5. Longitudinal data: data on multiple entities collected over two or more times in an interval
of time.
Data is a general term for observations and measurements collected during any type of scientific
investigation. Based on source data can be classified as:
1. Primary Data
Primary data is the one, which is collected by the investigator himself for the purpose of a
specific inquiry or study. Such data is original in character and is generated by survey conducted
by individuals or research institution or any organization.
31
Example
If a researcher is interested to know the impact of noon meal scheme for the school children, he
has to undertake a survey and collect data on the opinion of parents and children by asking
relevant questions. Such a data collected for the purpose is called primary data.
Two activities involved: planning and measuring.
a) Planning:
Identify source and elements of the data.
Decide whether to consider sample or census.
If sampling is preferred, decide on sample size, selection method,… etc
Decide measurement procedure.
Set up the necessary organizational structure.
The primary data can be collected by the following five methods:
Focus Group discussion
Telephone Interview
Mail Questionnaires
Interviews
self-administered questionnaire
Experiments
Diary
Observations
32
Merits and Demerits of primary data:
1. The collection of data by the method of personal survey is possible only if the area
covered by the investigator is small. Collection of data by sending the enumerator is
bound to be expensive. Care should be taken twice that the enumerator record correct
information provided by the informants.
2. Collection of primary data by framing a schedules or distributing and collecting
questionnaires by post is less expensive and can be completed in shorter time.
3. Suppose the questions are embarrassing or of complicated nature or the questions probe
into personnel affairs of individuals, then the schedules may not be filled with accurate
and correct information and hence this method is unsuitable.
4. The information collected for primary data is mere reliable than those collected from the
secondary data.
2. Secondary Data
Secondary data are those data which have been already collected and analyzed by some earlier
agency for its own use; and later the same data are used by a different agency. According to
W.A.Neiswanger, ‘A primary source is a publication in which the data are published by the same
authority which gathered and analyzed them. A secondary source is a publication, reporting the
data which have been gathered by other authorities and for which others are responsible’.
Sources of Secondary data:
In most of the studies the investigator finds it impracticable to collect first-hand information on
all related issues and as such he makes use of the data collected by others. There is a vast amount
of published information from which statistical studies may be made and fresh statistics are
constantly in a state of production. The sources of secondary data can broadly be classified under
two heads:
a. Published sources, and
b. Unpublished sources.
a) Published Sources:
The various sources of published data are:
1. Reports and official publications of
33
i. International bodies such as the International Monetary Fund, International Finance
Corporation and United Nations Organization.
ii. Central and State Governments such as the Report of the Tandon Committee and Pay
Commission.
2. Semi-official publication of various local bodies such as Municipal Corporations and District
Boards.
3. Private publications-such as the publications of –
i. Trade and professional bodies such as the Federation of Indian Chambers of Commerce
and Institute of Chartered Accountants.
ii. Financial and economic journals such as ‘Commerce’, ‘Capital’ and ‘Indian Finance’.
iii. Annual reports of joint stock companies.
iv. Publications brought out by research agencies, research scholars, etc.
It should be noted that the publications mentioned above vary with regard to the periodically of
publication. Some are published at regular intervals (yearly, monthly, weekly etc.,) whereas
others are ad hoc publications, i.e., with no regularity about periodicity of publications.
b) Unpublished Sources
All statistical material is not always published. There are various sources of unpublished data
such as records maintained by various Government and private offices, studies made by research
institutions, scholars, etc. Such sources can also be used where necessary
When our source is secondary data check that:
The type and objective of the situations.
The purpose for which the data are collected and compatible with the present
problem.
The nature and classification of data is appropriate to our problem.
There are no biases and misreporting in the published data.
Precautions in the use of Secondary data
The following are some of the points that are to be considered in the use of secondary data
1. How the data has been collected and processed
2. The accuracy of the data
3. How far the data has been summarized
34
4. How comparable the data is with other tabulations
5. How to interpret the data, especially when figures collected for one purpose is used for
another
Generally speaking, with secondary data, people have to compromise between what they want
and what they are able to find.
Merits and Demerits of Secondary Data:
1. Secondary data is cheap to obtain. Many government publications are relatively cheap
and libraries stock quantities of secondary data produced by the government, by
companies and other organizations.
2. Large quantities of secondary data can be got through internet.
3. Much of the secondary data available has been collected for many years and therefore it
can be used to plot trends.
4. Secondary data is of value to:
- The government – help in making decisions and planning future policy.
- Business and industry – in areas such as marketing, and sales in order to appreciate
the general economic and social conditions and to provide information on competitors.
- Research organizations – by providing social, economical and industrial information.
Note: Data which are primary for one may be secondary for the other.
Having collected and edited the data, the next important step is to organize it. That is to present it
in a readily comprehensible condensed form that aids in order to draw inferences from it. It is
also necessary that the like be separated from the unlike ones.
The process of arranging data into classes or categories according to similarities technically is
called classification.
35
Classification is a preliminary and it prepares the ground for proper presentation of data.
Definitions:
Raw data: recorded information in its original collected form, whether it be counts or
measurements, is referred to as raw data.
Frequency: is the number of values in a specific class of the distribution.
Frequency distribution: is the organization of raw data in table form using classes and
frequencies.
Used for data that can be place in specific categories such as nominal, or ordinal. e.g. marital
status.
Example: a social worker collected the following data on marital status for 25
persons.(M=married, S=single, W=widowed, D=divorced)
M S D W D
S S M M M
W D S M M
W D D S S
S W W D D
Solution:
Since the data are categorical, discrete classes can be used. There are four types of marital status
M, S, D, and W. These types will be used as class for the distribution. We follow procedure to
construct the frequency distribution.
Step 1: Make a table as shown.
36
Class Tally Frequency Percent
(1) (2) (3) (4)
M
S
D
W
Step 2: Tally the data and place the result in column (2).
Step 3: Count the tally and place the result in column (3).
Step 4: Find the percentages of values in each class by using;
f
% * 100 Where f= frequency of the class, n=total number of value.
n
Percentages are not normally a part of frequency distribution but they can be added since they
are used in certain types diagrammatic such as pie charts.
Step 5: Find the total for column (3) and (4).
Combing all the steps one can construct the following frequency distribution.
It is a table of all the potential raw score values that could possible occur in the data along with
the number of times each actually occurred. It is often constructed for small set or data on
discrete variable.
Constructing ungrouped frequency distribution:
37
First find the smallest and largest raw score in the collected data.
38
When the range of the data is large, the data must be grouped in to classes that are more than one
unit in width.
Definitions:
39
Cumulative Frequency Distribution (CFD): it is the tabular arrangement of class interval
together with their corresponding cumulative frequencies. It can be more than or less than
type, depending on the type of cumulative frequency used.
Relative frequency (rf): it is the frequency divided by the total frequency.
Relative cumulative frequency (rcf): it is the cumulative frequency divided by the total
frequency.
40
7. Find the boundaries by subtracting U/2 units from the lower limits and adding U/2 units from
the upper limits. The boundaries are also half-way between the upper limit of one class and
the lower limit of the next class. may not be necessary to find the boundaries.
8. Tally the data.
9. Find the frequencies.
10. Find the cumulative frequencies. Depending on what you're trying to accomplish, it may not
be necessary to find the cumulative frequencies.
11. If necessary, find the relative frequencies and/or relative cumulative frequencies
Example 1*:
Construct a frequency distribution for the following data.
11 29 6 33 14 31 22 27 19 20
18 17 22 38 23 21 26 34 39 27
Solutions:
Step 1: Find the highest and the lowest value H=39, L=6
Step 2: Find the range; R=H-L=39-6=33
Step 3: Select the number of classes desired using Sturges formula;
k 1 3.32 log n =1+3.32log (20) =5.32=6(rounding up)
Step 4: Find the class width; w=R/k=33/6=5.5=6 (rounding up)
Step 5: Select the starting point, let it be the minimum observation.
6, 12, 18, 24, 30, 36 are the lower class limits.
Step 6: Find the upper class limit; e.g. the first upper class=12-U=12-1=11
11, 17, 23, 29, 35, 41 are the upper class limits.
So combining step 5 and step 6, one can construct the following classes.
Class limits
6 – 11
12 – 17
18 – 23
24 – 29
30 – 35
36 – 41
41
Step 7: Find the class boundaries;
E.g. for class 1, Lower class boundary=6-U/2=5.5
Upper class boundary =11+U/2=11.5
Then continue adding w on both boundaries to obtain the rest boundaries. By doing so one
can obtain the following classes.
Class boundary
5.5 – 11.5
11.5 – 17.5
17.5 – 23.5
23.5 – 29.5
29.5 – 35.5
35.5 – 41.5
Step 8: tally the data.
Step 9: Write the numeric values for the tallies in the frequency column.
Step 10: Find cumulative frequency.
Step 11: Find relative frequency or/and relative cumulative frequency.
The complete frequency distribution follows:
Class Class boundary Class Tally Freq. Cf (less Cf (more rf. rcf (less
limit Mark than than than
type) type) type
6 – 11 5.5 – 11.5 8.5 // 2 2 20 0.10 0.10
12 – 17 11.5 – 17.5 14.5 // 2 4 18 0.10 0.20
18 – 23 17.5 – 23.5 20.5 ////// 7 11 16 0.35 0.55
42
3.3.2 Diagrammatic and Graphic presentation of data
These are techniques for presenting data in visual displays using geometric and pictures.
-The three most commonly used diagrammatic presentation for discrete as well as qualitative
data are:
Pie charts
Pictogram
Bar charts
Pie chart
A pie chart is a circle that is divided into sections or wedges according to the percentage of
frequencies in each category of the distribution. The angle of the sector is obtained using:
Boy s Men
Girls Women
Pictogram
In these diagrams, we represent data by means of some picture symbols. We decide about a
suitable picture to represent a definite number of units in which the variable is measured.
44
Stands for 1000 people
Bar Charts:
- A set of bars (thick lines or narrow rectangles) representing some magnitude over time space.
- They are useful for comparing aggregate over time space.
- Bars can be drawn either vertically or horizontally.
- There are different types of bar charts. The most common being :
Simple bar chart
Deviation or two way bar chart
Broken bar chart
Component or sub divided bar chart.
Multiple bar charts.
-They are thick lines (narrow rectangles) having the same breadth. The magnitude of a quantity
is represented by the height /length of the bar.
Example: The following data represent sale by product, 1957- 1959 of a given company for three
products A, B, C.
Solutions:
45
Sales by product in 1957
30
25
Sales in $
20
15
10
5
0
A B C
product
When there is a desire to show how a total (or aggregate) is divided in to its component parts, we
use component bar chart.
The bars represent total value of a variable with each total broken in to its component parts and
different colors or designs are used for identifications
Example:
Draw a component bar chart to represent the sales by product from 1957 to 1959.
Solutions:
100
80
Sales in $
Product C
60
Product B
40
Product A
20
0
1957 1958 1959
Year of production
46
Multiple Bar charts
These are used to display data on more than one variable. They are used for comparing different
variables at the same time.
Example:
Draw a component bar chart to represent the sales by product from 1957 to 1959.
Solutions:
60
50
Sales in $
40 Product A
30 Product B
20 Product C
10
0
1957 1958 1959
Year of production
- The histogram, frequency polygon and cumulative frequency graph or ogive is most
commonly applied graphical representation for continuous data.
Procedures for constructing statistical graphs:
47
Represent the class boundaries for the histogram or ogive or the mid points for the frequency
polygon on the X axes.
Plot the points.
Draw the bars or lines to connect the points.
Histogram: is a graph which displays the data by using vertical bars of various heights to
represent frequencies. Class boundaries are placed along the horizontal axes. Class marks and
class limits are sometimes used as quantity on the X axes.
Frequency Polygon is a line graph. The frequency is placed along the vertical axis and classes
mid points (class marks) are placed along the horizontal axis. It is customer to the next higher
and lower class interval with corresponding frequency of zero, this is to make it a complete
polygon.
Example: Draw a frequency polygon for the above data (example *).
Solutions:
4
Value Frequency
0
2. 5 8. 5 14.5 20.5 26.5 32.5 38.5 44.5
48
Ogive (cumulative frequency polygon): is a graph showing the cumulative frequency (less
than or more than type) plotted against upper or lower class boundaries respectively. That is class
boundaries are plotted along the horizontal axis and the corresponding cumulative frequencies
are plotted along the vertical axis. The points are joined by a free hand curve.
Summary
Data is a general term for observations and measurements collected during any type of scientific
investigation. Based on source; data classified as primary and secondary. Primary data be
collected through observation, experimentation, interview, questionnaire. Frequency distribution
is arrangement of data using class and frequency in table form. Cumulative frequency
distribution: The tabulation of a sample of observations in terms of numbers falling below
particular values. The tabulated data is presented using either diagrammatic or graphic
presentation methods. Diagrammatic presentation like bar chart, pie chart, pictogram are more
appropriate for discrete data while graphic presentation like histogram, ogive curves (less/more
than cumulative frequency curve) and frequency polygons are appropriate for continuous
frequency distributions.
Exercise 3
1. A researcher observes aggressive behavior for a sample of n = 15 boys and classifies each boy
as high, medium, or low in terms of aggression. If the frequency distribution for these scores is
presented in a graph, what kind of graph would be appropriate?
Assignment 3: (6%)
1. What are the points that are to be considered in the use of secondary data?
2. What are the sources of secondary data?
49
3. Give the merits and demerits of primary data and secondary data.
4. In a survey, it was found that 64 families bought milk in the following quantities in a
particular month. Quantity of milk (in litres) bought by 64 Families in a month. 19 16 22 9 22
12 39 19 14 23 6 24 16 18 7 17 20 25 28 18 10 24 20 21 10 7 18 28 24 20 14 23
25 34 22 5 33 23 26 29 13 36 11 26 11 37 30 13
8 15 22 21 32 21 31 17 16 23 12 9 15 27 17 21
Construct a continuous frequency distribution.
Reference
50
Chapter 4: Measures of Central Tendency
Objectives
Contents
A single value that describes the characteristics of the entire mass of data is called measures of
central tendency or average.
51
To enable further statistical analysis
We say a measure of central tendency is best if it posses most of the following. It should:
- be simple to understand and easy to calculate/interpret,
- exist and be unique,
- be rigidly defined by mathematical formula,
- based on all observations,
- Not be seriously affected by extreme observations,
- Have capable of further statistical analysis and/or algebraic manipulation.
Let a data set consists of a number of observations, represents by x1 , x 2 , ..., x n where n (the last
subscript) denotes the number of observations in the data and x i is the ith observation. Then the
sum 𝑥1 + 𝑥2 + ⋯ + 𝑥𝑛 = ∑𝑛𝑖=1 𝑥𝑖
For instance a data set consisting of six measurements 21, 13, 54, 46, 32 and 37 is represented by
x1 , x 2 , x3 , x 4 , x5 and x 6 where x1 = 21, x 2 = 13, x3 = 54, x 4 = 46, x5 = 32 and x 6 = 37.
6
Their sum becomes xi 1
i 21+13+59+46+32+37=208.
n
Similarly x1 x2 ... xn = xi
2 2 2 2
i 1
Example:
52
12 12 12 12
12 12
Find I ) (4x 3 y ),
i 1
i i
II ) 2x ( x 7)
i 1
i i
12 12 12
Solution: I ) (4x
i 1
i
3 y ) 4 xi Y i 4(26) 3(17) 105
i
i 1 i 1
12 12 12
Several types of averages or measures of central tendency can be defined, the most commons are
- the mean
- the mode
- the median
There are four of means: Arithmetic mean, weighted arithmetic mean, Harmonic mean and
Geometric mean.
It is defined as the sum of the measurements of the items divided by the total number of items.
When the data are arranged or given on the form of ungrouped frequency distribution, then the
formula for the mean is
𝑓1 𝑥1 + 𝑓2 𝑥2 + ⋯ + 𝑓𝑘 𝑥𝑘 ∑𝑘𝑖=1 𝑓𝑖 𝑥𝑖 k
𝑋̅ =
𝑓1 + 𝑓2 + ⋯ + 𝑓𝑘
= 𝑘
∑𝑖=1 𝑓𝑖
Note that f
i 1
i n
Example 1: You measure the body lengths (in inches) of 10 full-term infants at birth and record
the following:
53
17.5, 19.5, 17.5, 19, 20, 21, 18, 19.5, 18, 10.75
Example 2: Monthly incomes of fourth year regular students are given in the following
frequency distribution.
Monthly income (birr) 54.5 64.5 74.5 84.5 94.5 104.5 114.5
Number of students 6 9 15 25 13 7 5
Compute the mean for these data.
If data are given in the form of continuous frequency distribution, the sample mean can be
computed as
k
f m i i f m
f m ... f m
x i 1
1 1 2 2 k k
f
k
f f ... f
1 2 k
i
i 1
k
Note that f i n = the total number of observations.
i 1
Example: The following table gives the daily wages of laborers. Calculate the average daily
wages paid to a laborer.
The sum of the deviations of the items from their arithmetic mean is zero. This means, the
algebraic sum of the deviations of a set of numbers x1 , x 2 , ..., x n from their mean x is zero.
n
That is ( xi x ) 0
i 1
54
The sum of the squares of the deviations of a set of observations from any number, say A, is
2
minimum when A=𝑋̅ . That is, ∑(𝑥𝑖 − 𝑥̅ )2 ≤ ∑(𝑥𝑖 − 𝐴)
When a set of observations is divided into k groups and x1 is the mean of n1 observations of
of group k , then the combined mean ,denoted by x c , of all observations taken together is
given by
𝑛1 𝑥̅1 + 𝑛2 𝑥̅2 + ⋯ + 𝑛𝑘 𝑥̅ 𝑘 ∑𝑘𝑖=1 𝑛𝑖 𝑥̅𝑖
𝑋̅𝑐 = = 𝑘
𝑛1 + 𝑛2 + ⋯ + 𝑛𝑘 ∑𝑖=1 𝑛𝑖
If a wrong figure has been used in calculating the mean, we can correct if we know the
correct figure that should have been used. Let
𝑋𝑤𝑟 denote the wrong figure used in calculating the mean
𝑋𝑐 be the correct figure that should have been used
𝑋̅𝑤𝑟 be the wrong mean calculated using 𝑋𝑤𝑟 , then the correct mean, 𝑋̅𝑐𝑜𝑟𝑟𝑒𝑐𝑡 , is given
by
̅
𝑛𝑋 + 𝑋 − 𝑋
𝑋̅𝑐𝑜𝑟𝑟𝑒𝑐𝑡 = 𝑤𝑟 𝑛𝑐 𝑤𝑟
Solution:
n1 x1 n2 x 2 n3 x3 28(80) 32(83) 35(76) 7556
xc 79.54
n1 n2 n3 28 32 35 95
Example 2: An average weight of 10 students was calculated to be 65 kg, but latter, it was
discovered that one measurement was misread as 40 kg instead of 80 kg. Calculate the corrected
average weight.
55
𝑛𝑋̅𝑤𝑟 + 𝑋𝑐 − 𝑋𝑤𝑟 10(65)+80−40
Solution: 𝑋̅𝑐𝑜𝑟𝑟𝑒𝑐𝑡 = = = 69
𝑛 10
Exercise: The average score on the mid-term examination of 25 students was 75.8 out of 100.
After the mid-term exam, however, a student whose score was 41 out of 100 dropped the course.
What is the average/mean score among the 24 students?
In finding arithmetic mean, all items were assumed to be of equally importance (each value in
the data set has equal weight). When the observations have different weight, we use weighted
average. Weights are assigned to each item in proportion to its relative importance.
If x1 , x 2 , ..., x k represent values of the items and w1 , w2 , ... , wk are the corresponding weights, then
𝑤1 𝑥1 +𝑤2 𝑥2 +⋯+𝑤𝑘 𝑥𝑘 ∑𝑘
𝑖=1 𝑤𝑖 𝑥𝑖
𝑋̅𝑤 = = ∑𝑘
𝑤1 +𝑤2 +⋯+𝑤𝑘 𝑖=1 𝑤𝑖
Example: A student’s final mark in Mathematics, Physics, Chemistry and Biology are
respectively 82, 80, 90 and 70. If the respective credits received for these courses are 3, 5, 3 and
1, determine the approximate average mark the student has got for one course.
Solution: We use a weighted arithmetic mean, weight associated with each course being taken as
the number of credits received for the corresponding course.
𝑥𝑖 82 80 90 70
𝑤𝑖 3 5 3 1
Therefore x w
w x i i
(3 82) (5 80) (3 90) (1 70)
82.17
w i 3 5 3 1
Exercise 1: If a student gets A in 4 cr. hrs, B in 3 cr. hrs and D in 2 cr. hrs courses, what is his
GPA in this semester?
56
Values 4 3 1
Weight 4 3 2
Answer: GPA=3
It used when observed values are measured as ratios, percentages, proportions, indices or growth
rates.
GM n
x . x .... x
1 2 n
,
GM n f1 f 2 .... f k
If the observed have frequencies x .x
1 2 x k
Values 2 4 6 8 10 Total
Frequencies 1 2 2 2 1 8
f
k
i 1 f ..... f
HM i
1 k
1 1
n
i 1
f ix i
f x ........ f x
1 1 k k
Example: A motorist travels 480km in 3 days. She travels for 10 hours at rate of 48km/hr on 1st day,
for 12 hours at rate of 40km/hr on the 2nd day and for 15 hours at rate of 32km/hr on the 3rd day.
What is her average speed?
57
3
HM 39.92
1 1 1
48 40 32
- Arithmetic mean has a rigidly defined mathematical formula so that its value is always
definite.
- It is calculated based on all observations.
- Arithmetic mean is simple to calculate and easy to understand.
- It doesn’t need arrangement of data in increasing or decreasing order.
- Arithmetic mean is also capable of further algebraic treatment.
- It affords a good standard of comparison.
The median of a set of items (numbers) arranged in order of magnitude (i.e. in an array form) is the
middle value or the arithmetic mean of the two middle values. We shall denote the median of
x1 , x 2 , ..., x n by ~x . For ungrouped data the median is obtained by
58
F Sum of frequencies of all class lower than the median class (in other words it is the
The median class is the class with the smallest cumulative frequency greater than or equal to n .
2
Examples1: The birth weights in pounds of five babies born in a hospital on a certain day are 9.2,
6.4, 10.5, 8.1 and 7.8. Find the median weight of these five babies.
Solution: 𝑋̃ = 8.1
Exercise: The following table gives the distribution of the weekly wages of employees of a small
firm.
59
4.4.3 The Mode
The mode refers to that value in a distribution, which occur most frequently. It is an actual value,
which has the highest concentration of items in and around it. According to Croxton and Cowden
“The mode of a distribution is the value at the point around which the items tend to be most heavily
concentrated. It may be regarded at the most typical of a series of values”.
The mode or the modal value is denoted by x̂ . Note that the mode may not exist in the series or, even
if it does exist, it may not be unique.
2 The difference between the frequency of the modal class and frequency of the class
immediately follows the modal class
Examples 1: The marks obtained by ten students in a semester exam in statistics are: 70, 65, 68, 70,
75, 73, 80, 70, 83 and 86. Find the mode of the students’ marks. Mode=70
Example 2: Find the mode for the frequency distribution of the birth weight (in kilogram) of 30
children given below.
No. of children 5 5 9 4 4 3
Solution: 2.7-3.1 is the modal class since it has the highest frequency
1 9 5 4 and 2 9 4 5 Lmod 2.7
4
xˆ 2.7 * 0.4 2.878
4 5
60
Merits of mode
Demerits of mode
- Mode may not exist in the series and if it exists it may not be a unique value.
- It does not fulfill most of the requirements of a good measure of central tendency
4.3.4 Quantiles
Quantiles are values which divides the data set arranged in order of magnitude in to certain equal
parts. They are averages of position (non-central tendency). Some of these are quartiles, deciles and
percentiles.
I. Quartiles: are values which divide the data set in to four equal parts, denoted by Q1 ,Q2 and Q3 .
The first quartile is also called the lower quartile and the third quartile is the upper quartile. The
second quartile is the median.
For Ungrouped data:
Let Q j be the j th quartile value for j 1, 2, 3 . Then
th
j
Q j n 1 item; j 1, 2, 3.
4
Example:
Compute quartiles for the data given below 25, 18, 30, 8, 15, 5, 10, 35, 40, and 45
Solution: arrange the data in ascending order.
5, 8, 10, 15, 18, 25, 30, 35, 40, 45
𝑡ℎ 𝑡ℎ
1 1
𝑄1 = ( (𝑛 + 1)) 𝑖𝑡𝑒𝑚 = ( (10 + 1)) 𝑖𝑡𝑒𝑚 = (2.75)𝑡ℎ 𝑖𝑡𝑒𝑚
4 4
= 2𝑛𝑑 𝑖𝑡𝑒𝑚 + 0.75(3𝑟𝑑 − 2𝑛𝑑 )𝑖𝑡𝑒𝑚 = 8 + 0.75(10 − 8) = 9.5
61
𝑡ℎ 𝑡ℎ
2 2
𝑄2 = ( (𝑛 + 1)) 𝑖𝑡𝑒𝑚 = ( (10 + 1)) 𝑖𝑡𝑒𝑚 = (5.5)𝑡ℎ 𝑖𝑡𝑒𝑚 = 5𝑡ℎ 𝑖𝑡𝑒𝑚 + 0.5(6𝑡ℎ − 5𝑡ℎ )𝑖𝑡𝑒𝑚
4 4
= 18 + 0.5(25 − 18) = 21.5
𝑡ℎ 𝑡ℎ
3 3
𝑄3 = ( (𝑛 + 1)) 𝑖𝑡𝑒𝑚 = ( (10 + 1)) 𝑖𝑡𝑒𝑚
4 4
= (8.25)𝑡ℎ 𝑖𝑡𝑒𝑚85𝑡ℎ 𝑖𝑡𝑒𝑚 + 0.25(9𝑡ℎ − 8𝑡ℎ )𝑖𝑡𝑒𝑚 = 35 + 0.25(40 − 35) = 36.25
For grouped data
We can apply the following formula:
j n 4 FQ j
Q j LQ j W ; j 1, 2, 3.
fQj
The j th quartile class is the class with the smallest cumulative frequency greater than or equal to
j n4 .
II. Deciles: are values dividing the data in to ten equal parts, denoted by D1 , D2 , ..., D9 . The fifth
decile is the median.
For Ungrouped data
Let D j be the j th percentile value for j 1, 2, ... , 9 . Then
th
j
D j n 1 item; j 1, 2, ... , 9
10
Example: Compute 𝐷5 , 𝐷3 for the data given below 5, 24, 36, 12, 20, 8
Solution: first arrange the data in ascending order: 5, 8, 12, 20, 24, 36
𝑡ℎ 𝑡ℎ
3 3
𝐷3 = ( (𝑛 + 1)) 𝑖𝑡𝑒𝑚 = ( (6 + 1)) 𝑖𝑡𝑒𝑚 = (2.1)𝑡ℎ 𝑖𝑡𝑒𝑚
10 10
= 2𝑛𝑑 𝑖𝑡𝑒𝑚 + 0.1(3𝑟𝑑 − 2𝑛𝑑 )𝑖𝑡𝑒𝑚 = 8 + 0.1(12 − 8) = 8.4
62
𝑡ℎ 𝑡ℎ
5 5
𝐷5 = ( (𝑛 + 1)) 𝑖𝑡𝑒𝑚 = ( (6 + 1)) 𝑖𝑡𝑒𝑚 = (3.5)𝑡ℎ 𝑖𝑡𝑒𝑚
10 10
= 3𝑟𝑑 𝑖𝑡𝑒𝑚 + 0.5(4𝑡ℎ − 3𝑟𝑑 )𝑖𝑡𝑒𝑚 = 12 + 0.5(20 − 12) = 16
For grouped data
We can apply the following formula:
j n10 FD j
D j LD j W ; j 1, 2, ... , 9
f Dj
Define the symbols similar way as we did in the case of quartiles.
The j th decile class is the class with the smallest cumulative frequency greater than or equal to j n 10 .
Percentiles: are values which divide the data in to one hundred equal parts, denoted by P1 , P2 , ... P99 .
The fiftieth percentile is the median.
j n100 FPj
Pj LPj W ; j 1, 2, 3, ... , 99
f Pj
63
Interpretations
1. Q j is the value below which ( j 25) percent of the observations in the series are found (where
j 1, 2, 3 ). For instance Q3 means the value below which 75 percent of observations in the given
2. D j is the value below which ( j 10) percent of the observations in the series are found (where
j 1, 2, ... , 9 ). For instance D4 is the value below which 40 percent of the values are found in the
series.
3. Pj is the value below which j percentof the total observations are found (where j 1, 2, 3, ... , 99 ).
For example 73 percent of the observations in a given series are below P73 .
Summary
A measure of central tendency is a typical value around which other figures congregate. An
average stands for the whole group of which it forms a part yet represents the whole. One of the
most widely used set of summary figures is known as measures of location. There are five
averages. Among them mean, median and mode are called simple averages and the other two
averages geometric mean and harmonic mean are called special averages. It is possible to
compute the modal value of any data set either qualitative or quantitative. We can compute the
arithmetic mean of a frequency distribution with open class while it is possible to compute mean
and median.
Exercise 4
1. The mean will provide the best measure of central tendency for any possible set of data.
2. Changing the value of a score in a distribution will always change the value of the mean.
3. For any normal distribution, the mean and the median will have the same value.
Identify the choice that best completes the statement or answers the question.
4. A population of scores has ∑ 𝑋 = 60 and a mean of 𝑥̅ = 12. How many scores are in this
population?
a. 5 c. 60
b. 12 d. None
64
5. One sample of n = 4 scores has a mean of 𝑥̅ = 10, and a second sample of n = 8 scores has a
mean of 𝑥̅ = 20. If the two samples are combined, the mean for the combined sample will be
a. equal to 15 c. less than 15 but more than 10
b. greater than 15 but less than 20 d. None of the other choices is correct.
Assignment 4: (5%)
1. Arithmetic mean of 50 observations was 100. At the time of calculations two observations
180 and 90 were wrongly taken as 100 and 10. Find the corrected mean?
2. Complete the following frequency distribution?
Class limit frequency
11-20 12
21-30 30
31-40 𝑓3
41-50 65
51-60 𝑓5
61-70 25
71-80 18
Total 229
Reference
65
Chapter 5: Measures of Variation (Dispersion), Skewness and Kurtosis
Objectives
At the end of this unit the learners will be able to:
explain objectives of measures of central tendency
Compute range, variance, standard deviation, CV and Z-score for raw/summarized data.
Identify the shape and peakedness of frequency curves
Contents
66
These values may be used to compare the variation in two distributions provided that the
variables are in the same units and of the same average size.
The range is the largest score minus the smallest score. It is a quick and dirty measure of
variability, although when a test is given back to students they very often wish to know the range
of scores.
If data are given in the shape of continuous frequency distribution, the range is computed as:
67
Relative Range (RR): it is also sometimes called Coefficient of Range (CR) and given by:
LS R
RR ; This is sometimes expressed as:
LS LS
x max x min R
RR ........ for ungroupeddata
x max x min x max x min
M last M first R
RR ......... for grouped data
M last M first M last M first
Examples: Find the range and coefficient of range the data set: 7, 9, 8, 6,11, 10, 4
L S 11 4 7
Solution: L=11, S=4 R=L-S=11-4=7 CR= 0.4667
L S 11 4 15
Example: If the range and relative range of a series are 4 and 0.25 respectively. Then what is the
value of the smallest observation largest observation.
Solutions:
R 4 L S 4 __________ _______(1)
RR 0.25 L S 16 __________ ___( 2)
Solving (1) and ( 2) at the same time , one can obtain the following value
L 10 and S 6
Example: Find the values of the range and relative range for the following frequency
distribution: which shows the distribution of the maximum loads supported by a certain number
of cables.
Maximum load(in kilo-Newton) Number of cables
93 – 97 2
118 – 122 6
123 – 127 3
128 – 132 1
Solution:
68
Merits and Demerits of Range:
Merits:
1. It is simple to understand.
2. It is easy to calculate.
3. In certain types of problems like quality control, weather forecasts, share price analysis, et
c., range is most widely used.
Demerits:
1. It is very much affected by the extreme items.
2. It is based on only two extreme observations.
3. It cannot be calculated from open-end class intervals.
4. It is not suitable for mathematical treatment.
5. It is a very rarely used measure.
5.2.2 The Quartile Deviation (Semi-inter quartile range, ( Q.D) and Coefficient of Quartile
Deviation (C.Q.D)
The inter quartile range is the difference between the third and the first quartiles of a set of items
and semi-inter quartile range is half of the inter quartile range.
Q3 Q1
Q.D
2
Coefficient of Quartile Deviation (C.Q.D)
It gives the average amount by which the two quartiles differ from the median.
(Q3 Q1 2 2 * Q.D Q3 Q1
C. Q.D
(Q3 Q1 ) 2 Q3 Q1 Q3 Q1
Example 1: Find the Quartile Deviation and its coefficients for the following data: 391, 384,
591, 407, 672, 522, 777, 733, 1490, 2488
Solution: Arrange the given values in ascending order. 384, 391, 407, 522, 591, 672, 733, 777,
1490, 2488.
Q1=403, Q3=955.25
69
𝑄3 − 𝑄1 955.25 − 403
𝑄. 𝐷 = = = 276.125
2 2
Example 2: Compute Q.D and its coefficient for the following distribution.
Values Frequency
140- 150 17
150- 160 29
160- 170 42
170- 180 72
180- 190 84
190- 200 107
200- 210 49
210- 220 34
220- 230 31
230- 240 16
240- 250 12
Solutions:
Q3 Q1 203.83 174.90
Q.D 14.47
2 2
2 * Q.D 2 * 14.47
C.Q.D 0.076
Q3 Q1 203.83 174.90
Remark: Q.D or C.Q.D includes only the middle 50% of the observation.
The mean deviation of a set of items is defined as the arithmetic mean of the values of the
absolute deviations from a given average. Depending up on the type of averages used we have
different mean deviations.
The mean deviation (MD) measures the average deviation of a set of observations about their
central value, generally the mean or the median, ignoring the plus/minus sign of the deviations.
70
MD
x i A
Where A is a central measure (the mean or the median or the mode)
n
In case of grouped data, the formula for MD becomes
MD
f i mi A
Where mi is the class mark of the i th class, f i is the frequency of the
n
i th
class and n f i .
a) Mean Deviation about the Mean
MD
x i x
....
n for ungrouped data
MD
f i mi x
.... for grouped frequency distribution; where mi is the class mark of the
n
i th class, f i is the frequency of the i th class and n f i
MD
xi ~
x
.... for ungrouped data
n
MD
fi mi x .... for grouped frequency distribution; where mi is the class mark of
n
the i th class, f i is the frequency of the i th class and n f i .
~
Steps to calculate M.D ( X ):
~
1. Find the median, X
~
2. Find the deviations of each reading from X .
3. Find the arithmetic mean of the deviations, ignoring sign.
c) Mean Deviation about the Mode
71
n
Denoted by M.D( X̂ ) and given by:
X i Xˆ
M .D( Xˆ ) i 1
n
For the case of frequency(grouped) distribution it is given as:
f i mi Xˆ
M .D( Xˆ ) i 1
n
1. The following are the number of visit made by ten mothers to the local doctor’s surgery. 8, 6,
5, 5, 7, 4, 5, 9, 7, 4
Find mean deviation about mean, median and mode.
Solutions:
First calculate the three averages
~
X 6, X 5.5, Xˆ 5
Then take the deviations of each observation from these averages.
Xi 4 4 5 5 5 6 7 7 8 9 total
Xi 6 2 2 1 1 1 0 1 1 2 3 14
X i 5.5 1.5 1.5 0.5 0.5 0.5 0.5 1.5 1.5 2.5 3.5 14
Xi 5 1 1 0 0 0 1 2 2 3 4 14
j n100 FP j
Pj LP j W ; j 1, 2, 3, . .. , 99
f Pj
72
10
X i 5.5
14
M .D( X ) i 1
1.4
10 10
j n 100
MD
In general, CMD where A is a measure of central tendency: the arithmetic mean or the
A
median.
MD
That is, CMD about the arithmetic mean is given by CMD where MD is the mean
x
deviation calculated about the arithmetic mean. On the other hand CMD about the median is
MD
given by CMD ~ in which case MD is calculated about the median of the observations.
x
Example: Calculate the C.M.D about the mean, median and mode for the data in example above.
Solutions:
( j 10) percent
M .D( X ) 1.4
C.M .D( X ) 0.233
X 6
~
~ M .D( X ) 1.4
C.M .D( X ) ~ 0.255
X 5.5
M .D( Xˆ ) 1.4
C.M .D( Xˆ ) 0.28
Xˆ 5
73
1. It is not based on all the items. It is based on two positional values Q 1 and Q 3 and
ignores the extreme 50% of the items.
2. 2. It is not amenable to further mathematical treatment.
3. 3. It is affected by sampling fluctuations.
The Variance
Variance is the arithmetic mean of the square of the deviation of observations from their
arithmetic mean
Population Variance ( 2 )
If we divide the variation by the number of values in the population, we get something called the
population variance. This variance is the "average squared deviation from the mean".
x
2
1 xi
2
N N
is the total number of observations in the population.
f m fi mi
2 2
2
i i
1
fi mi N
2 , Where is the population arithmetic
N N
mean, mi is the class mark of the i th class, f i is the frequency of the i th class and N f i .
Sample Variance ( S 2 )
One would expect the sample variance to simply be the population variance with the population
mean replaced by the sample mean. To counteract this, the sum of the squares of the deviations
is divided by one less than the sample size.
74
x x
2
1 xi 2 , Where x is the sample arithmetic mean
... xi n
2 i 2
S
n 1 n 1
and n is the total number of observations in the sample.
f m x
2 2
1 f m
... fi mi 2 n Where x is the sample arithmetic
i i i i
S2
n 1 n 1
mean, mi is the class mark of the i th class, f i is the frequency of the i th class and n f i .
Standard Deviation
There is a problem with variances. Recall that the deviations were squared. That means that the
units were also squared. To get the units back the same as the original data values, the square
root must be taken.
Standard deviation is the positive square root of the variance.
Examples: Find the variance and standard deviation of the following sample data
1. 5, 17, 12, 10.
2. The data are given in the form of frequency distribution.
Class Frequency
40-44 7
45-49 10
50-54 22
55-59 15
60-64 12
65-69 6
70-74 3
Solutions:
Xi 5 10 12 17 Total
75
1. X 11 (Xi- X)2 36 1 1 36 74
(X
i 1
i X )2
74
S2 24.67.
n 1 3
S S 2 24.67 4.97.
2. X 55
mi(M.P.) 42 47 52 57 62 67 72 Total
f (m X )
i i
2
4400
S2 i 1
59.46.
n 1 74
S S 2 59.46 7.71.
Properties of the Variance and the Standard Deviation
Variance
It removes most of the demerits or drawbacks of the measures of dispersion discussed so far.
Its unit is the square of the unit of measurement of values. For example, if the variable is
measured in kg, the unit of variance is kg2.
It is calculated based on all the observations/data in the series.
It gives more weight to extreme values and less to those which are near to the mean.
Standard Deviation
76
Special Properties of Standard Deviations
1.
( X i X )2 ( X i A) 2 , A X
n 1 n 1
2. For normal (symmetric distribution the following holds true.)
Approximately 68.27% of the data values fall within one standard deviation of the mean. i.e.
with in ( X S, X S)
Approximately 95.45% of the data values fall within two standard deviations of the mean. i.e.
with in ( X 2S , X 2S )
Approximately 99.73% of the data values fall within three standard deviations of the mean.
i.e. with in ( X 3S , X 3S )
3. Chebyshev's Theorem
For any data set ,no matter what the pattern of variation, the proportion of the values that fall
1
within k standard deviations of the mean or ( X kS, X kS ) will be at least 1 , where k
k2
is an number greater than 1. i.e. the proportion of items falling beyond k standard deviations of
1
the mean is at most
k2
Example: Suppose a distribution has mean 50 and standard deviation 6. What percent of the
numbers are?
a) Between 38 and 62
b) Between 32 and 68
c) Less than 38 or more than 62.
d) Less than 32 or more than 68.
Solutions:
a) 38 and 62 are at equal distance from the mean,50 and this distance is 12
50-38=62-50=12
KS=12; K=12/S =12/6=2
77
1
Applying the above theorem, at least (1 ) * 100% 75% of the numbers lie between 38
k2
and 62.
b) Similarly done.
1
c) It is just the complement of a) i.e. at most * 100% 25% of the numbers lie less than 32
k2
or more than 62.
d) Similarly done.
78
Compare the relative dispersions of the two departments’ scores using the appropriate way.
Solution:
Solutions:
Calculate coefficient of variation for both firms.
SA 10
C.VA *100 *100 19.05%
XA 52.5
SB 11
C.VB *100 *100 23.16%
XB 47.5
Since C.VA < C.VB, in firm B there is greater variability in individual wages.
A standard score is a measure that describes the relative position of a single score in the entire
distribution of scores in terms of the mean and standard deviation. It also gives us the number of
standard deviations a particular observation lie above or below the mean.
x
Population standard score: Z where x is the value of the observation, and are the
mean and standard deviation of the population respectively.
79
xx
Sample standard score: Z where x is the value of the observation, x and S are the mean
S
and standard deviation of the sample respectively.
Interpretation:
Example: Two sections were given an exam in a course. The average score was 72 with standard
deviation of 6 for section 1 and 85 with standard deviation of 5 for section 2. Student A from
section 1 scored 84 and student B from section 2 scored 90. Who performed better relative to
his/her group?
x B x 2 90 85
Z-score of student B: Z 1.00
S2 5
From these two standard scores, we can conclude that student A has performed better relative to
his/her section students because his/her score is two standard deviations above the mean score of
selection 1 while the score of student B is only one standard deviation above the mean score of
section 2 students.
Example Two sections were given introduction to statistics examinations. The following
information was given.
80
Solutions:
X A X 1 90 78
Z MARU 2
S1 6
X B X 2 95 90
Z HANA 1
S2 5
Student MARU performed better relative to his section because the score of student MARU
is two standard deviation above the mean score of his section while, the score of student HANA
is only one standard deviation above the mean score of his section.
5.3 Moments
X X 2 ... X n
r r r
X 1
r
n
n
X
r
i
i 1
n
- For the case of frequency distribution this is expressed as:
k
f X
r
i i
i 1
Xr
n
- If r 1 , it is the simple arithmetic mean, this is called the first moment.
- This is sometimes called the moment about the origin.
2. The rth moment about the mean ( the rth central moment)
Denoted by Mr and defined as:
n
(X X )r
n
(X i X) r
(n 1) i 1
i
Mr i 1
n n n 1
81
For the case of frequency distribution this is expressed as:
k
f i ( X i X )r
Mr i 1
n
Ifr 2 , it is population variance, this is called the second central moment. If we assume
n 1 n , it is also the sample variance.
3. The rth moment about any number A is defined as:
'
- Denoted by M r and
∑(𝑋𝑖 − 𝐴)𝑟
𝑀𝑟, =
𝑛
- For the case of frequency distribution this is expressed as:
k
f (X i i A) r
Mr i 1
'
Example:
1. Find the first two(about the origin) moments for the following set of numbers: 2, 3, 7
2. Find the first three central moments of the numbers in problem 1
3. Find the third moment about the number 3 of the numbers in problem 1.
Solutions:
1. Use the rth moment formula.
n
X
r
i
Xr i 1
n
237
X1 4 X
3
2 2 32 7 2
X2 20.67
3
2. Use the rth central moment formula.
82
n
(X
i 1
i X )r
Mr
n
(2 4) (3 4) (7 4)
M1 0
3
(2 4) 2 (3 4) 2 (7 4) 2
M2 4.67
3
(2 4) 3 (3 4) 3 (7 4) 3
M3 6
3
3. Use the rth moment about A.
n
(X
i 1
i A) r
Mr
n
(2 3) 3 (3 3) 3 (7 3) 3
M3 21
'
5.4 Skewness
83
Symmetrical Positively skewed negatively Skewed
x x x x x or x x x or x
Measures of Skewness
Mean Mode X Xˆ
3
S tan dard deviation S
5.5 Kurtosis
Mesokurtic (normal curve): If the frequency distribution is unimodal and if the curve is bell
shaped and symmetrical.
84
Leptokurtic: If the frequency distribution is more peaked and narrow topped than normal i.e.
large numbers of observations have high frequency
Platykurtic: If the frequency distribution is less peaked and flat topped than normal i.e. large
numbers of observations have low frequency.
Measures of Kurtosis
The moment coefficient of kurtosis:
M4 M4 M4
4
M 22 4 2 2
Leptokurtic
Mesokurtic
Platykurtic
Examples
85
Solutions:
M3 60
3 32
0.94 0
a) M2 16 3 2
The distribution is negatively skewed .
M 4 162
0.6 3
b) 4 M 2 162
2
The curveis platykurtic.
2. Suppose the mean, the mode, and the standard deviation of a certain distribution are 32, 30.5
and 10 respectively. What is the shape of the curve representing the distribution?
Solutions:
Use the Pearsonian coefficient of skewness
Mean Mode 32 30.5
0.15
3 S tan dard deviation 10
0 The distributionis positively skewed .
3
3. In a frequency distribution, the coefficient of skewness based on the quartiles is given to be
0.5. If the sum of the upper and lower quartile is 28 and the median is 11, find the values of
the upper and lower quartiles.
Solutions:
Solving (*) and (**) at the sametime weobtain the following values
Q 8 and Q 20
1 3
Summary
A measure of dispersion or simply dispersion may be defined as a statistics satisfying the extent of the
scatteredness of items round a measure of central tendency. A measure of dispersion may be expressed in
86
an “absolute form” or in “relative form”. It is said to be in an absolute form when it states the actual
amount by which the values of an item on an average deviate from a measure of central tendency.
Absolute measures are expressed in concrete units i.e. units in terms of which the data have been
expressed. Range is the crudest measure of dispersion. By far the most universally used and the most
useful measure of dispersion is the standard deviation or root mean square deviation about the mean.
Skewness is concerned with the shape of the curve while the measure of kurtosis exhibits the to
which the curve is more peaked or more flat topped than the normal curve.
Exercises 5
3. The following data is for the monthly salary of eighteen workers in a certain paint factory
given below.
Frequency 2 6 2 1 1 1 2 1 1 1
Salary 462 480 534 624 498 552 606 588 516 570
a. Find the range and relative range
b.Find the variance and standard deviation
c. Find the mean devotion and coefficient of MD about the mode
4. If the average scores of a special test of knowledge of wood refinishing has a mean of 53 and
standard deviation of 6. Find the range of values in which at least 75% the scores will lie.
5. The mean and the standard deviation of a set of numbers are respectively 500 and 10.
a. If 10 is added to each of the numbers in the set, then what will be the variance and standard
deviation of the new set?
b. If each of the numbers in the set are multiplied by -5, then what will be the variance and
standard deviation of the new set?
Assignment 5: (5%)
1. Two groups of people were trained to perform a certain task and tested to find out which
group is faster to learn the task. For the two groups the following information was given:
87
a) Which group is more consistent in its performance
b) Suppose a person A from group one take 9.2 minutes while person B from Group two take
9.3 minutes, who was faster in performing the task? Why?
3. For a moderately skewed frequency distribution, the mean is 10 and the median is 8.5. If the
coefficient of variation is 20%, find the Pearsonian coefficient of skewness and the probable
mode of the distribution.
4. The sum of fifteen observations, whose mode is 8, was found to be 150 with coefficient of
variation of 20%
(a) Calculate the pearsonian coefficient of skewness and give appropriate conclusion.
(b) What is the shape of this distribution?
(c) If a constant k was added on each observation, what will be the new pearsonian coefficient
of skewness? Show your steps. What do you conclude from this?
Reference
88
Chapter 6: Simple Linear Regression and Correlation
6.1 Introduction
In this chapter we shall see the relationships between different variables and closely related
techniques of correlation and linear regression for investigating the linear association between
two continuous variables. Correlation measures the closeness of the association, while linear
regression gives the equation of the straight line that best describes it and enables the prediction
of one variable from the other. For example, how does consumption change as family income
change? Is there a relation between expense and income? In the community, what is the relation
between socio economic status of residents and the extent to which health care is available? Is
there relation between income and expenditure? All these questions concern the relationship
between two variables, each measured on the same units of observation, be they animals,
patients, or communities. Correlation and regression constitute the statistical techniques for
investigating such relationships.
Scatter diagram (plot): It is the simplest method of studying the relationship between two
variables diagrammatically. It is a two-dimensional plot of a sample of bivariate observations.
The diagram is an important aid in assessing what type of relationship links the two variables. It
is a plot of all ordered pairs (x, y) on the coordinate plane which is necessary to discover whether
the relationship b/n two variables indeed best explained by straight line.
Correlation analysis: is a statistical technique that can be used to describe the degree to which
one variable is linearly related to other variable.
Correlation is the method of analysis to use when studying the possible association between two
continuous variables. If we want to measure the degree of association, this can be done by
calculating the correlation coefficient. The standard method (Pearson correlation) leads to a
quantity called r which can take any value from -1 to +1. This correlation coefficient r measures
the degree of 'straight-line' association between the values of two variables. Thus a value of +1.0
or -1.0 is obtained if all the points in a scatter plot lie on a perfect straight line (see figures 9.1).
89
The correlation between two variables is positive if higher values of one variable are associated
with higher values of the other and negative if one variable tends to be lower as the other gets
higher. If r = 0 implies there is no linear relationship between the two variables: but there could
be a non-linear relationship between them. In other words, when two variables are uncorrelated, r
= 0, but when r = 0, it is not necessarily true that the variables are uncorrelated.
What are we measuring with r? In essence r is a measure of the scatter of the points around an
underlying linear trend: the greater the spread of the points the lower the correlation. The
correlation coefficient usually calculated is called Pearson's r or the 'product-moment' correlation
coefficient (other coefficients are used for ranked data, etc.).
If we have two variables x and y, the correlation between them denoted by r (x, y) or ‘r’ is given
by
( x x )( y y )
Cor ( x, y ) n 1
r
sd ( x).sd (Y ) ( x x ( y y)
2
n 1 n 1
=
( x x )( y y )
( x x) ( y y)
2 2
x y
xy n
=
( X ) )( y ( y)
2
2
( x 2
2
)
n n
90
The numerator is termed as the sum of products of x and y, SPxy. In the denominator, the first
term is called the sum of squares of x, SSx, and the second term is called the sum of squares of y,
SSy. Thus,
SPxy
r=
SS x SS y
Interpretation of r
1. Perfect positive linear relationship ( if r 1)
2. Perfect negative linear relationship ( if r 1)
3. Some Positive linear relationship ( if r is between 0 and 1)
4. No linear relationship ( if r 0)
5. Some Negative linear relationship ( if r is between -1 and 0)
Uses of correlation:
2. It is useful for economists to study the relationship between variables like price, quantity etc.
Businessmen estimate costs, sales, price etc.
3. It is helpful in measuring the degree of relationship between the variables like income and
expenditure, price and supply, supply and demand etc.
Example 9.1: Compute the value of Pearson’s correlation coefficient based on the study of Age
(X) and Blood pressure (Y) of a person.
Blood
Age=X Pressure=Y XY X2 Y2
91
70 152 10640 4900 23104
n XY ( X )( Y )
n=6 and r
[ n X 2 ( X ) 2 ] [ n Y 2 ( Y ) 2 ]
Interpretation: There is strong positive linear r/p b/n age & blood pressure.
The above formula and procedure is applicable for quantitative data. When we have qualitative
data (efficiency, honesty, intelligence and others), we go for Spearman’s Rank Correlation
Coefficient ( rs ).
It is a measure of correlation based on rank of observations and not on the actual magnitudes
(values). It is useful to study the qualitative measure of attributes like honesty, colour, beauty,
intelligence, character, morality etc. The individuals in the group can be arranged in order and
there on, obtaining for each individual a number showing his/her rank in the group.
6 ∑ 𝐷𝑖2
𝑟𝑠 = 1 −
𝑛(𝑛2 − 1)
Example: The following are rankings of seven football players by two Coaches.
92
Player Coach-1=X Coach-2=Y Di=Xi-Yi Di2
A 4 4 0 0
B 1 2 -1 1
C 6 5 1 1
D 5 6 -1 1
E 3 1 2 4
F 2 3 -1 1
G 7 7 0 0
D i
2
=8
6 Di
2
6(8)
n=7 and rs 1 1 0.857(Closeto 1)
n(n 1)
2
7(7 2 1)
Regression analysis: is concerned with bringing out the nature of relationship and using it to
know the best approximate value of one variable corresponding to a known value of other
variables. Simple linear regression deals with method of fitting a straight line (regression line) on
a sample of data of two variables in terms of equation so that if the value of one variable is given
we can predict the value of the other variable.
If we have two variables under study one may represent the cause and the other may represent
the effect. The variable representing the cause is known as independent (predictor or repressor)
variable and it is usually denoted by X. The variable representing the effect is known as
dependent (predicted) variable and is usually denoted by Y. Then, if the relationship between the
two variables is a straight line, it is known as simple linear regression.
When there are more than two variables and one of them is assumed to be dependent up on the
others, the functional relationship between the variables is known as multiple linear regressions.
Therefore, to see the type of relationship, it is advisable to prepare scatter plot before fitting the
model.
93
There are two principal purposes for building a regression model. The first most common
purpose is to build a predictive model, for example in situations in which age and gender are
used to predict normal values in lung size or body mass index (BMI). Normal values are the
range of values that occur naturally in the general population. In developing a model to predict
normal values, the emphasis is on building an accurate predictive model.
The second purpose of using a regression model is to examine the effect of an explanatory
variable on an outcome variable after adjusting for other important explanatory factors. These
types of models are used for hypothesis testing.
Simple Regression: is a regression where there is one dependent variable (Y) and one
independent variable (X).
Multiple Regressions: is a regression where there is one dependent variable (Y) and two or
more independent variable (X).
Simple Linear Regression: is a regression where there is a linear relationship b/n dependent
variable (Y) and independent variable.
Two variables X and Y are said to linearly related if their relationship can be expressed by
simple linear model (referred as Regression of Y on X):
Y = α + βX + ε … … … … … … … … … … … … … … … … … … 9.1
Where:Y Dependentvar iable
X independent var iable
Re gression cons tan t
regressionslope
randomdisturbance term
Y ~ N ( X , 2 )
~ N (0, 2 )
There exists a linear relationship between the dependent and the independent variable/s.
94
Error terms are assumed to be distributed normally with zero mean and constant variance 𝜎 2
(homoscedasticity or equal variance of ui,), i.e. 𝜀𝑖 ~𝑁(0, 𝜎 2 ).
The values of predictor variables are fixed in repeated sampling.
No autocorrelation between the disturbances.
Variability in X values (the X values in a given sample must not all be the same).
There is no perfect multicollinearity. That is, there are no perfect linear relationships among
the explanatory variables. (Gujarati, 2004)
To estimate the parameters ( and ) we have several methods: some of them are:
The least squares estimation procedure uses the criterion that the Least Squares solution must
give the smallest possible sum of squared deviations of the Criterion the observed Yi from the
estimates of their true means provided by the solution. Let 𝑎 𝑎𝑛𝑑 𝑏 be the numerical estimates of
the parameters 𝛼 𝑎𝑛𝑑 𝛽 respectively, and let
be the estimated mean of Y for each Xi, i = 1, . . . , n. Note that _Yi is obtained by substituting the
estimates for the parameters in the functional form of the model relating E(Yi) to X.
Where a is a constant which gives the value of Y when X=0 .It is called the Y-intercept. b is a
constant indicating the slope of the regression line, and it gives a measure of the change in Y for
a unit change in X. It is also regression coefficient of Y on X.
95
Where: Yi observedvalue
Yˆ estimated value a bX
i i
Minimizing SSE 2
gives
b
( X X )(Y Y ) XY nXY
i i
(X X )
i X nX
2 2 2
a Y bX
Example 6.3: The following data shows the score of 12 students for accounting and fundaments
of Statistics examinations.
Accounting X Statistics Y
1 74.00 81.00
2 93.00 86.00
3 55.00 67.00
4 41.00 35.00
5 23.00 30.00
6 92.00 100.00
7 64.00 55.00
8 40.00 52.00
9 71.00 76.00
10 33.00 24.00
11 30.00 48.00
12 71.00 87.00
a)
96
𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∑ 𝑌
𝑟=
(√𝑛 ∑ 𝑋 2 − (∑ 𝑋)2 ) √𝑛 ∑ 𝑌 2 − (∑ 𝑌)2
12 ∑ 48407 − 687 × 741
= = 0.9194
(√12 × 45591 − 6872 ) × √12 × 52525 − 7412
The Coefficient of Correlation (r) has a value of 0.92. This indicates that the two variables are
positively correlated (Y increases as X increases).
b. Using OLS:
𝑛 ∑ 𝑋𝑌 − ∑ 𝑋 ∑ 𝑌 ∑ 𝑋𝑌 − 𝑛𝑋̅𝑌̅
𝑏= =
𝑛 ∑ 𝑋 2 − (∑ 𝑋)2 ∑ 𝑋 2 − 𝑛𝑋̅ 2
= 7.0194
Yˆ 7.0194 0.9560 X
7.0194 0.9560(85) 88.28
Example 6.4: The following hypothetical data set shows income and monthly food expenditure
of household in hundreds of birr. Then,
97
4.8 3.7 17.76 23.04 13.69
5.5 4 22 30.25 16
a. Yˆ a bX
n XY X Y 11(294.97) (65.45)(44.43)
b 0.38 and
n X ( X )
2 2
11(472.19) (65.45) 2
Interpretation: whenever income (X) is zero, the expenditure on food will be Birr 1.79 (179)
and for every Birr increase in income 38% of it will be spent on food.
n XY ( X)( Y)
b. r 0.9829
[n X 2 ( X ) 2 ] [n Y 2 ( Y ) 2
Summary
Correlation measures the closeness of the association, while linear regression gives the equation
of the straight line that best describes it and enables the prediction of one variable from the other.
98
It is useful to study the qualitative measure of attributes like honesty, colour, beauty, intelligence,
character, morality etc. The individuals in the group can be arranged in order and there on,
obtaining for each individual a number showing his/her rank in the group. Simple linear
regression deals with method of fitting a straight line (regression line) on a sample of data of two
variables in terms of equation so that if the value of one variable is given we can predict the
value of the other variable.
Exercise 6
1. If cov(x, y) = 0 then
A. x and y are correlated C. x and y are linearly related
B. x and y are uncorrelated D. none
2. Limits for correlation coefficient.
A. –1 ≤r ≤1
B. 0 ≤r ≤1
C. –1 ≤r ≤0
D. 1≤ r ≤2
99
1. Given the bivariate data:
X: 1 5 3 2 1 1 7 3
Y: 6 1 0 0 1 2 1 5
a) Fit a regression line of Y on x and hence predict Y if x=10.
b) Fit a regression line of x on y and hence predict X if y=2.5
Assignment 6: (10%)
1. For 10 observations on price (x) and supply (y) the following data was obtained (in
appropriate units): ∑ 𝑥 = 130, ∑ 𝑦 = 220, ∑ 𝑥 2 = 2288, ∑ 𝑦 2 = 5506, ∑ 𝑥𝑦 = 3467. obtain the
line of regression of y on x and estimate the supply when the price is 16 units.
a. Find the correlation coefficient between supply and price?
b. Fit the linear regression model
2. The grades of a class of 9 students on a midterm report (x) and on the final examination (y)
are as follows:
x 77 50 71 72 81 94 96 99 67
Y 82 66 78 34 47 85 99 99 68
References
100
Chapter 7: Elementary Probability
Objectives
At the end of this unit the students will be able to:
Define probability and identify probability experiment
Explain some terms of probability
Explain and identify the counting rule for a given problem
List the axioms of probability
Compute the probability/conditional probability of an event
Contents:
7.1 Definitions of terms of probability
7.2 Counting rule (addition, multiplication, permutation, combination)
7.3 Approaches of measuring probability (Classical, frequents, Subjective)
7.4 Conditional probability and independence
7.5 Basic Concepts of Probability Distributions
Outcome: The result of a single trial of a random experiment. Each outcome in a sample space is
called an element or a member of the sample space, or simply a sample point
3. Sample Space: The set of all possible outcomes of a statistical experiment is called the
sample space and is represented by the symbol S.
4. Event: It is a subset of sample space. It is a statement about one or more outcomes of a
random experiment .They are denoted by capital letters.
101
Example: Considering the above experiment let A be the event of odd numbers, B be the event
of even numbers, and C be the event of number 8.
Solutions:
3+3+3=9
If a choice consists of k steps of which the first can be made in n1 ways, the second can be made
in n2 ways…, the kth can be made in nk ways, then the whole choice can be made in
(n1 * n2 * ........ * nk ) ways.
Example 1: The digits 0, 1, 2, 3, and 4 are to be used in 4 digit identification card. How many
different cards are possible if
103
a) Repetitions are permitted.
b) Repetitions are not permitted.
Solutions:
Example 2: How many sample points are there in the sample space when a pair of dice is thrown
once?
Solution: The first die can land face-up in any one of n1 = 6 ways. For each of these 6 ways, the
second die can also land face-up in n2 = 6 ways. Therefore, the pair of dice can land in n1n2 =
(6)×(6) = 36 possible ways.
Example 3: In a personnel department a larger corporation wishes to issue each employee an ID
cards with two letters followed by two digit numbers. How many possible ID cards can be
imposed?
Solution
K1 K2 K3 K4
26 26 10 10
c) Permutation
n!
Pr
( n r )!
n
104
3. The number of permutations of n objects in which k1 are alike, k2 are alike,…, kn are alike;
is given by:
n!
n Pr
k1!*k2 * ... * kn
4. The number of permutations of n objects arranged in a circle is (n − 1)!.
Examples:
1. Suppose we have a letters A,B, C, D
a) How many permutations are there taking all the four?
b) How many permutations are there two letters at a time?
2. How many different permutations can be made from the letters in the word
“CORRECTION”?
Solutions:
1.
a) Here n=4, there are four distinct objects, there are 4!=24 permutations
4! 24
b) Here n=4, r=2; then 4 P2 12 permutations.
(4 2)! 2
2. Here , n=10 Of which 2 areC , 2 areO, 2 are R ,1E ,1T ,1I ,1N
Example: In one year, three awards (research, teaching, and service) will be given to a class of
25 graduate students in a statistics department. If each student can receive at most one
award, how many possible selections are there?
Solution: Since the awards are distinguishable, it is a permutation problem. The total number of
25! 25!
sample points is: 25𝑃3 = (25−3)! = 22! = (25)(24)(23) = 13,800
d) Combination
Combination Rule
105
n
The number of combinations of r objects selected from n objects is denoted by n Cr or and
r
n n!
is given by the formula:
r (n r )!*r!
Examples:
n9 , r 5
n n! 9!
126 ways
r (n r )!*r! 4!*5!
2. Among 15 clocks there are two defectives .In how many ways can an inspector chose three of
the clocks for inspection so that:
a) There is no restriction.
b) None of the defective clock is included.
c) Only one of the defective clocks is included.
d) Two of the defective clock is included.
Solutions:
a) If there is no restriction select three clocks from 15 clocks and this can be done in :
n 15 , r 3
n n! 15!
455 ways
r (n r )!*r! 12!*3!
2 13
* 286 ways.
0 3
106
2 13
* 156 ways.
1 2
2 13
* 13 ways.
2 3
There are four different conceptual approaches to the study of probability theory. These are:
Definition: If a random experiment with N equally likely outcomes is conducted and out of these
NA outcomes are favorable to the event A, then the probability that event A occur denoted P ( A)
is defined as:
N A No. of outcomes favourableto A n( A)
P( A)
N Total numberof outcomes n(S )
Examples:
107
Solutions:
S 1, 2, 3, 4, 5, 6
N n( S ) 6
108
30 50
Total way in which A occur * N A n( A)
4 6
30 50
*
n( A) 4 6
P ( A) 0.265
n( S ) 80
10
c) Let A be the event that all will be non defective.
30 50
Total way in which A occur * N A n( A)
0 10
30 50
*
n( A) 0 10
P ( A) 0.00624
n( S ) 80
10
3. Some customers prefer to see the merchandise but then make their purchase later using Lee’s
Lights’ new Internet site. Tracking customer behavior, Lee determines that there’s a 9%
chance of a customer making a purchase in this way. We know that about 30% of customers
make purchases when they enter the store.
Question:
a) What is the probability that a customer who enters the store makes no purchase at all?
Answer: We can use the Addition Rule because the alternatives “no purchase,” “purchase in the
store” and “purchase online” are disjoint events.
4. Lee notices that when two customers enter the store together, their behavior isn’t independent.
In fact, there’s a 20% chance they’ll both make a purchase.
If 𝑃(𝐴 𝑝𝑢𝑟𝑐ℎ𝑎𝑠𝑒) = 𝑃( 𝐵 𝑝𝑢𝑟𝑐ℎ𝑎𝑠𝑒) = 0.3
Question: When two customers enter the store together, what is the probability that at least one
of them will make a purchase?
109
Answer: Now we know that the events are not independent, so we must use the General
Addition Rule
𝑃(𝐵𝑜𝑡ℎ 𝑝𝑢𝑟𝑐ℎ𝑎𝑠𝑒) = 𝑃(𝐴 𝑝𝑢𝑟𝑐ℎ𝑎𝑠𝑒𝑠 𝑜𝑟 𝐵 𝑝𝑢𝑟𝑐ℎ𝑎𝑠𝑒𝑠)
= 𝑃(𝐴 𝑝𝑢𝑟𝑐ℎ𝑎𝑠𝑒𝑠) + 𝑃(𝐵 𝑝𝑢𝑟𝑐ℎ𝑎𝑠𝑒𝑠)– 𝑃(𝐴 𝑎𝑛𝑑 𝐵 𝑏𝑜𝑡ℎ 𝑝𝑢𝑟𝑐ℎ𝑎𝑠𝑒)
= 0.30 + 0.30 − 0.20 = 0.40
b) The Frequentist Approach (Empirical Approach)
Example:
1. If records show that 60 out of 100,000 bulbs produced are defective. What is the probability
of a newly produced bulb to be defective?
Solution:
Let A be the event that the newly produced bulb is defective.
NA 60
P( A) lim 0.0006
N N 100,000
2. The national center for health statistics reported that of every 539 deaths in recent years, 24
resulted that from automobile accident, 182 from cancer, and 353 from other disease. What is
the probability that particular death is due to an automobile accident?
Solution: P (automobile) = death due to automobile /total death =24/539
Let E be a random experiment and S be a sample space associated with E. With each event A a
real number called the probability of A satisfies the following properties called axioms of
probability or postulates of probability.
110
3. If A and B are mutually exclusive events, the probability that one or the other occur equals
the sum of the two probabilities. i. e. P ( A B ) P ( A) P ( B ) ; P( A B) 0
4. P( A' ) 1 P( A)
5. 0 P ( A) 1
6. P(ø) =0, ø is the impossible event.
Remark: Venn-diagrams can be used to solve probability problems.
AUB AnB A
In general, p( A B ) p ( A) p ( B ) p ( A B) , for any events A and B.
Examples
1. John is going to graduate from a national economics in a university by the end of the
semester. After being interviewed at two companies he likes, he assesses that his probability
of getting an offer from company A is 0.8, and his probability of getting an offer from
company B is 0.6. If he believes that the probability that he will get offers from both
companies is 0.5, what is the probability that he will get at least one offer from these two
companies?
Solution: using the additive rule, we have
𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴 ∩ 𝐵) = 0.8 + 0.6 − 0.5 = 0.9
2. Find the errors in each of the following statement.
a. The probability that it will rain tomorrow is 0.40, and the probability that it will not rain
tomorrow is 0.52.
b. The probabilities that a printer will make 0, 1, 2, 3, or 4 or more mistakes in setting a
document are, respectively, 0.19, 0.34,−0.25, 0.43, and 0.29.
Solutions:
a. Let A= the event it will rain tomorrow and B= the event it will not rain tomorrow. Then if
P(A) = 0.40 , P(B) must be equal to 1- P(A).i.e. P(B) = 1-0.40=0.60 ≠ 0.52.
111
b. Here there are two errors; the first error is probability of any event cannot be negative
(P(2) = -0.25), second although we consider there is no the first error , the sum of the
probabilities cannot be greater than 1.
a. The probability that an American industry will locate in Shanghai, China, is 0.7, the
probability that it will locate in Beijing, China, is 0.4, and the probability that it will
locate in either Shanghai or Beijing or both is 0.8. What is the probability that the
industry will locate
b. in both cities?
c. in neither city?
Solution: let A=the event that the industry will locate in Shanghai and
B= the event that the industry will locate in Beijing
p ( A) 0.7 , p ( B ) 0.4 and p ( A B) 0.8
Conditional Events: If the occurrence of one event has an effect on the next occurrence of the
other event then the two events are conditional or dependant events.
Conditional probability:
The conditional probability of an event A in relation to B is defined as the probability that event
A occurs given that event B has been already occurred, denoted p( A B) is
p( A B) p( A B)
p( A B) , p( B) 0 or p ( B A) , p ( A) 0
p( B) p ( A)
p( B A)
P A B
P A
P B A
P A
(4). pB pB A. p A p B A' . p A'
(3).
112
Example:
1. Suppose we have two red and three white balls in a bag
a. Draw a ball with replacement
B= the event that the second draw is red p ( B ) 2 ; A and B are independent.
5
b. Draw a ball without replacement
Let B= the event that the second draw is red given that the first draw is red p ( B ) 1 4
2. The probability that a regularly scheduled flight departs on time is P (D) = 0.83; the
probability that it arrives on time is P (A) = 0.82; and the probability that it departs and arrives
on time is P (D ∩A) = 0.78. Find the probability that a plane
a. Arrives on time, given that it departed on time, and
b. Departed on time, given that it has arrived on time.
Solutions: Using the definitions we have the following
The probability that a plane arrives on time, given that it departed on time, is P(A|D)
P( D A) P( D) 0.78 0.94 .
0.83
(b) The probability that a plane departed on time, given that it has arrived on time, is P(D|A)
P( D A) P( D) 0.78 0.95 .
0.82
113
ii.The selected individual is unemployed given that she is a female.
iii.The selected individual is employed.
iv.The selected individual is a female given that he/she is unemployed.
Solutions: Consider; M=Male, F=Female, E=Employed and U= Unemployed
PM E
i. p(M E ) 23/ 45 460 23
PE 2/3 600 30
P F U
ii. p(U F ) 260 13
PF 400 20
iii. p E 600 2
900 3
P F U
iv. p( F U ) 260 13
P U 300 15
4. For a student enrolling at freshman at certain university the probability is 0.25 that he/she will
get scholarship and 0.75 that he/she will graduate. If the probability is 0.2 that he/she will
get scholarship and will also graduate. What is the probability that a student who get a
scholarship graduate?
Solution: Let A= the event that a student will get a scholarship
Examples:
1. A box contains four black and six white balls. What is the probability of getting two
black balls in drawing one after the other under the following conditions?
a. The first ball drawn is not replaced
114
b.The first ball drawn is replaced
2. A small town has one fire engine and one ambulance available for emergencies. The
probability that the fire engine is available when needed is 0.98, and the probability that
the ambulance is available when called is 0.92. In the event of an injury resulting from a
burning building, find the probability that both the ambulance and the fire engine will be
available, assuming they operate independently.
3. The probability that a married man watches a certain television show is 0.4, and the
probability that a married woman watches the show is 0.5. The probability that a man
watches the show, given that his wife does, is 0.7. Find the probability that
a. a married couple watches the show;
b. a wife watches the show, given that her husband does;
Solutions:
2. Let A and B represent the respective events that the fire engine and the ambulance are
3. Let p M 0.4; p W 0.5, and P M W 0.7
a. p W M p M W . p W 0.7 0.5 0.35
p W M 0.35
b. p W M 0.875
PM 0.4
115
It a random Variable is variable whose values are determined by chance.
Example: If X is a random variable, then it is a function from the elements of the sample space
to the set of real numbers. i.e.
X is a function X: S R
Example: Flip a coin three times, let X be the number of heads in three tosses.
Discrete random variable: are variables which can assume only a specific number of values
which are clearly separated and they can be counted.
Example:
b. Toss coin n times and count the number of heads.
c. Number of Children in a family.
d. Number of car accidents per week.
e. Number of defective items in a given company.
f. Number of bacteria per two cubic centimeter of water.
Every discrete random variable X has a point associated with it. The points collectively are
known as a probability mass function which can be used to obtain probabilities associated with
the random variable.
Let X be a discrete random variable, then the probability mass function is given by
f(x) = P(X=x), for real number x.
The set of order pairs (x, p(x)) is a probability function, probability mass function, or
probability distribution of the discrete random variable X if, for each possible outcome x,
P( x) 0, if X is discrete.
1.
2.
P X x 1 ,
x
if X is discrete.
3. P( X u ) f u for any u
Example: Consider the experiment of tossing a coin three times. Let X be the number of heads.
Construct the probability distribution of X.
a b
𝑏) ∫ 𝑓(𝑥)𝑑𝑥 = 1
−∞
118
probability of an interval is the area bounded by curve of probability density function and
interval on x-axis. Let a and b be any two values; a<b. The probability that X assumes a value
that lies between a and b is equal to the area under the curve a and b. i.e. P (a x b) area under
curve between a and b. Since X must assume some value, it follows that the total area under the
density curve must equal 1.
b
P(a x b ) area of shaded region P (a X b) p (a x b) f ( x)dx
a
a b
𝑏) ∫ 𝑓(𝑥)𝑑𝑥 = 1
−∞
Examples: 1. Suppose that the error in the reaction temperature, in ◦C, for a controlled
laboratory experiment is a continuous random variable X having the probability density function
𝑥2
𝑓(𝑥) = { 3 , −1 < 𝑥 < 2,
0, 𝑒𝑙𝑠𝑒𝑤ℎ𝑒𝑟𝑒.
119
a) Obviously, f(x) ≥0. To verify condition 2 above, we have
∞ 2 2
𝑥3 𝑥3 8 1
∫ 𝑓(𝑥)𝑑𝑥 = ∫ 𝑑𝑥 = | = + = 1
3 9 −1 9 9
−∞ −1
Summary
Any realistic model of a real-world phenomenon must take into account the possibility of
randomness. That is, more often than not, the quantities we are interested in will not be
predictable in advance but, rather, will exhibit an inherent variation that should be taken into
account by the model.
Probability is a measure associated with an event A and denoted by 𝑃(𝐴) which takes a value
such that 0 ≤ 𝑃(𝐴) ≤ 1 0. Essentially the quantitative expression of the chance that an event
will occur. In general the higher the value of 𝑃(𝐴) the more likely it is that the event will occur.
If an event cannot happen 𝑃(𝐴) = 0; if an event is certain to happen 𝑃(𝐴) = 1. Numerical
values can be assigned in simple cases by one of the following two methods:
1) If the sample space can be divided into subsets of n (n ≥2) equally likely outcomes and the
event A is associated with r (0 ≤ r ≤n) of these, then 𝑃(𝐴) = 𝑟/𝑛.
2) If an experiment can be repeated a large number of times, n, and in r cases the event A occurs,
then 𝑟/𝑛 is called the relative frequency of A. If this leads to a limit as 𝑛 → ∞, this limit is
𝑃(𝐴).
Exercises 7
120
4. How many different letter arrangements can be made from the letters in the word
“STATISTICS?”
5. In how many ways can 5 different trees be planted in a circle?
6. If the probability that a research project will be well planned is 0.60 and the probability that it
will be well planned and well executed is 0.54, what is the probability that it will be well
executed given that it is well planned?
Assignment 7 :( 5%)
1. In a college football training session, the defensive coordinator needs to have 10 players
standing in a row. Among these 10 players, there are 1 freshman, 2 sophomores 4 juniors, and
3 seniors. How many different ways can they be arranged in a row if only their class level will
be distinguished?
2. How many ways are there to select 3 candidates from 8 equally qualified recent graduates for
openings in an accounting firm?
3. If 3 books are picked at random from a shelf containing 5 novels, 3 books of poems, and a
dictionary, what is the probability that
i. The dictionary is selected?
ii. 2 novels and 1 book of poems are selected?
References
Answer Key
Exercise 1
1.1
121
A. Interval D. Ratio G. Nominal J. Nominal
B. Interval E. Nominal H. Ratio
C. Ordinal F. Nominal I. Ratio
Biblography (References)
Hogg, R.V. and Tanis, E. (2009). Probability and Statistical Inference (8th Edition).
Prentice Hall.
Cochran, W. G. (1977). Sampling Techniques, 3rd , Ed, John Wiley& Sons, Inc., New
York.
122
Appendix: Standard Tables of Statistics
123
-
124
-
125
-
126
-
127