We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 10
oe
vs
| INTRODUCTION Te sry
STATISTICAL
THEORY
Part 1
(A text book for Degree and Post-Graduate Students)
By
Prof. Sher Muhammad Chaudhry
B.Sc. (Hons.), M.A. (Gold Medalist)
F.8.S. (London)
Ex-Head, Department of Statistics
Government College (now GC University), Lahore
Dr. Shahid Kamal .
M.Sc. (Gold Medalist), Ph.D. (U.K.)
Principal and Professor
College of Statistical & Actuarial Sciences
University of the Punjab, Lahore
ILMI KITAB KHANA®
Kabir Street, Urdu Bazar, Lahore 54000. ~~“
| \ (Pakistan)
ge an agar
Scanned with CamScannerv Preface
CONTENTS
‘ TRODUCTION
wy
21
2.2
23
we
25
26
27
Meaning of Stati
LL” Use of
al Information
1.1.2 Characteristics of Statistics
1.13 Descriptive and Inferential Statistics
1.1.4 Populations and Samples
1.1.5 Importance of Statistics
Observations and Variables
Variables.
Discrete and Continuous Variables
Measurement Scales
Errors of Measurement
Significant Digits
Rounding off a Number
Collection of Data
13.1 Collection of Primary Data
1.3.2 Coliection of Secondary Data
1.3.3. Editing of Data
13.4 Uses and Misuses of Statistics
Exercises
PRESENTATION OF DATA
Introduction
Classification
2.2.1 Aims of Classification
2.2.2 Basic Principles of Classification
Tabulation
23.1 Types of Tables
23.2. Main Parts of a Table and its Construction
Frequency Distribution
2.4.1 Class-limits
2.4.2 Class-boundaries
2.4.3. Class-Mark
2.4.4 Class Width or Interval
2.4.5 Constructing a Grouped Frequency Distribution
2.4.6 Cumulative Frequency Distribution
Stem-and-Leaf Display
Graphical Representation
Diagrams
2.7.1 Simple Bar Chart
2.7.2 Multiple Bar Chart
2.7.3 Component Bar Chart
Page
COMOIAHVUHa RYN
10
ll
i
oT
15
1S
15
1S
1S
16
16
20
20
20
20
20
21
26
27
28
29
30
30
31
Scanned with CamScannerXK
we
Rectangles and Sub-divided Rectangles
Pictograms
Pie Diagrams
Profit and Loss Chart
we
a
Graph of Time Series ~ Historigram
Histogram
Frequency Polygon
Frequency Curve
Cumulative Frequency Polygon or Ogive
Ogive for a Discrete Variable
‘Types of Frequency Curves
Ratio Charts or Semi-logarithmic Graphs
Exercises
MEASURES OF CENTRAL TENDENCY OR AVERAGES
3.1 Introduction
Criteria of a
‘Types of Averag
The Arithmetic Mean
3.4.1 The Weighted Arithmetic Mean
3.4.2 Properties of the Arithmetic Mean
3.4.3 Mean from Grouped Data
3.4.4 Change of Origin and Scale
3.5 The Geometric Mean
3.6 The Harmonic Mean
3.7 The Median
3.7.1 Quantiles
3.8 The Mode
3.9 Empirical Relation between Mean, Median and Mode
3.10 The Box Plots
3.11 Relative Merits and Demerits of Various Averages
3.11.1 The Arithmetic Mean
3.11.2 The Geometric Mean
3.11.3 The Harmonic Mean
3.11.4 The Median
3.11.5 The Mode
Exercises
factory Average
MEASURES OF DISPERSION, MOMENTS AND SKEWNESS
Al Introduction
4.2 Range
43 The Semi-Interquartile Range or the Quartile Deviation
4.4 The Mean (or Average) Deviation
Scanned with CamScanner
32
33
44
35
35
36
37
39
40
40
41
41
43
44
5s
55
By
56
57
37
59
61
162
64
o7
68
2
4
14
15
15
1
16
1
1%
1
87
87
88
. 945 ene Standard Deviation
oe of Origin and Scale
. interpretation of the S -
453 Co-efficient anda Deviation
4.5.4 Properties of Varian ,
45.5 Standardized ai Standard Deviation
46 Trimmed and Winsorized Measures
4.7 Moments
471 Moments about the Mean in terms of Moments about
an arbitrary origin, say a, and conversely
4.7.2 Sheppard’s Corrections
4.7.3. Moment-Ratios
4.7.4 Change of Origin and Scale
4.7.5 Charlier Check
48 — Skewness
49 Kurtosis
4.10 Describing a Frequency Distribution
Exercises ae)
INDEX NUMBERS
5.1 Introduction
5.1.1 Simple and Composite Index Numbers
5.1.2 Problems Involved in Index Number Construction
5.2 Main Steps in the Construction of Index Numbers of
Wholesale Prices
5.2.1 Selection of Commodities for Inclusion
5.2.2 _ Selection of the Base Period
5.2.3 Selection of Average
5.2.4 Selection of Appropriate Weights
5.3. Unweighted Index Numbers
5.3.1 Simple Ageregative Index
5.3.2 Simple Average of Relatives
5.4. Weighted Index Numbers
5.4.1. Weighted Aggregative Price Index Numbers
542 Weighted Average of Relatives Price Index Number
5.5 Quantity Index Numbers
5.6 ‘Tests for Index Number Formulae
5.6.1 Time Reversal Test
5.6.2 Factor Reversal Test
5.6.3 Circular Test
5.7 Consumer Price Index Number
5.7.1 Meaning
5.7.2 Construction of Consumer Price Index Numbers
ol
96
97
98
100
103
104
105
106
108
108
110
Ml
114
115
116
116
131
131
131
132
132
132
134
135
135
135
136
140
140
145
147
151
151
153
155
156
156
156
Scanned with CamScannery
5.8
59
5.7.3. Shortcomings or Drawbacks of Consumer Prig
Numbers rice Index
Uses of Index Numbers.
imitations of Index Numbers.
‘xerci
6. PROBABILITY
6.1
AD
‘
Introduction
An Aside — Sets
Subsets.
Venn Diagram
Operations on Sets
The Algebra of Sets
Partition of Sets.
Class of Sets
Cartesian Product Sets
6.2.8 Relation and Function
Random Experiment
6.3.1 Sample Space
6.3.2 Events
6.3.3 Events and Symbolic Representations
63.4 Counting Sample Points
Definitions of Probability
6.4.1 Subjective or Personalistic Probability
Laws of Probability
Conditional Probability
Independent and Dependent Events
Exercises
7. RANDOM VARIABLES
L
OP
vAD
4@
75
7.6
Introduction
Distribution Function
Discrete Random Variables and its Probability Distribution
Continuous Random Variable and its Probab lity Density
Function
Joint Distributions
75.1 Bivariate Distribution Function
Bivariate Probability Functions
Marginal Probability Functions
Conditional Probability Functions
Independence
7.5.6 Continuous Bivariate Distributions
Mathematical Expectation of a Random Variable
7.6.1 Expectation of a Function of a Random Variable
Scanned with CamScanner
159
160
160
161
173
21
27
228
233
231
237
238
238
2B
239
243
248
250INTRODUCTION
PATISTICS
People view Statistics in many different ways. Generally it is considered to be a subject that deals
with percentages, charts, graphs, averages and tables. Some people think that Statistics is a subject
consisting of rules, methods and techniques of collecting and presenting large amount of numerical
information, while other people think that it is a subject of making inferences about the population on the
basis of sample information,
The word “Statistics”
‘h comes from the Latin word status, meaning a political state, originally
meant information useful to the state, for example, information about the sizes of populations and armed
forces. But this word has now acquired different meanings. 4
In the first place, the word statistics refers to “numerical facts systematically arranged”. In this
sense, the word statistics is always used in the plural. We have, for instonce, statistics of prices, statistics
ofroad accidents, statistics of crimes, statistics of births, statistics of educational institutions, etc. In all
these examples, the word statistics denotes a set of numerical data in the respective fields. This is the
meaning the man in the street gives to the word Statistics and most people usually use the word dara
instead
fely
Example 1.1 In the following examples, the facts and figures usually called Statistics presented in
the media almost every day are given: %
i) Children who brush their teeth with brand XYZ toothpaste have 60% fewer countries,
ii) The Bureau of census projects the population of Pakistan to be 170.1 million in the year 2010.
iii) Eight out of ten Pakistanis do not have skills.
iv) The prevalence of diabetes is nearly 3 times as high in over weight people 2s compared to
normal weight people
¥) In 1980 it was estimated that 0.1% of people had tried any sort of drug; where as in 2008 it
was estimated that 10% had done so.
‘In the/second place) the word statistics is defined as a discipline that includes procedures and
techniques used to collect, process and analyse numerical data to make inferences and to reach decisions
inthe face of uncertainty. It should of course be bome in mind that uncertainty does not imply ignorance
but it refers to the incompleteness and the instability of data available, In this sense, the word statistics is
wed in the singular. As it embodies more or less all stages of the general process of learning, sometimes
called scientific method, statistics is characterized as a science. Thus the word statistics used in the plural.
‘sfers to a set of numerical information and in the singular, denotes the sc of basing de
Tumerical data. It should be noted that statistics as a subject is mathematical in character
( diy) the word statistics are numerical quantities calculated from sample observations, a single
Auantity that has been so calculated is called a statistic, The mean of a sample for instan
* Word statistics is plural when used in this sense.
sa statistic
1.1.1 Use of Statistical Information. ‘The statistical information are and can be used for a variety
‘Sons. Some of them are:
4)
~ii)
Of rea
‘o inform general public;
‘o explain things that have happened;
Scanned with CamScannera i
iii) to justify a claim;
Ziv) to provide general comparisons;
”
vi) toestimate the unknown quantities:
sion regarding future outcomes;
vil) to establish association / relationship between factors
Hence Stausties is a subject which is much more than just numbers. It tells us what is done to g
with numbers. The following three examples further & stics may be used
xplain how Stat
Example 1.2 Suppose we want to determine the best teacher at Govt. College University, Lahore
How should we decide this? This could be done by asking Govt. College University stuclents who the bes
teacher is. To do so, we collect the-data, analyze the results and make the decision, Now vation
questions are:
i) should we survey every student?
ii) how will the survey be conducted?
iii) how will the data be analyzed?
iv) how will the best teacher be determined?
In order to answer these and other questions, Statistical techniques are used
Example 1.3 A TV station claims that an advertisement of @ product on their channel attracts mor
cnstomers compared to all other TV channels. Now if this claim is based on data, there it can be used t
miaaket the TV channel. Suppose we have some doubts about the claim, In order to remove the doubts,
might gather relevant information, analyze the results using appropriate statistical technique and make
decision regarding the claim.
Example 1.4 Suppose University of the Punjab is planning an expansion program of its physic:
facilities. Tu draw up an effective course of action, the University authorities decide that it needs t
answer this question, how many college students will we need to accommodate over the next ten yea
The question can be further broken dows into many smaller questions, How many college students
then be in the Punjab? How many will want to attend the University of the Punjab? ete. Once ag
Statistical methods can assist in evaluating and planning of expansion program,
1.1.2 Characteristics of Statistics. The cefinition stated above indicates that statistics is 4 U/®
in its own right. It may therefore be desirable to know the characteristic features of statistics in ore"
appreciate and understand its genera! nature. Some of its important characteristics are given below
i) Statistics deals with the behaviour of aggregates or large groups of data. It has nothing °°
with what is happening to a particular individual or object of the aggregate.
s
ii) Statistics deals with aggregates of ebservations of the same kind rather than isolated fig
ob
iii) Statistics deals with variability that obscure underlying patterns. No two objects LY
universe are exactly alike. If they were, there would have been no statistical problem
Scanned with CamScannerPY
jpiRODUCTION __
a_— — = _ .
in) __ Statsties deals with uncertainties
or uncontrolled, involve
terms of probability
8 every process of getting observations whether controlled
deficiencies or el
jeneies or chance variation. ‘That is why we have to talk in
4) Statistics deals with those characteristics. or aspects. of thiny
which can be ed
numerically either by counts or by measurements, ich can be described
” Satis dea wih hose aggregates which are subject wo » number ofrandom causes, eg the
heights of persons are subject to a number of causes such a t
i uses such as race, ancestry, age, diet, habits,
climate and so forth eee
vii) Statistical laws are valid on the average or in the long run, Th no guarantee that a certain
law will hold in all cases. Statistical inference is therefore made in the face of uncertainty.
vii) Statistical results might be misleading and incorrect if sufficient care in collecting, processing
and interpreting the data is not exercised or if the statistical data are handled by a person who
is not well versed in the subject matter of statistics.
1.1.3 Descriptive and Inferential Statisties. Statistics as a subject, may be divided into
descriptive statistics and inferential statistics.
Peseriprive
statistics is that branch of statistics which deals with concepts and methods concerned
with Summarization and description of the important aspects of numerical data. T! is area of study
consists of the condensation of data, their graphical displays and the computation of a few numerical
quantities that provide information about the centre of the data and indicate the spread of the
observations.
sitial statistics) deals with procedures for making inferences about the characteristics that
describe the large group of data or the whole, called the population, from the knowledge derived from
only apart ofthe data, known as sample. This area includes the estimation of population parameters and
testing of statistical hypotheses. This phase of statistics is based on probability theory as the inferences
“hich are made on the basis of sample evidence, cannot be absolutely certain.
Comparison
Descriptive Statistics Inferential Statistics
i) Accricket player wants to estimate
his chance of scoring based on his
current season average.
i) A cricket player wants to find his
score average for the last 20
games.
the ii) Based on the first four test scores,
‘Aamir would like to predict the
variation in his final Statistics test
scores
ii) Aamir wants to describe th
variation in his four test scores 1n
Statistics.
Based on last six months grocery
bills, Mrs. Rashid would like 10
predict the average amount she
will spend on groceries for the
upcoming year.
iii) Mrs. Rashid wants to determine iii)
the average weekly amount she
spent on groceries in the past 6
months.
Scanned with CamScannerINTRODUCTION TO STATISTICAL THEORY
; | 7
1.1.4 Populations and Samples, A\population or a statistical population is a collection or set of all
‘ossible observations whether finite or infinite, relevant to some characteristic of interest. A statistical
spulation may be real such as the heights of all college students or hypothetical such as all the possible
outcomes from the toss of a coin, The number of observations in a finite population is called the size of
the population and is denoted by the letter N. Numerical quantities describing a population are called
(parameters) customarily represented by Greek letters. It is important to note that ih statistics the word
‘population is a technical term not necessarily referring to all the people in a specified area, rather
denoting the aggregate of measurements or counts of some characteristic for the entire group of objects or
individuals.
4 7
A fample is a part or a subset of a population. Generally it consists of some of the observations but
in certain situations, it may include the whole of the population. The number of observations included in a
sample is called the size of the sample and is denoted by the letter 1. A numerical quantity computed from
a sample, is called alstatistic, which is usually represented by ordinary Latin letter. The information
derived from sample data is used to draw conclusions about the population.
Example 1.5 State whether each of the following is a population or a sample,
i) Total number of absentees by all students in a college during the last month,
ii) Number of colour TV sets owned by all families in Lahore.
iii) Monthly salaries of all employees of a company.
iv) Wheat yield per acre for 5 pieces of land.
v) Number of computers sold during the last month at all the computer stores in Lahore.
Solution
i) Population
ii) Population
iii) Population
iv) Sample
v) Population
1.1.5 Importance of Statistics. Statistics is perhaps a subject that is used by everybody. The
following functions and uses of statistics in most diverse fields serve to indicate its importance,
i) Statistics assists in summarizing the larger sets of data in a form that is easily understandable.
ii) Statistics assists in the efficient design of laboratory and field experiments as well as surveys.
iii) Statistics assists in a sound and effective planning in any field of inquiry
iv) Statistics assists in drawing general conclusions and in making predictions of how much of
thing will happen under given conditions.
y) Statistical techniques being powerful tools for analysing numerical data, are used in almost
every branch of learning. In the biological and physical sciences, Genetics, Agronom
‘Anthropometry, Astronomy, Physics, Geology, etc. are the main areas where statistical
techniques have been developed and are increasingly used.
Scanned with CamScannerINTRODUCTION
ee 5
vi) A businessman, an industrialist
work. Banl
and a research worker all employ
ies and Governments all have thei
stical methods in their
tistics departments
» Insurance compani
vii), A modern administrator whether in
uli
a factual basis for decision Public or private sector, leans on statistical data to provide
viii) A politician uses statistics advant,
F a ageou c “1
while elucidating the probleme usly to lend support and credence to his arguments
handles,
A social scier
ix) ‘ust Uses statistical methods in various areas of socio-economic life of a nation. It
is sometimes said that “a social scientist without an adeg
: ‘ juate underst si
often like the blind man groping fanding of statistics, is
in a dark room for a black cat that is not there”.
12 OBSERVATIONS AND VARIABLES
In statistics, an observation often means any sort of numerically recording of information, whether
itisa physical measurement such as height or weight; a classification such as heads or tails, or an answer
toa question such as yes or no.
1.2.1. Variables. A characteristic that varies with an individual or an object, is called a variable.
For example, age is a variable as it varies from person to person. A variable can assume a number of
values. The given set of all possible values from which the variable takes on a value, is called its domain.
If for a given problem, the domain of a variable contains only one value, then the variable is referred to as
constant.
Variables may be classified into quantitative and qualitative according to the form of the
characteristic of interest. A variable is called a quantitative variable when a characteristic can be
expressed numeriéally such as age, weight, income or number of children. On the other hand, if the
characteristic is non-numerical such as education, gender, eye-colour, quality, intelligence, poverty,
satisfaction, etc. the variable is referred to as a qualitative variable. A. qualitative characteristic is also
called an avtribidte. An individual or an object with such a characteristic can be counted or envimersted
after having been assigned to one of the several mutually exclusive classes or categories
1.2.2 Discrete and Continuous Variables. A quantitative variable may be classified as cane
continuous, Aldiscrete variable is one that can take only a discrete set of integers or Whole numbers,
i cI such : ber
e a te variable represents cowne data such as the num!
is the va by jumps or breaks. A discre'
ore re tony the pander of rooms in a house, the number of deaths in an accident, the income
sons i ,
of an individual, etc.
variable if it can take on any value--fractional or integer--within.a
.rval with all possible values without gaps. A continuous variable
he age of a person, the height of a plant, the weight of a
[A variable is called ajeontinuous
given interval, i.e. its domain is an inte
represents\miedsurement_datai such as 1
commodity, the temperature at a place, etc.
is generally denoted by some symbol such as oF
+ countable or measurable, Jaced by &
vara aed fr or jtl value of the variable. The subscript For J EP
Yand X,or Y, represents that itl
number such as 1,2,3,--- When referred to a particular value.
Scanned with CamScanner