Chapter 1 Introduction To Statistics What Is Statistics?: Population
Chapter 1 Introduction To Statistics What Is Statistics?: Population
WHAT IS STATISTICS?
Introduction
The word statistics appears to have been derived from the Latin word Status. Statistics was
simple the collection of numerical data, by the kinds on different aspects useful to the state.
Today statistics is the scientific study of handling quantitative information. It embodies a
methodology of collection, classification description and interpretation of data obtained through
the conduct of surveys and experiments.
Population
The total group under discussion or the group to which results will be generalized is called
population. For example collection of height measurements of all college students is called
population.
Sample
A part of the population selected in the belief that it will represent all the characteristics of the
population is called a sample. For example a sample of 10 students is selected from a population
of 100 students in order to analyse the average height of the students.
Meaning of Statistics
Now a days the word Statistics is used in two senses i.e.
Singular Senses
In its singular sense, the word statistics means the science of statistics which deals with statistical
methods.
Plural Senses
The word statistics, when used in plural senses means numerical facts collected in any field of
study by using a statistical method.
Definition Of Statistics
Statistic is the numerical statement of facts capable of analysis and interpretation, and science of
statistics is the study of their principles and methods applied in collecting, presenting, analysis
and interpretation the numerical data in any field of inquiry. OR
Science of facts and figures is called statistics. OR
(Croxton and Cowden)
Statistics are collection, presentation, analysis and interpretation of numerical data
OR
(Connor)
Statistics are measurement enumeration or estimation or social or natural phenomena
systematically arrange so as to exhibit interrelationship. OR
(Boodington)
Statistics is the science of estimates and probabilities. OR
(Achenwall)
Statistics are a collection of notes worthy facts concerning, both historical and descriptive. OR
Statistics is defined as the science of collecting organizing presentation, analysis and
interpreting numerical data for making better decisions.
Scope of Statistics
Statistics is the branch of mathematics that deals with data. Statistics uses data, collected
through systematically method of data collection and the theories are employed to arrive at the
conclusion.
We apply general rules on the quantitative data which is useful for forecasting and for this
purpose we use scientific applied statistics.
Limitation of Statistics
1. Statistical has a handicap in dealing with qualitative observation or values.
2. Statistical results are applied only on the average
3. Statistics does not study qualitative phenomena
4. Statistical deals with fact which can be numerically expressed for e.g. love, hate, beauty,
poverty health cannot be measured.
5. Sufficient care need be exercised in the collection, analysis and interpretation of data
otherwise statistical results may be false.
Now a days statistics and statistical data, methods being applied increasing in agriculture,
Economics, Biology, Business, Physics, Chemistry, Astronomy, Medicine, Administration,
Education, Mathematics, Meteorology and Physical science.
1. Statistics and Administration
Statistics plays an important role in the field of administration and management in providing
measure of performance of the employees. Statistical data are widely used in taking all
administration decision. For example the authorities want of rise the pay scales of employees in
view of an increase in the cost of living. Statistical methods will be used to calculate the rise in
cost of living.
2. Statistics and Agriculture
Agriculture Statistics cover a wide field. These include Statistics of land utilization, production of
crop, price and wages in agriculture etc. Agriculture is greatly benefited by the statistical
methods.
3. Statistics and Medicine
Statistics plays an important role in the field of medicine, to test the effectiveness of different
types of medicines. Vital Statistics may be defined as the science. This deals with the application
of numerical methods to vital fact. It is a part of the broader field of demography. Demography is
a statistical study of all phases of human life relating to vital facts such as births, deaths, age,
marriages, religions, social affairs, education and sanitation. Vital statistics is a part of
demography and comprises of vital data.
4. Statistics and Mathematics
All statistical methods have their foundations in mathematics. No calculating work can be done
without the help of mathematics. Therefore, mathematics is applied widely in statistics. The
branch of statistics is called mathematical statistic. Both these subjects are so interrelated.
5. Statistics and Physical Sciences
Physical science greatly depending upon science of statistics in analysis and testing their
significance for drawing result. Statistical methods are used in physical science like physics,
chemistry, Geology etc.
6. Statistics and Economics
Important phenomena in all branches of economics can be described, compared with the helps of
statistics. Statistics of production described wealth of nation and compare it year after showing
there by the effect of changing economics policies and other factors on the level of production.
7. Statistics Helps in Forecasting
Through estimating the variables that exit in the fast forecasting about in times to come, can
easily be done. Statistics helps in forecasting future events. Use of some statistical techniques
like extrapolation and time series analysis helps in saying some thing the future courses of
events. Statistics plays an important role in filed of astronomy, transportation, communication
publics, health, teaching methods, engineering psychology, meteorology wealth forecasting.
Statistics and Business
Statistics plays in important role in business. It helps the business men to plan production
according to the tastes of the customers; the quality of the products can also be checked by using
statistical methods
Characteristic of Statistics
Statistics have the following characteristics.
1. Statistics are aggregates of facts
Statistics are a number of facts.
Statistics Inquiry
The inquiry about any problem which has done with the help of statistical principles and methods
is called statistical inquiry.
Steps in Statistical Inquiry
Requiring collection of data, the following steps are involved in statistical inquiry.
1. Planning inquiry.
2. Collection of data.
3. Editing the collected data.
4. Tabulating the data.
5. Analyzing the data by calculated statistical measures.
Variable
A measurable quantity which can vary (differ) from one individual to another or one object to
another object is called variable. For e.g. height of students, weight of children. It is denoted by
the letters of alphabet e.g. x, y, z etc.
Type of Variable
There are many type of variable.
1. Continuous Variable
A variable which can take set of values (fractional) b/w two limits and has continuous integer
numbers is called continuous variable. Or
A variable which can assume any value within a given range is called a continuous variable. For
e.g. age of persons, speed of car, temperature at a place, income of a person, height of a plant, a
life time of a T.V tube etc.
2. Discrete Variable
A variable which can assume only some specific values within a given range is called discrete
variable. For e.g. Number of students in a class, Number of houses in a street, number of children
in a family etc. it cant occur in decimal.
3. Quantitative Variable
A characteristic which varies only in magnitude from one individual to another is called
quantitative variable. It can measurable. Or A characteristics expressed by mean of quantitative
terms is known as quantitative variable. For e.g. number of deaths in a country per year, prices
temperature readings, heights, weights etc.
4. Qualitative Variable
When a characteristic is express by mean of qualitative term is known as qualitative variable or
an attributes. For e.g. smoking, beauty, educational status, green, blues etc. it is noted that these
characters can not measure numerically.
Domain
A set of value from which variables are taken on a value is called domain.
Constant
A characteristic is called a constant if it assumes a fixed value e.g. p is a constant with a
numerical value of 3.14286. is also a constant with numerical value of 2.71828.
Errors
The difference b/w the actual values and the expected value is called errors. There are two types
of errors.
1. Compensating error
2. Biased errors
Data
A set of values or number of values is called data.
Quantitative Data
The data described by a quantitative variable such as number of deaths in a country per year,
prices temperature readings, heights, weights, wheat production from different acres, the number
of persons living in different houses etc, are called quantitative data.
Qualitative Data
Data described by a qualitative variable e.g. smoking, beauty, educational status, green, blue The
marital status of persons such as single, married, divorced, widowed, separated, The sex of
persons such as male and female, etc are called qualitative data.
Discrete Data
Data which can be described by a discrete variable is called discrete data. Number of students in
a class, Number of houses in a street, number of children in a family etc
Continuous Data
Data which can be described by a continuous variable is called continuous data. For e.g. age of
persons, speed of car, temperature at a place, income of a person, height of a plant, a life time of
a T.V tube etc
Chronological Data
A sequence of observations, made on the same phenomenon, recorded in relation to their time of
occurrence, is called chronological data. A chronological data is also called a time series.
Geographical Data
A sequence of observations, made on the same phenomenon, recorded in relation to their
geographical region, is called a geographical data.
Statistical Data
When the data is classified on the basis of a numerical characteristic which is know as statistical
data on classification according to class interval. Statistical data may be classified is to two types
1. Primary Data
It is most original data which is note complied by someone or it is first hand collected data. It has
also not undergone any sort of statistical treatment.
2. Secondary Data
It is that data which has already been compiled and analyzed by someone, may be sorted,
tabulated and has undergone statistical treatment.
Collection of Data
Following methods are used for collection of data.
1. Methods for Collection of Primary Data
Following are the main methods by which primary data are obtained.
i. Direct Personal Investigation
ii. Indirect Investigation
iii. Local Source
iv. Questionnaire Method
v. Registration
Xm Xo
Q3 Q1
2
XX
n
Or
M.D =
X Mean
n
M.D =
f X X
f
Or
M.D =
X Mean
X X%
n
Or
M.D =
X Median
n
M.D =
f X X%
f
Or
M.D =
X Median
S.D = S =
X
n
X X
n
fX
f
S.D = S =
fX
f
f X X
f
S.D = S =
S.D = S =
Where D= X A
fD
f
S.D = S =
fD
f
S.D = S =
Where
XA
D
or
h
h
S.D = S =
fu
f
fu
f
Sc =
2
n1S12 n2 S22
n1n2
X1 X 2
2
n1 n2
n1 n2
Sc =
n S
i
2
i
X X
Variance ( S 2 )
The variance is defined as the mean of the squared deviation from mean. It is denoted by
S2
Or
The square of the standard c=deviation is called variance. It is denoted by S 2
Methods of Standard Deviation
1. Direct Method
2. Short Cut Method
3. Coding Method or Step-Deviation Method
1. Direct Method
For Ungrouped Data
Var(X) =
Var(X) =
S =
S =
X X
Var(X) =
S2 =
S =
fX
f
fX
f
f X X
f
S =
D
n
Where D= X A
S2 =
fD
f
fD
f
XA
D
or
h
h
For Grouped Data
S = h
2
Where
S = h
2
Var(X) =
fu
f
fu
f
Combined Variance ( Sc )
For two set of values
Sc 2 =
2
n1S12 n2 S 22
n1n2
X1 X 2
2
n1 n2
n1 n2
Sc 2 =
n S
i
2
i
X X
Xm Xo
Xm Xo
Q3 Q1
Q3 Q1
Or
Coefficient of M.D from Mean =
M .D From X
X
Or
Coefficient of M.D from Mean =
S .D
X
M .D From X%
X%
S .D
100
X
3
S.D
4
2
S.D
3
5
III. Quartile Deviation = Q.D =
M.D
6
II. Quartile Deviation = Q.D =
Moments
A moment designates the power to which deviation are raised before averaging them.
Methods of Standard Deviation
1. Moments about Mean or Central Moments
2. Moments about Origin or Zero
3. Moments about Provisional Mean or Arbitrary Value (Non Central Moment)
1. Moments about Mean or Central Moments
For Ungrouped Data
1 m1
xx
n
0
2
2 m2
x x
3 m3
x x
4 m4
x x
Variance
1 m1
f x x
2 m2
f xx
3 m3
f x x
4 m4
f xx
Variance
'1 m'1
n
2 m2
'
'
x3
'3 m ' 3
n
4 m4
'
'
'1 m'1
fx
'2 m' 2
fx 2
f
f
3 m3
'
'
fx
'4 m' 4
fx 4
Direct Method
For Ungrouped Data
'1 m'1
constant
x A
n
Where A is
2 m2
x A
3 m3
x A
4 m4
x A
'
'
'
'
'
'
'1 m'1
f x A
constant
2 m2
f x A
3 m3
f x A
4 m4
f x A
'
'
'
ii.
Where A is
'
'
'
'1 m'1
n
Where D= X - A
2 m2
3 m3
'
'
'
'
D4
'4 m' 4
n
For Grouped Data
'1 m'1
fD
'2 m' 2
fD 2
f
f
Where D= X - A
3 m3
'
'
fD
'4 m' 4
fD 4
iii.
u
'1 m'1 h
n
Where
XA
D
or
h
h
u2
'2 m' 2 h 2
n
3 m3
'
'
h3
u4
'4 m' 4 h 4
n
For Grouped Data
'1 m'1
XA
D
or
h
h
2 m2
'
'
fu
fu
'3 m ' 3
h2
fu 3
4 m4
'
'
fu
Where
h3
h4
1 m1 '1 1' 0
2 m2 '2 1' Varaince
2
Moments Ration
3 2
1 b1 3
2
2 b2 42
2
Sheppards Correction for Moments of Group Data
2 (corrected ) 2 (uncorrected )
h2
12
3 (corrected ) 3
4 (corrected ) 4 (uncorrected )
h2
7 4
2 (uncorrected )
h
2
240
Charliers Check
i.
f u 1 fu f
ii.
f u 1
fu 2 2 fu f
iii.
f u 1
fu 3 3 fu 2 3 f u f
iv.
f u 1
fu 4 4 fu 3 6 f u 2 4 f u f
Symmetry
In a symmetrical distribution a deviation below the mean exactly equals the corresponding
deviation above the mean. It is called symmetry.
For symmetrical distribution the following relations hold.
Mean = Median = Mode
Q3 - Median = Median - Q1
u3 m3 0
1 b1 0
Skewness
Skewness is the lack of symmetry in a distribution around some central value i.e. means
Median or Mode. It is the degree of asymmetry.
Mean
Median Mode
Q3 - Median Median - Q1
u3 m3 0
1 b1 0
There are two types of Skewness.
1. Positive Skewness
If the frequency curve has a longer tail to right, the distribution is said to be positively
skewed.
2. Negative Skewness
If the frequency curve has a longer tail to left, the distribution is said to negatively
skewed.
Mean Mode
S .D
SK =
3 Mean Median
S .D
Q3 Q1 2 Median
Q3 Q1
Kurtosis
Moment coefficient
1 2 3
2 5 2 6 1 9
4
2 2
2 3 distribution is Leptokurtic
2 3 distribution is Normal or Mesokurtic
2 3 distribution is Platy Kurtic
Or
Q.D
For Normal distribution, K = 0.263
P90 P10
A relative number which indicates the relative change in a group of variables collected at
different time. Index numbers is a device for estimating trend in Prices, Wages, Production
and other economic variables. It is also known as economic barometer.
Or
An Index Number Is a number that measure a relative change in a variable or an average
relative change in a group of related variable with respect to a base. A base may be that
particular time, space professional class with whose reference change are to be measured
.
2.
3.
Index Number
Simple
Composite
Weighted
Un Weighted
Average of relative
Average of relative
Aggregative
Aggregative
Pon
Pon
pn
100
po
Where
Quantity Relative
They are obtained by dividing the quantity in a current year by the quantity in a
base year and expressed as percentage.
Quantity Relative= Qon
Qon
qn
100
qo
Where
1.2.
P n 1, n
Link
Relative
Q n 1, n
pn
100
pn 1
=
qn
Quantity in a current year
100
100
Quantity in a preceding year
qn 1
In second step, we take just reverse step of step 1. Hence, to get chain indices
we multiply the current period link relative by link relative of immediate previous
period of current period and divide this product by 100.
Chain Indices =
Pon
p
p
100
Qon
and
q
q
100
Pon
p q
pq
n o
o o
Pon
p q
p q
n n
o n
Pon
p q p q
p q p q
n o
n n
o o
o n
Pon L P 100
100
p q p q
p q pq
n o
n n
o o
o n
100
Pon
p
p
qo qn
qo qn
100
q
q
Qon
Qon
po
po
q
q
pn
pn
Qon
q
q
n
o
(4). Fishers
Ideal
Qon L P 100
po qn pn
po qo pn
Quantity
q
q
po
po
q
q
q
q
Qon
pn
pn
po pn
po pn
100
Index
100
100
Pon
pn
po qo
po
100
po qo
Weighted)
Where Price Relative= I
(2). Paasches
Price
pn
100 ,
po
Index =
Pon
pn
100 ,
po
Pon
pn
po qn
po
100
po qn
I W (Current
W
W po qn
pn
pn qn
po
100
pn qn
(Base Year
W po qo
Weighted)
I W
W
I W
W
Year
pn
100 ,
po
W pn qn
Qon
qn
qo po
qo
100
q
p
o o
Weighted)
Where Quantity Relative= I
qn
100 ,
qo
Qon
qn
qo pn
qo
100
qo pn
qn
100 ,
qo
Qon
I W (Current
W
Year
W qo pn
qn
qn pn
qo
100
qn pn
pn
100 ,
po
(Base Year
W qo po
Weighted)
I W
W
I W
W
W qn pn
1. All index numbers are not suitable for all purposes. They are suitable for the purpose
for which they constructed.
2. Comparisons of changes in variables over long period are not reliable
3. Index numbers are subject to sampling error.
4. It is not possible to take into account all changes in quality or product.
5. The index numbers obtained by different methods of construction may give different
results.
po and qo be
Pon
p q
pq
n o
100
o o
Where
pq
pq
n o
o o
Pon
pn
po qo
po
100
po qo
I W
W
pn
100 ,
po
W po qo
Pon
1
Pno
or
Pon Pno 1
p q
p q
n n
o o
Pbc , then the circular test requires that the index for the year c
based upon the year a, i.e., Pac should be the same as if it were compounded of these
upon the year b is
two stages i.e.
Linear Regression
When the dependence of the variable is represented by a straight line, then it is called the
linear regression otherwise it said to be non-linear or curvilinear regression. For example, If
X is independent variable and Y is dependent variable, then the relation Y=a+bX is called
linear regression.
(Y Y ) 0 , ( X X ) 0
4. The sum of square deviation b/w observed estimated values always minimum. i.e.
(Y Y )
= minimum ,
( X X )
= minimum
X X
Y Y
Y a bX
Or
Y Y b( X X )
Or
Y Y byx ( X X )
General Method
Normal Equations
Y na b X
XY a X b X
a Y bX
X Y X XY
(2).
a aYX
n X 2 X
X
X
X
X
(3).
a aYX
XY
Y
X
n
Direct formula of b
(1).
(2).
(3).
b bYX
b bYX
b bYX
n XY X Y
n X 2 X
XY
Y
X
X
(4).
(5).
(6).
(7).
b bYX
X
n
XY
X
X Y
n
2
X
n
XY nXY
X nX
2
XY nXY
b bYX
When
nS X2
b bYX
b bYX
DY
2
X
D D
X
n
2
DX
n
DX X A
DY Y B
(8).
b bYX r
(9).
b bYX
A= Constant
B= Constant
SY
SX
S XY
S X2
Where
S XY
SX
( X X )(Y Y )
n
(X X )
n
SX
X
n
X
n
( X X ) (Y Y )
(X X )
D
Where
2
X
SY
(Y Y )
SY
Y
n
X c dY
Or
X X d (Y Y )
Or
X X bXY (Y Y )
General Method
Normal Equations
X nc d Y
XY c Y d Y
c X dY
X Y Y XY
n Y Y
2
(2).
(3).
c a XY
c a XY
X
Y
Y
Y
XY
Y
Y
2
Direct formula of d
(1).
d bXY
n XY X Y
n Y 2 Y
(2).
(3).
d bXY
d bXY
XY
Y
Y
Y
X
n
XY
Y
(4).
(5).
(6).
(7).
d bXY
X Y
n
2
Y
n
XY nXY
Y nY
2
XY nXY
d bXY
When
nSY2
d bXY
d bXY
S
2
Y
( X X ) (Y Y )
(Y Y )
2
DY
D D
X
n
2
DY
Where DX X A
DY Y B
(8).
d bXY r
(9).
d bXY
A= Constant
B= Constant
SX
SY
S XY
SY2
Where
S XY
( X X )(Y Y )
n
SX
(X X )
SY
(Y Y )
Scatter Diagram
SX
SY
X
n
Y
n
Y-axis
X-axis
To observed values of (X, Y) do not all fall on the regression line but they scatter away from
it. The degree of scatter (or dispersion) of the observed values about the regression line is
measured by what is called the standard deviation of regression or the standard error of
estimate of Y on X and X on Y.
1.
Y on X ( Y a bX )
For Ungrouped Data
s y. x
a Y b XY
n2
Or
s y. x
Y Y
Where
n2
Y is Trend values
s y. x k
2.
fv
a fv b fuv
f 2
Where k is constant
X on Y ( X c dY )
For Ungrouped Data
sx. y
c X d XY
n2
Or
sx. y
X X
n2
Where
X is Trend values
sx. y h
fu
c fu d fuv
Where h is constant
f 2
Multiple Regression
A regression which involves two or more independent variable is called a multiple
regression. For example; the yield of a crop depends upon fertility of the land, fertilizer
applied, rain fall, quality of seeds etc. likewise, the systolic blood pressure of a person
depends upon ones weight, age, etc
Yi 1 X 1i 2 X 2i ....... k X ki i
Yi 1 X 1i 2 X 2i i
The estimated multiple liner regression based on sample data is
Y a b1 X 1 b2 X 2
Normal Equations are
Y na b X b X
X Y a X b X b X X
X Y a X b X X b X
1
2
2
2
X 1 on X 2 and X 3
b12.3 and b13.2 is the regression coefficient of Multiple regression line X 1 on X 2 and X 3 .
Multiple Liner Regression / Multiple Regression line / Least Square Multiple Regression
line
X 1 a b12.3 X 2 b13.2 X 3
Or
( X 1 X 1 ) b12.3 ( X 2 X 2 ) b13.2 ( X 3 X 3 )
General Method
Normal Equations
X na b X b X
X X a X b X b X X
X X a X b X X b X
1
12.3
13.2
12.3
13.2
We get the
simultaneously.
value
of
12.3
a,
b12.3
13.2
and
b13.2 solving
the
above
equations
Alternative Methods
Direct formula of obtaining the value of a, b12.3 and b13.2
Direct formula of a
a X 1 b12.3 X 2 b13.2 X 3
Direct formula of b12.3
(1).
b12.3
S1 12
.
S 2 11
(2).
b12.3
(3).
b12.3
Where r=correlation
b13.2
S1 13
.
S3 11
(2).
b13.2
(3).
b13.2
Where r=correlation
r11
Where r21
r31
r12
r22
r32
r13
1
r23 r21
r33 r31
r12
1
r32
r13
r23
1
11 1 r232
Direct Method to Solve Multiple Regression equation
X X3
X1 X1
X X2
11 2
12 3
13 0
S1
S2
S3
Or
X 1 on X 2 and X 3
S1
S1 r12 r13 r23
X 2
2
S 2 1 r23
S 3
r13 r 12 r23
X3
1 r232
X1
X 2 on X 1 and X 3
b21.3 and b23.1 is the regression coefficient of Multiple regression line X 2 on X 1 and X 3
.
Multiple Liner Regression / Multiple Regression line / Least Square Multiple Regression
line
X 2 a b21.3 X 1 b23.1 X 3
Or
( X 2 X 2 ) b21.3 ( X 1 X 1 ) b23.1 ( X 3 X 3 )
General Method
Normal Equations
X
X
X
2
2
2
We get the
simultaneously.
na b21.3 X 1 b23.1 X 3
X 1 a X 1 b21.3 X 12 b23.1 X 1 X 3
X 3 a X 3 b21.3 X 1 X 3 b23.1 X 32
value
of
a,
b21.3
and
b23.1 solving
the
above
Alternative Methods
Direct formula of obtaining the value of a, b21.3 and b23.1
Direct formula of a
a X 2 b21.3 X 1 b23.1 X 3
Direct formula of b21.3
(1).
b21.3
S2 21
.
S1 22
(2).
b21.3
(3).
b21.3
S 2 r21 r23r13
.
S1 1 r132
b23.1
S2 23
.
S3 22
Where r=correlation
equations
(2).
b23.1
(3).
b23.1
S2 r21r31 r23
.
S3 1 r132
Where r=correlation
S 2 r23 r21r31
.
S3 1 r132
r11
Where r21
r31
r12
r22
r32
r13
1
r23 r21
r33 r31
r12
1
r32
r13
r23
1
11 1 r232
Direct Method to Solve Multiple Regression equation
X 2 on X 1 and X 3
X X3
X2 X2
X X1
22 1
21 3
23 0
S2
S1
S3
Or
S 2
S 2 r21 r23r13
X 1
2
S1 1 r13
S 3
X2
r23 r 21r31
X3
1 r132
X 3 on X 2 and X 1
b31.2 and b32.1 is the regression coefficient of Multiple regression line X 3 on X 2 and X 1
.
Multiple Liner Regression / Multiple Regression line / Least Square Multiple Regression
line
X 3 a b31.2 X 1 b32.1 X 2
Or
( X 3 X 3 ) b31.2 ( X 1 X 1 ) b32.1 ( X 2 X 2 )
General Method
Normal Equations
X
X
X
3
3
3
na b31.2 X 1 b32.1 X 2
X 1 a X 1 b31.2 X 12 b32.1 X 1 X 2
X 2 a X 2 b31.2 X 1 X 2 b32.1 X 2 2
We get the value of a, b31.2 and b32.1 solving the above equations
simultaneously.
Alternative Methods
Direct formula of obtaining the value of a, b31.2 and b32.1
Direct formula of a
a X 3 b31.2 X 1 b32.1 X 2
Direct formula of b31.2
(1).
b31.2
S3 31
.
S1 33
(2).
b31.2
(3).
b31.2
Where r=correlation
b32.1
S3 32
.
S 2 33
(2).
b32.1
S3 r31r21 r32
.
S 2 1 r122
(3).
b32.1
Where r=correlation
S3 r32 r31r21
.
S 2 1 r122
r11
Where r21
r31
r12
r22
r32
r13
1
r23 r21
r33 r31
r12
1
r32
r13
r23
1
11 1 r232
Direct Method to Solve Multiple Regression equation
X 3 on X 2 and X 1
X3 X3
X X1
X X2
33 1
31 2
32 0
S3
S1
S2
Or
X 1
1 r122
S1
S2
X3
r32 r31r21
X2
1 r122
Positive Correlation
The correlation in the same direction is called positive correlation. If one variable increase
other is also increase, and one variable is decrease other is also decrease. For example, an
increase in heights of children is usually accompanied by an increase in their weights. The
length of an iron bar will increase as the temperature increase.
Negative Correlation
The correlation in opposite (different) direction is called negative correlation. If one
variable increase other is decrease, and one variable is decrease other is increase. For
example, the volume gas will decrease as the pressure increase.
If there are no relationship b/w two variables then it is called no correlation or zero
correlation.
Coefficient of Correlation
It is a measurement of the degree of interdependence b/w the variable. It is a pure number
and lies b/w -1 to +1 and intermediate value of zero indicates the absence of correlation. it
denoted by r.
r b d Or r bxy byx
3. The correlation coefficient is independent of origin and unit of measurement i.e.
rxy ruv
1 r 1
(1).
XY
r rxy ryx
X Y
n
( X X ) (Y Y )
n
(2).
r rxy ryx
(3).
r rxy ryx
(4).
r rxy ryx
( X X ) (Y Y )
(5).
r rxy ryx
XY nXY
( X X ) (Y Y )
2
XY nXY
X nX Y
2
nS x S y
nS x S y
nY 2
(6).
r rxy ryx
(7).
2
y
U V
n
X A DX
Y B Dy
, V
h
h
k
k
UV
(8).
D D
X A , DY Y B
r ruv rvu
Where
DY
Where DX
b byx r
Sy
Sx
r rxy ryx b d
Or
d bxy r
Sx
Sy
(1).
fX fY
f
fX
fY
fX
f
f
fD fD
fD D
f
fD
fD
fD
fD
f
f
fU fV
fUV
f
fU
fV
fU
f
f
fXY
r rxy ryx
(2).
r rxy ryx
(4).
Where
bvu
Or
r rxy ryx b d
fU fV
f
fU
fU
f
fUV
r ruv rvu
(3).
, byx
k
bvu
h
buv
fU fV
f
fV
fV
f
fUV
bxy
h
buv
k
Rank Correlation
Sometimes, the actual measurement or counts of individuals or objects are either not
available or accurate assessment is not possible. They are then arranged in order
according to some characteristic of interest. Such an ordered arrangement is called a
ranking and the order given to an individual or object is called its rank. The correlation b/w
two such sets of rankings are known as Rank Correlation.
Rank Correlation =
rs 1
6 d 2
n(n 2 1)
(Spearmans Formula)
rs 1
6 d 2 a
n(n 2 1)
1 3
1
t1 t1 t23 t2 ..............
12
12
Multiple Correlation
Multiple correlation coefficient measures the degree of relationship b/w a variable and a
group of variables and variable is not included in that group e.g. Ry .12 , R1.23
(1).
R1.23 R1.32
R1.23 R1.32 1
(2).
R2.13 R2.31
11
R2.13 R2.31 1
(3).
R3.12 R3.21
22
Or
R3.12 R3.12 1
r11
Where r21
r31
r12
r22
r32
33
r13
1
r23 r21
r33 r31
r12
1
r32
r13
r23
1
11 1 r232
1 r122 r132 r232 2r12 r13r23
Q r12 r21 , r23 r32 , r13 r31
2
2
2
are known as coefficient of multiple determination
R1.23
, R2.13
, R3.12
Partial Correlation
Hence
Correlation b/w two variable keeping the effects of all other variables as constant is called
partial correlation for example r12.3 , r13.2 , r23.1
(1).
r12.3 r21.3
1 r 1 r
2
13
2
23
Or
r13.2 r31.2
1 r 1 r
2
12
2
32
Or
r23.1 r32.1
r23 r21r31
1 r 1 r
2
21
2
31
Or
for a no of years, hourly temperature recorded at a locality for a period of years, the
weekly prices of wheat in Lahore, the monthly consumption of electricity in a certain town,
the monthly total of passengers carried by rail, the quarterly sales of a certain fertilizer,
the annual rainfall at Karachi for a number of years, the enrolment of students in a college
or university over a number of years and so forth.
Y T C S I
Y a bX
Normal Equations
Y na b X
XY a X b X
Y a bX cX 2
Normal Equations
Y na b X c X
XY a X b X c X
X Y a X b X c X
2
Y a bX cX 2 dX 3
Normal Equations
Y na b X c X d X
XY a X b X c X d X
X Y a X b X c X d X
X Y a X b X c X d X
2
Finite Population
A population said to be finite if it consists of a finite or fixed number of elements for
example, All university students In Pakistan, the weights of all students enrolled at Punjab
University.
Infinite Population
A population said to be infinite if there is not limit to the number of elements. For example,
All heights between 2 and 3 meters.
Existent Population
A population which consists of concrete objects is called an existent population.
Hypothetical Population
A population which does not contain concrete objects or items is called hypothetical
population.
Sample
Representation small part of a population is called sample. The number of elements
desired in sample is called sample size. It is denoted by n.
Sampling
N!
N
= N n !n !
n
Cn =
If for example, we have N=5 and n=2, the no. of possible samples will be 10 i.e.
5
C2 =
5!
= 10.
5 2 !2!
Parameter
Numerical information or values drawn from population are called parameter. These are
fixed numbers. It is usually denoted by Greek or capital letters. For example, population
mean , and standard deviation .
Statistic
Numerical information or values drawn from sample are called statistic. It vary from
sample to sample from the same population. It is denoted by Roman or Small letters. For
example, sample mean X and sample standard deviation S.
Sampling Units
A basic element or object which we select for a sample are called sampling units. For
example, if we want to measure the average height of college students are sampling units.
Sampling Frame
Census
Complete enumeration of similar and dissimilar units is termed as census.
Sample Survey
In a sample Survey, enumeration is limited to only a part, or a sample select from the
population.
Sampling Error
The difference b/w parameter and statistic due to small size of sample is called sampling
error. It can be reduced by increasing the sample size to a sufficient level.
Sampling Error = x
Where x = Sample Mean = Population Mean
Non-Sampling Error
The non-sampling error is those errors that arise due to defective sampling frame or
information not being provided correctly. For example, income, Sale, Production and Age
etc are not coated correctly in the most of the cases.
Sampling Bias
Bias is a cumulative component of error which arises due to defective selection of the
sample or negligence of the investigator. Errors due to bias increase with an increase in
the size of sample.
Standard Error
The standard deviation of a sampling distribution of statistic is called standard error
(abbreviated to S.E).
S .E ( x )=
Sampling Distribution
Frequency distribution of statistics from all samples is called sampling distribution. For
example, sampling distributions of sample mean or sample distribution of sample
variance.
Stratified Sampling
When a population has highly variable material, the simple random sampling fails to give
accurate results. In this case our population is heterogeneous which is divided into
homogenous subgroups called strata. Then a sample is selected separately from each
strata at random and the combined into a single sample. This method is called stratified
random sampling.
Systematic Sampling
Systematic sampling is a method of selecting a sample that calls for taking every Kth
element in the population. The first unit in the sample is selected at random from first 1 to
K units the population and the every Kth unit is included in the sample.
Cluster Sampling
Cluster sampling a method of selection a sample in which population is divided into natural
groups , such as household , agricultural forms, etc. which are called cluster and taking
these clusters as sampling units, a sample is draw at random.
Quota Sampling
Quota sampling is method of selecting a sample of convenience with certain controls to
avoid some of the more serious biases involved in talking those most conveniently
available. In those method quotas are setup example, by specifying number of interviews
from urban and rural, males and females etc.
Sampling Distribution
Frequency distribution of statistics from all samples is called sampling distribution. For
example, sampling distributions of sample mean or sample distribution of sample
variance.
Population Size = N
Population = X
Sample Size = n
Population Mean =
Population Variance =
(x )
N
Sample Proportion = P
Sample Mean = x
(x )
X
N
x
n
x
n
(x x )
n
s2 =
(x x )
n 1
(x x )
(x x )
s =
n 1
Nn
N!
N
= N n !n !
n
Cn =
x E ( x ) xf ( x )
2)
x2 E ( x ) 2 E ( x )
3)
x 2 f ( x ) xf ( x )
4) Population Mean =
x 2 f ( x ) xf ( x )
5) Population Variance =
N
2
b.
x2
c.
x S .E ( x )
2
n
x2
b.
2 N n
.
n N 1
N n
n N 1
Sampling Distribution of Difference b/w two means ( x1 x2 )
x S .E ( x )
c.
x x E ( x1 x2 ) ( x1 x2 ) f ( x1 x2 )
1
x21 x2 E ( x1 x2 ) 2 E ( x1 x2 )
3)
(x
x2 ) 2 f ( x1 x2 )
( x x ) f ( x x )
4) Population Mean
x1 = 1
N1
(x
x2 ) 2 f ( x1 x2 )
(x x ) f (x x )
1
x2 = 2
5) Population Mean
6) Population Variance
7) Population Variance
N2
2
1
x1 =
2
1
N1
2
2
N1
x2 =
2
2
N2
N2
x1 = 1
2
1
N1
N1
2
2
N2
N2
1 2
x21 x2
b.
x1 x2
c.
12 22
n1 n2
12 22
S .E ( x1 x2 )
n1 n2
b.
2
x1 x2
1 2
12 N1 n1
22 N 2 n2
n1 N1 1
n 2 N 2 1
12 N1 n1
22 N 2 n2
n1 N1 1
n 2 N 2 1
Sampling Distribution of Sample Proportion ( P )
x1 x2 S .E ( x1 x2 )
c.
( P )
P E ( P ) Pf
( P )
P 2 f ( P ) Pf
P2 E ( P ) 2 E ( P )
3)
P E ( P ) 2 E ( P )
4) Population Mean = P
number.
P 2 f ( P ) Pf ( P )
X
N
P P
Pq
2
b. P
n
a.
P S .E ( P )
c.
where
q 1 P
Pq
n
P P
Pq N n
2
.
b. P
n N 1
a.
Pq N n
.
n N 1
Sampling Distribution of Difference b/w two Proportion ( P1 P2 )
1) Mean of the sampling distribution of P P
P S .E ( P )
c.
P P
1
P2 P E ( P1 P2 ) 2 E ( P1 P2 )
1
3)
E ( P1 P2 ) ( P1 P2 ) f ( P1 P2 )
P1 P2
( P P )2 f ( P P ) ( P1 P2 ) f ( P1 P2 )
1
P P E ( P1 P2 ) 2 E ( P1 P2 )
1
4) Population Mean
Where
x1
( P P )2 f ( P P ) ( P1 P2 ) f ( P1 P2 )
1
X1
N1
5) Population Mean
Where
x1 = P1
x2
x2 = P2
X2
N2
P1 P2
a. P P
1
b.
c.
P2 P
1
Pq
Pq
1 1
2 2
n1
n2
where
q1 1 P1
Pq P q
P P S .E ( P1 P2 ) 1 1 2 2
1
2
n1
n2
P1 P2
a. P P
1
b.
P2 P
1
N1 n1
Pq
P2 q 2 N 2 n2
1 1
n1 N1 1
n2
N 2 1
q2 1 P2
Pq N n
P q N 2 n2
P P S .E ( P1 P2 ) 1 1 1 1 2 2
1
2
n1 N1 1
n2
N 2 1
Sampling Distribution of Biased Variance ( S 2 )
c.
S2
S E (S 2 ) S 2 f (S 2 )
2
S22 E ( S 2 ) 2 E ( S 2 )
3)
(S
S2
) f ( S 2 ) S 2 f ( S 2 )
2 2
S 2 E ( S 2 )2 E ( S 2 )
(S
4) Population Mean =
) f (S 2 )
2 2
f (S 2 )
5) Population Variance =
Verification
S E (S 2 ) 2
2
s E (s 2 ) s 2 f (s 2 )
2
s22 E ( s 2 ) 2 E ( s 2 )
3)
(s )2 f (s
2
s2
f ( s 2 )
s2 E (s 2 )2 E (s 2 )
(s
4) Population Mean =
) f (s 2 )
2 2
5) Population Variance =
f (s 2 )
s E (s 2 ) 2
Verification
x
N
x
N
Chapter 9 Estimation
Confidence Interval for Population Mean With Replacement (Z-Test)
When Population Standard Deviation ( ) is known
P X Z / 2
X Z / 2 1
n
n
Or
X Z / 2
X Z / 2
n
n
When Population Standard Deviation
S
S
P X Z / 2
X Z / 2 1
n
n
Or
S
S
X Z / 2
X Z / 2
n
n
Confidence Interval for Population Mean With Out Replacement (ZTest)
When Population Standard Deviation ( ) is known
P X Z / 2
n
N n
X Z / 2
N 1
n
N n
1
N 1
Or
X Z / 2
N n
X Z / 2
N 1
n
S
P X Z / 2
n
N n
N 1
( ) is unknown
N n
S
X Z / 2
N 1
n
N n
1
N 1
Or
X Z / 2
S
n
N n
S
X Z / 2
N 1
n
N n
N 1
2 2
2 2
P X 1 X 2 Z / 2 1 2 1 2 X 1 X 2 Z / 2 1 2 1
n1 n2
n1 n2
Or
12 22
2 2
1 2 X 1 X 2 Z / 2 1 2
n1 n2
n1 n2
X 1 X 2 Z / 2
S12 S 22
S12 S 22
1 2 X 1 X 2 Z / 2
1
n1 n2
n1 n2
P X 1 X 2 Z / 2
1 X 2 Z / 2
n1 , n2 >30
Or
2
1
2
2
S
S
S2 S2
1 2 X 1 X 2 Z / 2 1 2
n1 n2
n1 n2
2 2
2 2
P X 2 X 1 Z / 2 1 2 2 1 X 2 X 1 Z / 2 1 2 1
n1 n2
n1 n2
Or
X 1 Z / 2
12 22
2 1 X 2 X 1 Z / 2
n1 n2
n1 n2
2
1
2
2
P X 2 X 1 Z / 2
n1 , n2 >30
S12 S 22
S12 S 22
2 1 X 2 X 1 Z / 2
1
n1 n2
n1 n2
Or
X 2 X 1 Z / 2
2
1
2
2
S
S
S2 S2
2 1 X 2 X 1 Z / 2 1 2
n1 n2
n1 n2
2 N n 2 N 2 n2
2 N n 2 N n
P X 1 X 2 Z / 2 1 1 1 2 2 2 1 2 X 1 X 2 Z / 2 1 1 1 2
1
n1 N1 1
n 2 N 2 1
n1 N1 1 n2 N 2 1
Or
X 2 Z / 2
12 N1 n1 22 N 2 n2
1 2 X 1 X 2 Z / 2
n1 N1 1
n 2 N 2 1
12 N1 n1 22 N 2 n2
n1 N1 1 n2 N 2 1
n1 , n2 >30
S 2 N n S 2 N 2 n2
S 2 N n S 2 N n
P X 1 X 2 Z / 2 1 1 1 2 2 2 1 2 X 1 X 2 Z / 2 1 1 1 2
1
n1 N1 1 n 2 N 2 1
n1 N1 1 n2 N 2 1
Or
1 X 2 Z / 2
S12 N1 n1 S22 N 2 n2
S12 N1 n1
S 22 N 2 n2
1
2
1
2
/2
n1 N1 1
n 2 N 2 1
n1 N1 1 n2 N 2 1
12 N1 n1 22 N 2 n2
12 N1 n1 22 N 2 n2
P X 2 X 1 Z / 2
2 1 X 2 X 1 Z / 2
1
n1 N1 1
n 2 N 2 1
n
N
1
n
N
1
1
2 2 1
Or
X 1 Z / 2
12 N1 n1 22 N 2 n2
12 N1 n1
22 N 2 n2
2 1 X 2 X 1 Z / 2
n1 N1 1
n 2 N 2 1
n1 N1 1 n2 N 2 1
n1 , n2 >30
S 2 N1 n1 S22 N 2 n2
S2 N n
S 2 N n
P X 2 X 1 Z / 2 1 1 1 2 2 2 2 1 X 2 X 1 Z / 2 1
1
n1 N1 1
n 2 N 2 1
n1 N1 1 n2 N 2 1
Or
X 2 X 1 Z / 2
S12 N1 n1 S 22 N 2 n2
S12 N1 n1
S 22 N 2 n2
Z
2 1 / 2 n N 1 n N 1
2
1
n1 N1 1
n 2 N 2 1
1
1
2 2
P (1 P )
P (1 P )
1
P P Z / 2
P P Z / 2
n
n
Or
P (1 P )
P (1 P )
P Z / 2
P P Z / 2
n
n
Confidence Interval for Difference Between Two Population
Proportion ( P1 P2 ) (Z-Test)
P (1 P1 ) P2 (1 P2 )
P (1 P1 ) P2 (1 P2 )
1
P P1 P2 Z / 2 1
P1 P2 P1 P2 Z / 2 1
n1
n2
n1
n2
Or
P (1 P1 ) P2 (1 P2 )
P (1 P1 ) P2 (1 P2 )
P1 P2 Z / 2 1
P1 P2 P1 P2 Z / 2 1
n1
n2
n1
n2
P (1 P1 ) P2 (1 P2 )
P (1 P1 ) P2 (1 P2 )
1
P P2 P1 Z / 2 1
P2 P1 P2 P1 Z / 2 1
n1
n2
n1
n2
Or
P P Z
2
/2
P1 (1 P1 ) P2 (1 P2 )
P (1 P1 ) P2 (1 P2 )
P2 P1 P2 P1 Z / 2 1
n1
n2
n1
n2
1
1
P Z f Z / 2
z Z f Z / 2
1
n3
n 3
Or
Z f Z / 2
n3
z Z f Z / 2
n3
1 r
1 r
Z f 1.1513log
0
0
H0 ;
Alternative H1 ;
Null
0
0
0
0
2. Significance Level
=5%/1%
or 0.05 / 0.01
If significance level is not given then we take 5% by default.
3. Critical Region(C.R)
1
Z / 2 )
2
If Alternative H1 ; 0
C.R =
Z Z / 2
or
Z / 2 Z Z / 2
Z )
H1 ; 0
Z
For One Tail Test ( 0.5 Z )
If Alternative H1 ; 0
C.R= Z
C.R= Z
4. Test Statistics
When population S.D ( ) is known
X 0
n
When population S.D ( ) is unknown & n>30
X 0
Z= S
n
Z=
5. Conclusion
If z-cal is greater than or equal to z-tab so rejected
H0
H0
Testing of Hypotheses concerning the difference between two
Population Mean ( X 1 X 2 ) (Z-Test)
1 2 #0
1 2 #0
H0 ;
Alternative H1 ;
Null
1 2 #0
1 2 #0
1 2 #0
1 2 #0
2. Significance Level
=5%/1%
or 0.05 / 0.01
If significance level is not given then we take 5% by default.
3. Critical Region(C.R)
1
Z / 2 )
2
If Alternative H1 ; 1 2 #0
C.R =
Z Z / 2
or
Z / 2 Z Z / 2
Z )
If Alternative H1 ; 1 2 #0
C.R= Z Z
For One Tail Test ( 0.5 Z )
If Alternative H1 ; 1 2 #0
C.R= Z Z
4. Test Statistics
( X 1 X 2 ) #0
12 22
n1 n2
When population S.D ( ) is unknown & n1 , n2 >30
Z=
#0 1 2
( X 1 X 2 ) #0
S12 S22
n1 n2
Z=
If
5. Conclusion
If z-cal is greater than or equal to z-tab so rejected
H0
H0
Testing of Hypotheses concerning the difference between two
Population Mean ( X 2 X 1 ) (Z-Test)
2 1 #0
2 1 #0
H0 ;
Alternative H1 ;
Null
2 1 #0
2 1 #0
2 1 #0
2 1 #0
2. Significance Level
=5%/1%
or 0.05 / 0.01
If significance level is not given then we take 5% by default.
3. Critical Region(C.R)
1
Z / 2 )
2
If Alternative H1 ; 2 1 #0
C.R =
Z Z / 2
or
Z / 2 Z Z / 2
Z )
If Alternative H1 ; 2 1 #0
C.R= Z Z
For One Tail Test ( 0.5 Z )
If Alternative H1 ; 2 1 #0
C.R= Z Z
#0 2 1
4. Test Statistics
12 22
n1 n2
When population S.D ( ) is unknown & n1 , n2 >30
( X 2 X 1 ) #0
Z=
Z=
If
S12 S22
n1 n2
5. Conclusion
If z-cal is greater than or equal to z-tab so rejected
H0
H0
P P0
P P0
H0 ;
Alternative H1 ;
Null
P P0
P P0
P P0
P P0
2. Significance Level
=5%/1%
or 0.05 / 0.01
If significance level is not given then we take 5% by default.
3. Critical Region(C.R)
1
Z / 2 )
2
If Alternative H1 ; P P0
C.R =
Z Z / 2
Z / 2 Z Z / 2
or
Z )
H1 ; P P0
Z
For One Tail Test ( 0.5 Z )
If Alternative H1 ; P P0
C.R= Z Z
C.R= Z
4. Test Statistics
P P0
Z=
P0 (1 P0 )
P X
n
Or
Z=
P P0
P0 q0 )
n
q 1 P
5. Conclusion
If z-cal is greater than or equal to z-tab so rejected
H0
H0
Testing of Hypotheses concerning the difference between two
Population Proportion( P1 P2 )(Z-Test)
H0 ;
Alternative H1 ;
Null
P1 P2 #0
P1 P2 #0
P1 P2 #0
P1 P2 #0
P1 P2 #0
P1 P2 #0
2. Significance Level
=5%/1%
or 0.05 / 0.01
If significance level is not given then we take 5% by default.
3. Critical Region(C.R)
1
Z / 2 )
2
If Alternative H1 ; P1 P2 #0
C.R =
Z Z / 2
Z / 2 Z Z / 2
or
Z )
If Alternative H1 ; P1 P2 #0
C.R= Z Z
For One Tail Test ( 0.5 Z )
If Alternative H1 ; P1 P2 #0
C.R= Z Z
#0 P1 P2
4. Test Statistics
( P1 P 2 ) #0
Z=
P2
P1
P1 q1 P 2 q 2
n1
n2
X2
X1
n1
n2
Or
( P1 P 2 ) #0
Z=
1 1
n1 n2
Pc qc
n Pn P
Pc 1 1 2 2
n1 n2
qc 1 Pc
If
5. Conclusion
If z-cal is greater than or equal to z-tab so rejected
H0
H0
Testing of Hypotheses concerning the difference between two
Population Proportion( P2 P1 )(Z-Test)
H0 ;
Alternative H1 ;
Null
P2 P1 #0
P2 P1 #0
P2 P1 #0
P2 P1 #0
P2 P1 #0
P2 P1 #0
2. Significance Level
=5%/1%
or 0.05 / 0.01
If significance level is not given then we take 5% by default.
3. Critical Region(C.R)
1
Z / 2 )
2
If Alternative H1 ; P2 P1 #0
C.R =
Z Z / 2
Z / 2 Z Z / 2
or
Z )
If Alternative H1 ; P2 P1 #0
C.R= Z Z
For One Tail Test ( 0.5 Z )
If Alternative H1 ; P2 P1 #0
C.R= Z Z
#0 P2 P1
4. Test Statistics
( P 2 P1 ) #0
Z=
P1
P1 q1 P 2 q 2
n1
n2
P2
X2
X1
n1
n2
Or
( P 2 P1 ) #0
Z=
1 1
n1 n2
Pc qc
n Pn P
Pc 1 1 2 2
n1 n2
qc 1 Pc
If
5. Conclusion
If z-cal is greater than or equal to z-tab so rejected
H0
H0
Testing of Hypotheses concerning the Population Correlation
Row
Coefficient when ( 0 or 0 ) (Z-Test)
0
0
H0 ;
Alternative H1 ;
Null
0
0
0
0
2. Significance Level
=5%/1%
or 0.05 / 0.01
If significance level is not given then we take 5% by default.
3. Critical Region(C.R)
1
Z / 2 )
2
If Alternative H1 ; 0
C.R =
Z Z / 2
or
Z / 2 Z Z / 2
Z )
If Alternative H1 ; 0
C.R= Z Z
Z )
If Alternative H1 ; 0
C.R= Z Z
4. Test Statistics
Z f z
Z=
z
1
1 r
1 r
z 1.1513log
Z f 1.1513log
1
n3
5. Conclusion
If z-cal is greater than or equal to z-tab so rejected
H0
H0
Testing of Hypotheses concerning the difference between Population
Row
of two Correlation Coefficient ( r1 r2 ) (Z-Test)
1 2 #0
1 2 #0
H0 ;
Alternative H1 ;
Null
1 2 #0
1 2 #0
1 2 #0
1 2 #0
2. Significance Level
=5%/1%
or 0.05 / 0.01
If significance level is not given then we take 5% by default.
3. Critical Region(C.R)
1
Z / 2 )
2
If Alternative H1 ; 1 2 #0
C.R =
Z Z / 2
or
Z / 2 Z Z / 2
Z )
If Alternative H1 ; 1 2 #0
C.R= Z Z
For One Tail Test ( 0.5 Z )
If Alternative H1 ; 1 2 #0
C.R= Z Z
4. Test Statistics
Z=
1 r1
1 r1
Z f 1 1.1513log
( Z f 1 Z f 2 ) #0
#0 1 2
Z 1 Z 2
1 r2
1 r2
Z f 2 1.1513log
Z 1 Z 2
If
1
1
n1 3 n2 3
5. Conclusion
If z-cal is greater than or equal to z-tab so rejected
H0
H0
Testing of Hypotheses concerning the difference between Population of
Row
two Correlation Coefficient ( r2 r1 ) (Z-Test)
2 1 #0
2 1 #0
H0 ;
Alternative H1 ;
Null
2 1 #0
2 1 #0
2 1 #0
2 1 #0
2. Significance Level
=5%/1%
or 0.05 / 0.01
If significance level is not given then we take 5% by default.
3. Critical Region(C.R)
1
Z / 2 )
2
If Alternative H1 ; 2 1 #0
C.R =
Z Z / 2
Z / 2 Z Z / 2
or
Z )
If Alternative H1 ; 2 1 #0
C.R= Z Z
For One Tail Test ( 0.5 Z )
If Alternative H1 ; 2 1 #0
C.R= Z Z
4. Test Statistics
Z=
( Z f 2 Z f 1 ) #0
1 r1
1 r1
Z f 1 1.1513log
Z 2 Z 1
If
#0 2 1
Z 2 Z 1
1 r2
1 r2
Z f 2 1.1513log
1
1
n1 3 n2 3
5. Conclusion
If z-cal is greater than or equal to z-tab so rejected
Source of
Variation
(S.O.V)
Degree
Freedo
m (d.f)
Sum of Square
(S.O.S)
H0
H0
Mean Square (M.S)
FDistribut
ion
Treatment(Column)
Treatment
(Sample)
k-1
S.S =
Error
n k
Total
n 1
Tj 2
n
C.F
Treatment M.S =
s12 Treatment.(Column) S .S
E.S.S = T.S.S
Treatment(Column)
S.S
Error M.S =
s E.S .S
2
n k
T.S.S =
2
i. j
C.F
1. Conclusion
If F-cal is greater than or equal to F-tab so rejected
k 1
s12
F 2
s
H0
H0
B1 B 2 B 3 ................ Bk
H0 ;
Alternative H1 ;
Column
A1 A 2 A3 ................ Ak
H 0' ;
Null
Alternative
H1'
2. Significance Level
=5%/1%
or 0.05 / 0.01
If significance level is not given then we take 5% by default.
3. Critical Region(C.R)
Row
C.R =
F F ( , V1 , V2 )
V1 r 1
V2 (r 1)(k 1)
V1 & V2 DegreeFreedom
Column
C.R = F
F ( , V1 , V2 )
V1 k 1
V2 (r 1)( k 1)
V1 & V2 DegreeFreedom
4. Test Statistics
Row /
Colum
n
A1
A2
A3
. Ak
Ti
Ti 2
2
i. j
B1
B2
B3
.
.
.
Bn
X1
1
X2
1
X3
1
.
.
.
Xn
1
X1
2
X2
2
X3
2
.
.
.
Xn
2
X1
3
X2
3
X3
3
.
.
.
Xn
3
X1k
X2k
X3k
.
.
.
Xnk
T T
Tj
T0 =
Grand Total
Tj
T =
X
2
i. j
2
i. j
i. j
(T0 ) 2
rk
X
T
Column Sum of Square =C.S.S=
2
i. j
C .F
III.
C.F
2
j
C.F
ANOVA Table
Sourc
e of
Variati
on
(S.O.V
)
Degree
Freedom
(d.f)
Sum of Square
(S.O.S)
Column.M.S =
Column.S.S =
Colum
n
k-1
2
j
C.F
Row.S.S =
Row
r-1
Error
(r 1)(k 1)
Tj 2
k
C.F
E.S.S = T.S.S
Treatment(Colum
s Column.S .S
2
1
k 1
Row.M.S=
s22 Row.S .S
r 1
Error M.S =
FDistributi
on
FColumn
FRow
s12
s2
s22
2
s
n) S.S
Total
n-1
s 2 E.S .S
(r 1)(k 1)
T.S.S =
2
i. j
C.F
5. Conclusion
Row
If FRow -cal is greater than or equal to
Column
If FColumn -cal is greater than or equal to