Statistics FULL Assignment
Statistics FULL Assignment
Statistics:
Statistics is the procedures in which way we organize the data, collect the data, classified the
data, then analysis the data, presentation of the data, control the data and finally evaluate the
data. So, these procedures are known as statistics.
Statistics is the mathematical expression but all of the mathematical/numerical expression is not
statistics.
According to Professor R.A Fisher, “The science of statistics is essentially a branch of applied
mathematics and may be regarded as mathematics applied to observational data.”
1. Definiteness: Numerical expressions are convincing and therefore one of the most
important functions of statistics is to present general statements in a precise and definite
form. Statement or facts conveyed in exact quantitative terms are always more
convincing then vague utterances. Statistics present facts in precise and definite form and
thus help proper comprehension of what is stated.
2. Condensation: Not only does statistics present facts in a definite form but it also helps in
condensing mass of data into a few significant figures. In a way, statistical methods
present a meaningful overall information from the mass of data.
3. Comparison: Unless figured are compared with others of same kind they are often
devoid of any meaning. For example, If we say that the production of Maruti Udyog Ltd.
Has increased considerably shall not be meaningful unless some comparison of figures is
made. But the statement there has been an increase from 200 cars a day in sept. 2000 to
more than 2000 cars a day in jan. 2015 definitely indicates the increasing trend in
production.
4. Formulating and testing hypothesis: Statistical methods are extremely useful in
formulating and testing hypothesis and to develop new theories. For example, hypothesis
like whether Chloromycetin is effective in preventing typhoid, whether the credit squeeze
is effective in checking price increase, whether students have benefited from the extra
coaching etc. can be tested by appropriate statistical tools.
5. Prediction: Statistics is not only concerned with the above functions, but it also predicts
the future course of action of the phenomena. We can make future policies on the basis of
estimates made with the help of Statistics. We can predict the demand for goods in 2005
if we know the population in 2004 on the basis of growth rate of population in past.
Similarly a businessman can exploit the market situation in a successful manner if he
knows about the trends in the market. The statistics help in shaping future policies.
6. Formulation of policies: With help of statistics we can frame favourable policies. How
much food is required to be imported in 2007? It depends on the food-production in 2007
and the demand for food in 2007. Without knowing these factors we cannot estimate the
amount of imports. On the basis of forecast the government forms the policies about food
grains, housing etc. But if the forecasting is not correct, then the whole set up will be
affected.
Scope of Statistics
The scope of statistics is so vast and ever increasing that not only it is difficult to define but also
unwise to do so. The use of statistics has promoted almost every fact of our lives. There is hardly
any field whether it be trade, industry or commerce, economics, biology, botany, astronomy,
physics, chemistry, education, medicine, sociology, psychology or technology where statistical
tools are not applicable. In fact the greatest victory of mankind of the 20th century, that of landing
Apollo 2 on the moon, would not have been success in the absence of statistical help. Let’s
examine a few fields in which statistical is applied.
Statistics are so significant to the state that the govt. in most countries is the biggest collector and
user of statistical data.
The main focus of this text is to discuss various statistical techniques that are indispensable in
analyzing and solving business problems.
Statistical data and statistical methods are of immense help in the proper understanding of the
economic problems and in the formation of economic policies statistical methods help not only
in formulation appropriate economic policies but also in evaluating their effect.
The physical science especially geology and physics were among the fields in which statistical
methods were first developed and applied. Currently the physical science seen to be making use
of statistics.
Statistical technique have proved to be extremely useful in the study of all natural sciences like
biology, medicine, botany etc.
Statistical is indispensable in research work. Most of the advancement in knowledge has taken
place because of experiments conducted with the help of statistical methods.
We have discussed above the significance of statistics in some important fields. Besides these,
statistics are useful to various institutions such as bankers, brokers, insurance companies,
auditor’s, social workers etc.
Nature of Statistical Data:
Collection of data
Data constitute the foundation of statistical analysis and interpretation. Hence the first step in
statistical work is to obtain data. Data can be obtained from three important sources, namely: (1)
Secondary source, (2) Primary source, (3) Internal records. Depending on the source, we can
have either secondary data or internal data or primary data.
1. Secondary Data:
Like all scientific pursuits, in statistics also the investigator need not begin from the very
beginning, he may use and must take into account what has already been discovered by
others. Consequently before starting a statistical investigation we must read the existing
literature and learn what is already known of the general area in which our specific
problem falls. when an investigator uses the data which has already been collected by
others, such data are called secondary data. Secondary data can be obtained from
journals, reports, government publication, publication of research organizations, trade
and professional bodies, etc. However, secondary data must be used with utmost care.
The user should be extra-cautious in using secondary data and he should not accept it at
its face value. The reason is that such data can be full of errors because of bias,
inadequate size of the sample, substitution, errors of definition, arithmetical errors, etc.
Even if there is no error, secondary data may not be suitable and adequate for the purpose
of the inquiry. Hence before using secondary data the investigator should examine
following aspects:
(1) Whether the data are suitable for the purpose of investigation. Before using secondary data
the investigator must ensure that the data are suitable for the purpose of the inquiry. The
suitability of the data can be judged in the light of the nature and Whether scope of
investigation.
(2) Whether the data are adequate for the purpose of investigation. If it is found that the data are
suitable for the purpose of investigation they should be tasted for adequacy. Adequacy of the
data is to be judged in the light of the requirements of the survey and the geographical area
covered by the available data.
(3) Whether the data are reliable. To determine the reliability of secondary data is perhaps the
most important and at the same time most difficult job.
2. Primary Data:
Primary data are measurements observed and recorded as part of original study. When the
data required for a particular study can be found neither in the internal records of the
enterprise, nor in published sources, it may become necessary to collect original data, to
conduct first hand investigation. The work of collecting original data is usually limited by
time, mony and manpower available for the study. When the data to be collected are very
large in volume, it is possible to draw reasonably accurate conclusions from the study of
small portion of the group called a sample. The actual procedures used in collecting data
are essentially the same whether all the items are to be included or only some items are
considered.
There are two basic methods of obtaining primary data-(1) Questioning & (2)
Observation.
3. Internal data:
Internal data refers to the measurements that are the by-product of routine business record
keeping like accounting, finance, production, personnel, quality control, sales, research
and development, etc.
In statistical analysis of many business problems one may be able to use internal data
which emerges in the process of keeping records employee earnings from a pay roll, sales
amounts from a sales journal, the amount of raw materials, direct labor and
manufacturing expenses used and the units of finished product produced from production
record, and cash receipts from the cash book. Thus the chief source of internal data are
the accounting records kept in most business firms.
Since internal data originates within business, collecting the desired information does
not usually offer much difficulty. The particular procedure depends largely upon the
nature of fact being collected and the form in which they exist. The problem of collection
is primarily that of having the proper record made at the time the information is secured.
The information wanted is frequently to be found in more than one department of the
business, which increases the difficulty of getting just the information one wants.
After collection and editing of data an important step towards processing the data is
classification. Classification is the grouping of related facts into different classes. Facts in one
class differ from those of another class with respect to some characteristics called a basis of
classification.
Types of classification
Frequency Distribution:
The word 'distribution' refers to the way in which the observations are distributed in different
classes. The process of preparing this type of distribution is very simple. We have just to count
the number of times a particular value is repeated which is called the frequency of that class. It is
obtained when the observations of a continuous variable are distributed among different classes
of the variable. Actually, it is a table that divides the whole data into a small number of classes.
This method of classifying helps in condensing the data only where values are largely repeated,
otherwise there will be hardly any condensation. In order to make the series more compact so
that its characteristics can be easily studied, data may be classified according to class-intervals.
1. Class limits: The class limits are the lowest and the highest values that can be included in the
class. For example, take the class 20-40. The lowest value of this class is 20 and the highest 40.
2. Class-intervals: The span of a class, that is, the difference between the upper limit and the
lower limit, is known as class-interval. For example in the class 20-40, the class interval is 20 (40
minus 20). The size of the class-interval is determined by the number of classes and the total
range in the data.
3. Class frequency: The number of observations corresponding to the particular class is known
as the frequency of that class or the class frequency.
4. Class mid-point: It is the value lying half-way between the lower and upper class limits of a class-
interval. Mid-point of a class is ascertained as follows:
Mid-point of a class = (Upper limit of the class + lower limit of the class)\2
There are two methods of classifying the data according to class-interval, namely (a) 'exclusive'
method, and (b) 'inclusive' method.
(a) 'Exclusive' method: When the class intervals are so fixed that the upper limit of one class is the
lower limit of the next class, it is known as the 'exclusive' method of classification.
(b) 'Inclusive' method: Under the 'inclusive' method of classification, the upper limt of one class is
included in that class itself.
It should be noted that both the inclusive and exclusive method give us the same class frequencies,
although the class-intervals are apparently different in the two cases.
Struges suggested the following formula for determining the approximate number of classes:
k = 1+3.322logN.
N = Total number of observations. Log = The ordinary logarithm to the base of 10.
Illustration-1(P-43):
Number of classes,
Struges, K = 1+3.322logN
= 1+3.322log30 [N=30]
=1+ (3.322*1.4771)
= 1+4.9069
= 5.9069
=6
Range
Width of a class, I = Here, Range = 69-16= 53
Number of classes
53
= = 8.83 =9
6
Since values like 3, 7, 9 etc. should be avoided and therefore we will take 10 as the interval and
the first class as(15-25)
Frequency distribution of the profits
(Exclusive method)
Total N = 30
Illustration:28
Hence, the company would need to pay 42000 tk by the way of bonus and average bonus per
4200
workers =
25
= 1680 tk
Ans.
In statistics, a central tendency (or measure of central tendency) is a central or typical value for a
probability distribution.[1] It may also be called a center or location of the distribution.
Colloquially, measures of central tendency are often called averages. The term central tendency
dates from the late 1920s.[2]
The most common measures of central tendency are the arithmetic mean, the median and the
mode. A central tendency can be calculated for either a finite set of values or for a theoretical
distribution, such as the normal distribution. Occasionally authors use central tendency to denote
"the tendency of quantitative data to cluster around some central value."[2][3]
The central tendency of a distribution is typically contrasted with its dispersion or variability;
dispersion and central tendency are the often characterized properties of distributions. Analysts
may judge whether data has a strong or a weak central tendency based on its dispersion.
Measures
The following may be applied to one-dimensional data. Depending on the circumstances, it may
be appropriate to transform the data before calculating a central tendency. Examples are squaring
the values or taking logarithms. Whether a transformation is appropriate and what it should be,
depend heavily on the data being analyzed.
Any of the above may be applied to each dimension of multi-dimensional data, but the results
may not be invariant to rotations of the multi-dimensional space. In addition, there are the
Geometric median
Which minimizes the sum of distances to the data points. This is the same as the median
when applied to one-dimensional data, but it is not the same as taking the median of each
dimension independently. It is not invariant to different rescaling of the different
dimensions.
Quadratic mean (often known as the root mean square)
useful in engineering, but not often used in statistics. This is because it is not a good
indicator of the center of the distribution when the distribution includes negative values.
Simplicial depth
the probability that a randomly chosen simplex with vertices from the given distribution
will contain the given center
Tukey median
A point with the property that every halfspace containing it also contains many sample
points
Illustration33. A machine was purchased for Rs. 10 lakhs in 2000. Depreciation on the
diminishing balance was charged@ 40% in the first year, 25% in the second year & 10% per
annum during the next three years. What is the average depreciation charge during the whole
period?
Solution: Since we are interested in finding out the average rate of depreciation, geometric mean
will be the most appropriate average. The cost of machine can be ignored as it is immaterial in
the rate calculation.
DETERMINING AVRAGE
RATE OF DEPRECIATION
Y Diminishing
ear value (for the
logX
value of 100tk
£logX=9•5159
GM=Antilog (£logx/N)
= Antilog (9•5159/5)
= Antilog (1•9032)
= 80.
Hence, average deprecation charge during the whole period will be= (100-80)%
=20%
Answer: 20%
20. Three group of observation contain 8,7 and 5 observation. There geometric means are 8.52,
10.12 and 7.75 respectively. Find the geometric mean of the 20 observation in the single group
formed by pooling the three groups.
N1 = 8 G1 = 8.52
N2 = 7 G2 = 10.12
N3= 5 G3 = 7.75
We know,
7.4432+7.0364 +4.4465
=
20
18.9261
=
20
= 0.9463
G = antilog (0.9463)
= 8.84
Hence, the combined geometric mean of the 20 observations taken together is 8.84 (ans.)
30. A recent college graduate was hired by a large manufacturing corporation and placed in their
management training program. As part of her training, she was assigned five different
departments of the corporation for various period of time. At the end of the training period in
each department, the supervisor graded her performance on a scale from zero to ten. At the end
of the training program, the training director computed an overall mean score based on the
following consideration:
The marketing and production phases of her training were assumed to be equai importance. Both
of these were considered to be three times as important as the purchasing and financial phases.
The accounting was twice as important as the letter two. If the supervisor's rating were as
follows. Compute an appropriate mean score.
Department Score
Finance 4
Marketing 7
Production 8
Purchasing 6
Accounting 9
Solution:
∑ fd
X¯ = A +
N
−15
=8+
18
=7.17
(Ans.)
The word dispersion has a technical meaning in statistics. The average measures the center of the
data, and it is one aspect of observation. Another feature of the observation is how the
observations are spread about the center. The observations may be close to the center or they
may be spread away from the center. If the observations are close to the center (usually the
arithmetic mean or median), we say that dispersion, scatter or variation is small. If the
observations are spread away from the center, we say dispersion is large.
Suppose we have three groups of students who have obtained the following marks on a test. The
arithmetic means of the three groups are also given below:
. But in group A the observations are concentrated around the center. All students in group A
have almost the same level of performance. We say that there is consistency in the observations
in group A. In group B the mean is 50 but the observations are not close to the center. One
observation is as small as 30 and one observation is as large as 70. Thus there is greater
dispersion in group B. In group C the mean is 60 but the spread of the observations with respect
to the center 60 is the same as the spread of the observations in group B with respect to their own
center, which is 50. Thus in groups B and C the means are different but their dispersion is the
same. In groups A and C the means are different and their dispersions are also different.
Dispersion is an important feature of observation and it is measured with the help of the
measures of dispersion, scatter or variation. The word variability is also used for the idea of
dispersion.
The study of dispersion is very important in statistical data. If in a certain factory there is
consistency in the wages of workers, the workers will be satisfied. But if some workers have
high wages and some have low wages, there will be unrest among the low paid workers and they
might go on strike and arrange demonstrations. If in a certain country some people are very poor
and some are very rich, we say there is economic disparity. This means that dispersion is large.
The idea of dispersion is important in the study of workers' wages, price of commodities,
standards of living of different people, distribution of wealth, distribution of land among framers,
and many other fields of life. Some brief definitions of dispersion are:
1. The degree to which numerical data tend to spread about an average value is called the
dispersion or variation of the data.
2. Dispersion or variation may be defined as a statistic signifying the extent of the
scatteredness of items around a measure of central tendency.
3. Dispersion or variation is the measurement of the size of the scatter of items in a series
about the average.
Properties
Properties 1 : prove that, the standard deviation is independent of change of origin but not
scale.
Proof: let, the number of ‘n’ values of ‘x’ variables are x1, x2,…...,xn. now, changing the origin
deducting by ‘A’ and changing the scale dividing by ‘C’
We get,
x1− A x2 −A x −A
, , ……. , n
C C C
xi −A
Let, d i=
C
or, x i-A= cd i
or, Σ x i= nA + cΣd i
Σx i nA c Σd i
or, = +
n n n
Σ( x i− x́ )²
σ x= √ ………… (iii)
n
Now, putting the value of equation (i) and (ii) into the equation (iii)
we get, σ X=
√
Σ ( A +c di −A−c d́ )
n
Σ( c d i−c d́ ) ²
=
√ n
c ² Σ(d i− d́) ²
=
√ n
= |c|σ d
∴ σ X =|c|σ d
Hence, the standard deviation is independent of change of origin but not scale. (Proved)
Property2: Prove that, Co-efficient of Variance (CV) depends on origin and scale.
Σ xi
Proof: Let, the number of ‘n’ values of ‘x’ variables are x1, x2,…...,xn. and their mean is x́= .
n
Σ ( x i−x́ ) ²
Standard Deviation is, σ x =√
n
σ
Co-efficient of variance (cv) = x ×100
x
xi −A
Again let, d i=
C
x i-A= cd i
x i= A + cd i
Σ x i= nA + cΣd i
Σx i nA c Σd i
= n +
n n
x́ = A + cd́
∴ σ x = c×σ d
c × σd
CV= ×100
A +σ d
∑ xi
Proof: Let, the number of ‘n’ values of ‘x’ variables are x1, x2,……,xn. and their mean x́=
n
Σ ( x i−x́ ) ²
Standard Deviation is, σ x =√
n
σx
Co-efficient of variance (cv) = ×100
x
xi −A
Again let, d i=
C
x i-A= cd i
x i= A + cd i
Σ x i= nA + cΣd i
Σx i nA c Σd i
= n +
n n
x́ = A + cd́
x́ -A= cd́
cd́ = x́ -A
d i= X́− A
C
σd
∴ Co−efficient of varianceCo-efficient of variance variable is (C.V)d= ×100
d́
Σ ( d i−d́ ) ²
= √
x́− A
n ×100
x i− A X́ −A
= √ ∑(
C
−
n
x́− A
C
)²
×100
x i− A− x́+ A
= √ ∑(
C
n
x− A
)²
×100
c
x i− x́
= √ ∑(
n
x́− A
c
)²
× 100
∑(x i− x́ )² 1
= √ c²
x́− A
×
n ×100
1 ∑( x i− x́ )²
= c √
x́− A
n × 100
σx
= × 100
x́−A
σx
∴ C .V (d))= × 100
x́−A
(proved).
4 properties: prove that, the standard deviation (σ ) of the 1st ‘n’ natural numbers can be
obtained by the following formula,
n ²−1
σ =√
12
Proof:
We know that, the set of 1st ‘n’ natural numbers is {1, 2, 3, ……, n}
Let, xi= 1, 2, 3, …….,n
n(n+1)
∑ x i= 1+2+3+…….+n =
2
n(n+1)(2 n+1)
∑ x i ² = 1²+2²+3²+ …… +n² =
6
∑ xi ² ∑ x i
Again, we know that, σ ²= −( )²
n n
n (n+1)(2 n+ 1) n (n+1)
= 6 – { 2 }²
n n
n(n+1)(2 n+1) 1
= 6
×
n – { n(n+1)
2
1
× }²
n
n+1 n−1
= 2 × 6
n ²−1
= 12
n ²−1
As, σ ² =
12
n ²−1
∴ σ= √
12
(Proved)
5 property: When does the standard deviation receives the minimum or the least value?
Proof: let, the number of ‘n’ values of ‘x’ variables are x1, x2, …..., xn and their mean is x́ =
∑ xi
n
∑ xi ∑ xi
Standard deviation, σ x =
√ n
−(
n
)²
Since, the standard deviation never be negative. So, it is minimum will be 0
∑ xi ² ∑ xi
If
√ n
−(
n
)²=0
∑ xi ² ∑x i
−( )² =0
n n
∑ xi ²
−(x́) ² =0
n
∑ xi ²
= x́ ²
n
∑ x i ² = n x́ ²
Hence, ∑ x i ² = nx́ ², then the standard deviation receives the minimum or the least value.
6 Property: prove that, the arithmetic mean of standard variable or ideal variable is zero
(0) and variance is 1
Proof: Let the number of ‘n’ values of ‘x’ variables is x1, x2, …..., xn and their mean is x́
∑( x i−x́) ²
Standard deviation σ x =
√ n
∑( x i− x́ )²
Variance σ ² x =
n
xi −x́
∴ the Standard variable, zi=
σ
σ z i= x i−x́
σ ∑ z i=∑( x i−x́ )
σ ∑ zi ∑( x i− x́ )
=
n n
0
σ ź = [∴∑( x i−x́ )=0]
n
∴ ź = 0
∑( z i− ź )²
σ ² z=
n
1
= × ∑( z i−ź )²
n
1 xi −x́
= × ∑( −0)²
n σ
1 xi −x́
= ×∑ ( )²
n σ
1 ∑( xi − x́)²
= ×
n σ²
1 ∑( x i− x́ )²
= ×
σ² n
1
= ×σ ²
σ²
=1
Exercise
1. Exercise 18: Calculate the appreciate measure of dispersion from the following data.
Q3−Q1
Quartile deviation (Q.D) =
2
Q3−Q1
Coefficient of quartile deviation=
Q3 +Q1
Calculation Table
Daily wages No. of
(in tk) Wage earners (f) C.F
Less than 85 14 14
85-87 62 76
88-90 99 175
91-93 18 193
Over 93 7 200
N = 200
N 200
Q1 = Size of th observation = =50th observation, which lies in the class 85-87. But
4 4
the real class limit is 84.5-87.5
N
−p.c. f
∴ Q1 ∴ Q_1= L + 4 ×i
f
200
−14
=84.5+ 4 ×3
62
50−14
= 84.5 + ×3
62
= 84.5 + 1.74
= 86.24
3N 3× 200
Q 3= Size of th observation = =150th observation, which lies in the class 88-90.
4 4
But the real class limit is 87.5-90.5
3N
−p.c.f
∴ Q3 ∴Q_3= L+ 4
×i
f
3 ×200
−76
= 87.5 + 4 ×3
99
150−76
= 87.5 + ×3
99
=87.5+2.24
=89.74
Q3−Q1
Hence, Quartile Deviation (Q.V) =
2
89.74−86.24
=
2
=1.75
Q3−Q1
Co-efficient of Quartile Deviation = ×100
Q3 +Q1
89.74−86.24
= ×100
89.74+ 86.24
3.5
= ×100
175.98
=0.0199 ×100
=1.99%
2. Exercise 20: The following table gives the fluctuations in the prices of shares of two
companies A and B. Find out which of them shows greater variability. Comment on the
result.
Company A Company B
Price (X) d= X-A d² Price (X) d= X-A d²
318 -6 36 2542 -3 9
322 -2 4 2522 -23 529
325 1 1 2534 -11 121
312 -12 144 2532 -13 169
324(A) 0 0 2545(A) 0 0
315 -9 81 2530 -15 225
308 -16 256 2556 11 121
319 -5 25 2530 -15 225
∑d= -49 ∑d²=547 ∑d= -69 ∑d²=1399
Company A
∑d
Mean, x́ A=A+
N
(−49)
= 324+
8
=324+(-6.125)
=324-6.125
=317.88
∑ d ² ∑d
Standard Deviation (S.D)A=
√ N
−(
N
)²
547 −49
=
√ 8
−(
8
)²
=√ 68.38−37.52
=√ 30.86
=5.56
SD A
C.V A = ×100
XA
5.56
= × 100
317.88
=1.75%
Company B:
∑d
Mean , X́ B = A+
N
(−69)
= 2545+
8
= 2545+ (-8.63)
= 2545-8.63
=2536.38
∑ d² ∑d
Standard Deviation (S.D)B =
√ N
−(
N
)²
1399 −69
=
√ 8
−(
8
)²
= √ 178.88−74.39
=10.22
SD A
C.V B= ×100
XA
10.22
= × 100
2536.88
=0.40%
Comment: it is wise to hold share of company A, because company A has greater variability
than company B.
Chapter: 4 Moments, Skewness, Kurtosis
Skewness:
The measures of central tendency and variation do not reveal the entire story about a frequency
distribution. Two distributions may have the same mean and standard deviation but may differ in
their shape of the distribution. The term ‘skewness’ refers to the lack of symmetry or departure
from symmetry, e.g., when a distribution is not symmetrical it is called a skewed distribution.
1. Symmetrical distribution: The values of mean, median and mode are alike.
Mean=Median=Modean=Mode
2. Positively skewed distribution: Mean is greater than the mode and the median lies
somewhere in between mean and mode.
Measures of Skewness
Karl Pearson’s coefficient of skewness:
Mean−Mode
S k p=
σ
If the mode is ill-defined, the above formula has to be modified. In such a case the formula is:
3( Mean−Median)
S k p=
σ
Interpretation: The value of this coefficient would be zero in a symmetrical distribution. If the
mean is greater than mode, coefficient of skewness would be positive, otherwise negative. The
value of this coefficient usually lies between ± 1for moderately skewed distribution.
Bowley’s coefficient of skewness:
Interpretation: The value of this coefficient would be zero in a symmetrical distribution. If the
value is greater than zero, coefficient of skewness would be positive, otherwise negative. The
value of this coefficient usually lies between ± 1for moderately skewed distribution.
Moments
2. The r-the moment of a variable X about any arbitrary point A is given by:
❑
∑ ( X− A )r
μ́r = ❑
N
2. The r-the moment of a variable X about any arbitrary point A is given by:
❑
∑ f ( X −A )r
μ́r = ❑
N
For r = 1, 2, 3, 4 …
❑ ❑ ❑ ❑
2 3 4
∑ ( X − X́ ) ∑ ( X − X́ ) ∑ ( X− X́ ) ∑ ( X − X́ )
, , , ………….
μ1=
μ2= ❑ ❑
μ3= ❑
μ4 = ❑
N N N N
Moments from Arbitrary point:
❑
∑ ( X− A )r
μ́r = ❑
N
For r = 1, 2, 3, 4 …
❑ ❑ ❑ ❑
∑ ( X −A ) ∑ ( X − A )2 ∑ ( X − A )3 ∑ ( X− A )4
, , , ,
μ́1= ❑
μ́2= ❑
μ́3= ❑
μ́4 = ❑
N N N N
…………..
Skewness is also measured by using moments
μ 23 μ3
β 1= 3 Or γ 1 =√ β 1 = 3
μ2 σ
Interpretation:
1. If β 1 = 0, the distribution is symmetric.
2. β 1 as a measure of skewness has a serious limitation because it cannot tell us about the
direction of skewness i.e., whether it is positive or negative. To remove this drawback is
removed if we calculate Karl Pearson’s γ 1 where
μ3 μ3
γ 1=√ β 1 ¿ 3/ 2
=
μ 2 σ3
Now the sign of skewness would depend upon the value of μ3. If μ3 is positive we will have
positive skewness and if μ3 is negative we will have negative skewness.
With the help of the following relationships, moments about an arbitrary point can be converted
to moments about mean:
μ1=0
Kurtosis
• In statistics, kurtosis refers to the degree of flatness or peakedness in the region about the
mode of a frequency curve.
• If a curve is more peaked than the normal curve, it is called ‘leptocurtic’
• If it is more or flat-topped than the normal curve, it is called ‘platykurtic’
• The normal curve itself is known as ‘mesokurtic’
Measures of Kurtosis
μ4
β 2= Or γ 2=β 2−3
μ22
Interpretation:
If β 2 is greater than 3, the curve is more peaked than the normal curve; i.e., leptokurtic,
If β 2 is less than 3, the curve is less peaked than normal curve i.e., platykurtic.
Illustration 6. The first central moments of a distribution are 0, 16, -36 and 120. Comment on the
skewness and kurtosis of the distribution.
µ1 = 0,
µ2 = 16,
µ3 = -36,
µ4 = 120.
µ3
γ1 = Where
σ3
36
= 3 σ = √µ2
(4)
−36
= =√ 16
64
= 0.5625 =4
The distribution is negatively skewed. (It may be noted that if we calculate β1 its value will be
µ 32 −362
β1 = = = 0.3164. But this would not be wrong as µ3 is negative.
µ 23 163
µ4
β2 =
µ 22
120
=
162
120
=
256
=0.4688
21-25 5 41-45
15
26-30 15 46-50
12
31-35 28 51-55
3
36-40 42
Calculate mean, standard deviation and coefficient of skewness and comment on the result.
Solution:
Ʃfd
Mean x= A+ xI
N
−25
=38 + X5
120
=38 + (-1.04)
=36.96
Standard deviation :
∑ fd 2 fd 2
σ=
√ N
− ( )
N
×i
223 −25 2
=
√ 120
− ( )
120
×5
= 1.35 x 5
= 6.75
Coefficient of skewness:
(x−mode)
Sk p=
σ
Here, mode : Since the highest frequency is 42, mode lise in the class (35.5-40.5)
∆1
∴ Mode =L+ ×i
∆1 +¿ ∆ 1
14
= 35.5 + ×5
( 42−28 )+ ¿ ( 42−15 )
= 35.5+1.71
= 37.21
(x−mode)
Coefficient of skewness Sk p=
σ
36.96−37.21
=
6.75
= - 0.037
Comment : The value of mean = 36.96 indicates that or the average, rejects per operator were 37
in number. The value of standerd deviation = 6.75 suggests that the variation in the data from the
central value is approximately 7. Coefficient of skewness = - 0.037 indicates that the distribution
is slightly skewed to the left and therefore, there is greter concentration of the rejects per
operator at the upper value than the lower values of the distribution.
21. From the following data pertaining to profits (RS lakhs) for 50 companies. Calculate
moments β1 and β2:
70-90. 8
90-110. 11
110-120. 18
130-150. 9
150-170. 4
Ʃfd
µ’1 = xI
N
−10
= x 20
50
=-4
fd 2
µ’2 = x i^2
N
64
= x (20)^2
50
=544
fd 3
µ’3 = x i^3
N
−34
= x (20)^3
50
=-5440
fd 4
µ’4 = x i^4
N
−10
= x (20)^4
50
= 678400
Moments about mean:
µ1 = 0
µ2 = µ’2 - µ’1^2
= 544 – (-4)^2
= 528
=960
= 642816
µ 32
β1 =
µ 23
( 960 )2
= 3
( 528 )
= 0.006
µ4
β2 =
µ 22
= 642816
¿¿
= 2.31
(Answer)
22. A record was kept over a period of 6 month by a sales manager to determine
The average number of calls made per day by his six salesmen .The results are shown below:
Salesmen A B C D E F
Average number of calls: 8 10 12 15 7 5
S0lution:
8 -4 16 16 256
10 -2 -8 -8 16
12(A) 0 0 0 0
15 3 9 27 81
7 -5 25 -125 625
5 -7 49 -343 240
Σ(X-12)=-15 Σ(X-12)2=103 Σ(X-12)3=-513 Σ(X-12)4=3379
µ’=Σ(X-A)/N=Σ(X-12)/6=-15/6=-2.5
µ2/=Σ(X-12)2/N=Σ(X-12)2/6=17.17
µ3’=Σ(X-12)3/N=Σ(X-12)3/6=-85.5
µ4’=Σ(X-12)4/N=(X-12)4/6=563.17
Moments of mean;
µ1=0
=10.92
µ3= µ’3-3µ’2µ1+2µ’3
=-85.5-3×17.17×(-2.5)+2×(-2.5)
=-85.5+128.778-31.25
=-12.025
µ4=µ’4-4µ’3µ’1+6µ’2µ’12-3µ’4
=563.17-4×(-85.5)+6×17.17×(-2.5)2-3×(-2.5)4
=563.12-855+643.85-117.1875
=234.86
=(-12.025)2÷(10.92)3
=0.11
=(234.86)÷(10.92)2
=1.97
25. The arithmetic mean of a distribution is 5.The second and the third moment about the mean
are 20 and 140 respectfully .Find the third moment of distribution about 10.
Solution:
X͞ =5
µ2=20
µ3=140
we know ,µ’1=X͞ -A
=5-10
=-5
µ’2=µ2+(µ’)2
=20 +(-5)2
=45
µ’3=µ3+3µ2µ’1+(µ’)3
=140+3×20×(-5)+(-5)3
=-285 (ANS)
27. (a) For a distribution Bowleys coefficient of skewness is -.48, Q3=10.2 and median
=14.14 .What is the quartile coefficient of distribution ?
(b) Karl pearson`s coefficient of skewness of a distribution is +0.4. Its standard deviation is 10
and mean is 40.5
(d) The flowing information was obtain from the records of factory relating to wages
Solution:
Q3=10.2
Median=14.4
We know,
CsKB=(Q3+Q1-2 Median)/(Q3-Q1)
or,-.48=(10.2+Q1-2×14.4)/(10.2-Q1)
or,-0.48(10.2-Q1)=-18.6+Q1
or,-4.896+18.6=Q1-0.48Q1
or,13.704=0.52Q1
or,Q1=26.35(Ans)
X̅=40.5
We know,
CsKp=3(X̅-Median)/ơ
or,0.4=3(40.5-median)
or,4=121.5-3Median
or, Median=39.17
Again, SKp=3(X̅-Mode)
We know , CsKB=(Q3+Q1-2Median)/(Q3-Q1)
=(75+60-2×68)/(75-60) =-.067(Ans)
Median=260
ơ=45.8
We know,
CSkp=3(X̅-Median)/ơ
=3(275-260)/45.8
=0.98 (Ans)
33. The frequency distribution of weekly wages (in Rs.) in a certain factory is as follows:
Solution:
∑ fd 13
Calculation of Mean : x=A+ ×c=445+ ×5=445.65
N 100
∑ fd 2 fd 2 311 13 2
Calculation of S.D.:σ =
√ N ( )
−
N
×c =
√ −
100 100 ( )
×5=9.79
∆1
∴ Mode =L+ ×c ∆ 1=32−14=18
∆1 +¿ ∆ 1
18
=442.5+ ×5 ∆ 2=32−16=16
18+16
=446.15 c=5
( Mean−median ) ( 445.65−445.15 )
Skp = = =0.057
σ 8.79
Interpretation:
Since we have got positively skewed from this distribution, that is why the first half of the
distribution has more weekly salaries than that of second half.
34. A survey was conducted by a manufacturing company to enquire the maximum price at
which persons would be willing to buy their product. The following table gives the stated prices
(in rupees) by persons:
Σfd 6
Mean: x=A+ ×c=105+ ×10=105.60
N 100
S.D.( σ )=
√ ∑
❑
fd 2
N
−
fd 2
( )
N
×10=12.6335
Median= Size of N/2 observation =100/2=50th observation, which lies in the class 100-110.
N
−fc 50−40
Median = L + 4 ×c=100+ 18
×10=105.56
fc
3(mean−median) 3(105.60)
∴ Sk p= = =-0.009
σ 12.635
Interpretation:
Since we have got positively showed from. This distribution. That is why the first half of the
distribution has more prices than that of second half.
35. The standard deviation of a symmetrical distribution is 3. What must be the value of fourth
moment about the mean in order that the distribution be mesokurtic?
Solution:
We know, μ4 =σ 2=32=9
Here,
β 2= μ4 / μ22
⟹3= μ4 /92
36. Calculate coefficient of variation and Karl Pearson’s coefficient of Skewness from the data
given below:
No. of Cos: 8 20 50 72 80
Solution:
∆1
∴Mode = L+ ×c ∆ 1=32−12=18
∆1 +¿ ∆ 2
18
=50+ × 10 ∆ 1=30−22=8
18+8
=56.92
σ 1 o.997
C.V. = × 100= ×100=19.55
X 56.25
3(Mean−median) 3(56.25−56.92)
SK2 = = =−0.06 [ Ans . ]
σ 10.997
37. Assume that a firm has selected a random sample of 100 from its production line and has
obtained the data shown in the table below:
Solution:
Calculation of SKp
Σfd −8
Mean a: X́ = A+ ×c=147+ ×5=145.6
N 100
∑ fd 2 fd 2 208 −8 2
S.D.:σ =
√ N ( )
−
N
×C=
√ − ( )
100 100
×5=7.2
By observing mode lies in the class 145-149. But the real limit of this class is 144.5-149.5.
∆1
∴Mode = L+ ×C ∆ 1=¿28-21= 7
∆1 +¿ ∆ 2
7
=144.5+ ×5 ∆ 2= 32 – 16 = 9
7+9
=146.69
(Mean−median) (146.6−146.69)
SKp = = = - 0.125
σ 7.2
38. (a) A moderately sketyed distribution has mean and median as 25 and 26 respectively. Then
its mode approximately equals
(b) Whether the following statement is true or false; If a distribution has negative skewness than
its mean is greater than mode.
Solution:
(a) Given Mean = 25 and Median = 26
Here, Mode = 3 Median – 2 Mean
=3(26) – (25)
= 28
(b) The statement is false. So, when mean is greater than mode then it must be positive.
40. From the following data pertaining to the income of 5800 persons. Find Bowley`s coefficient
of skewness.
Solution:
Calculation
Income f c.f
Below 10000 170 170
10000- 20000 630 800
20000-30000 1000 1800
30000-40000 1250 3050
40000-50000 1360 4410
50000-60000 1000 5410
60000 and above 400 5810
N= 5810
N
Q1= Size of thobservation, which lies in the class 20000-30000.
2
N ¿−fc 1452.5−800
Q1= L + × c = 2000 + ×10000=26525
fm 1000
3N 3× 5810
Q3 = Size of th observation = = 4357.5th observation which lies in the class 40000-
2 4
50000.
3 N ¿−fc 4357.5−3050
Median = L+ ×c = 40000+ ×10000=49613.97
fm 1360
N 5810
Median = Size of observation = =2905 thobservation ,
2 2
n ¿−fc 2905−1800
∴ Median = L + × c=30000+ ×10000=38840
fm 1250
Thus,
Answer: -0.067
Chapter: 5 Correlation Analysis
The statistical tool with the help of which these relationships between two or more than two
variables is studied is called correlation .
A very simple definition of correlation is that given by A.M. Tuttle , ‘‘An analysis of the co-
variation of two or more variables is usually called correlation
The study of correlation is of immense use in practical life because of the following reasons :
1.Most of the variables show some kind of relationship between price and supply, income and
expenditure, etc. With the help of correlation analysis we can measure in one figure the degree of
relationship existing between the variables.
2.Once we know the two variables are closely related, we can estimate the value of one variable
given the value of another. This is done with the help of regression analysis.
3.Correlatoin analysis contributes to the economic behavior, aids in locating the critically
important variables on which others depend. May reveal to the economist the correlation by
which disturbances spread and suggest to him the paths through which stablishing forces become
effective.
In business, correlation analysis enables the executive to estimate costs, price and other variables
on the basis of some other series with which these costs ,sales or prices may be functionally
related. Some of the guesswork can be removed from decisions when the relationship between a
variable to be estimated and the one or more other variables on which it depends are close and
reasonably invariant.
4.Progessive development in the methods of science and philosophy has been characterized by
increase in the knowledge of relationship or correlations. Nature has been found to be
multiplicity of inter-related forces.
However, it should be noted that coefficient of correlation is one of the most widely used and
also one of the most widely abused statistical measures. It is abused in the sense that one
sometimes overlooks the fact that correlation measures nothing but the strength of linear
relationships and that it does not necessarily imply a relationship.
1. Types of correlation
1. Positive Correlation
2. Negative Correlation
3. Partial Correlation
The correlation is partial if we study the relationship between two variables keeping all
other variables constant.
Example:
The Relationship between yield and rainfall at a constant temperature is partial correlation.
4. Linear Correlation
When the change in one variable results in the constant change in the other variable, we
say the correlation is linear. When there is a linear correlation, the points plotted will be in a
straight line
Example:
X: 10 20 30 40 50
Y: 20 40 60 80 100
Here, there is a linear relationship between the variables. There is a ratio 1:2 at all points. Also, if
we plot them they will be in a straight line.
One of the most common and basic techniques for analyzing the relationships between
variables is zero-order correlation. The value of a correlation coefficient can vary from -1 to +1.
A -1 indicates a perfect negative correlation, while a +1 indicates a perfect positive correlation.
A correlation of zero means there is no relationship between the two variables.
7. Spearman's Correlation
⇒ ρ⇒ ρ = 1 - 6∑d2n(n2−1)6∑d2n(n2−1)
When the amount of change in one variable is not in a constant ratio to the change in the
other variable, we say that the correlation is non linear.
Example:
X: 10 20 30 40 50
Y: 10 30 70 90 120
Here there is a non linear relationship between the variables. The ratio between them is not fixed
for all points. Also if we plot them on the graph, the points will not be in a straight line. It will be
a curve.
If there are only two variable under study, the correlation is said to be simple.
Example:
When one variable is related to a number of other variables, the correlation is not simple.
It is multiple if there is one variable on one side and a set of variables on the other side.
Example:
Relationship between yield with both rainfall and fertilizer together is multiple correlations
. The range of the correlation coefficient between -1 to +1. If the linear correlation
coefficient takes values close to 0, the correlation.
The following are the important methods of ascertaining whether two variables are correlated or not. Four methods of
correlation:
Basic formula=
rxy =∑ ( x−x ) ¿ ¿
When Deviations are taken from an Assumed Mean:
When actual means are in fractions ,say the actual means of ‘X’ and ‘Y’ series are 20.167 and
29.23, the calculation of coefficient of correlation by the method discussed above would involve
too many calculations and would take a lot of time.In such cases we make use of the assumed
mean,the following is applicable:
N ∑dx dy−∑dx ∑ dy
r=
√ N ∑ d 2 x −( ∑ dx ) 2 √ N ∑ d 2 y−(∑dy )2
Where dx refers to deviations of X series from an assumed mean, i.e, (X-A).
It may be noted that this form of formula is same as (iii), only difference being that whereas in
form (iii) we are dealing with original X and Y, in form (iv) we are taking deviations of X and Y
series from assumed mean.
1. There is linear relationship between the variables, i.e., when the two variables are plotted on a
scatter diagram, a straight line will be formed by the points so plotted.
2. The two variable under study are affected by a large number of independent causes so as to
form a normal distribution. Variables like height, weight, price demand, supply etc. are affected
by such forces that a normal distribution is formed.
3. There is a cause and effect relationship between the forces affecting the distribution of the
items in the two series. If such a relationship is not formed between the variables, i.e., if the
variables are independent there cannot be any correlation. For example, there is no relationship
between income and height because the forces that affected these variables are common.
-1 ≤ r ≤ +1
r = √ bxy .byx
2 .If the value of r is more than six times the probable error ,the existence of correlation is
practically certain ,i.e the value of significant .
3. By adding and subtracting the value of probable error from the coefficient of correlation we
get respectively the upper and lower limits within which coefficient of correlation in the
population can be expected to lie.
2. The statistical measure for which the P.E is computed must have been calculated from a
sample.
3. The sample must have been selected in an unbiased manner and the individual items must be
independent.
This method of finding out covariability or the lack of it between two variables was developed
by the British psychologist Charles, Edward Spearman in 1904. This measure is especially useful
when quantitative measure of certain factors ( such as in the evaluation of leadership ability or
the judgement of female beauty) cannot be fixed, but the individuals in the group can be
arranged in order thereby obtaining for each individual a number indicating his (her) rank* in the
group.
6∑D 2 6∑D 2
R= 1- or 1-
N( N 2 -1) (N 3 -3)
Where R denotes rank coefficient of correlation and D refers to the difference of ranks between
paired item in two series.
The value of this coefficient also lies in +1 and -1. When R is +1, there is complete agreement in
the order of the ranks and the ranks are in the same direction. When R is -1, there is complete
agreement in the order of the ranks and they are in opposite directions. This shall be clear from
the following :
R1 R2 D D2 R1 R2 D D2
(R1-R2) (R1-R2)
1 1 0 0 1 3 -2 4
2 2 0 0 2 2 0 0
3 3 0 0 3 1 2 4
∑D 2D = ∑D 2 = 8
0
6∑ D 2 6 ×0 6∑D 2 6 ×8
R= 1- = 1- 3 =1-0 =1 R= 1- = 1- 3 = 1-2 = -1
(N 3 -3) 3 ×3 (N 3 -3) 3 ×3
Where actual ranks are given, the steps required for computing rank correlation are :
Take the differences of the two ranks, i.e., (R1-R2) and denote these differences by D.
Square these differences and obtain the total ∑D 2
Apply the formula :
6∑ D 2
R= 1-
(N 3 -3)
When we are given the actual data and not the ranks, it will be necessary to assigns the ranks.
Ranks can be assigned by taking either the highest value as 1 or the lowest value as 1. But
whether we start with the lowest value or the highest value, we must follow the same method in
case of all variables
Illustration 05 : Find the coefficient of correlation between the age and the sum assured from the
following table :
20-30 4 6 3 7
30-40 2 8 15 7
40-50 3 9 12 6
50-60 8 4 2 0
❑ ❑ ❑
∴r= N ∑ F d x d y −∑ f d x ∑ f d y
❑ ❑ ❑
√¿ ¿¿
=100(−7)−¿ ¿
−700−2013
=
√( 13100−1089 ) (13100−3721)
−2713
=
√ 12011× 9379
−2713
=
109.59× 96.85
=-0.25
Illustration 06:Calculate the coefficient of correlation from the following bivariate frequency
distribution :
Sales Revenue
75-125 4 1 0 0
125-175 7 6 2 1
175-225 1 3 4 2
225-275 1 1 3 4
❑ ❑ ❑
∴r= N ∑ F d x d y −∑ f d x ∑ f d y
❑ ❑ ❑
√¿ ¿¿
( 40 × 21 )−10(−17)
= 2 2
√ {40 ×50−(10) }{ 40× 45−(−17) }
840÷ 170
=
√ 4820 ×5244
1010
=
1694.373
= 0.596
Illustration 15:
Correct ∑X=120-8-12+8+10=120-2=118
Correct ∑Y=90-10-7+12+8=93
N ∑ XY −∑ X ∑ Y
x=
√ N ∑ X−( ¿ ∑ X )√ N ∑ Y 2−( ∑Y ) 2 ¿
10410−10974
X=
√16680−13924 √ 9270−8649
−564
X=
52.50× 24.92
X= -0.43
Illustration 17. Family income and its percentage spent on food in the case of one hundred
families gave the following bivariate frequency distribution. Calculate the coefficient correlation
15- 17.5 0 0 0 0 0 20 0 0 0
20 4 9 4 3
f dy2 40 20 0 20 40 ∑f dx2=120
N ∑ f d x d y −( ∑ f d x ) (∑ f d y )
r = ❑ ❑ 2 ❑ ❑ 2
√ N ∑ f d 2x −
❑
( ∑ f dx
❑
)√ N ∑ f d 2y −
❑
(
∑f dy
❑
)
(100×−48)−(0 ×100)
= 2 2
√100 ×120−( 0 ) √100 × 200−(100)
−4800 −48
= = = -0.438
√ 12000 √ 10000 √ 120 √ 100
There seems to be a low degree of negative correlation between family income and its
percentage spent on food expenditure.
Illustration :28
15.Find the correlation by karl pearson’s method between the two kinds of assessment of
postgraduate student performance .
Internal assessment: 45 62 67 32 12 38 47 67 42 85
External assessment: 39 48 65 32 20 35 45 77 30 62
d ❑
∑ dx dy
❑ y−¿ ❑
N
¿∑dx ¿
r ❑ 2 2
√{
❑ ❑
❑
∑d −
❑
( ∑d )
2
x
❑
N
x
}{ ❑
∑d −
❑
(2
y
∑d )❑
N
y
}
117 ×3
3005−
= 10
√¿ ¿ ¿
3005−35.1
=
√( 5325−1368.9 ) (2877−0.9)
2969.9
=
√ 3956.1 ×2876.1
2969.9
=
3373.15
=0.88
Hence there is a higher positive correlation between two variables I,e is the internal assessment
and external assessment e,g as the internal assessment goes up external assessment also goes up .
18. The following table gives the frequency according to age groups ,marks obtained by 68
students in a general knowledge test. Measure the of relationships between age and general
knowledge:
Age in years
Test mark 21 22 23 24
200-250 4 4 2 1
250-300 3 5 4 2
300-350 2 6 8 5
350-400 1 5 6 10
X M.p 21 22 23 24
F Fd y Fd 2y Fd x d y
Y d x d y -1 0 1 2
m.p
❑ ❑ ❑
∴r= N ∑ F d x d y −∑ f d x ∑ f d y
❑ ❑ ❑
√¿ ¿¿
=0.399
Hence there is a low degree of positive correlation between age and test marks.
Chapter: 6 Regression Analysis
Regression Analysis: Regression analysis is a statistical technique which is developed to
statistical relationship among two or more variables with a visa to estimate or predict the value of
dependent variable for some known value of the independent variable.
Correlation Regression
1. Correlation coefficient is a measure of 1. The objective of regression analysis is
degree of relationship between X and to study the nature of relationship
Y. between the variables.
2. Correlation coefficient measures the 2. The cause and effect relation is clearly
strength of linear relationship between indicated through regression analysis.
two variables; it does not measure the
course and effects. Correlation is
merely a tool of ascertaining the
degree of relationship between two
variables and therefore, we cannot say
that one variable is the cause and the
other is effect.
Regression analysis is a branch of statistical theory that is widely used in almost all the scientific
disciplines. In economics it is the basic technique for measuring or estimating the relationship
among economic variables that constitute the essence of economic theory and economic life. The
regression analysis helps in three important ways:
1. Regression Equation of X on Y:
(X- X́ ) = bxy (Y-Ý )
sp ( x , y )
or, (X- X́ ) = (Y-Ý )
ss ( y )
sp(x , y)
N
or, (X- X́ ) = (Y-Ý )
ss ( y)
N
Cov( x , y)
or, (X- X́ ) = (Y-Ý )
σ ²y
r σxσy
or, (X- X́ ) = (Y-Ý )
σ²y
σx
∴ (X- X́ ) = r (Y-Ý )
σy
2. Regression Equation of Y on X:
or, (Y-Ý ) = byx (X- X́ )
sp ( x , y )
or, (Y-Ý ) = (X- X́ )
ss( x)
sp(x , y)
N
or, (Y-Ý ) = (X- X́ )
ss ( x )
N
Cov( x , y)
or, (Y-Ý ) = (X- X́ )
σ²x
r σxσy
or, (Y-Ý ) = (X- X́ )
σ²x
σx
or, (Y-Ý ) = r (X- X́ )
σy
3. Prove that, rxy = √ bxy ×byx
Proof,
Σ ( X− X́ )(Y −Ý ) Σ( X− X́)(Y −Ý )
Now, √ bxy ×byx =
√ Σ (Y −Ý )
×
Σ (X− X́ )
{ Σ (X − X́ )(Y − Ý )} ²
=
√
Σ (Y −Ý )² Σ (X − X́ ) ²
Σ( X − X́ )(Y −Ý )
=
√ Σ( X− X́ ) ² Σ(Y −Ý ) ²
= rxy
∴ rxy = √ bxy ×byx (proved)
bxy +byx
4. Prove that, ≥ rxy
2
Proof,
We know that,
σx
The regression co-efficient of X on Y, bxy = r
σy
σy
The regression co-efficient of X on Y, byx = r
σx
Again,
(σ x- σ y )2 ≥ 0 (since it can never be negative)
or, (σ x2-2.σ σσ y +σ y2) ≥ 0
or, σ x2+ σ y2 ≥ 2.σ xσ y
σx² σ y² 2. σ x σ y
or, + ≥
σ xσ y σ xσ y σxσ y
σx σ y
or, + ≥2
σ y σx
σx σy
or, r +r ≥ 2r
σy σx
or, bxy+ byx ≥ 2r
bxy +byx
∴ ≥r (Proved)
2
5. Regression co- efficient are independent of change of origin but not scale.
Proof.
We know that,
(X − X́ )(Y − Ý )
the regression co efficient of X on Y, bxy = ……..(i)
( x−x́)
Now, Let,
xi−A
ui =
C
or, cui = xi- A
or, xi-A = cui
or, xi = A+ cui………(a)
or, Σxi = nA + cΣui
Σxi NA Σui
or, = +c
N N N
or, x́ = A+ cú………(b)
And,
Yi−B
Vi =
D
or, Dvi= Yi- B
or, Yi-B= DVi
or, Yi = B + DVi……….(c)
or, ΣYi= NB+ DΣVi
Σyi NB ΣVi
or, = +D
N N N
∴ Ý = B + DV́ .......... (d)
Now, putting the value of a, b, c& d into (i) we get,
Σ ( A +cui− A−c ú)( B+ DVi −B−D V́ )
bxy =
Σ( B+ DVi−B−D V́ )²
Σ (cui−c ú)( DVi−D V́ )
=
Σ( DVi−D V́ )²
C ² Σ(ui−ú)(Vi−V́ )
=
D ² Σ (Vi−V́ ) ²
C Σ (ui−ú)(Vi−V́ )
= ×
D Σ (Vi−V́ )²
C
= b
D xv
Hence, regression co- efficient are independent of change of origin but scale.
EXERCISE
Exercise: 14
We are given,
ΣX= 580
ΣY= 370
ΣXY= 11494
ΣX2= 41658
ΣY2 = 17206
N= 12
Now, Regression equation of Y on X: Y = a+ bx
or, ΣY = Na + bΣx
or, 370 = 12a + 580b
∴12a + 580b = 370…………….(i)
ΣXY = aΣX+bΣX2
or, 11494 = 580a + 41658b
∴ 580a+ 41658b = 11494………..(ii)
Multiplying equation (i) & (ii) respectively 580 & 12 & deducting equation (i) from (ii)
we get,
6960a+ 499896b = 137928
6960a+ 336400b = 214600
163496b = -76672
or, b = -76672
= 163496
∴b = -0.47
Putting the value of ‘b’ in equation (i) we get
12a+ 580 (-.47) =370
or, 12a + (-272.6) = 370
or, 12a – 272.6 = 370
or, 12a = 370+ 272.6
642.6
or, a =
12
∴ a = 53.55
so, the required regression equation of Y on X is
Y = a+ bx
= 53.55+ (-.47)x
= 53.55-.47x
And,
Regression equation of X on Y: x = a+ by
or, Σx = Na+ bΣy
or, 580 = 12a +370b
or, 12a+ 370b = 580……….. (iii)
or, Σxy = aΣy+ bΣy2
or, 11494 = 370a+ 17206b
∴ 370a+17206b = 11494………… (iv)
Multiplying equation (iii) and (iv) respectively 370 & 12 & deducting equation (iv) from
(iii) we get,
4440a + 206472b = 137928
4440a + 136900b = 214600
69572b = -76672
or, b = -76672
∴ b = -1.1
Putting the value of ‘b’ in equation (iii) we get,
12a+ 370 (-1.1) = 580
or, 12a- 407 = 580
or, 12a = 580+407
987
or, a =
12
∴ a = 82.25
So, the required regression equation of x on y is
x = a+ by
= 82.25+ (-1.1) y
= 82.25-1.1y
Ans: x = 82.25- 1.1y &
y = 53.55- .47x
Exercise: 20
We are given,
Á = 39.5 σ A = 10.8
B́ = 47.5 σ B = 16.8
rAB = .42
We know,
Regression equation of A on B is as follow
σA
(A- Á ) = rAB (B- B́)
σB
10.8
or, A- 39.5 = .042× (B- 47.5)
16.8
4.536
or, A-39.5 = (B- 47.5)
16.8
or, A- 39.5 = 0.27 (B-47.5)
or, A- 39. = 0.27B- 12.825
or, A = 0.27B – 12.825+ 39.5
∴A = 26.675 + 0.27B
When B = 55 then A will be
A = 26.675 + (0.27×55)
= 26.675 + 14.85
= 41.525
(Ans.)
Exercise: 21
Calculation Table
No. Pages (x) dx = x- A dx2 Price dy =Y- dy2 dxdy
(tk)(y) A
1 700 160 25600 12 2 4 320
2 540 0 0 11 1 1 0
3 210 -330 108900 5 -5 25 1650
4 625 85 7225 10 0 0 0
5 380 -160 25600 7 -3 9 480
6 910 370 136900 15 5 25 1850
7 610 70 4900 9 -1 1 -70
8 420 -120 14400 8 -2 4 240
9 750 210 44100 12 2 4 420
10 400 -140 19600 9 -1 1 140
Σ x = Σ dx = Σ dx2= Σ y= 98 Σ dy= Σ Σdxdy=
5545 145 387225 -2 dy2=74 5030
Regression co efficient of X on Y:
NΣdxdy −ΣdxΣdy
bxy =
NΣdy ²−(Σdy)²
( 10× 5030 )− {145 ×(−2) }
=
( 10 ×74 )−(−2) ²
= 68.74
Regression co efficient of Y on X:
NΣdxdy −ΣdxΣdy
byx =
NΣdx ²−(Σdx)²
( 10× 5030 )− {145 ×(−2) }
=
( 10 ×387225 )−(145)²
= 0.013
Σx 5545
X́ = = = 554.5
N 10
Σy 98
Ý = = = 9.8
N 10
a. Regression line for extinating the price of a book i.e Y on
(Y- Ý ) = byx (X- X́ )
or, (Y- 9.8) = 0.013 (X- 554.5)
or, Y- 9.8 = 0.013x - 7.20
or, Y = 0.013x – 7.20 + 9.8
∴ Y = 0.013x + 2.69
Hence the required regression equation of Y on X i.e
Y = 2.6+ 0.013x
b. when, x = 500, then y= 2.6+0.013× 500
= 9.1
Ans.
c. If the pages of the book increased by 100 pages, then the number of the pages of the
book will be (500+ 100) or 600 pages.
∴ Y = 2.6+0.013× 600
= 10.4
Regression equation of Y on X:
σy
(Y-Ý ) = rxy (X- X́ )
σx
2
or, Y- 10 = 0.9 (X- 50)
10
1.8
or, Y- 10 = (X-50)
10
or, Y- 10 = 0.18 (X- 50)
or, Y- 10 = 0.18X- 9
or, Y = 10- 9+ 0.18X
∴ Y = 1+ 0.18X
Chapter-08 Probability
Introduction:
The concept of probability which originated in the seventeenth century has become one of the
most fascinating and debatable subjects in recent years. The probability formulae and techniques
were developed by Jacob Bernoulli (1654-1705), De Mover (1667-1754), Thomas Bayes (1702-
171761), Joseph Lagrange (1763-1813). Actually “Probability” is the measure of the likelihood
that an event will occur. See glossary of probability and statistics. Probability is quantified as a
number between 0 and 1, where loosely speaking 0 indicates impossibility and 1 indicates
certainty. The higher the probability of an event, the more likely it is that the event will occur. A
simple example is the tossing of a fair (unbiased) coin. Since the coin is fair, the two outcomes
("heads" and "tails") are both equally probable; the probability of "heads" equals the probability
of "tails"; and since no other outcomes are possible, the probability of either "heads" or "tails" is
1/2 (which could also be written as 0.5 or 50%).
Modern approach to probability theory generally employs set theory and it will be used here for
the development of some fundamental concepts and tools.
The objects which comprise the set are usually referred to as elements or members of the set and
said to belong to that set or to be contained in it. The set must be well defined in the sense that
one can decide whether any given member does or does not belong to the set. The collection of
aggression or totality of elements is referred to simply as a set donated by S. thus the following
collection are examples of set:
The elements are usually enclosed within brackets. For example, the set consisting of the
possible outcomes (tails=T, Head= H) of single toss of a coin may be expressed as:
S= (T.H)
The set of possible outcomes of tossing two coins may be within as:
The set of possible outcomes of tossing two coins may be written as:
The order in which the elements of a set are listed is of no importance. It is important, however,
that each elements of a set listed only once.
Sometimes it is helpful to have a brief and exact way to describe sets without listing elements.
For example, the set of all university students expressed as:
We read this as “S is the set of all x such that x is a student in the university”
Universal set
The universal set U is defined as the set consisting of all the elements under consideration. Thus,
if A is any set and U is the universal set, then every element in A must be in U (since it consists
of elements under consideration)
Null set
Subset
If every element of a set A is also an element of a set B, Then A is called a subset of B. for
example, consider the set A = (3,5) and the set B= (1,2,3,4,5). We note that every element in the
set A is also an element of the set B. the set A is said to be subset of B.
Equal sets
Two sets A and B are said to be equal if and only if every element of A is also an element of B
and vice versa.
Set operation
We shall now consider certain operations on sets that will result in the formation of new sets.
The classical or a priori approach happens to be the earliest. This school of thought assumes that
all the possible outcomes of an experiment are mutually exclusive and equally likely. The words
“equally likely convey the notion of equally probable, and mutually exclusively means if one
event occurs the other event will not occur. For example, when we toss a coin the probability of
head is equal to the probability of a tail and is equal to ½.
In the a priori method of measurement as well as in all other methods ,the probability of an event
E is a number such that 0<P(E)<1,and the sum of the probability of that an event will occur and
the probability that it will not occur is equal to one.
The relation frequency theoreticians agree that the only valid procedure for determining event
probabilities is through repetitive experience.
The ratio of the number of occurrences of an event to the number of possible occurrences in an
experiment is referred to as the relative frequency.to definitions of probability in terms of
relative frequency can be given:
(a)If an experiment is performed n times under the same conditions and there are ‘a’
outcomes,a<n,favouring an event, then an estimate of the probability of that event is the ratio a/n.
(b)The estimate of probability of event ,a/n approaches a limit ,the true probability of the event n
approaches infinity is given by
a
P(E)=Limit
n
The classical approach restricts the calculation of probability to essentially equally likely and
mutually exclusively events. The resolution of non mutually exclusive events of reality into
mutually exclusive subevents and the introduction of 'equal likelihood' among events which are
essentially not so in reality are questions not clearly treated by classicist. On the other hand, the
emprical or relative frequency approach requers that every questions of probabilistic nuture be
exclaimed experimentally in the laboratory of the mathematician under identical conditions, and
that too over a very long period of time, through the process of repeated observations,if estimates
of the chances of occurrence of the events under consideration are required.
The axiomatic theory of probability is an honest attempt at cconstructing a theory of probability,
largely free from the inadequacies of both the classical and empirical approaches, in the true
mathematical tradition. It is true that the introduction of advanced logic through mathematical
abstractions render the complex real world situation too idealized (or too simplified) to be of any
immediate practical utility. But nonetheless it plays an important role in rendering a reasonable
amount of comprehensibility and tractability to the understanding of myriad chance
phenomenona observed in nature, at least in the initial stages of any scientific inquiry into their
structure and composition, where other approach have at best left them less comprehensible and
less tractable.
Though this approach to probability is relatively recent its application to statistical problems has
occurred virtually entirely in the post World War ll period, particularly in connection with
statistical decision theory. According to the personalitics or subjective concept, the probability of
an event is the degree of confidence placed in the occurrence of an event by a particular
individual based on the evidence available to him. This evidence may consists of relative
frequency to data and any other quantitative or qualitative information. According to the degree
of belief for this possible occurrence, a subjectivist would assign a weight between 0 and 1 to an
event. Thus ,if one believes that it is very likely that event will occur, he will assign it a
probability close to one and if he believes that is unlikely that the event will occur, he will assign
a probability close to zero.
PROBABLITY LAWS
There are several laws that can ease our task of computing probabilities.in this section, we shall
discuss two of the fundamental laws of computing probabilities, viz., Addiction law and
multiplication law.
Addition Law
The probability of occurrence of either event A or event B of two mutually exclusive events is
equal to the sum of their individual probabilities. symbolically, we may write,
DISJOINT EVENTS
A B
U
Since A and B can be written as a union of simple events in which no simple event of B appears
in A, hence, the result follows.
If two events A and B are not mutually exclusive then the addition law can be started as follows:
The probability of the occurrence of either event A or event B or both is equal to the probability
that event A occurs, plus the probability that event B occurs minus the probability that both
events occur, symbolically, it can be written as
n( AUB)
Proof P[AUB]=
n(U )
Where n(AUB) indicates the number of elements belonging to AUB, and n(U)
is the total number of elements in the universal set U.
JOINT EVENTS
AꓵB
A B
U
[By adding
n(A) and n(B), we count twice (A ꓵ B). see diagram above.]
n ( A ) +n ( B )−n( AUB)
P (AUB)=
n (U )
n( A ) n(B) n( A ꓵ B)
= + -
n(U ) n ( U ) n(U )
Conditional probability
If one event depends on another event, those probability is called conditional probability. When
We are dealing with probabilities of a subset rather than whole set, our attention is focused on
the probability of an event in a subset of the whole set. Probability associated with the events
defined on the subsets are called conditional probabilities.
The Conditional probability of A, given B is equal to the probability of B, provided that
probability of B is not Zero. Symbolically, we may write this-
P ( A ∩B)
P(A/B) = P(B)≠0 I, e, P(B)>0
P(B)
P (B ∩ A)
P(B/A) = P(A)≠0 I, e, P(A)>0
P( A)
Multiplication Laws
0r, P( B A) P( B / A) P( B)
Proof,
n( A B) n( A B) / n(U ) P ( A B)
P( A / B)
n( B ) n( B ) / n(U ) P( B)
P ( A B ) P ( A / B).P ( B ).
Generalization:
The multiplication law can be extended for more than two events. If we lose three events A, B
and Which are not mutually exclusive than formula becomes,
P ( A B C ) P ( A).P ( B / A).P (C / A B )
A , A ............ An
For n events 1 2 the formula becomes,
P ( A1 A2 A3 ....... An ) P( A1 ) P ( A2 / A1 ) P ( A3 / A1 A2 .........).......P ( An / A1 A2 ...... An )
Dependent Event:
Two events are said to be dependent if the occurrence or non-occurrence of one event in any trial
affects the probability of other events in other trails. Thus, in the case of dependent events, the
probability of any event is conditional, or depends upon the occurrence or non- occurrence of
other events. From definition of conditional probability, we can see that if A and B are dependent
events,
P ( A B) P( A / B ) P ( B )
Or, P ( B A) P ( B / A) P( A)
The order is of no significance in the intersection of two events, since A B B A.
Therefore, we are in important property of intersection,
P( A B) P( A / B), P( B) P( B / A).P( A)
Independent Events:
Two events are said to be independent, if the probability of the occurrence of the event will not
affect the probability of the occurrence of the second event. Independent events are base events
whose probability is in no way affected by the occurrence of any other events preceding or
occurring at the same time.
Two events A and B are said to be independent if and only if
P ( A / B) P ( A) P( B )
Which implies from 1 and 2, that
P( A / B) P( A)
And P( B / A) P( B)
Bayes Theorem
It is associated with the name of Thomas Bayes (1702-61) and is a theorem on probability,
concerned with the method of estimating the probabilities of the cause by which an observed
event may have been produced. This theorem may be stared as following:
Let’s B1, B2, Bn be n mutually exclusive events whose union is the universe and let ‘A’ be an
arbitrary event in the universe. Such that P(A)≠0, given that P(A/Bi) and P(Bi)
[i=1, 2, 3,……..n]
P( A /B i) P( Bi)
P(Bi/A) =
∑ P (B i)P( A /B i)
P( A ∩ Bi)
P(Bi/A) =
P ( A ∩ B 1 )+ P ( A ∩ B 2 )+ … … . P( A ∩ Bn)