Unit 1bs
Unit 1bs
But this definition also does not cover the entire scope of statistics. The
statistical methods are methods for the collection, analysis and
interpretation of numerical data and form a basis for the analysis and
comparison of the observed phenomena.
In the words of Croxton &Cowden, “Statistics may be defined as the
collection, presentation, analysis and interpretation of numerical data”.
Characteristics of statistics:
We cannot blame medicine for such a result. Similarly, if a child cuts his
finger with a sharp knife, it is not a knife that is to be blamed, but the
person who kept the knife at a place that the child could reach it. These
examples help us in emphasizing that if statistical facts are misused by
some people it would be wrong to blame the statistics as such. It is the
people who are to be blamed. In fact statistics are like clay which
can be moulded in any way.
Uses / Application of Inferential Statistics in managerial Decision
making.
Through the aid of statistical reports, the executives can gain the
summary picture of current operations. The following are some major
activities of typical, large and progressive organization which would
indicate how statistics helps in the efficient discharge of various
activities.
Marketing In the field of marketing, it is necessary first to find out what
can be sold and then to evolve a suitable strategy so that goods reach
the ultimate consumer. A skilful analysis of data or population,
purchasing powers, habits of people, competition, transportation cost
etc. should precede any attempt to establish a new market.
Production The decisions about what to produce, how much to
produce, when to produce, for whom to produce is based largely on
facts analyzed statistically. Statistical tools are also of immense help in
Quality Control, optimum inventory level and in dealing with labor
problems etc.
Finance Financial forecasts, Break-even analysis and investment
decisions under uncertainty are but part of financial Manager's activities
not only this statistics play its important role in dealing with inventories,
cash balances, etc. apart from this the area of security analysis is also
highly quantitative.
Banking Banking institutions with in their organization establish a
research department for the purpose of gathering and analyzing
information not only regarding their own operations but on general
economic conditions and on every live of business in which they might
be directly or indirectly interested.
Investment Statistics greatly assist investors in making clear and
valued judgment in his investment decision in selection of securities
which are safe and which have the best prospects of yielding a good
income. These investigations assist in determining whether to buy or
sell, or at what time to invest and when to disinvest, etc.
Purchase The purchase department in discharging its functions
makes use of statistical data to frame suitable purchase policies such
as from where to buy, at what time to buy and at what price to buy.
AccountingStatistical methods are also applied in accounting. The
auditing function makes frequent application of statistical sampling and
estimation procedure and the cost account uses regression analysis.
The accountant collects data on historical costs in course of auditing a
company's financial records.
Control The management control process combines statistical and
accounting methods in making the overall budget for the coming year
including sales material, labour and other costs and net profit and
capital requirements. It usually maintains a standard cost system for
controlling costs and setting prices of products.
Credit The credit department performs statistical analysis to determine
how much credit to expend to various customers. In the formulation of
future credit policy, the characteristics of those who have paid and
those who have defaulted are kept in mind.
Personnel The personnel department frames personnel policies based
on facts. It makes statistical study of wage rates, incentive plans, cost of
living, labor turn-over rates, employment trends, accident rate,
employee grievances, performance appraisal, training programs etc.
Such studies help the personnel department in the process of
manpower planning.
Research and Development Many big organizations have research
and development department which are primarily concerned with finding
out how existing products can be improved, what new product lines can
be added and how the optimal use of resources made. In the absence
of factual data it is almost impossible to carry-out fruitful research and
development programs.
Measures of Central Tendency
The tendency of some certain value around which data tends to cluster
is called central tendency. Following are three important measures of
central tendency which are very commonly used in business.
1. Mean
2. Median
3. Mode.
1. Mean
One of the most important measures of central tendency is the
mean; following three types of means are calculated according to
the requirements of the user.
A) Arithmetic Mean
B) Geometric Mean
C) Harmonic Mean
X=
∑X
N
Where, X =Arithmetic Mean
X =A +
∑d
N
Where, d= ( X− A )
And, A=Arbitrary Point
Grouped Data:
For grouped data, the following formula is used –
X=
∑ fX
N
Where, X =Arithmetic Mean
X =mid point of various classes
f =frequency of each class
N=∑ of frequency column
X =A +
∑ fd ×i
N
X−A
Where, d=
i
X =Arithmetic mean
A=Assumed mean
∑ fd =∑ of deviations ¿ assumed
valuemultiplied by frequency
i=Class interval
Combined Mean of Two Groups:
If we have the arithmetic mean and number of observations of
two or more than two related groups, we can compute combined
average of these groups by applying the following formula.
N 1 X 1+ N 2 X 2
X 12=
N 1+ N 2
Where, X 12=Combined mean of the two groups
X 1 =Arithmetic mean of the first group
X 2 =Arithmetic mean of the second group
N 1=No . of observations∈the first group
N 2=No . of observations∈the second group
XW =
∑ WX
∑W
Where X W represents the weighted arithmetic mean. X = the variable
and W = weights attached to the variable X .
The term weight refers the relative importance of the different
observations. The weights may be either actual or arbitrary, i.e.
estimated. However weights are generally given in the problems but in
situations, where no weights are given, the weights are assigned
keeping in view the relative importance of the observations.
(B) Geometric Mean: Geometric mean is defined as the Nth root of the
product of N observations of a given data. If there are two observations
we take square root, three observations, we take cube root and so on.
Symbolically,,
G . M .= √ X 1 × X 2 × X 3 × … … … X N
N
¿
∑ log X
N
G . M .= A . L . ( ∑ log X
N )
In grouped data for calculating geometric mean first we will find the midpoints
and then apply the following formula:
G . M .= A . L . ( ∑ f log X
N )
Where, X =mid point
N
H . M .=
( X1 + X1 + X1 +… … … … … … .+ X1 )
1 2 3 N
N
H . M .=
∑( X)
1
N
H . M .=
∑ (f . X )
1
Median
Calculation of Median
(i) For ungrouped data
Arrange the data in ascending or descending order of magnitude.
N +1
Apply the formula to find out Median No .=
2
This No. gives the value of median in the series.
N
− pcf
2
Median=L+ ×i
f
Where,
L=Lower limit of the median class
p . c . f .=preceding cumulative frequency ¿ the median class
f =frequency of the median class
i=Class inte rval of the median class
Mode
Mode is defined as that value which occurs the maximum number of
times having the maximum frequency.
Graphically, it is the value on the x axis below the peak or highest point.
MODE
In statistics, mode only tells us which single value occurs most often,
therefore it may represent a majority of total population.
Locating Mode in the Distribution
In histogram method we first draw a histogram and there after draw two
lines diagonally on the inside of model class bar, starting from each
upper corner of the adjacent bar. At the point of intersection of these
two diagonals, we draw a perpendicular live on x axis which gives the
modal value.
Mode can also be determined from frequency polygon in which
perpendicular is drawn on the base from the apex of the polygon and
where it meets the base gives the modal value.
f 1−f 0
Z=L+ ×i
2 f 1−f 0−f 2
∆1
Or, Z= ×i
∆1 +∆ 2
In case of grouped data any one formulae of the above can be used to
find out mode.
NOTE:
(i) While determining the mode, it is Important to see that the
class intervals are uniform throughout otherwise the result will
be misleading.
(ii) A distribution may have more than one Mode the distribution
having single mode is called uni-modal and distribution having
more than one mode is called bimodal.
(iii) In case of distribution bimodal or Uni modal, in order to arrive
at a single value of mode, following formula will be used.
Dispersion / Variation
A measure of variation or Dispersion is designed to state the extent to
which the individual measures differ on an average from the mean.
Vatriation tells the amount of variation, the degree of variation but not
the direction.
Measures of Variation
Following are important measures of variation / dispersion.
i. Range
ii. Quartile Deviation.
iii. Average / Mean Deviation.
iv. Standard Deviation.
v. Lorenz curve
Range
Range is the simplest method of studying variation. It is defined as the
difference between the value of the smallest observation and the value
of the largest observation included in the distribution.
Symbolically,
Range = L - S
Where,
L = Largest value
S = Smallest value.
The relative measure corresponding to range, called the coefficient of
range and is obtained by applying the following formula.
Coefficient of Range = L - S
L+S
Q 3 −Q1
Q . D .=
2
Q3−Q1
Coeff . of Q . D .=
Q3 +Q1
MD = ∑ | x |
n
A . D.
Coeff . of A . D . Med =
Median
Standard Deviation
The standard deviation concept was introduced by Karl Pearson in
1893. It is measure of how much spread or variability is present in the
sample. It is the square root of the means of square deviations from the
arithmetic mean.
Standard deviation is also known as root mean square Deviation and
denoted by Greek Letter σ
Calculation of Standard Deviation
Ungrouped data -
Standard deviation may be computed by applying any of the following
two methods.
1. By taking deviations from the actual mean; and
2. By taking deviations from assumed mean.
Deviations from actual mean – When deviations are taken from actual
mean, the following formula will be applied:
σ=
√ ∑ ( X−X )2
N
If we calculate standard deviation without taking deviations, the above
formula can be used after simplification as follows:
√ ∑X
( )
∑X
2 2
σ= −
N N
√ ∑X
2
Or, σ= −( X )
2
N
Deviations from Assumed Mean:
When the actual mean is in fractions, it would be too cumbersome to
take deviations from it and then find squares of these deviations. In
such a case either the mean may be approximated or else the
deviations be taken from assumed mean and the necessary adjustment
be made in the value of standard deviation. The former method of
approximation is less accurate and therefore, invariably in such a case
deviations are taken from assumed mean.
When deviations are taken from assumed mean the following formula
will be applied:
√ ∑d
( )
∑d
2 2
σ= −
N N
Where, d= ( X− A )
Calculation of Standard Deviation
Grouped data –
In grouped frequency distribution, standard deviation can be calculated
by applying by any of the following two methods:
1. By taking deviations from actual mean; and
2. By taking deviations from assumed mean.
Deviations taken from assumed mean:
When deviations are taken from actual mean, the following formula is
used:
σ=
√ ∑ f ( X− X )2
N
If we calculate standard deviation without taking deviations, then this
formula after simplification can be used and is given by:
√ ∑ fX
( ∑ fX
)
2 2
σ= −
N N
√ ∑ fX
2
Or, σ= −( X )
2
N
Deviations taken from assumed mean-
When deviations are taken from the assumed mean, the following
formula is applied:
√ ∑ fd
( ) ∑ fd
2 2
σ= − ×i
N N
Coefficient of Variation: The corresponding relative measure is known
as the coefficient of variation and was developed by Karl Pearson. It is
used in such problems, where we want to compare the variability of two
or more than two series.
The series having higher coefficient of variation is said to highly variable
or less consistent and the series having lower coefficient of variation is
said to be less variable or highly consistent. The formula for calculation
of coefficient of variation, is as follows –
σ
C . V .= × 100
X
Skewness: The terms skewness refers to lack of symmetry or
departure from symmetry, when a distribution is not symmetrical (or is
asymmetrical) is called a skewed distribution. The measures of
skewness Indicate the difference between the manner in which the
observations are distributed in a particular distribution compared with a
symmetrical (or normal) distribution.
In a symmetrical distribution, the values of mean median and mode are
alike. In a skewed distribution these values differ. If the value of the
mean is greater than the mode skewness is said to be positive. On the
other hand if the value of mode is greater than mean the skewness is
said to be negative.
Measures of skewness-
But where mode is ill defined the following formula can be used –
Skp = 3(Mean –
Median)
Kurtosis
Kurtosis in Greek means bulginess. In statistics, Kurtosis refers to
the degree of fitness or Peakedness in the region about the mode of a
frequency curve. If curve is more peaked than the normal curve it is
called "Leptopurtic"; If it is more or flat topped than the normal curve, It
is called "platykurtic" or "flat topped". The normal curve It self is known
as mesokurtic. The concept of Kurtosis is rarely used in analyzing
business data.
Measures of Kurtosis:
Kurtosis is measured by β 2 or its derivative γ 2
μ4
β 2= 2 and γ 2=β 2−3
μ2