Business Stats
Business Stats
UNIT — I& II
STATISTICS
Definitions: - “The classified facts relating the condition of the people in a state
specially those facts which can be stated in members or in tables of members or
in any tabular or classified arrangements.”
-Webster
“Statistics may be regarded as (i) the study of population (ii) The study of
variation (iii) The study of method of reduction of data”
-R.A. Fisher
2
Nature /Features /Characteristics of statistics
It is an aggregate of facts
It is numerically expressed
Statistical Methods
•By this method we mean methods
specially adapted to the elucidation
of quantitative data affected by a
multiplicity of causes. Few Methods
are:-
• Collection of Data (2) Classification
(3) Tabulation (4) Presentation (5)
Analysis (6) Interpretation (7)
Forecasting.
Theoretical
• : Mathematical theory which Applied
is the basis of the science of
• It deals with the application
statistics is called theoretical
of rules and principles
statistics.
developed for specific
Division problem in different
disciplines. Eg: - Time series,
of Sampling, Statistical Quality
control, design of
Statistics experiments.
Functions of Statistics:-
It presents facts in a definite form.
It simplifies mass of figures
It facilitates comparison
It helps in prediction
It helps in formulating suitable & policies.
Scope of Statistics:-
1. Statistics and state or govt.
2. Statistics and business or management.
3
Marketing
Production
Finance
Banking
Control
Research and Development
3. Statistics and Economics
Measures National Income
Money Market analysis
Analysis of competition, monopoly, oligopoly,
Analysis of Population etc.
4. Statistics and science
5. Statistics and Research
Limitations:-
(i) It is not deal with items but deals with aggregates.
(ii) Only on expert can use it
(iii) It is not the only method to analyze the problem.
(iv) It can be misused etc.
STATISTICAL INVESTIGATION
Meaning: In general it means as a statistical survey.
In brief, it is Scientific and systematic collection of data and their analysis with the help of various
statistical method and their interpretation.
4
Interrelation
of Data or
Presentation
and analysis Report
of Data Preparation
Editing of Data
Collection of
Data
Planning
of
Investigati
on
Experiment
or survey
investigation
Original or Confidential
repetitive or open
investigation investigation
Types of
Statistical
Investigation
General
purpose and Complete or
specific sample
purpose investigation
investigation
Official,
semi-official,
Non official
investigation
5
Collection of Data: - It means the methods that are to be employed for obtaining the required
information from the units under investigations.
Preparation of Questionnaires:-
This method of data collection is quit popular, particularly in case of big enquires, it is adopted by
individuals, research workers. Private and public organization and even by government also.
A questionnaires consists of number of question printed or type in a definite order on a form or set of
forms. The respondents have to answer the question on their own.
Importance:-
i. Low cost and universal
ii. Free from biases.
iii. Respondents have adequate time to respond
iv. Fairly approachable
6
Basis of Primary
Difference Secondary Data
Data
Primary data are according to the Secondary data are collected for
Purpose object of investigation and are some other purpose and are
used without correction. corrected before use.
Demerits:-
(i) Low rate of return
(ii) Fill on educated respondents
(iii) Slowest method of Response
Technical
Emphasize Personal
Ask Logical terms and
Prepare it Prepare on question questions
and not vague
in a general sequence of formulation should be
misleading expressions
form question and left to
questions. should be
wordings the end.
avoided.
7
Example :
iv. Quantitative Classification: - When data is quantify on some units like height, weight, income,
sales etc.
Tabulation of Data
A table is a systematic arrangement of statistical data in columns and Rows.
Part of Table:-
1. Table number
2. Title of the Table
3. Caption
4. Stub
5. Body of the table
6. Head note
7. Foot Note
Types of Table:-
(i) Simple and Complex Table:-
(a) Simple or one-way table:-
8
Age No. of Employees
25 10
30 7
35 12
40 9
45 6
2) General Purpose and Specific Purpose Table:-General purpose table, also known as the reference
table or repository tables, which provides information for general use or reference.
Special purpose are also known as summary or analytical tables which provides information for one
particular discussion or specific purpose.
METHODS OF SAMPLING
Meaning: - The process of obtaining a sample and its subsequent analysis and interpretation is known
as sampling and the process of obtaining the sample if the first stage of sampling.
9
I Simple Random Sampling: - In this method each and every item of the population is given an equal
chance of being included in the sample.
(a) Lottery Method (b) Table of Random Numbers
Merits:
Equal opportunity to each item.
Better way of judgment
Easy analysis and accuracy
Limitations:
Different in investigation
Expensive and time consuming
For filed survey it is not good
II Stratified Sampling:- In this it is important to divided the population into homogeneous group
called strata. Then a sample may be taken from each group by simple random method.
10
Grater accuracy
Geographically Concentrated
Limitations: Utmost care must be exercised due to homogeneous group deviation. In the absence of
skilled supervisor sample selection will be difficult.
III Systematic Sampling:- This method is popularly used in those cases where a complete list of the
population from which sampling is to be drawn is available. The method is to be select k th item from
the list where k refers to the sampling interval.
IV Multi- Stage Sampling: - This method refers to a sampling procedure which is carried out in several
stages.
11
Non Random Sampling Method:-
I. Judgment Sampling: - The choice of sample items depends exclusively on the judgment of the
investigator or the investigator exercises his judgement in the choice of sample items. This is an
simple method of sampling.
II. Quota Sampling: - Quotas are set up according to given criteria, but, within the quotas the
selection of sample items depends on personal judgment.
III. Convenience Sampling: - It is also known as chunk. A chunk is a fraction of one population taken
for investigation because of its convenient availability. That is why a chunk is selected neither by
probability nor by judgment but by convenience.
12
Size of Sample:- It depends upon the following things:-
Cost aspects.
The degree of accuracy desired.
Time, etc.
Normally it is 5% or 10% of the total population.
13
UNIT-III
MEASURES OF CENTRAL TENDENCY
The point around which the observations concentrate in general in the central part of the data is called
central value of the data and the tendency of the observations to concentrate around a central point is
known as Central Tendency.
• arithmatic mean
MEAN • geometric mean
• harmonic mean
MEDIAN
MODE
ARITHMETIC MEAN ( )
Arithmetic Mean of a group of observations is the quotient obtained by dividing the sum of all
observations by their number. It is the most commonly used average or measure of the central
14
tendency applicable only in case of quantitative data. Arithmetic mean is also simply called “mean”.
Arithmetic mean is denoted by .
MEDIAN (M)
The median is that value of the variable which divides the group into two equal parts, one part
comprising of all values greater and other of all values less than the median. For calculation of median
the data has to be arranged in either ascending or descending order. Median is denoted by M.
15
Uses
•When there are open- ended classes
provided it does not fall in those classes.
•When exceptionally large or small values
occur at the ends of the frequency
distribution.
•When the observation cannot be
measured numerically but can be
ranked in order.
•To determine the typical value in the
problems concerning distribution of
wealth etc.
16
MODE (Z)
Mode is the value which occurs the greatest number of times in the data. The word mode has been
derived from the French word ‘La Mode’ which implies fashion. The Mode of a distribution is the value
at the point around which the items tend to be most heavily concentrated. It may be regarded as the
most typical of a series of values. Mode is denoted by Z.
17
Geometric Mean is appropriate when:
Large observations are to be given less weight.
We find the relative changes such as the average rate of population growth, the average
rate of intrest etc.
Where some of the observations are too small and/or too large.
Also used for construction of Index Numbers.
18
UNIT IV
DISPERSION
The Dispersion (Known as Scatter, spread or variations) measures the extent to which the items vary
from some central value. The measures of dispersion is also called the average of second order (Central
tendency is called average of first order).
The two distributions of statistical data may be symmetrical and have common means, median or
mode, yet they may differ widely in the scatter or their values about the measures of central tendency.
Range
(coefficient of
Range)
Based on
selected Items
Inter-quartile,
coefficient of
Range (IQR)
Dispersion
Mean Deviation
Based on
all items
Standard
Deviation
1. Range: - Range (R) is defined as the difference between the value of largest item and value of
smallest item included in the distributions. Only two extreme of values are taken into
considerations. It also does not consider the frequency at all series.
2. Quartile Deviation: - Quartile Deviation is half of the difference between upper quartile (Q3) and
lower quartile (Q1). It is very much affected by sampling distribution.
19
3. Mean Deviation: - Mean Deviation or Average Deviation (Alpha) is arithmetic average of
deviation of all the values taken from a statistical average (Mean, Median, and Mode) of the series.
In taking deviation of values, algebraic sign + and – are also treated as positive deviations. This is
also known as first absolute moment.
4. Standard Deviation:- The standard deviation is the positive root of the arithmetic mean of the
squared deviation of various values from their arithmetic mean. The S.D. is denoted as Sigma.
Variance
The square of the standard deviation is called variance. In other words the arithmetic mean of the
squares of the deviation from arithmetic mean of various values is called variance and is denoted as 2.
Variance is also known as second movement from mean. In other way, the positive root of the variance
is called S.D.
Coefficient of Variations- T o compare the dispersion between two and more series we define
coefficient of S.D. The expression is x 100 = known as coefficient of variations.
X
Interpretation of Coefficient of Variance-
Value of variance Interpretation
Smaller the value of Lesser the variability or greater the uniformity/ stable/ homogenous of
2 population
Larger the value of 2 Greater the variability or lesser the uniformity/ consistency of the population
20
DISPERSION
RANGE = R
Individual Series Discrete Series Continuous Series
Range = L-S RLS RLS
Where L=Largest,
S=Smallest Observation
Coefficient of Range LS LS
LS LS LS
LS
mean) N f f
2 2 2
Indirect (Through
assumed mean)
dx 2
dx
fdx 2
fdx fdx 2
fdx
N N
f f f f
21
SKEWNESS
Skewness is a measure of symmetry, or more precisely, the lack of symmetry. A distribution, or data set,
is symmetric if it looks the same to the left and right of the center point.
Skewness is positive if the tail on the right side of the distribution is longer or fatter than the tail on the
left side. The mean and median of positively skewed data will be greater than the mode. Skewness is
negative if the tail of the left side of the distribution is longer or fatter than the tail on the right side. The
mean and median of negatively skewed data will be less than the mode. If the data graph symmetrically,
the distribution has zero skewness, regardless of how long or fat the tails are.
Where = the mean, Mo = the mode and s = the standard deviation for the sample.
Where = the mean, Mo = the mode and s = the standard deviation for the sample.
It is generally used when you don’t know the mode.
22
TIME SERIES ANALYSIS
“A Time Series” is a series of statistical data recorded in accordance with their time of occurrence. Here
it is noted that it is a set of observation taken at specified times usually (but not always) at equal
intervals. Thus a set of data depending on the time (which may be year, quarter, month, day etc.) is
called a “Time Series”.
Today the use of time series analysis is not merely confined to economists and businessmen, but it
extensively used by scientists, sociologist, biologists, geologists, research workers etc.
According to Patterson “A timeseries consists of statistical data which are collected. Recorded or
observed over successive increments.
i. It enables us to predict or forecast the behavior of the phenomenon in future, which is very
essential for business planning. On the basis of past information, the trend can be estimated and
projections can also be made for the uncertain future. It assists in reducing, the risk and uncertainties
of business and industry.
ii. It helps in the evaluation of current achievement by review and evaluation of progress made
through a plan can be done on the basis of time series.
iii. It helps in the analysis of past behavior of the phenomenon under consideration. What changes
had taken place in the past, what factor were responsible for these changes, under that conditions these
changes took place, etc. are certain issues which could be studied and analyzed by time series.
iv. It helps in making comparative studies in the values of different phenomenon at different times
or place. It provides a scientific basis for making comparison by studying and isolating the effects of
various components of a time series.
v. The segregation and study of the various components of time series is of paramount importance
to a businessman in the planning of future operations and the formulation of executive and policy
decisions.
vi. On the basis of the past performance of the various sectors of economy, we can determine
future requirements and a suitable policy can be formulated to get desired and predetermined
objectives.
23
If the values of a phenomenon are observed at different periods of time, the values so obtained will
show appreciable variations.
The following factors are generally affecting any time series are :
i. Changing of tastes, habits and fashions of the people.
ii. Changing of customs, conventions of the people.
iii. Rituals and festivals.
iv. Political movements, government policies.
v. War, Famines, Drought, Flood, Earthquakes and Epidemic etc.
vi. Unusual weather or seasons.
The various forces affecting the values of a phenomenon in a time series may be broadly classified into
the following four categories, commonly known as the components of a time series.
i. Secular Trend (i.e. long-term smooth, regular movement)
ii. Seasonal Variation (periodic movement, the period being not greater than one year)
iii. Cyclical Variation (periodic movement with period greater than one year)
iv. Irregular or Random Variation.
1. Secular Trend: - It is the matter of common sense that there might be violent variations in a time
series during a short span of time, however in a long run, it has a tendency either to rise or fall. This
tendency or trend of variation may be either upward or down set on over a long time period. This is
known as ‘Secular trend’ or ‘Simple trend. It is but natural that population growth, Technological
progress medical facilities production, prices etc. are not judge over a day, month or year they shores.
The movement are upward, downward or constant over a fairly long period.
Seasonal Variation: As we read season the first things comes in our mind is spring, summer, autumn
and winter. Generally seasonal variations occur due to changes in weather condition, customer,
tradition fashion etc.
Seasonal variations represent a periodic movement where the period is not longer than one year. The
factors, which mainly cause this type of variation in time series, are the climatic changes of the different
seasons. For example
i. Sale of woolensgo up in winter.
ii. Sale of raincoat and umbrella go up in rainy season.
iii. Prices of food grains decrease with the arrival of new crop.
iv. Sale of cooler, refrigerator etc. rise during the summer season.
Another variation occurs due to man-made convention and customs, which people follow at different
times like DurgaPooja, Dashehra, Deepawali, Ide. X-Max etc. The seasonal variations may take place per
day per week or per month. For example:
i. Sale of departmental stores go up in festivals.
24
ii. Sale of cloths and Jewelry pick up in marriages.
iii. Sale of Paint, furniture and electronics goes up during festivals like, Deepawali, Ide, X-max etc.
iv. Sale of vehicles increase considerably during DurgaPooja and Dasherhra.
Cyclical Variations: Most of the business activities are often characterized by recurrence of periods of
prosperity and slump constituting a business cycle. Cyclical variations are another type of periodic
movement, with a period more than one year. Such movements are fairly regular and oscillatory in
nature. One complete period is called a ‘cycle’ cyclical variations are not as regular as seasonal
variation, but the sequence of changes, marked by prosperity, decline, depression and recovery,
remains more of less regular.
25
necessarily independent of each other. In fact, the model presumes that their effects are interdependent
U=T×S×C×R
Measurement of Trend or Secular Trend
The different methods of determining the trend component of a time series are:
1. Moving Average Method: Moving average method is very commonly used for the isolation of trend
and in smoothing out fluctuations in time series. In this method, a series of arithmetic means of
successive observation, known as moving averages, as calculated from the given data, and these
Working Rule
i. Add the values of the first3 years (namely 1979, 1981 i.e., 80+90+70=240) and place the total
against the middle year1980.
ii. Leave the first year’s value and add up the values of the next 3 years (i.e., 1980, 1981, 1982, viz.,
90+70+70+60 = 220) and place the total against the middle year i.e., year 1981.
Illustration2 Calculate 5 yearly moving averages and seven year moving average for the following
data:
Year : 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990
Sales (‘000 Rs.) : 123 140 110 98 104 133 95 105 150 135
Step 2 : Leave the first year value and then add the for values of the next four years and place the total
in between the 3 and 4 year Continue this process until the last year is taken into account.
rd th
Step 3 : Divide 4 yearly moving totals 4. It will give 4 yearly moving average.
Step 4 :Add first two moving averages and divide it by 2 to get the moving average centered. Place it
against 3 year. Leave the first moving average and then add next two moving average and divide by 2
rd
to get the next moving average centered. Place it against the 4 Year. Continue this process till the last
th
add nexttwo moving total to get the next moving total centred. Place it against the 4 year. Continue this
th
Illustration Construction a four-yearly centered moving average from the following data :
Year : 1970 1975 1980 1985 1990 1995 2000
26
Imported Cotton (in ‘000) : 129 131 106 91 95 84 93
(y-c)=0, i.e., the sum of the deviations of the actual values of y and computed values of y is zero.
deviations from the line of the best fit is zero.
(y-y)2=is least, i.e., the sum of the squares of deviations from the actual and the computed
i.
ii.
value of y is least.
That is why it is called the method of least squares and the line obtained by this method is called the
‘line of best fit’
This method may be used either to fit a straight line trend or parabolic trend straight line trend is
represented by the equation y= a + bx where y represents the estimated values of the trend x
represents the deviations in the time period. A and b are constants.
‘a’ represents intercept of the line of the y no is and ‘b’ represent the slope of the line i.e. it gives the
changes in the value of y for per unit change in the value of x if b>0 it show and growth rate and if b<0 it
shows decline rate.
Merits:
1. This is the only method of measuring trend which provides the future values authentically very
convincing and reliable.
2. This method is used for forecasting the series for example.
3. If other factors are not so effective no share market, this method can provide very reliable
information about the movement of the share of a company.
4. This method has no scope for personal bias of the Investigator.
5. It is only method which gives the rate of growth per annum.
Demerits:-
1. The method required mathematical ability. Some items it involves tedious and complicated
calculations.
2. The method has no flexibility i.e. if even a single term is added to series it makes necessary to do
all the calculations again.
3. Estimations and predictions by this method are based only on long term variations and the
impact of cyclical, seasonal and irregular variations are completely ignored.
27
∑Y=na+b∑X
∑XY = a∑X+b∑X 2
taken as origin and deviations are taken from the middle time period it provides x=0 the above
Remarks :-The variable x can be measured from any point of time as origin. But if middle time period is
y=na+xy=na+0=naThus a = Yn
normal equation would be reduced to the
UNIT-V
CORRELATION
Introduction
1. Correlation is a statistical tool & it enables us to measure and analyse the degree or extent to
which two or more variable fluctuate/vary/change w.e.t. to each other.
2. For example – Demand is affected by price and price in turn is also affected by demand.
Therefore we can say that demand and price are affected by each other & hence are correlated.
the other example of correlated variable are –
3. While studying correlation between 2 variables use should make clear that there must be cause
and effect relationship between these variables. for e.g. – when price of a certain commodity is
changed ( or ) its demand also changed ( or ) so there is case & effect relationship between
demand and price thus correlation exists between them. Take another eg. where height of
students; as well as height of tree increases, then one cannot call it a case of correlation
because neither height of students is affected by height of three nor height of tree is affected by
height of students, so there is no cause & effect relationship between these 2 so no correlation
exists between these 2 variables.
4. In correlation both the variables may be mutually influencing each other so neither can be
designated as cause and the other effect for e.g. –
Price Demand
Demand Price
So, both price & demand are affected by each other therefore use cannot tell in real sense which
one is cause and which one is cause and which one is effect.
DEFINITIONS OF CORRELATION
1. “If 2 or more quantities vary is sympathy, so that movements is one tend to be accompanied by
corresponding movements in the other(s), then they are said to be correlated”. Connor.
2. “Correlation means that between 2 series or groups of data there exists some casual
correction”. WI King
3. “Analysis of Correlation between 2 or more variables is usually called correlation.”A.M. Turtle
4. “Correlation analysis attempts to determine the degree of relationship between variables.
YaLunchou
TYPES OF CORRELATION
28
Correlation
So, supply and price are …….correlated So, Demand & Price vely correlated
P = Price/Unit P = Price/Unit
Q = quantity Supplied Q = quantity Supplied
29
LINEAR CORRELATION NON-LINEAR CORRELATION
1 In linear correlation, due to unit, change value of In non linearor curvilinear correlation, due to
one variable there is constant change in the unit, change value of one variable, the change in
value of other variable. The graph for such a the value of other variable is not constant. the
relationship is straight line. E.G. – If in a factory graph for such a relationship is a curve. E.G. –
no of workers are doubled, the production The amount spent on advertisement will not
output is also doubled, and correlation would be bring the change in the amount of sales in the
linear. same ratio, it means the variation.
2 If the changed in 2 variables are in the same If thechange in 2 variables is in the same
direction and in the constant ratio, itis linear direction but not in constant ratio, the
positive correlation correlation is non linear positive.
X Y Y X Y
Y
2 3 50 10
4 6 55 12
6 9 60 15
8 12 90 30
X 100 45 X
3 If changes in 2 variables are in the opposite If changes in 2 variables are in opposite
direction but in constant ratio, the correlation is direction and not in constant ratio, the
linear negative. For eg. every 5% is price of a correlation is non linear negative. For eg: -
good is associated with 10% decrease in every 5% in price of good is associated with
demand the correlation between price and 20% to 10%in demand, the correlation
demand would be linear negative. between price & demand would be non linear
X Y Y negative.
Y
2 21 X Y
4 18 80 50
6 15 55 60
8 12 50 75
10 9 X 90 130 X
30
TYPE – 1 [BASED ON KARL PEARSON’S COFFICIENT OF CORRELATION]
Before use move to numerical, use understand the basic notions & concepts –
dx = Deviations of xi value from mean =(xi -𝑥̅)
x = Mean of x value [Average of X values] =𝑛xi
2
Variance of y values = (yi 𝑛− 𝑦̅ )
x i
=
𝑛
y i
r or rxy = coefficient of correlation between x 7 y variables.
31
Deviation from actual mean method
This method is used in the situation where mean of any series (x or y) is not in whole number, i.e. in
decimal value. in this case it is advisable to take deviation from assumed mean rather than actual mean
and then use the above formula.
In the above short cut method
Let, A = Assumed mean of X series
B = Assumed mean of y series
thendx = (xi – A) &dy = (yi – B) &
dx2= (xi – A)2&dy2= (yi – B)2
dxdy= (xi – A)(xi – B)
REGRESSION ANALYSIS
The dictionary meaning of regression is “Stepping Back”. The term was first used by a British
Biometrician” Sir Francis Galton 1822 – 1911) is 1877. He found in his study the relationship between
the heights of father & sons. In this study he described “That son deviated less on the average from the
mean height of the race than their fathers, whether the father’s were above or below the average, son
tended to go back or regress between two or more variables in terms of the original unit of the data.
Meaning
Regression Analysis is a statistical tool to study the nature extent of functional relationship between
two or more variable and to estimate the unknown values of dependent variable from the known
values of independent variable.
Dependent Variables – The variable which is predicted on the basis of another variable is called
dependent or explained variable (usually devoted as y)
Independent variable – The variable which is used to predict another variable called independent
variable (denoted usually as X)
32
Definition
Statistical techniques which attempts to establish the nature of the relationship between variable and
thereby provide a mechanism for prediction and forecasting is known as regression Analysis.
– Ya-lun-Chon”
Importance/uses of Regression Analysis
Forecasting
Utility in Economic and business area
Indispensible for goods planning
Useful for statistical estimates.
Study between more than two variable possible
Determination of the rate of change in variable
Measurement of degree and direction of correlation
Applicable in the problems having cause and effect relationship
Regression Analysis is to estimate errors
Regression Coefficient (bxy&byx) facilitates to calculate of determination ® & coefficient or
correlation (r)
Regression Lines
The lines of best fit expressing mutual average relationship between two variables are known as
regression lines – there are two lines of regression
3. Where there is more degree of correction, say (r = ±70 or more the two regression line with be
2. When there is no correction (r = o)>Both the lines will cut each other at point.
next to each other whereas when less degree of correction. Say (r=± 10 on less) the two
regression line will be a parted from each other.
33
Thecorrelationandregressionanalysis, both, help us in studying the relationship between two variables
yet they differ in their approach and objectives. The choice between the two depends on the purpose of
analysis.
S.NO BASE CORRELATION REGRESSION
1 MEANING Correlation means relationship between Regression means step ping back
two or more variables in which or returning to the average value,
movement in one have corresponding i.e., it express average
movements in other relationship between two or more
variables.
2 RELATIONSHIP Correlation need not imply cause and Regression analysis clearly
effect relationship between the variables indicates the cause and effect
under study relationship. the variable(s)
constituting causes(s) is taken as
independent variables(s) and the
variable constituting the variable
consenting the effect is taken as
dependent variable.
3 OBJECT Correlation is meant for co-variation of Regression tells use about the
the two variables. the degree of their co- relative movement in the variable.
variation is also reflected in correlation. We can predict the value of one
but correlation does not study the variable by taking into account
nature of relationship. the value of the other variable.
4 NATURE There may be nonsense correlation of There is nothing like nonsense
the variable has no practical relevance regression.
5 MEASURE Correlation coefficient is a relative Theregression coefficient is
measure of the linear relationship absolute measure representing
between X and Y. It is a pure number the change in the value of
lying between 1 and +1 variable. We can obtain the value
of the dependent variable.
6 APPLICATION Correlation analysis has limited Regression analysis studies linear
application as it is confined only to the as well asnonlinearrelationship
study of linear relationship between the between variables and therefore,
variables. has much wider application.
34
Change in independent variable)
A and b constants can be calculated through –
(x = a + by) (by multiplying ‘’)
x = Na + by (1)
REGRESSION EQUATIONS –
Theregressionequation’sexpresstheregressionlines, asthere are two regression lines there are two
regression equations –
Explanation is given in formulae –
REGRESSION LINES
1. Regressionequationofxony
X – X = bxy (y – y)
Wherebxy = regressioncoefficientofXonY
2. Regressioneuationofyonx
Y – Y = bxy (x – x) wherebxy = regressioncoefficientofY onX
35
REGRESSION COEFFICIENT –Thereare two regression coefficient like regression equation, they are
(bxy and byx)
Properties of regression coefficients –
Same sign – Both coefficient have the same either positive on negative
Both cannot by greater than one – If one Regression is greater than “One” or unity. Other must
be less than one.
Independent of origin – Regression coefficient are independent of origin but not of scale.
A.M.> ‘r’ – mean of regression coefficient is greater than ‘r’
R is G.M. – Correlation coefficient is geometric mean between the regression coefficient
R, bxyand bxy – They all have same sign
INDEX NUMBERS
Index numbers are devices which measure the change in the level of a phenomenon with respect to
time, geographical location or some other characteristic. The first index number was constructed in the
year 1764 by an Italian named Carli to compare the changes in the price for the year 1750 with the
price level of the year 1500. In present day situation changes in production, consumption, exports,
imports, national income, cost of living, incidence of crimes, number of road accidents, business failures
and a very wide variety of other phenomena are studied with the help of index numbers. Index
numbers are supposed to be barometers which measure the change in the level of a phenomena.
“An index number is a statistical measure designed to show changes in variable or a group of related
variables with respect to time, geographical location or other characteristics.”
36
2. Index numbers study the effects of such factors which cannot be measured directly. Index
numbers are meant to study the changes in the effects of such factors which cannot be
measured directly.
3. Index numbers being out the common characteristics of a group items.
4. Index number measure only relative changes in the values of a phenomenon.
Where
P = Index number of the current year
01
37
Simple Average of Relatives Method.
or
Here
Splicing: Sometimes series of index number based on a certain year is discontinued and a new series of
index number is prepared by taking another year as base. Thus two series of index number would
result. In this situation index number of these two series are not comparable because both are based
38
on different years. If these are to be compared then new series will be covered on the basis of old series
or vice-versa; this conversion/shifting is called as spicing. Splicing may be taken as another form of
base shifting.
Formula for splicing :-
a. Splicing of new series in old series (Forward splicing):
2. Time Reversal Test- In the worlds of Fisher: “The test is that the formulae for calculating an
index number should be such that it will give the same ratio between one point of comparison and the
other no matter which of the two is taken as base.” This mean that the index number should work both
backwards as well as forwards. Thus, if the index number of the current year is 4000 then the index
number of the base year (based on the current year) should be 25. In other words, the two index
numbers thus calculated (without the figure 100) should be reciprocals of each other. The reciprocal of
4 is .25 and the reciprocal of .25 is 4. The product of these two ratios would always be equal to one.
Thus, if P represents the price change in the current year and P the price change of the base
10 10
year (based on the current year) the following equation should be satisfied:-
3. Factor Reversal Test- In the words of Fisher: “Just as each formula should permit inter-
changing the price and quantities without giving inconsistent result, i.e., the two results multiplied
together should give the trust value ratio.” It means that the changes in the prices multiplied by the
changes in quantity should be equal to the total change in value. Change in value is the result of
changes should represent the total change in value. Thus, if the price of a commodity has doubled
during a certain period and if in this period the quantity has trebled the total change in the value should
be six time the former level. In the other words, if p and p represent the prices and q and q
1 0 1 0
the quantities in the current and the base years respectively, and if p represent the change in price in
01
the current year and q the change in the quantity in the current year then
01
The factor reversal test is satisfied only by the Fisher’s Ideal Index Number.
The proof of it is given below:
39
Circular Test
Another test applied in index number studies is the circular test. It is a short of extension of the time
reversal test. Suppose an index number is constructed for the year 1983 with the base of 1982 and
another index number for 1982 on the base of 1981, then it should be possible for us to directly get an
index number for 1983 on the base of 1981. If the index number calculated directly does not give an
inconsistent value, the circular test is said to be satisfied. If p represent the price change of the current
01
year on the base year and P the price change of the base year on some other base and p the
12 20
price change of the current year on this second base then the following equation should be satisfied.
40