MCM1C03 0
MCM1C03 0
BUSINESS DECISIONS
(MCM1C03)
STUDY MATERIAL
I SEMESTER
CORE COURSE
M.Com.
(2019 Admission onwards)
UNIVERSITY OF CALICUT
SCHOOL OF DISTANCE EDUCATION
CALICUT UNIVERSITY P.O.
MALAPPURAM - 673 635, KERALA
190603
School of Distance Education
University of Calicut
Study Material
First Semester
M.Com.
(2019 Admission onwards)
CORE COURSE:
MCM1C03: QUANTITATIVE TECHNIQUES FOR
BUSINESS DECISIONS.
Prepared by:
VINEETHAN T.
Assistant Professor
Department of Commerce
Govt. College, Madappally.
Scrutinized by:
Dr. E.K. SATHEESH
Professor
Department of Commerce & Management Studies
University of Calicut.
Disclaimer
"The author(s) shall be solely responsible
for the content and views expressed in this
book"
CONTENTS
Chapter
Description Page No.
No.
Introduction to Quantitative
1 1 - 14
Techniques
2 Correlation Analysis 15 - 40
3 Regression Analysis 41 - 60
4 Probability Distributions 61 - 64
5 Binomial Distribution 65 - 72
6 Poisson Distribution 73 - 79
7 Normal Distribution 80 - 90
8 Exponential Distribution 91 - 92
9 Uniform Distribution 93
10 Statistical Inferences 94 - 126
11 Chi-Square Test 127 - 144
12 Analysis of Variance 145 - 165
13 Non-parametric Tests 166 - 192
14 Sample Size Determination 193 - 197
15 Statistical Estimation 198 - 202
16 Softwares for Quantitative Methods 203 - 215
MCM1C03 : QUANTITATIVE TECHNIQUES FOR BUSINESS DECISIONS
CHAPTER 1
INTRODUCTION TO QUANTITATIVE
TECHNIQUES
Quantitative technique is a very powerful tool with the
help of which, the business organizations can augment their
production, maximize profits, minimize costs, and production
methods can be oriented for the accomplishment of certain pre
– determined objectives. Quantitative techniques are used to
solve many of the problems that arise in a business or
industrial area. A large number of business problems, in the
relatively recent past, have been given a quantitative
representation with considerable degree of success. All this has
attracted the business executives, public administrators alike
towards the study of these techniques more and more in the
present times.
Managerial activities have become complex and it is
necessary to make right decisions to avoid heavy losses.
Whether it is a manufacturing unit, or a service organization,
the resources have to be utilized to its maximum in an efficient
manner. The future is clouded with uncertainty and fast
changing, and decision- making – a crucial activity – cannot be
made on a trial-and-error basis or by using a thumb rule
approach. In such situations, there is a greater need for
applying scientific methods to decision-making to increase the
probability of coming up with good decisions. Quantitative
Technique is a scientific approach to managerial decision-
making. The successful use of Quantitative Technique for
Differentials
Differential is a mathematical process of finding out changes
in the dependent variable with reference to a small change in the
independent variable. It involves differential coefficients of
dependent variables with or without variables.
Integration
It is a technique just reversing the process of differentiation.
It involves the formula f(x) dx where f(x) is the function to be
integrated
Statistical techniques
They are techniques which are used in conducting statistical
inquiry concerning a certain phenomenon. They include all the
statistical methods beginning from the collection of data till
interpretation of those collected data. Important statistical techniques
include collection of data, classification and tabulation, measures of
central tendency, measures of dispersion, skewness and kurtosis,
correlation, regression, interpolation and extrapolation, index
numbers, time series analysis, statistical quality control, ratio
analysis, probability theory, sampling technique, variance
analysis, theory of attributes etc.
Programming techniques
These techniques focus on model building, and are widely
applied by decision makers relating to business operations. In
programming, problem is formulated in numerical form, and a
suitable model is fitted to the problem and finally a solution is
derived. Prominent programming techniques include linear
programming, queuing theory, inventory theory, theory of games,
decision theory, network programming, simulation, replacement
non linear programming, dynamic programming integer
programming etc.
Forecasting
Quantitative techniques are useful in demand forecasting.
They provide a scientific basis of coping with the uncertainties of
future demand. Demand forecasts serve as the basis for capacity
planning. Quantitative technique enables a manager to adopt the
minimum risk plan.
Inventory control
Inventory planning techniques help in deciding when to buy
and how much to buy. It enables management to arrive at
appropriate balance between the costs and benefits of holding stocks.
The integrated production models technique is very useful in
minimizing costs of inventory, production and workforce.
Statistical quality controls help us to determine whether the
production process is under control or not.
Applications of quantitative techniques in business
operations
Quantitative techniques are widely applied for solving
decision problems of routine operations of business organizations. It
is especially useful for business managers, economist, statisticians,
administrators, technicians and others in the field of business,
agriculture, industry services and defense. It has specific
applications in the following functional areas of business
organizations.
Planning
In planning, quantitative techniques are applied to determine
size and location of plant, product development, factory
construction, installation of equipment and machineries etc.
Purchasing
Quantitative techniques are applied in make or buy
Mathematical techniques
They are quantitative techniques in which numerical data are
used along with the principles of mathematics such as integration,
calculus etc. They include permutations, combinations, set theory,
matrix analysis, differentials integration etc.
Permutations and combinations
Permutation is mathematical device of finding possible
number of arrangements or groups which can be made of a certain
number of items from a set of observations. They are groupings
considering order of arrangements.
Combinations are number of selections or subsets which can
be made of a certain number of items from a set of observations,
without considering order. Both combinations and permutations help
in ascertaining total number of possible cases.
Set theory
It is a modern mathematical device which solves the various
types of critical problems on the basis of sets and their operations
like Union, intersection etc.
Matrix Algebra
Matrix is an orderly arrangement of certain given numbers or
symbols in rows and columns. Matrix analysis is thus a
mathematical device of finding out the results of different types of
algebraic operations on the basis of relevant matrices. This is useful
to find values of unknown numbers connected with a number of
simultaneous equations.
Differentials
Differential is a mathematical process of finding out changes
in the dependent variable with reference to a small change in the
CHAPTER 2
CORRELATION ANALYSIS
OR
nεdxdy – (εdx . εdy)
r =
√nεdx2 – (εdx)2 nεdy2 – (εdy)2
OR
nεXY – (εX . εY)
r =
√nεX2 – (εX)2 nεY2 – (εY)2
X 57 42 40 38 42 45 42 44 40 46 44 43
Y 10 26 30 41 29 27 27 19 18 19 31 29
Sol:
44 19 -1 -6 6 1 36
40 18 -5 -7 35 25 49
46 19 1 -6 -6 1 36
44 31 -1 6 -6 1 36
43 29 -2 4 -8 4 16
εdx εdy εdxdy εdx2 εdy2
= -17 =6 = -317 =277 = 704
2 2
r = (12 x -317) – (-17 x 6) / √[(12 x 277) – (-17) ] [(12 x 704) – (6) ]
Course I 45 70 65 30 90 40 50 75 85 60
Course
35 90 70 40 95 40 60 80 80 50
II
Sol:
Computation of Product Moment Correlation Coefficient
Course Course dx = (x dy=(y—
Dxdy dx2 dy2
I (x) II (y) --60) 60)
45 35 -15 -25 375 225 625
70 90 10 30 300 100 900
65 70 5 10 50 25 100
30 40 -30 -20 600 900 400
90 95 30 35 1050 900 1225
40 40 -20 -20 400 400 400
50 60 -10 0 0 100 0
75 80 15 20 300 225 400
85 80 25 20 500 625 400
60 50 0 -10 0 0 100
εdx εdy εdxdy εdx 2
εdy2
= 10 = 40 = 3575 =3500 = 4550
Coefficient of non-determination
= 1– r2 = 1 – 0.7225 = 2775 = 27.75%
Rank Correlation Method
When the variables cannot be measured in quantitative
terms, the coefficient of correlation can be found out by using
rank correlation method. Here ranks are to be assigned to the
individual observations. Ranks may be assigned in either
ascending or descending order. This method was designed by
Charles Edward Spearman in 1904. He suggested two
formulae for computing rank correlation coefficient. Rank
correlation coefficient if denoted by ‘R’
(1) When there is no equal rank:
6εD2
R =1 –
(n3 – n)
6εD2
R =1 –
(n3 – n)
6εD2
R =1 –
(n3 – n)]
6εD2
R =1 – = 1
(n3 – n)]
= 1 – [(6 x 20)/(63–6)] = 1 – (120/210)
= 1– 0.5714 = 0.4286
Qn: From the following data, compute Spearman’s Rank
Correlation Coefficient:
x 80 45 55 58 55 60 45 68 70 45 85
y 82 56 50 43 56 62 64 65 70 64 90
Sol:
This is the case of equal marks.
60 62 7 5 2 4
45 64 2 6.5 -4.5 20.25
68 65 8 8 0 0
70 70 9 9 0 0
45 64 2 6.5 -4.5 20.25
85 90 11 11 0 0
εD2 = 79
6 [79 + (2+0.5+0.5+0.5)]
=1 –
1320
= 1 – [6(79 + 3.5) ÷ 1320] = 1 – (495/1320)
= 1– (0.375) = 0.625
Concurrent Deviation Method
This is a simple method for computing coefficient of
correlation. Here, we consider only the direction of change and
not the magnitude of change. The coefficient of correlation is
determined on the basis of number of concurrent deviations.
That is why this method is named as such. The coefficient of
concurrent deviation is denoted by rc.
The formula for computing coefficient of concurrent
deviation is:
rc = ± √± (2c – n) / n
rc = ± √± (2c – n) / n
rc = ± √± (2 x 0 – 8) / 8 = ±√±(0—8)/8 = –1
There is perfect negative correlation between x and y.
PARTIAL CORRELATION
When there are more than two variables and we study
the relationship between any two variables only, assuming
other variables as constant, it is called partial correlation. For
example, the study of the relationship between rainfall and
agricultural produce, without taking into consideration the
effects of other factors such as quality of seeds, quality of soil,
use of fertilizer, etc.
Partial correlation coefficient measures the relationship
between one variable and one of the other variables assuming
that the effect of the rest of the variables is eliminated.
Suppose there are 3 variables namely x1, x2 and x3.
Here, we can find three partial correlation coefficients. They
are:
(1) Partial Correlation coefficient between x1 and x2, keeping
x3 as constant. This is denoted by r12.3
(2) Partial Correlation coefficient between x1 and x3, keeping
x2 as constant. This is denoted by r13.2
(3) Partial Correlation coefficient between x2 and x3, keeping
x1 as constant. This is denoted by r23.1
The formulae for computing the above partial
correlation coefficients are:
Qn: If r12 = 0.98, r13 = 0.44 and r23 = 0.54, find (1) r12.3,
(2) r13.2 and (3) r23.1
Sol:
(1)
r12 – r13 r23
r12.3 =
√1–r132 √1–r232
0.98 – (0.44 x 0.54)
r12.3 =
√1–0.442 √1–0.542
0.98 – 0.2376
=
√1–0.1936 √1–0.2916
= 0.7424/ (0.898 x 0.842) = 0.7424 / 0.7561 = 0.982
(2)
r13 – r12 r23
r13.2 =
√1–r122 √1–r232
0.44 – (0.98 x 0.54)
r12.3 =
√1–0.982 √1–0.542
0.44 – 0.5292
=
√1–0.9604 √1–0.2916
MULTIPLE CORRELATION
When there are more than two variables and we study
the relationship between one variable and all the other
variables taken together, then it is the case of multiple
correlation. Suppose there are three variables, namely x, y and
z. The correlation between x and (y & z) taken together is
multiple correlation. Similarly, the relation between y and (x &
z) taken together is multiple correlation. Again, the relation
between z and (x & y) taken together is multiple correlation. In
all these cases, the correlation coefficient obtained will be
termed as coefficient of multiple correlation.
Suppose there are 3 variables namely x1, x2 and x3. Here,
we can find three multiple correlation coefficients. They are:
1. Multiple Correlation Coefficient between x1 on one side
and x2 and x3 together on the other side. This is denoted by
R1.23
2. Multiple Correlation Coefficient between x2 on one side
and x1 and x3 together on the other side. This is denoted by
R2.13
3. Multiple Correlation Coefficient between x3 on one side
and x1 and x2 together on the other side. This is denoted by
R3.12
The formulae for computing the above multiple
correlation coefficients are:
Qn: If r12 = 0.6, r23 = r13 = 0.8, find R1.23, R2.13 and R3.12 .
Sol:
REVIEW QUESTIONS:
1. What do you mean by correlation analysis?
2. Define correlation.
3. What is scatter diagram? What are its advantages?
4. What is coefficient of correlation? What are its properties?
5. What are the different types of correlation?
6. What do you mean by degree of correlation?
7. What is meant by positive and negative correlation?
8. What are the merits of Karl Pearson’s Coefficient of
Correlation?
9. What are the demerits of Karl Pearson’s Coefficient of
Correlation?
10. What is rank correlation?
11. What are the merits of rank correlation?
12. What are the demerits of rank correlation?
13. What is meant by correlation graph method?
14. What is concurrent deviation method?
15. What is the main drawback of concurrent deviation
method?
16. What is coefficient of determination?
Roll No. 1 2 3 4 5 6 7 8 9 10
Marks I 45 56 39 54 45 40 56 60 30 35
Marks II 40 56 30 44 36 32 45 42 20 36
27. If r12 = 0.7, r13 = 0.61, r23 = 0.4, find r12.3, r13.2 and r 23.1
28. If r12 = 0.98, r13 = 0.44, r23 = 0.54, find R1.23, R2.13 and R3.12
CHAPTER 3
REGRESSION ANALYSIS
Meaning and Definition of Regression Analysis
Correlation analysis helps to know whether two
variables are related or not. Once the relationship between two
variables is established, the same may be used for the purpose
of predicting the unknown value of one variable on the basis of
the known value of the other. For this purpose we have to
examine the average functional relationship exists between the
variables. This is known as regression analysis.
Regression analysis may be defined as the process of
ascertaining the average functional relationship exists between
variables so as to facilitate the mechanism of prediction or
estimation or forecasting. Regression analysis helps to predict
the unknown values of a variable with the help of known
values of the other variable. The term regression was firstly
used by Francis Galton.
Types of Regression
Regression may be classified as follows:
I. On the basis of number of variables:
(a) Simple Regression
(b) Multiple Regression
II. On the basis of Proportion of change in the variables:
(a) Liner Regression
(b) Non-liner Regression
1. Simple Regression
In a regression analysis, if there are only two variables,
it is called simple regression analysis.
2. Multiple Regression
In a regression analysis, if there are more than two
variables, it is called multiple regression analysis.
3. Linear Regression
In a regression analysis, if linear relation exists
between variables, it is called linear regression analysis. Under
this, when we plot the data on a graph paper, we get a straight
line. Here, the relationship exists between variables can be
expressed in the form of y = a + bx. In case of linear
regression, the change in dependent variable is proportionate to
the changes in the independent variable.
4. Non-linear Regression:
In case of non-linear regression, the relation between
the variables cannot be expressed in the form of y = a + bx.
When the data are plotted on a graph, the dots will be
concentrated, more or less, around a curve. This is also called
curvi-linear regression.
Regression Line (Line of Best Fit)
Regression line is a graphical method to show the
functional relationship between two variables, namely
dependent variable and independent variable. Since regression
line helps to estimate the unknown values of dependent
X 10 16 24 36 48
Y 20 12 32 40 55
Sol:
X on Y : X = a + bY
Y on X : Y = a + bX
εX = Na + bεY, and
εXY = aεY + bεY2
εY = Na + bεX, and
εXY = aεX + bεX2
Qn: From the following data, fit the two regression equations:
x 4 5 8 2 1
y 5 6 7 3 2
Sol:
Regression Equation X on Y is:
X = a + bY
The normal equations to find the values of ‘a’ and ‘b’
are:
εX = Na + bεY, and
εXY = aεY + bεY2
20 = 5a + 23 b .............................. (1)
114 = 23a + 123 b .......................... (2)
(1) x 23 : 460 = 115 a + 529 b ....................... (1)
(2) X 5 : 570 = 115a + 615 b ........................ (2)
(3) -- (1) : 110 = 0 + 86 b
86 b = 110
b = 110/86 = 1.28
Substitute b= 1.2 in equation number (1)
20 = 5a + 23 x 1.28;
20 = 5a + 29.44; 5a = 20 – 29.44; 5a = – 9.44
a = -9.44/5 = – 1.89
Substitute the values of ‘a’ and ‘b’ in regression
equation X on Y:
X = – 1.89 + 1.28y
Regression Equation Y on X is:
Y = a + bX
The normal equations are:
εY = Na + bεX, and
εXY = aεX + bεX2
23 = 5a + 20b ............................. (1)
114 = 20a + 110b ............................. (2)
(1) x 4 : 92 = 20a + 80b ............................. (1)
114 = 20a + 110b ............................. (2)
(2) – (1): 22 = 0 + 30b
30b = 22
b = 22/30 = 0.73
Substitute b=0.73 in equation (1)
23 = 5a + 20 x 0.73; 23 = 5a + 14.6
5a = 23 – 14.6 = 8.4; a = 8.4/5 = 1.68
Substitute a = 1.68 and b = 0.73 in the general form of
Y on X
Y = 1.68 + 0.73x Regression Equation X on Y : X
= – 1.89 + 1.28y
Regression Equation X on Y : Y = 1.68 + 0.73x
Regression coefficient method
Under regression coefficient method, regression
equations are developed with the help of regression
coefficients. Since there are two regression equations, two
regression coefficients are to be computed.
The regression coefficient used to find the regression
equation X on Y is “regression Coefficient of X on Y”. It is
denoted by bxy
The regression coefficient used to find the regression
equation Y on X is “regression Coefficient of Y on X”. It is
denoted by byx
The regression Equation X on Y is:
X – X̄ = bxy (Y – Ȳ )
bxy = r . (σx/σy)
where bxy = Regression Coefficient of Regression
equation X on Y
r = Coefficient of correlation
σX = Standard deviation of series X
σy = Standard deviation of series Y
OR
bxy = εxy/εy2
where bxy = Regression Coefficient of Regression
equation X on Y
x = Deviation of X values from its actual mean
y = Deviation of X values from its actual mean
OR
nεdxdy – [(εdx) (εdy)]
bxy =
nεdy2 – (εdy)2
where bxy = Regression Coefficient of Regression
equation X on Y
dx = Deviation X values from its assumed mean
dy = Deviation Y values from its assumed mean
n = Number of paired observations
OR
Y – Ȳ = byx (X – X̄ )
byx = εxy/εx2
where byx = Regression Coefficient of Regression
equation Y on X
x = Deviation of X values from its actual mean
y = Deviation of X values from its actual mean
OR
nεdxdy – [(εdx) (εdy)]
byx =
nεdx2 – (εdx)2
where byx = Regression Coefficient of Regression
equation Y on X
dx = Deviation X values from its assumed mean
dy = Deviation Y values from its assumed mean
n = Number of paired observations
OR
X 7 2 1 1 2 3 2 6
Y 2 6 4 3 2 2 8 4
Using regression coefficients:
(a) Fit the regression equation of Y on X and predict Y if X = 5
(b) Fit the regression equation of X on Y and predict X if Y = 20
Sol.
The regression Equation Y on X is:
Y – Ȳ = byx (X – X̄ )
8 x -17 – (-8 x 7)
byx =
8 x 44 – (-8)2
= (-136 – -56) / (352 – 64) = -80/288 = – 0.278
X̄ = εX/n = 24/8 = 3
Ȳ = εY/n = 31/8 = 3.875
The regression Equation Y on X is:
Y – 3.875 = – 0.278 (X – 3)
Y = – 0.278 X + 0.834 + 3.875 = – 0.278 X + 4.709
Y = – 0.278 X + 4.709
If X=5, Y = (-0.278 x 5) + 4.709 = -1.39 + 4.709 = 3.319
The regression Equation X on Y is:
X – X̄ = bxy (Y – Ȳ )
X – 3 = – 0.3042 (Y – 3.875)
X = – 0.3042 Y + 1.179 + 3 = – 0.3042Y + 4.179
X = – 0.3042Y + 4.179
If Y=20, X = (-0.3042 x 20) + 4.179 = -6.084 + 4.179
= –1.905
Properties of Regression Coefficients
1. In a bivariate data, there will be two regression
coefficients. They are bxy and byx
2. bxy is the regression coefficient of regression equation X on
Y
3. byx is the regression coefficient of regression equation Y on
X
4. Both the regression equations will have the same signs.
5. The sign of regression coefficients and correlation
coefficient will be same.
6. The geometric mean of two regression coefficients is equal
to coefficient of correlation.
√ bxy x byx = r
7. The product of two regression coefficients is equal to
coefficient of determination.
bxy x byx = r2
8. When there is perfect correlation between X and Y, then
bxy and x byx will be reciprocals of each other.
9. When the standard deviations of both the variables are
Yule’s Notation
Yule suggested that, the above equations may be
simplified by taking (x3 – x̄ 3) = X1, (x3 – x̄ 3) = X2 and (x3 –
x̄ 3) = X3. Then the equations of planes of regression are:
1. Regression equation of x1 on x2 and x3:
X1 = b12.3 X2+ b13.2 X3
2. Regression equation of x2 on x1 and x3:
X2 = b21.3 X1+ b23.1 X3
3. Regression equation of x3 on x1 and x2:
X3 = b31.2 X1+ b32.1 X2
In the above three equations, we used six regression
coefficients. Following are the formulae for computing
regression coefficients:
b12.3 = (σ1/ σ2 ) [(r12 – r13r23)/(1– r232)]
REVIEW QUESTIONS:
1. What do you mean by regression analysis?
2. What are the different types of regression?
3. What do you mean by linear and non-linear regressions?
4. What do you mean by line of best fit?
X 102 80 100 88 84 82 90 96 97 83 79 88
Y 100 97 98 83 84 72 84 101 102 88 84 87
Also find the value of X, if Y 90 and Y if X = 105
12. In a trivariate distribution, x̄ 1 =10, x̄ 2 = 15, x̄ 3 = 12, σ1 = 3,
σ2 = 4, σ3 = 5, r23= 0.4, r31= 0.6 and r12= 0. 7. Determine the
regression equation of X1 on X2 and X3.
CHAPTER 4
PROBABILITY DISTRIBUTIONS
(THEORETICAL DISTRIBUTIONS)
Definition
Probability distribution (Theoretical Distribution) can
be defined as a distribution obtained for a random variable on
the basis of a mathematical model. It is obtained not on the
basis of actual observation or experiments, but on the basis of
probability law.
Random variable
Random variable is a variable who value is determined
by the outcome of a random experiment. Random variable is
also called chance variable or stochastic variable.
For example, suppose we toss a coin. Obtaining of head in this
random experiment is a random variable. Here the random
variable of “obtaining heads” can take the numerical values.
Now, we can prepare a table showing the values of the
random variable and corresponding probabilities. This is called
probability distributions or theoretical distribution.
In the above, example probability distribution is :-
X: 0 1 2 3 4
Solution
Here all values of P(X) are more than zero; and sum of all
P(X) value is equal to 1
Since two conditions, namely P(X) ≤0 and ∑P(X) = 1, are
satisfied, the given distribution is a probability distribution.
MATHEMATICAL EXPECTATION (EXPECTED VALUE)
If X is a random variable assuming values x1, x2,
x3,…………,xn with corresponding probabilities P1, P2,
P3,…………,Pn, then the Expectation of X is defined as x1p1+
x2p2+ x3p3+………+ xnpn.
E(X) = ∑ [x. p(x)]
CHAPTER 5
BIONOMIAL DISTRIBUTION
P(r) = nC r prqn-r
Where, P = probability of success in a single trial q = 1 – p
n = number of trials
r = number of success in ‘n’ trials.
Assumption of Binomial Distribution
(Situations where Binomial Distribution can be
applied)
Binomial distribution can be applied when:-
1. The random experiment has two outcomes i.e., success and
failure.
2. The probability of success in a single trial remains constant
from trial to trial of the experiment.
3. The experiment is repeated for finite number of times.
4. The trials are independent.
Properties (Features) of Binomial Distribution
1. It is a discrete probability distribution.
2. The shape and location of Binomial distribution changes as
‘p’ changes for a given ‘n’.
3. The mode of the Binomial distribution is equal to the value
of ‘r’ which has the largest probability.
4. Mean of the Binomial distribution increases as ‘n’
increases with ‘p’ remaining constant.
5. The mean of Binomial distribution is np.
6. The Standard deviation of Binomial distribution is √npq
7. The variance of Binomial Distribution is npq
8. If ‘n’ is large and if neither ‘p’ nor ‘q’ is too close zero,
Binomial distribution may be approximated to Normal
Distribution.
9. If two independent random variables follow Binomial
distribution, their sum also follows Binomial distribution.
Qn: Six coins are tossed simultaneously. What is the
probability of obtaining 4 heads?
P(r) = nC r prqn-r
(1) Probability that Sachin scores centuary in exactly 2
matches is:
P (r = 2) = 5C2 1/32 2/35-2
= 0.329
(c) No girls
(d) At the most two girls
(a) P( having a boy) = ½
P (having a girl) = ½
n = 4
P (getting 2 boys & 2 girls) = p (getting 2 boys)
= p (r = 2) = 4C2 (½) 2 (1/2)4-2
= 4! x (1/2)2 x (½)2
(4-2)! 2!
= 4 x 3 x (1/2)4
2
= 6 x 1/16 = 6/16 = 3/8
∴ Percentage of families with 2 boys and 2 girls =
(3/8) x 100 = 37.5 % .
(b) Probability of having at least one boy:
= p (having one boy or having 2 boys or having 3 boys or
having 4 boys)
= p (having one boy) + p (having 2 boys) + p (having 3 boys)
+ p (having 4 boys)
= p (r=1) + p (r = 2) + p(r = 3) + p (r = 4)
= 4/16 + 6/16 + 4/46 + 1/16 = 15/16
∴ Percentage of families with at least one boy =
(15/16) x 100 = 93.75 %
Binomial Distribution
Expected
No. of
P(x) Frequency = P(x)
Heads (x)
x 256
8C0 (1/2)0 (1/2) 8 =
0 1
1/256
8C1 (1/2)1 (1/2) 7 =
1 8
8/256
8C2 (1/2)2 (1/2) 6 =
2 28
28/256
8C3 (1/2)3 (1/2) 5 =
3 56
56/256
8C4 (1/2)4 (1/2) 4 =
4 70
70/256
Mean = np = 8*1/2 = 4
S.D = √npq = √8*1/2* ½ = √2 = 1.414
REVIEW QUESTIONS:
1. Define Binomial Distribution.
2. What are the important properties of Binomial
Distribution?
3. Examine whether the following statement is true:
“ For a Binomial Distribution, mean = 10 and S D = 4”
4. For a Binomial Distribution, mean = 6 and S D = √2. Find
parameters. Write down all the terms of the distribution.
*********
CHAPTER 6
POISSON DISTRIBUTION
P ( r) = e –m . mr
r!
m=4
∴ P ( exactly 4 apples are defective) = P (r < 4)
P (r < 4) = P (r = 0 or 1 or 2 or 3)
= P (r = 0) + P (r =1) + P (r = 2) + P (r = 3)
P (r = 0) = (e-4 . 4 0) / 0! = (0.0183 x 1) / 1 = 0.0183
P (r = 1) = (e-4 . 4 1) / 1! = (0.0183 x 4) / 1 = 0.0732
P (r = 2) = (e-4 . 4 2) / 2! = (0.0183 x 16) / 2 = 0.1464
P (r = 3) = (e-4 . 4 3) / 3! = (0.0183 x 64) / 6 = 0.1952
∴ P (r < 4) = 0.0183+ 0.0732 + 0.1464 + 0.1952 = 0.4331
Qn: Out of 500 items selected for inspection, 0.2% is found to
be defective. Find how many lots will contain exactly no
defective if there are 1000 lots.
Sol:
e –m . mr
P ( r) =
r!
m = 500 x 0.2% = 1
∴ P (r = 0) = (e -1 10) / 0! = ( 0.3679 x 1) / 1 = 0.3679
∴ No. of lots having zero defective = 0.3679 x 1000 = 368
Qn: In a certain factory producing optical lenses, there is a
small chance of 1/500 for any one lens to be defective. The
lenses are supplied in packets of 10. Use P.D to calculate the
approximate number of packets containing no defectives, one
defective, two defectives and three defective lenses
respectively in a consignment of 20,000 packets.
Sol:
e –m . mr
P ( r) =
r!
m = 10 x 1/500 = 0.02
∴ P (r = 0) = (e-0.02 x 0.02 0 ) / 0! = (0.9802 x 1) /1 = 0.9802
∴ No. of packets containing no defective lens = 0.9802 x
20000 = 19604
P (r = 1) = (e-0.02 x 0.02 1 ) / 1! = (0.9802 x 0.02) /1 = 0.0196
∴ No. of packets containing no defective lens = 0.0196 x
20000 = 392
P (r = 2) = (e-0.02 x 0.02 2 ) / 2! = (0.9802 x 0.0004) /2
= 0.00019604
x 0 1 2 3 4 5 6
f 48 27 12 7 4 1 1 N = εf = 100
fx 0 27 24 21 16 5 6 (εfx) = 99
Mean = 99/100 = 0.99
Calculation of Expected Frequencies
Expected Frequency =
X P(x) = (e-m . mx) / x!
P(x) . N
0 (e-0.99 . 0.990)/0! =(0.3716 x1)/1 0.3716 x 100 =
= 0.3716 37.16 = 37
REVIEW QUESTIONS:
1. Define Poisson distribution.
2. What are the important properties of P.D?
3. What are the situations under which P D can be applied?
4. Write down the probability function of P.D. whose mean is
2. What is its variance?
5. A machine is producing 4% defectives. What is the
probability of getting at least 4 defectives in a sample of 50
=, using (a) BD and (b) PD?
6. The following table gives the number of days in a 50 day
period during which automobile accidents occurred in a
certain part of the city. Fit a Poisson distribution to the
data:
No. of accidents 0 1 2 3 4
No. of days 19 18 8 4 1
********
CHAPTER 7
NORMAL DISTRIBUTION
2
1 -- ½ (x-μ)/σ
P (x) = e
√2π . σ
Z = (x – μ) / σ
Z = (60 – 45) / 10 = 15/10 = 1.5
0.4332
0.1915
0.1915 0.3643
Z = (x – μ) / σ
When x = 85, Z = (85 – 80) / 15 = 5/15 = 0.333
When x = 95, Z = (95 – 80) / 15 = 15/15 = 1
0.1293 0.212
0.0918 0.4082
0.02
Therefore, the area to the left of the above area of 0.02 is:
Z = (40 – 45) / 10 = -5/10 = -- 0.5
0.48
Locate the area of 0.48 in the table and find the Z – value
corresponds to it.
The table shows the area nearest to 0.48 is 0.4798, and the
corresponding z-value is 2.05
Z = 2.05
(x – μ)/σ = 2.05
(x -- 62)/12 = 2.05, x – 62 = 2.05 x 12
x -- 62 = 24.6, ∴ x = 24.6 + 62 = 86.6
∴ The minimum marks one should score to get section
= 86.6 marks
Construction of Normal Distribution
Procedure:
1. Find the mean and S.D of the given distribution and take
them as μ and σ (parameters) of the normal distribution.
2. Take the lower limit of each class as the x values.
3. Calculate the z-value corresponding to each x-value by
using formulae z = (x—μ)/σ. Z-value of first and last
values need not be computed.
4. Find the area corresponds to z-value from the standard
normal distribution table. The area corresponds to the first
and last z-values will be 0.5.
5. Find the area of each class using the area (probability) of
respective class limits. (Take the difference in case of same
signs; and take the total in case of opposite signs)
x̄ = A + [(εfd’)/N] x C, x̄ = 35 + [(180/200)x10],
= 35+9 = 44
S. D = √(εfd’2 /N) - [(εfd’)/N]2 x 10 = √1.55 x 10 = 12.45
∴ μ = 44 and σ = 12.45
Review Questions:
1. Define normal distribution.
2. What are the important properties of normal distribution?
3. Explain the importance of normal distribution.
4. Explain the procedure for construction of normal
distribution.
5. If x follows a normal distribution with mean 12 and
variance 16, find P(x≥20).
6. The weekly wages of 1000 workers are normally
distributed with mean of 70 and S.D of 5. Estimate the
number of workers whose wages lie between 69 and 72.
7. In an aptitude test administered to 900 students, the mean
score is 50 and S.D is 20. Find the number of students
*********
CHAPTER 8
EXPONENTIAL DISTRIBUTION
REVIEW QUESTIONS:
1. Define exponential distribution.
2. Write down the first four moments of exponential
distribution.
3. What is the skewness of exponential distribution?
4. What about the median of an exponential distribution?
5. What are the important properties of exponential
distribution?
CHAPTER 9
UNIFORM DISTRIBUTION
Definition of Uniform Distribution
A discrete random variable, x, follows uniform
distribution if its probability density function is :
CHAPTER 10
STATISTICAL INFERENCE
Basic Concepts
Population: In statistics, ‘Population’ refers to collection of all
individuals or objects or items or things under consideration.
Finite Population: If a population contains a finite number of
objects, it is called finite population. Eg: Students in a college.
Infinite Population: If a population contains a infinite number
of objects, it is called infinite population. Eg: Stars in the sky.
Sample: A sample is a representative part of the population.
Sample size: Number of units in a sample group is called
sample size. If sample size is too small, it may not represent
the population. If it is very large, it may require more time and
money for investigation. Hence, the size of a sample should be
optimum.
Large Sample: If the size of a sample exceeds thirty, it is
called as large sample.
Small Sample: If the size of a sample does not exceed thirty, it
is called as small sample.
Parameter: It is a statistical measure derived from population
elements. If the arithmetic mean is computed from all the
elements of a population, it is a population parameter. Here it
is called population mean. Population mean is denoted by the
symbol μ. Population standard deviation is denoted by σ.
Alternative Hypothesis
Any hypothesis other than null hypothesis is called
alternative hypothesis. It is the hypothesis which is accepted
when the null hypothesis is rejected. An alternative hypothesis
is denoted by H1 or Ha
Sampling Distribution
Sampling distribution is a distribution of sample
statistic derived from various samples drawn from the same
population. Since sample statistic is a random variable,
sampling distribution is a probability distribution.
Standard Error
Standard Error (SE) of a statistic is the standard
deviation of the sampling distribution of that statistic. For
example, the Standard deviation of the sampling distribution of
the sample mean is σ/√n, where σ = population S.D. and n =
sample size. Therefore the Standard Error (SE) of sampling
distribution of mean is σ/√n .
Uses of Standard Error
(1) Standard Error is used for testing a given hypothesis.
(2) Standard Error gives an idea about the reliability of a
sample. The reciprocal of Standard Error is a measure
of reliability of the sample.
(3) Standard Error can be used to determine the confidence
limits for population values like mean, proportion and
standard deviation.
Rejection Region
One-tailed Test
One tailed test is one in which the rejection region is
located in only one tail of the normal curve. It may be at left
tail or right tail, depending on the alternative hypothesis. If the
alternative hypothesis is with ‘<’ (less than) sign, the rejection
region is placed on the left tail, and the test is called left-tailed
test. If the alternative hypothesis is with ‘>’ (more than) sign,
School of Distance Education, University of Calicut 100
MCM1C03 : QUANTITATIVE TECHNIQUES FOR BUSINESS DECISIONS
the rejection region is placed on the right tail, and the test is
called right-tailed test.
Rejection Region
(Left-tailed Test)
Rejection Region
(Rigjt-tailed Test)
Parametric Tests
When testing of hypothesis is done, if some
assumptions are made about the nature of population
distribution, then the test statistic applied there is called
parametric test. There are number of parametric tests. Eg: t-
test, Z test, F test, etc.
Non-Parametric Tests
When testing of hypothesis is done, if no assumptions
are made about the nature of population distribution, then the
test statistic applied there is called non-parametric test. There
are number of non-parametric tests. Eg: Chi-square Test, Sign
tests, Signed Rank Tests, Rank Sum Tests, Run Test,
Kolmogrov Smirnov Test, etc. Since, no assumptions are made
about the nature of population, non-parametric tests are also
called distribution-free tests.
TESTING OF GIVEN POPULATION MEAN
This testing of hypothesis is used to test whether the given
population mean is true or not. In other words, it is used to
test whether there is significant difference between sample
mean and population mean.
Procedure:
1. Set up H0 and H1
H0 : There is no significant difference between sample mean
and population mean
( i.e; μ = μ0)
H1 : There is significant difference between sample mean and
population mean
( i.e; μ ≠ μ0)
2. Decide the test statistic:
The test statistic applicable here is Z-test or t-test.
If population S.D.(i.e; σ) is known, apply Z-test
If population S.D.(i.e; σ) is unknown but sample is large, apply
Z-test
If population S.D.(i.e; σ) is unknown but sample is small,
apply t-test
3. Apply the appropriate formula for computing the value of
the test statistic:
Z / t = Difference/Standard Error
Difference = Difference between sample mean and the given
population mean
Standard Error = σ / √n ( If population S.D is known)
Standard Error = s / √n ( If population S.D is unknown, but
sample is large)
Standard Error = s / √n-1 ( If population S.D is unknown and
sample is small)
Where σ = population S.D, s = sample S.D, n = sample size
4. Specify the level of significance. If nothing is mentioned
about the level of significance, take 5%.
5. Fix the degree of freedom:
For Z-test, d.f = infinity; For t-test, d.f = n-1
Z = 69/124.8 = 0.553
Level of significance = 5%
Degree of freedom = infinity (population S D is known)
Table value (Critical value) at 5 % level of significance and
infinity degree of freedom is 1.96
Since calculated value of Z is less than the critical value, H0 is
accepted. That is, there is no significant difference between
sample mean and population mean. μ = 15200. So, we may
conclude that the claim of the company is valid.
Qn: A sample of size 400 was drawn and the sample mean
was found to be 99. Test whether this sample could have come
from the normal population with mean = 100 ad S.D = 8 at 5%
level of significance.
Sol:
H0 : There is no significant difference between sample
mean and population mean
( i.e; μ = 100)
H1 : There is significant difference between sample mean and
population mean
( i.e; μ ≠ 100)
Since population S.D is known, the test statistic applicable
here is Z-test
Z = D/SE
D = x̄ - μ = 100 -- 99 = 1
S E = σ/√n = 8/√400 = 8/20 = 0.4
School of Distance Education, University of Calicut 105
MCM1C03 : QUANTITATIVE TECHNIQUES FOR BUSINESS DECISIONS
Z = 1/0.4 = 2.5
Level of significance = 5%
Degree of freedom = infinity (population S D is known)
Table value (Critical value) at 5 % level of significance and
infinity degree of freedom is 1.96
Since calculated value of Z is more than the critical value, H0
is rejected. H1 is accepted. That is, there is significant
difference between sample mean and population mean. So, we
may conclude that μ ≠ 100
Qn: A random sample of 200 bottles of talcum powder gave
an average weight of
49.5 gram with a S.D of 2.1 gram. Do we accept the
hypothesis of weight per bottle is 50 gram at 1% level of
significance?
Sol:
H0 : There is no significant difference between sample mean
and population mean
( i.e; μ = 50)
H1 : There is significant difference between sample mean and
population mean
( i.e; μ ≠ 50)
Since sample is large, the test statistic applicable here is Z-test
Z = D/SE
D = x̄ - μ = 50 – 49.5 = 0.5
District I District II
n1 = 100 n2 = 150
x̄ 1 = 210 x̄ 2 = 200
σ 1 = 11 σ 2 = 11
Group I 18 20 36 50 49 36 34 49 41
Group II 29 26 28 35 30 44 46
Test whether the group means are equal.
Sol: Here we have to find the Means and S.Ds of the two
samples.
S.D of Group I = √ ε (X -- x̄ )2 / n
School of Distance Education, University of Calicut 115
MCM1C03 : QUANTITATIVE TECHNIQUES FOR BUSINESS DECISIONS
S.D of Group I = √ ε (X -- x̄ )2 / n
= √1134/9 = √126 = √386/7 = √55.14
H0 : There is no significant difference between two
sample means( i.e; μ1 = μ2)
H1 : is no significant difference between two sample
means( i.e; μ1 ≠ μ2)
Since population S.Ds are unknown and samples are
small, the test statistic applicable here is t-test.
t = Difference / S E
Difference = x̄ 1 – x̄ 2 = 37 – 34 = 3
SE = √(n1s12+n2s22)/n1+n2 -2 x (1/n1 + 1/n2)
(Population S.Ds are unknown and samples are small)
= √(9*126)+ (7*55.14) / 9+7-2) x (1/9 + 1/7)
= √1510.98/14 x 0.254 = √27.41 = 5.24
t = 3/5.24 = 0.573 (Calculated Value)
Level of significance = 5%
Degree of freedom = 9+7-2 = 14
Table value of t at 5% level of significance and 14 degrees of
freedom = 2.145
Since the calculated value of t is less than the table value, H0 is
accepted. (i.e; μ1 = μ2). So we may conclude that the
difference in the group means are not significant. They are
equal.
t = d/SE
Where:
Before 67 24 57 55 63 54 56 68 33 43
After 70 38 58 58 56 67 68 72 42 38
t = d/SE
Since the calculated value of t is less than the critical value, the
null hypothesis is accepted. So, we may conclude that there is
no significant difference in the performance of the students.
TESTING OF GIVEN POPULATION
PROPORTION
This type of testing of hypothesis is used to test whether there
is any significant difference between the sample proportion
and the given population proportion.
Procedure:
1. Set up H0 and H1 :
H0 : There is no significant difference between sample
proportion and population proportion ( i.e; H0 : P = P0)
H1 : There is significant difference between sample proportion
and population proportion ( i.e; H0 : P ≠ P0)
2. Decide the test statistic:
The test statistic applicable here is Z-test
3. Apply appropriate formulae for computing the value of Z
(i.e; calculated value):
Z = Difference / S E ie; Z = ( p -- P) / S E
Where p = sample proportion, P = Population proportion
S E = √ PQ / n
4. Decide the level of significance (Take 5%, if nothing is
mentioned in the question).
5. Fix the degree of freedom ( Infinity d.f)
of freedom is 1.96
Since the calculated value of Z is more than the table value,
null hypothesis is rejected. We accept alternative hypothesis.
P ≠ 0.02. So, it is not possible to think that the machine
produces 2% defective items.
TESTING OF THE SIGNIFICANCE OF THE
DIFFERENCE BETWEEN TWO SAMPLE
PROPORTIONS
This testing of hypothesis is used to test whether the
difference between two sample proportions are significant or
not. If the difference is not significant, they are treated as
equal; or we may think that the two samples are drawn from
the same population.
Procedure:
1. Set up H0 and H1
H0 : There is no significant difference between two sample
proportions( i.e; p1 = p2)
H1 : is no significant difference between two sample
proportions( i.e; p1 ≠ p2)
2. Decide the test statistic:
The test statistic applicable here is Z-test.
3. Apply the appropriate formula for computing the value of
the test statistic:
Z = Difference/Standard Error
i e; Z = p1 – p2 / SE
Z = Difference/Standard Error
i e; Z = p1 – p2 / SE
p1 = 450/1000 = 0.45, p2 = 400/800 = 0.5
S E = √p0q0 [(1/n1) + (1/n2)] , n1 = 1000 , n2 = 800
p0 = (1000 x 0.45 + 800 x 0.5)/(1000 + 800) = 850/1800
= 0.472
q0= 1— 0.472 = 0.528
∴ SE = √0.472 x 0.528 [(1/1000) + (1/800)] = √0.249 (0.001
+ 0.00125)
= √0.249 x 0.00225 = √0.00056 = 0.0237
∴ Z = (0.5 – 0.45) / 0.0237 = 0.05/0.0237 = 2.11
Level of significance = 5%.
Fix the degree of freedom = infinity
Table value (critical value) of Z at 5% level of significance and
infinity fixed degree of freedom is 1.96
Since the calculated value of Z is more than the table value,
null hypothesis is rejected. Alternative hypothesis is accepted.
p1 ≠ p2. So, we may conclude that there is significant difference
between the two districts regarding the coffee drinking habits
of people.
REVIEW QUESTIONS:
1. What do you mean by inferential analysis?
2. What do you understand by sampling distributions?
3. What are the two branches of inferential analysis?
Sample I 25 32 30 32 24 14 32
Sample 24 34 30 22 42 31 40 35 32 30
II
20. In a sample of 600 people in Bihar 336 are coffee drinkers and
the rest are tea drinkers. Can we assume that both coffee and tea
are equally popular in the State at 1% level of significance?
21. In a sample of 900 men from a certain large city 675 were found
to be smokers. In a random sample of 1350 men from another
large city 675 were found to be smokers. Do the data indicate
that the cities are significantly different in respect of the
prevalence of smoking among men?
22. A sample of size 50 has S.D of 10.5. Can you contradict the
hypothesis that the population S.D. is 12?
CHAPTER 11
CHI-SQUARE TEST
χ2 = ε [(O—E)2/E]
where o = Observed frequencies and E = Expected frequencies
4. Specify the level of significance. If nothing is mentioned,
take 5% level of significance.
5. Fix the degree of freedom. Degree of freedom = n – r – 1
Where n = number of pairs of observations
r = number of parameters computed from the given data to
find the expected frequencies.
6. Obtain the table value of Chi-square at specified level of
significance and fixed degree of freedom.
7. Compare the actual value of Chi-Square with the table
value and decide whether to accept or reject the null
hypothesis. If calculated value is less than the table value,
null hypothesis is accepted and otherwise it is rejected.
Qn: The numbers of road accidents per week in a certain city
were as follows:
12, 8, 20, 2, 14, 10, 15, 6, 9, 4
Are these frequencies in agreement with the belief that
the accidents occurred were the same during the 10 week
period?
Sol:
H0 : There is goodness of fit between observed frequencies and
expected frequencies.
χ2 = ε [(O—E)2/E]
Here the Observed values (Actual values) are 12, 8, 20,
2, 14, 10, 15, 6, 9 and 4.
i.e; O = 12, 8, 20, 2, 14, 10, 15, 6, 9, 4
If accidents occurred are same, then the number of accidents
per week which we may expect is 10 (i.e; the average of the
given values).
i.e; E = 10
Now we can find the value of Chi-square as follows:
Computation of Chi-square Value
Observed Expected Values
(O – E)2 (O – E)2 / E
Values (O) (E)
12 10 4 0.4
8 10 4 0.4
20 10 100 10.0
2 10 64 6.4
14 10 16 1.6
10 10 0 0.0
15 10 25 2.5
6 10 16 1.6
9 10 1 0.1
4 10 36 3.6
χ2 = 26.6
Degree of Freedom = n -- r -- = 10 – 0 -- 1 = 9
Table value of χ2 at 5% level of significance and 9 d.f is
16.919.
Since calculated value is more than the table value, null
hypothesis is rejected. We accept alternative hypothesis. So we
may conclude that the given figures do not agree with the
belief that accident occurred were same during the 10 weeks
period.
Qn: The principal of a college made a sample analysis of an
examination result of 200 students. It was found that 24
students had got first class, 62 second class, 68 third class and
the rest were failed. Are these figures commensurate with the
general examination result which is in the ratio of 2:3:3:2 for
various categories respectively.
Sol:
H0 : There is goodness of fit between the given figures and the
figures expected in general examination
H0 : There is no goodness of fit between the given figures and
the figures expected in general examination
χ2 = ε [(O—E)2/E]
Here the Observed values (Actual values) for first,
second, third and failed categoris of students are respectively
24, 62, 68 and 46.
If results are in the ratio of 2:3:3:2, then the number of students
for above categories may be expected as follows:
Testing of Independence
Procedure:
1. Set up H0 and H1
H0 : There is independence between observed frequencies and
expected frequencies.
H0 : There is no independence between observed and expected
frequencies.
2. Decide the test statistic. Here, the test statistic is Chi-
Square test.
3. Apply the appropriate formula:
χ2 = ε [(O—E)2/E]
where o = Observed frequencies and E = Expected frequencies
Here E values are obtained by using the following formula:
E Value = [(Row Total x Column Total)/Grand Total]
E Values are computed by preparing a table called
Contingency Table.
4. Specify the level of significance. If nothing is mentioned,
take 5% level of significance.
5. Fix the degree of freedom. Degree of freedom = (r – 1) x
(c – 1)
Where r = number of rows; c = number of columns
6. Obtain the table value of Chi-square at specified level of
significance and fixed degree of freedom.
Smokers Non-smokes
Literates 83 57
Illiterates 45 68
Sol:
H0 : There is independence between smoking habit and
literacy.
H0 : There is no independence between smoking habit and
literacy.
χ2 = ε [(O—E)2/E]
Here the Observed values (Actual values) are 83, 57, 45 and
68.
The E Values corresponding to the above ‘O’ values can be
found out by preparing a 2 X 2 contingency table:
2 X 2 Contingency Table
Smokers Non-smokes Total
(140 x 125) /
[(83+57) x 253
Literates 140
(83+45)] /253 = 71
= 69
(113 x 125) /
(113 x 128) /253
Illiterates 253 113
= 57
= 56
Total 128 125 253
So, the E values are 71, 69, 57 and 56.
χ2 = ε [(O—E)2/E]
Here all the Observed values (Actual values) are not
directly given in the question. So, we have to find the missing
figures with the help of a 2 x 2 contingency table:
Testing of Homogeneity
Procedure:
1. Set up H0 and H1
H0 : There is homogeneity between the samples on the basis of
the attribute.
H0 : There is no homogeneity between the samples on the basis
of the attribute.
2. Decide the test statistic. Here, the test statistic is Chi-
Square test.
3. Apply the appropriate formula:
χ2 = ε [(O—E)2/E]
where o = Observed frequencies and E = Expected frequencies
Here ‘E’ values are obtained by using the following formula:
‘E’ Value = [(Row Total x Column Total)/Grand Total]
‘E’ Values are computed by preparing a table called
Contingency Table.
4. Specify the level of significance. If nothing is mentioned,
take 5% level of significance.
5. Fix the degree of freedom. Degree of freedom = (r – 1) x
(c – 1)
Where r = number of rows; c = number of columns
6. Obtain the table value of Chi-square at specified level of
significance and fixed degree of freedom.
χ2 = ε [(O—E)2/E]
Here the Observed values (Actual values) are 124, 16, 56, and
10
The ‘E’ values corresponding to the above ‘O’ values
can be found out by preparing a 2 X 2 contingency table:
2 X 2 Contingency Table
Smokers Non-smokes Total
No. of
(140 x 180) /206 =
families (140 x 26)/206= 18 140
122
drinking tea
No. of
families not (66 x 180) /206 = 58 (66 x 26) / 206 = 8 66
drinking tea
Total 180 26 206
So, the ‘E’ values are 122, 18, 58 and 8.
χ2 = ns2/σ2
Here, n = 10, σ2 = 20, Sample variance is to be computed
from the given data.
x̄ = 470/10 = 47.
No. turned up 1 2 3 4 5 6
Frequency 19 23 28 17 32 31
Test the hypothesis that the die is unbiased.
8. Following data are given:
Education
Gender
Middle High School College Total
Male 52 10 20 82
Female 44 12 26 82
Total 96 22 46 164
CHAPTER 12
ANALYSIS OF VARIANCE
Smaller
variance]
Within MSE = F = MSC ÷
SSE = N–C MSE, or
Sample SSE/(N–C)
F = MSE ÷
MSC
Total SST = N –1
Variety of Seeds
Plot
P Q R S
I 10 7 8 5
II 9 7 5 4
III 8 6 4 4
Sol:
H0 : There is no significant difference in the productivity of
seeds.
H1 : There is significant difference in the productivity of seeds.
Test Statistic applicable here is F-test
School of Distance Education, University of Calicut 149
MCM1C03 : QUANTITATIVE TECHNIQUES FOR BUSINESS DECISIONS
Sol:
Here, we are asked to test whether there is significant
difference between varieties. But varieties are given in rows,
not in columns. In one way ANOVA, the samples must be in
columns. Therefore, we have to rearrange the given data so as
to bring the samples in columns as shown below:
Varieties
Plots
I II III
A 30 51 44
B 27 47 35
C 42 37 41
D 48 36
E 42
Total SST = N –1
or F= MSE÷MSC]
FR= Larger variance ÷ Smaller variance; [i.e; F= MSR÷MSE, or F=
MSE÷MSR]
4. Specify the level of significance. Take 5% if nothing is
mentioned.
5. Fix the degrees of freedom. Fix a pair of d.f. in respect of
FC and FR.
6. Obtain table value of FC and FR at specified level
significance and fixed degree of freedom.
7. Compare the Calculated value of Fc with the Table value,
and decide whether to accept or reject the null hypothesis.
If calculated value is less than the table value, H0 is
accepted. If calculated value is more than the table value,
H0 is rejected.
8. Compare the Calculated value of FR with the Table value,
and decide whether to accept or reject the null hypothesis.
If calculated value is less than the table value, H0 is
accepted. If calculated value is more than the table value,
H0 is rejected.
Qn: Following table shows the yield of crops using 3 varieties
of seeds:
Varieties of Seeds
Plots
P Q R
I 6 7 8
II 4 6 5
III 8 6 10
IV 6 9 9
(3 – 1)x(4 FR = MSR ÷
Within SSE – 1) MSE=10/6 = MSE
Sample =10 1.67
=6 = 6/1.67 =
3.593
SST
Total 12 –1 = 11
=36
Between Columns:
Calculated value of FC = 2.396
Level of Significance = 5%
Degrees of freedom = (2,6)
School of Distance Education, University of Calicut 159
MCM1C03 : QUANTITATIVE TECHNIQUES FOR BUSINESS DECISIONS
Machines
Workers
P Q R S
I 44 38 47 36
II 46 40 52 43
II 34 36 44 32
IV 43 38 46 33
V 38 42 49 39
You are required to test:
(1) Whether there is significant difference in the mean
productivity of machines.
(2) Whether there is significant difference in the mean
productivity of workers.
Sol:
Let us apply coding method by subtracting 45 from each
observation of the given data. Then we get;
Machines
Workers
P Q R S
I -1 -7 2 -9
II 1 -5 7 -2
III -11 -9 -1 -13
IV -2 -7 1 -12
V -7 -3 4 -6
SST
Total 20 –1 = 19
=574
Between Columns:
Calculated value of FC = 18.387
Level of Significance = 5%
Degrees of freedom = (3,12)
Table value of FC at 5% level of significance and (3,12)
degrees of freedom = 3.49
way ANOVA.
8. Explain the hypothesis testing procedure in case of two
way ANOVA.
9. What do you mean by coding method in analysis of
variance?
10. Following table shows the scores attained by trainees under
three different instructional methods:
Methods Scores
I 84 71 84 76 85
II 85 76 88 86 90
III 81 68 73 71 82
Test whether there is significant difference in the
scores under three methods.
11. A company had 4 salesmen P,Q,R and S, each of whom
was sent for a period of one moth to three types of areas,
namely, urban area, rural area and semi-urban area. The
sales (in thousand rupees) achieved by the salesmen are
shown in the following table:
Salesmen
Area
P Q R S
Urban 80 80 60 100
Rural 30 30 70 30
Semi-urban 70 40 50 80
Carry out an analysis of variance and interpret the results.
CHAPTER 13
NON-PARAMETRIC TESTS
Meaning:
A test which is not concerned with testing of
parameters is called Non-parametric test. Non-parametric test
does not make any assumption about the nature of distribution.
Therefore, non-parametric tests are called distribution-free
tests.
Situation where non-parametric tests are used
1. When hypothesis does not involve population parameter
2. When observations are not accurate as required for a
parametric test.
3. When the researcher thinks that parametric test is not
applicable.
Assumptions of Non-parametric tests
1. Samples are drawn randomly
2. Sample observations are independent
3. Observations are measured on ordinal or nominal scale
4. The variable is continuous
5. The probability density function of population is
continuous
Advantages of Non-parametric tests
1. It is very simple and easy to apply the non-parametric tests
Student I 7 10 14 12 6 9 11 13 7 6 10
Student II 10 13 14 11 10 7 15 11 10 9 8
Procedure:
1. Set up null and alternative hypotheses:
H0: There is no significant difference between samples
H1: There is significant difference between samples
2. Decide the test statistic. The test statistic applicable here is
Wilcoxon matched pairs test ( i.e; Wilcoxon’s T test)
3. Use the appropriate formula for computing the value of test
statistic (Wilcoxon’s T test)
T = Sum of Positive Ranks or Sum of Negative Ranks,
whichever is less.
4. Specify the level of significance. Take 5%, if not
mentioned.
5. Degree of freedom = n-1
6. Locate the table value of Wilcoxon’s T test.
7. Compare the calculated value with table value and decide
whether to accept or reject the hypothesis. If calculated
value is less than the table value, null hull hypothesis is
rejected and otherwise, it is rejected.
Qn: The following table shows the details of number of units
of a product produced by two workers. Test whether there is
significant difference between the performances of the workers
using Wilcoxon matched pairs test.
Worker
73 43 47 53 58 47 52 58 38 61 56 56 43 55 65 75
P
Worker
51 41 43 41 47 32 24 58 43 53 52 57 44 57 40 68
Q
Sol:
H0: There is no significant difference between samples
H1: There is significant difference between samples
Decide the test statistic. The test statistic applicable here is
Wilcoxon matched pairs test ( i.e; Wilcoxon’s T test)
T = Sum of Positive Ranks or Sum of Negative Ranks,
whichever is less.
Difference Ranks Ranks
Worker Worker Rank
(d) ldl of +ve of –ve
P Q of ldl
(P – Q) values values
73 51 22 22 13 13
43 41 2 2 2.5 2.5
47 43 4 4 4.5 4.5
53 41 12 12 11 11
58 47 11 11 10 10
47 32 15 15 12 12
52 24 28 28 15 15
58 58 0 0 - -
38 43 -5 5 6 - -6
61 53 8 8 8 8
56 52 4 4 4.5 4.5
56 57 -1 1 1 -1
34 44 -10 10 9 -9
55 57 -2 2 2.5 -2.5
65 40 25 25 14 14
75 68 7 7 7 7
Total of Signed Ranks 101.5 18.5
The calculated value of T = 101.5 or 18.5 whichever is lower.
∴ T value = 18.5
Level of significance = 5%
Degree of freedom = n-1, (n= number of vlues who have
either + or –ve sign)
N = 15
Table value of Wilcoxon’s T test at 5% level of significance
and 15 df = 25
Since calculated value is less than the table value, null
hypothesis is accepted. So we may conclude that there is no
significant difference in he performances of workers are P and
Q.
Signed Rank Test (When the number of matched pairs >
25)
Here, we find the difference of matched pairs and assign them
ranks. Then ranks are classified into two categories based on
their respective signs. Then take the total of two categories of
ranks. The test statistic is Z test.
Procedure:
1. Set up null and alternative hypotheses:
H0: There is no significant difference between samples
H1: There is significant difference between samples
2. Decide the test statistic. The test statistic applicable here is
Wilcoxon matched pairs test ( i.e; Z test)
3. Use the appropriate formula for computing the value of test
statistic (Z test)
Ranks Ranks
Marks Marks Difference Rank
ldl of +ve of –ve
(before) (after) (d) of ldl
values values
20 32 -12 12 9 9
58 62 -4 4 3 3
65 63 2 2 2 2
35 30 5 5 4 4
52 68 -16 16 13 13
Total of Signed Ranks 128 223
B,/G,/B,/G,/B,B,B,/G,/B,/G,/B,B,B,/G,G,/B,B,B,B,/G,G,/B,/G,
/B,B,B,/G,/B,B,B,/G,G,G,/B,/G,/B,B,B,/G,/B,/G,/B,B,B,B,/G,
G,/B/
Number of runs (r) = 27
n1= 30
n2= 18
μ = [2n1n2/(n1+n2)] + 1 = [2 x 30 x 18/(30+18)] + 1
= (1080/48)+1 = 22.5 + 1 = 23.5
σ = √[2n1n2(2n1n2–n1–n2)]/ (n1+n2)2(n1+n2–1)
= √[2 x 30 x 18(2*30*18 – 30–18)]/ [(30+18)2(30+18–1)]
= √1080(1080-30-18)/(482 * 47) = √(1080 x 1032)/108288
= √1114560/108288 = √10.2926 = 3.208
∴ Z = (27 – 23.5)/3.208 = 3.5/3.208 = 1.091
Level of significance = 5%
Degree of freedom = infinity
Table value of Z at 5% level of significance and infinity
d f = 1.96
Since calculate vale is less than table value, null hypothesis is
accepted. So we may conclude that the arrangement is made at
random.
REVIEW QUESTIONS:
1. What do you mean by non-parametric tests?
2. What are the situations under which non-parametric tests
are applied?
3. What are the important assumptions of non-parametric
tests?
4. What are the important merits of non-parametric tests?
5. What are the important drawbacks of non-parametric tests?
6. What are the different types of non-parametric tests?
7. Distinguish between parametric tests and non-parametric
tests.
8. What do you mean by one sample sign test?
9. Explain the hypothesis testing procedure under one sample
sign test.
10. Explain the hypothesis testing procedure under two sample
sign test.
11. What do you mean by Wilcoxon matched pairs test?
12. Explain the hypothesis testing procedure of Wilcoxon
matched pairs test.
13. What is meant by Wilcoxon Mann Whitney U-test?
14. Explain the hypothesis testing procedure of Wilcoxon
Mann Whitney U-test.
15. What is meant by Kruskal-Wallis H-test?
16. Explain the hypothesis testing procedure of H-test.
CHAPTER 14
SAMPLE SIZE DETERMINATION
Population S D (σ) = 15
Allowable difference (e) = 6
Value of Z at 1% level of significance and infinity d f = 2.576
∴ n = [(2.576 x 15)/6]2 = (38.64/6)2 = 6.442 = 41.474
= 41
B. Sample Size Determination While Estimating
Population Mean When Population is Finite
Sample Size (n) = [Z2Nσ2] / {[(N-1)e2] + [Z2σ2]}
where Z = table value of Z; N = Size of population; σ = S D
of Population
e = allowable difference between population mean and sample
mean.
Qn: From the details given below, determine the sample size
for estimating population mean:
(a) Population size = 5000
(b) Variance of the population = 4
(c) Estimate should be within 0.4 units of the true value of the
population mean
(d) Desired level of confidence = 99%
Sol:
n = [Z2Nσ2] / {[(N-1)e2] + [Z2σ2]}
Population Sixe (N) = 5000
Population S D = √4 = 2
Allowable difference (e) = 0.4
School of Distance Education, University of Calicut 194
MCM1C03 : QUANTITATIVE TECHNIQUES FOR BUSINESS DECISIONS
REVIEW QUESTIONS:
1. What do you mean by sample size?
2. What are the important formulae used for determining
sample size while estimating population mean?
3. What are the important formulae used for determining
sample size while estimating population proportion?
CHAPTER 15
STATISTICAL ESTIMATION
REVIEW QUESTIONS:
1. What do you mean by statistical estimation?
2. What are the two types of estimation?
3. What do you mean by Point Estimation?
4. What are the various methods used for point estimation?
5. What is meant by Interval Estimation?
6. Distinguish between Estimate and Estimator.
7. What are the important characteristics (properties) of a
good estimator?
8. Distinguish between point estimation and interval
estimation.
9. A random sample of 50 people from a population showed
incomes with a mean = 50000 and Standard Deviation =
6000. Estimate the population mean with 95% and 99%
confidence level.
10. In a sample of 500 units of a commodity from a large
consignment, 40 units were considered defective. Estimate
the percentage of defective in the whole consignment and
limits within which the percentage will probably lie.
.*********.
CHAPTER 16
SOFTWARES FOR QUANTITATIVE
METHODS
values with the new ones. Ensure you give the new column a
unique header.
Importing data
If you are importing the data from another electronic
file, check that the layout is suitable (i.e. respondents as rows,
variables as columns), add or modify variable names if
required, add respondent ID if needed and check that the data
has imported correctly.
Managing your data
Once you have created your dataset, ensure that you
back it up in a secure place, not on your PC or laptop. If you
make any changes to your master dataset, record those changes
and create a duplicate back-up.
Give files a meaningful name. It is also helpful to date them as
this makes it easier to track back if you need to do so.
Worksheet tabs can also be named to help you manage your
data.
Preparing your data
Once your data are entered you can follow the steps in
Chapter 13 to prepare your data for analysis. If you need to
carry out data transformation, such as recoding variables or
calculating summated scores, do so now. (Hint: you can use
functions such as SUM and AVERAGE to help you with
creating summated scales.) If you are creating new variables
during data transformation ensure they are given unique
column headers.
Using a function
We will introduce specific functions in the other guides
but the following example of applying the AVERAGE
function to calculate the mean age in the sample dataset in
Figure 2 illustrates their use:
Select the cell in which you wish the calculation to be
placed (Hint: if you are using the same worksheet as your
dataset, avoid cells that are immediately adjacent to your
data).
Select Formulas > More Functions > Statistical >
AVERAGE to open the Function Argument dialogue box
SPSS (Statistical Package for Social Sciences)
SPSS (Statistical package for social sciences) is the set
of software programs that are combined together in a single
School of Distance Education, University of Calicut 211
MCM1C03 : QUANTITATIVE TECHNIQUES FOR BUSINESS DECISIONS
.*********.