0% found this document useful (0 votes)
28 views152 pages

Business Statistics

Uploaded by

marlasushruta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views152 pages

Business Statistics

Uploaded by

marlasushruta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 152

A unit of Realwaves (P) Ltd

MBA CLASSES
ST
I Semester

Business Analytics
Dear Student,
Welcome to the World of Knowledge – REAL WAVES

I have the pleasure of presenting this study material to


you. It contains exhaustive practical and Theory. It covers all
the aspects which will bring in to focus all important concepts
that you need to study in order to fortify yourself for your
examination. The subject will be taught by eminent professor
who are highly experienced and well versed with the job.
The Institute is very exhaustive and wholly concept based.
Also, the Institute is very systematic, well planned and
absolutely time- bound. For a change, say good bye to
mechanical learning. I am sure you will feel that the study is a
pleasurable job and not a painful exercise.

I wish you a very happy study time.

BEST OF LUCK!

PUNEET MORE
Director
Module-I 8 Hours)
Descriptive Statistics: Measures of Central Tendency - Problems on Measures of Dispersion – Karl
Pearson Correlation, Spearman’s Rank Correlation, Simple and Multiple Regression (Problems on
Simple Regression Only).

Module-II
Probability Distribution: Concept and Definition - Rules of Probability – Random Variables –
Concept of Probability Distribution – Theoretical Probability Distributions: Binomial, Poisson,
Normal and Exponential – Baye’s Theorem (No Derivation) (Problems Only on Binomial, Poisson
and Normal).

Module- III
Decision Theory: Introduction – Steps of Decision-Making Process – Types of Decision-Making
Environments – Decision-Making under Uncertainty – Decision-Making under Risk – Decision Tree
Analysis (Only Theory).
Design of Experiments: Introduction – Simple Comparative Experiments – Single Factor
Experiments – Introduction to Factorial Designs.

Module-IV (OnlyTheory)
Cluster Analysis: Introduction – Visualization Techniques – Principal Components –
Multidimensional Scaling – Hierarchical Clustering – Optimization Techniques.
Factor Analysis: Introduction – Exploratory Factor Analysis – Confirmatory Factor Analysis.
Discriminant Analysis: Introduction – Linear Discriminant Analysis

Module-V
Foundations of Analytics: Introduction – Evolution – Scope – Data for Analytics – Decision
Models – Descriptive, Predictive, Prescriptive – Introduction to Data Warehousing – Dashboards
and Reporting – Master Data Management (Only Theory).

Module-VI
Linear Programming: Structure, Advantages, Disadvantages, Formulation of LPP, Solution using
Graphical Method. Transportation Problem: Basic Feasible Solution using NWCM, LCM and VAM,
Optimisation using MODI Method.
Assignment Model: Hungarian Method – Multiple Solution Problems – Maximization Case –
Unbalanced – Restricted.

Module-VII
Project Management: Introduction – Basic Difference between PERT & CPM – Network
Components and Precedence Relationships – Critical Path Analysis – Project Scheduling – Project
Time-Cost Trade Off – Resource Allocation.

Instruction: Equal weightage is given for both theory and problems in the ratio of 60:40.
SR NAME OF CHAPTER PAGE
NO. NO.

1 Unit- I
Chapter 1: Role of Statistics 1
Chapter 2: Correlation 33
Chapter 3: Regression 47

2 Unit- II
Chapter 4: Probability 55
Chapter 5: Probability Distributions 64

3 Unit- III
Chapter 6: Decision Theory
Chapter 7: Design of Experiments

4 Unit-IV
Chapter 8: Cluster Analysis
Chapter 9: Factor Analysis
Chapter 10: Discriminant Analysis

5 Unit- V
Chapter 11: Foundations of Analytics
Chapter 12: Linear Programming
Chapter 13: Assignment Model
Chapter 14: Project Management
6 73
CHAPTER: 15 HYPOTHESIS
EXAM PAPER 108
TABLE 143
1

A unit of Realwaves (P) Ltd Role Of Statistics

CHAPTER 3 ROLE OF STATISTICS


Meaning and Definition of Central Tendency
The central tendency of a variable means a typical value around which other values
which can be measured tend to concentrate. Such concentration of the values in the
central part of distribution is referred to as Measure of Central Tendency also
known as Averages.
Dr. A. L. Bowley has said, "Statistics may rightly be called the science of
averages."

Objects and Functions


It is a precise and simple indicator of a group. It represents the whole group. Its
objects and functions are as under:

(1) To Present the Mass of Complex Data in Condensed Form: An average


reduces a mass of data into a single typical figure to enable one to draw a general
conclusion about the characteristics of the phenomenon under study. It is
impossible to remember various citizens of a country but average income can be
remembered.

(2) Comparative Study: An averages provides a common denominator for


comparing one set of data with others. For example, we cannot compare the
economic conditions of the students of two different classes simply by knowing
their annual incomes. But the average annual income figure of the students of two
sections will enable us to draw conclusion as the students of which sections are
economically better than the other.

(3) Representative of the Group: Averages also help to obtain a picture of a


complete group. For example, day to day sales of a businessman may be
inconsistent, but the average monthly sales figure is consistent to a great extent.

(4) Mathematical Relationship: When it is desired to trace the mathematical


relationship between different groups an average becomes essential. Simply saying
that expected life of an average Indian is less than that of an average

(5) Basis of Future Planning and Decision-making: In the process of research


and experimentation it is of vital importance to know about the average of a

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .1
2

A unit of Realwaves (P) Ltd Role Of Statistics

variable. For example, a railway office will need information regarding average
number of passengers carried by various trains.

Arithmetic Average or Mean


The arithmetic mean is the most widely used and the most generally
understandable of all the averages. This is clear from the fact that when the term
'mean' is used alone, it always refers to the arithmetic mean. In the words of H.
Secrist, “Arithmetic mean is the amount secured by dividing the sum of values of
the items in a series by their number.”

Kinds of Arithmetic Mean: The arithmetic mean is of two kinds:


(1) Simple Arithmetic Mean, (2) Weighted Arithmetic Mean.

1. Simple Arithmetic Mean: When all the values of a statistical series are given
equal importance, the total of the values is divided by number of items. It is called
Simple Arithmetic Mean.

2. Weighted Arithmetic Mean: Sometimes a statistical series comprises of


different values of unequal importance. As such it becomes essential to give
weightage to unequal values according to their relative importance which are
called Weights'. The different values are multiplied with their respective weights.
The total of the product so obtained is divided by the, sum of weights. The quotient
is called Weighted Arithmetic Mean.

DEFINITION OF AVERAGE
“Average is an attempt to find one single figure to describe whole of figures.”

“The average is sometimes described as a number which is typical of the whole


group.”

TYPES OF AVERAGES
The following are the important types of averages:
 Arithmetic Mean
 Median
 Mode
ARITHMETIC MEAN
Calculation of simple Arithmetic mean – Individual Observations.

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .2
3

A unit of Realwaves (P) Ltd Role Of Statistics

X = ∑X
N

Illustration: 1
The following table gives the monthly income of 10 employees in an office:
Income (`) 4780 5760 6690 7750 4840 4920 6100 7810 7050 6950.

Calculate the arithmetic mean of incomes.


Ans: X = 6265

Calculation of Arithmetic mean – Discrete series.


(i) Direct method or (ii) Short – out method.

Direct – Method
The formula for computing mean is

X = ∑ fX
N

Illustration: 2
From the following data of the marks obtained by 60 students of a class, calculate
the arithmetic mean.
Marks No. of Students
20 8
30 12
40 20
50 10
60 6
70 4
Ans: X = 41

Short – Cut Method

X = A + ∑ fd
N
Where A = Assumed mean.
D = (X –A); N = total number of observations
i.e ∑ f
N=∑f
Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.
Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .3
4

A unit of Realwaves (P) Ltd Role Of Statistics

Illustration: 3
Calculate the arithmetic mean by the short cut method using frequency distribution
of illustration 2.
Ans: X = 41

Calculation of Arithmetic Mean- continuous series.

(i)Direct method. (ii) Short cut method.

Direct method.
When the direct method is used.

X = ∑ fM
N

Where, M= midpoint of various classes,


f = the frequency of each class.
N= the total frequency.

Midpoint = Lower limit + Upper limit


2

Illustration: 4
From the following data compute arithmetic mean by direct method.
Marks 0-10 10-20 20-30 30-40 40-50 50-60
No. of students 5 10 25 30 20 10
Ans: X = 33

Short cut method.

X = A + ∑ fd x i
N

Illustration: 5
Calculate the arithmetic mean by short cut method using frequency distribution of
illustration.4.
Ans: X = 33.

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .4
5

A unit of Realwaves (P) Ltd Role Of Statistics

Illustration: 6
Calculate arithmetic mean from the following data using direct method.
Marks 0-10 10-30 30-60 60-100
No. of students 5 12 25 8
Ans: X = 40.6.

Combined Mean
If mean of different components of a group is given separately and it is necessary
to calculate the combined mean of the whole group, it can be calculated with the
help of the following formula:
X1N1+ X2N2 + X3N3…..Xn . Nn
Combined Mean (X123 ... n) = N1+N2+ N3 + .....Nn
where,
X12 = combined mean
X1 and X2 = average of first and second group
N1 and N2 = No. of items in first and second groups so on

Illustration: 7
A distribution consists of 3 components with total frequencies of 100, 150 and 200
having mean 25, 15, 10 respectively. Find the combined mean for the whole
distribution.
Ans: Mean = 15

Illustration: 8
The mean age of a group of 100 children was 9.35 years. The mean age of 25 of
them was 8.75 years and that of another 65 was 10.51 years. What was the mean
age of the remaining?
Ans: The average age of 10 children was 3.31 years.

Illustration: 9
The mean age of a combined group of men and women is 30 years. If the mean age
of the group of men is 32, and that of the group of women is 27 years, find out the
percentage of men and women in the group.
Ans: Male percentage is 60 and female percentage 40.

To find out the missing value or the missing frequency

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .5
6

A unit of Realwaves (P) Ltd Role Of Statistics

If in a variable, the arithmetic mean is given but any one value (size) or the
frequency is missing. We should solve the question assuming the missing data as x,
applying direct method of X, form the equation and ascertain the missing value or
the frequency.

Illustration: 10
If mean value is 20.5, find out the missing frequency in the following data:
Value 5-10 10-15 15-20 20-25 25-30
Frequency 2 4 6 ? 8
Ans: missing frequency = 10

Illustration: 11
From the following information calculate the missing value if mean marks are
28.5:
Marks 10 15 20 X 35 50 Total
No. of Students 3 6 8 10 17 6 50
Ans: Missing value is 25 marks.

Merits of Arithmetic Mean


It is the most commonly used measure because of the following merits:

(1) Easy Calculation: Arithmetic mean is easy to calculate. Even a layman can
easily understand it.

(2) Based on All values: All the values of variable are considered in calculating
arithmetic mean. Thus it is the most representative measure of central tendency.

(3) Exact Figure: Arithmetic mean is an exact figure which is determined by a


rigid formula. Every one who computes mean of the same variable will get the
same answer.

(4) Stability: Of all the averages, arithmetic mean is affected least by fluctuations
of sampling.

(5) Algebric use possible: In the calculation of arithmetic mean none of the
mathematical principle is violated. As such it is used in the highest analytical study
of statistics.

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .6
7

A unit of Realwaves (P) Ltd Role Of Statistics

(6) Test of Accuracy: Test of accuracy of arithmetic mean is possible with the
help of Charlier's accuracy check which is not possible in any other measure of
central tendency.

(7) Arraying or Grouping not needed: The computation of arithmetic mean does
not require the arraying or grouping of items.

(8) Topical Value: It is the typical value of the variable and the centre of gravity
balancing the values on either side of it.

(9) Algebric Properties: It possess so many algebric properties which any other
measure of central tendency does not possess.

Demerits or Limitations of Arithmetic Mean


1. Affected by extreme items: It is disproportionately affected by the extreme
values of a variable. Thus the smallest and the largest values affect its calculation.
For example, the marks obtained by three students are 10, 50 and 120, the
arithmetic mean would be 60 marks.
It does not truely represent the group. The mean is largely affected by the
highest value, 120.

2. Cannot be estimated by construction of a variable: Just as Median and Mode


are estimated by looking at the construction of frequency distribution, arithmetic
mean cannot be estimated.

3. Fictitious figure: Sometimes arithmetic mean may not be an actual item in a


variable. Thus it is called a fictitious average. It may not represent even a single
item of the variable. Thus it may give a meaningless figure.

4. Absurd results: Sometimes it gives meaningless and absurd results. The 'Punch'
Journal once remarked, 'The figure of 1.6 children per adult female was felt to be
an absurd result'.

5. Graphical presentation not possible: It can neither be determined by


inspection nor can it be located by means, of a graph.

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .7
8

A unit of Realwaves (P) Ltd Role Of Statistics

6. All values required: It cannot be used in case of open end classes such as less
than 20 and more than 80 etc. since mid-values cannot be obtained for such classes
unless we estimate both the ends or total values for such classes if given.

7. Not suitable for Qualitative Phenomenon: It cannot be used when dealing


with qualitative characteristics such as honesty, beauty, intelligence etc.

Uses of Arithmetic Mean


This measure of central tendency is used in those variables where the distribution
of frequency is normal or moderately asymmetrical and equal weightage is given
to all the values of a variable. Average income, average price, average height,
average production, average weight, average imports etc. are the cases where this
measure is used. It cannot be used in qualitative phenomena.

MEDIAN
Median is a position average. It is the value of the middle item of a variable when
the items are arranged according to their values either in ascending or descending
order. Its value is so located in the frequency distribution that it divides in half,
with 50% of the items below it and 50% of items above it.
In the words of Dr. A. L. Bowley, "If the numbers of the groups are ranked in order
according to the measurement under consideration, then the measurement of the
number most nearly one-half is the median."

Median = Size of N +1th item.


2

Characteristics of Median
1. It is an average of position.
2. Median is not affected much by extreme items.
3. The sum of deviations about the median, signs ignored, will be less than the sum
of deviations from any other point.
4. Median can be computed even if the items are not expressed quantitatively.

Illustration: 12
From the following data of the wages of 7 workers, compute the median wage:
Wages (in `) 4100 4150 6080 7120 5200 6160 7400
Ans: Median = ` 6080

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .8
9

A unit of Realwaves (P) Ltd Role Of Statistics

Illustration: 13
Obtain the value of median from the following data of the monthly income of 10
employees of a company in `.
4391 5384 5591 5407 6672 6522 6777 6753 7850 7490
Ans: Median = ` 6597

Merits of Median
The following are the merits of median:

1. Simple and Easy: It is easy to calculate and readily understood specially in an


individual observation and discrete variable.

2. Determinable in All Circumstances: It can be determined in irregular class


intervals and open end classes of a variable (unlike arithmetic mean).

3. Not affected by Extreme Values: Median is not affected by the extreme items
of a variable.

4. Usually An Actual Value: Median is usually an actual figure of the series. In


case of individual observation with even numbers and discrete variable, it is always
an actual figure of the series.

5. Suitable for Qualitative Phenomenon: Median is an appropriate measure of


central tendency in qualitative phenomena like intelligence, beauty, honesty etc.

6. Location by Inspection Possible: Sometimes median is located even by


inspection.

7. Graphical Presentation Possible: Median can also be determined by graphical


method.

8. All Values not needed: The total data are not required for calculating median.
Number of items and median groups are sufficient to calculate median.
9. Rigidly Defined: Median is always rigid and clear. It can always be calculated.

10. Based on Assumption in some Cases: The sum of deviations of values about
median (ignoring minus) me the least.

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .9
10

A unit of Realwaves (P) Ltd Role Of Statistics

Demerits of median
1. Arraying is difficult: The computation of median requires arraying of the items
which is very cumbersome.

2. Algebric treatment not possible: Being an average of position, median is not a


mathematical concept suitable for further algebric treatment.

3. Total values cannot be obtained: If median and number of items of a


distribution are given, then total value cannot be known.

4. Unstable values: Median tends to be a rather unstable value, if the number of


items is small. If items are not spread in the middle of distribution, median will be
an unrealiable measure of central tendency.

5. Based on assumption in some cases: It is not possible to obtain the actual


median in case the group has an even number of observations and thus in such a
case it is an average of the two values (Assumption).

6. No weightage to extreme values: A very little importance is attached to the


items on the extremes and as such the median fails to register changes due to the
changes in the values of the items on the extremes.

7. Uncertain and Indefinite value: A little change in the series may change the
value of the median substantially. For example, if marks of 7 students are 5, 12, 17,
19, 48, 52 and 54, then median is 19. If marks of two more students 55 and 60 are
added in the series, then median will be 48. Thus, it is not a representative measure
of central tendency.

Uses of Median
The median is used in such cases where individual items are comparable separately
but they are to be included into the groups for comparison. Median is also used for
problems that are not expressed quantitatively. For example, intelligence, beauty,
health etc. cannot be quantified. In the study of business and economic problems
also, the median is used. For example, to obtain average wages, distribution of
assets etc. the median is used. It is also used in cases where extreme items are not
given any importance.

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .10
11

A unit of Realwaves (P) Ltd Role Of Statistics

Other Measures based on the Principle of Median: Just as median divides a


series duly arranged in ascending or descending order in two equal parts, similarly
a variable may be divided into four, five, eight, ten and hundred equal parts. These
fractiles are:

No. of equal parts No. of points needed Names of Fractiles


2 1 Median (M)
4 3 Quartiles (Q)
5 4 Quintiles (Qn)
8 7 Octiles (O)
10 9 Deciles (D)
100 99 Percentiles (P)

These measures are:

(1) Quartiles: If a distribution is divided into 4 equal parts then there will be three
quartiles Q1, Q2 and Q3. Q2 is the median itself. Q1 is also called lower quartile and
Q3 as upper quartile.

(2) Quintiles: If a variable is divided into 5 equal parts then there are four
quintiles, Qn1, Qn2, Qn3 and Qn4.

(3) Octiles: If a series is divided into 8 equal parts then there are 7 octiles, O1, O2,
O3.... O7 .Here O4 = median, O2 = Q1and O6 = Q3.

(4) Deciles: If a distribution is divided into 10 equal parts, then there will be 9
deciles, D1,D2, D3....D9. D5 = Median.

(5) Percentiles: If a distribution is divided into 100 equal parts, then there will be
99 percentiles. They are denoted as P1, P2....P67, P68....P69. P50= Median, P25 = Q1
and P75 = Q3.
These different measures are calculated just like median. Most of these
measures are used in dispersion and skewness. The formulae which are used in
their calculations are listed below:

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .11
12

A unit of Realwaves (P) Ltd Role Of Statistics

Individual Discrete Continuous


Measure
series series To find out item
Size of the Size of the Formulae used
Size of the item
item item
(N + l) (N+l) N
Q1 4 4 4 Q1 = l1 + i / f (q1 - c)
3(N + 1) 3(N + 1) 3(N)
Q3 4 4 4 Q3 = l3 + i / f (q3 - c)
(N + l) (N+l) N
Qn1 5 5 5 Qn1 = ln1 + i / f (qn1 - c)
2(n + 1) 2(N + 1) 2N
Qn2 5 5 5 Qn2 = ln2 + i / f (qn2 - c)

4(N + 1) 4(N + 1) 4N Qn4 = ln4 + i / f (qn4 - c)


Qn4 5 5 5
(N + l) (N+l) N
O1 8 8 8 O1 = l1 + i / f (o1 - c)

3(N + 1) 3(N + 1) 3N O3 = l3 + i / f (o3 - c)


O3 8 8 8
5(N + 1) 5(N + 1) 5N
O5 8 8 8 O5 = l5 + i / f (o5 - c)
7(n + 1) 7(n + 1) 7N
O7 8 8 8 O7 = l7 + i / f (o7 - c)
(N + 1) (N + l) N
D1 10 10 10 D1 = l1 + i / f (d1 - c)

3(N + 1) 3(N + 1) 3N D3 = l3 + i / f (d3 - c)


D3 10 10 10
4(N + 1) 4(N + l) 4N
D4 10 10 10 D4 = l4 + i / f (d4 - c)
7(N + 1) 7(n + 1) 7N
D7 10 10 1q D7 = l7 + i / f (d7 - c)
9(N + 1) 9(n + 1) 9N
D9 10 10 10 D9 = l9 + i / f (d9 - c)

30(N + 1) 30(N + 1) 30N P30 = l30+ i / f (p30 - c)


P30 100 100 100
76(N + 1) 76 (N + 1) 76N
P76 100 100 100 P76 = l76+ i / f (p76 - c)

99 (N + 1) 99(N + 1) 99N
P99 100 100 100 P99 = l99 + i / f (p99 - c)

Similarly other percentiles can be obtained. The notations used in the above table
mean:
Q1= First Quartile, Q3= Third Quartile, Qn1= First Quintile,
Qn2= Second Quintile, Qn4= Fourth Quintile, O1= First Octile,
Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.
Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .12
13

A unit of Realwaves (P) Ltd Role Of Statistics

O3= Third Octile, O5= Fifth Octile, O7= Seventh Octile,


D1= First Decile, D3= Third Decile, D4= Fourth Decile,
D7= Seventh Decile, D9= Nineth Decile, P30= Percentile Thirty,
P76= Percentile Seventy sixth, P99= Percentile Ninety-nine
i = (l2 – l1) Class interval of the related class, f = Frequency of the related group in
which the relevant value lies, c = Cumulative frequency of the preceding class of
the related group.

Illustration: 14
In a class test, marks obtained by 20 students are given below, find out Median,
Lower Quartile and Upper Quartile:
Marks = 6, 9, 10, 12, 18, 19, 23, 23, 24, 28, 37, 48, 49, 53 and 60.
Ans: M= 23 Marks, Q1 = 12 Marks, Q3 = 48 Marks.

Illustration: 15
From the following monthly income of 15 families, calculate first & third quartiles,
3rd quintile, 5th octile, 7th decile and 30th percentile:
100, 120, 140, 160, 170, 175, 180, 184, 189, 194, 220, 230, 250, 280, 330.
Ans: Q1= Rs 160, Q3 = Rs 230, Qn3 = Rs 192, O5 = Rs 194, P30 = Rs 168,
and D7 = Rs 222.

Illustration: 16
From the data given in the following table calculate 1st and 3rd quartiles, 4th
decile, 5th octile and 70th percentile:
Size 0 2 4 6 8 10
Frequency 5 7 10 9 5 3
Ans: Q1 =2, Q3 = 6, D4 = 4, O5 = 6 and P70 = 6 units.

Illustration: 17
Calculate both the Quartiles, 2nd Quintile, 3rd Octile, 9th Decile and 45th
Percentile from the following data:
Marks Less than 10 20 30 40 50 60 70 80
No. of Students 4 16 40 76 96 112 120 125
Ans: Q1= 26.35 marks, Q3 = 48.88, Qn2 = 32.78, O3 = 31.98, P45 = 34.51,
and D9 = 60.63

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .13
14

A unit of Realwaves (P) Ltd Role Of Statistics

Determination of Missing Value or Frequency: When in a series Median is


given but a value or a frequency is missing, the missing x be calculated by
substituting the values in the formula to locate median and then simplifying it.

Illustration: 18
From the following data find out the missing frequency if median value is 50.
Class interval 10-20 20-30 30-40 40-50 50-60 60-70
Frequency 2 8 6 - 15 10
Ans: missing frequency = 9

Illustration: 19
Find out the missing frequencies from the following frequency distribution if
median value is Rs 50 and total no. of families 100:
Expenditure (Rs) 0-20 20-40 40-60 60-80 80-100 Total
No. of Families 14 ? 26 ? 16 100
Ans: Two missing frequencies are 23 and 21

Computation of median – Discrete series.

Median = size of N + 1
2
Illustration: 20
From the following data, find the value of median:
Income (`) 4000 4500 5800 5060 6600 5380
No.of persons 24 26 16 20 6 30
Ans: Median = ` 5060.

Calculation of median – continuous series.

Median = L + N/2 – c.f x i


f
L = lower limit f the median class.
c.f = cumulative frequency of the class preceding the median class.
f = simple frequency of the median class.
i = the class interval of the median class.

Illustration: 21
Calculate the median for the following frequency distribution:

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .14
15

A unit of Realwaves (P) Ltd Role Of Statistics

Marks No. of students


45 - 50 10
40 - 45 15
35 - 40 26
30 - 35 30
25 - 30 42
20 - 25 31
15 - 20 24
10 -15 15
5 - 10 7
Ans: Median = 27.74.

Illustration: 22
Calculate the median from the following data:
Weight (in gms) No. of Apples
410 - 419 14
420 - 429 20
430 - 439 42
440 - 449 54
450 - 459 45
460 - 469 18
470 - 479 7
Ans: Median = 443.94.

Illustration: 23
From the following data calculate median.
Marks No. of students
Less than 5 29
Less than 10 224
Less than 15 465
Less than 20 582
Less than 25 634
Less than 30 644
Less than 35 650
Less than 40 653
Less than 45 655
Ans: Median = 12.14

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .15
16

A unit of Realwaves (P) Ltd Role Of Statistics

MODE
Mode is the value of that observation which occur; with greatest frequency and
thus is the most fashionable value.
Zesek has defined it as, "Mode is the value occuring most frequently in a series of
items and around which the other items are distributed most densely."

Calculation of mode – Individual observations.

Illustration: 24
Calculate the mode from the following data of the marks obtained by 10 students.

Marks obtained 10 27 24 12 27 27 20 18 15 30

Ans: Mode = 27

Calculation of mode – Discrete series.

Illustration: 25
Calculate mode from the following data:
Size of garment No. of persons wearing
28 10
29 20
30 40
31 65
32 50
33 15
Ans: Mode = 31

Calculation of mode – continuous series.

M o = L + ∆1 x i
∆1 + ∆2

∆1 = f1 – fo
∆2 = f1 – f2

Illustration: 26
Calculate mode from the following data:

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .16
17

A unit of Realwaves (P) Ltd Role Of Statistics

Marks No. of students


Above 0 80
Above 10 77
Above 20 72
Above 30 65
Above 40 55
Above 50 43
Above 60 28
Above 70 16
Above 80 10
Above 90 8
Above 100 0
Ans: Mo = 55

Illustration: 27
Find the value of mode from the data given below:
Weight (kg) No. of students
93 - 97 2
98 - 102 5
103 - 107 12
108 - 112 17
113 - 117 14
118 - 122 6
123 - 127 3
128 - 132 1
Ans: Mo = 110.625 kg

Illustration: 28
Calculate mode in the following series.

Items 0-5 5-10 10-15 15-20 20-25 25-30 30-35


Frequency 1 2 10 4 10 9 2
Ans: Mo = 24.28

Illustration: 29
Calculate Arithmetic Average, Median and the Mode for the following series:
Central Size (years) 5 15 25 35 45 55 65 75
Frequency (f) 5 10 14 22 21 16 9 3

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .17
18

A unit of Realwaves (P) Ltd Role Of Statistics

Ans: Mean = 39.55 Years, Median = 39.55 years, Mode = 37.5

Illustration: 30
For the data given below calculate the Mode:
Class Intervals 55-60 50-55 45-50 40-45 35-40 30-35 25-30 20-25
Frequency 5 7 12 15 18 8 7 5
Ans: Mode = 40
To find out missing frequency: Apply the formula for locating mode and form an
equation to find out the missing item (x)

Illustration: 31
If the mode of the following variable is 240, find the frequency of the class 200 -
300:
Expenditure in Rs 0-100 100-200 200-300 300-400 400-500 500-600
No. of Persons 140 230 270 ? 150 140
Ans: Missing frequency (f2) = 210

Principal Characteristics of Mode


(1) Mode is an average of location.
(2) Mode is not affected by the extreme items of a variable.
(3) Mode is the point of maximum concentration as such the frequency distribution
of a series can be estimated easily.
(4) Approximate value of mode can be located easily.
(5) Mode is not suitable for further mathematical treatment.

Merits of Mode

1. Simple Calculation: Mode is easy to calculate and easy to understand. In some


cases it is located simply by inspection.

2. Calculation Possible: It can also be calculated in case of open end classes


which do not create any problem.

3. Least affected by extreme value: It is not affected by the extreme or abnormal


items of a variable. As such it is preferred to arithmetic mean while
dealing with extreme observations.

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .18
19

A unit of Realwaves (P) Ltd Role Of Statistics

4. Graphic Determination Possible: Mode can be calculated by means of a


graphic presentation method unlike arithmetic mean.

5. Not affected by scattered values: Like median, it is not affected by the


dispersion or scatteredness of the series.

6. Same result from different samples: It is the most probable value in the
distribution. It means if a random value is choosen from a frequency distribution,
the probability of modal value will be more than any other value.

7. Representative size: Mode is the most representative item of a variable since it


is the value of maximum concentration.

8. Most useful valued: In view of the aforementioned merits, mode is an average


having more practical use compared to mean or median. An average-man takes the
'average' in the sense of 'mode' in statistics.

Demerits of Mode
1. Indefinite and uncertain: Mode is sometimes uncertain and ill-defined
particularly when maximum frequencies are repeated or frequency distribution is
irregular.

2. Algebric treatment not possible: Mode is not suitable for further mathematical
treatment. It cannot be ascertained for the combined two or more groups if the
values of mode are given for separate groups.

3. Affected by sampling fluctuations: It is affected to a greater extent by the


fluctuations of sampling as compared to arithmetic mean.

4. Not based on all the values: Mode is the value of maximum corresponding
frequency, so it is not based on all the values of a variable.

5. Unsuitable for relative weightage: It is unsuitable in cases where relative


importance of items has to be considered.

6. Not based on all values: It does not give any weightage to the extreme items.
Thus this average is unsuitable where importance is to be given to extreme values
of a variable.

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .19
20

A unit of Realwaves (P) Ltd Role Of Statistics

Relationship between Mean, Median and Mode:


(i) In a symmetrical distribution, the value of Mean, Median and Mode will
coincide at one point, i.e., X = M =Z.
(ii) In moderately skewed or asymmetrical distribution, the average relationship
will be:
Z = X - 3 (X - M) or Z = 3M - 2X or M = 2(X - Z)
3
or M = X - 1(X - Z) or X = 1(3M - Z)
3 2

Illustration: 32
(i) If Mode and Mean in a moderately asymmetrical series are 16 cm and 15.6 cm,
what would be the probable value of Median?
(ii)If Median and Mean of a series are 14 and 15 respectively, estimate probable
value of Mode.
(iii)If Mode and Median of a series are respectively 14 and 13, what would be its
Mean?
Ans: (i) M = 15.73cm. (ii) Z = 12 (iii) X = 12.5

MEASURES OF DISPERSION
In two or more distributions the central value may be the same but still there can be
wide disparities in the formation of distribution. Measures of dispersion help us in
studying this important characteristics of a distribution.

DEFINITIIONS
“Dispersion is the measure of the variation of the items.”
“The degree to which numerical data tend to spread about an average value is
called the variation or dispersion of the data.”
The study of dispersion is of great significance in practice as could well be
appreciated from the following example.
Series A Series B Series C
100 100 1
100 105 489
100 102 2
100 103 3
100 90 5
Total 500 500 500
X 100 100 100

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .20
21

A unit of Realwaves (P) Ltd Role Of Statistics

METHODS OF STUDYING VARIATION

The following are the important methods of studying variation.


 The Range
 The Interquartile Range and the Quartile Deviation
 The Mean Deviation or Average deviation.
 The Standard Deviation.

(1) RANGE
It is the simplest of all the measures of dispersion. It is the difference between two
extreme observations of the distribution. In other words, range is the difference
between highest (maximum) and the lowest (minimum) values of a variable. Thus,
Range (R) = Largest Value (L) — Smallest Value (S)

Coefficient of Range: Range is an absolute measure of dispersion based on the


units of measurement which cannot be used for comparison between two or more
variables, particularly having different units of measurement. Thus to compare the
variability of two or more distribution given in different units of measurement, the
relative measure of range called Coefficient of Range is used. The formula for its
calculation is:
Range = L - S
L = Largest item, and
S = Smallest item.
Coefficient of range = L - S
L+S

Illustration: 33
The following are the prices of shares of AB Co. Ltd. from Monday to Saturday.
Day Price (`)
Monday 200
Tuesday 210
Wednesday 208
Thursday 160
Friday 220
Saturday 250
Ans: Coefficient of Range = 0.22

Illustration: 34

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .21
22

A unit of Realwaves (P) Ltd Role Of Statistics

Calculate coefficient of Range from the following data.


Marks No. of students
10 - 20 8
20 - 30 10
30 - 40 12
40 - 50 8
50- 60 4
Ans: Coefficient of Range = 0.714

Merits of Range
(1) Range is simple to calculate and easy to understand.
(2) It is rigidly defined.
(3) It gives a broad picture of the data and gives the limits within which all the
items fall.
(4) Its use is very common. It is used for quality control of the articles
produced, geographical studies and for the highest and the smallest values.
(5) Frequencies are not needed for its calculation. Only values are considered and
so it is not affected by the frequencies.

Demerits of Range
(1)Range is not based on complete set of data.
(2)It is not a reliable measure of variability since it is based on two extreme values.
(3) It cannot be calculated in case of open end classes.
(4)Range is affected very much by fluctuations of sampling. It varies widely from
sample to sample.
(5)It is not ligible for mathematical treatment.
(6)In the words of W. I. King, "Range is too indefinite to be used as a practical
measure of dispersion."

Uses of Range
Despite its limitations, Range is a useful measure in the following areas:

(1) Quality Control: It is applied for quality control measures. The control charts
are prepared on the basis of Range for controlling the quality of the products.

(2) Measurement of Fluctuations: It is a useful measure for studying variations in


data in our day-to-day life.

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .22
23

A unit of Realwaves (P) Ltd Role Of Statistics

(3) Used in Predictions: It is used in meteorological department for weather


forecast since the people are interested to know the limits of temperature.

(2) THE INTER QUARTILE RANGE OR QUARTILE DEVIATION


Inter-quartile range is a measure of partial range. It is calculated by deducting the
value of lower quartile from the value of upper quartile. This measure takes into
account only middle 50% of the items. The frequencies of class intervals are also
given weightage for its computation; The following process is adopted for its
Here Q.D. = Quartile deviation, Q3 = upper quartile, Q1 = lower quartile
Calculation:
(1) First of all compute the values of lower and upper quartiles (Q1 and Q3).
(2) Deduct the value of Q1 from the value of Q3 to obtain its value:
Inter-quartile Range (I.Q.R.) =Q3 – Q1

Interquartile range = Q3 – Q1

Q1 = N + 1
4

Q3 = 3 N + 1
4
Quartile Deviation or Semi Inter - Quartile Range
It is a measure of dispersion based on the values of upper and lower quartiles.
It is half of the Inter-Quartile Range. As such it is called semi-inter-quartile Range.
The formula for its calculation is:
Quartile Deviation or Q.D. = Q3 – Q1
2

Coefficient of Quartile Deviation: Quartile deviation is an absolute value not


suitable for doing comparative studies. For a comparative study between two or
more variables, its relative measure called coefficient of quartile deviation is
computed. The formula to be used is:

Coefficient of Q.D = Q3 – Q1
Q3 + Q1

Illustration: 35
Find out the value of Q.D and its coefficient from the following data:

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .23
24

A unit of Realwaves (P) Ltd Role Of Statistics

Marks: 20 28 40 12 30 15 50
Ans: Q1= 15, Q3= 40, Q.D = 12.5, Coefficient of Q.D = 0.455

Illustration: 36
Compute coefficient of quartile deviation from the following data:
Marks 10 20 30 40 50 60
No of students 4 7 15 8 7 2
Ans: Q1= 20, Q3= 40, Q.D = 10, Coefficient of Q.D = 0.333

Illustration: 37
Calculate quartile deviation and the coefficient of quartile deviation from the
following data.
Wages in Rupees per week No. of wage earners
Less then 34 14
Less then 35-37 62
Less then 38-40 99
Less then 41-43 18
Over 44 7
Ans: Q1= 36.24, Q3= 39.74, Q.D = 1.67, Coefficient of Q.D = 0.044

Merits of I.Q.R
(1)It is simple to calculate and easy to understand as compared to Range.
(2)It is not affected by the two extreme values of the variable.

Demerits of I.Q.R
(i) It cannot be considered as a representative measure since it is based only on
middle 50% items.
(ii) It is not based on all the observations of a variable.
(iii) This measure does not clarify the formation of the series.
(iv) It is not amendable to algebraic treatment. Thus it is not a satisfactory measure
of dispersion.

(3) THE MEAN DEVIATION


Mean deviation or the Average deviation is the measure of dispersion, which is
based upon all the items in a variable. It is the arithmetic mean of the deviation of
the values from a measure of central tendency. In the words of Clark and Schkade,
'Average deviation is the average amount of scatter of items in a distribution from
either mean or the median, ignoring signs of the deviations."

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .24
25

A unit of Realwaves (P) Ltd Role Of Statistics

Thus it is a measure obtained by calculating the absolute deviation of each


observation from the median or the mean, taking all deviations as positive, and
then averaging these deviations by taking their arithmetic mean. Mean deviation
from arithmetic mean is also called First Moment of Dispersion.
M.D = 1 ∑ | X – A |
n

Or

∑ | D|
N

Where, |D| = |X – A|
Coefficient of M.D = M.D
Median

Illustration: 38
Calculate the mean deviation and its coefficient of the income groups of members
given below:
(`) 4000 4200 4400 4600 4800
Ans: Median = 4400, M.D = 240, C.M.D = 0.054.

Merits of Mean Deviation


Mean deviation possesses following merits as a measure of dispersion:

(1) Easy to calculate: It is simple to calculate and easy to understand as compared


to other measures of dispersion.

(2) Possible from any Average: Mean deviation can be calculated from any
average, viz; Mean or Median or Mode. But the use of median and mean are very
popular.

(3) Based on all the items: Mean deviation takes into account all the items of a
series. Hence it is affected by every value in the distribution. Thus it is more
comprehensive measure of variation than any other.

(4) Less affected by extreme values: It is less affected by extreme values. It is a


good measure particularly in small samples which have extreme values.

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .25
26

A unit of Realwaves (P) Ltd Role Of Statistics

(5) Gives importance to distribution Mean deviation shows the significance of an


average in distribution.

(6) Relative weightage: It gives relative importance to all the values in a variable.

(7) Calculated Value: Mean deviation is rigidly defined and is a rigid and
calculated value.

Demerits of Mean Deviation


The principal demerits of mean deviation are:

(1)Signs are ignored: Since signs are ignored in its calculation, it has poor
combining properties as far as its use in advanced statistical techniques is
concerned. As such it has gained very limited acceptance in statistical application.

(2)Unreliable: In some situations it gives unsatisfactory results particularly when


the mode is uncertain in certain cases and it is calculated from the Mode.

(3)Lack of uniformity: Mean deviation can be calculated from mean or median or


mode. It gives different results in all the cases since sum of the deviations are
different in all these cases. Thus it is not well defined.

Uses: Mean Deviation, despite its demerits, is useful while using small samples. It
is generally used in statistical analysis of economic, business and social
phenomenon.

Calculation of Mean Deviation by Discrete series.

M. D = ∑ f |D|
N

Illustration: 39
Calculate mean deviation from the following series:
X 10 11 12 13 14
f 3 12 18 12 3.
Ans: Median = 12, M.D = 0.75.

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .26
27

A unit of Realwaves (P) Ltd Role Of Statistics

Illustration: 40
Calculate the mean deviation from the mean for the following data:
Size 2 4 6 8 10 12 14 16
Frequency 2 2 4 5 3 2 1 1
Ans: M.D = 2.8.

Calculation of mean deviation by continuous series.

Illustration: 41
Find the median and mean deviation of the following data:
Size frequency
0-10 7
10-20 12
20-30 18
30-40 25
40-50 16
50-60 14
60-70 8
Ans: Median = 35.2, M.D = 13.1

Illustration: 42
Calculate the mean deviation and its coefficient from the following data:
Class Frequency
0-10 5
10-20 8
20-30 12
30-40 15
40-50 20
50-60 14
60-70 12
70-80 6
Ans: Median = 43, M.D = 15.10, C.M. D = .351

(4) THE STANDARD DEVIATION


The concept of standard deviation was first introduced by Karl Pearson in 1893.
The standard deviation is the most important and popular measure of dispersion.
Unlike mean deviation which can be computed from any measure of central

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .27
28

A unit of Realwaves (P) Ltd Role Of Statistics

tendency standard deviation is always; computed from arithmetic mean. While


taking deviation of the values from mean, algebric signs are not ignored. These
deviations are squared up and then summated. The sum of the squares of
deviations is divided by number of items. The square root of the average of the
squared deviations from mean is taken to obtain the values of standard deviation.
The value obtained prior to taking square, root is called 'Variance'.
Thus, the standard deviation is the square root of the arithmetic mean of the
squares of all deviations, deviations being measured from the arithmetic mean of
the observations. It is represented by a Greek letter small sigma (σ).

Coefficient of Standard Deviation: Standard deviation is an absolute measure.


Where comparison of variability in two or more series is required to be done,
relative measure of standard deviation is computed. It is called coefficient of
Standard Deviation. It is calculated by dividing standard deviation (σ) by the mean
(X) of the distribution. Symbolically,

Coefficient of Standard Deviation = S.D or σ


Mean X

Distinction between Mean Deviation and Standard Deviation

S.N. Mean Deviation Standard Deviation


Standard Deviation
1. Deviation may be taken from mean, Deviation is taken only from
median or mode. arithmetic mean.
2. Algebric signs, plus or minus are Algebric signs are not ignored, but
ignored i.e. minus signs are also deviations are squared up to ignore
treated as plus. minus.

3. It is only an arithmetic mean of It is the square root of mean of


deviations. square of deviations.
4. It lacks further algebric treatment It possesses further algebric attribute
since it is based on absolute value of since algebric signs of deviation are
measures of central tendency. not ignored.
5. It is easy to calculate when the value The calculation of standard deviation
of mean, median or mode is in whole is not very easy since first deviations
number. Its shortcut method is are squared up and then square root
complicated and difficult to is taken of the average of squared
understand. deviations.
Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.
Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .28
29

A unit of Realwaves (P) Ltd Role Of Statistics

Calculation of standard Deviation.

Individual Observations.
 By taking deviation of the items from the actual mean.
 By taking deviation of the items from an assumed mean.

Deviation taken from actual mean: When deviations are taken from actual mean.

σ = ∑ x2
N

x = (X – X)

When deviations are taken from assumed mean the following formula is applied.

σ = ∑ d2 - ∑ d 2

N N

d = (X – A)

Illustration: 43
Find standard deviation of (`) 7, 9, 16, 24, 26.
Ans: S.D = ` 7.66.

Illustration: 44
Blood serum cholesterol levels of 10 persons are as under.
240, 260, 290, 245, 255, 288, 272, 263, 277, 251.
Calculate S.D with the help of assumed mean.
Ans: S.D = 16.398.

Calculation of standard deviation – Discrete Series.

 Actual mean method


 Assumed mean method
 Deviation method.

(i) Actual mean method:

σ = ∑f x2
N

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .29
30

A unit of Realwaves (P) Ltd Role Of Statistics

Where, x = (X – X)

(ii) Assumed mean Method:


σ = ∑ fd2 - ∑ fd 2

N N

Where, d = (X – A)

Illustration: 45
Calculate S.D by assumed mean method from the data given below:
Size of item Frequency
3.5 3
4.5 7
5.5 22
6.5 60
7.5 85
8.5 32
9.5 8
Ans: S.D = 1.149.

(iii) Step deviation method:

σ = ∑ fd2 - ∑ fd 2
x i
N N

Where, d = (X – A) and i = class interval


i

Illustration: 46
The annual salaries of a group of employees are given in the following table:
Salaries (in ` 000) 45 50 55 60 65 70 75 80
No. of persons 3 5 8 7 9 7 4 7
Calculate the S.D by step deviation method.
Ans: S.D = 10.35.

Calculation of standard Deviation by continuous series.

σ = ∑ fd2 - ∑ fd 2
x i
N N

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .30
31

A unit of Realwaves (P) Ltd Role Of Statistics

Where, d = (m – A) and i = class interval


i

Illustration: 47
Calculate mean & S.D of following frequency distribution of marks:
Marks No. of students
0-10 5
10-20 12
20-30 30
30-40 45
40-50 50
50-60 37
60-70 21
Ans: Mean = 40.9, S.D = 14.839.

Illustration: 48

Find the S.D from the following data:


Age under No. of persons dying
10 15
20 30
30 53
40 75
50 100
60 110
70 115
80 125
Ans: S.D = 19.76.

COEFFICIENT OF VARIATION

Coefficient of variation or C.V = σ x 100


X

Year 2014
One hundred customers from a particular branch were asked to state the time they
generally take to withdraw cash from their accounts. The data is given below

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .31
32

A unit of Realwaves (P) Ltd Role Of Statistics

Minutes 0-10 10-20 20-30 30-40


No. of 20 50 20 10
Customers

Calculate Mean deviation and Standard deviation.

Ans: Mean deviation = 14, S.D = 9.4

*****

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 3 .32
33

A unit of Realwaves (P) Ltd Correlation

CHAPTER 4 CORRELATION
Correlation is the relationship between the two or more interrelated series of
variables.
 Family income and expenditure or luxury items.
 Yield of a crop and quantity of fertilizer applied.
 Sales revenue and expenses incurred on advertising.
 Frequency of smoking and lung damage.
 Supply position and price of the commodity.
Increase in the prices of a commodity, reduces its demand and vice-versa.
Similarly up to a certain age, increase in age is associated with the increase in
height of a baby. Thus we may say that sometimes two variables are inter-
dependent on each other. Price of a commodity and its demand, rainfall and
production, income and expenditure, etc. Two variables are said to be correlated if
the change in one variable results in a corresponding change in the other variable.

Definition

When the relationship is of a quantitative nature, the appropriate statistical tool for
studying and measuring the relationship and expressing it in a brief formula is
known as correlation.

Kinds of Correlation

Coefficient of Correlation is classified in different ways, the most important ways


of classifying it are:
(i) Positive and Negative Correlation.
(ii) Linear and Non-linear or Curvilinear Correlation.
(iii) Simple, Partial and Multiple Correlation.

1. Positive and negative correlation: If the changes in two variables are in the
same direction, i.e. increase in one variable is associated with the corresponding
increase in other variable, the correlation is said to be positive.
On the other hand, if variations or fluctuations in two variables are in opposite
direction or in other words an increase in one variable is associated with the
corresponding decrease in other or vice-versa, the correlation is said to be negative.

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 4 .1
34

A unit of Realwaves (P) Ltd Correlation

Positive Correlation. Negative correlation.


Increase in two variables Increase in one and decrease in other

Price Supply Price Demand


(`) (units) (`) (Units)
10 1400 8 1760
11 2000 9 1680
12 2600 10 1490
13 3000 11 1300
14 3600 12 1190
15 4100 13 1000

The graphical presentation of positive and negative correlation may be as under:

y y

P1 P1
P P

Q Q1 x Q1 Q x

Positive correlation Negative correlation

2. Linear and non-linear (curvilinear) correlation: The distinction between


linear and non-linear correlation is based upon the consistency of the ratio of
change between two variables. If the amount of change in one variable tends to
bear constant ratio of change in the other variable, the correlation is said to be
linear.
Example:
x 110 210 310 410 510
y 400 600 800 1000 1200
On the other hand, correlation would be known as curvilinear (non-linear) if the
amount of change in one variable does not bear a constant ratio of change in other
variable.

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 4 .2
35

A unit of Realwaves (P) Ltd Correlation

Example:
x 28 29 30 40 50 58 59 60
y 80 130 170 150 230 560 460 600
Thus linear and non-linear correlation may also be positive or negative. It is clear
from the following chart:

Correlation

Linear Non-linear

Positive Negative Positive Negative

Thus, it is clear from the above discussion that:


(i) If changes in two series of variables are in the same direction and having a
constant ratio, the correlation is linear positive.
(ii) If changes in two groups of variables are in opposite direction in a constant
ratio, the correlation will be known as linear negative.
(iii) If changes in two groups of variables are in the same direction but not in a
constant ratio, the correlation is positive non-linear.
(iv) If changes in two groups of variables are in opposite direction and not in
constant ratio, the correlation is negative curvilinear or non-linear.

The following diagrams will illustrate different types of correlations:


y y

o x o x

Linear Positive Correlation Linear Negative Correlation

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 4 .3
36

A unit of Realwaves (P) Ltd Correlation

y y

o x o x
Non-linear Positive Correlation Non-linear Negative Correlation

Degree of correlation

(1) Perfect correlation: If the variations in two variables are in a constant ratio,
the correlation is said to be perfect. If the variations in two variables are in constant
ratio in same direction, the correlation is perfect positive. On the other hand if
correlation coefficient in two variables are in constant ratio but in opposite
direction, the correlation is perfect negative.

The following diagram illustrates perfect positive and negative correlations:

y y

o x o x
Perfect negative correlation Perfect positive correlation

(2) Absence of correlation: If variations in two groups of variables are not


corresponding to each other, it is a case of absence of correlation. The correlation
may be zero in such a case. It is illustrated with the help of the following diagrams:

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 4 .4
37

A unit of Realwaves (P) Ltd Correlation

y y

x x x
x x
x x x
x x
o x o x
No Correlation Correlation (r = 0)

(3) Limited degree of correlation: When there is neither perfect correlation nor
absence of correlation between the two series of variables, then the correlation is
said to be of limited degree. Generally in case of socio-economic studies, limited
degree of correlation exists. In such cases coefficient of correlation is more than
zero but less than one. Such correlation may be positive as well as negative.

Degree of correlation – an eye view


Degree Positive Negative
Perfect +1 -1
High +0.75 to +1 -0.75 to –1
Moderate +0.25 to + 0.75 -0.25 to –0.75
Low +0 to + 0.25 -0 to –0.25
Absence Zero (0) Zero (0)

Karl Pearson’s coefficient of correlation

Direct method:

r =  dxdy
N.x.y

Where dx = (X - X) and dy = (Y – Y)


dx dy = product of corresponding deviations of X and Y variables.

Steps for calculation:


(1) Find-out the mean of x and Y variables, i.e. X and Y
(2) Take deviations of X variable from its actual mean, i.e. dx = (X –X).
(3) Take deviations of Y variable from its actual mean, i.e. dy = (Y–Y).
Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.
Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 4 .5
38

A unit of Realwaves (P) Ltd Correlation

(4) Multiply the corresponding deviations of X and Y variables with each other and
get total or sum of them i.e. (dx dy).
(5) Calculate square of deviations of x and Y variables separately and total them
i.e. (d2x) and (d2y).
(6) Ascertain standard deviation of both the variables with the help of the
following formulae:
x = d2x and y = d2y
N N
(7) Now the coefficient of correlation is obtained by using the following formulae:
r = dx dy
Nxy

Simplification of direct formulae (direct method)

(1) r =dxdy ;putting the formulae of standard deviations in the formula of r.


N.x.y

(2) r = dx dy

d2x .d2y

Solve the following:

Q1 Calculate coefficient of correlation between age of husband and age of wife


from the following data:

Age of wife 17 20 22 27 21 29 26 30 28 30
Age of husband 22 27 28 28 29 30 31 34 25 36
Ans: r =0.71

Q2 Find out the correlation between the height of father and height of son from the
following data:

Height of father (inches) 65 66 67 65 68 69 71 73


Height of son (inches) 67 68 66 68 72 70 71 70
Ans: r = 0.62

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 4 .6
39

A unit of Realwaves (P) Ltd Correlation

Q3 Given X series Y series


No of items 15 15
Mean 25 18
Sum of squares of deviation
From their respective means 136 138
Sum of products of deviation
of X and Y series from their
respective means 122
Ans: r = 0.89

Q4 From the following data relating to X, Y and Z series, determine which of the
two series are the most closely correlated?

Given X series Y series Z series


No. of items 20 20 20
Arithmetic mean 10 15 20
Sum of squares of deviation from
the arithmetic mean 320 500 720
X & Y series from their respective means 360
Sum of products of deviations of X and Z
series from their respective means 408
Sum of products of deviations of Y & Z
series from their respective means 564
Ans:
Coefficient of correlation between X and Y series:
r =0.90
Coefficient of correlation between X and Z series:
r =0.85
Coefficient of correlation between Y and Z series:
r =0.94

Short cut method

Formula:
(1) r = dxdy – N (X – Ax) (Y – Ay)  X = Ax + dx
N.x.y N

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 4 .7
40

A unit of Realwaves (P) Ltd Correlation

Where x = d2x - dx 2


N N

y = d2y - dy 2

N N

(2) r = dxdy . N – (dx . dy)

d2x . N – (dx)2 d2y . N – (dy)2

dx = sum of deviations from assumed mean of X variable.


dy = sum of deviations from assumed mean of Y variable.

Solve the following:

Q1 Compute Karl Pearson’s coefficient of correlation between agricultural


production and industrial production from the following data of index numbers of
the two variables:
Index no. of 98 102 114 117 117 124 115 132 127 135
agricultural
production
Index no. of 112 113 117 129 139 151 153 157 175 194
industrial
production
Ans: r = 0.88

Q2 Calculate the coefficient of correlation between weight and income from the
following data. What are your conclusions?
Weight (kg) 120 130 140 150 160 170
Income (`) 100 200 300 400 500 600
Ans: r = 1

Q3 Calculate coefficient of correlation between X and Y series from the following


data:
HINT (First Formula)

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 4 .8
41

A unit of Realwaves (P) Ltd Correlation

Given X series Y series


No. of pairs of items 100 100
Standard deviation 9 15
Arithmetic mean 30 40
Assumed mean 25 42
Summation of products of deviation of X and Y series from their respective
assumed means = 9260.
Ans: r = +0.76

Q4 Calculate Karl Pearson’s coefficient of correlation from the following data:


Month Jan Feb Mar Apr May June July Aug Sep Oct
Price of 35 36 40 38 37 39 41 40 36 38
A
Price of 65 72 78 77 76 77 80 79 76 75
B
Use 38 as assumed mean for A and 75 for B.
Ans: r = +0.827

Q5 The following table gives the value of export of raw cotton from India to U.S.A
and the value of the imports of manufactured cotton goods into India from U.S.A
(in crores of `):
Year Exports Imports
1997-98 42 56
1998-99 44 49
1999-00 58 53
2000-01 55 58
2001-02 89 65
2002-03 98 76
2003-04 66 58
Calculate the coefficient of correlation between the value of the exports of raw
cotton and the value of imports of cotton-manufactured goods.
Ans: r = +0.9042

Product Moment Method of Correlation

Formula:

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 4 .9
42

A unit of Realwaves (P) Ltd Correlation

r = N.  XY – (X) (Y)

NX2 – (X)2 x NY2 – (Y)2

Q1 From the following data calculate coefficient of correlation between X and Y


series by square of values method:
X series 7 8 10 11 9 5 6 2 3 6
Y series 10 12 8 2 4 3 5 2 4 7

Ans: r = +0.2794

Coefficient of Correlation in Grouped Series/Data

Formula:
r= fdx dy . N – (fdx . fdy)

fd2x . N – (fdx)2 fd2y . N – (fdy)2

Q1 The following table given class frequency distribution of 45 clerks in a


business office according to age and pay. Find the correlation if any, between age
and pay.
Pay in `.
Age 60-70 70-80 80-90 90-100 100-110 Total
(years)
20-30 4 3 1 - - 8
30-40 2 5 2 1 - 10
40-50 1 2 3 2 1 9
50-60 - 1 3 5 2 11
60-70 - - 1 1 5 7
Total 7 11 10 9 8 45
Ans: 0.75

Q2 Calculate coefficient of correlation between ages of husbands and ages of


wife’s from the following data:
Ages of wife’s (X)
Ages of 10-20 20-30 30-40 40-50 50-60 Total.
husbands (Y)
15-25 6 3 - - - 9
Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.
Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 4 .10
43

A unit of Realwaves (P) Ltd Correlation

25-35 3 16 10 - - 29
35-45 - 10 15 7 - 32
45-55 - - 7 10 4 21
55-65 - - - 4 5 9
Total 9 29 32 21 9 100
Ans: +0.802

Q3 In a survey of 100 school teachers of a city following data were obtained


regarding their income and saving. Calculate correlation between income and
saving. Savings (in `)
Income (in `) 50 100 150 200 Total
400 8 4 - - 12
600 - 12 24 6 42
800 - 9 7 2 18
1000 - - 10 5 15
1200 - - 9 4 13
Total 8 25 50 17 100
Ans: r = .5237

Q4 Find out correlation coefficient between height and weight of children from the
following bivariate frequency distribution table:
Height (in inches)
Weight (in 40-44 44-48 48-52 52-56 56-60 60-64 Total
pounds)
35-55 4 40 60 - - - 104
55-75 - - 24 88 12 - 124
75-95 - - - 8 32 8 48
95-115 - - - - 4 8 12
115-135 - - - 4 - - 04
135-155 - - - - 4 4 08
Total 4 40 84 100 52 20 300
Ans: r = 0.7750

Spearman’s Rank Difference Method

r(R) = 1 – 6 D2
N3 - N

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 4 .11
44

A unit of Realwaves (P) Ltd Correlation

Where rR = rank correlation coefficient


D2 = total of square of rank differences.
N = number of pairs of observation
In the formula we add the factor m3 – m to the value d2. Here m means the
12
number of times an item has repeated. The correlation factor is to be added for
each repeated value.
1 – 6 [ D2 +1/12 (m3 – m) + 1/12 (m3 – m) + 1/12 (m3 – m) ……..]
N3 - N

Q1 Calculate coefficient of correlation by ranking method, if (i) ranks are given


from the highest value, and (ii) ranks are given from the lowest value.
A series 70 60 50 30 40 55 63 79 80 72
B series 10.0 10.6 12.0 9.0 9.2 9.5 9.7 11.0 12.4 10.2
Ans: r(R) = +0.67

Q2 From the following data find out rank coefficient correlation:


X series 112 106 109 84 95 95 117 97 95 115
Y series 70 68 80 65 71 60 77 68 63 75
Ans: r(R) = +0.73

Q3 Calculate the coefficient of correlation from the following data by the method
of rank differences: Assignment.
X 75 88 95 70 60 80 81 50
Y 120 134 150 115 110 140 142 100
Ans: rr = 0.93

Q4 The competitors in a beauty contest are ranked by two judges in the following
order:
Assignment.
1st judge 1 6 5 10 3 2 4 9 7 8
2nd judge 2 8 4 1 6 9 5 3 7 10
Ans: rr = -0.12

Q5 Calculate the coefficient of rank correlation from the following data:


Assignment.

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 4 .12
45

A unit of Realwaves (P) Ltd Correlation

X 48 33 40 9 16 16 65 24 16 57
Y 13 13 24 6 16 1 20 9 6 9
Ans: rr = 0.59

Last year question

Year 2014
Calculate the coefficient of correlation from the following data:

Fertilizer used 15 18 20 24 30 35 40 50
Yield (in tonnes) 85 93 95 105 120 130 150 160
Ans: r = .98

Year 2012
Calculate the coefficient of correlation between the corresponding value of x and y
in the following table:

X 2 4 5 6 8 11
Y 18 12 10 8 7 5
Ans: r = -.92

Year 2011
The competitors in a beauty contest are ranked by three judges in the following
order:

First judge 1 6 5 10 3 2 4 9 7 8
Second judge 3 5 8 4 7 10 2 1 6 9
Third judge 6 4 9 8 1 2 3 10 5 7
Use the rank correlation to discuss which pair of judges have the nearest approach.
Ans: 1 and 2 r = -.21, 2 and 3 r = -.29, 1 and 3 r = .64
Since coefficient is positive in judgment of 1 and 3 so they have nearest approach

Year 2009
From the following data calculate coefficient of correlation between X and Y
series by square of values method:

X 7 8 10 11 9 5 6 2 3 6
Y 10 12 8 2 4 3 5 2 4 7

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 4 .13
46

A unit of Realwaves (P) Ltd Correlation

Ans: r = .2794

Year 2003
Calculate the coefficient of correlation from the following data:

X 100 200 300 400 500 600 700


Y 30 50 60 80 100 110 130

Ans: r = .9972

*****

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 4 .14
47

A unit of Realwaves (P) Ltd Regression

CHAPTER 5 REGRESSION ANALYSIS


The statistical technique that helps to study an algebraic relationship between two
or more variables in the form of an equation to estimate the value of a random
variable, given the value of another variables, is called regression analysis. The
variable whose value is estimated using the algebraic equation is called dependent
(or response) variable and the variable whose values are used as the basis for the
estimate is called independent (or predictor) variable. The linear algebraic equation
used for expressing a dependent variable in terms of independent variable is called
linear regression equation.
The two variables x & y which are correlated can be expressed in terms of
each other in the form of straight line equations are called regression equations.
Such lines should be able to provide the best fit of sample data to the population
data. In general for a bivariate distribution there will be two regression lines. The
algebraic expression of regression lines is written as:
The regression equation of Y on X
Y = a + bX
Is used for estimating the value of Y for given values of X.
Regression equation of X on Y
X= c + dY
Is used for estimating the value of X for given values of Y.

Importance/Uses Functions of Regression Analysis


Regression analysis is highly useful in almost all sciences - natural and social.
Following are some of the important uses or functions of regression analysis:

1. Forecasting - Regression analysis gives an objective and scientific estimate of


values of the dependent variable based on the corresponding values of the
independent variable. It establishes a functional relationship between two or more
variables. Once the relationship is established and regression equations are
obtained, it can be used for various advanced analytical purposes.

2. Utility in Economics & Business Areas - Regression analysis is a highly useful


tool in economic and business researches since it is based on cause and effect
relationship.

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 5 .1
48

A unit of Realwaves (P) Ltd Regression

3. Indispensable for Good Planning - Regression analysis is an important tool for


estimating future production, sales, prices, investments, incomes, population etc.
which are indispensable for efficient planning of an economy and are of paramount
importance for trade and commerce.

4. Useful for Statistical Estimation - Regression analysis is useful in statistical


estimation of demand curve, supply curve; cost function, production function etc.

5. Study between more than two variables possible - Regression analysis is


useful not only for two variables but also for three or even more.
If the regression analysis is confined to the study of only two variables at a
time, it is simple regression, whereas if the variables are more than two, it is
multiple regression. But here only the simple regression will be discussed which is
based on two variables.

Difference between Correlation and Regression Analysis


1. Degree and Nature of Relationship: Coefficient of correlation measures the
degree of covariance between two variables whereas the regression analysis tells
about the 'nature of relationship' between the variables so that one is able to
estimate or predict the value of one variable on the basis of another.

2. Cause and Effect Relationship: Correlation merely ascertains the degree of


relationship between two variables and therefore one cannot say that one variable
is the cause and other is the effect. In regression analysis, one variable is taken as
dependent variable while the other variable is taken as independent variable. Thus
making it possible to study the cause and effect relationship.

3. The value of rxy in the calculation of coefficient of correlation measures the


direction and degree of relationship between two variables X and Y. The values of
rxy and ryx are symmetric (i.e. rxy=ryx), it shows that it is immaterial, which of X and
Y is dependent variable and which is independent. However in the regression
analysis the values of regression coefficients i.e. bxy and byx are not symmetric i.e.
bxy and byx and therefore it certainly makes a difference as to which variable is
dependent and which one is independent.

4. In case of correlation, there may be non-sense correlation between two variables


X and Y which is merely due to chance and may not have any practical relevance,
such as increase in income and increase in environmental temperature. However,
there cannot be a non-sense regression.
Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.
Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 5 .2
49

A unit of Realwaves (P) Ltd Regression

5. The value of coefficient of correlation is independent of change of scale and


point of origin. However, regression coefficients are independent of change of
origin but not of scale.

6. While pointing out the difference between regression and correlation Werner Z.
Hirsch rightly stated that, "While correlation analysis tests the closeness with
which two (or more) phenomena co-vary, regression analysis measures the nature
and extent of the relation, that enabling us to make prediction."

Methods of regression
TWO METHODS for determining the equation of a regression line:
(i) Least square method
(ii) Mean based method

(i) Least square method: By using calculus it can be shown that by solving
following two simultaneous linear equations called normal equations, the values of
parameters a and b can be obtained, such that the least square requirement is
followed:

Y on X
Y = na + bX
XY = aX + bX2
These equations are called normal equations. By substituting values of Y, n, X,
XY & X2 (Obtained from the given data) in the above two equations & then
solving, the values of a & b can be obtained.

X on Y
X = nc + dY
XY = cY + dY2

Example: 1
Obtain the two regression lines with the help of the following data:
X 1 3 4 6 8 9 11 14
Y 1 2 4 4 5 7 8 9
Ans: Regression of Y on X is: Y = 0.548 + 0.636X
Regression of X on Y is: X = -0.5 + 1.5Y

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 5 .3
50

A unit of Realwaves (P) Ltd Regression

(ii) Method based on mean: Equations of the two regression lines based on mean
are as follows:
X on Y: (X – X) = bxy (Y – Y)
__ __
Y on X: (Y – Y) = byx (X – X)

(1) When deviations are taken from actual mean:

bxy = xy ; byx = xy


y2 x2

(2) When deviations are taken from assumed mean:

bxy = ndx dy – (dx dy)


nd2y – (dy)2

byx = n dx dy - dx dy


n d2x – (dx)2

(3) When standard deviation & coefficient of correlation are given:

r x
y

r y
x
__ __
Where x = X – X, y = Y – Y
x = S.D of X - series
y = S.D of Y - series

Illustration: 1
From the following data obtain the two regression equations:
X 6 2 10 4 8
Y 9 11 5 8 7
Ans: Regression of Y on X is: Y = 11.9 – 0.65X
Regression of X on Y is: X = 16.4 – 1.3Y

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 5 .4
51

A unit of Realwaves (P) Ltd Regression

Illustration: 2
From the following data obtain the two regression equations taking deviation of X
from 3 and deviation of Y from 6
X 1 2 3 4 5
Y 3 4 6 9 10
Ans: Regression of Y on X is: Y = 1.9X + .7
Regression of X on Y is: X = .51Y + .26

Regression equations when r, x, y are given:

X on Y = X – X = r x (Y – Y)
y
Y on X = Y – Y = r y (X – X)
x
If deviations in both series are taken from actual means, value of
r y is equal to xy &
x x2

that of r x is equal to xy


y y2

r =  bxy x byx

Illustration: 3
The following information about advertisement & sales are available:
Advertisement exp (X) Sales (Y) (` crores)
(` Crores)
Mean 20 120
S.D 5 25
Correlation coefficient = 0.8 Calculate the two regression equations:
Ans: Regression of X on Y is: X = .16Y + 0.8
Regression of Y on X is: Y = 4X + 40

Illustration: 4
Estimate the yield when rainfall is 9 inches from the following data:

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 5 .5
52

A unit of Realwaves (P) Ltd Regression

Mean S.D
Yield of wheat (kg per unit area) 10 8
Annual rainfall (inches) 8 2
Correlation coefficient r = .5
Ans: The yield of wheat is 12 kg when it rains 9 inches

Illustration: 5
The following data relates to the height (X) & weight (Y) of 100 business
executives.
Mean height = 68
S.D (height) = 2.5
Mean weight = 150 lbs
S.D (weight) = 20 lbs
r = 0.6
Estimate from the above data.
(a) The height of an executive whose weight is 200 lbs,
(b) The weight of an executive whose height is 60 ft.
Ans: Regression equation are: X = .075Y + 56.75 & Y = 4.8X – 176.4

Illustration: 6
For certain X & Y series which are correlated, the two lines of regression are as
given below:
5X – 6Y + 90 = 0……(i)
15X – 8Y – 130 = 0…….(ii)
Find which is regression of Y on X & which is X on Y. find the means of two
series & the correlation coefficient.

Ans: Calculation of mean: X = 30 & Y = 40, Calculation of coefficient of


correlation = r = 0.667

Illustration: 7
Find the value of ‘r’ if variance of X = 6.7, S.D of Y = 2.6 & regression equation
of X on Y is X = 0.95Y – 6.4
Ans: r = 0.95

Last year questions

Year 2014

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 5 .6
53

A unit of Realwaves (P) Ltd Regression

(a) What do you understand by Multiple Regression?

(b) An investigation of the demand for TV sets in 5 towns has resulted in the
following data
Population (x) 11 14 17 21 25
(in '000)
No. of Sets (Y) 15 27 34 38 46
Demanded

Find a linear regression of Y on X and estimate the demand of TV sets for a


population of 30,000.
Ans: X = 17.6, Y = 32, y = 2.05x-4.08 x = 30, y = 57.42

Year 2011
In order to study the productivity of workers in an industry, ten workers were
selected at random and their scores on aptitude test and the productivity indices
were complied:
Aptitude score (X) 60 62 65 70 72 48 53 73 65 82
Productivity index (Y) 68 60 62 80 85 40 52 62 60 81
From these details, estimate the productivity index for a worker whose test score is
75.
Ans: Y = 1.167x – 10.855, productive index Y = 1.167 x 75 – 10.855 = 76.67

Year 2008
Table below gives the data relating to purchases and sales. Obtain the two
regression equations and estimate the likely sales when the purchases equal to 65.
Purchases 57 58 59 59 60 61 62 64
Sales 77 78 75 78 82 82 79 81
Ans: Y = .66x + 39.4, x = .545y + 16.95, sales = 82.3

Year 2006, 2013


In partial destroyed laboratory record of an analysis of correlation data, the
following results only are legible: variance of X = 9, regression equations 8X –
10Y + 66 = 0 & 40X – 18Y = 214 what are (i) the mean value of x & Y (ii) the
correlation coefficient between X & Y (iii) the standard deviation of Y.
Ans: x = 13, y = 17, r = .6, σy = 4

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 5 .7
54

A unit of Realwaves (P) Ltd Regression

Year 2005
If the regression lines are given by 3x + 2y = 26 & 6x + y = 31, find (i) the mean
value of x & y; (ii) the coefficient of correlation between x & y; (iii) estimate the
value of y of x = 0 & value of x when y = 13
Ans: x = 4, y = 7, r = 1, y = 13, x = 3
2
Year 2004
For some bivariate data, the following results were obtained. The mean value of X
is 53.2 & mean value of Y is 27.9. The regression coefficient of Y on X = -1.5 &
the regression coefficient of X on Y is – 0.2. Find the most probable value of Y
when X = 60. Also find out the value of coefficient of correlation between X & Y.
Ans: r = .548, Y = -1.5x + 107.7, Y = 17.7

Year 2000
Calculate the trend values by the method of least squares from the data given
below & estimate the sales for the year 1991.
Year 1983 1984 1985 1986 1987
Sales of T.V 12 18 20 23 27
Sets (in ‘000’)
Ans: Sales for year 1991 = 41

*****

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 5 .8
55

A unit of Realwaves (P) Ltd Probability

CHAPTER 7 PROBABILITY
The word ‘probability’ is related with chance of happening or not happening of an
event. In our daily life we come across with some events of estimating i.e.,
probability e.g. ‘the probability that it will rain today’, ‘probability of getting a
particular number up when dice is thrown’, probability of getting a head or tail by
tossing a coin etc.

PERMUTATIONS (Arrangement)
Permutation refers to the different arrangements of objects in a set where all
elements are different and distinguishable.
Permutations of n different objects taken r at a time: Suppose we have n different
objects and r space to be filled. For filling the first space we can choose any object
from n object hence to fill first space from r spaces the are or n objects.
The first space can be filled in n different ways. The second space can be
filled in (n -1) ways. There are (n - 2) ways for the third space and so on. The final
space is filled in n - (r -1) = n – r + 1 ways after the first space have been filled up.
n
Pr = n
(n-r)
In this way for getting permutations of n different objects taken r at a time there
are two formulae.

Q1 How many different words can be formed using the letters J, A, I, P, U, R,


taken (a) all at a time; and (b) three at a time.
Ans: (a) 720 words (b) 120 words

Combinations (selection)
Forming of different groups out of different items is known as combination. It is to
be noted that from combination point of view AB or BA are the same but from
permutation point of view they are different. One has to note that in combination
the placement of item in any order is not important while in case of permutation
the order of placement is important.
n
Cr = n
(n-r) r

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 7 .1
56

A unit of Realwaves (P) Ltd Probability

Q1 In how many ways a team of 11 players can be formed out of 15 players?


Ans: 1365 ways

Probability Theory
In the general language one use the term probability in the sense of happening or
not happening of an event.

Probability = Number of cases favorable to events


Number of all possible cases

Q1 Three Coins are tossed simultaneously. What is the probability that they will
fall 2 head and 1 tail.
Ans: P = 3/8

Q2 A library received 20 books including 8 Hindi novels. If 2 of these books are


selected at random, what is the probability that no one of them is a Hindi novel?
Ans: P = 0.347

Types of events

(i) Equally likely events: Such an event which has equal chance of happening for
example getting head or tail on tossing a coin, in the same way getting one or six in
a throw of dice.

(ii) Independent events: These are such events the happenings of which do not
prevent the happening of other events. For example-getting head in the first chance
of tossing a coin do not prevent getting tail in second throw of coin.

(iii) Dependent events: If the happening of an event effects the other event then it
is known as dependent event. For example drawing a jack from a pack of cards
will have a probability of 4/52 or 1/13 but after drawing the jack and not replacing
it in the pack of cards the probability of drawing jack in the second chance will be
3/51, as such the second event is affected by the first event and such events are
called dependent events.

(iv) Overlapping events: if a part of an event can occur together with another part
of second event then these two or more events which occurred together are known
as overlapping events. In general, these events are partially overlapping. For

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 7 .2
57

A unit of Realwaves (P) Ltd Probability

example, drawing a red diamond card or drawing an ace card is an overlapping


event since both these events are present in a red diamond ace card i.e, it is both a
red diamond card and an ace also. In the above situation the probability of drawing
a red diamond card or an ace card will be determined as follows:
Probability of drawing a red diamond card
P (A) = 13
52
Probability of drawing an ace
P (B) = 4
52
Probability of drawing a red diamond ace card (overlapping event)
P (AB) = 1
52
Hence, the probability of drawing a red diamond card or an ace
= 13 + 4 – 1 = 16 = 4
52 52 52 52 13

Probability Theorems
There are two important theorems of probability, namely:
1. Addition theorem 2. Multiplication theorem.

1. Addition theorem

Case I: When events are mutually exclusive


The addition theorem states that if two events A and B are mutually exclusive
the probability of the occurrence of either A or B is the sum of the individual
probability of A and B. symbolically
P (AB) = P (A) + P (B)
In other words
P (A or B) = P (A) + P (B)
This theorem can be extended to three or more mutually exclusive events. Thus
P (A or B or C) = P (A) + P (B) + P (C)

Q1 A bag contains 5 red, 2 black, 3 yellow and 4 green balls. What is the
probability of getting a red or green ball at random in a single draw of one?
Ans: 9
14

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 7 .3
58

A unit of Realwaves (P) Ltd Probability

Q2 A card is drawn at random from a pack of 52 playing cards. Find the


probability that a card drawn is either a king or the ace of diamonds.
Ans: 5
52

Case II: When events are not mutually exclusive


When events are not mutually exclusive i.e. it is possible for both events to occur
together, the addition theorem must be modified. For example select an ace or a
card for diamond. Here one card is an ace of diamond, which is included
(common) in both the events. We must reduce from the probability of drawing an
ace or a diamond, the chance that we can draw both of them together. Hence for
finding the probability of one or more of two events that are not mutually exclusive
we use the modified form of the addition theorem, which is as follows
P (A or B) = P (A) +P (B) – P (A and B)
In other words
P (A  B) = P (A) +P (B) – P (A  B)
In the case of three events,
P (A or B or C) = P (A) + P (B) + P (C) – P (AB) – P (AC) - P (BC) + P (ABC)

Q1 In a group of 200 drycleaners, 70 have washing machines, 50 have cloth driers


and 30 have both. Find the probability that a given dry cleaner has a washer or a
drier.
Ans: 90
200

Q2 A group of 200 dry cleaners has the following distribution of washers, driers,
and dishwashers.
Washers 110 washers and driers 40
Driers 50 dishwashers and driers 25
Dishwashers 60 washers and dishwashers 35
All three 20
Find the probability that a dry cleaner has a washer or drier or dishwasher.
Ans: 140
200

Q3 In a city three daily newspaper X, Y, Z is published. 40% of the people of the


city read X, 50% read Y, 30% read Z, 20% read both X and Y, 15% read X and Z,

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 7 .4
59

A unit of Realwaves (P) Ltd Probability

10% read Y and Z and 24% read all the three papers. Calculate the percentage of
people who do not read any of the three papers.
Ans: 1%

Q4 (i) What is the probability of drawing a spade or a king from a pack of cards.
(ii) Twenty balls are serially numbered and placed in a bag. Find the chance that
the first ball drawn in a multiple of 3 or 5.
(iii) A number is chosen at random from numbers ranging from 1 to 50. What is
the probability that the number chosen is either a multiple of 2 or 10?
Ans: (i) 16 (ii) 9 (iii) 1
52 20 2

Q5 If a pair of dice is thrown, and what is the probability that the sum of the digit
is neither 7 nor 11.
Ans: P = 7
9

Q6 What is the probability of getting a total of at least 9 in a single throw of two


dice?
Ans: 10
36

Q7 A bag contains 20 balls marked 1 to 20. One ball is drawn at random. What is
the probability that it is marked with a number multiple of 5 or 7?
Ans: 6
20

2. Multiplication Theorem or Multiplicative law of probability

Case I: When events are independent:


The multiplication theorem states that if two events A and B are independent,
the probability that they both will occur is equal to the product of their individual
probabilities, i.e.
p(A and B) = p(A) x p(B)
The theorem can be extended to three or more independent events.
Thus,
p(A,B and C) = p(A) x p(B) x p(C)

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 7 .5
60

A unit of Realwaves (P) Ltd Probability

Q1 A bag contains 5 red, 6 black and 4 green balls. What is the probability of
getting a red ball followed by a green ball in two successive draws of one ball
each, assuming that a ball once drawn is replaced before a second one is drawn?
Ans: 20
225

Q2 An ordinary coin and a six face dice were tossed simultaneously. Find out the
probability of the coin to fall with tail upward and the dice to fall with number 2
upward.
Ans: 1
12

Q3 A university has to appoint examiners to evaluate papers in statistics. Out of a


panel of 40 examiners. 10 are women; 30 out of them know Hindi and 5 of them
are Ph.D. find the probability of selecting a Hindi knowing Ph.D. women teacher
to evaluate the papers.
Ans: 3
128

Q4 Five men in a company of 15 are smokers. Three men are chosen, find the
probability that –
(i) All the 3 are smokers.
(ii) None of the three is smokers.
(iii) At least one is smoker.
Ans: (i) 2 (ii) 24 (iii) 67
91 91 91

Q5 Three cards are drawn from a pack of cards, find the probability that –
(i) They are a king, a queen and an ace.
(ii) 2 kings and an ace.
(iii) All spade cards.
(iv) All are red cards.
(v) Two red and 1 black card.
Ans: (i) 16 (ii) 6 (iii) 11 (iv) 2 (v) 13
5525 5525 850 17 34

Q6 A bag contains 4 white and 6 red balls. Two draws of 3 balls are made. Find
the probability that the first draw will give all the three white balls and the second
all the three red balls are replaced before the draw?
Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.
Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 7 .6
61

A unit of Realwaves (P) Ltd Probability

Ans: 1
180

Case II: When events are dependent

Conditional Probability
Q1 From the packs of cards, three cards were drawn one by one, find the
probability that all the three cards are of black colour.
(i) If the card is not replaced back before the next draw.
(ii) If the card is replaced before the next draw.
Ans: (i) 2/17 (ii) 1/8

Q2 A bag contains 5 white and 3 black balls. Two balls are drawn at random one
after the other without replacement. Find the probability that both balls drawn are
black.
Ans: 3/28

Q3 Find the probability of number of kings drawn if 2 cards are drawn without
replacement from a pack of cards.
Ans: 1/221

Q4 The probability that a contractor will get a contract for road construction is 4/9
and the probability that he will get contract for the construction of a water tank is
5/7. What is the probability of getting at least one contract?
Ans: 53/63

Q5 A salesman is known to sell a product in 3 out of 5 attempts while another


salesman is 2 out of 5 attempts. Find the probability that (i) no sale will be affected
when they both try to sell the product and (ii) either of them will succeed in selling
the product.
Ans: (i) 6/25 (ii) 19/25

Q6 There are 5 white and 8 red balls in a bag. Two draws of 3 balls (in each draw)
are made such that
(a) The balls are replaced before the second draw
(b) The balls are not replaced before the second draw. Find the probability of
getting 3 red balls in the first draw and 3 white balls in the second draw.
Ans: 7/429

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 7 .7
62

A unit of Realwaves (P) Ltd Probability

Q7 Find the chance of drawing a king, a queen and a jack in that order from a pack
of cards in three consecutive draws, the cards drawn not being replaced.
Ans: 8/16575

Last year questions

Year 2014
(a) What is Baye's Theorem and explain the meaning of mutually exclusive events?

(b) A bag contains 6 red and 4 white balls. Another bag contains 3 red and 5 white
balls. A fair dice is tossed for the selection of bag. If dice shows 1 or 2 the first bag
is selected otherwise the second bag is selected. A ball is drawn from the selected
bag and found to be red. What is the probability that this ball comes from the first
bag?
Ans: 2/6 x 6/10 = 4/9
2/6 x 6/10 +4/6 x 3/8

Year 2012
(a) Answer the following:
(i) Probability of throwing exactly 7 with two dice? Ans: 6/36 or 1/6
(ii) The probability of drawing a 5 or a club? Ans: 16/52
(iii) The probability that the difference between the numbers showing when two
dice are rolled is 2? Ans: 8/36 or 2/9

(b) Two students A and B are given the same problem to solve. The odds in favour
of A solving the problems are 4 to 6 while against B solving the problem are 6 to 5.
Both the students try to solve the problem. Find the probability of the problem
being solved.
Ans: 4 x 5 = 2
10 11 11

Year 2007
From the pack of cards, 3 cards were taken out one by one, find the probability.
That the 3 cards of black colour
(i) If the card is not replaced back
(ii) If the card is replaced back.
That the card is black or king.
Ans: (i) 2/17 (ii) 1/8
Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.
Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 7 .8
63

A unit of Realwaves (P) Ltd Probability

Year 2005
Show that in a single throw with two dice, the chance of throwing more than 7 is
equal to that of throwing less than 7, each being equal to 5/12.

Year 2004
(a) A bag contains 5 white and 4 black balls. Two balls are drawn at random one
after the other without replacement. Find the probability that both balls are white.
Ans: 5/9 x 4/8 = 5/18

Year 2002
(a) The probability that a man will be alive for next 30 years is 2/3. Find the
probability that at least one of them will be alive 30 years hence.
Ans: 2/3

Year 2001
(a) Suppose it is 9 to 7 against a person A who is now 35 years of age living till he
is 65 and 3 to 2 against a person B now 45 years living till he is 75. Find the
chance that at least one of these persons will be alive 30 years hence.
Ans: 1- (9/16 x 3/5) = 53/80

Year 2000
The probability that a contractor will get a plumbing contract is 2/3 and the
probability that he will not get an electric contract is 5/9. If the probability of
getting at least one contract is 4/5, what is the probability that he will get both the
contracts?
Ans: P(A) +P(B) – P(AB)
2/3 + 4/9 – 4/5 = 14/45

*****

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 7 .9
64

A unit of Realwaves (P) Ltd Probability Distribution

CHAPTER 8 PROBABILITY DISTRIBUTION

Theoretical distribution
Distributions, which are not obtained by actual observations but are deduced
mathematically under certain definite hypothesis or assumptions, are called
theoretical distributions.

Types of theoretical frequency distribution


There are three types of theoretical frequency distribution:
1. Binomial distribution
2. Poisson distribution
3. Normal distribution
From the above three distribution first two are discrete distribution and the last one
is continuous distribution

Binomial distribution
The binomial distribution describes discrete data resulting from an experiment
known as Bernoulli process. The tossing of a coin a fixed number of times is a
Bernoulli process.

Bernoulli process
The trials are absolutely independent. The probability of r success in n trials where
p is the probability of success and 1-p = q is the probability of failure in case of
Bernoulli process is given by
P(r) = nCr pr q n-r

Q1 Suppose a machine produces on an average 80% good pieces; find the


probability that out of 5 pieces produced by these machine 3 pieces will be good.
Ans: .2048

Q2 There are 5 workers in K- Pharma. The owner has studied the situation over a
period of time and has determined that there is 0.4 chance of any one employee
being late and that they arrive independently of one another. Find the probability
that:
(i) No employee is late.
(ii) At least one employee is late.

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 8 .1
65

A unit of Realwaves (P) Ltd Probability Distribution

(iii) 4 or more employees are late


Ans: (i) (0.6)5 (ii) 1- (.6)5 (iii) 0.0768 +0.01024 or 0.08704

Q3 Six coins are thrown simultaneously. Find the chance of obtaining (i) no head
(ii) at least one head (iii) exactly two heads (iv) not more than two heads (v) more
than 3 heads.

Ans: (i) 1 (ii) 63 (iii) 15 (iv) 11 (v) 11


64 64 64 32 32

Q4 Ten coins are tossed simultaneously. Find the probability of at least seven
heads.
Ans: 176/1024

Q5 A and B play a game in which A’s chance of winning is 2/3. In a series of 8


games what is the probability that A will win 6 or more games?
Ans: 46.8%

Q6 In a multiple-choice quiz each question has 5 alternatives out of them only one
answer is correct. What is the probability of 6 correct answers out of 10 questions?
Ans: 0.0055

Q7 (a) The incidence of occupational disease in an industry is such that the


workman has 20 % chance of suffering from it. What is the probability that out of
6 workmen 4 or more will contact with disease? (Year 2009)

(b) A and B play a game. The probability of winning the game by A is 3/5. Find
the probability of winning at least 4 games by A in a set of 6 games.
Ans: (a) .01696 (b) 1701 or 0.54
3125
Q8 Eight coins are thrown simultaneously, find the probability of getting at least
six heads.
Ans: 37
256

POISSON DISTRIBUTION

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 8 .2
66

A unit of Realwaves (P) Ltd Probability Distribution

In binomial distribution if the value of n is very large (n = ) and the value of p is


too small (p  0) and np is finite number, in this situation the binomial distribution
is not suitable to be used. In other words, the Poisson distribution is applicable
where the successful events in the total events are few.
P ( r ) = e-mmr
r
Situation where Poisson distribution is applicable:
1. Number of defective blades out of total blades produced in a factory.
2. Number of mistakes found in the pages of a book published by a repute
press.
3. No. Of accidents met by a taxi driver in a year.

Poisson distribution as a limiting form of binomial distribution


(i) When number of trials (n) are unlimited, means n  ;
(ii) When the probability of success, ‘p’ 0; and
(iii) When np = m is finite.

Q1 The average number of customers, who appear at a counter of a certain bank


per minute, is two. Find the probability that during a given minute:
(i) No customer appears
(ii) Three or more customers appear.
Given e –2 = 0.1353
Ans: (i) 0.1353 (ii) 0.3235

Q2 Year 2014
A manufacturer of pins knows that 5% of his product is defective. If he sells pins
in boxes of 100 and guarantees that not more than 4 pins will be defective. What is
the probability that a box will fail to meet the guaranteed quality. (e –5 = 0.0067)
Ans: 0.5621

Q3 Suppose that a manufactured product has 2 defects per unit of product


inspected. Using Poisson distribution, calculate the probabilities of finding a
product without any defect, 3 defects and 4 defects. (Given e –2 = 0.135)
Ans: 0.135, 0.18, and 0.09

Normal distribution (a continuous distribution)

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 8 .3
67

A unit of Realwaves (P) Ltd Probability Distribution

Q1 In a training programme designed to upgrade the supervisory skills of


production line supervisors the mean length of time spent on the programme is 500
hours with a standard deviation of 100 hours. Find the probability that:
(i) A participant selected at random will require more than 500 hrs to complete the
programme.
(ii) A participant selected at random will take between 500 and 650 hrs to complete
the programme?
(iii) A participant selected at random will take more than 700 hrs to complete the
programme?
(iv) A participant selected at random will require between 550 and 650 hrs to
complete the programme?
(v) A participant selected at random will require less than 400 hrs to complete the
programme?
(vi) A participant selected at random will require between 350 and 450 hrs to
complete the programme?
(vii) A participant selected at random will require between 420 and 570 hrs to
complete the programme?
(viii) A participant selected at random will take less than 600 hrs to complete the
programme?
Ans: (i) 0.5 (ii) 0.4332 (iii) 0.0228 (iv) 0.2417 (v) 0.1587 (vi) 0.2417 (vii) 0.5461
(viii) 0.8413

Q2 The Mumbai Municipal Corporation installs 2000 electric bulbs in the streets
of the city. These bulbs have an average life of 1000 hours with a standard
deviation of 200 hrs. if the life of the bulbs is assumed to be normally distributed,
what number of bulbs may be expected to fuse within first 700 hours?
X- 1 1.25 1.50

Ans: 134 bulbs

Last year questions

Year 2014
(a) What do you understand by Normal distribution? Give the importance of
Normal distribution.

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 8 .4
68

A unit of Realwaves (P) Ltd Probability Distribution

(b) Assuming that 50% of the population of a town smokes and assuming that out
of 256 investigators each takes 10 individuals to find out if they smoke, how many
investigators would you expect to report that 3 people or less smoke?

Year 2013
(a) Six dice are thrown 729 times. How many times do you expect atleast three
dice to show e five or six?
Ans: Given N = 729, n = 6
The probability of getting either 5 or 6 = p = 1 + 1 + = 2 = 1
6 6 6 3
The probability of not getting 5 or 6 = q = 1 – 1 = 2
3 3
Thus p = 1, q = 2
3 3
P (atleast 3 dice to show 5 or 6) = P(3) + P(4) + P(5) + P(6)
= 233
729
Hence, out of 729, the number of times we expect atleast 3 dice to
show five or six
= 729 x 233 = 233
729

(b) In a city, ten accidents took place in a span of 50 days. Assuming that the
number of accidents per day follow the Poisson Distribution, find the probability
that there will be three or more accidents in a day. (Given: e-0.2 = 0.8187).
Ans = 1 - 0.999 = 0.001

(c) State the importance of normal distribution.


Ans: Importance of Normal Distribution

1) Study of Natural Phenomenon: All natural phenomenon possesses the


characteristics of normal distribution such as length of leaves of a tree, heights of
adults, birth rates and death rates, etc. the normal distribution is widely used in the
study of natural phenomenon.

2) Basis of Sampling Theory: The normal distribution is also of great importance


in the sampling theory. With the help of normal distribution one can test whether
the samples drawn from the universe is satisfactory or not.

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 8 .5
69

A unit of Realwaves (P) Ltd Probability Distribution

3) Statistical Quality Control: Normal distribution helps in determining the


tolerance or specification limits within which the quality of the product lies. The
variations in the quality of a product are acceptable within these tolerance limits.

4) Useful for Large Sample Tests: The normal distribution is also widely used in
case of large samples. Large sample tests are based on the properties of normal
distribution.

5) Approximation to Binomial and Poisson distribution: The normal


distribution serves as a good approximation to many theoretical distributions such
as Binomial, Poisson, etc. When np > 5 and n (l — p) > 5, the normal distribution
provides a good approximation of the binomial distribution.

Year 2012
(a) The average test marks in a particular class is 79. The standard deviation is 5. if
the marks are distributed normally, how many students, in a class of 200 did not
receive marks between 75 and 82?
Given:
Pr (0 ≤ Z ≤ 0.7) = 0.2580
Pr (0 ≤ Z ≤ 0.8) = 0.288
Pr (0 ≤ Z ≤ 0.6) = 0.2257
When Z is a standard normal variable.

Ans: Given, µ= 79, σ = 5


We know that z = x - µ
σ
When x = 75 then value of z = 75 – 79 = -4 = -0.8
5 5

When x = 82 then value of z = 82 – 79 = 3 = 0.6


5 5
So, Area between z = 0 and 75 = 0.2881
Area between z = 0 and 82 = 0.2257
Total area between 75 and 82 = 0.2281 + 0.2257 = 0.5138

Total number of students who receive marks between 75 and 82= 200 x 0.5138=
102.76
So, (200 – 102.76) = 97.24 ≈ 97 students did not receive marks between 75 and 82

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 8 .6
70

A unit of Realwaves (P) Ltd Probability Distribution

(b) Differentiate between Binomial and Normal distribution.


Binomial Distribution Normal Distribution
1) Binomial distribution is a discrete Normal distribution is a continuous one.
probability distribution.
2) Binomial distribution is approximated with Normal distribution is not approximated with
normal distribution under certain condition. binomial distribution under certain condition.

Year 2011
(a) When does a binomial distribution tend to become a normal and poisson
distribution?
Ans: According to Binomial distribution If an event E has probability p of
occurring in each of n independent trials and that of failure in any trial is q (=1 - p)
then the probability that it will occur exactly r times in n trails is given by:
f(r) = nCrprqn-r
 When n is very large, p and q are not small, then binomial distribution tends
to normal distribution.
 When n is very large and p is very small, then binomial distribution tends to
poisson distribution.

(b) A leading razor blade manufacturing factory turns out razor blades with a small
chance of one out of 1000 blades to be defective. Blades are supplied in packets of
10. Using poisson approximation, calculate the approximate number of packets
without any defective blades and with one defective in a consignment of 1000
packets. e -.01= .99
Ans: P(defective) = p = 1 ,n = 10
1000
Mean = m = 10 = 0.01
1000
Probability of zero defective = P(0) = e-m mr = e-.01 x .010= 0.99
r! 0!
Therefore, in a consignment of 1000 packets, 990 packets will have no defective
blades.
Probability of one defective = P(l) = e-m mr = e-.01 x .011= 0.99 x 0.01 = 0.0099
r! 1!
Therefore, in 1000 packets, approximately 10 packets will have a single blade
defective

Year 2010

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 8 .7
71

A unit of Realwaves (P) Ltd Probability Distribution

It has been noticed in the World cup twenty-20 that the score posted by Indian
cricket team in a day is a normal variate N (150,225):
a) What is the probability they will score not more than 170 on a given day?
Ans:.5319
b) What is the probability they will score atleast 140 on a given day? Ans: .5160
c) What is the score they will post with probability equal to 0.97?
Ans: p(z) = 0.97 = 0.5 + 0.47 z = 1.89
1.89= x -150
225
x = 575.25

Year 2009
Three Coins are tossed simultaneously. Find the probability of (i) all heads (ii) one
head (iii) at least one head (iv) all tails.

Year 2006
Ten coins are thrown simultaneously. Find the probability of getting at least 7
heads.
Ans: 176/1024

Year 2004
(a) Raju and Ramu play a game. The probability of winning the game by Ramu is
2/5. Find the probability of winning at least 4 games by Ramu in a series of 5
games.

Year 2003
On an average five birds hit the Washington monument and are killed every week.
The government will allocate the fund for equipment to score birds away from the
monument if the probability of more than three birds being killed in any week
exceeds 0.7 will the funds be allocated?
Given e –5 = .00674
Ans: .351

Year 2002
(a) A and B play a game. The probability of winning the game by A is 3/5. Find
the probability of winning at least 4 games by A in a series of 6 games.

(b) Write importance of normal distribution.

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 8 .8
72

A unit of Realwaves (P) Ltd Probability Distribution

Ans: (a) P(4) = 4860 P(5) = 2916 P(6) = 729


15625 15625 15625
= P(4) + P(5) + P(6) = 1701 = .54
3125

Year 2001

(a) What is Poisson distribution? Explain with an example and state the conditions
under which this distribution is used.

(b) 5000 students were appeared in an examination. The mean of marks was 39.5%
with a standard deviation of 12.5% marks. Assuming distribution of marks to be
normal, find the number of students who have secured more than 60% marks.
Some areas of standard normal curve are given below:
Z: 1.6 1.62 1.64 1.68
Areas: 0.4452 0.4474 0.4515 0.4535

*****

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 69/318, VT Road, Mansarovar
Contact: 9829959536,7737733360,9928001210 8 .9
73
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing

CHAPTER 5 ESTIMATION THEORY AND


HYPOTHESIS TESTING
ESTIMATION THEORY AND HYPOTHESIS TESTING

Introduction
Estimation Theory as the name itself suggests refers to the technique and methods by which population
parameters are estimated from sample studies. Estimation of parameter is absolutely essential when-ever
a sample study has been conducted. People are interested, for a variety of reasons, in parameter values.
For example, a manufacturer would like to have some estimate about the future demand of his product, a
businessman would like to estimate his future sales and profits, a production engineer would very much
wish to know the percentage of defective articles which his machine is likely to produce over a period of
time, the manufacturer of a motor tyres would like to know the approximate life of his tyres, a bulb
manufacturer would be interested to know about the length of life of the bulbs and so on. Such estimates
can be obtained either by the Census Method or Sample Method However, as pointed out earlier;
generally sample studies are conducted to save time, money and energy.

Objectives of Theory of Estimation

1) To Estimate Population Parameter: The primary objective of theory of estimate is to estimate


population parameter on the basis of sample statistic. Sampling aims at obtaining the information about
the population on the basis of sample drawn from such population. This is done on the basis of estimation
of unknown population parameter by using a suitable statistic computed from a sample drawn from such
parent population.

2) To Set the Limits of Accuracy and Degree of Confidence: Theory of estimation strives to set the
limits of accuracy and degree of confidence of the estimates of the population parameter computed on the
basis of sample statistic. The estimates of the population parameters obtained on the basis of sample
statistics may not give true results. Thus researcher set the limits of accuracy and degree of confidence on
such estimates in order to determine how precise the estimates are. Thus, the precision of the estimate is
another main object of sampling theory.

3) To Test Significance: One of the objectives of estimation theory is to test significance about the
population characteristic on the basis of sample statistic. Statistical inferences and statistical conclusions
about population characteristic may be conveniently drawn on the basis of sample statistic. Thus, the
testing of statistical hypothesis and drawing of statistical conclusions are also the objective of the
sampling theory.

4) To Estimate Unknown Population: Its major aim is to help in estimating unknown population
parameter from knowledge of statistical measure based on sample studies.

5) To Compare the Observed and Expected Value: Theory of estimation aims to help compare the
observed and expected value and to find if the difference can be ascribed to the fluctuations of sampling.

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.1
74
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing
6) To Estimate the Properties of the Population: The theory of estimation is concerned with estimating
the properties of the population from those of the sample and also with gauging the precision of the
estimate.

7) To Determine the Approximate Value: To determine the approximate value of a population


parameter on the basis of a sample statistic. Or in other words we can say that to obtain an estimate of
parameter from statistic is the main objective of the theory of estimation.

Criteria of Good Estimator


A good estimator must possess the following properties:

1) Unbiasedness: An estimator is unbiased if its value is identical with the real value of the parameter.
An estimator θ of a population parameter is said to be unbiased if the expected value of the estimator is
equal to the population parameter. That is, θ is unbiased if E(θ) - θ
For example, the mean of a sampling distribution is supposed to be equal to the parameter value of the
mean. If this is so we would say that the estimator (the mean of the sampling distribution) is unbiased.
However there are some estimators which are not totally unbiased but are asymptotically unbiased, which
means that as the sample size goes on increasing the bias goes on declining till such time that the sample
size is so large that the bias is reduced to almost zero.
Bias in an estimate is not always bad or undesirable. An estimator with large bias but with low variability
is better than an estimator with low bias but high variability. When the variability in an estimator is large
the parameter value given by it would not be dependable.
For example, the samples mean X is an unbiased estimator. Given a random sample, the expected value
of X is µ, the same value one is trying to estimate.

2) Consistency: An estimator is said to be consistent, if with an increase in its size, its value (statistic)
comes closer and closer to the parameter value.
For example, if a sample means...X comes closer to the parameter value of the mean µ, it would be said
that the estimator is consistent. Obviously it means that consistency is a property concerning the behavior
of the estimator for very large values of N. If the value of N is very large - moving towards infinity, then
a value given by the estimator would not differ from the real value of the parameter or the probability of
its being very close to the real parameter value would be unity and the difference between the two values
would be a negligible constant figure.

3) Efficiency: In many cases there can be more than one unbiased and constant estimator of the
parameter value. For example, in a normal distribution both the mean and median are unbiased and
consistent estimators of the parameter mean. However the variance of the sampling distribution of mean
would be less than the variance of the sampling distribution of Median and for this reason Mean would
be considered to be a more efficient estimate than median.
Therefore an estimator which has lesser variability is said to be more efficient and as such more
dependable as others.

4) Sufficiency: A statistic is said to be a sufficient estimator of the parameter if it contains all the
information in the sample about the parameter. If all the information that a sample can provide about the
parameter has been utilized by an estimator it would be termed as a sufficient estimator.
If there is a sufficient estimator for the parameter, it would also be the most efficient and the most
consistent estimator. It however need not be the most unbiased estimator.

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.2
75
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing
MEANING

Hypothesis Testing / Significance Testing is a procedure that helps us to decide whether the
hypothesized population parameter value is to be accepted or rejected by making use of the information
obtained from the sample

BASIC STEPS IN HYPOTHESIS TESTING

(i) Formulate the null and alternate hypothesis


The null hypothesis (Ho) is the hypothesized parameter value, which is compared with the sample result.
Ho :  = Ho

Suppose we want to test the assumption that the mean mileage per gallon for all the cars of the new
model is 36, based on sample evidence of test runs.

The null hypothesis is:


Ho: = 36

The alternative hypothesis (H1) is accepted only if null hypothesis is not supported by the sample results.

There can be three possible alternative hypotheses:


 Population mean is not equal to the hypothesized mean.
In the example, it means that the mean mileage of all cars is not equal to 36. it can be greater than 36, or
less than 36.
H1:   36
 Population mean is greater than the hypothesized mean or H1:  > Ho

In the example, it means that the mean mileage of all cars is greater than 36 or H1:  > 36.

 Population mean is less than the hypothesized mean or H1:  < Ho

In the example, it means that the mean mileage of all cars is less than 36 or H1:  < 36.

(ii) Set up a suitable significance level ():


This is very important concept in the context of hypothesis testing. It is always some percentage (usually
5 %,  = 0.05), which should be chosen with great care, thought and reason. If we take the significance
level () at 5 percent, then this implies that the researcher is willing to take 5% risk of rejecting the null
hypothesis (Ho) when it happens to be true. It is usually determined in advance before testing the
hypothesis.

(iii) Two tailed and one tailed tests


Two tailed test – the null and alternative hypothesis are:
Ho :  = Ho

H1:   Ho which may mean  > Ho or  < Ho


Thus there are two rejection regions, as illustrated below:
Level of significance = 5%

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.3
76
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing
5% / 2 = 2.5% (rejection regions from both side)

Two tailed test


Normal probability distribution

47.5% 47.5%
Area Area

Z = -1.96 Z=0 Z = 1.96


Reject ACCEPT Ho Reject Ho

ONE TAILED TEST:


A one tail test would be used to test whether the population mean is either lower (left – tailed test), or
higher (right - tailed test ) than some hypothesized value

Left – tailed test – The null and the alternative hypothesis for left- tailed test
Ho :  = Ho

H1:  < Ho ; there is one rejection region only on the left tail as illustrated below:
Level of significance = 5%
(5% rejection regions from both side)

Left tailed test


Normal probability distribution

45% 50%
Area Area

Z = -1.645 Z=0
Reject ACCEPT Ho

Right – tailed test – The null and the alternative hypothesis for right- tailed test
Ho :  = Ho

H1:  > Ho ; there is one rejection region only on the right tail.

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.4
77
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing

(I) TWO TAIL


( = 0.05) means that area on both the tails is 5% of the total area, i.e. 2.5% on both sides of the normal
curve. The critical values of z most commonly used in business research, for different proportion of areas
 are shown below:

TWO TAIL
LEVEL 10% ( = 0.10) 5% ( = 0.05) 1% ( = 0.01) 0.1% ( = 0.001)
|Z| 1.645 1.96 2.58 3.289

(II) ONE TAIL (left tail or right tail)


In one tail, ( = 0.05) means that area on both the tails is 5% of the total area. The critical values of z
most frequently used in business research, for different proportion of areas  are shown below:

ONE TAIL
LEVEL 10% ( = 0.10) 5% ( = 0.05) 1% ( = 0.01) 0.1% ( = 0.001)
|Z| 1.28 1.645 2.33 3.09

Types of Hypothesis

1) Research Hypotheses: The research hypothesis is a directional hypothesis, i.e., it indicates the
expected direction of the results. The direction is implied by theory or previous research. The hypothesis
would not indicate the expected direction of the results in exploratory studies where there is no strong
rationale for an expected direction. When it is time to test whether the data support or refute the research
hypothesis, it needs to be translated into a statistical hypothesis.

2) Statistical Hypothesis: It is given in statistical terms. Technically, in the context of inferential


statistics, it is a statement about one or more parameters that are measures of the populations under study.
Statistical hypotheses often are given in quantitative terms, e.g., "The mean reading achievement of the
population of third grade students taught by Method A equals the mean reading achievement of the
population taught by Method B".
The two hypotheses in a statistical test are normally referred to as:

1) Null Hypothesis: A statistical hypothesis which is stated for the purpose of possible acceptance is
called null hypothesis. It is usually denoted by the symbol H0. For example, the null hypothesis may be
expressed symbolically as:
H0: µ = 162 cms.
According to Prof R.A. Fisher: "Null hypothesis is the hypothesis-which is tested for possible rejection
under the assumption that it is true."
The following may be borne in mind in setting the null hypothesis:

i) If we want to test the significance of the difference between a statistic and the parameter or between
two sample statistics then we set up a null hypothesis that's difference is not significant. This means that
the difference is just due to fluctuations of sampling:
H0: µ = X

ii) If we want to test any statement about the population we set up the null hypothesis that it is true.

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.5
78
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing
For example, it we want to find it the population mean has specified value µ0, then we setup the null
hypothesis
H0: µ= µ0

2) Alternative Hypothesis: Any hypothesis which is complementary to the null hypothesis is called an
alternative hypothesis and is usually denoted by H1 or Ha. For example, if we want to test the null
hypothesis that the average height of the soldiers is 162 cms., i.e.,
H1: µ= 162 cms. = µ0 (say)
Then the alternative hypothesis could be:
a) H1: µ ≠ µ0 (i.e., µ > µ0 or µ < µ0)
b) H1: µ > µ0
c) H1: µ < µ0

Formulation of Hypothesis
Step 1) Set up a Hypothesis: The null hypothesis, generally referred to as H0, is the hypothesis which is
tested for possible rejection under the assumption that it is true. Theoretically, a null hypothesis is set as
no difference of status quo and considered true, until and unless it is proved wrong by the collected
sample data. The null hypothesis is always expressed in the form of an equation, which makes a claim
regarding the specific value of the population. Symbolically, a null hypothesis is represented as:
H0: µ = µ0
Where µ is the population means and Ho is the hypothesized value of the population mean. For example,
to test whether a population mean is equal to 150, null hypothesis can be set as "population mean is equal
to 150".
Symbolically,
H0: µ = 150
The alternative hypothesis, generally referred by H, (H sub-one), is the logical opposite of the null
hypothesis. In other words, when null hypothesis is found to be true, the alternative hypothesis must be
false or when null hypothesis is found to be false, the alternative hypothesis" must be true. Symbolically,
alternative hypothesis is represented as:
H1: µ ≠ µ0
Consequently, H1: µ< µ0
Ho: µ > µ0
For the above example, the alternative hypothesis can be set as "population mean is not equal to 150".
Symbolically,
H1: µ ≠ 150
This result in two more alternative hypotheses, H1: µ < 150, which indicates that the population mean is
less than 150 and H1: µ < 150, which indicates that the population mean is less than 150 and H1: µ >
150; which indicates that the population mean is greater than 150.

Step 2) Set up a Suitable Significance Level: The level of significance generally denoted by α is the
probability, which is attached to a null hypothesis, which may be rejected even when it is true. The level
of significance is also known as the size of the rejection region or the size of the critical region. It is very
important to note that the level of significance must be determined before we draw samples, so that the
obtained result is free form the choice bias of a decision maker. The levels of significance which are
generally applied by researchers are: 0.01; 0.05; 0.10. The concept of "level of significance" is discussed
in detail later in this chapter.

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.6
79
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing
Step 3) Test Statistic: The next step is to decide an appropriate statistical test that will be used for
statistical analysis. Type, number, and the level of data may provide a platform for deciding the statistical
test. Apart from these, the statistics used in the study (mean, proportion, variance, etc.,) must also be
considered when a researcher decides on appropriate statistical test, which can be applied for hypothesis
testing in order to obtain the best results.

Step 4) Doing Computations: Having taken the first three steps, one has completely designed a statistical
test. One now proceed to the fourth step - performance of various computations from a random sample of
size n, necessary for the test. These calculations include the testing statistic and the standard error of the
testing statistic.

Step 5) Making Decision: Lastly, a decision should be arrived as to whether the null hypothesis is to be
accepted or rejected. In this regard the value of the test statistic computed to test the hypothesis plays a
very important role.
i) If the computed value of the test statistic is less than the critical value, then the computed value of the
test statistic falls in the acceptance region and the null hypothesis is accepted.
ii) If the computed value of the test statistic is greater than the critical value, then the computed value of
the test statistic falls in the rejection region and null hypothesis is rejected.
Usually 5% level of significance a = 0.05 is used in testing a hypothesis and taking a decision otherwise
any other level of significance is specially stated.

Importance of Hypothesis

1) Finding Answers: Hypothesis supports the researcher to find an answer to a problem. It is expressed
in declarative form. The most important thing is that it provides a guideline to the problem.

2) States Purpose of Researcher: A hypothesis states what researchers are looking for. When facts are
assembled, ordered, and seen in a relationship they constitute a theory. The theory is not speculation but
is built upon fact. Now the various facts in a theory may be logically analysed and relationships other
than those stated in the theory can be deduced. At this point there is no knowledge as to whether such
deductions are correct. The formulation of the deduction however constitutes a hypothesis; if verified it
becomes part of a future theoretical construction.

3) Forward Looking: A hypothesis looks forward. It is a proposition which can be put to a test to
determine its validity. It may seem contrary to or in accord with common sense. It may prove to be
correct or incorrect. In any event however, it leads to an empirical test.

4) States Specific Relationship: Hypothesis is to state a specific relationship between phenomena in


such a way that this relationship can be empirically tested. The basic method of this demonstration is to
design the research so that logic will require the acceptance or rejection of the hypothesis on the basis of
resulting data.

5) Provides Direction: It provides a direction to the research and prevents waste of time and effort of the
researcher.

6) Helps in looking in Particular Aspect: It helps the researcher to look into a particular aspect of the
problem thereby offering certain issues and facts.

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.7
80
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing
7) Framework for Analysis: It acts as a framework for analysis and interpretation of the data to draw
conclusions.

8) Suggests Areas of Importance: It suggests the areas of importance which need more attention or
more collection of facts by the researcher.

9) Ensures Scientific Nature of Research: A hypothesis ensures the entire research process remains
scientific and reliable; and following the principles of deduction.

Limitations of Hypothesis

1) Fashion of Testing: The tests should not be used in a mechanical fashion. It should be kept in view
that testing is not decision-making itself the tests are only useful aids for decision-making. Hence "proper
interpretation of statistical evidence is important to intelligent decisions."

2) Explanation of Difference: Tests do not explain the reasons as to why does the difference-exist, say
between the means of the two samples. They simply indicate whether the difference is due to fluctuations
of sampling or because of other reasons but the tests do not tell us as to which is/are the other reason(s)
causing the difference.

3) Lack of Certainty: Results of significance tests are based on probabilities and as such cannot be
expressed with full certainty. When a test shows that a difference is statistically significant, then it simply
suggests that the difference is probably not due to chance.

4) Lack of Accuracy: Statistical inferences based on the significance tests cannot be said to be entirely
correct evidences concerning the truth of the hypotheses. This is specially so in case of small samples
where the probability of drawing erring inferences happens to be generally higher. For greater reliability,
the size of samples is sufficiently enlarged.

IMPORTANT TERMS

Errors in Hypothesis Testing


In hypothesis basically there are two kinds of errors are occurred.
1) Type I Error
2) Type II Error
When a hypothesis is tested, there are four possibilities are occurred:
1) The hypothesis is true but our test leads to its rejection.
2) The hypothesis is false but our test leads to its acceptance.
3) The hypothesis is true and our test leads to its acceptance.
4) The hypothesis is false and our test leads to its rejection.
The first two possibilities lead to errors. If we reject a hypothesis when it should be accepted (possibility
1) we say that a type I error has been made. On the other hand, if we accept a hypothesis when it should
be rejected (possibility 2), we say that a type two error has been made.
The following table gives an idea about the Type I and Type II Errors
Accept H0 Reject H0
H0 is true No Error Type I Error

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.8
81
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing
H0 is false Type II Error No Error

Type I Error
Type I Error is committed when we reject a correct or true hypothesis. Type I Error (of rejecting a null
hypothesis when it is true) is denoted by a. Thus
α = Probability of Type I Error
= Probability of rejecting H0 when H0 is true.

Type II Error
Type II Error is committed when we accept a wrong or incorrect hypothesis. Type II Error (of accepting a
null hypothesis when it is not true) is denoted by β. Thus
β = Probability of Type II Error
= Probability of accepting H0 when H0 is not true.
If the difference between two means is zero and if test indicated rejection of the null hypothesis we
commit Type I error.
If on the other hand the difference between two means is not zero but our test suggests acceptance of null
hypothesis we commit Type II error.

Level of Significance
Having set up the hypothesis, it is necessary to test the validity of H0 against that of Ha at a certain level
or significance. The hypotheses are tested on a pre-determined level of significance and as such the same
should be specified. Generally, in practice, either 5% level or 1% level is adopted for the purpose.
The factors that affect the level of significance are:
1) The magnitude of the difference between sample means;
2) The size of the samples;
3) The variability of measurements within samples; and
4) Whether the hypothesis is directional or non-directional (A directional hypothesis is one which
predicts the direction of the difference between, say, means). In brief, the level of significance must be
adequate in the context of the purpose and nature of enquiry.

Degree of Freedom
The degree of freedom can be defined as the number of components in the calculation of a statistics that
are free to vary.
Let us consider that one know the mean of data is 25 and that the values are 20, 10, 50, and one unknown
value.
To find the mean of a list of data, we add all of the data and divide by the total number of values. Let the
unknown value is x then using the mean formula 20 + 10 + 50 - x = 25. After solving this one find that
4
x = 20.
If there is two values are missing and they are denoted as x and y. Using the mean formula one find that x
= (70 - y). This is shows that when one chooses a value for x, the value for y is determined. This shows
that there is one degree of freedom.
If size of the given sample is n, then the degree of freedom will be (n - 1). For example, if the size of the
sample is 22 then the degrees of freedom will be 21. In the contingency table the degree of freedom is
calculated in a slightly different manner. If there is s x t size of contingency table then the degree of
freedom will be (s-1) (t-1), where s refers to number of columns and t refers to number of rows. Thus in 2
x 2 contingency table the degree of freedom = (2-1) (2-1) = 1.

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.9
82
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing

One Tailed and Two Tailed Tests


A hypothesis test in which the population parameter is known to fall to the right or the left of centre of
the sampling distribution is called one tailed test. A one-tailed test looks for an increase or decrease in the
parameter. For example, a one-tailed test would be used to test these null hypotheses:
Females will not score significantly higher than males on an IQ test, blue collar workers will not have
significantly lower education than white collar workers, superman is not significantly stronger than the
average person. In each case, the null hypothesis predicts the direction of the expected difference.

There are two types of one tailed test as follows:

1) Right-tailed Test: A one-tailed test in which the sample statistic is hypothesized to be at the right tail
of the sampling distribution is called right tailed test.

2) Left-tailed Test: A one-tailed test in which the sample statistic is hypothesized to be at the left tail of
the sampling distribution is called left tailed test.
A hypothesis test in which a parameter statistic might fall within either the right or left tail of the
sampling distribution is called two tailed test.
A two-tailed test looks for any change in the parameter (which can be any change - increase or decrease).
For example, a two-tailed test would be used to test these null hypotheses: There will be no significant
difference in IQ scores between males and females, there will be no significant difference between blue
collar and white collar workers,, there is no significant difference in strength between Superman and the
average person.
The critical region (or the region of rejection) which is generally 5 per cent is kept on both sides of the
normal distribution in a two tailed test. It means that 2.5 per cent of the critical region is on the extreme
left of the normal curve and 2.5 per cent on the extreme right. The middle 95% is the acceptance region.
In a single tail test the 5 percent area would be either on the extreme left of the normal curve or on the
extreme right. The remaining 95 percent area would be the acceptance region.

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.10
83
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing

Z TEST

Applications of Z Test
Z test is used for application in various areas like,
1) Hypothesis Testing for One Proportion (π)
2) Hypothesis Testing for Two Proportions (π1 versus π2)
3) Hypothesis Testing for One Mean (µ)
4) Hypothesis Testing for Two Means (µ1 versus µ2)
5) Hypothesis Testing for Two Standard Deviations.

FORMULAE
(i) Population  (infinite) sample size may be large or small, standard deviation of the population known,
hypothesis may be one sided or 2 sided.

Z = x -  (mue)
p
n
Where,
x is sample mean
 is population mean
p is standard deviation of population
n is number of observations in the sample.

(ii) Population finite, sample size may be large or small, standard deviation of the population known,
hypothesis may be one sided or 2 sided.

Z = x -  (mue)
p x N - n
n N-1

Where,
N is number of sample in the population, n is number of observations in the sample.

Illustration: 1
A sample of 400 male students is found to have a mean height of 67.47 inches can it be reasonably
regarded as a sample from a large population with mean height 67.39 inches and a standard deviation of
1.30 inches. Test at 5% level of significance.
Ans: Z = 1.23, table value = 1.96

Illustration: 2
Suppose we are interested in a population of 20 industrial units of the same size all of which are
experiencing excessive labour turnover problem. The past records show that the mean of the distribution
of annual turnover is 320employees with a standard deviation of 75 employees. A sample of 5 of these
industrial units is taken at random which gives a mean of annual turnover as 300 employees. Is the
sample mean consistent with the population mean? Test at 5% level of significance.
Ans: Z = -.671, table value = 1.96

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.11
84
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing
Illustration: 3
The mean of a certain population process is known to be 50 with a standard deviation of 2.5. The
production manager would like to safe guard against decreasing values of mean. He takes a sample of 12
items that gives a mean of 48.5. What inference should the manager take for the production process on
the basis of sample results? Use 5% level of significance.
Ans: Z = - 2.078, table value = 1.65

Illustration: 4 – Year 2011


A manufacturer of dry cereal is producing 20gms packages of his product. The weights of the packages
are known to be normally distributed with a variance of 0.25 gms2. A sample of 49 packages shows on
average weight of 19.8gms. Test the appropriate hypothesis at 5% level of significance and discuss the
results. (8+6)
Ans: Z = - 2.8, T.V = 1.96

Illustration: 5 – Year 2012


Ascertain the size of the sample from the following particulars:
Standard deviation of population σp = 4
Mean of population µ = 24
Mean of sample or Xs = 22 and
Level of confidence = 99%
(Z value at 99% = 2.5758)
Ans: n = 27 (approx.)

Illustration: 6 – Year 2010


The Kanishk Yarn Trading Company claims that its product has an average breaking strength of atleast
90 lbs. The Ahmedabad weaving mills is interested in testing the Company's claim regarding the
breaking strength of the yarn. The weaving master of Ahmedabad weaving mills considers it much more
serious to buy a batch of yarn with mean breaking strength of less than or equal to 90lbs than to reject
one with a mean breaking strength of more than 901bs from the mill's past experience with this type of
cotton yarn with various cotton yarn suppliers, it was observed that the standard deviation of breaking
strength is 12 lbs. In order to test Kanishk's claim, a sample of 16 pieces of yarn was selected from a
batch of yarn supplied, and the average breaking strength was found to be 92 lbs. Given this sample
information, should the weaving master accept the Kanishk's claim? (Z0.025 =1.96; Z0.05 =1.645)
Ans: Z = .667

(iii) Hypothesis testing of difference between means when two samples are from two different
population.

Z= x1 – x2
p21 + p22
n1 n2

(iv) Hypothesis testing of difference between means when samples are taken from the same population.

Z= x1 – x2
p2 1 + 1
n1 n2

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.12
85
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing
Illustration: 7
The mean produce of coffee of a sample of 100 fields is 200 kg. Per acre. With a standard deviation of 10
kg. Another sample of 150 fields given the mean of 220 kg with a standard deviation of 12 kgs. Can the 2
samples be considered to have been taken from the same population whose standard deviation is 11 kg.
Use 5% level of significance.
Ans: Z = - 14.08, table value = 1.96

(v) Hypothesis testing of proportion.



Z=p–p
pq
n

where,
^p (p cap) is sample proportion
p is probability of happening of an event
q is probability of non happening, q = 1-p
n is number of observations in the sample.

Illustration: 8
The null hypothesis is that 20% passengers going first class. But management recognizes the possibilities
that this percentage could be more or less. A random sample of 400 passengers includes 70 passengers
holding 1st class ticket. Can the null hypothesis be rejected at 10 % level of significance.
Ans: Z = - 1.25, table value = 1.65

Illustration: 9
A certain process produces 10% defective articles. A supplier of new material claims that the use of his
material would reduce the proportion of defective. A random sample of 400 units using this new material
was taken out of which 34 were defective. Can the supplier’s claim be accepted?
Ans: Z = - 1, table value = 1.65

Illustration: 10
A sample survey indicates that out of 3232 births, 1705 were boys and the rest were girls. Do this figures
confirms the hypothesis that the sex ratio is 50:50. Test at 5% level of significance.
Ans: Z = 3.125, table value = 1.96

Illustration: 11 – Year 2012


A committee of Ministry of Human Resource Development know that last year 30 per cent graduates
were unemployed. This year the committee discovers that 5,000 are unemployed in a random sample of
20,000 graduates. At 5% level of significance, has unemployment decreased this year? (Table value at
590 level = 1.64) (7+7)
Ans: Z = -15.43

(vi) Hypothesis testing for difference between proportions.


 
Z = p1 – p2
   
p1 q1 + p2 q2

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.13
86
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing
n1 n2

Illustration: 12
An advertising agency wants to find out if there is a significant difference in the degree of loyalty for a
given brand of cereal between men & women. A random sample of 200 men & 200 women was taken
and it was determined that 58% of women and 65 % of men showed brand loyalty. At  = 0.05, test the
null hypothesis that there is no significant difference between the population proportion of men & women
who are brand loyal.
Ans: Z = 1.44

Illustration: 13
Among 60 literates 35 are employed and out of 50 illiterates, 26 are employed. Comment whether in your
opinion would further samples also show same difference in the proportion of employed persons among
literates and illiterates. Test the significance at 5% level.
Ans: Z = 0.663, table value = 1.96

Z Test (determination of sample size)

Infinite population Finite population

(1) σ known n = Z2σ2 n = NZ2σ2


E2 E2 (N- 1) + Z2 σ2
Always 2 tail

p known n = Z2pq n = NZ2pq


E2 E2 (N- 1) + Z2 pq

Illustration: 14
A research worker wants to determine the average time it takes a worker to manufacture a unit at 95%
confidence level. And the error is .50. The researcher knows from the past experience that σ is 1.6. How
many observations in a sample a researchers do take carry out the research process effectively.
Solution:
95% confidence level, .5% level of significance.
Error .50, σ = 1.6
n = Z2σ2 = (1.96)2 (1.6)2
E2 (.50)2
= 39.33 = 39 observations.

Illustration: 15 - Year 2011


A simple random sample is to be taken from a population of 50,090 sales invoices to estimate the mean
amount per invoice. The standard deviation of the population is 4,000. The allowable error is 200 and the
confidence coefficient is 90% (z = 1.64). What size of sample is appropriate?
Ans:

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.14
87
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing

Illustration: 16
The finance manager of a company feels that 55% of the branches will have good yearly collection of
deposit after introducing new interest rates. Determine the sample size such that the proportion is within
5% error at 90% confidence level.
Solution:
n = Z2pq
E2
p = .55
q = 1 - .55 = .45

E = .05
Confidence level = 90 – 100 = 10 = 1.65
(1.65) 2 x (.55) x (.45) = 269.525 = 270
(.05) 2

Illustration: 17
A simple random sample is to be taken from a population of 50,000 sales invoices to estimate the
mean amount per invoice. The standard deviation of the population is 4,000. The allowable error is 200
and the confidence coefficient is 90% (z = 1.65). What size of sample is appropriate?
Ans:
n = NZ2σ2
E2 (N- 1) + Z2 σ2

= 50000 x (1.64) 2 x (4000) 2


(200) 2 (50000 – 1) + (1.64) 2 x (4000) 2

= 50000 x (1.64) 2 x 16000000


40000 (49999) + (1.64) 2 x 16000000

= 50000 x 2.689 x 16000000


40000 (49999) + 2.689 x 1600000

= 2144000000
1999960000 + 42880000

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.15
88
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing

= 2144000000
2042840000

N = 50000 x 2.689 x 16000000


10000 (4 x 49999 + 2.689 x 1600)

215120000
199996 + 430.24

N = 1053

Illustration: 18 – Year 2013


ABC hotel management is interested in determining the percentage of the guests of the hotel who stay for
more than 2 days.
The reservation manager wants to be 95% confident that the percentage has been estimated to be within
±3% of the true value, what is the most conservative sample size needed for this problem? (z = 1.96 for
the given confidence level of 95%)
Ans: We have been given the following:
Population is infinite
e = 0.03 (since the estimate should be within 3% of the true value)
z = 1.96 (as per table of area under normal curve for the given confidence level of 95%).
As we want the most conservative sample size we shall take the value of p = 0.5 and q = 0.5. Using all
this information, we can determine the sample size for the given problem as under:
n = Z2pq
E2

= (1.96) 2.(0.5) (l - 0.5) =106711 = 1067


(0.03) 2 0.0009
Thus, the most conservative sample size needed for the problem is = 1067.

T TEST

(i) Population , sample size small, standard deviation of population unknown.

t = x -  (mue)
s
n

s =  ( xi –x )2
n-1

* Whenever sample size is less than 30 than that will be the sample size of sample otherwise large size of
sample if greater than 30.

(ii) Population finite, sample size small, standard deviation of population unknown, hypothesis may be
one sided or 2 sided.

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.16
89
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing
t= x -  (mue)
s x N -n
n N -1

Illustration: 19
The specimen of the copper wire have the following breaking strength.
In kg. Weight 578, 572, 570, 568, 572, 578, 570, 572, 596, 544.
Test whether the mean breaking strength of the population may be taken to be 578 kg. Test at 5% level of
significance.(two sided)
Ans: T = -1.49

Illustration: 20
Palms restaurant new the railway station has been having average sales of 500 tea cups per day. Because
of the development of bus stand near by it expects to increase it sales. During the first 12 days after the
start of the bus stand the daily sales were as under: 550, 570, 490, 615, 505, 580, 570, 460, 600, 580, 530,
526. Test at 5% level of significance.
Ans: T = 3.558, table value = 1.796

Illustration: 21
The lifetime of electrical bulbs for a random sample of 10 from a large consignment gave the following
data:
Item 1 2 3 4 5 6 7 8 9 10
Life in hrs. 4.2 4.6 3.9 4.1 5.2 3.8 3.9 4.3 4.4 5.6
Can we accept the hypothesis, that the average life time of bulbs in 4 hours.
Ans: T = 2.15, table value = 2.262

Illustration: 22 - Year 2011


The increase in the price of a share on certain, days during Jan 2010 was 12, 15,11,16,14,14,and 16,
respectively. The increase is the price of another share on the same days was 8, 10, 14, 10. 13, 11 and 11
respectively. Calculate the value of 't' and comment whether the trend in the prices of two shares is
significantly different. Test at 5% level of significance.

Ans: Null Hypothesis (H0): There is no significant difference between prices of two shares.
Price of Share
Share A Share B
X1 (X1)2 X2 (X2 )2
12 144 8 64
15 225 10 100
11 121 14 196
16 256 10 100
14 196 13 169
14 196 11 121
16 256 11 121
98 1394 77 871

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.17
90
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing

Illustration: 23 - Year 2013


Two types of drugs were used on 5 and 7 patients for reducing their weight. The decrease in the weight
after using the drugs for six months was as follows:
Drug A 10 12 13 11 14
Drug B 8 9 12 14 15 10 9

Is there a significant difference in the efficacy of the two drugs at 5% level of significance? (t.05 =2.223
at d.f. 10)

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.18
91
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing

(iii) Hypothesis testing for comparing two related samples. (Paired T Test)
Values from the two matched samples are denoted as Xi and Yi and the differences by Di (Di = Xi – Yi),
then the mean of the difference i.e.

D =  Di
n

( diff.) = D2i – (D)2.n


n-1
Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.19
92
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing

Assuming the said differences to be normally distributed and independent, we can apply the paired T –
test for judging the significance of mean of differences and work out the test statistic t as under:

T=D-O
 diff./n with (n-1) degrees of freedom.
Where,
D = mean of differences.
 diff = standard deviation of differences.
N = number of matched pairs.

Illustration: 24
Memory capacity of 9 students was tested before and after training. State at 5% level of significance.
Whether the training was effective from the following scores:
Students 1 2 3 4 5 6 7 8 9
Before 10 15 9 3 7 12 16 17 4
After 12 17 8 5 6 11 18 20 3
Use paired t-test
Ans: T = -1.368

Illustration: 25
The sales data of an item in six shops before and after a special promotional campaign are:
Shops A B C D E F
Before 53 28 31 48 50 42
After 58 29 30 55 56 45
Can the campaign be judged to be a success?
Test at 5% level of significance. Use paired t – test.
Ans: T = -2.784

F TEST (VARIANCE RATIO TEST)

The F test is used to test the significance of difference between two variances. The technique of F test
was originated by Prof. R. A. Fisher and Prof. George. W. Snedecor. By using the F test it is ascertained
whether the two samples can be regarded as drawn from the normal population having the same variance.
Procedure to calculate variance ratio (F)

(i) Calculation of variance of both the samples

Large size: S21 =  ( x – x1 )2


n1 - 1
Here, calculated value of variance is more.

Small size: S22 =  ( x – x2 )2


n2 - 1
Here, calculated value of variance is less.

(ii) Calculation of variance ratio (F)


Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.20
93
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing

F = Large estimate of the population variance


Smaller estimate of the population variance

(iii) The degrees of freedom for the sample having larger variance is known as V1 and that of the sample
having smaller variance is known as V2
v1 = n1 – 1, v2 = n2 – 1

V1
V2

(iv) The table value from F table is obtained at 5% level. In F table the v1 is located horizontally from left
to right and v2 in the first column from up to down. The value coinciding for v1 and v2 is regarded as
table value.

(v) Decision: if the computed value is more than the table value the difference is said to be significant
otherwise insignificant.

Illustration: 26
Given is the following data regarding 2 samples:
Sample 1 20 16 26 27 23 22 18 24 25 19
Sample 2 27 33 42 35 32 34 38 28 41 43 30 37
Test using F test at 5% level of significance whether the 2 samples have been from the same population.
Ans: F = 2.14, table value = 3.07

Illustration: 27
Answer using F test whether the 2 samples have been from the same population. Test at 5% level of
significance.
Sample 1 17 27 18 25 27 29 27 23 17
Sample 2 16 16 20 16 20 17 15 21
Ans: F = 4.15, table value = 3.73

Illustration: 28
In the following table the production of two workers A and B is shown:
Worker A 10 6 16 17 13 12 8 14 15 9
Worker B 7 13 22 15 12 14 18 8 21 23 10 17
Can these results be taken as a proof that B is more competent worker? Use F test.
Ans: F = 2.14, table value = 3.07

Illustration: 29
Two independent samples of 6 and 8 items respectively had the following values of the variables. Do the
two estimates of population variance differ significantly?
Sample 1 40 30 38 41 38 35
Sample 2 39 38 41 33 32 39 40 34
Ans: F = 1.33, table value = 3.97

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.21
94
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing
Chi square test

MEANING
The Chi square test is a non-parametric test where no assumption is made about the parameters of
population.
Chi square is a measure to evaluate the difference between observed frequencies and expected
frequencies to examine whether the difference so obtained is due to a chance factor or due to sampling
error.

Characteristics of chi square test


 Chi square test is useful to test the hypothesis about the independence of attributes.
 The Chi square test can be used in complex contingency tables.
 The Chi square test is very widely used for research purposes in behavioral science.

Formula to calculate chi square test

x2 = (oi – Ei)2
Ei
Where, oi = observed frequency
Ei = expected frequency
Ei = P x N

Illustration: 1
The table given below show the data obtained during outbreak of small pox:

Attacked Not attacked Total


Vaccinated 31 469 500
Not vaccinated 185 1315 1500
Total 216 1784 2000
Test the effectiveness of vaccination in preventing the attack from small pox. Test your result with the
help of x2 at 5% level of significance.
Ans: x2 = 14.642

Illustration: 2
The following table shows the condition of home and the condition of child.
Condition of Condition of child
home Clean Fairly clean Dirty Total
Clean 76 38 25 139
Not clean 43 17 47 107
Total 119 55 72 246
Do these results suggest that the condition of the home affects the condition of child. (At 5% level of
significance, the value of x2 = 5.991 for 2 differences)
Ans: x2 = 20.87

Illustration: 3
The following contingency table shows the classification of 1000 workers in a factory, according to the
disciplinary action taken by the management and their promotional experience:
Disciplinary Promotional experience Total
Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.22
95
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing
action Promoted Not promoted
Offenders 30 670 700
Non offenders 70 230 300
Total 100 900 1000
Use x2 test to ascertain whether the disciplinary action taken and promotional experiences are associated
(Given of v = 1, x2 0.05 = 3.84)
Ans: x2 = 84.655

Illustration: 4
The result of survey to know the educational attainment among 100 persons randomly selected in a
locality are given below:
Education
Middle High school College Total
Male 10 15 25 50
Female 25 10 15 50
Total 35 25 40 100
Can you say that education depends on sex?
(For v = 2, x2 0.05 = 5.99) (Null hypothesis should be stated clearly)
Ans: x2 = 9.928

Illustration: 5
In a certain town the proportion of smokers was 90%. A random sample of 100 persons was taken from
the town and 85% were found to be smokers among them. By using x2 test, test whether there is
significant difference between the sample proportion and the population-proportion of smokers in the
town, (for 1 d.f. X20.05 =3.841; for 2 .d.f X20.005 = 5.991) (7)
Ans: Null Hypothesis H0: There is not significant difference between the sample proportion and
population proportion.
Smokers Non smokers Total
Population O 90 10 100
E 175 x 100 = 87.5 25 x 100 = 12.5
200 200
Sample O 85 15 100
E 175 x 100 = 87.5 25 x 100 = 12.5
200 200
Total 175 25 200

X2 = (O - E)2
E

= (90 – 87.5) 2 + (10 – 12.5) 2 + (85 – 87.5) 2 + (15 – 12.5) 2


87.5 12.5 87.5 12.5

= 0.0714 + 0.5 + 0.0714 + 0.5 =1.143

Now, for y = 1 and level of significance 5%, Given x2 = 3.841


Since, the calculated value of x2 is less than given table value of x2, therefore the null hypothesis is
accepted. Hence, it can be concluded that there is no significant difference between the sample proportion
and population proportion of smokers in the town.
Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.23
96
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing

Illustration: 6
A six sided dice was thrown 792 times and the following results were obtained:
No. on dice turned up 1 2 3 4 5 6
Frequency 100 100 200 170 110 112
Test the hypothesis that the dice is unbiased.
(The table value of x2 at 5% level of significance for 5 degree of freedom is 11.07 and for 6 degree of
freedom is 12.592)
Ans: x2 = 68.19

Illustration: 7
A dice is thrown 132 times with following results:
Number 1 2 3 4 5 6
Frequency 16 20 25 14 29 28
Is the dice unbiased?
Ans: x2 = 9

Illustration: 8
The following table gives the number of aircraft accidents that occurred during the various days of the
week. Find whether the accidents are uniformly distributed over the week?
Days Sun Mon Tues Wed Thurs Fri Sat Total
No. of
Accidents 14 16 8 12 11 9 14 84
The table value for different degree of freedom is given below:
Degree of freedom 1 2 3 4 5 6 7 8 9
5% value 3.84 5.99 7.82 9.49 11.07 18.07 14.7 15.51 16.92
Ans: x2 = 4.165

Illustration: 9
200 digits are chosen at random from a set of tables. The frequencies of the digits are as follows:
Digits 1 2 3 4 5 6 7 8 9
Frequency 18 19 23 21 16 25 22 20 15
Use chi square test to ascertain the correctness of the hypothesis that the digit were distributed in equal
number in the table from which they were chosen?
Ans: x2 = 4.3

Illustration: 10
The demand for a particular spare part out of 6720 spare parts in a factory was found to vary from day to
day. In a sample study the following information was obtained:
Days Monday Tuesday Wednesday Thursday Friday Saturday
No. of
Parts demanded 1124 1125 1110 1120 1126 1115
Test the hypothesis that the number of parts demanded does not depend on the day of the week. Use chi
square test at 5% level of significance.
Ans: x2 = .1801

Illustration: 11 – Year 2012


Weight in kgs of 10 students are given as 45, 35, 30, 41, 32, 60, 48, 31, 42 and 36. Can we say that the

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.24
97
Estimation Theory and
A unit of Realwaves (P) Ltd Hypothesis Testing
standard deviation of weights of all students from which the above random sample has been drawn, is
equal to 5kgs.
Test on the basis of chi-square at 5% and 1% level of significance.
Given x20.05 at 8d.f = 15.5 x20.05 at 8d.f = 20.1
At 9 d.f = 16.9 at 9 d.f = 21.7
At 10 d.f = 18.3 at 10 d.f = 23.3
Ans: Let us take the null hypothesis that standard deviation of weights of all students from which the
above random sample has been drawn, is equal to 5kgs.
x2 = 1.95

Illustration: 12 – Year 2013


The number of car accidents per month in a certain town were as follows:
12 8 20 2 14 10 15 6 9 4
Are these frequencies in agreement with the belief that accident conditions were same during the 10-
month period?
(x2 = 16.919 for 9 d.f. at 5% level of significance)
Ans: x2 = 26.6

Illustration: 13
8 coins were tossed 256 times and the following results were obtained
No. of heads 0 1 2 3 4 5 6 7 8
Observed 2 6 30 52 67 56 32 10 1
Fit a binomial distribution and then calculate the expected frequencies. Test hypothesis using chi – square
that the coins are biased. Use 10%level of significance.
Ans: 3.13

*****

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 5.25
98

A unit of Realwaves (P) Ltd Techniques of association

CHAPTER 6 TECHNIQUES OFASSOCIATION


OF ATTRIBUTES AND TESTING
Introduction
> The analysis of variance was developed by R.A. Fisher, Analysis of variance (abbreviated as ANOVA)
is useful in the fields of economics, biology, education, psychology, sociology, and business/industry and
in researches of several other disciplines. This technique is used when multiple sample cases are
involved.
> For example, the significance of the difference between the means of two samples can be judged
through either z-test or the t-test, but the difficulty arises when we happen to examine the significance of
the difference amongst more than two sample means at the same time.
> The ANOVA technique enables us to perform this simultaneous test and as such is considered to be an
important tool of analysis in the hands of a researcher. Using this technique, one can draw inferences
about whether the samples have been drawn from populations have the same mean.
> The ANOVA techniques is important in the context of all those situations where one wants to compare
more than two populations such as in comparing the yield of crop from several varieties of seeds, the
gasoline distance of four automobiles, the smoking habits of five groups of university students and so on.
In such circumstances one generally does not want to consider all possible combinations of two
populations at a time for that would require a great number of tests before we would be able to arrive at a
decision. This would also consume lot of time and money, and even then certain relationships may be left
unidentified (particularly the interaction effects). Therefore, one quite often utilizes the ANOVA
technique and through it investigates the differences among the means of all the populations
simultaneously.
> ANOVA is essentially a procedure for testing the difference among different groups of data for
homogeneity. "The essence of ANOVA is that the total amount of variation in a set of data is broken
down into two types, that amount which can be attributed to chance and that amount which can be
attributed to specified causes." There may be variation between samples and also within sample items.
ANOVA consists in splitting the variance for analytical purposes. Hence, it is a method of analyzing the
variance to which a response is subject into its various components corresponding to various sources of
variation.
> Through ANOVA technique one can, in general, investigate any number of factors, which are
hypothesized or said to influence the dependent variable. One may as well investigate the differences
amongst various categories" within each of these factors, which may have a large number of possible
values. If we take only one factor and investigate the differences amongst its various categories having
numerous possible values, we are said to use one-way ANOVA and in case we investigate two factors at
the same time, then we use two-way ANOVA. In a two or more way ANOVA, the interaction (i.e., inter-
relation between two independent variables/factors), if any, between two independent variables affecting
a dependent variable can as well be studied for better decisions.

Characteristics of Analysis of Variance (ANOVA)


The essential characteristics of the Analysis of Variance (ANOVA) may be brought about as under:
1) It makes statistical analysis of variances (i.e., squares of standard deviations) of two, or more series, or
samples.
2) It determines whether the difference in the mean values of the different samples is due to chance, or
due to any significant cause, and thereby, it reveals the true characteristics of the given series.
3) It gives the desired result by finding the appropriate variance ratio through the F-test technique.
Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 6.1
99

A unit of Realwaves (P) Ltd Techniques of association

Applications of ANOVA
> Through this technique one can explain whether various varieties of seeds of fertilizers or soils differ
significantly so that a policy decision could be taken accordingly, concerning a particular variety in the
context of agriculture researches.
> The differences in various types of feed prepared for a particular class of animal or various types of
drugs manufactured for curing a specific disease may be studied and judged to be significant or not
through the application of ANOVA technique.
> A manager of a big concern can analyze the performance of various salesmen of his concern in order to
know whether their performances differ significantly.

Analysis of variance (ANOVA)


Analysis of variance (abbreviated as ANOVA) is an extremely useful technique concerning
researchers in the fields of economics, biology, education, psychology, sociology, and business/ industry
and in researchers of several other disciplines.
The ANOVA technique is important in the context of all those situations where we want to
compare more than two populations such as in comparing the yield of crop from several varieties of
seeds, the gasoline mileage of four automobiles, the smoking habits of five groups of university students
and so on.

Illustration: 1
Set up an analysis of variance table for the following per acre production data for three varieties of wheat,
each grown on 4 plots and state if the variety differences are significant.

Plot of land Per acre production data


Variety of wheat
A B C
1 6 5 5
2 7 5 4
3 3 3 3
4 8 7 4
Ans: F = 1.5, table value = 4.26

Illustration: 2 – Year 2012


The following data relate to the production of wheat in thousand tonnes of three varieties, viz., X1, X2
and X3 on 3 plots:

Varieties
Plots X1 X2 X3
Y1 10 13 4
Y2 16 19 7
Y3 19 22 13
Is there significant difference between varieties?
Ans: As the calculated F = 3.762 < 5.143
Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 6.2
100

A unit of Realwaves (P) Ltd Techniques of association


H0 is accepted, hence there is no significant difference between marks.

Two – way ANOVA


Two – way ANOVA technique is used when the data are classified on the basis of two factors.
For eg:
(i) The agricultural output may be classified on the basis of different varieties of seeds and also on the
basis of different varieties of fertilizers used.
(ii) A business firm may have its sales data classified on the basis of different salesmen & also on the
basis of sales in different regions.
(iii) In a factory, the various units of a product produced during a certain period may be classified on the
basis of different varieties of machines used and also on the basis of different grades of labour.

Degree of freedom (d.f) can be worked as under:


d.f for total variance = (c.r-1)
d.f for variance between columns = (c-1)
d.f for variance between rows = (r – 1)
d.f for residual variance = (c-1) (r – 1)

where,
c = number of columns
r = number of rows.

Analysis of variance table for two – way ANOVA


Source of variation Sum of squares Degrees of Mean square (MS) F - Ratio
(ss) freedom (d.f)
Between columns (Tj)2 – (T)2 (c-1) SS between columns MS between columns
treatment nj n (c-1) MS residual
Between rows (Ti)2 – (T)2 (r-1) SS between rows MS between rows
treatment ni n (r-1) MS residual
Residual or error Total SS- (SS (c-1) (r-1) SS residual
between columns (c-1) (r-1)
& SS between
rows)
Total X2ij – (T)2 (c.r – 1)
n
In the table c = number of columns, r = number of rows.
SS residual = Total SS- (SS between columns & SS between rows)

Illustration: 3
Set up analysis of variance table for the following two-way design results.

Varieties of Per acre production data of wheat


fertilizers Variety of seeds
A B C
W 6 5 5
X 7 5 4
Y 3 3 3
Z 8 7 4
Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 6.3
101

A unit of Realwaves (P) Ltd Techniques of association


Also state whether variety differences are significant at 5% level.
Ans: F: between columns = 4, between rows = 6, table value: between columns = 5.14, between rows =
4.76

Illustration: 4 – Year 2010


Set-up ANOVA table for the following information relating to three drugs testing to jude the
effectiveness in reducing blood pressure for three different groups of people:
Drugs
X Y Z
A 14 10 11
Group of people 15 9 11
B 12 7 10
11 8 11
C 10 11 8
11 11 7
[F0.05 (2,9) = 4.26; F0.05(4,9) = 3.63]
Do the drugs act differently?
Are the different groups of people affected differently? Is the interaction term significant?
Ans:
ANOVA table
Source of Variation SS d.f. MS F-Ratio 5% F-Limit
Between columns (i.e., 28.77 (3- 1) = 2 28.77 14.385 F (2, 9) = 4.26
between drugs) 2 0.389
= 14.385 = 36.9
Between rows (i.e., 14.78 (3 -1) = 2 14.78 7.390 F (2, 9) = 4.26
between people) 2 0.389
= 7.390 = 19.0
Interaction 29.23 7.308 F (4, 9) = 3.63
29.23* 4* 4 0.389
= 7.308 = 18.8
Within samples (error) 3.50 (18- 9) = 9 3.50
9
=0.389
Total 76.28 (18-1)= 17

* These figures are left-over figures and have been obtained by subtracting from the column total the
total of all other values in the said column. Thus, interaction SS = (76.28) - (28.77+14.78+3.50) = 29.23
and interaction degrees of freedom = (17) - (2+2+9) = 4.
The above table shows that all the three F-ratios are significant at 5% level which means that the drugs
act differently, different groups of people are affected differently and the interaction term is significant.
In fact, if the interaction term happens to be significant, it is pointless to talk about the differences
between various treatments, i.e., differences between drugs or differences between groups of people in
the given case.

NON PARAMETRIC OR HYPOTHESIS TESTING II

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 6.4
102

A unit of Realwaves (P) Ltd Techniques of association


RUN TEST

Run test for randomness.


The null and the alternative hypothesis in this test are as follows:
Ho: The occurrence of the runs in the given stream of symbols is random.
H1: The occurrence of the runs in the given stream of symbols is not random.
In this situation, one can approximate the sampling distribution of r to normal distribution with the
following mean & variance.

Population mean, r = 2.n1 n2 + 1


n1 + n2

Standard deviation,  r = 2.n1.n2.(2.n1.n2 - n1 - n2)


(n1 + n2)2.( n1 + n2 - 1)
Where,
n1 = frequency of occurrence of a particular symbol in the whole stream of symbols.
n2 = frequency of occurrence of another symbol in the whole stream of symbols.
r = the number of runs.

The formula for standard normal z statistic to test the significance of r is given by:
Z= r-r
r

Illustration: 1
The following is an arrangement of 25 men, M, and 15 women, W. Lined up to purchase tickets for a
premier picture show:
M WW MMM W MM W M W M WWW MMM W MM WWW MMMMMM WWW MMMMMM
Test for randomness at the 5% level of significance.
Ans: Z = - 0.94, table value = 1.96

Illustration: 2
The marketing manager of a company is keen in analyzing the outcomes of different quotations
submitted to its customers. The outcome is either winning (W) or losing (L) the order. The sequence of
outcomes of 40 different quotations are as listed below. Check whether the events of winning or losing
the orders is random at a significance level of 0.05.
WW LL WWWWWW LL WWW L WWW LL WW LL WW LLL W LL WWW LL WW
Ans: Z = - 1.069, table value = 1.96

Illustration: 3 - Year 2005, 2010


An economic researcher wants to find out if there is any pattern in arrivals at the entrance of the shopping
mall in terms of males and females arriving or whether such arrivals are simply random. One day, he
stationed himself at the entrance and recorded the gender of first 30 shoppers who came in. the results are
as follows:
MMFMFFFMMMFFMFMMFFFFMMMMMFFMMM
Use the run test for randomness at 0.05 level of significance.
Ans: Z = -1.0346, table value = 1.96

RANK SUM TESTS (U TEST)

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 6.5
103

A unit of Realwaves (P) Ltd Techniques of association

RANK SUM TESTS IS ALSO KNOWN AS U TEST OR WILCOXON- MANN- WHITNEY


TEST.

Wilcoxon- Mann- Whitney test (or U – test): This is a very popular test among the rank sum tests. This
test is used to determine whether two independent samples have been drawn from the same population.

U = n1. n2 + n1 (n1 +1) – R1


2
Where,
U = measurement of the difference between the ranked observations of two samples.
R2 = sum of the ranks assigned to the values of the second sample.
R1 = sum of the ranks assigned to the values of the first sample.
n1& n2 = sample sizes

In applying U test
Ho – the two samples came from identical population.
Ha – the means of the two populations are not equal.
The means of the ranks assigned to the values of the two samples should be more or less the same.
Mean =  u = n1. n2
2
and standard deviation (or standard error)

u = n1. n2 (n1+ n2 +1)


12

Upper limit =  u + µ + 2a
Lower limit =  u - µ - 2a Acceptance region

Illustration: 1
The values in one sample are 53, 38, 69, 57, 46, 39, 73, 48, 73, 74, 60, & 78. In another sample they are
44, 40, 61, 52, 32, 44, 70, 41, 67, 72, 53 & 72.
Test at 10% level of the hypothesis that they come from populations with the same mean. Apply U – test.
Ans: U = 54.5,  u = 72,  u = 17.32

Illustration: 2 – Year 2010


A manufacture wants to test the Hypothesis that the mean life two brands of machines used are equal.
The life time is measured by the number of operating hours between the overhauls. The manufacturer
keeps overhaul statistics
on all his machines. A random sample of 15 machine gives the following details: Operating hours
between overhauls:
Brand X: 1050, 1150, 850, 800, 1000,1350, 1100, 1300,1450, 900,1200, 1250, 1550, 825, 650.
Brand Y: 1170, 970, 880,1410,700,775,940,1650,950,1190,600, 1600,975, 450,1290.
Using Mann = Whitney test, will you conclude that the lifetimes of two brands are equal? (14)
Ans: U = 98, Uu = 112.5, σu = 24.1, Z = 1.96

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 6.6
104

A unit of Realwaves (P) Ltd Techniques of association


Illustration: 3
A researcher wants to hypothesis that the mean life time of two brands of bulbs is equal. A random
sample of 10 bulbs from each brand give the following results:
Sr.no 1 2 3 4 5 6 7 8 9 10
Brand A 100 125 80 110 130 140 95 116 75 85
Brand B 118 92 142 86 150 68 162 98 136 148
Use rank sum test the hypothesis that the mean life time of two brands of bulbs are equal. Use 5% level
of significance.
Ans: Z = 1.96, R1 = 89, µ = 50, σ = 13.23, U = 66

Kruskal-Wallis Test – H Test


Kruskal-Wallis test was developed by Kruskal and Wallis jointly and is named after them. Kruskal-
Wallis test is a non-parametric (distribution free) test, which is used to compare three or more groups of
sample data. Kruskal-Wallis test is used when assumptions of ANOVA are not met. ANOVA is a
statistical data analysis technique that is used when the independent variable groups are more than two. In
ANOVA, we assume that distribution of each group should be normally distributed.
In Kruskal-Wallis test, we do not assume any assumption about the distribution. So Kruskal-Wallis test is
a distribution free test. If normality assumptions are met, then the Kruskal-Wallis test is not as powerful
as ANOVA. Kruskal-Wallis test is also an improvement over the Sign test and Wilxoson's sign rank test
which ignores the actual magnitude of the paired magnitude.

Hypothesis in Kruskal-Wallis Test


Null Hypothesis: In Kruskal-Wallis test, null hypothesis assumes that the samples are from identical
populations.
Alternative Hypothesis: In Kruskal-Wallis test, alternative hypothesis assumes that the sample comes
from different populations.
1) In Kruskal-Wallis test, we assume that the samples drawn from the population are random,
2) In Kruskal-Wallis test, we also assume that the cases of each group are independent,
3) The measurement scale for Kruskal-Wallis test should be atleast ordinal.

Procedure for Kruskal-Wallis Test


1) Arrange the data of both samples in a single series in ascending order.
2) Assign rank to them in ascending order. In the case of a repeated value, assign ranks to them by
averaging their rank position.
3) Once this is complete, ranks or the different samples are separated and summed up as R1 R2 R3 etc.
4) To calculate the value of Kruskal-Wallis test, apply the following formula:

k
H = 12 Ri – 3 (n + 1)
N(n + 1) ni
i=1

Where,
H = Kruskal-Wallis test
n = Total number of observations in all samples
Ri = Rank of the sample
Kruskal-Wallis test statistics is approximately a Chi-square distribution, with k-l degree of freedom
where ni should be greater than 5. If the calculated value of Kruskal-Wallis test is less than the chi-square

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 6.7
105

A unit of Realwaves (P) Ltd Techniques of association


table value, then the null hypothesis will be accepted. If the calculated value of Kruskal-Wallis test H is
greater than the Chi-square table value, then we will reject the null hypothesis and say that the sample
comes from a different population.

Illustration: 4
A researcher intends to compare the education and teaching standards of three business schools in a city
with the following average marks of 20 students of the respective schools
X 65 84 74 72 56 70 68
Y 63 69 71 53 59 64 49
Z 79 43 67 57 60 76 -
Test the hypothesis using Rank sum test that there is no difference in the performance of the students of
various business schools. Use 10% level of significance. Ans: 2.820

Illustration: 5
Agribusiness researchers are interested in determining the conditions under which Christmas trees grow
fastest. A random sample of equivalent-size seedlings is divided into four groups. The trees are all grown
in the same field. One group is left to grow naturally, one group is given extra water, one group is given
fertilizer spikes, and one group is given fertilizer spikes and extra water. At the end of one year, the
seedlings are measured for growth (in height). These measurements are shown for each group. Use the
Kruskal-Wailis test to determine whether there is a significant difference in the growth of trees in these
groups. Use a = 0.05
Group 1 (Native) Group 2 (+ Water) Group 3 (+ Fertilizer) Group 4 (+Water and Fertilizer)
8 10 11 18
5 12 14 20
7 11 10 16
11 9 16 15
9 13 17 14
6 12 12 22
Ans: 16.77

Sign test
The sign test is one of the easiest parametric tests. Its name comes from the fact that it is based on the
direction of the plus or minus signs of observations in a sample and not on their numerical magnitudes.
The sign test may be one of the following two types: (a) One sample sign test (b) Two sample sign test.

(a) One sample sign test: The one sample sign test is a very simple non- parametric test applicable when
we sample a continuous symmetrical population in which case the probability of getting a sample value
less than mean is1/ 2 and the probability of getting a sample value greater than mean is also 1/ 2. To test
the null hypothesis  = Ho against an appropriate alternative on the basis of a random sample of size ‘n’,
we replace the value of each and every item of the sample with a plus (+) sign if it is greater than Ho,
and with a minus (-) sign if it is less than Ho. But if the value happens to be equal to Ho, then we simply
discard it. After doing this, we test the null hypothesis that these + and – signs are values of a random
variable, having a binomial distribution with p = 1/ 2*. For performing one sample sign test when the
sample is small, we can use tables of binomial probabilities, but when sample happens to be large, we use
normal approximation to binomial distribution. Let us take an illustration to apply one sample sign test.

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 6.8
106

A unit of Realwaves (P) Ltd Techniques of association


Illustration: 1- Year 2010
Suppose playing four rounds of golf at the city club 11 professionals totaled 280, 282, 290, 273, 283,
283, 275, 284, 282, 279, and 281. Use the sign test at 5% level of significance to test the null hypothesis
that professional golfers average Ho = 284 for four rounds against the alternative hypothesis Ho < 284

Illustration: 2
On 10 occasions Mr X has to wait 5,7,3,6,6,5,7,5,2,8 minutes for the metro that he takes to reach his
company. Test the hypothesis that on an average Mr X does not have to wait for more than 5 minutes to
catch the metro train. Use sign test at 5% level of significance.

Ans: 1.86

(b) Two sample sign test (or the sign test for paired data): The sign test has important applications in
problems where we deal with paired data. In such problems, each pairs of values can be replaced with a
plus (+) sign if the first value of the first sample (say X) is greater than the first value of the second
sample (say Y) and we take minus (-) sign if the first value of X is less than the first value of Y. in case
the two values are equal, the concerning pair is discarded. (In case the two samples are not equal size,
then some of the values of the larger sample left over after the random pairing will have to be discarded).

Illustration: 3
The following are the numbers of artifacts dug up by two archaeologists at an ancient cliff dwelling on 30
days.
By X – 1 0 2 3 1 0 2 2 3 0 1 1 4 1 2 1 3 5 2 1 3 2 4 1 3 2 0 2 4 2
By Y - 0 0 1 0 2 0 0 1 1 2 0 1 2 1 1 0 2 2 6 0 2 3 0 2 1 0 1 0 1 0
Use the sign test at 1% level of significance to test the null hypothesis that the two archaeologists, X and
Y, are equally good at finding artifacts against the alternative hypothesis that X is better.

Illustration: 4
In a manufacturing firm goods produced by 12 workers in a week before and after holidays are given as
follows:
s.no 1 2 3 4 5 6 7 8 9 10 11 12
Before 90 80 95 100 88 84 90 69 101 98 96 85
After 86 77 87 92 79 80 93 79 98 102 98 81
Use sign test to test the hypothesis that there is no effect of holidays on the productivity of the workers
against the alternate hypothesis that the productivity has increased after holidays
Ans: 1.18

SPEARMANS RANK CORRELATION


When the data are not available to use in numerical form for doing correlation analysis but when the
information is sufficient to rank the data as first, second, third, and so fourth, we quite often use the rank
correlation method and work out the coefficient of rank correlation. In fact, the rank correlation
coefficient is a measure of correlation that exists between the two sets of ranks.
Coefficient of rank correlation

rs = 1 - 6d2
n(n2 – 1)
where,
rs = coefficient of rank correlation

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 6.9
107

A unit of Realwaves (P) Ltd Techniques of association


n = number of paired observations
 = notation meaning ‘the sum of’
d = difference between the ranks for each pair of observations
H0 = the correlation is not significant
Ha = the correlation is significant

Illustration: 1
The following are ratings of aggressiveness (X) and amount of sales in the last year (Y) for eight
salespeople. Is there a significant rank correlation between the two measures? Use the 0.10 significance
level.

X 30 17 35 28 42 25 19 29
Y 35 31 43 46 50 32 33 42
Ans: r = .8095, table value = .6190

Illustration: 2
A plant supervisor ranked a sample of eight workers on the number of hours of overtime worked and
length of employment. Is the rank correlation between the two measures significant at the 0.01 level?
Amount of overtime 5.0 8.0 2.0 4.0 3.0 7.0 1.0 6.0
Years employed 1.0 6.0 4.5 2.0 7.0 8.0 4.5 3.0
Ans: r = .185, table value = .8571

Illustration: 3
The occupational safety and health administration (OSHA) was conducting a study of the relationship
between expenditures for plant safety and the accident rate in the plants. OSHA had confined its studies
to the synthetic chemical industry. To adjust for the size differential that existed among some of the
plants, OSHA had converted its data into expenditures per production employee. The results follow:

Expenditure by chemical companies per production employee in relation to accidents per year
Company A B C D E F G H I J K
Expenditure $60 $37 $30 $20 $24 $42 $39 $54 $48 $58 $26
Accidents 2 7 6 9 7 4 8 2 4 3 8
Is there a significant correlation between expenditures and accidents in the chemical - company plants?
Use a rank correlation (with 1 representing highest expenditure and accident rate) to support your
conclusion. Test at the 1 percent significance level.
Ans: r = -0.86, table value = .7455

*****

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar 6.10
a
108

TotalNoofPages: tr
IM6I I3
M.B.A I Sem. (Main & Back) Exam. Jan. Z0l4
M-103 A Business Mathematics & Statistics

Time: 3 Hours Maximum Marks: 70


Min. Passing Marks: 28
Instructions to Candidates : -
1) The question paper is divided in two sections.
2) There are sections A & B. Section A contains 6 questions
out of which the
candidate is required to attempt any 4 questions. Section
B contains short
case studyiapplication basel question *Hrt is compulsory.
3) All questions are carrying equal marks.
l.

SECTION -A

Q,1 (a) Define Matrix and Trarnpose of a Matrix.


isl
(b) The Matrix of technological coefficients of input
- output in Agriculture and
Industry is

.rt .so I
| .167
| .l2s l
If the market demand be of 100 units of Agriculture and 80 units of Industry.
Find the
fore cast demand.
tel

[1M6113] Page 1 of4 [3c401


109

Q.2 (a) Define Index Number. What are the main ways of construbting Index Number?
;t

t6l

(b) One hundred customers from a particuiar branch were asked to state the time
they generally take to withdraw cash from their accounts. The data is given
below

Minutes 0-10 10-20 20 -30 r30 - 40

No. of Customers 2A 50 20 10

Calculate Mean deviation and Standard deviation. t8l

Q.3 Calculate the Coefficient of correlation from the following data

Fertilizer used 15 18 20 24 30 3s 40 50

Yield (in tonnes) 85 93 95 105 t20 130 150 160

lt4l

[1M6113] Pa.geZ cf 4 [3040]


110

Q.4 (a) What do,you rmderstand'by Multiple Regression?


t4l

(b) An investigation of the demand for TV sets in5 towns has resulted in the
following data _

Population (x) 11 t4 l7 2t 25
(in'000)

No" of Sets (Y) 15 27 34 38 46


Demanded

Find a linear regression of Y on X and estimate the demand of rv sets fbr a


population of 30,000"
[10]

Q.5 (a) What is Baye's Theorem and explain the meaning of mutually exclusive events?

[s]

(b) A bag contains 6 red and 4 white balls. Another bag contains 3 red and 5 white
balls. A fair dice is tossed for the selection of bag. If dice shows I or 2 the first

. bag is selected otherwise the second bag is selected. A balt is drawu from the
selected bag and found to be red. What is the probability that this ball comes
from the first bag?
tel

[1M6113] Page 3 of4 [3040]


111
,..

Q.6 (a) What do you understand by Normal distribution? Give the importance of Normal
distribution. t6l

(b) A manufacturer of pins knows.that 5Yo of his products are defective. If he sells
pins in bo*L, of 100 and guarantees that not more than 4 pins wilt be defective,
what is the probability that a box will fail to meet the guaranteed Quality?
(.'= o.oorz) t8l

SBCTION -B

Q.7 (a) Solve the following seJof linear equations by using mahix method

xt ** 3xs: l
4xz
2xf 5x2 * 4xs:'4
' x1 -.jx2 - 2xj.: 5 t8l

(b) Assuming that 50%o of the population oia town smokes and assuming that out of
256 inveitigators each takes l0 individuals to find out if they smoke, how many
investigators would you expect to report that 3 people or less smoke? ,[6]

[1M6113I Page 4 of 4 [3o4o]


?a
\=-{
\o
Roll No.

re
M.B.A. I Semester (Main/Back) Examination - 2015
M-103 ABusiness Mathematics & Statistics
[Total No. of Pages { 2112

t-{

Time : 3 Hours Maximum Marks : 70


Min. Passing Marks : 28

Instructions to Candidates:
I) ' The question paper is divided in two sections.
2) There are sections A & B. Section A contains 6 questions out of which the
candidates is required to attempt any 4 questions. Section B contains short
case studylapplication base 7 question which is compulsory.
3) All questions ore carrying equal marla.

Section - A
1. a) Veri$r the transpose of the product of two matrices equals the product of the
transposes taken in reverse order; that is

(AB)' = Br Ar
[El
. n=1, (7)
I ona B =Lz,-l;lf
L-r j

b) Find l-' A =l:.,


27
rl (7)
'f L-r
2. a) . Calculate l,nl where

It z r 3l
j-,r34
A:l r o z 3l (7)
[-rr1t)

1M 5113 /zors (r) [C*ae*C".."


b) ' Solve following system of linear, algebraic equation by Cramer's Rule113
xr*xz*xt:4,
xr-xr-xr:2,
xy2xr:O (7)
3. The weights ofthe first 48 Miss India contest winners are given in the following
table in pounds.
128 119 t25 12lJ. 118 tzt 110 t25
135. 116 115 124 t24 115 118 116
120 114 130 120 [16 tU t32 118
143 119 105 140 130 t23 135 tzs
130 118 t20 t2{} 126 128 t20 11.4
120 112 l1s 118 138 137 140 108
a) Compute mean ,, variance ,z and standard deviation s for above data
b) . Use 10 equal length classes to construct a frequency table and to draw a
histograrn for the data. (7x2=14)
4 " The initial ix,eight (x) and the amount of weight lost from using a diet for one month
(y) (both in pounds) for 12 people are
y 31 9 22 30 27 t7 t4 2t 31 28 27 15
x 214 168 176 159 173 163 157 182 209 196 170 176
Assuming a simple linear regression model with normality does it appear a person's
initial weight affects the amount ofweight lost when using this diet? (14)
5. Count the number of different 4-Letter sequences that can be made using the
letters in Mi ssi ssi ppi (14)
6. The height of a university high - Jumper will clear each time he Jumps is a
normal random variable with mean 2 meters and standard deviation l0 Centimeters.
a) What is the greatest height he will Jumfiwitbprobability 0.95?
hi \tr4rat is the height he will clear only 10 percent of the time? ('lx2:14)
Section - B
1 a) ' Suppose that Medical science has a cancer - diagnostic test that is 95% accurate
on both those u,ho do and those who do not have cancer. if 0.005 of the
. populatian actuatrly does have cancer, cornpute the probability that a particular
individua"l itas cancer girren that the test says he has cancer. (7')
b) Assurnc a printed page in a book contains 40 lines and each line contains 75
.'ositions ( H,acti of which may be left blank or filled with same symbol ) Thus
i::1.'.:h page has 30CI0 positions to be set. Assume a particular type'setter makes

i.!i..i effor psr 5(10il positions on the average"


ii What is thc distribution fbr X, the number of errors per page?
ij) Cornpute the probability that a page contains no errors.
ii1) l.Vhat is tli* probability that a 16 page ehapter contains no errors? (7j

lM 6i13
114

Total No of Pages: pl
(Y)
t< LM6113
F{ M. B. A. I Sem. (MainlBack) Exam., Jan. 2016
\0 M-103A Business Mathematics and Statistics
t{
=

Time: 3 Hours Maximum Marks: 70


Min. Passing Marks: 2E
Instructions to Candidates :
' (i) The question paper is divided in two sections.
(ii) There are sections A & B. Section A contains 6 questions out of
which the candidate is required to attempt any 4 questions.
Section B contains short case study / apptication based question
which is compulsory.
(iii) All questions carry equal marks. PLJI'IIiJT MCIftfi
1. NIL 2.

SECTION.A

\yr(il Forrhetwomatric"' , ,=[], ''o],'"n, (as), =Br.Ar.


"=[] I

,{o "=l):il show that e.2-4A-5I=0 Hence or otherwise, Find

A-r 171

20
'o/^,*ll
'..,_/ r
2x
02
= 6 find the value of x.
|

[1M6113] Page 1 of3 [36401


,
115

J.<6 Solve the following system of linear equations by matrix inversion method:
2x +y +42= 2, 2 + 4y +22= 3, 2x+ 3y + z = - 6 t7t
Q.3 (a) The profits and losses of a business concern for the years 20ll-2015 are given
below:. 141
i"g t* flij
Year Profit (in Rs.) Loss (in Rs.) LL"JiTI
ft-^h.#
2011 3000 *5 -{
'::,
20t2 4000
"* .,'. fl
Ul v'
! .* ,;,
"t:,.
r
2013 2500 'rif
lJj"l
/ t1J
1'.i
2014 2000 -. ]W:^,.
rrl

2015 .l- (.); t+l


6000

Represent the above data by a Bar graph


(b) Calculate the arithmetic mean and the median of the frequency distribution given
below. Hence calculate the mode using the empirical relation between the
three. ll0I

Class- 130-134 135-139 r40-1M 145-149 150-154 155-159 160-t6/


Limits
Freq. 5 t5 8 24 l7 t0 I

,/
a"4r_(g>/ Six dice are thrown 729 times. How many times do you expect at least three dice
/o show a 5 or 6? Ul
.(b/
\./' Find the standard deviation and coefficient of variation from the following table
giving the marks of 150 students: UI

Marks Number of Students Marks Number of Students


1-10 5 51-60 22
tt-20 T2 6t-70 l5
2t-30 20 71-80 6
31-40 25 81-90 4
41-50 40 91-100 1

[1M6113] Page 2 of3 [36401

^ )d't/-
.'Lt-
L x.-' '"+* t
/r{-
s (
\a \.
\ i,; t^
'Q lc
116

Q.5 (a) What is meant by an Index number? Explain briefly any two methods of
construction of Index numbers. t71
(b) The following table gives the change in the price and consumption of three
commodities. Compute Fisher's ideal price index number. 171

2005 2015
Commodity Price (Rs.) Quantity (Rs.) Price (Rs.) Quantity (Rs.)
Wheat 100 l0 ll0 6
Rice 150 t5 170 l8
Cloth 5 50 4 30

Q.6 Calculate the coefficient of correlation from the following data:


X: I 2 3 4 5 6 7 8 9
Y: 9 8 10 t2 1l t3 t4 l6 15
Also,obtain the equations of line of regression and obtain an estimate of Y which

should correspond on the average to X - 6.2. t14l

SECTION.B
Q.7 (a) Suppose a manufacturing firm produces steel pipes in three plants with daily
production volume of 500, 10fi) and 2000 units respectively. According to past
experience it is known that the fractions of defective output produced by the

three plants are respectively 0.005, 0.008 and 0.010. If a pipe is selected from a

day's total production and found to be defective. Find out what is the probability
that it came from the first plant. t71

(b) Assume a certain factory turning out razor blades, there is a small chance 1/500
for any blade to be defective. The blades are supplied in packets of 10. Use the
Poisson distribution to calculate the approximate number of packets containing
no defective, one defective and two defective blades respectively in a

consignment of 10000 packets given that e-0'o2 = 0.9802 171

MORE EDI-ICATICIN Puneer More -71J77i5568

MBA Coaching Classes

[1M61131 . Eusiness Mathematics [36401


r Aecounting for Management
o Managerial Economies

I-ive & Recorded Batch in Raiasthan


117
l'.trTX btal No. of Pages
?a
-
M.B.A. I sem. (Main&Back) Examination f)ec.-2016
\o
-
M-103A Business Mathematics and Statistics
F{
=
I
t
Time : 3 Hours Maximum Marks z 70
Min. Passing Marks z 28
Instructions to Candidates:'
i) The question paper is divided in two sitctions.
iil There ore sections A & B. Section A contoins 6 questionts out of which the
candidate is required to attempt any 4 questions. Section B contains short
ccise study/application base I question which is compulsory.

iii) All question ore carrying equal marlcs.


Section - A
1. a) Find A2 -3A + gI, if

tt -z 3l [r or olol
A-lz 3 -rl,*nerer-lo (7)
L-3 t 2) [o o r]

-r ol lz r :l
l-r
b) ' Ir 3l-| ,=l-,
ff A=12 '=l-'
o I,l
0
rl'
l, L, o ,l
I

Find D Ar ii) Br iii) (A+B)r r) (2A)' (7)


Compute the inverse of the matrix

Ir z -rl
rl 4i
u=l-, ,'l
z)
(7)
L3
b) Solve the following system of equations by using determinants (Cramers rule)
x-4y-z=71
2x-5y+22 =39 U,)
-3x+2y+z=l

1M6113 lzarc (1) [Contd....


118
3. a) Find the mode and the median for the following'Distribution. (7)
Variable 0-5 5- l0 l0- l5 t5-20 20-25 25-30 30-35 35-4C
Frequency 2 5 7 13 2t T6 8
Frl

b) The following table shows the number of workers in a factory whose weekly
earnings are given against them. Determine the mean valves ofweekly earnings
and standard deviation. (7)
Burrge of weelly rngs rn mbdr. of workers ,in
(9
.-r,G
{ -(b
74
6-8 T - ft'
L
"376
8- l0 1 /
?04
t0-12 t) -L J10
t2-14
l4-16 @
lf
0

1'"-
18

0
16-18
l)_ \
9
t8-20
/l A
9
20-22
\ v 0

4. a)
tryq)
order.
@s inthe following

First judge 1 6 5 t0 a
J 2 4 9 7 8
Second judge J -t
5 8 4 7 10 2 I 6 9
Thirdjudge 6 4 9 8 1 2
a
J 10 5 7
Use the rank of correlation to discuss which pair ofjudges havi the nearest
approach to common tastes in beauty. (7)
b) The equations of two regression lines obtained in a correlation analysis of 60
observations are 5x:6y+24 and 1000 y :76gx - 37Og

D- What is the correlation coefficient and what is its propable error?


ii) Showthat the ratio ofthe coefficient ofvariability ofx to that of y is 5.24
iii) what is the ratio of variances of x andy? (7)
-
3. a) From a pack of cards, a carjis drawn what is the probability of drawing red
card or a king? 0)

1M6113 (2)
119
b) In a' bolt factory machines A, B and C manufacture 25,35 and 40 percent of
the total of their output 5,4 and 2 percent are defective. A bolt is drawn at
random and is found to be defective, What are the probabilities that it was
manufactured by the machinesA, B and C? (7)
6. a) From the chain base index numbers given below, prepare fixed base index
numbers. (7)
1945 1946 1947 t948 t949 I 950

92 t02 104 98 103 101

b) l0o/o of screws produced by a machine are defective. Find the probability of


the following when they are checked at random by examining samples of 5 :
r) None is defective
ii) One is defective
iii) Atmost one is defective (7)

Section - B
7. a) A driven has two taxies, which he hires out day by day. The number of
demands for a taxi on each day is distributed as a poisson variate with mean
1.5. Calculate the porportion of days on which

D Neither the car is used


ii) Some demand is refused (7)
b) "Construct with the help of datagiven below Fishers Ideal Index and show it
satisfies the factor Reversal Test.
Estimated totalproduce Harnest price per marnd in
in thousand tons in saran district
saran district
t93t-32 1932-33 t93L-32 r932-33
Rs. As. Rs. As.
WinterRrce 71 26 ,, 3
zl
J

Barley rc7 2 0 2 0

Maule 62 48 2 9 e (7)

.f,
?:

1M6113 (3)
VIDEO AVAILABLE FOR
WWW.MOREEDUCATION.IN ALL PRACTICAL & THEORY SUBJECT RTU MBA CLASSES SINCE 2005
120

Roll No. Total No of Pages, p


lcnl
lv-t 1M6L1 3
l-.tl
lro
I

M. B. A. I Sem. (MainlBack) Exam., f)ec. 2017


t=t I

Y-103A
Business Mathematics and Statistics

L.]
Time: 3 Hours Maximum Marks: 70
Min. Passing Marks: 28
Instnrctions to Candidates :
(i) The Etestiott poper is divided in two sections.
(ii) A & B. Section A contains 6 questions out of whiclt
There are sectiorts
the candidate is required to attempt any 4 questions. Section B
contains short case study / application based 7 question, which is
compulsory
(iii) All questions carry equal ntarl<s.

SECTION,A *
Q.1 (a) MORE EDUCATION
Company produces 9829959536
three products every day, Their total production on a certain

day is 45 tons. It is found that the production of third product exceeds the

production of flrst product by 8 tons, while the total:d,lbduction of the first and

third product is twice the production of sebond :product.. Determine the


'
lramer's'illej ' i
t10l

(b) Calculate the Inverse of a Matrix- t4l

[; ;]

[1M51]"3J Page 1 of 3 [ 40601

JAIPUR-VIDHYADHAR NAGAR 418 MANSAROVAR PLAZA PRATAP NAGAR


VIDEO AVAILABLE FOR
WWW.MOREEDUCATION.IN ALL PRACTICAL & THEORY SUBJECT RTU MBA CLASSES SINCE 2005
121

what do you mean by multiple regressions? Explain- t4l


Q.2 (a)
calories intake (x)
(b) Child specialist observed 10 school students for their average

and body weight (y) in kg. The data analyst offered'following summation

quantitiesbasedonthebasicdataonthetwovariablQs.'

!x= 166,Zy = 5tl7 ,Ixy = 9840, Z*'= 2892 andZy' = 33927 '
per unit of calode
Using these quantities. find (a) absolute increase in weight
most likely weight against a
inrake, (b) The minimum weight y intercept, (c) The
t10l
calorie intak e of 25 and (d) The standard error of estimate.

Draw different types of scattered diagrams for different degree


of correlation
Q.37u)
between two co related variables.
t6l

The following is the record of goals scored by team


A in the football session' [8]

No. of goals scored 0 t


9
2

T
3

J
4

3 {v
I $d
No of Matehes 1
\'

MORE EDUCATION
goals9829959536
scored per match was 2-5 with
For tearn B the avetage number of
standard deviation of 1 ,25 goals' Find which team may be more consistent?

a exactlY half chance


A, B and C bidding; for a contract. It is believed that A has
gain the contract. What is the
that B has; B, in tufir,,has 4t5n as Hkely as C has to

,* tel
probability for each,!9 win the contract?

(b) A bag contains white, 4 blue and 10 green balls. Two balls are drawn at
*
ffi. ;;;*. o.**ilitv that dpv will both be sreel': t 'D( t5l

Define the Index numbers. What are the main ways


of constructing lndex
a5 fu
1/ Number?
t6l ,
Page 2 of3 [ 4060l
[1M6113]

JAIPUR-VIDHYADHAR NAGAR 418 MANSAROVAR PLAZA PRATAP NAGAR


VIDEO AVAILABLE FOR
WWW.MOREEDUCATION.IN ALL PRACTICAL & THEORY SUBJECT RTU MBA CLASSES SINCE 2005
122

Calculate index number of prices for 1995 on the basis of 1990 from the data
given below:- t8l
Commodity Wei._eht Price per unit Price per unit 1995( < )
1ee0({)
A 40 L6 20

B 25 40 50

C
D
20
15
L2

2
15

3
4 o[ 8
If the weights of commodities A.B,C and D are increased in the tatio l:2:3;4,
what will be the increase in index number?

a) What do you mean by Normal distribution? Give the importance of normal


distribution' t6l

A manufacture of dolls knows that 5Vo of his products are defective, if he sells
dolls in boxes of 100 and guarantees that not more than 4 dolls will be defective,
/
what is the probability that a box will fail to meet the guaranteed quality.
(.-'= 0.0067) t8l

SECTIOI\[-B
, |Lb?
C?Se Studv

e.7 A financeMORE
company hasEDUCATION
offices located in every 9829959536
division, every district and every
taluka in a certain state in India. Assunle that there are five divisions, 30 districts and
200 taluka in the state. Each office has 1 Head Clerk, 1 Cashier, 1 Clerk and 1 Peon. A
divisional office has in addition, One Office Superintendent,.2 Clerks, 1 Typist and'1
Peon. A district office has in addition, 1 Clerk and 1 Peon.
The basic daily salaries are as follows: Office Superintendent Rs. 500, Head Clerk
Rs. 200, Cashier Rs. 175, Clerks and Typist Rs. 150 and Peon Rs. 100. Using Matrix
notation. Find.
(a) The total number of posts of each kind in all the offices taken together. tsl
(b)Thetotalbasicdailysalarybi11ofeachkindofofficeandt4z( tsl
(c) The total basic daily bill of all the offic0s taken together. t4)

[1M6113] Page 3 of 3 [ 40601

JAIPUR-VIDHYADHAR NAGAR 418 MANSAROVAR PLAZA PRATAP NAGAR


123
Roll No. [Total No. of Pages : | 4
m
F{
F{
\o MBA I Sem.(Main Back) Examination l)ec. - 2018
M-103, Business Mathematics & Statistics
F{
18 YlPff,.(r 6'n
Time : 3 Hours Maximum Marks z 70
Min. Passing Marks :28
Instructions to Candidates:

I) The question paper is divided in two sections.

2) Thtere are sectionsA & B. Section A contqins 6 questions out of which the
candidate is required to attempt any 4 questions. Section B contains short
case study/application bqse I question which is compulsory.
3) All questions are carrying equol marks.

Section - A

1, Ix-y 2x+21 [-r 5lr_.


1. F) lr.- y 3z +*J = | , lTind x,
r3J Y, z And w' (6)

p) Solve the following set of equations by matrix m ethod


2x +3y + 4z =29,x + y +22 =13,3x +2y * z: I 6 . '^ ,\r,5,'1 ) (8)

2. ,6) Explain various measures of dispersion and its sign ificance. (6)

f) The following data relate to age of employees and the number of days they
were reported sick in a month.

Employees: I 2 : 4 5 6 7 8 l0

Age (X) : 30 32 3s N 48 s0 52 55 57 6t
tu
SickDays(Y): I 0 Z 5 Z 4 G 5

Calculate Karl Pearson's coefficient of correlation between employee's age


and sick days & interpret the result.0 b+1 (8)

1M6113 t2018 (I) [Contd....


i
124
3. a) The following data give the experience of machine operators and their
performance ratings as given by the number of good parts turned out per
100 pieces.

Operator 12345678
Experience (X) 16 12 18 4 . 3 l0 5 12

Performance (Y) . 87 88 89 68 78 80 75 83
Ratings l$++tt
',_l.}
\y
Calculate the regression lines of performance ratings on experience and
estimate the portable performance if an operator has 7 years experience. (8)

If the regression lines are given by 3x * 2y :26. and6x + y : 31, find


$l
D the mean value of x and y.

ii) the coefficient of correlation between x and y ,Y q g

iii) estimate thc value of y for x:0 kd uulu, of x when y: 13. ' 4S1Ll

4.? What do you mean by index number? What is the significance and need of
index numbers. I-Iow do we construct index numbers? (6)

yY Calculate fisher's ideal index from the given data. Does it satisff the time
reversal and fuctor reversal tests (8)

Commodity Price 0g)


Quantif Price auantity (pg)
(year20l2) (year20l2) (year20l3) (year20l3)
A 50 10 56
+:

B 100 2 r20

C m 6 60

D l0 30 T2 24

E 8 40 12 36

\F,S>

1M6113 (2\
\ vq
'>z
125
I 5. a) The p;obability that a boy will not pass MBA examination
,2
rst and the

3
I

probability that a girl will not pass the MBA examination is Find the
5'
I

probability that only one ofthem will pass the examination. (6)
I

b) Explain the following terms with appropriate example. (8)

D Mutually exclusive events


i

ii) Conditionalprobability.

iii) Independent & dependent events

rD Baye's theorem.

6. a) The shareholders Research centre of India has conducted recently a research


study on price behaviour of three leading industrial shares, A, B and C for the
period 20ll to 2017, the results of which are published as follows in its
quarterlyjournal.

Share Average price (Rs.) Standard Deviation Cunent selling price (Rs.)

A 18.0 5,4 36.0

B 22.5 4,5 34.75

C 24.0 6.0 39.00

1) which share, in your opinion appear to be more stable in value?

ii) If you are the holder of all three shares, which one would you like to
dispose of at present, and why? (g)

b) What do you mean by central tendency? What are the different measures of
central tendency? Explain appropriate situations to use different measures of
central tendency. (6)

1M6113 (3)
126
Section - B (CompulsorY)

7. a) The average monthly ,d., # 5000 firms are norrnally distributed. Its mean
and standard deviation are Rs. 36000 and Rs. 10000 respectively. Find

i) the number of firms the sales of which are over Rs. 40,000.

ii) the percentage of firms the sales of which will be between Rs. 38,500
and Rs.,41,000.

iii) the number of firms the sales of which will be between Rs. 30,000 and
Rs.40,000. The relevantextract atthe areatable (underthe normal curve)
is given below.

qt -\\\6 ,a
z 0.25 0.40 0.5 0.6 € ba)rl t gasr
Area 0.0987 0.1 554 0.19 t 5 0.2257
,o1 fr< (s) "qf
b) A manufacturing firm produces steel pipes in three plants with daily production
volumes of 500, 1000 and 2000 units respectively. According to past
experience, it is known that the fraction of a defective outputs produced by
the three plants are respectively .005, .008 and .010. If a pipe is selected from
a day's total production and found to be defective. What is the probability
that it came from the first Plant. (6)

1M6113 (4) [Contd....


127

A unit of Realwaves (P) Ltd

Z test (Normal Probability Distribution)


Z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879

0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2703 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389

1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3780 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319

1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767

2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4884 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936

2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986
3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.498 0.4989 0.4989 0.4990 0.4990

Vidhyadhar Nagar: F – 45, Balaji Tower – I, Behind Vishal Mega Mart.


Mansarovar: 418, Mansarovar plaza, Maddhyam Marg
Contact: 9829959536,7737733360,9928001210
128

.-
v RoilNo., 13MBtxx 6lq TotalPrinted Pages:f
-v
-fAl t 2M5106
- ----- I

U I I. B..I Qem. fi) (Main/Back) Examinarion, Juty . 2014


AI M-206A Research tuleihods in Management'
6l
Time:3 Hoursl-
[Total Marks:70
[Min. Passing Marks: 28

(1) TT'te question paper is d.iuid.ed. in two sections.


(2) There are sections A and, B. section A contains 6 questions
out
af uthich the candidate is required, to attempt any 4 questions.
Section B contain short case studjt / applicatlion base 1 questions
which is compulsory.
(3) All questions are carrying equal marks.

2. NIL

SECTION - A

Describe different types of Research in brief. Define the main


problems which are encountered by a researcher in the business
research. n4 0Rtr ED
UC l+Tl t, N
FuruEtrT nloAr 8+6
qr
Saqq 5q ;zL
t'..*1I' what is the significance of sample selection in Research ? Describe
and di-qtinguish probability and non-probabfity sampling.

6+8

'Processing and analysis of data implies Editing, coding,


classification and Tabulation". Describe these operations in brief,
pointing oui ihe importance of each in context cf research study.

t4

2M5106r lffiiltgryillilll]illtliltuililt 1 IContd...


129

.W,/+///,t Write short note on the following :

(a) Different sources of Secondary Data.


(b) SPSS and Its use in Data Analysis.

8+6

A consumer goods manufacturing company wants to test


whether its three salesmen X, Y and z tendto make sales of same
size or they differ in their selling ability as measured by the
average size of their sales. During the last week out of 12 sales
calls X made -1, Y made 3 and z made b. The following are the
weekly sales record of the three salesmen in 100.

fn OR e
trA)eATi o N)
Purrrre€T trr. o Re
qgTqqs qs34'

Determine whether the three salesrnen's average sales differ in


size, taking 5% level of sigrrfficance. Table value of F at 0.05 level
and 2 and g d.egree of freedom is 4.26.
14
,1,

t",1,fi' Give the format of Research Report in brief describing its element
with example.
t4
SECTION -B
/'
,.4' The following table gives the number of good and defective parts
'\*/" produced by each of the three shifts in a factory. using a b% Ievel
of significance find if there is any association between the shit and
the quality of the part produced.
Given value of X2 at bYo level of significance is 5.gg1. '-',- -,
shift Good Defective Total (- " \,' '> -)-u i: '
Day 900 130 1030 1 : ; - :':';1": ":'i
-:':'' j
Evening 700 170 870 / ''. ! .'.'1,'"'l
Night 400 200 600
Total 2000 500 2500
t4

iltlftillt iltil ilEiltfit[ Iilt


In5106
ll[l 2 22,10
130

RollNo. Total No of Pagecr ffi


2M5106
M. II. A, II Sem. (Mnin / Bnck) Ilxum., July'August 2015
M-206 Research Methods in Mnnagement

Time: 3 Hours Maxlmum Marks: 70


Min. Passing Marks: 28
Instructions to Candidates :

(i) The question paper is divided in two sections,

(ii) There are sections A & B, Section A contains 6 questions out af


which the caididate is required to attempt any 4 questions,
Section B contains short case study / application based Question

whichis compulsory,
(iii) All questions carry equal marks. pt"iNHHT
tul**f;
2. NIL ln-q $ $ $ ri 5 J
1. NIL {,

spcTIoIY-s,
,/l
qly'l What do you mean by research? Explain various types of research" t7l
(b) What is research problem? Define the main issues wtrich should receive the
attention of the researcher in formulating the research problem. Give suitable
. /l examples to elucidate your points. Ul
//\
Q.Z7Bnefly describe the different steps invoived in a research process. Discuss the
\'/ different methods which are adopted for thd purpose of Research Design. U41

Q.3 (a) What do you mean by 'sampling'? State the various methods of sampling. U)
(b) Design a questionnaire to study the buying behaviors of consurners in a shopping

mall" t7l

[2Ms1o6] Page 1 of3 1246al


131

Q,4 (a) "Ptocessittg of' clltu irnplies editing, eoding, elnssifipution und tubulation",

Describe in brief tlrese l'our operntions'pointing out the significnnee df eaeh in

context o1' research study,


t7l
(b) Write shorf notes on:
1, Use of SPSS in dura malysis
t3I
2, Parametric aneJ non parnnrcBic te$t$ t4I
/l
/
the data given holow about the trsEtment of 250 patients suffering from a
!y'ron
disease, state whetlrer the ncw seetmsnt [s superior to the conventional treatment :( for
degree of freedom * l, chi- ${gwro S pmoenu3.S4} l1+1

Trnarmsnt

ffilt$uffi ,ffiiffiffi! fds!


New
'lj .' ,
i [ffi 30 170
i'; , i,,r
Conventional 60 20 80 fry5fl ndoftE
Total 200 50 250 rAj r,4"/A')b"d

ae
The sales data of an item in six shops before and after a special promotional campaign

ftre &s under:

Q$w A B C D E F
.Before 53 28 31 48 50 42
Uampatgn

After 58 29 30 55 56 45
campaign
't:'
I
.::j

Can the campaign be judged to be a success? (5% significance level, table value "+
=2.57) ::irn
l;.il
-,4
*
,:,;jii
*;}l
,Hg
i.ifi
[?M5,106] Page 2 of3 1246al
,EJS
,!.-,
,,iffi
r:i*
i"B
lffi
132

aydshort notes on:

a. Laygtrt of Research Report t7l


b. Bibliography and Annexure in the report t7l

Q.7 Tg,;dsess the significance of possible variation in porformance in tr certain test


,#::n
between the convent schools "of a city, & common tost was given to, a number of
>y'"
students taken at random from the senior fifth class of each of four school concerned.
The results are given below. Make an analysis of variance of data"

(Table value=3.24) t14l

School

A B C D
o
() t?, \u,I t8 13 {,& q
10 11 .t a.! i2 \qt4 e 8l
t2 e .6t 16 2& Lz tq Ll

8 14 l8u 6 -34 16 2st


7 46 8 6.1 15 )-Lf
SD cf T.rtt I

Pl :t',1rPT n{SftH
r,';.iigIit]5$S
i.. ,::\i".li\/dwffi$ \aPf
t2q

ztq(
t\\\

[2Ms105] Page 3 of3 124681


133

\0
Roll No. Total No of Pages: tr
o Ltvt1to6
F.{
tn
M. B. A. II
Sem. (Nlain / Back) Exam., June-July 2016
M-206 A Research Methods in Management
N
=

Time: 3 Hours Maximum Marks: 70


Min. Passing Marks: 28
Instructions to Candidates :

(i) The question paper is divided in two sections.


(ii) There are sections A & B. Section A coniains 6 questiotns otut of
which the candidate is required to attempt any 4 questions.
section B contains short case study / appliaation based question
which is..compulsory.
(iii) All questions carry equial marks.

SECTION ,,A
Q. 1 (a) Explain the concept of research and its application in various functions of
management. Ul
(b) What are the"different types of business problems encountered by the
researcher? , L|l
Q.2 What do you mean by research design? Explain various methods of research
design. tr4l
Q.3 (a) What do you understand by primary and secondary data? Explain the various
methods of collection of primary data and sources of secondary data. I7l
(b) What is questionnaire? What is difference
-b-9tye'elgg.e$iglgaigtmf,rchedule?
What preeautions should be taken in dfifting a good questionnaire? 171
Q.a (a) Explain sub-divided bar diagrams and Pie-Diagrams with illustration and their

(b) Distinguish between parametric and non-parametric tests. Give advantages of a

non-parametric test. l7l


lzMs106l Page 1 ofZ 1242ol

\
eL,s'&
134

Q.5 (a) What are the precautions should be taken in preparing the research report. t7)
(b) Write a short note on thesis. U1

Q.6 (a) A certain medicine given to each of the 12 patients,tpsulted in the following

increase of blood presspre. Can it be concluded that the medicine will in general
be accompained'by an increase in blood pressure?
-1, +3, 0,'+6, '2, +1, +5, 0, .+4. Ul
, (t. 05 tor df. = 11is2.201) +5, +2. +8,
.'(b)r+low many pairs of items should be included in a sample so that for r = +.42, the
1-/'' 6dlculated value of-t may be mo{e tndiz.lzl 17)
OR
The marks obtained in an examination follow the normal distribution with.mean 180

and standard deviation 40. If 10,000 students appeared at the examination. U41
(a) Calculate the number of students scoring betryeen 140 and 150 marks,
(b) Lowest marks of 1000 toppers
(c) Highest marks of 500 worst performers.
lZ(P=0.4)-+ 1.281

SECTION B
Caqg Siudv

a;Ahe following table gives the yields on 15 sample fields under three varieties of seeds

(vtz A,B,C): t14l

AB C

20 ** 18 !'} Y 25 {, 1'f
T*
L,{

2L url I 20 Ll r',' 28 -1
dY
z3 ( "i 1 17 -*,;' tl 22 L{ fr Ll
rut&f's
16 rl16 25 ils 28'-'[ {'.'l
20 {,0} 15 $&.f'
L-t 32 i {r"} tI
Test at 5Vo level of significanQe whether the average yields of lard under different
: varieties of seed show. Significant differences (Table value of F at 5 7o level for Y t = 2
and Yz= 12 - 3.88)

[2Ms 106] Page 2 ofZ 1?42ol

I
135
136
137
VIDEO AVAILABLE FOR
WWW.MOREEDUCATION.IN ALL PRACTICAL & THEORY SUBJECT RTU MBA CLASSES SINCE 2005
138

Roll No. Total No of Pages= p

2Nt5 106
M.B.A. Ir-Sem (Main lBack) Exam., May - z0l8
M-206A Research Methods in Management

Time: 3 Hours Maximum Marks: 70


Min. Passing Marks: 28
Instructions to Candidates :
(i) The question paper is divided in two sections.
(ii) There are sections A & B. Section A contains 6 questions out of which the
candidate is required to attempt any 4 questions. Section B contains short
case study / application based question which is compulsory.
(iii) All questions carry equal marlcs.
1. NIL 2. NIL
SECTION -A
Q./a) Discuss the vaiious methods of research.
171

O) MORE EDUCATION 9829959536


Briefly describe the different steps involved in research process. t7l
'a\,-'l,lhat
are the important concepts relating to research design? Explain. t7)

*ut do you mean by Sample? Discuss the various types of Sampling


Y
techniques. t7l
short notes on the following:
fWrite [5+5+4-14]
Primary and Secondary data.

lzMs106l Page 1 of 3 [2e00I

JAIPUR-VIDHYADHAR NAGAR 418 MANSAROVAR PLAZA PRATAP NAGAR


VIDEO AVAILABLE FOR
WWW.MOREEDUCATION.IN ALL PRACTICAL & THEORY SUBJECT RTU MBA CLASSES SINCE 2005
139

Q.4 What is testing of hypothesis? Explain how it is useful for illustrating a research

problem with examples. 14+10= l4l

Q.5 (a) What is Chi-square test? Explain the significance in statistical analysis of any
i

research proble-m. U)

(b) Suppose that the thickness of a part used in a semiconductor is its critical

dimension and that measurements of the thickness of a random sample of 18 such

parts have the variance s2=0.68, where the measurements are in thousandths of an

inch. The process rs consrdered


aa.

to be under control if the variation of the

thickness is given by a variance not greater that 0.36. Assuming that the

measurements constifute a random sample from a normal population, test the null

-hypothesis o2=0.36 against the alternative o40.36 at the cx=.05 significance

MORE
level.
EDUCATION 9829959536 U)
{d*

Wrafiraft the layout of Research Report. ,; u4)


{
SECTION. B

t14l
y Case Study-

A common admission test was conducted in four colleges. 5 students were selected at

random from each college and the marks scored by them are given below. Make an

analysis of variance.

[2Ms1o6] Page 2 of3 [2eool

qld'e--
t

JAIPUR-VIDHYADHAR NAGAR 418 MANSAROVAR PLAZA PRATAP NAGAR


VIDEO AVAILABLE FOR
WWW.MOREEDUCATION.IN ALL PRACTICAL & THEORY SUBJECT RTU MBA CLASSES SINCE 2005
140
("o l'
\__
Sample X1 SampleX2 Sample X3 Sample Xa

15 20 11 14

18 24 15

20 25 L7 zs
\
lfrn )

.:
t3 \.'.
;

24 18 t9

t3 13 18 zzt[ul
ir
l*,

MORE EDUCATION 9829959536

Page 3 of 3 [2e0ol
[2Ms106]

JAIPUR-VIDHYADHAR NAGAR 418 MANSAROVAR PLAZA PRATAP NAGAR


141
142
143

A unit of Realwaves (P) Ltd


Z test (Normal Probability Distribution)
Z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753
0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141
0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517
0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879

0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224
0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549
0.7 0.2580 0.2611 0.2642 0.2673 0.2703 0.2734 0.2764 0.2794 0.2823 0.2852
0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133
0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389

1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621
1.1 0.3643 0.3665 0.3686 0.3780 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830
1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177
1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319

1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441
1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545
1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633
1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706
1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767

2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817
2.1 0.4821 0.4826 0.4830 0.4884 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857
2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890
2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916
2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936

2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952
2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964
2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974
2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981
2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986

3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.498 0.4989 0.4989 0.4990 0.4990

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar (3) Lal Khothi 9.1
144

A unit of Realwaves (P) Ltd


T Test

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar (3) Lal Khothi 9.2
145

A unit of Realwaves (P) Ltd


Chi-Square Test

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar (3) Lal Khothi 9.3
146

A unit of Realwaves (P) Ltd


F-test 5%

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar (3) Lal Khothi 9.4
147

A unit of Realwaves (P) Ltd

F-test 1%

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar (3) Lal Khothi 9.5
148

A unit of Realwaves (P) Ltd


Spearmans Rank Correlation

Branches:
(1) Vidhyadhar Nagar (2) Mansarovar (3) Lal Khothi 9.6

You might also like