0% found this document useful (0 votes)
424 views68 pages

Exercises On Introduction To Ststistics

This document is a practice book on quantitative methods and business statistics for students of Bangalore University. It contains 10 chapters that cover topics such as introduction to business statistics, classification and tabulation of data, measures of central tendency, variation, skewness, correlation, regression, index numbers, formulas and assignments. The book provides concepts, formulas, examples and exercises to help students learn and apply statistical techniques.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
424 views68 pages

Exercises On Introduction To Ststistics

This document is a practice book on quantitative methods and business statistics for students of Bangalore University. It contains 10 chapters that cover topics such as introduction to business statistics, classification and tabulation of data, measures of central tendency, variation, skewness, correlation, regression, index numbers, formulas and assignments. The book provides concepts, formulas, examples and exercises to help students learn and apply statistical techniques.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

STATISTICS

SIMPLIFIED
A Quantitative Methods practice book for BBA II Semester
Students of Bangalore University

Compiled by
Lenin Arumanayagam,
Contains Concepts, Formulae, Exercises and Assignments Freelance Faculty
Table of Contents

Sl. No. Title Page No.

1 Introduction to Business Statistics 1

2 Classification and Tabulation 3

3 Diagrammatic Representation 5

4 Measures of Central Tendency 7

5 Measures of Variation 17

6 Measures of Skewness 25

7 Correlation & Regression 27

8 Index Numbers 35

9 Formulae 39

10 Assignments 43

11 University Question Papers 55


Chapter 1: Introduction to Business Statistics
Murray R Spiegel : Statistics is concerned with scientific method for collecting, organizing, summarizing,
presenting and analyzing data as well as drawing valid conclusions and making reasonable decisions on the
basis of such analysis

Characteristics of Statistics
1. Statistics are numerical facts
2. Statistics are aggregate of facts
3. Statistics are affected to a great extent by multiplicity of factors
4. Statistics are either enumerated or estimated with reasonable standard of accuracy
5. Statistics are collected in a systematic manner and for a predetermined purpose
6. Statistics should be capable of being placed in relation to each other

Functions of Statistics
1. Presents facts in simple forms
2. Reduces the complexity of data
3. Facilitates comparison
4. Testing hypothesis
5. Formulation of policies
6. Forecasting and estimating
7. Derives valid inferences

Limitations of Statistics
1. Statistics does not study the qualitative phenomenon
2. Statistics does not study the individual changes
3. Statistics results are true only in general and on an average
4. Statistics can be misused by ignorant and wrongly motivated persons
5. Statistics does not reveal the entire story
6. Statistics is liable to be misused
Scope of Statistics
Statistics and Planning; Statistics and Business; Statistics and Economics; Statistics and Administration;
Statistics and Business Management; Statistics and Research; Statistics and Mathematics; Statistics and
Science. Scope of Statistics in Business: Marketing; Production; Finance; Banking; Investment; Purchase;
Accounting; Control

Data Collection
Data: Facts and figures collected for a specific purpose, processed and used to help decision-making.
Census: The method of collection of data in which every unit of the population is included. This method is
accurate and reliable but expensive, time consuming and involves much labor.
Sample: A sample is a group of units selected from a larger group (the population) for specific investigation.
Primary Data: Data originally collected for the first time directly from the source using surveys are called
primary data. It may be obtained through direct observation, interviews, questionnaires, etc.
Secondary Data: Data already collected by someone other than the user are called secondary data. They may
be obtained from newspapers, agencies, journals, records, reports, etc.

Business Statistics | Concepts and Exercises Page | 1


Chapter 2: Classification & Tabulation of Data
Data is a collection of any number of related observations on one or more variables. Raw data is information
that has not been processed to be made presentable or analyzed by statistical methods.

Classification of data is a process of arranging data into sequences and groups or classes according to their
attributes and or characteristics. It refers to the sorting out of a heterogeneous mass data into a number of
homogeneous groups and sub-groups.

Tabulation is defined as the orderly or systematic presentation of numerical data in rows and columns,
designed to facilitate comparison between the figures.

Parts of a Table: Table number, Title, Titles of rows, columns, sub-rows and sub-columns, Totals, Footnotes
and Source.

Objectives and Functions of Classification of Data


1. To convert the raw data into organized data
2. To present the complex data into a simple form
3. To facilitate comparison
4. To bring out the uniformity among facts
5. To present data in a condensed form

Types of Classification
Qualitative Classification: a classification in which data are classified according to attributes or qualities.
Generally the qualitative phenomena are not measurable. E.g. Classification based on marital status, gender
etc.
Quantitative Classification: A classification in which data are classified according to quantities that are
measurable such as age, weights, marks, wages, etc.

Other Important Definitions:


Individual Observations/series: Data that are listed as they are observed, collected and recorded. They are in
a raw form and unorganized.
Discrete Classes: Data that do not progress from one class to the next without a break is called discrete class.
In other words, they are classes that represent distinct categories or counts.
Continuous Data: Data that may progress from one class to the next without a break and may be expressed in
whole numbers or decimals.
Frequency: Frequency is the number of times each value of the variable occurs in the series. It is the rate of
occurrence of a particular value thing, or event.
Frequency Distribution: It is the summary of frequency of variables according to their magnitude individually
or in groups.
Cumulative Frequency: It is the total of all the frequencies up to and including the respective class interval
when the class intervals are in the ascending or descending order of values.
Population: A collection of all the elements we are studying and about which we are trying to draw
conclusions.
Sample: A collection of some, but not all, of the elements of the population under study, used to describe the
population.

Business Statistics | Concepts and Exercises Page | 3


Classification and Tabulation

Two-Way Table: A table which is used to categorize the data based on two or more attributes.

Exercise 2.1
1. Draw a blank table to present the following information regarding the students of a college according to:

a. Faculty: Arts, Science and Commerce


b. Sex: Boys and Girls
c. Years: 1993 and 1994
d. Age Group: Below 20 years and above 20 years.

2. The total number of accidents in Southern Railway in 1960 was 3,500 and it decreased by 300 in 1961 and
by 700 in 1962. The total number of accidents in meter gauge section showed a progressive increase from
1960 to 1962. It was 245 in 1960, 346 in 1961 and 428 in 1962. In the meter gauge section, the number of
non-compensated cases were 49 in 1960, 77 in 1961, and 108 in 1962. The number of compensated cases
in the broad gauge section were 2,867, 2,687 and 2,152 in those years in order. Tabulate the data.

3. Present the following information in a suitable form supplying the figures not directly given:
In 1975, out of a total of 4,000 workers in a factory, 3,300 were members of a trade union. The number of
women workers employed was 500 out of which 400 did not belong to any union. In 1974, the number of
workers in the union was 3,450 of which 3,200 were men. The number of non-union workers was 760 of
which 330 were women.

4. Following data gives the number of children in 50 families. Construct a suitable frequency table.

4 2 0 2 3 2 2 1 0 2 3 5
1 1 4 2 1 3 4 2 6 1 2 2
2 1 3 4 1 0 2 4 3 0 1 3
6 1 0 1 1 3 4 1 0 1 2 2
2 5 (Answer: 6, 13, 14, 7, 6, 2, 2)

5. Following are the weights of 50 college students in kg. Construct a frequency table.

42 42 46 54 41 37 54 44 38 45 47 50

58 49 51 42 46 37 42 39 54 39 51 58

47 51 43 48 49 48 49 41 41 40 58 49

49 59 57 52 56 38 45 52 46 40 51 41

51 41 (Answer: 6, 13, 14, 11, 6)

6. Following are figures of income (x) and percentage expenditure on food (y) in 25 families. Construct a
bivariate (two-way) frequency table.

X 550 623 310 420 600 225 310 640 512 690
Y 12 14 18 16 15 25 26 20 18 12
X 680 300 425 555 325 202 255 492 587 643
Y 13 25 16 51 23 29 27 18 21 19
X 689 523 317 384 400
Y 11 12 18 17 19

Business Statistics | Concepts and Exercises Page | 4


Chapter 3: Diagrammatic Representation
Diagrams are visual aids of presenting the data in pictures, geometric figures and curves. They present a bird’s
eye view of huge mass of quantitative data in a condensed form attractively.

Uses of Diagrams and Graphs


1. They present a bird’s eye view of huge mass of information
2. They leave a huge impression on the minds of the readers as they are attractive
3. Easy to understand and consumes less time to understand the information
4. Entire data is visible at a glance

Limitations of Diagrams and Graphs


1. They are useful to layman but to experts, their utility is limited
2. They fail to furnish details
3. They present data only in a particular range.
4. They are not subject to further mathematical analysis

Types of Diagrams and Graphs:


Note: For examples and sample diagrams, please refer your textbook.

Line Diagrams: These diagrams are used when there is a large number of values of variable with variations in
their values within a small range
Simple Bar Diagram: These diagrams are suitable for individual observations and time series. The bars have
the uniform width.
Multiple Bar Diagram: These diagrams are used when two or more phenomena and a number of attributes
are compared with each other. Different shades may be used to identify the various attributes or periods.
Sub-divided (Component) Bar Diagrams: These diagrams are used when two or more components are present
in a single phenomenon
Sub-divided Percentage Bar Diagrams: These are sub-divided diagrams which are used to depict the values of
variable in percentage. All the bars are equal in height representing the value as 100%.
Pie Charts (Circular Diagram): This is a pictorial representation of statistical data with several components in a
circular form. Pie charts consist of a circle sub-divided into several sectors by radius.
Pictograms: It is a representation in which pictures are used to represent the data. Each full diagram
represents a certain quantity.
Histograms: Histogram is a device of graphical representation of a frequency distribution. It is constructed by
erecting a set of rectangles on each interval on the horizontal axis. The height of the rectangle represents the
frequency of the class interval.
Frequency Polygon: A line graph connecting the midpoints of each class in a data set, plotted at the height
corresponding to the frequency of the class. It can also be drawn by joining the midpoints of the top of the
vertical bars of a histogram.
Frequency Curve: A frequency polygon with smoothed curve to eliminate the accidental irregularities in the
data.

Business Statistics | Concepts and Exercises Page | 5


Diagrammatic Representation

Ogive Curve: This is a graphical representation of cumulative frequency distribution of a continuous series.
There are two types of Ogive Curves: 1. More than Ogive and 2. Less-than Ogive

Exercise 3.1
1. Draw a simple bar diagram from the following data relating to the number of small scale industrial units in
various states in the year 2008

States Karnataka TN Kerala Andhra Maharashtra MP UP

No. of SS Units 10 12 15 15 18 25 22

2. Present the following data of results of BBM students in statistics examination of Bangalore University
held in June 2006, 2007 and 2008 using multiple bar diagram:

Year I Class II Class III Class Failed Total

June 2006 100 300 500 300 1200

June 2007 120 400 600 280 1400

June 2008 100 500 700 300 1600

3. Represent the following data using sub-divided bar diagram and percentage sub-divided bar diagram:

Cost Per Equipment 2006 (₹) 2007 (₹) 2008 (₹)

Raw Material 2,160 2,600 2,700

Labor 540 700 810

Direct Expenses 600 300 350

Factory Expenses 360 200 360

Office Expenses 180 200 270

Total 3,840 4,000 4,490

4. Represent the following figures using line graph:

Year 2003 2004 2005 2006 2007 2008

Exports (lakh ₹) 25 110 80 130 90 150

Imports (lakh ₹) 5 70 110 90 140 130

Balance of Payments +20 +40 -30 +40 -50 +20

5. Draw a pie diagram to represent the following data of investment pattern in the state budget (in ₹ crore):

Agriculture Industry Education Transportation Social Services

600 400 300 450 250

Business Statistics | Concepts and Exercises Page | 6


Chapter 4: Measures of Central Tendency
A statistical average or measure of central tendency is a single number around which the greatest proportion
of the data concentrates.

Characteristics of a Good Measure of Central Tendency


1. It should be well defined.
2. It should be easy to understand and calculate.
3. It should be based on all the observations.
4. It should be capable of further treatment.
5. It should be affected as little as possible by fluctuations of sampling.
6. It should not be affected by extreme values.

Commonly Used Measures of Central Tendency


1. Arithmetic Mean or Simple Mean
2. Median
3. Mode
4. Geometric Mean
5. Harmonic Mean

Arithmetic Mean
A mathematical representation of the typical value of a series of numbers, computed as the sum of all the
numbers divided by the count of all numbers in the series.

Merits of Arithmetic Mean


1. It is simple to understand and easy to compute.
2. All items are used in calculation.
3. Mean is well defined.
4. It is capable of further algebraic treatment.
5. It is not affected by sampling fluctuations.
6. It is the center of gravity.
7. It is a calculated value and not based on position in the series.

Limitations / Demerits of Arithmetic Mean


1. The value of mean is affected by extreme items.
2. In case of open ended classes, the value of mean cannot be calculated without making assumptions
regarding the size of the interval.
3. It may not be a good measure in some cases, for instance, asymmetrical distributions.

Formulae
Σx Σd
̅ =
Individual Series: X ̅ =A+
;X
n n

Σfx Σfd
̅ =
Discrete Series: X ̅ =A+
;X
N N

Business Statistics | Concepts and Exercises Page | 7


Measures of Central Tendency

Σfm Σfd Σfd′


Continuous Series: ̅
X = ; ̅
X =A+ ; ̅
X = A+ xi
N Σf N

Σxw
Weighted Arithmetic Mean: ̅
X =
Σw
n1 x̅1 + n2 x̅2
̅ (1,2) =
Combined Arithmetic Mean: X
n1 + n2

Exercise 4.1
1. Find the AM of 5, 8, 10, 15, 24 and 28 (Answer: 15)

2. The wages of 9 workers are: 150, 80, 120, 60, 75, 125, 95, 115, 130. Find the mean wages. (Answer: 105.5)

3. In the city, 30 members were surveyed as to how many domestic appliances they had purchased and the
replies were as under. Prepare a frequency table and find the mean. (Answer: 2.83)

1, 2, 5, 1, 2, 1, 4, 2, 3, 4, 2, 4, 3, 2, 6, 3, 2, 4, 3, 6, 2, 2, 3, 3, 7, 2, 3, 0, 2, 1

4. Find the mean runs scored by a batsman during his career using direct method and shortcut method
(Answer: 46):

x 10 20 30 40 50 60 70 80 90

f 7 18 15 25 30 20 16 7 2

5. Compute the mean of the following data using direct method and shortcut method (Answer: 13.54):

x 9 10 11 12 13 14 15 16 17 18

f 1 2 3 6 10 11 7 3 2 1

6. Calculate the mean from the following data using direct, shortcut and step-deviation methods (Answer:
36.36):

CI 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70 70 – 80

f 5 10 25 30 20 10 5 5

7. Calculate the mean wages from the following data (Answer: 73.44):

Wages 48 – 56 56 – 64 64 – 72 72 – 80 80 – 88 88 – 96 96 – 104

No. of Workers 8 3 11 14 5 7 2

8. Calculate the mean from the following data using direct, shortcut and step-deviation methods (Answer:
49.3):

CI 10 – 19 20 – 29 30 – 39 40 – 49 50 – 59 60 – 69 70 – 79 80 – 89
f 5 9 14 20 25 15 8 4

9. Calculate the mean marks from the following data (Answer : 43.7):

Business Statistics | Concepts and Exercises Page | 8


Measures of Central Tendency

Marks Below 10 20 30 40 50 60 70 80 90 100

No. of students 5 12 25 45 70 80 88 92 96 100

10. Calculate the mean sales from the following data (Answer: 28.73):

Sales less than 10 20 30 40 50 60

Frequency 4 20 35 55 62 67

11. A college wanted to give monthly scholarship to B.Com students securing 50% and above marks in the
following manner:

Percentage of Marks 50 – 55 55 – 60 60 – 65 65 – 70 70 – 75

Scholarship (₹) 25 30 35 40 45

The percentage of marks of 20 students who were eligible for scholarship are given below:

52, 62, 51, 71, 54, 53, 51, 50, 57, 64, 56, 54, 69, 63, 65, 59, 58, 68, 57, 62

Calculate the average monthly scholarship payable to the students. (Answer: 31.5)

12. A limited company wants to pay bonus to the members of its staff as under:

Salary (₹ ‘000) 100 - 120 120 – 140 140 – 160 160 – 180 180 – 200 200 - 220 Above 220

Bonus (₹ ‘000) 50 50 70 80 90 100 110

Actual salaries of the members of the staff are as follows, in rupees: 200, 180, 185, 195, 218, 187, 160,
250, 198, 190, 168, 170, 178, 175, 140, 120, 148, 165, 155, 145, 125, 110, 162, 130, 150

What is the total bonus paid? What is the average bonus paid per staff? (Answer: 78.4)

13. From the following data of calculation of AM, find the missing value. Mean value is 126.3 (Answer: 120):

x 60 80 100 - 160 180 200

f 5 8 12 22 10 7 6

14. The AM of the following frequency distribution is 67.45 inches. Find the missing frequency. (Answer: 126):

Height (Inches) 60 – 62 63 – 65 66 – 68 69 – 71 72 - 74

F 15 54 ? 81 24

15. The mean of the following data is 67.45. Find the missing frequencies (Answer: 42, 27).

CI 60 – 62 63 – 65 66 – 68 69 – 71 72 - 74 Total

F 5 18 - - 8 100

16. The mean of the following data is 25. Find the missing frequencies (Answer: 10, 10).

Business Statistics | Concepts and Exercises Page | 9


Measures of Central Tendency

x 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 Total

f 5 - 15 - 5 45

17. Find the weighted arithmetic average price of coal purchased by an industry (Answer: 50.36):

Month January February March April May June

Price per ton (₹) 42.50 51.25 50.00 52.00 44.25 54.00

No. of tons 25 30 40 50 10 45

18. The mean weight of 25 male workers in a factory is 63 kg, and the mean weight of 35 female workers in
the same factory is 55 kg. Find the combined average weight of the 60 workers in the factory. (Answer:
58.33)

19. The arithmetic mean of a group of 80 boys is 10 years, and that of second group of 20 boys is 15 years.
Find the arithmetic mean of the two groups taken together. (Answer: 11)

Median
Median is the middle value of the distribution, and therefore it is called the positional average. So, the place of
median in a series is such that, an equal number of items lie on either side of it.

Merits of Median
1. Median is especially useful in case of open ended classes, since it is not necessary that the value of all
items be known.
2. Median is not influenced by extreme values.
3. In a markedly skewed distribution, median is especially useful.
4. The value of median can be determined graphically, whereas the value of mean cannot be graphically
ascertained.

Limitations / Demerits of Median


1. For calculating median it is necessary to arrange the data, whereas, other averages do not need any
arrangement.
2. It is not determined by each and every observation.
3. Median is not capable of further algebraic treatment.
4. It is affected by sampling fluctuations.
5. The median in some cases cannot be computed exactly, as in the case of mean.

Formulae
n th n th
(n+1) th ( 2 ) term + (2 +1) term
Individual Series: M = [ ] term when n is odd and M = [ ] when n is even.
2 2

(n+1) th
Discrete Series: M = [ ] term
2
N
− c.f.
Continuous Series: M = L + 2
xi
f

Business Statistics | Concepts and Exercises Page | 10


Measures of Central Tendency

Exercise 4.2
1. Find the median: 43, 62, 15, 80, 56, 72, 34, 8, 25 (Answer: 43)

2. The wages of 9 workers are: 150, 80, 120, 60, 75, 125, 95, 115, 130. Find the median. (Answer: 115)

3. Find the median: 36, 5, 19, 26, 6, 28, 56, 18, 63, 4 (Answer: 22.5)

4. Find the median: 105, 89, 93, 142, 112, 136, 82, 97, 128, 135, 110, 104 (Answer: 107.5)

5. In a class of 15 students, 5 failed in a test. The marks of those who passed were, 9, 6, 7, 8, 9, 6, 5, 4, 7 and
8. Calculate the median marks of the 15 students.

6. Find the median: (Answer: 40)

x 10 20 30 40 50 60 70 80 90 100

f 10 16 18 13 6 3 8 4 6 8

7. Find the median:

Wages 5 10 15 20 25 30

Frequency 7 12 37 25 22 11

8. Find the median (Answer: 37.7):

Age < 20 20 – 25 25 – 30 30 – 35 35 – 40 40 – 45 45 - 50 > 50

No. of Workers 13 29 46 60 112 94 45 21

9. Calculate the median (Answer: 50.3):

CI 10 – 19 20 – 29 30 – 39 40 – 49 50 – 59 60 – 69 70 – 79 80 – 89

f 5 9 14 20 25 15 8 4

10. Calculate the median (Answer: 42.6):

CI 1 – 10 11 – 20 21 – 30 31 – 40 41 – 50 51 – 60 61 – 70 71 - 80 81 - 90

f 3 7 13 19 14 11 9 9 5

11. Calculate the median marks (Answer: 42):

Marks Below 10 20 30 40 50 60 70 80 90 100

No. of students 5 12 25 45 70 80 88 92 96 100

12. Calculate the median (Answer: 36.25):

Value less than 10 20 30 40 50 60 70 80

No. of students 4 16 40 76 96 112 120 125

Business Statistics | Concepts and Exercises Page | 11


Measures of Central Tendency

13. Calculate the median marks (Answer: 30):

Values above 10 20 30 40 50 60

No. of students 50 40 25 16 10 2

14. Calculate the median (Answer: 44.2):

Marks more than 10 20 30 40 50 60 70 80

Frequency 115 103 88 68 43 23 13 3

15. In a group of 1000 wage earners, the monthly wages of 4% are below ₹60 and those of 15% are under
₹62.50. 15% earned ₹95 and over, and 5% got ₹100 and over. Find the median wage (Answer: 78.75).

16. 10% of the workers in a factory employing a total of 1000 workers, earn between ₹5 and 9.99, 30%
between ₹10 and ₹14.99, 250 workers between ₹15 and 19.99 and the rest ₹20 and above. What is the
median wage? (Answer: 17)

17. Compute the median after amending the table (Answer: 14):

X f x f

Less than 5 7 20 – 25 20

Less than 10 20 25 and above 5

5 – 15 38 30 and above 1

15 and above 35

18. Calculate the median (Answer: 153.79):

Mid values 115 125 135 145 155 165 175 185 195

Frequencies 6 25 48 72 116 60 38 22 3

19. Calculate the median (Answer: 15.81):

Mid values 5.5 9.5 13.5 17.5 21.5 25.5

Frequencies 12 23 40 65 17 3

20. Calculate the median using Ogive curve (Answer: 46.6):

Wages 0 – 20 20 – 40 40 – 60 60 – 80 80 – 100

No. of workers 82 112 150 95 48

21. Locate the median using Ogive curve (Answer: 44):

Marks Less Than 20 30 40 50 60 70

No. of Students 5 13 24 39 52 60

Business Statistics | Concepts and Exercises Page | 12


Measures of Central Tendency

22. Marks of 100 students are given below. If median is 33, find the missing frequencies. (Answer: 17, 16)

Marks 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70
No. of Students 12 15 - 20 - 10 10

23. Find the missing frequencies if the value of median is 36.5 and N = 120. (Answer: 30, 11)

Class Interval 20 – 25 25 – 30 30 – 35 35 – 40 40 – 45 45 – 50 50 – 55 55 – 60
Frequencies 8 15 28 - 22 - 4 2

Mode
According to A M Tuttle, “Mode is the value which has the greatest frequency in the neighborhood.” Just as
median, mode too is a positional average. So, the most frequent or the item which is repeated maximum
times in the series is the mode of the series.

Merits of Mode
1. Mode is not affected by extremely large or small items.
2. Mode can be determined in open-ended classes without assuming the class limits.
3. The value of mode can be determined graphically, whereas, the value of mean cannot be ascertained.

Limitations / Demerits of Mode


1. The value of mode cannot always be determined. For instance, bi-modal and multi-modal series.
2. Mode is not capable of further algebraic treatment.
3. The value of mode is not based on each item.
4. It is not a rigidly defined measure. So it is the most unstable average.

Formulae
Individual Series: The variable that occurs most frequently.

Discrete Series: The value which has the greatest frequency in the neighborhood.
∆1
Continuous Series: Z or M0 = L + x i; ∆1 = |f1 – f0| and ∆2 = |f1 – f2|
∆1 + ∆2

Bi-modal Class: Z or M0 = 3 median − 2 mean

Exercise 4.3
1. Find the mode: 3, 5, 7, 5, 9, 7, 5, 7, 6, 3, 9, 5, 6, 6, 3

2. Find the mode: 54, 66, 42, 64, 44, 86, 104, 94, 100, 80, 72, 64, 64, 44, 64, 72, 54, 54, 48, 52, 50

3. Find the mode: 122, 234, 638, 420, 512, 234, 270, 420, 900, 195, 360

4. Find the mode (Answer: 4):

x 1 2 3 4 5 6
f 2 8 11 18 9 7

Business Statistics | Concepts and Exercises Page | 13


Measures of Central Tendency

5. Compute the mode (Answer: 32):

x 8 16 24 32 40 48
f 2 4 20 19 10 5

6. Calculate the mode (Answer: 8):

x 2 4 6 8 10 12 14
f 6 8 16 16 12 6 4

7. Find the mode (Answer: 74):

Wages 48 – 56 56 – 64 64 – 72 72 – 80 80 – 88 88 – 96 96 – 104
No. of Workers 8 3 11 14 5 7 2

8. Compute the mode (Answer: 52.833):

CI 10 – 19 20 – 29 30 – 39 40 – 49 50 – 59 60 – 69 70 – 79 80 – 89
f 5 9 14 20 25 15 8 4

9. Find the mode (Answer: 11.35):

Attendance below 5 10 15 20 25 30 35 40 45

No. of students 29 224 465 582 634 644 650 653 655

10. Twenty percent of the workers in a firm employing a total of 2000 workers earn less than ₹2.00 per hour,
440 earn from ₹2.00 to ₹2.24 per hour, 24% earn from ₹2.25 to ₹2.49 per hour, 370 earn from ₹2.50 to
₹2.74 per hour, 12% earn from ₹2.75 to ₹2.99 per hour and the rest ₹3.00 or more per hour. Set up a
frequency table and calculate the modal wage. (Answer: 2.3117)

11. Compute the mode (Answer: 40):

CI 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70 70 – 80

f 4 12 24 32 32 16 8 2

12. Compute the mode (Answer: 89.5):

CI 40 – 49 50 – 59 60 – 69 70 – 79 80 – 89 90 – 99 100 – 109 110 – 119

f 7 9 10 6 13 10 13 10

13. Find the mode (Answer: 20):

Weight (in Kg) 5 10 15 20 25 30 35 40


No. of persons 8 19 27 45 24 45 22 10

Business Statistics | Concepts and Exercises Page | 14


Measures of Central Tendency

14. Find the mode (Answer: 59.62):

Weight (in Kg) 45 48 52 56 60 64 68 72 76 80


No. of persons 110 116 116 100 96 96 96 84 72 62

15. Locate the mode using Histogram, Frequency polygon and smoothed frequency curve (Answer:50.71):

Class Interval 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70

Frequencies 5 30 90 180 250 260 130

16. Locate the mode using Histogram, Frequency polygon and smoothed frequency curve (Answer: 24.44):

Class Interval 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60

Frequencies 14 23 35 20 8 5

17. Find the mean, median and mode (Answer: Mean = 151.29, Median = 149.6, Mode = 146.211):

Mid values 115 125 135 145 155 165 175 185 195

Frequencies 120 116 116 100 96 96 96 84 72

18. Find the mean, median and mode:

90, 78, 86, 51, 96, 104, 51, 78, 50, 72, 49, 77, 90, 74, 69, 70, 68, 69, 104, 80, 79, 54, 79, 73, 58, 91, 78, 67,
50, 84, 76, 110, 53, 74, 40, 60, 42, 82, 41, 76, 84, 76, 42, 65, 60, 77, 61, 75, 115, 81

19. a. Z = 50, and M = 45. ̅


X=?

b. ̅
X = 12, Z = 13, M = ?

c. If Mean = 20.2, Median = 22.1, find the mode.

Business Statistics | Concepts and Exercises Page | 15


Chapter 5: Measures of Variation
Kafka defines measures of variation as, “the measurement of the scatteredness of the mass of figures in a
series about an average.”

Objectives of Measuring Variation


1. To measure exactly the reliability of an average
2. To serve as the basis for the control of variability
3. To compare two or more series with regard to their variability
4. To facilitate the use of other statistical measures

Properties of a Good Measure of Variation

1. It should be simple to understand.


2. It should be easy to compute.
3. It should be well defined.
4. It should be based on each item of the distribution.
5. It should be capable of further algebraic treatment.
6. It should have sampling stability.
7. It should not be affected by extreme values.

Relative and Absolute Measures of Variation


1. Absolute measures of dispersion are expressed in the same statistical unit in which the original data are
given, such as Rupees, kg, tons etc. These variables may be used to compare the variation in two
distributions if the variables are expressed in the same units, and are of the same average size.

2. If the two sets of data are expressed in different units, such as quintals of sugar versus tons of
sugarcane, or if the average size is very different, such as the manager’s salary versus worker’s wages,
relative measures should be used. Relative measures of dispersion are also called a coefficient of
dispersion.

Some important measures of dispersion are discussed below.

Range
Definition
Range is defined as “The difference between the two extreme items of the distribution” or the difference
between the largest and smallest items of the distribution.

Merits of Range
1. Range is simple to understand
2. It is easy to calculate
3. It gives a quick rather than an accurate picture of variability.

Limitations of Range

1. It is not based on each observation

Business Statistics | Concepts and Exercises Page | 17


Measures of Variation

2. It is affected by extreme values in the series


3. It cannot be calculated for open-ended classes
4. It is highly affected by fluctuations of sampling

Uses of Range

1. It is useful in studying the variations in the prices of shares and stock, gold, jewelry etc.
2. In weather forecasts, range is used to determine the difference between the maximum and minimum
temperature.
3. It is used in industries for statistical quality control.

Interquartile Range & Quartile Deviation


Meaning
Inter-quartile range includes the middle 50% of the distribution. In other words, it represents the difference
between the third quartile and the first quartile.

Merits of Quartile Deviation

1. It is based on 50% of the observations


2. QD can be calculated for open ended classes also, because Q1 and Q3 are positional averages.
3. It is not affected by extreme values.

Limitations of Quartile Deviation


1. It ignores 50% items.
2. It is not a measure of dispersion as it does not show the scatter around an average.
3. It is not capable of further algebraic treatment.
4. It is affected if the central items are irregular.
5. It is highly affected by sampling fluctuations
6. It is not affected by distribution of items outside the two quartiles.

Formulae
Range: L – S (Where L = Largest variable and S = Smallest variable)

L−S
Coefficient of Range: Individual & Discrete Series Continuous Series
L+S
N
Interquartile Range: IQR = Q3 – Q1 (n + 1) th − c. f.
Q1 [
4
] term L+ 4 xi
Q3 − Q1 f
Quartile Deviation: QD =
2 3N
3(n + 1) th − c. f.
Q3 − Q1 Q3 [ ] term L+ 4 xi
Quartile Deviation: CQD = 4 f
Q3 + Q1

Exercise 5.1
1. Compute the range and coefficient of range of the following series and state which is more dispersed.

a. 13, 14, 15, 16, 17 b. 9, 12, 15, 18, 21 c. 1, 8, 15, 22, 29

Business Statistics | Concepts and Exercises Page | 18


Measures of Variation

2. Find the range and coefficient of range of the following distribution (36, 0.75):

x 6 12 18 24 30 36 42

f 7 18 15 25 30 20 16

3. Compute range and coefficient of range of the following series (Answer: 80, 1):

CI 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70 70 – 80

F 5 10 25 30 20 10 5 5

4. From the following data, calculate the Quartile Deviation and its Coefficient (Answer: 19.75, 0.339)

30, 43, 48, 89, 54, 25, 84, 61, 67, 37, 72, 80

5. Calculate the Quartile Deviation and its Coefficient from the following data (Answer: 1.5, 0.0244):

X 58 59 60 61 62 63 64 65 66

F 15 20 32 35 33 22 20 10 8

6. Compute the Quartile Deviation and its Coefficient from the following data (Answer: 5, 0.25):

Wages 5 10 15 20 25 30

Frequency 7 12 37 25 22 11

7. Calculate the Quartile Deviation and its Coefficient from the following data (Answer: 5.208, 0.2643):

Wages (₹) 4–8 8 – 12 12 – 16 16 – 20 20 – 24 24 – 28 28 – 32 32 –36 36 - 40


No. of
6 10 18 30 15 12 10 6 2
workers

8. Calculate the Quartile Deviation and its Coefficient from the following data (Answer: 2.273, 0.2039):

CI 5–7 8 – 10 11 – 13 14 – 16 17 – 19

f 14 24 38 20 4

Mean Deviation
Meaning
It is the average difference between the items in a distribution and the mean of that series.

Merits of Mean Deviation


1. It is simple to understand and easy to compute.
2. It is based on each item of the data.
3. It is less affected by the values of extreme items than the Standard Deviation.

Business Statistics | Concepts and Exercises Page | 19


Measures of Variation

4. Since deviations are taken from a central value, comparison about the formation of different
distributions can easily be made.

Limitations of Mean Deviation


1. Algebraic signs are ignored.
2. It is not capable of further algebraic treatment.
3. It is rarely used in social science studies.
4. It does not give us accurate results.

Formulae

Individual Series Discrete Series Continuous Series

Ʃ |D| Ʃ f |D| Ʃ f |D|


Mean Deviation
n N N

|D| |x − x̅| 𝑜𝑟 |x − M| |x − x̅| 𝑜𝑟 |x − M| |m − x̅| 𝑜𝑟 |m − M|

MD MD MD MD MD MD
Coefficient of MD 𝑜𝑟 𝑜𝑟 𝑜𝑟
x̅ M x̅ M x̅ M

Exercise 5.2
1. Calculate mean deviation & Coefficient of mean deviation using mean and median (Answer: 0.1193):

3000, 4300, 4000, 4800, 4200, 5800, 4600, 4500

2. Calculate mean deviation & Coefficient of mean deviation using mean and median (Answer: 0.38, 0.43):
90, 280, 65, 60, 50, 120, 100, 110, 70, 80, 75

3. Compute the mean deviation and its coefficient using mean and median (Answer: 7.66, 7.6 & 0.38, 0.38):

x 5 10 15 20 25 30 35 40

f 16 32 36 44 28 18 12 14

4. Compute the mean deviation and its coefficient using mean and median (Answer: 1.53, 1.49 & 0.407,
0.372):

No of Home
0 1 2 3 4 5 6 7
Appliances
No. of Families 14 21 25 43 51 40 39 12

5. Compute the mean deviation and its coefficient using mean and median (Answer: 11.33 & 0.252):

Marks 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70 70 – 80

No. of Students 4 6 10 20 10 6 4

Business Statistics | Concepts and Exercises Page | 20


Measures of Variation

6. Compute the mean deviation and its coefficient using mean and median (Answer: 7.6, 7.296 & 0.196,
0.194):

Mid Values 22.5 27.5 32.5 37.5 42.5 47.5 52.5 57.5 62.5

Frequency 6 12 17 28 12 10 8 5 2

7. Compute the mean deviation and its coefficient (Answer: 40.417 & 0.425):

Wages below 25 50 80 110 150 200 300

No. of Workers 4 10 20 40 50 56 60

Standard Deviation
Standard Deviation is the square root of the means of the squared deviations from the arithmetic mean. SD is
also known as Root Mean Square Deviation for this reason. It is the most widely used measure of variation.

Differences between Mean Deviation and Standard Deviation


1. Algebraic symbols are ignored while calculating mean deviation, whereas in the calculation of
standard deviation, signs are taken into account.
2. Mean deviation can be computed either from median or mean; standard deviation is always
calculated from mean.

Merits of Standard Deviation


1. It is based on each item of the data.
2. It is amenable to further algebraic treatment. It is possible to calculate the combined SD of two or
more groups.
3. For comparing the variability of two or more groups, coefficient of variation is considered to be the
most appropriate as it is based on mean and standard deviation.
4. Standard deviation is also used in further statistical work. For example, in calculating skewness,
correlation etc., standard deviation is used.

Limitations of Standard Deviation

1. Standard deviation is difficult to compute compared to other measures .

Formulae

Individual Series Discrete & Continuous Series

Ʃd2 Ʃfd2
Direct Method σ=√ d = x − x̅ σ=√ d = x − x̅ or m − x̅
n N

Ʃd2 Ʃd 2 Ʃfd2 Ʃfd 2


Short-cut Method σ=√ − ( ) d=x−A σ=√ − ( ) d = x − A or m − A
n n N N
2
Step – Deviation Ʃfd′2 Ʃfd′ x−A m−A
Method
- σ=√ − ( ) x i d′ = or
N N i i
σ
Variance = σ2 Coefficient of Variation, CV = x 100

Business Statistics | Concepts and Exercises Page | 21


Measures of Variation

Exercise 5.3
1. Calculate the standard deviation of the marks of 11 students (Answer: 60.49):

90, 280, 65, 60, 50, 120, 100, 110, 70, 80, 75

2. Calculate the SD and Coefficient of Variation using direct method and shortcut method (Answer: 23.066 &
59.91%):

5, 10, 20, 25, 40, 42, 45, 48, 70, 80

3. Following are the runs scored by two batsmen X and Y in ten innings. Find who is a better scorer and who
is more consistent (Answer: CV(X) = 84.072%; CV(Y) = 82.707%):

X 100 22 0 36 82 45 7 13 65 14

Y 97 12 40 96 13 8 85 8 56 16

4. Compute the coefficient of variation (Answer: 43.63%):

x 10 20 30 40 50 60

f 8 12 20 10 7 3

5. The following table gives the age distribution of boys and girls in a high school. Find which of the two
groups is more variable in age. (Answer: CV(boys) = 7.85%; CV(girls) = 7.34%)

Age 13 14 15 16 17

No. of boys 12 15 15 5 3

No. of girls 13 10 12 2 1

6. The goals scored by teams A and B in a few football matches are as follows. Which team is more
consistent? (Answer: CV(A) = 124.94%; CV(B) = 108.97%)

Goals 0 1 2 3 4

No. of matches – Team A 27 9 8 4 5

No. of Matches – Team B 17 9 6 5 3

7. Compute the variance (Answer: 311.52):

Marks 20 – 29 30 – 39 40 – 49 50 – 59 60 – 69 70 – 79 80 – 89 90 - 99

No. of Students 5 12 15 20 18 10 6 4

8. Compute the coefficient of variation from the following data (Answer: 152.77%):

Profit/Loss - 4 – -3 -3 – -2 -2 – -1 -1 – 0 0–1 1–2 2–3 3–4 4–5 5–6

No. of shops 4 10 22 28 38 56 40 24 18 10

Business Statistics | Concepts and Exercises Page | 22


Measures of Variation

9. Find which class is more consistent in scoring marks, from the following table (Answer: 24.99 & 23.53):

Marks 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70

Class A 7 10 20 18 7

Class B 5 9 21 15 6

10. Following data relates to the wages of workers in factories A and B. which factory wages are more
variable (Answer: CV(A) = 54.14%; CV(B) = 49.89?

Wages up to (₹) 5 10 15 20 25 30

No. of workers – A 20 38 68 93 113 128

No. of workers – B 15 35 70 100 118 135

11. The number of employees, average wages per employee, and the variance of wages for two factories is
given below. (Answer: 2.5% and 4.71%)

Factory A Factory B
No. of
50 100
employees
Average wages ₹120 ₹85

Variance 9 16

In which factory is there greater variation in the distribution of wages/employees? Which factory pays
more?

12. Mean and standard deviation of the following continuous series are 31 and 15.9 respectively. The
distribution after taking step deviations is as follows. Determine the class intervals. (Answer: i = 10, CI = 0
– 10, 10 – 20 etc.).

d' -3 -2 -1 0 1 2 3

f 10 15 25 25 10 10 5

13. a. If x̅ = 56 and Variance = 144, find CV. (Answer: 21.43)

b. If Variance = 16, and CV = 50% find x̅. (Answer: 8)

c. If CV = 58% and x̅ = 36.55, find σ. (Answer: 21.2)

Business Statistics | Concepts and Exercises Page | 23


Chapter 6: Measures of Skewness
Skewness is a measure of asymmetrical statistical distribution. It characterizes the degree of symmetry or
asymmetry around its mean.

Absolute and Relative Measures of Skewness


1. Absolute measures of Skewness
Absolute measure of skewness explains the extent of asymmetry and the direction.
2. Relative Measures of Skewness
Relative measure of skewness is useful for comparative study of two or more series

Symmetrical Distribution
A distribution is symmetrical if the Mean = Median = Mode
A distribution is positively skewed if Mean > Median > Mode
A distribution is negatively skewed if Mean < Median < Mode

Interpretation of coefficient of skewness


If skewness is less than -1 or greater than +1 (-1 >Skp or Skp> +1), the distribution is highly skewed
If skewness is between -1 and -½ or between +½ and +1 (-1 ≤ Skp ≤ -½ or +½ ≤ Skp ≤ +1), the distribution is
moderately skewed
If skewness is between -½ and +½ (-½ ≤ Skp≤ +½), the distribution is approximately symmetric.

Uses of Skewness
1. Skewness is a measure to study whether a distribution is symmetrical or not.
2. Many models assume normal distribution; i.e., data are symmetric about the mean. The normal
distribution has a skewness of zero. But in reality, data points may not be perfectly symmetric. So, an
understanding of the skewness of the dataset indicates whether deviations from the mean are going
to be positive or negative.

Differences between Measures of Variation and Skewness


Dispersion:
1. It is concerned with the amount of dispersion
2. It gives scatterdness of the observations
3. It does not depend on the skewness
4. It is based on the averages of the first order (Mean, Median and Mode)

Skewness:
1. It tells us about the direction of the variation or departure from the symmetry
2. It indicates to what extent and in what direction the distribution differs from the symmetry
3. It depends on the dispersion to some extent.
4. It is based on the averages of the first order (Mean, Median and Mode) and second order (SD)

Formulae
X̅ − M0
For unimodal distribution: Karl Pearson’s Coefficient of Skewness, Skp =
σ
̅ − M)
3(X
For bimodal distribution: Karl Pearson’s Coefficient of Skewness, Skp =
σ

Business Statistics | Concepts and Exercises Page | 25


Measures of Skewness
𝑄3 + 𝑄1 − 2M
Bowley’s Coefficient of Skewness, SB =
𝑄3 − 𝑄1

Exercise 6.1
1. Calculate Karl Pearson’s and Bowley’s Coefficients of Skewness (Answer: 0.3453 & 0.2):
23, 45, 12, 28, 23, 19, 27, 23, 28, 30

2. Calculate Karl Pearson’s and Bowley’s Coefficients of Skewness (Answer: 0.162 & 0.164):
112, 75, 140, 89, 112, 98, 134, 129, 98, 121, 136

3. Calculate Pearson’s and Bowley’s Coefficients of Skewness (Answer: – 0.2445 & 0):

x 14.5 15.5 16.5 17.5 18.5 19.5 20.5 21.5

f 35 40 48 100 125 87 43 22

4. Compute the two Coefficients of Skewness (Answer: – 0.8761 & -0.2):

x 4 8 12 16 20 24 28 32 36

f 18 21 20 9 7 20 22 17 8

5. Which group is more skewed?


i) Mean = 22; Median = 24, SD = 10 ii) Mean = 22, Median = 25, SD = 12

6. Calculate Karl Pearson’s and Bowley’s Coefficients of Skewness. (Answer: –0.0518 & –0.0165)

Class Interval 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70 70 – 80

Frequency 6 12 22 48 56 32 18 6

7. Calculate Karl Pearson’s and Bowley’s Coefficients of Skewness (Answer: 0.401 & 0.3750):

Marks Above 0 10 20 30 40 50 60 70 80 90

No. of Students 100 98 95 90 80 50 35 23 13 5

8. Calculate Karl Pearson’s and Bowley’s Coefficients of Skewness (Answer: –0.2078 & –0.058):

Mid Value 21 27 33 39 45 51 57

Frequency 18 22 40 50 38 12 4

9. Compute Karl Pearson’s and Bowley’s Coefficients of Skewness (Answer: –0.310 & –0.2314):

CI 3–7 8 – 12 13 – 17 18 – 22 23 – 27 28 – 32 33 – 37 38 – 42

f 7 9 10 6 13 10 13 10

10. a. In a distribution, Mean = 65, Median = 70, Skp = – 0.6. Find i) Mode and ii) CV. (Answers: 80, 38.46%)

b. Skp = – 0.7, σ = 6, M = 12.8. Find the Mean and CV. (Answers: 11.4, 52.63%)

Business Statistics | Concepts and Exercises Page | 26


Chapter 7: Correlation and Regression
The statistical tool with the help of which the relationship between two or more variables is studied is called
correlation. The measure of correlation is called the Correlation Coefficient.

Uses of Correlation Coefficient


1. Helps us measure the relationship between the variables.
2. If the variables are closely related, we can estimate the value of one variable, given the value of
another with the help of Regression Analysis
3. Helps in analyzing the economic behavior
4. Helps in the study of social science. For e.g. The relationship between smoking and lung cancer.
Correlation and Causation
1. The correlation may be due to pure chance, especially in a sample. For e.g., relationship between
salary and weight.
2. Both the correlated variables may be influenced by one or more variables. For e.g., a high degree of
correlation between the yield per acre of rice and wheat may be due to heavy rainfall or fertilizers
used.
3. Both the variables may be mutually influencing each other, so that neither can be designated as
cause and other effect. For e.g., demand and price.
4. Nonsense / Illusory Correlation: A correlation between two variables that is not due to any causal
relationship but related to a third variable, or to random sampling fluctuations. E.g. Global warming
and no. of pirates.
Types of Correlation
1. Positive Correlation or Direct Correlation: When the two variables are directly related, i.e., when one
increases the other also increases, it is said to be positive correlation. For e.g., Supply and price.
2. Negative or Indirect Correlation: When the two variables are inversely related, i.e., when one
increases the other decreases, it is said to be negative correlation. For e.g., Demand and supply
3. Partial Correlation: When one variable is independent and the other is dependent on the former, it is
a case of partial correlation
4. Simple Correlation: When only two variable are studied, it is called simple correlation
5. Multiple Correlation: When three or more variables are studied, it is called multiple correlation
6. Linear Correlation: When the two variable change by a fixed proportion, thus forming a straight line,
it is said to be linear correlation
7. Non-linear or Curvilinear Correlation: If the variables, when plotted on a graph do not form a straight
line, it is said to be curvilinear correlation. In other words, the amount of change in one variable does
not bear a constant change in the other variable.

Methods of Determining Correlation


1. Karl Pearson’s Coefficient of Correlation 2. Spearman’s Rank Coefficient of Correlation
3. Concurrent Deviation Method 4. Scatter Diagram method 5. Method of Least Squares

Business Statistics | Concepts and Exercises Page | 27


Correlation and Regression

Karl Pearson’s Coefficient of Correlation


This is the most widely used method of measuring correlation. It is popularly known as Pearsonian coefficient
of correlation. It is denoted by the symbol ‘r’.

Assumptions While Using Karl Pearson’s Coefficient of Correlation


While using Karl Pearson’s coefficient of correlation, it is assumed that,

1. The distribution is normal


2. There is cause and effect relationship between the variables.
3. There is a linear relationship between the variables.
Properties of Karl Pearson’s Coefficient of Correlation
1. The value of r always lies between -1 and +1. Interpretation: ±1 – Perfect correlation; ±0.9 to ±0.1 –
Very high degree; ±0.75 to ±0.9 – High degree; ±0.60 to ±0.75 – Moderate degree; ±0.30 to ±0.60 –
Low degree; 0 to ±0.30 – Very low degree; 0 – No correlation.
2. It is independent of change of scale and origin of X and Y variables.
3. It is the geometric mean of two regression coefficients.𝑟 = √𝑏𝑥𝑦 x 𝑏𝑦𝑥

Merits of Karl Pearson’s Coefficient of Correlation


1. This is the most popular among the mathematical methods
2. It summarizes in one value the degree of correlation and its direction – direct or inverse.
Limitations of Karl Pearson’s Coefficient of Correlation
1. It assumes a linear relationship.
2. There are chances of misinterpretation.
3. It is more time consuming compared to other methods.

Probable Error
It is the value that helps determine the reliability of the value of the correlation coefficient in the condition of
random sampling. It helps interpret the correlation coefficient.

Methods of Interpretation
1. If r < 6PE, the value of r may not be significant.
2. If r > 6PE, the value of r is significant or practically certain.
3. Using the limits of population, we get the range within which population correlation lies. ρ = r ± PE

Formulae
Σdx.dy
Using Actual Mean: r =
√Σdx2 x Σdy2

Σdx.Σdy
Σdx.dy −
Using Assumed Mean: r = 2
N
2
√Σdx2 − (Σdx) x √Σdy2 − (Σdy)
N N

1 − r2
Probable Error: P. E = 0.6745 x
√N

Business Statistics | Concepts and Exercises Page | 28


Correlation and Regression

Exercise 7.1
1. Compute the coefficient of correlation from the following data: (Ans.: +0.9243)

Internal Marks 25 30 22 12 19 24

External Marks 56 68 40 24 28 60

2. Compute the coefficient of correlation from the following data: (Ans.: +0.6051)

X 6 8 9 14 17 28 24 31 7

Y 10 12 15 15 18 25 22 26 28

3. Compute the coefficient of correlation from the following data: (Ans.: +0.8818)

X 45 55 56 58 60 65 68 70 75 80

Y 56 50 48 60 62 64 65 70 74 82

4. Compute the coefficient of correlation from the following data: (Ans.: – 0.7327)

X 43 44 46 40 44 42 45 42 38 40 42 57

Y 29 31 19 18 19 27 27 29 41 30 26 10

5. Calculate the coefficient of correlation between age and playing habits of students: (Ans.: – 0.9895)

Age 15 16 17 18 19 20

No. of Students 250 200 150 120 100 80

Regular Players 250 150 90 48 30 16

6. The following table gives the distribution of the total population and those blind among them. Calculate the
coefficient of correlation and probable error. (Ans.: 0.898)

Age 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70 70 – 80

No. of Persons (‘000) 100 60 40 36 24 11 6 3

Blind Persons 55 40 40 40 36 22 18 15

7. Calculate ‘r’ between age and failure of candidates in the results of B.Com students: (Ans.: 0.7745)

Age 13 – 14 14 – 15 15 – 16 16 – 17 17 – 18 18 – 19 19 – 20 20 – 21 21 – 22 22 – 23

Candidates
200 300 100 50 150 400 250 150 25 75
Appeared

Candidates failed 76 120 35 16 51 148 105 69 13 42

8. a. If r = +0.111 and N = 5, find PE (Answer: 0.2984)

b. If r = +0.9668 and PE = 0.01463, find N (Answer: 9) [Hint: Round off answer to the closest whole number]

c. If PE = 0.11857 and N = 8, find r. (Answer: +0.709)

Business Statistics | Concepts and Exercises Page | 29


Correlation and Regression

Spearman’s Rank Correlation


Formulae
6 Σd2
Unique Ranks: rs = 1 − 𝑑 = 𝑅1 − 𝑅2
N3 − N
1 1 1
6 [Σd2 + (m31 − m1 ) + (m32 − m2 )+⋯+ (m3n − mn )]
Tied Ranks: rs = 1 − 12 12 12
where m = No. of tied ranks
N3 − N

Exercise 7.2
1. Two ladies ranked seven brands of lipsticks as follows. Find the degree of agreement (Ans.: 0.786):

Lady 1 1 3 2 7 6 4 5
Lady 2 2 1 4 6 7 3 5

2. In a beauty competition, two judges ranked 12 participants as follows. What is the degree of agreement
between them? (Ans.: – 0.4546)

X 3 4 1 5 2 10 6 9 8 7 12 11
Y 6 10 12 3 9 2 5 8 7 4 1 11

3. Compute the rank correlation from the following data (Ans.: 0.8322):

X 60 34 40 50 45 41 22 43 42 66 64 46
Y 75 32 35 40 45 33 12 30 36 72 41 57

4. From the marks scored in accountancy and statistics by 12 students, compute rank correlation (Ans.: 0):

Accountancy 60 15 20 28 12 40 80 20
Statistics 10 40 30 50 30 20 60 30

5. Compute the coefficient of rank correlation (Ans.: 0.733):

X 48 33 40 9 16 16 65 24 16 57
Y 13 13 24 6 15 4 20 9 6 19

6. Compute the rank correlation between the length of service and order of merit (Ans.: 0.7937):

Length of Service 5 2 10 8 6 4 12 2 7 5 9 3
Order of Merit 6 12 1 9 8 5 2 10 3 7 4 11

7. Ten competitors in a voice contest are ranked by three judges in the following order. Find which pair of
judges have the nearest approach to common liking in voice (Ans.: -0.212, -0,297, 0.6364; Judges 1 & 3):

Judge 1 1 6 5 10 3 2 4 9 7 8
Judge 2 3 5 8 4 7 10 2 1 6 9
Judge 3 6 4 9 8 1 2 3 10 5 7

Business Statistics | Concepts and Exercises Page | 30


Correlation and Regression

Regression
The statistical tool with the help of which we are in a position to estimate or predict the unknown values of
one variable from known values of another variable is called regression.
Correlation vs. Regression
1. Correlation coefficient is a measure of degree of co-variability between two variables, but regression
analysis helps to predict the value of one variable given the value of the other.
2. The cause and effect relation is clearly indicated more through regression analysis than by
correlation, which is more a tool of ascertaining the degree of relationship between the variables.

Formulae
Equation X on Y: (X − ̅
X) = bxy (Y − ̅
Y)
̅) = byx (X − X
Equation Y on X: (Y − Y ̅)

Formulae to Find the Regression Coefficients:


Σdx.dy Σdx.dy
Using Actual Mean: bxy = ; byx =
Σdy2 Σdx2

N Σdx.dy − Σdx.Σdy N Σdx.dy − Σdx.Σdy


Using Assumed Mean: bxy = ; byx =
N Σdy2 − (Σdy)2 N Σdx2 − (Σdx)2
σ σy
Using Standard Deviation: bxy = r. σx ; byx = r. σ
y x

Coefficient of Correlation: r = √bxy x byx

Exercise 7.3
1. Find the Regression Equations (Answer: X = 1.3Y – 4.4 & Y = 0.65X + 4.1):

X 2 4 6 8 10

Y 5 7 9 8 11

2. A panel of judges P & Q graded seven dramatic performances by awarding marks as follows. Obtain the
two Regression Equations: (Answer: X = 0.75Y + 14.5 & Y = 0.75X + 5.75)

Performance 1 2 3 4 5 6 7

Marks by P 46 42 44 40 43 41 45

Marks by Q 40 38 36 35 39 37 41

3. Following Table shows the exports of raw cotton and the imports of manufactured goods into India for
seven years.

Exports 42 44 58 55 89 98 60

Imports 56 49 53 58 67 76 58

Business Statistics | Concepts and Exercises Page | 31


Correlation and Regression

Obtain the two Regression Equations and estimate the imports when export in a particular year was ₹ 70
crore. (Answer: 62.03; X = 2.198Y – 67.244 & Y = 0.391X + 34.651)

4. The advertisement expenses and sales data of ABC company are as follows:

Advertisement Expenses (₹ Lakh) 60 62 65 70 73 75 71

Sales (₹ Crore) 10 11 13 15 16 19 14

Find:

a. Sales for advertisement expenses of ₹ 80 lakhs. (Answer: ₹ 20.525 Crore)

b. Advertisement expenses for a sales target of ₹ 25 Crore. (Answer: ₹ 87.786 Lakh)

c. Coefficient of Correlation (Answer: 0.9870)

(The Regression Equations are: X = 1.807Y + 42.619 and Y = 0.539X – 22.613)

5. Following data are available on sales and advertisement:

Sales (₹) Advertisement Expenses (₹)

Mean 70,000 15,000

Standard Deviation 15,000 3,000

Coefficient of correlation is +0.8

Find:

a. The two Regression Equations (Answer: X = 4Y + 10,000 & Y = 0.16X + 3,800)

b. The advertisement budget if the company desires to achieve the target sales of ₹ 1,00,000
(Answer: ₹ 19,800)

6. Coefficient of correlation between the ages of brothers and sisters in a community was found to be 0.8.
Average age of the brothers was 25 and that of sisters 22 years. Their variances were 16 and 25
respectively.

Find:

a. The expected age of the brother when sister’s age is 12 years. (Answer: 18.6 years)

b. The expected age of the sister when brother’s age is 33 years. (Answer: 30 years)

(The Regression Equations are: X = 0.64Y + 10.92 and Y = X – 3)

7. a. 𝐼𝑓 𝑟 = 0.42, 𝜎𝑦 = 16.8 𝑎𝑛𝑑 𝜎𝑥 = 10.8, 𝑓𝑖𝑛𝑑 𝑏𝑥𝑦 𝑎𝑛𝑑 𝑏𝑦𝑥 (Answers: 0.269 & 0.653)

b. 𝐼𝑓 𝑏𝑥𝑦 = 0.2, 𝑟 = 0.533 𝑎𝑛𝑑 𝜎𝑥 = 5, 𝑓𝑖𝑛𝑑 𝜎𝑦 (Answer: 13.325)

c. 𝐼𝑓 𝑏𝑥𝑦 = 2.1 𝑎𝑛𝑑 𝑏𝑦𝑥 = 0.456, 𝑓𝑖𝑛𝑑 𝑟 (Answer: 0.978)

d. 𝐼𝑓 𝑏𝑥𝑦 = 2 𝑎𝑛𝑑 𝑟 = 0.578, 𝑓𝑖𝑛𝑑 𝑏𝑦𝑥 (Answer: 0.167)

Business Statistics | Concepts and Exercises Page | 32


Chapter 8: Index Numbers
A specialized average designed to measure the change in the level of phenomenon with respect to time,
geographic location or other characteristics such as income, price, etc.

Features of Index Numbers


1. Index numbers are specialized averages
An average is not suitable measure of comparing different groups of data if they are expressed in
different units. But index numbers help compare different groups of data even if they are expressed
in different unites. For instance, the spending on food, clothing, house rent etc. can be compared
using index numbers.
2. Index numbers measure the change in the level of phenomenon
For instance, if the index of industrial production is 108 in 2012 compared to 100 in 2011, it means
there is a net increase of 8% in industrial production.
3. Index numbers measure the effect of change over a period of time
For instance, BSE index, introduced in 1986, is used to study the movements in the share prices till
date.

Uses of Index Numbers


1. They help in framing suitable policies: For instance, wages and salaries are adjusted based on
Consumer Price Index.
2. They reveal trends and tendencies: For instance, to study the export trend after economic
liberalization in 1991, the current index can be compared with that of 1991.
3. Useful in deflating: Deflation is the process of adjusting original data for price changes. For instance,
nominal income can be adjusted to real income.

Types of Index Numbers


1. Unweighted Index: The method of constructing index numbers in which weights are not assigned to
the items is called Unweighted Index. It includes Simple Aggregative and Simple Average of relatives.
2. Weighted Index: The method of constructing index numbers in which weights are assigned to the
items is called weighted index. It includes Weighted Aggregative and Weighted Average of Relatives.

Some Important Definitions


1. Base Year: Base year is any reference year earlier than the year for which the indices are calculated.
They are used as the reference points for comparison of changes in phenomenon.
2. Fixed Base: Refers to the base year, which remains fixed over a period of time. The fixed base year
serves as a common standard of comparison for all prices during the period.
3. Chain Base: Refers to the base year which changes from year to year. Generally the previous year will
be the base year for calculation index number for the current year.
4. Consumer Price Index or Cost of Living Index: CPI measures the effect of change in prices of
consumer goods which may include may include food, clothing, fuel, lighting, house rent etc., on the
working class families or consumers, during any year with respect to some fixed year.
5. Time Reversal Test: A formula for an index number should maintain time consistency by working
both forward and backward with respect to time. This is called time reversal test. It is expressed in
the form of an equation as follows: P01 x P10 = 1

Business Statistics | Concepts and Exercises Page | 33


Index Numbers

6. Factor Reversal Test: The index must permit interchanging the prices and quantities without giving
inconsistent results. The two results multiplied together should give a true value ratio. This is given by
Σ p1 q1
the expression: P01 x Q 01 =
Σ p0 q0

Points to be Considered While Selecting the Base Year


1. It should be a normal year
2. It should not be too distant in the past
3. Fixed base or Chain base

Limitations of Index Numbers


1. Sampling errors
2. It is assumed that the quality of the products remain the same
3. Specific index for specific purpose
4. It is assumed that there is no change in tastes, habits and customs
5. No single formula to calculate the index which may be suitable for all situations
6. Unreliable comparisons over longer periods
7. It is difficult to select a normal year as base year

Fisher’s Ideal Index Number


Fisher’s Index Number is called ideal for the following reasons:

1. It is based on geometric mean which is considered to be the best average for constructing index
numbers
2. It takes into account both, current year as well as base year prices and quantities.
3. It satisfies both Time Reversal Test (TRT) and Factor Reversal Test (FRT).
4. It is free from bias.

Formulae
Σp1
Simple Aggregative Method: P01 = x 100
Σp0

Σp1 q0
Weighted Aggregative Method: P01 = x 100
Σp0 q0
p
ΣI Σ( 1 x 100)
p0
Simple Average of Price Relatives Method: P01 = =
n n

Σp1 q0
CPI/CLI: Aggregate Expenditure Method: P01 = x 100
Σp0 q0

ΣIW p1
Family Budget Method: P01 = where I = x 100 and W = p0 q0
ΣW p0

Σp q Σp1 q1
Fisher’s Ideal Index Number: P01 = √ 1 0 x x 100
Σp q 0 0 Σp0 q1

Σp q Σp1 q1 Σp0 q1 Σp0 q0


Time Reversal Test: P01 x P10 = √ 1 0 x x√ x =1
Σp q 0 0 Σp0 q1 Σp1 q1 Σp1 q0

Σp q Σp1 q1 Σp0 q1 Σp1 q1 Σp1 q1


Factor Reversal Test: P01 x Q 01 = √ 1 0 x x√ x =
Σp q 0 0 Σp0 q1 Σp0 q0 Σp1 q0 Σp0 q0

Business Statistics | Concepts and Exercises Page | 34


Index Numbers

Exercise 8.1
1. Calculate the price index for 2006, 2007 and 2008 using the simple aggregative method on the basis of 1995
(Answers: 124.37, 139.42, 153.20):

Commodity Unit 1995 2006 2007 2008

Rice Kg ₹10.50 ₹12.10 ₹14.30 ₹18.60

Wheat Kg 9.25 11.40 12.70 13.40

Milk L 4.75 7.00 9.00 10.50

Sugar Kg 8.60 14.00 16.00 17.00

Oil Kg 27.50 32.00 35.00 36.50

Pulses Kg 11.20 12.80 13.10 14.00

2. Calculate the weighted aggregative index number for the following commodities for the year 2001 and 2008 taking
the year 1991 as the base year (Answers: 130.45, 156.97):

Price per unit (₹)


Units Consumed
Commodity
1991
1991 2001 2008

Rice 10 kg 11.00 16.50 18.00

Wheat 5 kg 10.20 12.25 14.00

Grams 3 kg 5.00 7.00 9.00

Milk 30 litres 6.70 9.00 10.50

Oil 4 kg 29.00 32.00 38.00

Sugar 12 kg 8.80 11.30 16.30

3. Calculate the price index numbers for the following data for 2007 and 2008 using simple average of price relative
method (Answers: 147, 196):

Commodity Bricks Timber Board Sand Cement

Prices – 2001 10 20 5 2 7

Prices – 2007 16 21 6 3 14

Prices – 2008 18 22 7 5 21

4. Calculate the index number for the following data using simple average of price relative method (Answer: 122.92)

Commodity A B C D E F

Prices – 2008 4 6 2 5 8 10

Prices – 2009 5 6 3 7 9 11

Business Statistics | Concepts and Exercises Page | 35


Index Numbers

5. Calculate the Consumer Price Index or Cost of Living Index Number using Aggregative Expenditure Method and Family
Budget Method (Answer: 150):

Quantity Price
Item
2005 2005 2010

A 5 8 15

B 2 9 12

C 3 16 20

6. Calculate the CPI using Aggregative Expenditure Method and Family Budget Method (Answer: 118.77):

Quantity Price
Item
2008 2008 2009

A 6 quintals 5.75 6.00

B 6 quintals 5.00 8.00

C 1 quintal 6.00 9.00

D 6 quintals 8.00 10.00

E 4 kg 2.00 1.50

F 1 quintal 20.00 15.00

7. An enquiry into the budgets of middle class families in Bangalore gave the following information:

Commodity Food Rent Clothing Fuel Miscellaneous

Expenses – 2007 35% 15% 20% 10% 20%

Price relatives – 2008 116 120 125 125 150

What changes in the cost of living index of 2008 have taken place as compared 2007? How much dearness allowance
should be given to a worker who was drawing ₹200 as wages in 2007? (Answers: 126.10 & ₹52.20)

8. Following information relating to workers in an industrial town is given:

Food & Fuel &


Item Clothing Housing Miscellaneous
Beverages Lighting
Group Index – 2009
225 185 150 200 180
(Base 2004)
Proportion of
50% 10% 10% 15% 15%
Expenditure

Average wage per month in 2004 is ₹750. What should be the average wage per worker in 2009 in that town so that
the standard of living of the workers does not fall below that of 2004? (Answers: 203 & ₹1,522.50)

9. An enquiry into the budget of the middle class families in a city gave the following information. What changes in the
cost of living figures of 2005 as compared to that of 2002 are seen? (Answer: 102.75)

Business Statistics | Concepts and Exercises Page | 36


Index Numbers

Price (₹) Price (₹)


Percentage
Item
Expenses
2002 2005

Food 29% 140 147

Rent 15% 30 30

Clothing 25% 75 66

Fuel 10% 25 20

Miscellaneous 21% 40 52

10. The data below show the percentage increase in prices of selected food items and the weights attached to each of
them. Calculate the index number for the food group (Answer: 340, 304.6)

Food Item: Rice Wheat Dal Ghee Oil Spices Milk Fish Vegetables Refreshments

Weights 33 11 8 5 5 3 7 9 9 10
Increase in
180 202 115 212 175 517 260 426 332 279
Price %

Using the above food index and information given below, calculate the cost of living index number:

Commodity: Food Clothing Lighting Rent Miscellaneous

Index - 310 220 150 300

Weight 60 - 8 9 18

11. The cost of living index number on a certain date was 200. From the base period, the percentage increase in prices
were Rent – ₹60, clothing – ₹250, Fuel and lighting – ₹150, Miscellaneous – ₹120. The weights of different groups
were Food – 60, Rent – 16, clothing – 12, fuel and lighting – 8, and miscellaneous – 4. What was the percentage
increase in food group? (Answer: 72.67)

12. A textile worker earns ₹350 per month. The cost of living index for that particular month is known to be 136. Using
the data given below, find the amounts spent by him on house rent and clothing (Answer: 42, 49):

Commodity: Food Clothing House Rent Fuel Miscellaneous

Expenditure 140 ? ? 56 63

Group Index 180 150 100 110 80

13. Compute Fisher’s Ideal Index Number and prove that it satisfies the Time Reversal Test and Factor Reversal Test
(Answer: 134.41):

Year Commodity: A B C D E

Price 10 12 18 20 22
2008
Consumption 49 25 10 5 8

Price 12 15 20 40 45
2009
Consumption 50 20 12 2 5

Business Statistics | Concepts and Exercises Page | 37


Index Numbers

14. Compute Fisher’s Ideal Index Number for the following five items (Answer: 266.615):

Price (₹) Quantity


Commodity
2008 2009 2008 2009

A 16 40 100 120

B 4 12 30 20

C 2 4 40 50

D 4 10 20 16

E 2 10 80 60

15. Construct Fisher’s Ideal Index Number and prove that it satisfies TRT & FRT (Answer: 165.71):

Year Item: Rice Sugar Oil

Value 210 100 40


2000
Price 14 20 4

Value 300 108 56


2008
Price 25 27 7

16. Compute Fisher’s Ideal Index Number and prove that it satisfies the TRT and FRT (Answer: 112.10):

Year Item A B C D E

Price 10 12 20 18 28
Base Year
Value 200 108 260 144 280

Value 300 220 250 140 320


Current Year
Quantity 25 22 10 7 10

17. Compute Fisher’s Ideal Index Number and prove that it satisfies the TRT and FRT (Answer: 219.12):

Year Commodity: A B C D

Price 20 40 10 50
2013
Expenditure 400 160 100 250

Price 50 80 20 100
2014
Expenditure 750 400 240 600

Business Statistics | Concepts and Exercises Page | 38


Quantitative Methods – Formulae
Arithmetic Mean
Individual Series Discrete Series Continuous Series

Σx Σfx Σfm
Direct Method ̅ =
X ̅ =
X ̅ =
X
n N N

Σd Σfd Σfd
Shortcut Method ̅
X =A+ ; d = x– A ̅ =A+
X ; d = x − A ̅ =A+
X ; d = m − A
n N N

Step-Deviation Σfd′ Σfd′


- ̅
X =A+ x 𝑖; d = x − A ̅
X =A+ x 𝑖; d = m − A
Method N N

Σxw
̅ =
Weighted Arithmetic Mean: X
Σw
n1 x̅1 + n2 x̅2
̅ (1,2) =
Combined Arithmetic Mean: X
n1 + n2

Median
n th n th
(n+1) th ( 2 ) term + (2 +1) term
Individual Series: M = [ ] term when n is odd and M = [ ] when n is even.
2 2

(n+1) th
Discrete Series: M = [ ] term
2
N
− c.f.
Continuous Series: M = L + 2
xi
f

Mode
Individual Series: The variable that occurs most frequently.

Discrete Series: The value which has the greatest frequency in the neighborhood.
∆1
Continuous Series: Z or M0 = L + x i; ∆1 = |f1 – f0| and ∆2 = |f1 – f2|
∆1 + ∆2

Bi-modal Class: Z or M0 = 3 median − 2 mean

Mean Deviation
Individual Series Discrete Series Continuous Series

Ʃ |D| Ʃ f |D| Ʃ f |D|


Mean Deviation
n N N

|D| |x − x̅| 𝑜𝑟 |x − M| |x − x̅| 𝑜𝑟 |x − M| |m − x̅| 𝑜𝑟 |m − M|

MD MD MD MD MD MD
Coefficient of MD 𝑜𝑟 𝑜𝑟 𝑜𝑟
x̅ M x̅ M x̅ M

Business Statistics | Concepts and Exercises Page | 39


Range
Range: L – S (Where L = Largest variable and S = Smallest variable)
L−S
Coefficient of Range:
L+S

Quartile Deviation
Interquartile Range: IQR = Q3 – Q1
Q3 − Q1
Quartile Deviation: QD =
2
Q3 − Q1
Quartile Deviation: CQD =
Q3 + Q1

Individual & Discrete Series Continuous Series

N
(n + 1) th − c. f.
Q1 [
4
] term L+ 4 xi
f

3N
3(n + 1) th − c. f.
Q3 [
4
] term L+ 4 xi
f

Standard Deviation
Individual Series Discrete & Continuous Series

Ʃd2 Ʃfd2
Direct Method σ=√ d = x − x̅ σ=√ d = x − x̅ or m − x̅
n N

Ʃd2 Ʃd 2 Ʃfd2 Ʃfd 2


Short-cut Method σ=√ − ( ) d=x−A σ=√ − ( ) d = x − A or m − A
n n N N
2
Step – Deviation Ʃfd′2 Ʃfd′ x−A m−A
Method
- σ=√ − ( ) x i d′ = or
N N i i
σ
Variance = σ2 Coefficient of Variation, CV = x 100

Coefficient of Skewness
X̅ − M0
For unimodal distribution: Karl Pearson’s Coefficient of Skewness, Skp =
σ
̅ − M)
3(X
For bimodal distribution: Karl Pearson’s Coefficient of Skewness, Skp =
σ

𝑄3 + 𝑄1 − 2M
Bowley’s Coefficient of Skewness, SB =
𝑄3 − 𝑄1

Karl Pearson’s Coefficient of Correlation


Σdx.dy
Using Actual Mean: r =
√Σdx2 x Σdy2

Business Statistics | Concepts and Exercises Page | 40


Index Numbers

Σdx.Σdy
Σdx.dy −
Using Assumed Mean: r = 2
N
2
√Σdx2 − (Σdx) x √Σdy2 − (Σdy)
N N

1 − r2
Probable Error: P. E = 0.6745 x
√N

Spearman’s Rank Correlation


6 Σd2
Unique Ranks: rs = 1 − 𝑑 = 𝑅1 − 𝑅2
N3 − N
1 1 1
6 [Σd2 + (m31 − m1 ) + (m32 − m2 )+⋯+ (m3n − mn )]
Tied Ranks: rs = 1 − 12 12 12
where m = No. of tied ranks
N3 − N

Regression
Equation X on Y: (X − ̅
X) = bxy (Y − ̅
Y)

Equation Y on X: (Y − ̅
Y) = byx (X − ̅
X)

Formulae to Find the Regression Coefficients


Σdx.dy Σdx.dy
Using Actual Mean: bxy = ; byx =
Σdy2 Σdx2

N Σdx.dy − Σdx.Σdy N Σdx.dy − Σdx.Σdy


Using Assumed Mean: bxy = ; byx =
N Σdy2 − (Σdy)2 N Σdx2 − (Σdx)2
σ σy
Using Standard Deviation: bxy = r. σx ; byx = r. σ
y x

Coefficient of Correlation: r = √bxy x byx

Index Numbers
Σp1
Simple Aggregative Method: P01 = x 100
Σp0

Σp1 q0
Weighted Aggregative Method: P01 = x 100
Σp0 q0

p
ΣI Σ( 1 x 100)
p0
Simple Average of Price Relatives Method: P01 = =
n n
Σp1 q0
CPI/CLI: Aggregate Expenditure Method: P01 = x 100
Σp0 q0

ΣIW p1
Family Budget Method: P01 = where I = x 100 and W = p0 q0
ΣW p0

Σp q Σp1 q1
Fisher’s Ideal Index Number: P01 = √ 1 0 x x 100
Σp q 0 0 Σp0 q1

Σp q Σp1 q1 Σp0 q1 Σp0 q0


Time Reversal Test: P01 x P10 = √ 1 0 x x√ x =1
Σp q 0 0 Σp0 q1 Σp1 q1 Σp1 q0

Σp q Σp1 q1 Σp0 q1 Σp1 q1 Σp1 q1


Factor Reversal Test: P01 x Q 01 = √ 1 0 x x√ x =
Σp q 0 0 Σp0 q1 Σp0 q0 Σp1 q0 Σp0 q0

Business Statistics | Concepts and Exercises Page | 41


Assignment 1: Classification & Tabulation
1. Prepare a blank table showing the number of persons leaving India to four different countries – USA,
Canada, Australia and to the Gulf countries for employment opportunities, according to sex from the four
metros – Mumbai, Kolkata, New Delhi and Chennai.

2. In 2012, the total number of visitors to the Wonder Land, Bangalore, was 25,000. Among them, there
were 8,600 female visitors from India and 6,500 foreign visitors out of which 3,500 were female visitors.
In 2013, the total number of visitors increased by 20% and that of Indian visitors increased by 10%.
Among them, there were 8,000 Indian male visitors and 6,000 foreign female visitors. Tabulate the data.

3. A survey of 370 students from Commerce faculty and 130 students from Science faculty revealed that 180
students were studying for only CA examinations, 140 for only Costing examinations and 80 for both CA
and Costing examinations. The rest opted for Part-time Management courses. Of those studying for
Costing, only 13 were girls and 90 boys belonged to Commerce faculty. Out of 80 studying for both CA and
Costing, 72 were from commerce faculty amongst which 70 were boys. Among those that opted for Part-
time Management courses, 50 boys were from Science faculty, and 30 boys and 10 girls were from
Commerce faculty. In all there were 110 boys in Science faculty. Present the above information in a
tabular form.

4. Prepare a frequency distribution from the following figures relating to bonus paid to workers (₹’000)

67 60 69 70 62 63 69 70 58 56 67 54

55 70 60 60 60 65 70 56 57 58 60 59

61 73 69 67 61 60 59 57

5. The following are the marks of 50 students in Statistics. Construct a suitable frequency table:

28 17 48 57 38 59 28 16 78 46 45 86

21 29 49 61 71 46 49 30 76 37 76 36

37 39 46 27 29 31 21 49 29 8 56 46

5 36 71 42 46 56 16 15 22 35 18 22

46 17

6. 25 values of two variables X and Y are given below. Form a two-way frequency table showing the
relationship between the two:

X 12 24 33 22 44 37 26 36 55 48 27

Y 140 256 360 470 470 380 280 315 420 390 440

X 57 21 51 27 42 43 52 57 44 48 48

Y 390 590 250 550 360 570 290 416 380 392 370

X 42 41 69

Y 312 330 590

Business Statistics | Assignments Page | 45


Assignment 2: Diagrammatic Representation
1. Represent the following data using a simple bar diagram:

Year 1974 1975 1976 1977 1978 1979 1980 1981

Production (tons) 45 40 44 41 49 42 55 50

2. Present the following data on profit before tax and after tax using multiple bar diagram:

Year 1979 1980 1981 1982 1983

Profit Before Tax (lakh ₹) 190 191 200 109 127

Profit After Tax (lakh ₹) 79 71 90 36 89

3. Represent the cost per scooter using sub-divided bar diagram and percentage sub-divided bar diagram:

Particulars 1979 1980 1981

Raw Material 2,160 2,600 2,700

Labor 540 700 810

Direct Expenses 360 200 360

Factory Expenses 360 300 360

Office Expenses 180 200 270

Total 3,600 4,000 4,500

4. Draw a pie diagram to represent the expenditure (in ₹) of a family:

Food Rent Clothing Education Lighting Miscellaneous Savings

540 180 180 90 40 40 10

5. Present the following data using three variable line graph:

Year 2009 2010 2011 2012 2013

Income (₹ ‘000) 150 180 160 190 170

Expenses (₹ ‘000) 90 100 120 190 200

Profit/loss (₹ ‘000) +60 +80 +40 0 -30

Business Statistics | Assignments Page | 46


Assignment 3: Measures of Central Tendency
1. Find the mean, median and mode (Using G & A Table) of the following data:

Weight 58 60 61 62 63 64 65 66

No. of Persons 4 12 24 32 32 16 8 2

2. Find the mean using Direct, shortcut and Step-Deviation methods:

Wages 0 – 20 20 – 40 40 - 60 60 - 80 80 – 100

No. of Workers 82 112 150 95 48

3. Find the mean, median and mode of the following data:

x 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70 70 - 80

f 5 8 7 12 28 20 10 10

4. Find the mean, median and mode of the following data:

CI 4–7 8 – 11 12 – 15 16 – 19 20 – 23 24 – 27

Frequency 12 23 40 65 17 3

5. Find the mode of the following data:

Age below 5 10 15 20 25 30 35

No. of persons 24 56 84 100 132 142 150

6. 20% of the workers in a firm, employing a total of 4000 workers, earn less than ₹4 per hour, 880 earn
from ₹4 to ₹4.24 per hour, 24% earn from ₹4.25 to 4.49 per hour, 740 earn from ₹4.50 to ₹4.74 per hour,
12% earn from ₹4.75 to ₹4.99 per hour and rest earn ₹5 or more per hour. Calculate the median.

7. Find the median and mode of the following data using Ogive curves and Histogram respectively:

Mid values 115 125 135 145 155 165 175 185 195

Frequency 6 25 48 72 116 60 38 22 3

8. Find the missing frequencies, if total frequency is 120 and median is 36.5:

CI 20 – 25 25 – 30 30 – 35 35 – 40 40 – 45 45 – 50 50 – 55 55 – 60

F 8 15 28 - 22 - 4 2

Answers

(1) 62.24, 62, 62 (2) 46.51 (3) 45, 46.43, 46.67 (4) 15.03, 15.81, 16.87 (5) 8.33 (6) 4.33
(7) 153.79, 154.4 (8) 30, 11

Business Statistics | Assignments Page | 47


Assignment 4: Measures of Variation
1. Find the Interquartile Range, QD, CDQ, and MD (Using mean and median) from the following data:

Weight 58 59 60 61 62 63 64 65 66

No. of Persons 15 20 32 35 33 22 20 10 8

2. Find the Interquartile Range, QD, CDQ, and MD (Using mean and median) from the following data:

Value 90 – 99 80 – 89 70 – 79 60 – 69 50 – 59 40 – 49 30 – 39

Frequency 2 12 22 20 14 4 1

3. From the prices of shares of X and Y given below, state which share prices are more stable

X 55 54 53 53 56 68 52 50 51 49

Y 108 107 105 105 106 107 104 103 104 101

4. Find the coefficient of variation from the following data:

Wages up to 60 70 80 90 100 110 120 130


No. of
8 24 56 95 136 178 192 200
workers

5. The life of two types of tyres in a sample survey is given below. Which one has a higher average? Based on
consistency, which one would you prefer?

Life (in ‘000 km) 5 – 10 10 – 15 15 – 20 20 – 25 25 – 30 30 – 35

Type A 10 18 32 40 22 18

Type B 18 22 40 32 18 10

6. Given below is the distribution of boys and girls of a school. Find which group is more variable.

Age in Years 13 14 15 16 17

No. of Boys 12 15 15 5 3

No. of Girls 13 10 12 2 1

Answers

(1) QD: 3, 1.5, 0.024, 1.74; MD: 1.74, 0.028, 1.713, 0.028 (2) QD: 18.02, 9.01, 0.132; MD: 10.41, 0.153, 10.437,
0.152 (3) CV(X) = 9.37%, CV(Y) = 1.91% (4) 16.781% (5) CV(A) = 33.35% & CV(B) = 37.13%
(6) CV(X) = 7.855%& CV(Y) = 7.341%

Business Statistics | Assignments Page | 48


Assignment 5: Measures of Skewness
1. Find Karl Pearson’s and Bowley’s Coefficients of Skewness: 25, 37, 48, 35, 22, 29, 37, 30, 41, 25

2. Find the Pearson’s and Bowley’s Coefficients of Skewness:

Age 12 14 15 18 21 24 26 27 31 33

No. of Persons 8 12 24 20 15 24 18 8 6 4

3. Find the Coefficient of Skewness from the following data using Pearson’s and Bowley’s methods:

Size 7 8 9 10 11 12 13 14

Frequency 2 11 36 64 39 30 22 2

4. Find the Skp and SB from the following data:

Marks Below 80 70 60 50 40 30 20 10

No. of
150 136 120 80 70 70 50 10
Students

5. From the data given below, find the coefficient of skewness using both the methods:

X 23 – 27 28 – 32 33 – 37 38 – 42 43 – 47 48 – 52 53 – 57 58 – 62 63 – 67 68 – 72

F 2 6 9 14 32 16 12 6 2 1

6. From the data given below, find SKp and SB:

Marks Above 0 10 20 30 40 50 60 70 80 90

No. of
100 89 73 64 52 49 32 20 12 5
students

7. Pearson’s coefficient of skewness is –0.7 and the value of the median and standard deviation are 12.8 and
6 respectively. Determine the value of mean.

8. In a distribution, mean = 65, median = 70 and Skp = – 0.6. Find i) SD, ii) Mode, iii) CV

Answers

(1) 0.155, - 0.1818 (2) – 0.2597, – 0.0909 (3) 0.3665, 1 (4) – 0.7539, –0.3636 (5) 0.0572, 0.0615
(6) – 0.2276, – 0.1858 (7) 11.4 (8) 25, 80, 38.46

Business Statistics | Assignments Page | 49


Assignment 6: Coefficient of Correlation
1. Calculate Karl Pearson’s Coefficient of Correlation and the probable error for the following data regarding
price and demand of a commodity:

Price 10 28 49 50 70 75 98 100 110 120

Demand 112 110 75 60 55 50 40 30 20 10

2. Find the coefficient of correlation and PE of the following data:

Age 20 – 25 25 – 30 30 – 35 35 – 40 40 – 45 45 – 50 50 – 55 55 – 60 60 – 65

Wages (‘000) 9 10 12 11 16 16 18 17 15

3. From the following data find the coefficient of correlation between average profits and average
advertisement expenditure per shop and interpret the result.

No. of Shops 30 45 14 26 12 16 22 35

Total Profits 60,000 135,000 42,000 52,000 36,000 64,000 66,000 105,000
Advertisement
3,000 45,000 7000 13,000 6,000 4,800 8,800 14,000
Expenses

4. With the following data in 6 cities, calculate the Coefficient of Correlation between the density of
population and death rates.

Cities A B C D E F
Density of
200 500 400 700 600 300
Population
Population (‘000) 30 90 40 42 72 24

No. of deaths 300 1440 560 840 1224 312

5. Calculate the Rank Correlation and the Probable Error:

Analyst A 15 18 12 22 15 21 15 27 16 24

Analyst B 16 19 17 21 19 26 12 16 18 20

6. Using Rank Correlation find out which pair of judges have a nearly common taste in fashion design.

Judge A 1 3 2 5 8 7 9 4 10 6

Judge B 3 5 4 6 7 9 8 1 2 10

Judge c 5 6 2 3 8 7 10 4 1 9

Answers

(1) – 0.975 (2) 0.855 (3) 0.141 (4) + 0.988 (5) 0.4182 (6) 0.3455, 0.7697, 0.2727

Business Statistics | Assignments Page | 50


Assignment 7: Regression
1. The following data relate to the ages of husbands and wives:

Husband’s age 25 28 30 32 35 36 38 39 42 55

Wife’s age 20 26 29 30 25 18 26 35 35 46

Obtain the two regression equations and determine the most likely age of husband when the wife’s age is
25 years.

2. From the following data:


a. Find the two regression equations
b. Estimate the value of X when Y = 20 and the value of Y when X = 30
c. Determine the coefficient of correlation

X 20 24 26 34 36

Y 10 12 14 18 26

3. Find the regression lines for the following data and estimate the value of X when Y = 38.

X 25 28 35 32 36 37 29 39

Y 43 46 49 41 36 32 31 32

4. The heights (in cm) and weights (in kg) of a random sample of 9 adult males are shown below:

Height 177 163 173 182 171 168 174 176 184

Weight 71 67 77 85 69 62 73 78 80

Estimate the height when the weight is 75 and the weight when the height is 180.

5. A study of wheat prices per kg at Mysore and Bengaluru yields the following data:

Mysore Bengaluru

Average Price ₹ 24.63 ₹ 27.97

Standard Deviation ₹ 3.26 ₹ 2.07

Correlation Coefficient: 0.774

Estimate:
a. The price of wheat at Mysore when the price is ₹ 23.54 at Bengaluru.
b. The Price of wheat at Bengaluru when the price is ₹ 30.5 at Mysore.

Answers

(1) 32.6956 years (2) 32; 17.739 (3) 32.839 (4) 175.315 cm; 78.75 kg (5) ₹19.23; ₹30.855

Business Statistics | Assignments Page | 51


Assignment 8: Index Numbers
1. Calculate the index using Simple Aggregate and Weighted Aggregate Methods:

Price (₹) Price (₹) Quantity


Commodity
1999 2000 1999
Rice 30 40 10
Wheat 20 30 5
Pulses 40 50 6
Oil 35 40 5
Milk 40 50 10

2. Calculate the price index numbers for the following data for 2007 and 2008 using simple average of price
relative method:

Rice Wheat Pulses Oil Milk


Prices – 2001 35 30 25 15 40
Prices – 2002 40 40 35 25 50

3. Calculate the Consumer Price Index or Cost of Living Index Number using Aggregative Index Number and
Family Budget Method:

A B C D E
Quantity – 2004 50 100 60 30 40
Prices – 2004 6 2 4 10 8
Prices – 2009 10 2 6 12 12

4. The group indices and the corresponding weights for the working class cost of living index numbers in an
industrial city for 2009 and 2010 are as follows:

Group Index
Group Weight
2009 2010
Food 71 370 380
Clothing 3 423 504
Fuel 9 469 336
House Rent 7 110 116
Miscellaneous 10 279 283

Compute the cost of living index number for 2009 and 2010. If a worker was getting ₹3000 per month in
2009, should he be given any extra allowance in 2010 so that he can maintain his 2009 standard of living?
Justify your answer.

5. The following table gives the cost of living index numbers for different groups with their respective
weights for the year 1992 (base year 1982). Calculate the overall cost of living index numbers.
If Mr. Bose got ₹550 in 1982, determine how much he should receive in 1992 to maintain the same
standard of living as in 1982.

Business Statistics | Assignments Page | 52


Fuel &
Food Clothing Housing Miscellaneous
Lighting
Cost of Living Index 525 325 240 180 200
Weight 40 16 15 20 9

6. The relative importance of the following 8 groups of family expenditure is tabulated below. If the corresponding
increase in prices (in %) for February, 1992 compared to January 1992, are 25, 1, 22, 18, 14, 13, 20 and 11, calculate
the CPI:

Food Rent Clothing Fuel Household Miscellaneous Services Drinks

348 88 97 65 71 35 70 217

7. Compute Fisher’s Ideal Index Number and show that it satisfies the TRT & FRT:

2004 2008
Commodity
Price (₹) Consumption (kg) Price (₹) Consumption (kg)
A 8 6 12 4
B 10 8 12 8
C 14 4 18 4
D 4 6 2 10
E 10 10 14 8

8. Compute Fisher’s Ideal Index Number and prove that it satisfies the TRT & FRT:

Year A B C D

2000 2 4 1 5
Price
2010 5 8 2 10

2000 40 16 10 25
Value
2010 75 40 24 60

9. A worker earns ₹750 per month. The cost of living index for January, 2009 is known to be 160. Using the data given
below, find the amounts spent by him on food house rent.

Food Clothing House Rent Fuel & Light Miscellaneous

Expenditure ? 125 ? 100 75

Group Index 190 181 140 118 101

Answers

(1) 127.27 & 127.57 (2) 135.86 (3) 139.71 (4) 353.2 & 351.58; No (5) 352; 1936.00 (6) 117.49
(7) 124.01 (8) 218.046 (9) 300, 150

Business Statistics | Assignments Page | 53


Business Statistics | University Question Papers Page | 57
Business Statistics | University Question Papers Page | 58
Business Statistics | University Question Papers Page | 59
Business Statistics | University Question Papers Page | 60
Business Statistics | University Question Papers Page | 61
Business Statistics | University Question Papers Page | 62

You might also like