BBA 201 Busines Mathematics and Statistics - Unlocked
BBA 201 Busines Mathematics and Statistics - Unlocked
STATISTICS
MCM - 105BCIBF-202
B.Com- 202/ BBA-201/
PROGRAMME COORDINATOR
Dr. Sabiha Khatoon, CDOE, Jamia Millia Islamia
COURSE WRITERS
K.B. Akhilesh, Professor, Department of Management Studies, Indian Institute of Science, Bengaluru
Units: (1.1-1.2, 2.1-2.2, 2.4-2.8)
S. Balasubrahmanyam, Research Scholar, Department of Management Studies, Indian Institute of Science, Bengaluru
Units: (1.1-1.2, 2.1-2.2, 2.4-2.8)
V.K. Khanna, Associate Professor, Deptt. of Mathematics, Kirori Mal College, University of Delhi
Units: (1.3-1.8, 3, 4.1-4.2, 4.4-4.8, 5.1-5.2, 5.4-5.8, 6.1-6.2, 6.4-6.8)
S.K. Bhamari, Associate Professor, Deptt. of Mathematics, Kirori Mal College, University of Delhi
Units: (1.3-1.8, 3, 4.1-4.2, 4.4-4.8, 5.1-5.2, 5.4-5.8, 6.1-6.2, 6.4-6.8)
Dr. Pratiksha Saxena, Assistant Professor, School of Applied Sciences, Gautam Buddha University, Greater Noida
Units: (2.3, 4.3, 5.3, 6.3, 7, 8)
J.S. Chandan, Professor, Medgar Evers College, City University of New York
Units: (9, 13, 14)
Neeru Sood, Freelance Author
Units: (10-12)
Dr. (Mrs.) Vasantha R. Patri, Former Faculty of Psychology, Lady Shri Ram College, Delhi University (1971-2001);
Chairperson, Indian Institute of Counselling
Unit: (15)
C.R. Kothari, Ex-Associate Prof - Department of Economic Administration & Financial Management, University of Rajasthan
Units: (16-18)
All rights reserved. Printed and published on behalf of the CDOE, Jamia Millia Islamia by Hi-Tech Graphics, New Delhi
March, 2023
ISBN: 978-93-5259-718-5
All rights reserved. No part of this book may be reproduced in any form or by any means, electronic or mechanical, including
photocopying, recording or by any information storage or retrieval system, without permission in writing from the CDOE,
Jamia Millia Islamia, New Delhi.
Cover Credits: Anupama Kumari, Faculty of Fine Arts, Jamia Millia Islamia
SYLLABI-BOOK MAPPING TABLE
Business Mathematics and Statistics
Syllabi Mapping in Book
Block III Basic Statistical Concepts Unit-9: Meaning and Scope of Statistic
(Pages 225-242);
Unit-10: Organizing a Statistical Survey
(Pages 243-266);
Unit-11: Accuracy, Approximation and Errors
(Pages 267-290);
Unit-12: Ratios, Percentages and Rates
(Pages 291-304)
BLOCK-I
FUNCTION AND PROGRESSION
This block will discuss function and progression. Function in mathematics refers to a relation
or expression involving one or more variables. Progression, however, refers to a series with
a definite pattern of advance. This block refers to functions and progressions, systematically
dealing with functions, progressions, arithmetic progressions series and geometric
progression series. It consists of three units.
The first unit explains functions and progressions. It begins by explaining the nature of
functions. Functions refer to a variable that corresponds to a definite value of another
variable and is denoted with a common representation. The various types of functions, their
characteristics, graphical representations and solution sets of linear equations and inequalities
are discusses in detail here. A few solved examples dealing with functions and variables are
also solved for a better understanding.
The second unit discusses arithmetic progression and series. An arithmetic progression is a
mathematical series that is obtained by adding a fixed number to the previous term. This
fixed number that is added is called a common difference. The unit discusses some standard
results of arithmetic progression, geometric progression and its properties, arithmeco-
geometric series and its importance, and the sums of terms of an arithmetic series. Solved
examples on the topics are discussed for a better understanding.
The third unit examines geometric progression and series. Geometric progression which is
also known as a geometric sequence is a sequence of numbers where each term after the
first is obtained by multiplying the previous one by a fixed, non-zero number. This fixed
number is called the common ratio. The unit discusses geometric progression and means, it
also carries solved examples on the sum of n terms of geometric progression, and the sum
of integrity of a geometric progression.
1
Function and Progression
Objectives
After going through this unit, you will be able to:
• Discuss the properties of functions
• Analyze even and odd functions
• Assess the properties of logarithmic function
Structure
1.1 Introduction
1.2 Functions
1.3 Types of Function
1.4 Summary
1.5 Key Words
1.6 Answers to ‘Check Your Progress’
1.7 Self-Assessment Questions
1.8 Further Readings
1.1 INTRODUCTION
3
Function and Progression
1.2 FUNCTIONS
Remark
Though mathematically, the foregoing demand function can be duly transformed into
another wherein Px can be expressed as a function of Dx. It does not, however,
carry any practical significance because in general, it is the price (endogenous
variable or independent variable) that can be directly manipulated rather than the
demand, which is an exogenous variable or dependent variable.
Note
The set of values of x for which the value of the function y = f(x) is determined, is
called the domain of the function, while the set of values of y is called the range of
the function.
Interval of a Variable
The range of values that a variable can take, be it a closed (or semi-closed) interval
or an open (semi-open) interval or a combination of such intervals is known as the
interval of the variable.
Thus, if a variable ‘x’ can take any value between two real numbers a and b (a < b),
inclusive of both the values, then such an interval can be written as follows:
a ≤ x ≤ b or [a, b]
Using the notation of sets, it can be written as
{x ∈ / a ≤ x ≤ b} or x ∈ [a, b]
Similarly, the other possibilities can be expressed as follows:
{x ∈ / a < x < b} as x ∈ (a, b)
{x ∈ / a £ x < b} as x ∈ [a, b)
and {x ∈ / a < x ≤ b} as x ∈ (a, b]
4
Function and Progression
Classification of functions
5
Function and Progression
15. The even functions form a commutative algebra over the reals. However,
the odd functions do not form an algebra over the reals.
Explicit function
If the dependent variable y is expressed directly in terms of the independent variable
x, then y is called an explicit function of x and is written as y = f(x).
e.g., y = (2x + 3), y = (4x2 + 7x – 8) are all explicit functions of x.
Implicit function
When x and y both occur together in an equation but y is not capable of being
directly expressed in terms of x, then y is said to be an implicit function of x.
e.g., (x3 + 3x2y + 3xy2 + y2) = 0 is an implict function of x.
When the form of an implicit function is not specified, it is written as
f(x, y) = 0.
Inverse function
If y is a function of x, then on the other hand, x is also (yet another) function of y.
The latter is called the inverse function of the former function y, i.e., if y = f(x), then
x = g(y)
e.g., If y = ax, then x = loga y
6
Function and Progression
2x 3 3 5y
If y , then x
x 5 y 2
Convex function
A function f(x) defined over a convex set S (Note 3) is said to be a convex function
if for any two points x1 and x2 lying in S and for any 0 ≤ l ≤ 1,
f(lx1 + (1 – l) x2) ≤ [l. f(x1) + (1 – l) f(x2)]
Concave Function
A function f (x) is said to be concave
if – f (x) is convex.
Characteristics of Function
Domain and range are the two main characteristic of a function.
Function describes any situation in which one quantity depends on another. For
example, the height of a person depends on his age. The distance an object travels
in four hours depends on its speed. When such relationships exist, one variable is
said to be a function of the other. Therefore, height is a function of age and distance
is a function of speed.
The relationship between the two sets of numbers of a function can be
represented by a mathematical equation. Consider the relationship of the area of a
square to its sides. This relationship is expressed by the equation A = x2. Here, A,
the value for the area, depends on x, the length of a side. Consequently, A is called
the dependent variable and x is the independent variable. In fact, for a relationship
7
Function and Progression
8
Function and Progression
Linear Quadratic
To discuss the concept of linear equations more formally, firstly we define a linear
expression.
Definition 1. Any expression of the type ax + by + c, a, b, c in R and at least one
of a and b is non-zero, is called a linear expression (to be more precise, a linear
expression in x and y over the reals).
Definition 2. An equation of the type ax + by + c = 0, where a, b, c ∈ R, is
called a linear equation.
In other word, a linear equation is obtained by equating to zero a linear expression.
Similarly, inequality of the type ax + by + c > 0 or ax + by + c < 0 is called a
linear inequation (more precisely a linear inequation in x and y over the reals).
Thus, 3x + 5y + 7 = 0, 2x – 1 = 0, 3 y + 11 = 0, x + y – 2 = 0 are some linear
1
equations, while x > 0, 4x – 3y + 1 < 0, 2 x − 3 y + 11 > 0, x – 1.5y + > 0, 3.78x
1 2
– 2 < 0 are some linear inequalities.
3
Solution Sets of Linear Equations and Inequalities
In this section, we explain what we mean by the solution set of a linear equation or a
linear inequality or of a system of linear equations and linear inequalities.
Firstly we recall the definition of an ordered pair.
Definition 3. By an ordered pair (a, b) of real numbers a and b we mean a set
{{a}, {a, b}}.
Thus (a, b) is a set with two elements namely the set {a} and the set {a, b}. With
the help of this definition it can be proved that two ordered pairs (a, b) and (c, d) are
equal if and only if a = c and b = d.
Note: Some authors take this property as the defining property for ordered pairs.
9
Function and Progression
Example 1.1: The plane of co-ordinate geometry is the set of all ordered pairs (x, y)
with x, y ∈ R. For any point P in this plane, its co-ordinates determine an ordered
pair (a, b) where a is the abscissa of P and b is the ordinate of P. Also for any ordered
pair of real numbers (c, d) there is exactly one point Q in the plane of co-ordinates
whose x co-ordinate is c and y co-ordinate is d. We note that the points (2, 1) and (1,
2) are different. In general points (a, b) and (b, a) are different whenever a ≠ b. This
explains why the co-ordinates of a point form an ordered pair.
Definition 4. Let ax + by + c = 0 be a linear relation, then the set of all ordered
pairs (x1, y1) of real numbers such that ax1 + by1 + c = 0 is called the solution set of
the linear equations ax + by + c = 0.
Thus, (2, 1) is an element of the solution set of the equation 3x – 4y –2 = 0, since
3.2 – 4.1 – 2 = 6 – 4 – 2 = 0. Again (1, 2) is not in the solution set of the same equation
as 3.1 – 4.2 – 2 = – 7 ≠ 0. Let S be the solution set of 3x – 4y – 2 = 0, then S = {(2, 1),
(6, 4) ( 3 13 , 2), (– 2/3, – 1), ...}. Since it is impossible to enumerate all the ordered pairs
(x1, y1) satisfying 3x1 – 4y1 = 2, the above said notation of S does not convey the actual
size of the solution set. Note that (x1, y1) ∈ S ⇔ 3x1 – 4y1 = 2
2 (1 + 2 y1 )
⇔ x1 = .
3
So we can write
RS
S = ( x1 , y1 ) x1 =
2 (1 + 2 y1 ) UV
T 3 W
or S = {(x1, y1) | 3x1 – 4y1 – 2 = 0}.
Similarly we can define solution set of 2x – 1 = 0 and 4x + y + 1 = 0 as the set S
of ordered pairs (x1, y1) such that 2x1 – 1 = 0 and 4x1 + y1 + 1 = 0. It can be easily
1
verified that here S consists of only one ordered pair, namely , 3 .
2
Definition 5. The set of all ordered pairs (x1, y1) of real numbers, such that
ax1 + by1 + c > 0 is called the solution set of the linear inequality ax + by + c > 0.
For example, (5, 1) is the solution set of the inequality 2x – y – 7 > 0 while (1, 4)
is not in its solution set.
10
Function and Progression
11
Function and Progression
x
+
y
+
1
=
0
(–3, 2)
(–2, 1)
O
X' X
(–1, 0)
(0, –1)
(2, –3)
Y'
Fig. 1.1
Let P (x1, y1) be any point. Draw PQ parallel to x-axis to meet line x – y + 1 = 0
at Q (Fig. 1.2). Let the coordinates of Q be (x2, y1). Since Q (x2, y1) lies on x – y +
1 = 0, we get x2 – y1 + 1 = 0. P will lie on right of x – y + 1 = 0 if and only if x1 > x2
⇔ x1 > y1 – 1 ⇔ y1 < x1 + 1. Thus, point (x1, y1) is in solution set of x – y + 1 > 0
if and only if it lies on the right of line x – y + 1 = 0. Thus, the shaded portion (excluding
line x – y + 1 = 0) depicts the solution set of x – y + 1 > 0. Similarly, it can be verified
that a point (x1, y1) lies on left of x – y + 1 = 0 if and only if x1 < y1 – 1 ⇔ y1 > x1 +
1. So the unshaded portion is the graphical representation of the linear inequality x – y
+ 1 < 0. Note the shaded portion together with the line x – y + 1 = 0 represents the
solution set of x – y + 1 ≥ 0 or a system of linear constraints x – y + 1 = 0 and x – y
+ 1 > 0.
Sometimes a solution set need not exist. Consider the following examples. In such
cases no graphical representation is possible.
Example 1.2: Find the solution set of x – 1 = 0 and x < 0.
Solution: For all the points (x1, y1) lying in the solution set of x – 1 = 0, x1 = 1, while
for all points (x1, y1) satisfying x < 0 we must have x1 < 0. But 1 is never less than 0.
Hence no x1 exists which simultaneously satisfies x1 = 1 and x1 < 0.Thus, we cannot
get any point in the graphical representation of solution of x = 1 and x = 0. In this case
the solution set is an empty set.
12
Function and Progression
Q
P
(2, 3)
(1, 2)
(0, 1)
(–1, 0)
X' X
O
0
=
1
+
y
x–
Y'
Fig. 1.2
13
Function and Progression
Y
4x
+ 3x
6=
(–3, 6)
(–6, 2) (0, 2)
X' X
O
(3, –2)
Y'
Fig. 1.3
Clearly, the points (– 3, 6) and (– 6, 2) are such that their co-ordinates satisfy 4x
+ 3y ≤ 6, as 4(– 3) + 3 (6) = – 12 + 18 = 6 and 4(– 6) + 3(2) = – 18 < 6. We mark
these points by black dots.
Example 1.5: Find the graph of x + 2y – 5 < 0, 4x – y < 2 and y > 0. On the graph
mark three points which satisfy these inequalities.
14
Function and Progression
3
2, , 1, etc. Plot the points 0,FG 5 IJ , (1, 2), FG 2, 3 IJ , (3, 1) and join them to obtain the
2 H 2K H 2K
graph of x + 2y – 5 = 0.
Again 4x – y = 2 ⇒ y = 4x – 2 so for x = 0, 1, 2, 3, ...; y = –2, 2, 6, 10, ... . Plot
the points (0, – 2), (1, 2), (2, 6), (3, 10) and join them to get the graph of 4x – y = 2.
Finally y = 0 is the axis of x i.e., X′OX. Now y co-ordinate of any point is positive
if and only if that point lies above x-axis. Further, (x1, y1) satisfies x + 2y – 5 = 0 if and
only if it lies on the left of the line x + 2y – 5 < 0. Similary, (x1, y1) satisfies 4x – y < 2
if and only if the point (x1, y1) lies on the left of the line 4x – y – 2 = 0. Hence, the
solution set is the shaded portion of the figure excluding the lines y = 0, x + 2y = 5 and
4x – y = 2. The ordered pairs FG 1 , 2IJ , (– 1, 1), (– 5, 4) are in the solution set of the
H2 K
1 5 1
given system, since +2−5= − < 0, 4. −2 = 0 < 2 and 2 > 0; – 1 + 2.1 – 5 =
2 2 2
– 4 < 0, 4(– 1) – 1 = – 5 < 2 and 1 > 0; – 5 + 2.4 – 5 = – 2 < 0, 4(– 5) – 4 = – 24
< 2 and 4 > 0. The points corresponding to these pairs are shown by black dots in the
Figure 1.4.
Example 1.6: Find the solution set of the following system of inequalities and represent
the solution set by graph.
3x + y < 13, 7y + x > 11, 3y ≤ 9 + x.
Solution: Firstly, we draw lines 3x + y = 13, 7y + x = 11 and 3y = 9 + x.
Now, 3x + y < 13 is represented by the region in the left side of line 3x + y = 13;
7y + x > 11 is represented by the portion of plane on right side of line 7y + x = 11 and
3y ≤ 9 + x is represented by portion of plane on right of line 3y = 9 + x together with
the line 3y = 9 + x. Hence, the solution set is the interior of triangle ABC (shown by
shaded portion) and the portion of line 3y = 9 + x between the points A, C (but
excluding A and C). Note that coordinates of A, B, C are respectively equal to (– 3,
2), (4, 1), (3, 4). These are obtained by solving the pair of lines 3y = 9 + x and 7y +
x = 11; 7y + x = 11 and 3x + y = 13; 3x + y = 13 and 3y = 9 + x. The point A is not
in the solution set of the given system, since 7(2) + (– 3) = 11 11 (where stands
for not greater than). Also C is not in the solution set as 3(3) + 4 = 13 13 (where
stand for not less than).
15
Function and Progression
=2
4x – y
(2, 6)
(–5, 4)
(1/2, 2)
(–1, 1)
(3, 1)
(5, 0)
X' X
O x+
2y
– 5=
(0, –2) 0
Y'
Fig. 1.4
Quadratic Equation
An equation of degree two is called a quadratic equation.
Note: In this section we shall be mainly dealing with quadratic equations having rational
numbers as coefficients.
There are two types of quadratic equations: (1) Pure and (2) Affected.
A quadratic equation is called pure if it does not contain single power of x. In
other words in a pure quadratic equation, coefficient of x must be zero. Thus a pure
quadratic equation is of the type ax2 + b = 0 with a ≠ 0.
A quadratic equation which is not pure is called an affected quadratic equation.
Thus the most general form of an affected quadratic equation is ax2 + bx + c = 0,
with ab ≠ 0. (Recall that ab ≠ 0 ⇔ a ≠ 0 and b ≠ 0).
Root. A complex number α is called a root of ax2 + bx + c if aα2 + bα + c = 0.
b −b
ax2 = – b ⇒ x2 = − ⇒x= ±
a a
It is clear that the roots of ax2 + b are real if and only if a and b are of opposite
signs.
16
Function and Progression
17
Function and Progression
b b2 b2 c
Thus, we have x2 + x+ = −
a 4a 2
4a 2 a
or FG x + b IJ 2 2
= b − 42 ac .
H 2a K 4a
b
This is a pure equation in the variable x + .
2a
b ± b2 − 4 ac
So the solution is x+ =
2a 2a
− b ± b2 − 4 ac
or x =
2a
Note: This method is useful particularly when ax2 + bx + c cannot be factored into linear factor
easily.
− 3 ± 32 − 4 ( 2 )( − 1) − 3 ± 17
Hence, roots are x = = .
2. 2 4
Nature of Roots
− b ± b2 − 4 ac
The roots of ax2 + bx + c = 0 are given by . The expression inside the
2a
radical sign, i.e., b2 – 4ac V a, b, c ∈ R is called discriminant.
Case I. b2 – 4ac > 0, i.e., b2 > 4ac.
In this case b2 − 4 ac is a real number. Hence, the two roots of the given equation
are unequal and real.
Case II. b2 – 4ac = 0, i.e., b2 = 4ac.
In this case both the roots are real and equal (each equal to – b/2a).
Case III. b2 – 4ac < 0, i.e., b2 < 4ac.
In this case b2 − 4 ac is an imaginary number and so both the roots are complex
and unequal.
x+3 x − 3 2x − 3
Example 1.10: Solve + = .
x+2 x − 2 x −1
Solution: Given equation is equivalent to
18
Function and Progression
( x + 2) + 1 ( x − 2) − 1 2 ( x − 1) − 1
+ =
x+2 x−2 x −1
1 1 1
⇒ 1+ +1− = 2−
x+2 x−2 x −1
x−2−x−2 1
⇒ 2
= −
x −4 x −1
−4 1
⇒ 2
= −
x −4 x −1
⇒ 4x – 4 = x2 – 4
⇒ x2 – 4x = 0 ⇒ x(x – 4) = 0 ⇒ x = 0 or 4.
Hence, the roots of the given equation are 0 and 4.
Example 1.11: Solve, x4 – 13x2 + 36 = 0.
Solution: This is not a quadratic equation in x, but on putting x2 = t, we get a quadratic
in t, namely t2 – 13t + 36 = 0.
Roots of this equation are given by (t – 4)(t – 9) = 0.
Thus, t = 4 or t = 9. In other words x2 = 4 or x2 = 9. Hence x = ± 2 or ± 3.
Consequently, roots of given equation are ± 2, ± 3.
Example 1.12: Solve, (x + 1)(x + 3)(x + 4)(x + 6) = 72.
Solution: Rearrange the factors on the L.H.S. so as to have the sum of constants in
first two factors same as in the case of other two factors.
Since 1 + 6 = 3 + 4, we get (x + 1)(x + 6)(x + 3)(x + 4) = 72
or (x2 + 7x + 6)(x2 + 7x + 12) = 72
Now put x2 + 7x = t, to obtain
(t + 6)(t + 12) = 72
This implies t2 + 18t + 72 = 72
⇒ t(t + 18) = 0 ⇒ t = 0 or t = – 18
Hence, x2 + 7x = 0 or x2 + 7x + 18 = 0
First quadratic has 0 and – 7 as its roots and the second quadratic has roots given
by
− 7 ± 49 − 72 − 7 ± − 23
, i.e.,
2 2
19
Function and Progression
( 5x2 − 6 x + 8 = 8 )
⇒ 5x2 – 6x + 8 = 64
⇒ 5x2 – 6x – 56 = 0
6 ± 36 + 1120 6 ± 1156
⇒ x= =
10 10
6 ± 34 4
⇒ x= ⇒ x=4 or −2 .
10 5
Example 1.14: Solve, x4 – 5x3 + 15x + 9 = 0.
Solution: Note that in this equation
x4 – 5x (x2 – 3) + 9 = 0
(x4 – 6x2 + 9) – 5x(x2 – 3) + 6x2 = 0
Put x2 – 3 = t.
Thus the given equation is reduced to t2 – 5xt + 6x2 = 0
This has the roots t = 2x and t = 3x.
In other words we have two quadratic equations.
x2 – 3 = 2x and x2 – 3 = 3x.
3 ± 21
The roots of former equation are – 1 and 3 and those of the latter are .
2
Example 1.15: Solve, 5x + 52–x = 26.
Solution: Multiplying the given equation by 5x we obtain
52x + 25 = 26 × 5x
or 52x – 26 × 5x + 25 = 0
Put 5x = t to obtain the quadratic equation t2 – 26t + 25 = 0.
The roots of this equation are t = 1 or t = 25.
Then, 5x = 1 = 50 ⇒ x=0
or 5x = 25 = 52 ⇒ x = 2
Hence, x=0 or 2.
20
Function and Progression
−3 ± 5
⇒ x= or x = 1, 1.
2
−3 ± 5
1, 1, .
2
Example 1.18: Solve the equation
x2 – 6x + 9 = 4 x 2 − 6 x + 6
21
Function and Progression
x 1− x 1
Example 1.19: Solve + =2 .
1− x x 6
x
Solution: Put = t2
1− x
1 13
We get t + = ⇒ 6t2 + 6 = 13t
t 6
⇒ 6t2 – 13t + 6 = 0
⇒ 6t2 – 4t – 9t + 6 = 0
⇒ (2t – 3)(3t – 2) = 0
3 2
⇒ t= or
2 3
x
Now, t= 3 ⇒ = 9 ⇒ 4x = 9 – 9x
2 1− x 4
9
⇒ 13x = 9 ⇒x=
13
2 x 4
when t= ⇒ = ⇒ 9x = 4 – 4x
3 1 x 9
4
⇒ 13x = 4 ⇒ x =
13
4 9
So, x= or .
13 13
22
Function and Progression
p2 + q 2
Hence, x = 0 or or p + q.
p+q
6
Example 1.22: Solve x + x = .
25
Solution: Putting x = t, we get
6
t2 + t = ⇒ 25t2 + 25t – 6 = 0
25
− 25 ± 625 − 4 ( − 6)( 25)
⇒ t =
50
− 25 ± 625 + 600
=
50
− 25 ± 1225 25 35
= =
50 50
− 60
= 10 or
50 50
23
Function and Progression
1 −6
= or
5 5
1 36
Then x = t2 = or .
25 25
x 2 + 3 x − 40 + 10 x 2 + 3 x + 16 =0
− 3 ± 9 + 720
⇒ x=
2
− 3 ± 729
=
2
− 3 ± 27
= = 12 or – 15
2
Hence, x ⇒ 0, – 3, 12, – 15.
24
Function and Progression
⇒ 3x2 – 9x + 5x – 15 = 0
⇒ (x – 3)(3x + 5) = 0
⇒ x = 3 or – 5
3
Also, t = – 4 ⇒3x2 – 4x – 6 = 16 ⇒ 3x2 – 4x – 22 = 0
4 ± 16 − 4. 3 ( − 22 )
⇒ x=
6
4 16 264 4 ± 280
= =
6 6
2 ± 70
⇒ x=
3
5 2 ± 70
Hence, x = 3, − , .
3 3
1 + x2 + 1 − x2
Example 1.26: Solve, = 3.
1 + x2 − 1 − x2
1 + x2 + 1 − x2 = 3 1 x2 3 1 x2 ⇒ 2 1 + x2 = 4 1 − x2
⇒ 1 + x2 = 2 1 x2
⇒ 1 + x2 = 4(1 – x2)
⇒ 5x2 = 3
3
⇒ x2 =
5
3
⇒x=± .
5
25
Function and Progression
Logarithmic
In mathematics, logarithmic function is very important function. If y = ax, then x is
given as logarithm of y to the base a, the same is expressed mathematically as
x = logay.
A s an example, 100 = 102 so, 2 = log10100. This tells that 2 is how many times 10
must be multiplied to itself to get 100: Thus 10 × 10 = 100. The base-2 logarithm of
16 is 4 because 4 is multiplied to itself to get 16. It is also obtained by self multiplication
of 2 four times. Hence, it is clear that 2 × 2 × 2 × 2 = 16. Since 102 = 100, so log10100
= 2, and 24 = 16, so log216= 4.
If we want to get a logarithm of x having base b, it is written as logb(x). If the
base is understood, we may write simply as log(x).
if x = by, then y = logb (x)
Logarithms converts the tedious task of multiplication to addition using the
formula log(x.y) = log x + log y. By using this function complex calculations were
made easier and this contributed greatly to the development of concept. We find
logarithmic tables which are used for making complex calculations very easy.
Logarithm with base e is known as natural logarithm and those with base 10 are
known as common logarithm. In calculus logarithm is taken as natural logarithm. In
binary mathematics, ‘2’ is used as a base as it uses two discrete symbols to
represent numbers or characters.
8 = 23 ⇒ log2 (8) = 3,
log2 (32) = log2 (4 × 8) = log2 (4) + log2 (8) = 2 + 3 = 5.
A related property is reduction of exponentiation to multiplication, Using the
identity.
c = blogb (c),
if follows that c to the power p (exponentiation) is:
p
cp = (blogb (c)) = bp logb (c),
or, taking logarithms:
logb (cp) = p logb (c).
Hence, to raise a number to a power p, one must find the logarithm of the
number and then multiply it by p. The exponentiated value is then the inverse or anti
logarithm of this product; which means,
number to power = bproduct.
With the use of logarithms lengthy numerical calculations become easier. To
make the process easy, tables of logarithms, or slide rules are used.
Example 1.27: What is log327?
Solution: 3, because 27 = 33
Example 1.28: What is log51/25?
Solution: –2, because 1/25 = 1/(52) = 5–2
Logarithmic Identities
log(cd) = log(c) + log(d)
log(c/d) = log(c) – log(d)
log(cd) = d log(c)
log(c )
log( d c ) =
d
Logarithm as a Function
In early stages of development of logarithms it was taken to be an arithmetic
sequence of numbers in correspondence to a geometric sequence of other positive
real numbers. But gradually it was considered as an analytic function which can also
be extended to cover complex numbers.
27
Function and Progression
The term logarithm has the form logb(x) where base b is fixed and argument x is
a variable. But the base must be a positive real number, but not 1. Thus the
logarithmic function with base b, is the inverse of an exponential function of the form
bx. The term logarithm is normally used instead of logarithmic function.
If we are required to find the log with base 2 of the number 16 with the help of
a calculator, then we do as follows:
log(16)
log 2 (16) =
log(2)
Use of Logarithms
In equations where exponents are unknown, logarithms are very useful. Their
derivatives are simple and hence used in the solution of integrals.
28
Function and Progression
Scientific Applications
Logarithms are used to define many quantities, used in scientific applications. These
broadly include the following:
pH measurement: In chemistry, pH is defined as, pH = –log10[H+], where [H+]
activity of hydronium ions. Activity of hydronium ions neutral water = 10 –7 mol/L at
25o C. Its pH value is 7. pH thus shows the scale of acidity 1 to 14. A liquid is
acidic if pH < 7 and alkaline if pH > 7.
Measure power level: Power level, voltage level in electrical, electronics and
telecommunication is frequently used and expressed as decibel, written as dB which
is given as 10log10(Ratio of Power). Neper is measurement which is given by
ln(Ratio of Power).
Measurement of earthquake: Intensity of earthquake is measured in Richter scale
on a base 10 logarithmic scale.
In Astronomy: Eyes respond logarithmically to brightness, hence rightness of stars
as measured on logarithmic scale.
In Psychophysics: Relationship between stimulus and sensation has been shown by
Weber–Fechner as logarithmic.
In computer science: Computational complexity is expressed in terms logarithm.
For searching N items, computational time is proportional to N × log N. To compute
storage space of memory, base 2 logarithm is used.
In Information science: In information theory logarithms are used as a measure of
Quantity of information is measured in terms of logarithm in information science. If a
message recipient may expect any one of N possible messages with equal likelihood,
then the amount of information conveyed by any one such message is quantified as
log2 N bits.
Log-log chart: In engineering and scientific applications many log-log and semilog
charts are used.
29
Function and Progression
d 1
1n( x) =
dx x
We can find derivative for other bases, we apply the change-of-base rule as:
∫1n( x=
) dx x1n( x ) − x + C
(1 − z ) (1 − z ) (1 − z )
2 3 4
In z=
−(1 − z ) − − − +
2 3 4
2 n +1
∞
1 z −1
In ( z ) = 2∑ for z with positive real part.
n =0 2n + 1 z + 1
x 2 x3 x4
1n(1 + x) =x − + − +
2 3 4
30
Function and Progression
Subtraction gives:
1+ x x3 x5
1n = 1n(1 + x) − 1n(1 − x) = 2 x + 2 + 2 +
1− x 3 5
1+ x z −1
Putting z = and thus x = we get
1− x z +1
z − 1 1 z − 1 3 1 z − 1 5
1n z =2 + + +
z +1 3 z +1 5 z +1
As z tends to 1 convergence becomes faster. To use this formula one should try to
get an approximate value of y ≈ ln(z) first and then apply A = z/exp(y), where exp(y)
is computed using the exponential series. If y is not very large, it converges fast.
Finally, we get ln(z) = y + ln(A). Here A is approximately equal to 1, which is
desired. For larger value of z we should use z = a×10b, and ln(z) = ln(a) + b ×
ln(10).
Exponential
This function is of prime importance in mathematics and finds its wide application in
calculus and many branches of science and engineering. An exponential function of x
is written as exp(x) or ex. Here e is a constant and an irrational number. It has been
estimated as 2.718281828 by Euler and bears his name. It is called ‘Euler’s
number’ and is also the base of natural logarithm. An exponential function is the
inverse of a logarithmic function and is sometimes, called anti logarithm. Inverse of
an exponential function is a logarithmic function.
The exponential function rises slowly and is almost flat for x < 0, but increases
rapidly for values x > 0 and its value is 1 for x = 0. Its ordinate value is the slope
of its curve at that point. That is why an exponential function with negative value of
x is known as exponential decay and those with positive value it is called exponential
growth. Also, when growth is very fast we call it exponential growth, example,
population growth.
The exponential function is almost flat, rising slowly, for negative values of x, and
increases fast for positive values of x, and equals 1 when x is equal to 0. Its y value
always equals the slope at that point.
31
Function and Progression
The graph of an exponential function always lies above the abscissa, since ex is
always positive. It is increasing on the positive side of X-axis. In the negative side of
X-axis it is decreasing but never touches the X- axis.
The exponential function ex may be expanded into an infinite series, called power
series given below:
∞
xn x 2 x3 x 4
ex = ∑ =1 + x + + + +
n=0 n ! 2! 3! 4!
This function can be defined as a limit which is given below:
n 1
x
e x = lim 1 + ⋅ or e x = lim (1 + nx ) n ⋅
n →∞ n n →∞
32
Function and Progression
Proof.
y = ax
1ny = 1nax
1ny = x 1na
1 dy
= 1n a
y dx
dy
= (1n a)y = (1n a) ax
dx
This shows that Derivative of an exponential function is a constant multiple of its
own. If rate of change of a variable is proportional to the variable itself, the solution
results in an exponential function. Population growth, radioactive decay, continuously
compounded interest, etc., are examples of exponential function in practical life. In all
these cases the variable is proportional to exponential function of time. For a
differentiable function f(x), as per chain rule:
d f ( x)
e = f ′( x)e f ( x )
dx
The derivative, like that of real quantities also holds for complex quantities and
this can be stated as below:
d z
e = e z holds in the complex plane.
dz
We can now extends the concept for real exponential function to complex one as
below by writing as ex + iy = exeiy. The real part is ex and eiy = cos(y) + isin(y). Thus
we use the real definition without ignoring it.
We can now write,
ea+bi = ea (cosb + i sin b)
Here a and b are real values.
33
Function and Progression
Example 1.29: Looking at the functions below, find the function(s) which is/are not
exponential.
(i) f(x) = 3e–2 x
(ii) g(x) = 2x/2
(iii) h(x) = x3/2
(iv) g(x) = 15/7x
(v) p(x) = xe
Solution: Here, h(x) and p(x) are not exponential functions. For the function to be
exponential, the independent variable should be the exponent.
Example 1.30: Find the domain and range of function defined as f(x) = kbx.
Discuss the nature of graph of this function. How f(x) changes when (i) x tends to
infinity and (ii) x tends to negative infinity? Are there any horizontal asymptotes? Tell
about its horizontal asymptote.
Solution: Domain of this function is the set of real numbers, but the range is the set
of all positive real numbers.
When b > 1, the function f(x) is increasing; the graph rises in the right proton. (i)
When x tends to infinity f(x) increases. (ii) When x decreases tending to negative
side of infinity the function, f(x) goes on decreasing and tends to zero. The line given
by y = 0, which is the x-axis, is the horizontal asymptote.
For b < 1, the condition is opposite to it. It decreases with increasing value of x
and decreases with the increasing value of x. It goes from high in the left to low in the
right portion of the graph.
Example 1.31: The Bacteria grow exponentially in a culture. It was observed that
number of bacteria at 2:00 p.m. was 80 and at 6:00 p.m. it was 500. The growth is
given by a function f(t) = k.eat. Find the population of bacteria at 10:00 p.m.
Solution: The growth is given by f(t) = 80e0.4581 tat any time t. Number of bacteria
at 10:00 p.m. will be 3125.
Example 1.32: A European country conducted the nuclear test on an island in the
Pacific Ocean in 1990. Just after the explosion, the level of Strontium-90 on the
island was noted as 100 times the ‘safe level’ for human habitation. Taking half-life
of Strontium-90 as 28 years, find the number of years after which the island will
once again be habitable.
Solution: The Island will be habitable after 186 years approximately which is the
year 2176.
34
Function and Progression
Utility
If U(x, y) denotes the satisfaction obtained by an individual when he buys quantities
x and y of two commodities X and Y, then U(x, y), the function of two variables x
and y is called the utility function or utility index of the individual.
e.g., U = (x + 3) (y + 1)
U = (x – 1)0.5 (y – 2)0.5
Notes
1. Still there are other functions such as Marginal Revenue Function and
Marginal cost function, which are based on the (complete) derivatives or
partial derivatives. They are dealt with in the respective chapters of
differential/integral calculus.
2. Break-Even Analysis entails finding out the minimum quantum of production
(and sales) that a firm has to achieve in its attempt to recover its investment
(total fixed cost) whereafter profits start accruing.
At Break-even point, profit = Loss = 0
or Total Revenue = Total Cost
i.e., R(x) = C(x)
or, p.x. = (TFC + AVC.x)
⇒ x (P – AVC) = TFC, where p = P = unit Price
TFC
or xB = units (Break-even output) (QB)
( P − AVC )
P ( TFC )
sB = =
p.xB p=
.QB
( P − AVC )
or sB = ( TFC ) or TFC
(Break-even Sales)
AVC TFC
1 − 1 −
P TR
35
Function and Progression
1.4 SUMMARY
36
Function and Progression
• The sum of two odd functions is odd, and any constant multiple of an odd
function is odd.
• The derivative of an even function is odd.
• The product of an even function and an odd function is an odd function.
• The Fourier series of a periodic even function includes cosine terms only
while that of a periodic odd function includes sine terms only.
• Both the even and the odd functions form a vector space over the reals. In
fact, the vector space of all real-valued functions is the direct sum of the
spaces of even and odd functions.
• The even functions form a commutative algebra over the reals. However,
the odd functions do not form an algebra over the reals.
• When x and y both occur together in an equation but y is not capable of
being directly expressed in terms of x, then y is said to be an implicit
function of x.
• If y is a function of x, then on the other hand, x is also (yet another) function
of y. The latter is called the inverse function of the former function y, i.e., if
y = f(x), then x = g(y)
• The distance an object travels in four hours depends on its speed. When
such relationships exist, one variable is said to be a function of the other.
• The relationship between any square and its area could be represented by
f(x) = x2, where A = f(x).
• The set of numbers created by substituting every value for x into the
equation is known as the range of the function.
• A linear equation is obtained by equating to zero a linear expression.
• We can identify an ordered pair (a, b) of real number with a point in the
plane of coordinate geometry.
• Whenever the solution set of a system of linear inequations is empty, we
say that the inequations are inconsistent.
• A system of linear inequations is said to be consistent if its solution set is
non-empty.
• A quadratic equation is called pure if it does not contain single power
of x. In other words in a pure quadratic equation, coefficient of x must be
zero. Thus a pure quadratic equation is of the type ax2 + b = 0 with a ≠ 0.
37
Function and Progression
38
Function and Progression
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
39
Function and Progression
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
40
Arithmetic Progression and
Series
Objectives
After going through this unit, you will be able to:
• Define sequence and its significance
• Discuss arithmetic progression and its importance
• Analyse the general term of an arithmetic progression
• Understand the concept of arithmetic mean
Structure
2.1 Introduction
2.2 Sequence
2.3 Arithmetical Mean
2.4 Summary
2.5 Key Words
2.6 Answers to ‘Check Your Progress’
2.7 Self-Assessment Questions
2.8 Further Readings
2.1 INTRODUCTION
This unit will discuss arithmetic progression and series. An arithmetic progression is
a continuous series, in a coherent manner where in each term, after the first, is
obtained by adding a common number to the term before it. The number which is
generally added to the first term is called the common difference. The entire event is
called a sequence. Another sequence, where each term except the first is obtained
by multiplying it to the term before it, generally with a non-zero number is called a
geometric progression. Arithmetic progression varies into another type called a
harmonic progression.
This unit will discuss arithmetic mean and progression in detail and will also
explain the insertion of n Arithmatic means between two given numbers.
41
Arithmetic Progression and
Series
2.2 SEQUENCE
Remarks
(i) In an AP, we usually denote the first term by ‘a’, the common difference by
‘d’, the general term, i.e., nth term by Tn and the sum to first n terms by Sn
respectively.
(ii) Clearly, d = (T2 – T1) = (T3 – T2) = ... = (Tn – Tn–1)
(iii) (Tn = Sn – Sn–1)
(iv) If a, b, c are in AP, then b is called the arithmetic mean (AM) between a
a c
and c, and b .
2
(v) If a, x1, x2, ... , xn, b are in AP, then x1, x2, x3, ... , xn are called the
“n arithmetic means” between a and b.
n(b a )
Here, d b a
and x
n a ( n 1)
n 1
(vi) If a fixed number is added to (or subtracted from) each term of an AP then
the resulting sequence is also an AP.
(vii) If each term of an AP is multiplied or divided by a non-zero fixed number,
then the resulting sequence is also an AP.
(viii) It is convenient (when the sum of three/five/seven/... consecutive terms of
an AP is given) to make a choice of:
three numbers in AP as (a – d), a, (a + d),
five numbers in AP as (a – 2d), (a – d), a, (a + d), (a + 2d) and so on.
42
Arithmetic Progression and
Series
Geometric Progression
A geometric progression is a sequence, in which each term, except the first, is
obtained by multiplying the term immediately preceding it, with a fixed non-zero
number.
The fixed number is called the common ratio.
l
(iii) nth term from the end = n 1
, where l is the last term
r
a(r n 1) a(1 rn ) a lr
(iv) Sum to first n terms, Sn = or = where l =
(r 1) (1 r) 1 r
last term
(v) Sum to infinity of a GP,
a
S∞ = when | r | < 1 or –1 < r < 1
(1 r )
(vi) If a, b, c are in GP, then b is called the geometric mean (GM) between a
and c. In this case, b ac or b2 = ac.
(vii) If a, g1, g2, ... , gn, b are in GP, then g1, g2, ..., gn are called “n geometric
means” between a and b.
n 1 n( n 1)
b b
Here r and
gn ar
a n
a a
43
Arithmetic Progression and
Series
(ix) If each term of a GP is raised to the same index, then the resulting
sequence is also a GP.
i.e., If a, b, c, are in GP,
then aK, bK, cK are also in GP, where K is a constant.
(x) It is convenient (when the product of three/five/seven/ ... consecutive terms
of a GP is given) to make a choice of:
a
three terms of a GP as , a, ar
r
a a
five terms of a GP as 2
, , a, ar, ar2 and so on.
r r
(xi) It is convenient (when the product of four/six/eight/... consecutive terms of
a GP is given) to make a choice of:
a a
four terms of a GP as , , ar , ar 3
3 r
r
a a a
six terms of a GP as 5
, , , ar, ar3 , ar 5 and so on.
3 r
r r
Arithmetico-geometric series
A series in which each term is the product of the corresponding terms of an AP and
a GP is called an Arithmetico-geometric series.
a dr
S∞ = 2
(1 r ) (1 r )
Harmonic progression
A sequence of numbers is said to be in Harmonic Progression when the reciprocals
44
Arithmetic Progression and
Series
Thus, the nth term of a HP is the reciprocal of the nth term of the corresponding
AP.
Remark
The sum of first n terms of an HP is not equal to the reciprocal of the sum of first n
terms of the corresponding AP.
Notes
1. There are no special formulae for HP. We have to trace the corresponding
AP and apply the results/formulae of AP and eventually find the respective
answers for HP.
2. If each term of a HP is multiplied or divided by a constant non-zero
number, then the resulting terms are also in HP.
45
Arithmetic Progression and
Series
7
Thus, (i) S7 =
2
2(550) (7 1)25 = 4375 Ans.
20
=
2
2(5000 12) (20 1)(200 12)
5000
Thus, = 25 years are required.
200
46
Arithmetic Progression and
Series
Example 2.3: The cost of boring a tubewell 600 metres deep is as follows: 25
paise for the first metre and an additional 4 paise for every subsequent metre. Find
the cost of boring the 500th metre and also the total cost.
Solution: The cost of boring the 500th metre = T500 = [a + (500 – 1) d]
= [25 + 499 (4)] = 2021 paise = ` 20.21
Total cost of boring 600 metres = S600
600
=
2
2(25) (600 1)4
= 300 [50 + 599 (4)]
= 733800 paise
= ` 7338
Example 2.4: A person pays ` 975 through monthly instalments each less than the
former by ` 5. The first instalment is ` 100. In how many instalments will the amount
be paid?
Solution: The series of instalments 100, 95, 90, 85, ... forms an AP.
Let ‘n’ be the no. of instalments in which the entire amount is cleared.
n
Sn =
2
2a (n 1) d
975
n
or
2
2(100) (n 1)( 5)
975
n n
or 200 5n 5
975 or 205 5n
975
2 2
n
or 41 n
195 or (41n – n2) = 390
2
or (n2 – 41n + 390) = 0
⇒ [n2 – 26n – 15n + 390] = 0
⇒ n (n – 26) + 15 (n – 26) = 0
⇒ (n – 26) (n + 15) = 0
⇒ n = 26 ( n cannot be negative)
Example 2.5: The monthly salary of a person was ` 320 for each of the first three
years. He next got annual increments of ` 40 per month for each of the following
successive 12 years. His salary remained stationary till retirement when he found that
47
Arithmetic Progression and
Series
his average monthly salary during the service period was ` 698. Find the period of
his service.
Solution: The monthly salary for the first 3 years (i.e., 36 months) = ` 320
The monthly salary in the 4th year = ` 360
The monthly salary in the 5th year = ` 400
The monthly salary in the 6th year = ` 440
... ... ...
... ... ...
The monthly salary in the 15th year = [320 + 40(12)] = ` 800
Now, as per the given problem,
12
3(320) 2 [2(360) (12 1)40] n(800)
the average salary = = ` 698
(3 12 n )
7920 800n
⇒ 698
15 n
⇒ n = 25
∴ The total service = (3 + 12 + 25) = 40 years.
Example 2.6: Balu arranges to pay a debt of ` 9600 in 48 annual instalments which
form an arithemetic series. When 40 of these instalments are paid, Balu becomes
insolvent and his creditor finds that ` 2400 still remain unpaid. Find the value of each
of the first three instalments. Ignore interest.
48
Solution: S48 =
2
2a (48 1) d
9600
40
Also, S40
2
2a (40 1) d (9600 2400) 7200
48
Arithmetic Progression and
Series
n
⇒ 3250 = [2(20) + (n – 1) 15]
2
or (3n2 + 5n – 1300) = 0
or [3n2 + 65n – 60n – 1300] = 0
or [n (3n + 65) – 20 (3n + 65)] = 0
or (n – 20) (3n + 65) = 0
⇒ n = 20
65
(The other value being negative, is ruled out)
3
Example 2.9: The pth term of an AP is q and the qth term is p. Show that the rth
term is (p + q – r) and the (p + q)th term is zero.
Tp a ( p 1)d q d( p q ) ( q p )
Solution: T a ( q 1)d p d 1 and a ( p q 1)
q
Tr = a + (r – 1) d
= a + (r – 1) (–1)
=a–r+1
= (p + q – 1) – r + 1
Tr = (p + q – r)
Also, Tp+q = (p + q – 1) d + (p + q – 1) (–1) = 0
(Substituting a = p + q – 1 and r = p + q in Tr = [a + (r – 1)d]
Example 2.10: If pth term, qth term and rth term of an AP are a, b, c respectively,
show that (q – r) a + (r – p) b + (p – q) c = 0
Solution: Let A be the first term, then
T p = A + (p – 1) d = a
T q = A + (q – 1) d = b
Tr = A + (r – 1) d = c
Now, Σ (q – r) a = Σ(q – r) [A + (p – 1) d]
= Σ A(q – r) + Σpd (q – r) – Σd (q – r)
= AΣ (q – r) + dΣp (q – r) – dΣ (q – r)
= A(0) + d(0) – d (0) = 0
Hence the result.
Example 2.11: Firm A starts producing 400 units and decreases production by 50
units annually. Firm B starts by producing 250 units and increases production by 25
units annually. Assuming that both the firms grow/decay in an arithmetic series, find
the following:
(a) In which year will both produce the same amount?
(b) When will firm A produce zero output?
50
Arithmetic Progression and
Series
(c) What will be the production of firm B in the year when firm A produces
nothing?
Production at Firm A: 400, 350, 300, ...
Production at Firm B: 250, 275, 300, ...
Solution: (a) In the third year, they produce the same quantity
Tn = [400 + (n – 1) (– 50)] = [250 + (n – 1) 25]
⇒ n=3
(b) Tn = 0 = [a + (n – 1)d] = [400 + (n – 1) (– 50)]
or 400 – 50n + 50 = 0
or 450 – 50n = 0
⇒ n = 9 years
(c) T9 for the production of Firm B
T9 = [a + (n – 1) d] = [250 + (9 – 1) 25]
= (250 + 8 × 25) = 450 units
Example 2.12: Twenty-five trees are planted in a straight line at intervals of 5 feet.
To water them, the gardener must bring water for each tree separately from a well
10 ft from the first tree in the line of the trees. How far has he walked when he has
just watered all the trees beginning with the first?
Solution:
T1 = 10 + 10 = 20 ft (both to and fro)
T2 = (10 + 5) + (10 + 5) = 30 ft (both to and fro)
T3 = (10 + 10) + (10 + 10) = 40 ft (both to and fro)
1
Thus, the total distance covered = S24 2 (T25 )
Note
When he just completes watering the 25th tree, there is no need to come back to the
well.
1
Thus, (20 + 30 + 40 + ... up to 24 terms) + (T25)
2
51
Arithmetic Progression and
Series
24 1
= [2(20) + (24 – 1)10] + [20 + (25 – 1)10] = 3370 ft Ans.
2 2
Example 2.13: If Sn is the sum of first ‘n’ terms of an arithmetic series, then show
that Sn+3 – 3.Sn+2 + 3.Sn+1 – Sn = 0.
Solution: Sn+1 = (Sn + Tn+1)
Sn+2 = (Sn + Tn+1 + Tn+2)
Sn+3 = (Sn + Tn+1 + Tn+2 + Tn+3)
Thus, (Sn+3 –3.Sn+2 + 3.Sn+1 – Sn)
= [(Sn + Tn+1 + Tn+2 + Tn+3) – 3(Sn + Tn+1 + Tn+2) + 3(Sn + Tn+1) – Sn]
= [Tn+1 – 2.Tn+2 + Tn+3]
=0 ( Tn+1, Tn+2 and Tn+3 are three consecutive terms of an
Arithmetic Progression)
Note
If a, b, c are in AP, then b a c
2
or (a + c – 2b) = 0, or (a – 2b + c) = 0
Example 2.14: If the roots of the equation (q – r) x2 + (r – p) x + (p – q) = 0 are
equal, then show that p, q, r are in AP.
Solution: Since the roots of a quadratic equation ax2 + bx + c = 0 are equal when
(b2 – 4ac) = 0, we have
(r – p)2 – 4(q – r) (p – q) = 0
r2 + p2 – 2pr + 4pq + 4q2 + 4pr – 4qr = 0
r2 + p2 + (2q)2 + 2pr – 2(p)(2q) – 2(2q)(r) = 0
or (p – 2q + r)2 = 0
or p – 2q + r = 0
p r
⇒ q
2
⇒ p, q, r are in AP
a n 1 bn 1
Example 2.15: Find ‘n’ such that may be the arithmetic mean
a n bn
between a and b.
52
Arithmetic Progression and
Series
ab an 1 bn 1
Solution: Given, AM =
2 an bn
⇒ n=0
Example 2.16: If the sum of first ‘p’ terms of an AP is equal to the sum of first ‘q’
terms of the same progression, then show that sum of the first (p + q) terms is equal
to zero.
Solution:
p
Sp = [2a + (p – 1) d]
2
q
Sq = [2a + (q – 1) d]
2
Given, Sp = Sq
p q
⇒ [2a + (p – 1) d] = [2a + (q – 1) d]
2 2
or 2ap + p (p – 1) d = 2aq + q (q – 1) d
or 2a (p – q) + d [p (p – 1) – q (q – 1)] = 0
or 2a (p – q) + d [(p2 – q2) – (p – q)] = 0
or (p – q) [2a + (p + q) d – d] = 0
or (p – q) [2a + (p + q – 1)d] = 0
⇒ [2a + (p + q – 1) d] = 0 (Œ p ≠ q)
pq
Thus, Sp+ q = [2a + (p + q – 1) d]
2
pq
= [0]
2
= 0 (Hence the result)
53
Arithmetic Progression and
Series
Example 2.17: The sums of the first ‘n’ terms of two arithmetic series are in the
ratio of (3n + 1) : (n + 4). Find the ratio of their 4th terms.
Solution: Let a1 and a2 denote the first terms of the two series respectively.
Let d1 and d2 stand for the common differences of the two series respectively.
n
Sn 3n 1 2
2a1 (n 1)d1
Thus,
Sn n 4 n
2
2a2 (n 1)d2
2a1 (n 1)d1 3n 1
= ...(1)
2a2 (n 1)d2 n 4
54
Arithmetic Progression and
Series
55 59n 4
nth mean = xn = Tn+1 = a + nd = 4 + n
=
n 1 ( n 1)
Given, x4 : xn = 4 : 9
∴ (4n + 224) : (59n + 4) = 4 : 9
⇒ 9(4n + 224) = 4 (59n + 4)
Note
A : B = C : D ⇒ AD = BC
⇒ 36n + 2016 = 236n + 16
⇒ n = 10
Example 2.20: If the sum of the first pth, qth and rth terms of an AP are a, b, c,
then show that
a b c a 1
( q r ) (r p ) ( p q ) = 0 ⇒ = [2 A ( p 1)d ]
p q r p 2
p b 1
Sp = [2A + (p – 1) d] = a ⇒ = [2 A ( q 1)d ]
2 q 2
q c 1
Sq = [2A + (q – 1) d] = b ⇒ = [2 A (r 1)d ]
2 r 2
r
Sr = [2A + (r – 1) d] = c
2
a
Now, Σ (q – r) = Σ 1 [2A + (p – 1) d] (q – r)
p 2
1 1
= [ΣA (q – r) + Σ p (q – r) – Σ d (q – r)]
2 2
55
Arithmetic Progression and
Series
1 d
= [AΣ (q – r) + Σp(q – r) – Σ(q – r)]
2 2
= (0 + 0 – 0) = 0
Hence the result.
Example 2.21: In an organisational hierarchy, each echelon contains two managers
more than the one above it. If on the top, there are three managers and 17 at the
lowest echelon, determine the number of echelons and the total number of managers
in the entire organisation.
Solution: Let ‘n’ stand for the number of echelons.
Then, a = 3, d = 2
T n = [a + (n – 1) d] = 17
i.e., [3 + (n – 1) 2] = 17
i.e., (3 + 2n – 2) = 17
or n=8
n
Total no. of managers = Sn = S8 = [a + l]
2
8
= [3 + 17] = 80
2
Example 2.22: If Sp, Sq, Sr denote the sum of first p, q, r terms respectively of an
AP, whose common difference is ‘d’, then prove that
Sp d
( p q)( p r ) 2
Solution:
p q r
2
2a ( p 1)d 2 2a ( q 1)d 2 2a (r 1)d
LHS =
( p q )( p r ) ( q p )( q r ) (r p )(r q )
p q r
2
2a ( p 1)d 2 2a (q 1)d 2 2a (r 1)d
=
( p q )(r p ) ( p q )( q r ) (r p )( q r )
d d 2 d 2
ap ( p2 p) aq 2 ( q q ) ar 2 (r r )
2
=
( p q )(r p ) ( p q )( q r ) (r p )( q r )
p d p2 d p
= a
( p q )(r p) 2 ( p q)(r p) 2 ( p q )(r p)
56
Arithmetic Progression and
Series
d d
= a(0) ( 1) (0)
2 2
d
= = RHS.
2
Hence the result.
Notes
p p( q r ) q(r p ) r( p q ) 0
1. = = 0
( p q)(r p) ( p q )( q r )(r p ) ( p q ) ( q r ) (r p )
p2 p2 ( q r ) q2 (r p ) r 2 ( p q)
2. =
( p q)(r p) ( p q)( q r )(r p)
Example 2.23: The sum of three numbers in AP is 21 and their product is 315. Find
the numbers.
Solution: Let the three numbers be a–d, a, a+d.
∴ Their sum = (a – d) + a + (a + d) = 21
⇒ 3a = 21 ⇒ a = 7
Also, their product = (7 – d).7(7 + d) = 315
7(72 – d2) = 315
(72 – d2) = 45
49 – d2 = 45
⇒ d =± 2
∴ The numbers are 7 – 2, 7, 7 + 2, or 5, 7, 9
Example 2.24: The sum of four numbers in AP is 16 and the sum of their cubes is
496. Find the numbers.
Solution: Let the four numbers be (a – 3d), (a – d), (a + d), (a + 3d) respectively.
Their sum = [(a – 3d) + (a – d) + (a + d) + (a + 3d)] = 16
⇒ 4a = 16
⇒ a=4
Sum of their cubes = [(a – 3d)3 + (a – d)3 + (a + d)3 + (a + 3d)3]
= 2[a3 + 3a(3d)2] + 2[a3 + 3a(d)2]
57
Arithmetic Progression and
Series
58
Arithmetic Progression and
Series
⇔ a, b, c are in AP
8190 8190
⇒ a= 12
2
(2 1) 4095
Thus, T1 = a = ` 2
T12 = ar12–1 = ar11 = 2 × 211 = 212 = ` 4096
Example 2.27: The sum of 2w terms of a GP, whose first term is ‘a’ and common
ratio is ‘r’, is equal to the sum of w terms of another GP, whose first term is ‘b’ and
common ratio ‘r’. Prove that ‘b’ is equal to the sum of the first two terms of the first
series.
Solution: GP 1 GP 2
No. of terms 2w w
First term a b
Common ratio r r2
Given sum1 = sum2
a(r 2w 1) b[(r 2 )w 1]
i.e.,
(r 1) (r 2 1)
a(r 2w 1) b[(r 2w 1] b
i.e., or a ( r ≠ 1)
(r 1) (r 1)(r 1) (r 1)
or b = a + ar
Hence the result.
Example 2.28: A machine depreciates at 8% of its value at the beginning of a year.
If the machine was purchased for ` 15,000, what is the minimum number of
complete years at the end of which the worth of the machine will not exceed 2/5 of
its original value?
59
Arithmetic Progression and
Series
Solution: The value of the machine at the end of the 1st year, the 2nd year, the third
year and so on will form the following GP
1 2 3
8 8 8
15000 1 , 15000 1 , 15000 1 , ...
100 100 100
Thus, for the value not exceeding 2/5th of its original value, we have
n
8 2
15000 1 (15000)
100 5
∴ n = 10 years
Example 2.29: A tractor was purchased for ` 45,000 and sold as a scrap for `
5000 after 10 years. Find the rate of depreciation of the tractor.
Solution: Let r% p.a. be the rate of depreciation
10
r
T10 = 45000 1 5000
100
10 1 10
r 1 r 1
⇒ 1 100
9
1
100 9
r
⇒ 1 100 = 0.80274
r
= (1 – 0.80274) = 0.197258
100
or r = 19.726% p.a.
Example 2.30: For three consecutive months, a person deposited some amount of
money on the first day of each month in a small savings fund. These three successive
amounts in the deposits, the total value of which is ` 65, form a GP. If the two
extreme amounts be multiplied each by 3 and the mean by 5, the products form an
AP. Find the amounts in the first and the second deposits.
Solution: Because, the product of the three amounts has not been given, there
a
won’t be any special advantage in assuming the three amounts to be , a and ar..
r
60
Arithmetic Progression and
Series
= ` 1000 (210 – 1)
= ` 10,23,000
Example 2.32: ABC Company Ltd has earmarked a fund of ` 1 crore towards the
payment of remuneration to a consultant for his advisory services rendered during a
month. His pay package for that one month is as follows:
He charges Re 1 for the first day, ` 2 for the 2nd day, ` 4 for the 3rd day, ` 8
for the 4th day and so on. What is his total remuneration for that one month? Can the
company afford to pay his remuneration?
Solution: His total remuneration is the sum of all the 30 terms of the GP given by
1 + 2 + 4 + 8 + ... up to 30 terms
Thus, his total remuneration = S30
1(230 1)
= (2 1)
= (230 – 1)
= ` 1,07,37,41,823
Naturally, with just an allocation of ` 1 crore, the company can’t afford to hire
his services for ` 107.37 crore (approx.), which is more than 100 times the earmarked
budget.
Example 2.33: The fifth term of a GP is 81 and the second term is 24. Find the
series.
Solution: Let the GP be a, ar, ar2, ..., arn–1
T 5 = ar4 = 81
T 2 = ar = 24
3
T5 3 81 27 3
∴ = r
T2 24 8 2
3
∴ r=
2
3
Now, ar = 24 ⇒ a = 24 ⇒ a = 16
2
2 3
3 3 3 , ...
Thus, the GP is 16, 16 , 16 , 16 2
2 2
62
Arithmetic Progression and
Series
Solution:
0.348 = 0.3 + 0.048 + 0.00048 + 0.0000048 + ...
3 48 48 48
= 3 5 7 ...
10 10 10 10
48
3 103 a
= S 1 r
10 1
1 2
10
3 48 100
=
10 1000 99
3 48
=
10 990
345 23
=
990 66
23
∴ 0.348 =
66
Aliter Let x = 0.3484848...
10x = 3.484848...
1000x = 348.484848...
on subtraction, 990x = 345
345 23
x =
990 66
Remark
The trick lies in obtaining two different deca-multiples of the given recurring decimal,
each with the recurring part occurring immediately after the decimal and thereafter
taking the difference between these two multiples.
Example 2.35: Find the first term of a GP whose second term is 2 and sum to
infinity is 8.
Solution:
Given, T 2 = ar = 2 ...(1)
a ar
S∞ = 8
8r
...(2)
(1 r ) (1 r )
2
or 8r
1r
⇒ 4r2 – 4r + 1 = 0
63
Arithmetic Progression and
Series
⇒ (2r – 1)2 = 0
1
⇒ r=
2
1
From (1), ar = 2 or a = 2 or a = 4
2
Example 2.36: If (a2 + b2) (b2 + c2) = (ab + bc)2, then show that a, b, c are in GP.
Solution: (a2 + b2) (b2 + c2) = (ab + bc)2
⇒ (a2b2 + a2c2 + b4 + b2c2) = (a2b2 + b2c2 + 2ab2c)
⇒ (a2c2 + b4) = 2ab2c
⇒ [(ac)2 + (b2)2 – 2 (ac) (b2)] = 0
⇒ (ac – b2)2 = 0
⇒ b2 = ac
⇒ a, b, c are in GP
Example 2.37: If a, b, c, d are in GP, then show that
(a2 + b2 + c2) (b2 + c2 + d2) = (ab + bc + cd)2
Solution: Let r be the common ratio,
then b = ar, c = ar2, d = ar3
∴ LHS = (a2 + b2 + c2) (b2 + c2 + d2)
= (a2 + a2r2 + a2r4) × (a2r2 + a2r4 + a2r6)
= a4r2 (1 + r2 + r4)2
RHS = (ab + bc + cd)2 = (a2r + a2r3 + a2r5)2
= a4r2 (1 + r2 + r4)2
∴ LHS = RHS (Hence the result)
Example 2.38: Find the sum to n terms of the series 4 + 44 + 444 + ...
Solution: Sn = 4 + 44 + 444 + ...
Sn
⇒ = 1 + 11 + 111 + ...
4
9Sn
⇒ = 9 + 99 + 999 + ...
4
9Sn
⇒ = (10 – 1) + (102 – 1) + (103 – 1) + ...
4
64
Arithmetic Progression and
Series
9Sn
⇒ = (101 + 102 + 103 + ... + 10n) – (1 + 1 + 1 + ... up to n terms)
4
9Sn 10(10n 1)
⇒ = n
4 (10 1)
9Sn 10(10n 1)
⇒ = n
4 9
40 4n
⇒ Sn = (10n 1)
81 9
Example 2.39: Find the sum to n terms of the series 0.7 + 0.77 + 0.777 + ...
Solution:
Let Sn = 0.7 + 0.77 + 0.777 + ...
Sn
= 0.1 + 0.11 + 0.111 + ...
7
9Sn
⇒ = 0.9 + 0.99 + 0.999 + ...
7
9Sn
⇒ = (1 – 0.1) + (1 – 0.01) + (1 – 0.001) + ...
7
9Sn FG 1 IJ FG 1 IJ FG
= 1 − 10 + 1 − 2 + 1 − 3 + ...
1 IJ
⇒
7 H 10K H 10 K H K
9Sn 1 1 1
⇒ = (1 + 1 + 1 + ... up to n terms) – 2 ... n
7 10 10 10
1 1
1 n
9Sn 10 10
⇒ = n
7 1
1 10
9Sn (1 10n )
⇒ = n
7 9
7n 7
⇒ Sn = (1 10 n )
9 81
Example 2.40: Insert 5 GMs between 3 and 192.
Solution: Let g1, g2, g3, g4, g5 be the 5 GMs so that 3, g1, g2, g3, g4, g5, 192 are
in GP.
Let r be the common ratio of this GP.
T1 = a = 3
∴ T 7 = ar7–1 = 192
65
Arithmetic Progression and
Series
⇒ 3r6 = 192
192
⇒ r6 = 26
64
3
⇒ r=2
∴ g1 = T2 = 3r = 3 × 2 = 6
g 2 = T3 = 3r2 = 3 × 22 = 12
g3 = T4 = 3r3 = 3 × 23 = 24
g4 = T5 = 3r4 = 3 × 24 = 48
g5 = T6 = 3r5 = 3 × 25 = 96
Hence the five GMs are 6, 12, 24, 48, 96.
Example 2.41: If G1 and G2 are two GMs between b and c and a is their AM, then
show that G13 + G23 = 2abc.
Solution: Given G1, G2 are two GMs between b and c. ∴ b, G1, G2, c are in GP.
13
c
If r is the common ratio, then r =
b
13
c
G 1 = br = b
b
23
c
G 2 = br2 = b
b
G13 = b2c
G23 = bc2
G13 + G23 = bc (b + c)
bc
Since a is the AM between b and c, we have a =
2
⇒ b + c = 2a
∴ G13 + G23 = 2abc
Example 2.42: The sum of three terms of a GP is 21 and their product is 216. Find
the terms.
a
Solution: Let the numbers be , a, ar
r
66
Arithmetic Progression and
Series
a
∴ . a . ar = 216
r
a3 = 216 = 63 ⇒ a = 6
Also, sum = 21
a
∴ + a + ar = 21
r
6
⇒ + 6 + 6r = 21
r
⇒ 6 (1 + r + r2) = 21r
⇒ 2r2 – 5r + 2 = 0
⇒ 2r2 – 4r – r + 2 = 0
⇒ 2r (r – 2) – 1 (r – 2) = 0
⇒ (r – 2) (2r – 1) = 0
1
⇒ r = 2 or
2
6 6 1
∴ The numbers are , 6, 6 2 or , 6, 6
2 1 2 2
a2 (1 r r 2 )2 72 49 7
a2 (1 r 2 r 4 ) 21 21 3
(1 r r 2 )2 7
2 4
(1 r r ) 3
(1 r r 2 )2 7
⇒ 2 2
(1 r r )(1 r r ) 3
67
Arithmetic Progression and
Series
Note
1 + r2 + r4 = 1 + r4 + r 2
= [12 + (r2)2 + 2(1) (r2)] – r2
= (1 + r2)2 – r2
= (1 + r2 + r) (1 + r2 – r)
1 r r2 7
⇒ 2
3
⇒ (4r2 – 10r + 4) = 0
1r r
⇒ 2r2 – 5r + 2 = 0
⇒ (r – 2) (2r – 1) = 0
⇒ r=2 or ½
Œ a (1 + r + r2) = 7, a (1 + 2 + 4) = 7 ⇒ a=1
∴ The numbers are 1, 2, 4.
Note
Even if r = ½ is used, we get the same numbers (in the reverse order).
Example 2.44: a, b, c are the three numbers in GP and their sum is 28. If ab + bc
+ ca = 224, find the numbers.
Solution: a, b, c are in GP ⇒ b2 = ac, a + b + c = 28, ab + bc + ca = 224
∴ ab + bc + b2 = 224
⇒ b (a + b + c) = 224
⇒ b (28) = 224
⇒ b=8
8
If r be the common ratio, then a = , c = 8r
r
8
∴ + 8 + 8r = 28
r
⇒ 2r2 – 5r + 2 = 0
⇒ (r – 2) (2r – 1) = 0
1
⇒ r = 2 or
2
8
∴ a= = 4, c = 8 × 2 = 16
2
∴ The numbers are 4, 8 and 16.
68
Arithmetic Progression and
Series
1 1
Example 2.45: The second term of a HP is and its 9th term is . Determine
5 19
the series.
Solution:
T2 of the corresponding AP = 5= a + d ...(1)
T9 of the corresponding AP = 19 = a + 8d ...(2)
On subtraction, 14 = 7d
d=2
Substituting d = 2 in (1),
5=a+2 ⇒ a=3
Thus, the AP is 3, 3 + 2, 3 + 4, 3 + 6, ...
or 3, 5, 7, 9, ...
1 1 1 1
Hence, the HP is , , , , ...
3 5 7 9
Example 2.46: If a, b, c are in HP (a ≠ b ≠ c), prove that
a a b
c bc
1 1 1
Solution: a, b, c are in HP ⇒ , , are in AP
a b c
1 1 1 1
⇒
b a c b
a b bc
⇒
ab bc
a b bc
⇒
a c
a a b
⇒
c b c
Example 2.47: Find the (m + n)th term of the HP of which the mth term is n and
nth term is m. Also find the (mn)th term.
Solution: Let the corresponding AP be a, a + d, a + 2d, ...
1
Tm of the AP = a + (m – 1)d = (given)
n
69
Arithmetic Progression and
Series
1
Tn of the AP = a + (n – 1)d = (given)
m
1 1 m n
∴ On subtraction, (m – n)d =
n m mn
1
d
mn
1
But, a + (m – 1)d =
n
(m 1) 1
∴ a+
mn n
1 m 1
a=
n mn
1
a=
mn
∴ (m + n)th term of the AP, Tm+n = a + (m + n – 1) d
= 1 m n 1
mn mn
mn
=
mn
mn
∴The (m + n)th term of the given HP =
(m n )
1 1
Tmn of the HP = 1
a (mn 1)d 1 mn 1
mn mn
Example 2.48: If log (a + c) + log (a + c –2b) = 2 log (a – c), then prove that a,
b, c are in HP.
Solution: log (a + c) + log (a + c – 2b) = 2 log (a – c)
⇒ log [(a + c) (a + c – 2b)] = log [(a – c)2]
⇒ (a + c) (a + c – 2b)] = (a – c)2
⇒ (a + c)2 – 2b (a + c) = (a – c)2
⇒ (a + c)2 – (a – c)2 = 2b (a + c)
⇒ 4ac = 2b (a + c)
2ac
⇒ b =
a c
⇒ a, b, c are in HP
70
Arithmetic Progression and
Series
ba bc
Example 2.49: If a, b, c are in HP, then show that = 2.
ba bc
b a b c
Solution: =2
b a b c
⇔ (b + a) (b – c) + (b – a) (b + c) = 2 (b – a) (b – c)
⇔ 2b2 – 2ac = 2b2 – 2ab – 2bc + 2ac
⇔ 2ab + 2bc = 4ac
2ac
⇔ b=
a c
⇔ a, b, c are in HP.
Example 2.50: If the pth, qth and rth terms of a HP are respectively P, Q and R,
prove that PQ (p – q) + QR (q – r) + RP (r – p) = 0.
1 1 1
Solution: Tp = ; Tq ; Tr
a ( p 1)d a ( q 1)d a (r 1)d
( p q)
ΣPQ (p – q) = [ a ( p 1)d ][a (q 1)d ]
( p q)[ a (r 1)d ]
= [a ( p 1)d ][a (q 1)d ][ a (r 1)d ]
a ( p q ) d ( p q ) r d ( p q )
= =0
[ a ( p 1)d ][ a ( q 1)d ][ a (r 1)d ]
2 2
Example 2.51: Insert 4 HMs between and .
3 13
2 2
Solution: Let x1, x2, x3, x4, be 4 HMs between and
3 13
2 2
∴ , x1, x2, x3, x4, are in HP
3 13
3 1 1 1 1 13
⇒ , , , , , are in AP
2 x1 x2 x3 x4 2
3
a=
2
13
T6 = a + 5d =
2
71
Arithmetic Progression and
Series
3 13
= + 5d =
2 2
⇒ d=1
1 3 5
∴ T2 = =a+d= 1 ,
x1 2 2
1 3 7
T3 = = a + 2d = 2 ,
x2 2 2
1 3 9
T4 = = a + 3d = 3 ,
x3 2 2
1 3 11
T5 = = a + 4d = 4 .
x4 2 2
2 2 2 2
Hence, the 4 HMs are , , and respectively..
5 7 9 11
an 1 bn 1
Example 2.52: Find n such that may be the harmonic mean between
a n bn
2ab
Solution: The harmonic mean between two numbers a and b is a b
a n 1 bn 1 2ab
∴
a n bn a b
Remark
The same expression AM and GM between a and b for n = 0 and n = –1/2
respectively.
72
Arithmetic Progression and
Series
bc b c 3(b c )
Example 2.53:
If , then show that a, b, c, d are in HP.
ad a d (a d )
ad bc
⇒
ad bc
1 1 1 1
⇒
d a c b
1 1 1 1
⇒ ...(1)
b a d c
bc 3(b c ) a d 3(b c )
Also,
ad ( a d ) ad bc
1 1 3 3
⇒ ...(2)
d a c b
1 1 1 1
d a c b
··· from (1)
2 4 2
Adding,
d c b
2 2 4
or
d b c
1 1 1 1
or
d b c c
1 1 1 1
or ...(3)
d c c b
1 1 1 1 1 1
From (1) and (3), we have
b a c b d c
1 1 1 1
⇒ , , , are in AP
a b c d
⇒ a, b, c, d are in HP
Example 2.54: If a1, a2, a3, ..., an, are in HP, then show that
a1 a2 + a2 a3 + a3 a4 + ··· an–1.an = (n – 1) a1 an
Solution: Given a1, a2, a3, ..., an, are in HP.
1 1 1 1
∴ , , , ..., are in AP.
a1 a2 a3 an
Let ‘d’ be the common difference of the AP.
73
Arithmetic Progression and
Series
1
−
1 a − a2
=d ⇒ 1
FG
= a1 a2
IJ
Then a2 a1 d H K
1
−
1
=d ⇒ G
Fa 2 −a I
3
J=a a
a 3 a2 H d K
2 3
... ...
... ...
... ...
1 1 a an
d ⇒ n 1
an 1 .an
an an 1 d
( a1 a2 ) ( a2 a3 ) ... ( an 1 an )
∴ (a1 a2 + a2 a3 + ... an–1.an) =
d
a1 an
= ...(1)
d
1 1
But, = Tn of the AP = + (n – 1)d
an a1
1 1
⇒ (n 1)d
an a1
a1 an
⇒ a a = (n – 1)d
a1 1 nan
= (n – 1) a1 an ...(2)
d
From (1) and (2), we have
a1 a2 + a2 a3 + ... an–1.an = (n – 1) a1.an (Hence the result)
a b c
Example 2.55: If a, b, c are in HP, then show that , , are in HP..
bc ca ab
a b c
Solution: , , are in HP
bc ca ab
b+ c c+ a a+ b
⇔ , , are in AP
a b c
bc ca a b
⇔ 1, 1, 1 are in AP
a b c
FG a + b + c IJ , FG a + b + c IJ , FG a + b + c IJ are in AP
⇔ H a KH b KH c K
1 1 1
⇔ , , are in AP
a b c
⇔ a, b, c are in HP
74
Arithmetic Progression and
Series
a b c
Solution: , , are in HP
b c 2a c a 2b a b 2c
b c 2a c a 2b a b 2c
⇔ a , b , c are in AP
b c c a a b
⇔ a 2 , b 2 , c 2 are in AP
bc ca a b
⇔ , , are in AP
a b c
⇔ a, b, c are in HP (as per the previous problem)
Example 2.57: The sum of first three terms of a HP is 22. If the first term be 12,
find the HP.
Solution: Let the first three terms of the corresponding AP be a, a + d, a + 2d.
1 1
Thus, = 12 or a =
a 12
75
Arithmetic Progression and
Series
Example 2.58: The harmonic mean of two numbers is 4. The arithmetic mean A and
the geometric mean G of these numbers are connected by the relation 2A + G2
= 27. Find the numbers.
Solution: Let the numbers be a and b.
a b 2ab
A= ; G= ab ; H= =4
2 a b
⇒ 2A = (a + b); (G2 = ab); [ab = 2 (a + b)]
∴ 2A + G2 = 27 ⇒ (a + b) + ab = 27
⇒ (a + b) + 2 (a + b) = 27
⇒ 3 (a + b) = 27
⇒ (a + b) = 9
∴ ab = 2 (a + b) = 18
∴ (a – b) = ( a b)2 4ab
3
Solving, a = 6 or 3
b = 3 or 6
Arithmetical Progression
Quantities a1, a2, a3, ..., an, ... are said to be in Arithmetical Progression if an – an–
1 is constant for all integers n >1. The constant quantity an – an–1 is called the
common difference of the arithmetical progression.
Notation. A.P. stands for an arithmetical progression. Consider the following
series.
1, 3, 5, 7, 9, 11, ...
0, 2 , 2 2 , 3 2 , 4 2 , ...
1 1 3
1, , 0, – , –1, – , ...
2 2 2
x + y , x, x – y, x – 2y, ...
76
Arithmetic Progression and
Series
77
Arithmetic Progression and
Series
⇒ 9 = 49 – 5n + 5
or 5n = 54 – 9 = 45
n =9
Thus 9th term of the given A.P. is 9.
Sum of Finite Number of Quantities in an Arithmetic Progression
Let a1, a2, ..., an be n quantities in A.P., and let the last term an be denoted by l. If d
is their common difference then
an = a1 + (n – 1)d = l
Put Sn = a1 + a2 + ... + an
Thus Sn = a1 + (a1 + d) + (a2 + 2d) + ... + [a1 + (n – 1)d]
= a1 + (a1 + d) + (a1 + 2d) + ... + (l – d) + l ...(1)
Writing the above series in reverse order, we get
Sn = l + (l – d) + (l – 2d) + ... + (a1 + d) + a1 ...(2)
Adding Equations (2.4) and (2.5), we get
2Sn = (a1 + l) + (a1 + l) + ... + (a1 + l), (n times)
= n(a1 + l)
n
Therefore, Sn = (a1 + l)
2
n
= {a1 + [a1 + (n – 1)d]}
2
n
Consequently, Sn = [2a1 + (n – 1)d]
2
Check Your Progress - 1
78
Arithmetic Progression and
Series
If a1, a2, ..., an are in A.P., then the quantities a2, a3, ..., an–1 are called Arithmetic
Means (A.M.) between a1 and an.
Thus in the series, 1, 3, 5, 7, 9, 11, 13, 15, ...
3, 5 are arithmetic means between 1 and 7.
9, 11, 13 are arithmetic means between 7 and 15.
79
Arithmetic Progression and
Series
Example 2.62: If pth, qth, rth term of an A.P. are a, b, c, respectively, show that
(q – r) a + (r –p)b + (p – q)c = 0
Solution. Here, pth term = a = a1 + (p – 1)d ...(1)
qth term = b = a1 + (q – 1)d ...(2)
rth term = c = a1 + (r – 1)d ...(3)
where a1 is the first term and d is the common difference of the A.P.
Multiply equations (1) by q – r, (2) by r – p, (3) by p – q and add to obtain
(q – r) a + (r – p) b + (p – q) c= a1(q – r) + a1(r – p) +a1(p – q) + d[(p – 1) (q – r)
80
Arithmetic Progression and
Series
a1 10d1 148
or =
b1 10d 2 111
a11 148
or =
b11 111
where a11 and b11 are the 11th terms of two A.P.s respectively.
The ratio of their 11th term is 4:3
1 1 1
Example 2.64. If , , are in an A.P., prove that a2, b2, c2 are also in
b+c c+a a+b
A.P.
1 1 1
Solution: Since , , are in an A.P.,.,
b c c a a b
1 1 1 1
We have – = –
c a b c a b c a
b c – c a c a a b
⇒ =
b c c a a b c a
b a c b
⇒ =
b c b a
⇒ b2 – a2 = c2 – b2
⇒ a2, b2, c2 are in A.P.
81
Arithmetic Progression and
Series
Example 2.65: The monthly salary of a person was ` 320 for each of the first three
years. He then got annual increments of ` 40 per month for each of the following
successive 12 years. His salary remained stationary till retirement when he found that
his average monthly salary during the service period was ` 698. Find the period of his
service.
Solution: Let n be the total number of years of the person’s service.
His total salary = ` 12n × 698
(As his monthly average is ` 698)
Total salary in first three years of service
= 320 × 3 × 12 = ` 960 × 12
In the 4th year, his monthly salary was ` (320 + 40) = ` 360
In the 5th year his monthly salary was ` 400, and so on.
Then for the next 12 years, his total salary
= ` 12 × [360 + 400 + ... up to 12 terms]
12
= ` 12 × [2 × 360 + (12 – 1) × 40]
2
= ` 12 × 6 (720 + 440)
= ` 12 × 6 × 1160
= ` 12 × 6960
At the end of following the 12 years, his monthly salary was
` [360 + (12 – 1) × 40] = ` 800
He got ` 800 as salary for the remaining (n – 15) years. So his total salary for the
remaining (n – 15) years was (n – 15) 800 × 12
Hence his total salary throughout his service period
= 12[960 + 6960 + 800(n – 15)]
= 12(7920 + 800n – 12000)
= 12 (800n – 4080)
This must be same as 12n × 698
i.e., 12n × 698 = 12(800n – 4080)
⇒ 102n = 4080 ⇒ n = 40 years.
82
Arithmetic Progression and
Series
83
Arithmetic Progression and
Series
y = ( 2 x 3)
Solution: This function can have all values of x but negative value inside the square
root will be to a complex number.
∴ –2x + 3 ≥ 0
–2x ≥ –3
y
or 2x ≤ 3
or x ≤ 3/2
Domain is {x : x ≤ 3/2}
Range is {y ≤ 0}
Sigma Notation
Sigma notation is given by Σ. Sigma is the upper case letter S in Greek, which
stands for sum. It represents sum up the value written after it
For Example: Σn implies we sum n.
84
Arithmetic Progression and
Series
= n
n 1
= [1 + 2 + 3 + 4]
= 10
In this same way value of more complex terms can be evaluated under sigma
notations as
4
n (2n+1) = (3 + 5 + 7 + 9)
n 1
= 24
5
= 55
Example 2.70: Write 1 + 2 + 3 + … + 7 + 8 using sigma notation.
8
Solution: n
n 1
Solution: n
n 1
85
Arithmetic Progression and
Series
2.4 SUMMARY
86
Arithmetic Progression and
Series
• Quantities a1, a2, a3, ..., an, ... are said to be in Arithmetical Progression
if a n – an–1 is constant for all integers n >1. The constant quantity
an – an–1 is called the common difference of the arithmetical progression.
• Sigma notation is given by Σ. Sigma is the upper case letter S in Greek,
which stands for sum. It represents sum up the value written after it
87
Arithmetic Progression and
Series
4. A manufacturer of TV sets produced 670 units in the third year and 770
units in the seventh year. Assuming that the increase in production every
year is the same, find what was (i) the total production in 9 years and
(ii) the production in the 11th year?
5. The monthly salary of a person was ` 320 for each of the first three years.
He next got annual increments of ` 40 per month for each of the following
successive 12 years. His salary remained stationary till retirement when he
found that his average monthly salary during the service period was ` 698.
Find the period of his service.
6. Two posts were offered to a man. In the first one, the starting salary was
` 500 per month and the annual increment was ` 15. In the second one,
the salary commenced at ` 320 per month, but the annual increment was
` 22. He decided to accept that post which would give him more earnings
in the first 20 years of the service. Which post was acceptable to him?
Justify your answer.
7. If Sn is the sum of first ‘n’ terms of an arithmetic series, then show that
Sn+3 – 3.Sn+2 + 3.Sn+1 – Sn = 0.
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
88
Geometric Progression and
Series
Objectives
After going through this unit, you will be able to:
• Define nth term of a geometric progression
• Analyse the sum of infinity of a geometric progression
• Assess the sum of integrity of a geometric progression
• Discuss geometric mean and its significance
Structure
3.1 Introduction
3.2 Geometric Progression and Geometric Means
3.3 Sum of Geometric Progression
3.4 Summary
3.5 Key Words
3.6 Answers to ‘Check Your Progress’
3.7 Self-Assessment Questions
3.8 Further Readings
3.1 INTRODUCTION
This unit will discuss geometric progression and series. A geometric progression is a
sequence of numbers where each term after the first is found by multiplying the
previous one by a fixed, non-zero number called the common ratio. The number
multiplied each time is constant. In order to find the common ratio, the second term
is divided by the first term. In a geometric progression, the n-th term of a geometric
sequence with initial value a and common ratio r is given by an = arn-1.
Geometric Progression also includes harmonic series, which is the reciprocal of
arithmetic progression. Other than geometric progression is geometric mean i.e. the
mean or average which indicates the central tendency or typical value of a set of
numbers. This unit discusses in detail the various aspects of geometric progression
and series, ranging from the first n terms of geometric progression to the sun of
infinity of a geometric progression.
89
Geometric Progression and
Series
The aspects of geometric progression, its nth term and mean are discussed here.
90
Geometric Progression and
Series
1 a
In (iv) Ist term = a, r = , hence, 11th term = ar10 = 10 .
b b
1 1 1
In (v) a = 1, r = , hence, 8th term = ar2 = 7 = .
5 5 78125
Sum of First n Terms of a G.P.
Let a, ar, ar2,... be a given G.P. and let Sn be the sum of its first n terms.
Then, Sn = a + ar + ar2 +...+ arn–1.
This gives that rSn = ar + ar2 +...+ arn–1 + arn
Subtracting, we get, Sn – r Sn = a – arn = a (1 – rn)
a 1 rn
In case r ≠ 1, Sn =
1 r
91
Geometric Progression and
Series
211 1 683
= 10 = .
3 2 1024
Harmonic Series
Non-zero quantities whose reciprocals are in A.P. are said to be in Harmonical
Progression (H.P.)
Consider the following examples:
1 1 1
1. 1, , , , ... ...
3 5 7
1 1 1 1
2. , , , , ... ...
2 5 8 11
5 10
3. 2, , , ...
2 3
1 1 1
4. a , a b , a 2b , ... ... a, b 0.
55 55
5. 5, , , 11, ... ...
9 7
It can be easily checked, that in each case the series obtained by taking the recip-
rocal of each of the term is an A.P.
Geometric Means
Geometric mean or GM is the mean or average which indicates the central tendency
or typical value of a set of numbers.
If α, β, γ are in G.P., then β is called a geometric mean between α and γ (written as
G.M.).
If a1, a2, ..., an are in G.P., then a2, ..., an–1 are called geometric means between
a1 and an.
Thus 3, 9, 27 are three geometric means between 1 and 81.
92
Geometric Progression and
Series
1
1
b n 1
So, G 1 = ar = a = anb n 1
a
2
1
b n 1
G2 = ar2 =a = a n 1b 2 n 1
a
... .... ... ... ... ... .... ... ... ... ....
n 1
1
b n 1
G n = arn–1 = a = a 2bn 1 n 1
a
Example 3.3: Find 7 G.M.’s between 1 and 256.
Solution. Let G1, G2, ... G7, be 7 G.M.’s between 1 and 256.
Then 256= 9th term of G.P.,
= 1. r8 where r is the common ratio of the G.P.
This gives that r8 = 256 ⇒ r = 2.
Thus G 1 = ar = 1.2 = 2
G 2 = ar2 = 1.4 = 4
G 3 = ar3 = 1.8 = 8
G 4 = ar4 = 1.16 = 16
G 5 = ar5 = 1.32 = 32
G 6 = ar6 = 1.64 = 64
G 7 = ar7 = 1.128 = 128.
Hence required G.M.’s are 2, 4, 8, 16, 32, 64, 128.
Example 3.4: Sum the series 1 + 3x + 5x2 + 7x3 + ... up to n terms, x ≠ 1.
Solution. Note that nth term of this series = (2n – 1) xn – 1.
Let Sn = 1 + 3x + 5x2 + ... + (2n – 1) xn – 1.
Then xSn = x + 3x2 + ... + (2n – 3) xn – 1 + (2n – 1) xn.
Subtracting, we get
Sn(1 – x) = 1 + 2x + 2x2 + ... + 2xn – 1 + (2n – 1) xn
1 xn 1
= 1 + 2x . – (2n – 1) xn
1 x
93
Geometric Progression and
Series
1 x 2 x 2 x n (2n 1) x n (1 x)
=
1 x
1 x 2 x n (2n 1) x n (2n 1) x n 1
=
1 x
1 x (2n 1) x n (2n 1) x n 1
=
1 x
1 x (2n 1) x n (2n 1) x n 1
Hence S=
(1 x)2
Example 3.5: If in a G.P., (p + q)th term = m and (p – q)th term = n, then find its
pth and qth terms.
Solution. Suppose that the given G.P. be a, ar, ar2, ar3, ...
By hypothesis, (p + q)th term = m = ar p + q – 1
(p – q)th term = n = arp – q – 1.
1/ 2q
m m
Then = r2q ⇒ r =
n n
(p q 1) / 2 q
m
Hence m=a ⇒ a = m(q – p + 1)/2q n(p + q – 1)/2q.
n
2q p p
qth term = arq – 1 = m n
2p 2q
Example 3.6: Sum the series 5 + 55 + 555 + ... up to n terms.
Solution. Let Sn = 5 + 55 + 555 + . . . .
S n = 5 (1 + 11 + 111 + . . . . )
5
= (9 + 99 + 999 + . . . )
9
5
= [(10 – 1) + (100 – 1) + (1000 – 1) + ...]
9
5
= [(10 + 102 + 103 + ... + 10n )
9 – (1 + 1 + . . . .n terms)]
5
= [(10 + 102 + 103 + ... + 10n) – n]
9
94
Geometric Progression and
Series
5 10(1 10n )
= n
9 1 10
5 10(10n 1)
= n
9 9
50 5n
= (10n 1) .
81 9
Example 3.7: If a, b, c, d are in G.P., prove that a2 – b2, b2 – c2 and c2 – d2 are
also in G.P.
b c d
Solution. Since = k (say)
a b c
we have b = ak, c = bk, d = ck
i.e., b = ak, c = ak2, d = ak3.
Now (b2 – c2)2 = (a2k2 – a2k4)2
= a4k4(1 – k2)2.
Also (a2 – b2) (c2 – d2) = (a2 – a2k2) (a2k4 – a2k6)
= a4(1 – k2) (k4 – k6)
= a4k2 (1 – k2)2
Hence (b2 – c2) = (a2 – b2) (c2 – d2).
This gives that a2 – b2, b2 – c2, c2 – d2 are in G.P.
124
Example 3.8: Three numbers are in G.P. Their product is 64 and sum is .
5
Find them.
a
Solution. Let the numbers be , a, ar.
r
a 124 a
Since + a + a2 = and , a, ar = 64,
r 5 r
we have a 3 = 64 ⇒ a = 4.
4 124
This gives that + 4 + 4r =
r 5
1 31
⇒ +1+r=
r 5
r2 1 26
⇒ =
r 5
95
Geometric Progression and
Series
⇒ 5r2 + 5 = 26r
1
⇒ r= or 5
5
4
In either case, the numbers are , 4, and 20.
5
Example 3.9: If a, b, c are in G.P. and ax = b y = cz, prove that
1 1 2
+ =
x z y
Solution. a, b, c are in G.P., b2 = ac
But by = ax ⇒ a = b y/x
and by = cz ⇒ c = b y/z
So we get bz = b y/x. b y/z
1 1
y
x z
=b
1 1
⇒ 2 = y
x z
1 1 2
⇒ = .
x z y
Example 3.10: Sum to n terms the series
.7 + .77 + .777 + . . .
Solution. Given series
= .7 + .77 + .777 + . . . up to n terms
= 7 (.1 + .11 + .111 + ... up to n terms)
7
= (.9 + .99 + .999 + ... up to n terms)
9
7 1 1 1
= 1 1 2 1 3 ...
9 10 10 10
7 1 1
= n 2 ... up to n terms
9 10 10
1 (1 1/10n )
7
= n
10
9 1
1
10
96
Geometric Progression and
Series
7 1 1
= n 1 n
9 9 10
7 1 1
= n 1 n .
9 9 10
Example 3.11: The sum of three numbers in G.P. is 35 and their product is 1000.
Find the numbers.
97
Geometric Progression and
Series
a (1 r 4 )
S4 = Sum of first four terms =
1 r
a (1 r 8 ) 5a(1 r 4 )
By hypothesis S 8 = 5S4 ⇒ =
1 r 1 r
⇒ 1 – r8 = 5(1 – r4)
⇒ (1 – r4) (1 + r4) = 5(1 – r4)
In case r4 – 1 = 0 we get (r2 – 1) = 0 ⇒ r = ±1
(Note that r2 + 1 = 0 ⇒ r is imaginary)
Now r = 1 ⇒ the given series is a + a + a + . . .
but then S8 = 8a and S4 = 4a. So S8 ≠ 4S4.
In case r = –1, we get S8 = 0 and S4 = 0 hence the hypothesis is satisfied.
Suppose now r4 – 1 ≠ 0 then 1 + r4 = 5
⇒ r 4 = 4 ⇒ r2 = 2 (r2 ≠ – 2)
⇒ r= ± 2
Hence r = –1 or ± 2
Example 3.13: If S is the sum, P the product and R the sum of reciprocals of n
terms in G.P., prove that
P2Rn = Sn.
Solution. Let a, ar, ar2, . . . be the given G.P.
Then S = a + ar + ar2 + . . . up to n terms
a (1 r n )
= ...(1)
1 r
P = a ⋅ ar ⋅ ar2... arn – 1
= an r1 + 2 + 3 + ... + (n – 1)
( n 1)
(2 n 2)
= an r 2
n 1
= an r 2 n ...(2)
1 1 1
R= ... up to n terms
a ar ar 2
98
Geometric Progression and
Series
1 1
a 1 n r (r n 1)
r
= =
1 a (r 1) r n
1
r
(1 r n )
= ...(3)
a (1 r ) r n 1
(1 r n ) n
By (2) and (3), P2Rn = a2n rn(n – 1)
a n (1 r ) n r n ( n 1)
a n (1 r n ) n
= = Sn by (1).
(1 r ) n
Example 3.14: The ratio of the 4th to the 12th term of a G.P. with positive
1
common ratio is . If the sum of the two terms is 61.68, find the sum of series
256
to 8 terms.
Solution. Let the series be a, ar, ar2, . . .,
T 4 = 4th term = ar3
T12 = 12th term = ar11
T4 1
By hypothesis T12
=
256
ar 3 1
i.e., 11 =
ar 256
1 1
8 =
r 256
⇒ r8 = 256
⇒ r = ±2
Since r is given to be positive, we reject negative sign.
Again it is given that
T4 + T12 = 61.68
i.e., a (r3 + r11) = 61.68
99
Geometric Progression and
Series
a (8 + 2048) = 61.68
61.68
a= = 0.03
2056
Hence S 8 = sum to eight terms
a (1 r 8 ) a (r 8 1) (.03) (256 1)
= =
1 r r 1 (2 1)
64 16
= 18750
125 25
1024
= 750
125
= 1024 × 6
= 6144 rupees
Example 3.16: Show that a given sum of money accumulated at 20 per cent per
annum more than doubles itself in 4 years at compound interest.
6a
Solution. Let the given sum be a rupees. After 1 year, it becomes (it is increased
a 5
by ).
5 2
6 6a 6
At the end of two years, it becomes a.
5 5 5
100
Geometric Progression and
Series
Proceeding in this manner, we get that at the end of 4th year, the amount will be
4
6 1296
a = a
5 625
1296 46
Now a 2a a, a + ve quantity, so the amount after 4 years is more than
625 625
double of the original amount.
a a
Example 3.17: If x=a+ + ... ∞
r r2
b b
y= b + ... ∞
r r2
c c
and z= c 2
+ ... ∞
r r4
xy ab
Show that =
z c
a ar
Solution. Clearly x= ,
1 r 1
1
r
b br
y=
1 ( 1/r ) r 1
c cr 2
and z=
1 r2 1
1 2
r
xy ab r 2 cr 2
Now = 2
z (r 1) r2 1
ab
= .
c
Example 3.18: If a2 + b2, ab + bc and b2 + c2 are in G.P., prove that a, b, c are
also in G.P.
Solution. Since a2 + b2, ab + bc and b2 + c2 are in G.P., we get
(ab + bc)2 = (a2 + b2) (b2 + c2)
b2(a2 + 2ac + c2) = a2b2 + a2c2 + b4 + b2c2
⇒ 2ab2c2 = a2c2 + b4
⇒ a2c2 – 2ab2c2 + b4 = 0
⇒ (ac – b2)2 = 0
101
Geometric Progression and
Series
⇒ ac = b2
⇒ a, b, c are in G.P.
a 1 rn 3 1 314
So, Sn = =
1 r 1 3
3
= (314 – 1).
2
Example 3.20: Find the sum of first 11 terms of a G.P. given by
1 1 1
1, , , ..., ...
2 4 8
1
Solution. Here a = 1, r = , n = 11.
2 11
1
11
a 1 rn 2
So, Sn = =
1 r 1
1
2
102
Geometric Progression and
Series
211 1 683
= 10 = .
3 2 1024
103
Geometric Progression and
Series
3.4 SUMMARY
• Non-zero quantities a1, a2, a3, ..., an,...., each term of which is equal to the
product of preceding term and a constant number, form a Geometrical
Progression.
• A geometrical progression is also written as G.P.
• The constant number is termed as the common ratio of the G.P.
• Non-zero quantities whose reciprocals are in A.P. are said to be in
Harmonical Progression (H.P.)
• Geometric mean or GM is the mean or average which indicates the central
tendency or typical value of a set of numbers.
• If α, β, γ are in G.P., then β is called a geometric mean between α and
γ (written as G.M.).
104
Geometric Progression and
Series
1. The sum of three numbers in G.P. is 75 and their product is 1050. Find the
numbers.
2. The sum of the first eight terms of a G.P. (of real terms) is five times the sum
of the first four terms. Find the common ratio.
3. The ratio of the 4th to the 12th term of a G.P. with positive common ratio
is 1/256. If the sum of the two terms is 61.68, find the sum of series to 8
terms.
4. Evaluate the recurring decimal 17.
5. Define a geometric progression series. Use an example to support your
answer.
6. How is a geometric progression different from a harmonic progression?
105
Geometric Progression and
Series
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
106
Fundamental Principles of
Counting
BLOCK-II
PERMUTATION AND COMBINATION
This block discusses Permutation and Combination. Permutation and Combination is the
method of deriving or finding out the maximum number of possible outcomes for any given
situation. For example, the numbers 1, 2 and 3 can be written as 12, 13, 21, 23, 31 and 32.
There can be another case where we can repeat the digits, thus contributing three more
combinations 11, 22 and 33. These therefore, are the maximum number of combinations of
the three digits. The block discusses the fundamental principles of counting, permutation and
combination, matrices and determinants, differentiation and integration and its applications.
This block consists of five units.
The fourth unit discusses fundamental principles of counting. The fundamental principles of
counting implies that if for one event has m number of possible outcomes and anther has n
possible outcomes, then the total number of outcomes for both events will be m x n. The unit
consists of the multiplication rule of counting. It also discusses the other mathematical
operations used for counting the events.
The fifth unit discusses permutation and combination. Permutation and combinations help
find the total number of possible outcomes from any event. It is in other words the several
possible ways a set or number of things can be ordered or arranged. The cases of repetition
and non-repetition are discussed in this unit.
The sixth unit explains matrices and determinants. Matrices are arrays of numbers, symbols,
or expressions, arranged in rows and columns. The various types of matrices, row, column,
square, null, diagonal, scalar, identity and triangular; along with the various operations on
matrices are also discussed in the unit. Along with matrices, determinants of order one, two,
three and four are explained with suitable examples. The properties of determinants are also
explained with examples.
The seventh unit discusses differentiation. Differentiation in mathematics is the
mathematical process of obtaining the derivative of a function. Limit and continuity,
properties of continuous functions, differentiability, applications of derivatives, and
derivatives of functions multiplied by a constant are discusses in this unit.
The eighth unit lists integration and its applications. Integration is a calculus operation by
which the integral of a function is determined. There are various applications of integration,
ranging from economics to accounting and business, determination of cost functions, total
revenue functions, consumer surplus and producer surplus. Integration can be computed by
various methods, namely, indefinite integral, integration by substitution, integration of rational,
irrational and trigonometric functions. These are discussed in detail in the unit.
107
Fundamental Principles of
Counting
Objectives
After going through this unit, you will be able to:
• Discuss the fundamental principles of counting
• Explain multiplication rule
• Describe the rule of the product
• Analyses the principles of inclusion and exclusion
• Understand the basics of factorial notation
Structure
4.1 Introduction
4.2 Multiplication Rule
4.3 Addition Rule
4.4 Summary
4.5 Key Words
4.6 Answers to ‘Check Your Progress’
4.7 Self-Assessment Questions
4.8 Further Readings
4.1 INTRODUCTION
109
Fundamental Principles of
Counting
Rule of Sum: Suppose two tasks can not be performed simultaneously and also
suppose that T1 can be performed in n1 ways and T2 can be performed in n2 ways.
Then two tasks T1 and T2 can be performed in n1 + n2 ways. In general, suppose a
task T1 can be performed in n1 ways, and second task T2 in n2 ways, a third task in
n3 ways, and so on, and if no two tasks can be performed simultaneously, then one
of the task can be performed in n1 + n2 + n3 + … ways.
In set theoretical notation, the rule of sum can be interpreted as follows:
n (A ∪ B) = n (A) + n (B)
Multiplication Rule
Rule of Product: Suppose a task T 1 can be performed in n 1 ways, and
independent of this task, the second task T2 can be performed in n2 ways, so that
these two tasks when combined can be performed in mn ways. In general, suppose
a task T1 can be performed in n1 ways, and following T1, a second task T2 can be
perfomed in n2 ways, and following task T2 a third task T3 can be performed in n3
ways, and so on, then all k tasks can be performed in the sequence T1T2…Tk in
exactly n1n2 …nk different ways.
In set theoretical notation, the rule of product can be interpreted as follows:
n (A × B) = n (A) × n (B)
where n (A) and n (B) denotes the number of elements in the sets A and B,
respectively.
Example 4.1: Suppose a questionnaire contains 5 questions in which 3 questions
have 2 possible answers and the remaining 2 questions have 3 possible answers.
Then in how many ways can questionnaire be answered?
Solution: Each of 3 questions can be answered in 2 × 2 × 2 ways and remaining
2 questions can be answered in 3 × 3 ways. Hence, total number of ways in which
the questionnaire can be answered are, 2 × 2 × 2 × 3 × 3 = 72 ways.
For example, suppose there are 3 different optional papers to select in one semester
and 2 different optional papers to another semester by the BCA students. According
to the rule of product, there will be 3 × 2 choices for students who want to select
one paper in each of these semesters. On the other hand, as per the rule of sum,
students will have 3 + 2 choices to select only one paper.
Example 4.2: A computer program consists of one letter followed by three digits.
If repetition are allowed, then in how many ways different label identifiers are possible?
110
Fundamental Principles of
Counting
Solution: There are 26 English alphabet and 10 digits from 0 to 9. Thus, each
sequence of three digits can be formed in 10 ways. Hence, total number of ways
in which different label identifiers are possible are, 26 × 10 × 10 × 10 = 26,000.
Example 4.3: A football stadium has 4 gates on the South boundary and 3 gates
on the North boundary.
(i) In how many ways can a person enter through an South gate and leave by
a North gate?
(ii) In how many different ways in all can a person enter and get out through
different gates?
Solution:
(i) Since there are 4 gates on South side, the person can enter in 4 different
ways from South side into the stadium. If he wants to exit from North side,
he can do so in 3 ways because there are 3 exit gates in North side.
Hence, the total number of ways in which he can enter from South gate and
go out from a North gate is 4 × 3 = 12 ways.
(ii) Since he has the choice for entrance from any of 4 + 3 = 7 gates, there are
7 ways in which he can enter and can get out from any one of the remaining
6 gates because he cannot go out from the gate through which he had entered.
Hence, the total number of ways in which he can enter and go out 7 × 6 = 42.
Example 4.4: In how many different ways, can 3 rings of a lock be combined
when each ring has 10 digits 0 to 9? If the lock opens with only one combination
of 3 digits, how many unsuccessful events are possible?
Solution: The ways in which 3 rings can be combined are 10 × 10 × 10 = 1000.
But the lock opens with only one combination of 3 digits, therefore the unsuccessful
events (attempts) will be 1000 – 1 = 999.
Example 4.5: How many 8-digit telephone numbers are possible, if
(i) Only even digits may be used?
(ii) The number must be a multiple of 100?
Solution:
(i) Even digits are 2, 4, 6 and 8. Each of the 8 places can be filled in 4 ways
by even digits to form a 8 digit number. Hence, there can be 4 × 4 × 4 ×
4 × 4 × 4 × 4 × 4 = (4)8 different numbers.
111
Fundamental Principles of
Counting
(ii) A telephone number that needs to be multiple of 100 should have last two
digits as zero (0). Thus, while forming such numbers by using digits 0 to 9,
the first digit can be 1 to 9 and the next 5 places be filled in by 10. Thus
9 × 105 ways.
We may not have exact sequences of heads starting on the 1st and 2nd throws.
Let us write H for head and T for tails. If N1 is the number of sequences exactly
starting with 3 H then the 4th throw must be the tail (T) leaving two possibilities for
the 5th throw. Thus, N1 = 2.
In a similar way, if there is an exact sequence of 3 H starting with the 2nd toss
then this means that both the 1st and 5th tosses must be T or tail. This leads to N2
= 1. Hence, if a sequence of H starts on the 3rd throw, then the 2nd throw has to
be T while the 1st throw may be either H or T. This gives N3 = 2. Hence, V12345 =
V123 = 2 + 1 + 2 = 5.
Example 4.7: A coin is flipped 5 times. In how many ways can it be done so that
there must be a sequence of at least 3 heads in a row? If Ni be the number of
sequences of tosses with at least 3 heads beginning on the ith throw then what will
be V12345?
Solution: In this case, Ni is the number of exact sequences of tosses where there is
a sequence of at least 3 heads starting on the ith throw. Here, at least 3 means that
3 or more heads. Like previous example, V12345 = V123 and V123 = N1 + N2 + N3
– N12 – N23 – N13 + N123
Also in this case the first three terms only can be non-zero and N1 = 4 because
each 4th and 5th tosses can be H or T. Also, N2 = 2, because 5th throw can either
b e
H or T. In a similar way N3 = 2 as the 1st throw may be H or T but the 2nd throw
must be tails. V12345 = V123 = 8.
Example 4.8: In a class room there are 15 students. Out of these 6 study
mechanics, 9 study general science, and 9 study computer science. Also, 2 study
mechanics and general science, 3 study mechanics and computer science, and 5
study general science and computer science. One student in the class studies all
three subjects. How many of these students study none of the three subjects?
Solution: Let M, G, and C denote the sets of students who study mechanics,
general science and computer science respectively and let U be the entire set of 15
students. Then |M| = 6, |G| = 9, and |C| = 9. Also, |MG| = |M ∩ G| = 2, |MC| = |M
∩ C| = 3, and |GC| = |G ∩ C| = 6 and |MGS| = |M ∩ G ∩ C| = 1. Then (MGC)c
= |U|-(|M|+|G|+|C|-|MG|-|MC|-|GC|+|MGC|) = complement of MGC = 15-(6 + 9
+ 9 – 2 – 3 – 6 +1) = 3 = 15 – (24 – 11 + 1) = 1.
113
Fundamental Principles of
Counting
The fundamental principle of addition says that if there are two event which may
occur independent by p and q ways, then either of the two events can occur in
(p + q) ways.
It can also be defined as; if E1 and E2 are mutually exclusive events and E be the
event that either E1 or E2 will occur, then number of times event E will occur is given
as,
N(E) = n(E1) + n(E2)
where n(E1) = number of outcomes of event E1
n(E2) = number of outcomes of event E2
n(E) = number of outcomes of event E
For n number of events, this principle can be extended as,
n(E) = n(E1) + n(E2) + … n(Em)
Where E is event that either E1, E2, … Em will occur n(E1), n(E2) … n(Em)
presents number of outcomes for events E1, E2 … Em.
Factorial notation
The product of all consecutive integers starting from 1 to t is denoted by t! or | t and
read as t-factorial.
Thus t ! = 1 × 2 × 3 × ... × t.
In this way 1 ! = 1, 2 ! = 1 × 2 = 2, 3 ! = 1 × 2 × 3 = 6
4 ! = 1 × 2 × 3 × 4 = 24 etc.
Note that for n > 1, n ! = n(n – 1) !
n
Now Pr = n(n – 1)(n – 2) ... (n – r + 1)
n ( n −1) ... ( n − r + 1)( n − r )( n − r − 1) ... 3. 2.1
=
( n − r )( n − r − 1) ... 3. 2.1
n
=
n−r
Convention: As a convention we take 0 ! equal to 1.
Example 4.9: Find the value of 6P4.
6! 6! 6 . 5 . 4 . 3 . 2 .1
Solution. 6P4 = = = = 360.
(6 − 4) ! 2! 2 .1
114
Fundamental Principles of
Counting
n! n!
Solution. nP4 = and n
P2 =
(n − 4) ! (n − 2) !
n! n!
By hypothesis = 12
(n − 4) ! (n − 2) !
⇒ 12(n – 4) ! = (n – 2) !
⇒ 12(n – 4) ! = (n – 2)(n – 3)(n – 4)!
⇒ 12 = n2 – 5n + 6
⇒ n2 – 5n – 6 = 0
⇒ (n – 6)(n + 1) = 0
⇒ n=6 or n = – 1
Since n is positive integer, we reject the second value of n. Thus
n = 6.
Example 4.11 In how many ways 5 passengers can sit in a compartment
having 16 vacant seats?
Solution. Required number of ways = 16P5
16 ! 16 !
= =
(16 − 5) ! 11!
16 . 15 . 14 . 13 . 12 . | 11
=
| 11
= 16 . 15 . 14 . 13 . 12 = 524160.
4.4 SUMMARY
• Rule of sum: It is a basic counting principle which states the idea that if
there are a ways of doing something and b ways of doing another thing,
then we can do both things simultaneously as a + b.
116
Fundamental Principles of
Counting
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
117
Fundamental Principles of
Counting
118
Permutation and
Combination
Objectives
After going through this unit, you will be able to:
• Discuss the concepts of permutation
• Analyse ordered samples and permutations
• Differentiate between ordered samples with and without repetitions
• Understand the concept of restricted combination
Structure
5.1 Introduction
5.2 Permutation
5.3 Combination
5.4 Summary
5.5 Key Words
5.6 Answers to ‘Check Your Progress’
5.7 Self-Assessment Questions
5.8 Further Readings
5.1 INTRODUCTION
This unit will discuss permutation and combination. Permutation refers to an event in
which one thing is substituted for another, while combination refers to its combination
with another event. Permutations are of ordered samples wherein each of the several
possible ways a set or number of things can be ordered or arranged. Supposing
there are three things represented as a, b and c, then the selections can be made
with the combination of these three things taken two at a time, as, ab, ac and bc. In
another order these can be taken as ba, ca, and cb. Permutation and combination
help determine the maximum number of likely outputs that can be derived from a set
of given things or commodities. In a detailed explanation, permutation and
combination can also be determined for things with or without repetition. This unit
will discuss in detail the various aspects of permutation and combination.
119
Permutation and
Combination
5.2 PERMUTATION
120
Permutation and
Combination
Example 5.1: How many numbers between 1,000 and 10,000 can be formed
with the digits 1, 3, 5, 7, 9, when each digit being used only once in each
number?
Solution: Each number should consist of 4 digits, and the required number is the
same as the number of permutations of 5 different things, taken 4 at a time = 5P4=120.
Example 5.2: Eleven papers are set for the engineering examination of which, two
are in Mathematics. In how many ways can the papers be arranged, if the two
Mathematics papers do not come together?
Solution: We shall find the total number of ways of arranging all the papers, if there
is no restriction and subtract from this, the number of ways in which the two
Mathematics papers come together.
The total number of ways in which all the papers can be arranged, if there is no restriction
= 11!
To find the number of ways in which Mathematics papers come together, consider
that the two papers are bound together; this can be done in 2! ways. Now the
number of ways in which the resulting 10 papers can be arranged is, 10!. Hence,
the total number of ways in which the Mathematics papers come together is 2! ×
10!. Hence the number of ways in which the Mathematics papers do not come
together is, 11! – 2! × 10! = 9 × 10!
Example 5.3: There are 35 micro computers in a computer centre. Each
microcomputer has 18 ports. How many different ports are there in the centre?
Solution: The procedure for choosing a port consists of two jobs, first picking a
microcomputer, and then picking a port, on this microcomputer. Since there are
35 ways to choose the microcomputer and 18 ways to choose the port, it does
not matter which microcomputer has been choosen. By rule 3, there are, 35 × 18
= 630 ports.
121
Permutation and
Combination
Example 5.4: How many functions are there from a set with P elements to one
with Q elements?
Solution: A function corresponds to a choice of one of the Q elements in the
codomain for each of the P elements in the domain. Hence by rule 3, there are QP
functions from a set with P elements, to one with Q elements.
Example 5.5: In how many ways can the letters of the word EDINBURGH be
arranged,
(i) With the vowels only in the odd places;
(ii) Beginning and ending with vowels;
(iii) Beginning and ending with consonants.
Solution:
(i) There are five odd places and as the three vowels should be in these places
only, they can be first, arranged in 5P3 = 60 ways.
When the vowels have been arranged in any one way as shown below,
1 2 u3 4 5 6 7 8 9
e
the remaining six places are to be filled up by the six consonants, and this
can be done in 6! = 720 ways. Hence, the total number of arrangements is
60 × 720 = 43,200 ways.
(ii) The first and last places should be occupied by vowels, and this can be done
in 3P2 = 6 ways.
Further, for each of these ways the other 7 letters can be arranged in 7! = 5040
ways.
Hence, the total number of arrangements are: 6 × 5040 = 30240.
(iii) The total number of ways = 6P2 × 7! = 30 × 5040 = 151200.
Example 5.6: How many numbers between 5,000 and 10,000 can be formed from
the digits 1, 2, 3, 4, 5, 6, 7, 8, 9, each digit not appearing more than once in each
number?
Solution: The first digit from the left may be 5 or 6 or 7 or 8 or 9; and so, the first
place from the left can be filled in 5 different ways; as the number should consist of
4 digits, the remaining 3 digits can be arranged in 8P3 = 336 ways. Hence, the total
number of numbers that can be formed is 5 × 336 = 1,680.
122
Permutation and
Combination
Circular Permutations
Consider n persons seated in a round table in any order, and at the same time,
consider them arranged in the same order in a line, as shown above.
The number of circular arrangements of n persons is (n – 1)!
Example 5.7: In how many ways can 6 different beads be strung together to form
a necklace?
Solution: The number of circular permutations of 6 different things is 5!
When 6 persons are seated at a round table the two arrangements shown above
will have to be considered different; but in the case of the necklace, the above
arrangements can be obtained by simply turning over the first arrangement same
and so must be considered identical.
1
Hence the 5! circular permutation gives us only ⋅ 5! =
60 ways in which, a necklace
2
of 6 beads can be formed.
Check Your Progress - 1
1. How can different selections be obtained from the same number from a
given collection?
................................................................................................................
................................................................................................................
................................................................................................................
123
Permutation and
Combination
5.3 COMBINATIONS
124
Permutation and
Combination
sequence of number 1,2,3 which is considered one sample and here ordering is not
important. If orders are taken then there are six number of arrangements like, 123,
132, 231, 213, 312, and 321. So unordered sample is like a set. It is a set of three
elements 1,2,3. Here we will discus those where repetition is not allowed and so we
are dealing with combination without repetition.
An ordered arrangement of in n things taken r at a time (without repetition)
is given by n!/(n – r)!. If the same n things are arranged taken r! at a time, it is given
as n!/{(n – r)!r!}. Thus when restriction of ordering is removed then the number of
arrangement is reduced by r! times. Symbolically we represent first case of orderly
arrangement as permutation, nPr and in second case it is a combination nCr and nPr
= r! nCr.
Here, among the n things, p things of them are of one kind; q of them are of second
kind, and so on.
(p + q + r +........ = n)
126
Permutation and
Combination
Example 5.13: How many different words can be made out of the letters which form
the word ALLAHABAD?
Solution: There are 9 letters of which 4 letters are of one sort (A, A, A, A); 2 are
of second sort (L, L); 1 is of third sort (H); 1 is of fourth sort (B); and 1 is of
different sort (D).
9!
The required number =
4! 2 !
Example 5.14: In how many ways can the letters of the word ENGINEERING
be arranged (i) without changing the order of the consonants (ii) without changing
the relative positions of the vowels and the consonants?
Solution:
(i) The consonants are required to be in the same order as in the given word
and so, there can be no interchange of posititons among them and so, they
may be replaced by letters say c,c,c,c,c,c. All these 6 c’s can be arranged
in one and only way. Now, we have to find the number of permutations of
11 letters of which the 6 consonants are alike; 3 vowels e,e,e are alike; and
the two vowels i, i are alike.
11!
Hence, the required number = = 4,620.
6! 3! 2!
(ii) The places originally occupied by vowels must always be occupied by vowels
and those occupied by consonants, always by consonants. The vowels e,
5!
e, e, i, i can be arranged among themselves in = 10 ways and the
3! 2 !
6!
consonants n, n, n, g, g, r can be arranged among themselves in = 60
3! 2 !
ways.
Hence, the required number is l0 × 60 = 600 ways.
127
Permutation and
Combination
In such a case if among the n things, p things of them are of one kind; q things
of them are of second kind, and so on, such that p + q + r +........ = n, the number
of permutations of n things taken together when all the things are not different is given
by n! / (p!q!r!...) since number of permutations are reduced by p!q!r!... from the
original permutation when things were all different.
In the above example number of permutation of all the letters of the word
COMMITTEE can be given by 9! / (2!2!2!) = 45360. If these 9 letters would all
have been different then it would have been just 9! = 362880.
The following example will make the concept clear on permutations involving
indistinguishable objects.
Example 5.15: How many different letter arrangements can be formed using the
letters T E N N E S S E E ?
Solution: There are 9! possible permutations of the letters T E N N E S S E E if the
letters are distinguishable.
However, 4 E’s are indistinguishable. There are 4! ways to order the E’s.
2 S’s and 2 N’s are indistinguishable. There are 2! orderings of each.
Once all letters are ordered, there is only one place for the T.
If the E’s, N’s and S’s are indistinguishable among themselves, then there are 9!/
(4!.2!.2!) = 3,780 different orderings of T E N N E S S E E.
Restricted Combinations
We know that combination presents a way of choosing elements from a set, where
order does not matter. When additional restrictions are added, it is called restricted
combinations.
Case 1: When p particular things are always to be included
Number of combinations of n distinct things taking r at a time, when s particular
(n p)
things are always to be included in each selection, is C( r p) .
128
Permutation and
Combination
Case 5: Number of ways of selecting zero or more things from ‘n’ different things
is given by 2n–1.
Proof: Number of ways of selecting one thing, out of n-things = nC1
Number of selecting two things, out of n-things = nC2
Number of ways of selecting three things, out of n-things = nC3
Number of ways of selecting ‘n’ things out of ‘n’ things = nCn
→ Total number of ways of selecting one or more things out of n different things.
= n C1 n
C2 n
C3 n
Cn
= ( n C0 n
C1 n
Cn ) n
C0 2n 1
Case 6: Number of ways of selecting zero or more things from ‘n’ different things
is given by n + 1.
Example 5.16: In how many ways can a cricket-eleven be chosen out of 15
players? If
(i) A particular player is always chosen.
(ii) A particular is never chosen.
Solution:
(i) A particular player is always chosen, it means that 10 players are selected
out of the remaining 14 players.
= Required number of ways = 14C10 = 14C4
= 14!/4! × 19! = 1365
(ii) A particular players is never chosen, it means that 11 players are selected
out of 14 players.
→ Required number of ways = 14C11
= 14!/11! × 3! = 364 [nC0 = 1]
Example 5.17: Kamal has 8 friends. In how many ways can he invite one or more
of them to dinner?
129
Permutation and
Combination
Solution. Kamal can select one or more than one of his 8 friends.
→ Required number of ways = 28 – 1 = 256 – 1 = 255.
Example 5.18: In how many ways, can zero or more letters be selected form the
letters AAAAA?
Solution. Number of ways of :
Selecting zero ‘A’s = 1
Selecting one ‘A’s = 1
Selecting two ‘A’s = 1
Selecting three ‘A’s = 1
Selecting four ‘A’s = 1
Selecting five ‘A’s = 1
Required number of way = 5 + 1 = 6.
n
= p pp
1 2 r
n r
= p pp
1 2 r
130
Permutation and
Combination
131
Permutation and
Combination
5.4 SUMMARY
132
Permutation and
Combination
133
Permutation and
Combination
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
134
Matrices and Determinants
Objectives
After going through this unit, you will be able to:
• Describe matrices and determinants
• Analyse the various types of matrices
• Assess the operations of matices
• Describe minors and cofactors of determinants
• Understand scalar multiplication of matrix
Structure
6.1 Introduction
6.2 Matrix
6.3 Subtraction of Matrix and System of Linear Equations
6.4 Summary
6.5 Key Words
6.6 Answers to ‘Check Your Progress’
6.7 Self-Assessment Questions
6.8 Further Readings
6.1 INTRODUCTION
This unit will discuss matrices and determinants. A matrix (plural matrices) is a
rectangular array of numbers, symbols, or expressions, arranged in rows and
columns. A determinant however refers to a quantity obtained by the addition of
products of the elements of a square matrix. Matrices are classified as row, column,
square, null, diagonal, scalar, identity and triangular. Matrices can be however, equal
sometimes and often not equal. In an equal matrix the corresponding elements in
matrix A and B are of the same order. The basic operations of addition, subtraction
and multiplication can also be applied on matrices. Another operation that can be
applied to a matrix is transpose. This unit discussed the operations of matrices in
detail.
135
Matrices and Determinants
6.2 MATRIX
The aspects, types and characteristics of matrix are discussed here below in detail.
What is a Matrix?
Let n, m be two integers ≥ 1. An array of elements of the type is as follows:
This is called a matrix. We denote this matrix by (aij), i = 1, ..., m and j = 1, ..., n. We
say that it is an m × n matrix (or matrix of order m × n). It has m rows and n columns.
For example, the first row is (a11 a12 ..., a1n) and first column is,
a11
a21
am1
Also, aij denotes the element of the matrix (aij) lying in ith row and jth column and we
call this element as the (i, j)th element of the matrix.
For example, in the matrix,
1 2 3
4 5 6
7 8 9
a11 = 1, a12 = 2, a32 = 8, i.e.,
(1, 1)th element is 1, (1, 2)th element is 2, (3, 2)th element is 8, respectively.
Notes: 1. Matrices are a key tool in linear algebra.
2. A matrix is simply an arrangement of elements and has no numerical
value.
1 2 3
Example 6.1: If A = 4 5 6
, find a11, a22, a33, a31, a41.
7 8 9
0 1
2
136
Matrices and Determinants
Types of Matrices
1. Row Matrix. A matrix which has exactly one row is called a row matrix.
For example, (1 2 3 4) is a row matrix.
2. Column Matrix. A matrix which has exactly one column is called a
column matrix.
5
6
For example, is a column matrix.
7
1 2
For example, 3 4 is a 2 × 2 square matrix.
0 0 0
For example, 0 0 0 is a 2 × 3 Null matrix.
1 2 3
4 5 6
square matrix (aij). For example, in matrix,
7 8 9
137
Matrices and Determinants
A square matrix whose every element other than diagonal elements is zero,
1 0 0
0 2 0
is called a diagonal matrix. For example, is a diagonal matrix.
0 0 3
Note that, the diagonal elements in a diagonal matrix may also be zero. For
example,
0 0 0 0
and are also diagonal matrices.
0 2 0 0
1 0 0 0 0 0
5 0
, 0 1 0, 0 0 0
called a scalar matrix. For example, 0 5 are
0 0 1 0 0 0
scalar matrices.
7. Identity Matrix. A diagonal matrix whose diagonal elements are all equal
to 1 (unity) is called identity matrix or (unit matrix). For example,
1 0
is an identity matrix.
0 1
1 0 0
1 0
4 5 0 , 2 0 are lower triangular matrices
7 8 9
1 2 3
1 2
0 4 5, are upper triangular matrices.
and
0 0 6 0 3
138
Matrices and Determinants
Algebra of Matrices
Equality
Two matrices A and B are said to be equal if,
(i) A and B are of same order.
(ii) Corresponding elements in A and B are same. For example, the following
two matrices are equal.
3 4 9 3 4 9
=
16 25 64 16 25 64
But the following two matrices are not equal.
1 2 3
1 2 3
4 5 6
4 5 6 7 8 9
As matrix on left is of order 2 × 3, while on right it is of order 3 × 3
The following two matrices are also not equal.
1 2 3 1 2 3
7 8 9 4 8 9
As (2, 1)th element in LHS matrix is 7 while in RHS matrix it is 4.
Operations on Matrices
The following operations can be performed on matrices.
Addition of Matrices
If A and B are two matrices of the same order then addtion of A and B is defined to
be the matrix obtained by adding the corresponding elements of A and B.
For example, if
1 2 3 2 3 4
A = 4 5 6 , B =
5 6 7
1 + 2 2 + 3 3 + 4 æç3 5 7 ÷ö
Then, A + B = 4 + 5 5 + 6 6 + 7 = çç ÷÷
è9 11 13÷ø
139
Matrices and Determinants
Note that addition (or subtraction) of two matrices is defined only when A and
B are of the same order.
Properties of Matrix Addition
(i) Matrix addition is commutative.
i.e., A+B=B+A
(i, j)th element of A + B is (aij + bij) and of B + A is (bij + aij), and they are
same as, aij and bij are real numbers.
(ii) Matrix addition is associative,
i.e., A + (B + C) = (A + B) + C
For, (i, j)th element of A + (B + C) is aij + (bij + cij) and of (A + B) + C is
(aij + bij) + cij which are same.
(iii) If O denotes null matrix of the same order as that of A then,
A+O=A=O+A
(i, j)th element of A + O is aij + O which is same as (i, j)th element
of A.
(iv) To each matrix A there corresponds a matrix B such that,
A + B = O = B + A.
Let (i, j)th element of B be – aij. Then (i, j)th element of A + B is,
aij – aij = 0.
Thus, the set of m × n matrices forms an abelian group under the composition of
matrix addition.
1 2 3 0 1 2
Example 6.2: If A = 4 5 6 and B = 3 4 5
Verify A + B = B + A.
Solution: A + B =
FG 1 + 0 2 + 1 3 + 2IJ = 1 3 5
H 4 + 3 5 + 4 6 + 5K 7 9 11
B+A=G
F 0 + 1 1 + 2 2 + 3IJ = 1 3 5
H 3 + 4 4 + 5 5 + 6K 7 9 11
So, A+B=B+A
Example 6.3: If A and B are matrices as in Example 4.2
1 0 1
and C = , verify (A + B) + C = A + (B + C).
1 2 3
140
Matrices and Determinants
1 3 5
Solution: Now A + B =
7 9 11
0 3 6
So, (A + B) + C = 1 1 3 0 5 1
=
7 1 9 2 11 3 8 11 14
Again, B+C=
FG 0 − 1 1+ 0 2 +1 IJ = 1 1 3
H3 + 1 4 + 2 5+ 3 K 4 6 8
So, A + (B + C) =
FG 1 − 1 2 +1 3+ 3 IJ = 0 3 6
H4 + 4 5+6 6 + 8K 8 11 14
Therefore, (A + B) + C = A + (B + C)
1 2
Example 6.4: If A = 3 4 , find a matrix B such that A + B = 0.
5 6
b11 b12
Solution: Let, B = b21 b22
b31 b32
1 b11 2 b12 F0 0 I
Then, A + B = 3 b21 4 b22 = G0 0 JJ
5 b31 6 b32
GH 0 0 K
It implies, b11= – 1, b12 = – 2, b21 = –3, b22 = – 4,
b31 = – 5, b32 = – 6
F −1 −2 I
Therefore required B = G − 3 −4 JJ
GH − 5 −6 K
Multiplication of Matrices
The product AB of two matrices A and B is defined only when the number of columns
of A is same as the number of rows in B and by definition the product AB is a matrix G
of order m × p if A and B were of order m × n and n × p, respectively. The following
example will give the rule to multiply two matrices:
æ d1 e1 ÷ö
æ a1 b1 c1 ÷ö çç ÷
ç ççd2 e2 ÷÷
Let, A = çça b c ÷÷÷ B = çç ÷÷
è 2 2 2ø ÷
èçd3 e3 ÷ø
Order of A = 2 × 3, Order of B = 3 × 2
141
Matrices and Determinants
g11 g12
= g
21 g 22
g11 : Multiply elements of the first row of A with corresponding elements of the first
column of B and add.
g12 : Multiply elements of the first row of A with corresponding elements of the second
column of B and add.
g21 : Multiply elements of the second row of A with corresponding elements of the
first column of B and add.
g22 : Multiply elements of the second column of A with corresponding elements of
the second column and add.
Notes: 1. In general, if A and B are two matrices then AB may not be equal to BA.
For example, if
1 1 1 0 1 0
A= , B= then AB =
0 0 0 0 0 0
1 1
and BA = . So, AB ≠ BA
0 0
16 3
= 6 9
142
Matrices and Determinants
Example 6.6: Verify the associative law A(BC) = (AB)C for the following matrices.
–1 0 −1 5 −1 −1
A= , B= , C=
7 – 2 7 0 2 0
−1 0 −1 5 1+ 0 −5 + 0
Solution: AB = =
7 −2 7 0 −7 − 14 35 + 0
æ 1 -5÷ö
çç ÷
= çè-21 35 ÷÷ø
æ-1 5öæ
÷÷çç-1 -1÷÷ö = ççæ 1 + 10 1 + 0 ö÷ çæ 11 1 ö÷
BC = çç ÷÷ç 2 ÷=ç ÷
çè 7 0øè 0 ÷÷ø çè-7 + 0 -7 + 0ø÷÷ çè-7 -7ø÷÷
æ-1 0 ÷öæ 11 1 ö÷ æ-11 + 0 -1 + 0ö÷
çç ÷ç ÷ ç ÷
A(BC) = çè 7 -2÷÷øèçç-7 -7ø÷÷ = èçç 77 + 14 7 + 14 ø÷÷
æ-11 -1÷ö
çç ÷
= çè 91 21÷÷ø
æ-11 -1ö÷
çç ÷
= çè 91 21ø÷÷
Solution: A2 =
FG 1 0IJ FG 1 0IJ = 1 0
H 3 4K H 3 4K 15
16
(Similarly, we can define A3, A4, A5, ... for any square matrix A.)
A=
FG 1 2 3 IJ and k = 2
H4 5 6 K
143
Matrices and Determinants
2 4 6
Then, kA =
8 10 12
0 1 2
Example 6.8: (i) If A = 2 3 4 and k1 = i, k2 = 2, verify,,
4 5 6
0 i 2i 0 2 4
Solution: (i) Now k1A = 2i 3i 4i and k2A = 4 6 8
4i 5i 6i 8 10 12
0 2 i 4 2i
So, k1A + k2A = 4 2i 6 3i 8 4i
8 4i 10 5i 12 6i
0 2 i 4 2i
Also, (k1 + k2) A = 4 2i 6 3i 8 4i
8 4i 10 5i 12 6i
Therefore, (k1 + k2)A = k1A + k2A
0 4 6
(ii) 2A = 4 2 8
21 18 9
3B =
3 12 15
21 22 15
So, 2A + 3B =
7 14 23
1 2
Example 6.9: If A = – 3 0 find A2 + 3A + 5I where I is unit matrix of order 2.
1 2 1 2 5 2
Solution: A2 = =
3 0 3 0 3 6
144
Matrices and Determinants
3 6
3A =
9 0
I=
FG 1 0IJ , 5I = æçç5 0ö÷
÷
H 0 1K çè0 5ø÷÷
−5 2 3 6 5 0
So, A2 + 3A + 5I = + +
−3 −6 −9 0 0 5
3 8
=
− 12 − 1
0 1 0 i
Example 6.10: If A = ,B= show that, AB = – BA and A2 = B2 = I.
1 0 i 0
0 1 0 i i 0
Solution: Now, AB = =
1 0 i 0 0 i
0 i 0 1 i 0
BA = =
i 0 1 0 0 i
So, AB = – BA
Also, A2 =
FG 0 1IJ FG 0 1IJ = FG 1 0IJ = I
H 1 0K H 1 0K H 0 1K
B2 =
0 i 0 i
=G
F 1 0IJ = I
i 0 i 0 H 0 1K
This proves the result.
Example 6.11: In an examination of Mathematics, 20 students from college A, 30
students from college B and 40 students from college C appeared. Only 15 students
from each college could get through the examination. Out of them 10 students from
college A and 5 students from college B and 10 students from college C secured full
marks. Write down the above data in matrix form.
Solution: Consider the matrix,
20 30 40
15 15 15
10 5 10
First row represents the number of students in college A, college B and college C
respectively.
Second row represents the number of students who got through the examination
in three colleges respectively.
145
Matrices and Determinants
Third row represents the number of students who got full marks in the three colleges
respectively.
Example 6.12: A publishing house has two branches. In each branch, there are three
offices. In each office, there are 3 peons, 4 clerks and 5 typists. In one office of a
branch, 6 salesmen are also working. In each office of other branch 2 head clerks are
also working. Using matrix notation find (i) the total number of posts of each kind in all
the offices taken together in each branch, (ii) the total number of posts of each kind in
all the offices taken together from both the branches.
Solution: (i) Consider the following row matrices,
A1 = (3 4 5 6 0), A2 = (3 4 5 0 0), A3 = (3 4 5 0 0)
These matrices represent the three offices of the branch (say A) where elements
appearing in the row represent the number of peons, clerks, typists, salesmen and
head clerks taken in that order working in the three offices.
Then, A1 + A2 + A3 = (3 + 3 + 3 4 + 4 + 4 5 + 5 + 5 6 + 0 + 0 0 + 0 + 0)
= (9 12 15 6 0)
Thus, total number of posts of each kind in all the offices of branch A are the
elements of matrix A1 + A2 + A3 = (9 12 15 6 0)
Now consider the following row matrices,
B1 = (3 4 5 0 2), B2 = (3 4 5 0 2), B3 = (3 4 5 0 2)
Then B1, B2, B3 represent three offices of other branch (say B) where the elements
in the row represent number of peons, clerks, typists, salesmen and head clerks
respectively.
Thus, total number of posts of each kind in all the offices of branch B are the
elements of the matrix B1 + B2 + B3 = (9 12 15 0 6)
(ii) The total number of posts of each kind in all the offices taken together from
both branches are the elements of matrix,
(A1 + A2 + A3) + (B1 + B2 + B3) = (18 24 30 6 6)
146
Matrices and Determinants
50 100
Solution: 5A =
150 200
It represents the number of table fans and ceiling fans that the manufacturing units
A and B produce in five days.
2 3 4 5
Example 6.14: Let A = 3 4 5 6 where rows represent the number of items of
4 5 6 7
type I, II, III, respectively. The four columns represents the four shops A1, A2, A3, A4
respectively.
1 2 3 4 1 2 2 3
Let, B = 2 1 2 3 , C = 1 2 3 4
3 2 1 2 2 3 4 4
Where elements in B represent the number of items of different types delivered at the
beginning of a week and matrix C represent the sales during that week. Find,
(i) The number of items immediately after delivery of items.
(ii) The number of items at the end of the week.
(iii) The number of items needed to bring stocks of all items in all shops to 6.
F3 5 7 9 I
Solution: (i) A + B = G 5 5 7 9 JJ
GH 7 7 7 9 K
Represent the number of items immediately after delivery of items.
F2 3 5 6 I
(ii) (A + B) – C = G 4 3 4 5 JJ
GH 5 4 3 5 K
Represent the number of items at the end of the week.
(iii) We want that all elements in (A + B) – C should be 6.
F4 3 1 0 I
GG
Let D = 2 3 2 1 JJ
H1 2 3 1K
Then (A + B) – C + D is a matrix in which all elements are 6. So, D represents the
number of items needed to bring stocks of all items of all shops to 6.
Example 6.15: The following matrix represents the results of the examination of
B. Com. class:
147
Matrices and Determinants
1 2 3 4
5 6 7 8
9 10 11 12
The rows represent the three sections of the class. The first three columns represent
the number of students securing 1st, 2nd, 3rd divisions respectively in that order and
fourth column represents the number of students who failed in the examination.
(i) How many students passed in three sections respectively?
(ii) How many students failed in three sections respectively?
(iii) Write down the matrix in which number of successful students is shown.
(iv) Write down the column matrix where only failed students are shown.
(v) Write down the column matrix showing students in 1st division from three
sections.
Solution: (i) The number of students who passed in three sections respectively are
1 + 2 + 3 = 6, 5 + 6 + 7 = 18, 9 + 10 + 11 = 30.
(ii) The number of students who failed from three sections respectively are 4, 8,
12.
1 2 3
(iii) 5 6 7
9 10 11
4
(iv) 8 represents column matrix where only failed students are shown.
12
F 1I
(v) GG 5JJ represents column matrix of students securing 1st division.
H 9K
Transpose of a Matrix
Let A be a matrix. The matrix obtained from A by interchange of its rows and columns,
is called the transpose of A. For example,
If,
F1
A=G
0 2 IJ then transpose of A = FG 10 2
1
I
JJ
H2 1 0 K GH 2 0 K
Transpose of A is denoted by A′.
148
Matrices and Determinants
3 5 7
Again, A + B =
5 13 12
3 5
So, (A + B)′ = 5 13
7 12
Therefore, (A + B)′ = A′ + B′
Square Matrix
A square matrix is a matrix which has the same number of rows and columns. An
n-by-n matrix is known as a square matrix of order n.
Symmetric Matrices
Consider a square matrix A such that A' = A is a symmetric matrix.
A square matrix A such that A' = – A is skew symmetric. Its leading diagonal has all
zeros.
0 −1 2
1 0 3
is skew symmetric.
−2 −3 0
149
Matrices and Determinants
Note: If A is a square matrix then, (i) A + A' is symmetric (ii) A – A' is skew symmetric,
A + A'
(iii) The square matrix A is the sum of the symmetric matrix and the skew
2
A − A'
symmetric matrix . These results can be easily proved.
2
If A, B are square and AB = BA, then A, B are commutative. If AB = – BA, then A, B
are anti-commutative.
If A2 = A, then A is called idempotent.
Determinant of a Matrix
A square matrix A has a uniquely defined determinant | A | associated with the matrix.
The determinant of,
a
11 12 a 11 a
12 a
A= is | A | = a = a11 a22 – a12 a21
a a
21 22 21 a22
The determinant of the product of two matrices is the product of their determinants.
| AB | = | A | | B |
Students to verify the above results for,
1 1 1 2 2 1
4 −1 1
(i) A = , B = 1 0 2
0 1 −1 2 1 2
1 0 0 a b c
(ii) A = 0 1 0 , B = d e
f
0 0 1 g h i
12 3
Example 6.17: Is square matrix A = Singular or non-singular?
20 5
12 3
Solution: A = is singular because | A | = 60 – 60 = 0.
20 5
150
Matrices and Determinants
Adjoint Matrix
The adjoint matrix of A is obtained by replacing the elements of A by their respective
cofactors and then transposing.
If A = [aij] and B = [Aij] where Aij is the cofactor of aij in A then we have the adjoint
matrix of A, written as
adj A = [Aij]' = [Aji]
151
Matrices and Determinants
For example, if A =
FG 1 2IJ then det A = 4 – 6 = – 2
H 3 4K
a11 a12
Suppose A = is invertible.
a21 a22
B=
FG x y IJ where x, y, z, w are complex numbers such that AB = I = BA
H z wK
The above identity implies,
a11x + a12z = 1, a11 y + a12w = 0
a21x + a22z = 0, a21 y + a22w = 1
which in turn implies
∆x = a22, ∆y = – a12
∆z = – a21, ∆w = a11
where ∆ = a11a22 – a12a21
Clearly ∆ ≠ 0, for otherwise x, y, z, w will be indeterminate. This means that det
A ≠ 0. Conversely, if A is a square matrix of order 2 such that det A ≠ 0, then A is
invertible as,
a22 a12 a21 a11
x= , y= , z= , w=
Then we define,
det A = a11(a22a33 – a32a23) – a12(a21a33 – a31a23) + a13(a21a32 – a31a22)
The above definition may be explained as follows:
152
Matrices and Determinants
The first bracket is determinant of matrix obtained after removing first row and
first column.
The second bracket is determinant of matrix obtained after removing first row and
second column.
The third bracket is determinant of matrix obtained after removing first row and
third column.
The elements before three brackets are first, second, third element respectively of
first row with alternate positive and negative signs.
F1 2 3 I
GG
For example, let A = 4 5 6 JJ
H7 8 9K
To find det A.
The first bracket in the definition of det A is determinant of,
FG 5 6IJ = 45 – 48 = – 3
H 8 9K
The second bracket is determinant of,
FG 4 6IJ = 36 – 42 = – 6
H 7 9K
The third bracket is determinant of,
4 5
7 8 = 32 – 35 = – 3
153
Matrices and Determinants
a1 b1
Note: A determinant of order 2 can also be obtained when we eliminate x, y
a2 b2
Properties of Determinants
We list below some imortant properties of determinants.
1. If two rows (or columns) are interchanged in a determinant it retains its absolute
value but changes its sign.
a1 a2 a3 b1 b2 b3
i.e., b1 b2 b3 = a1 a2 a3
c1 c2 c3 c1 c2 c3
2. If rows are changed into columns and columns into rows the determinant remains
unchanged.
a1 a2 a3 a1 b1 c1
i.e., b1 b2 b3 = a2 b2 c2
c1 c2 c3 a3 b3 c3
154
Matrices and Determinants
5. If to any row (or column) is added k times the corresponding elements of another
row (or column), the determinant remains unchanged.
a1 kb1 a2 kb2 a3 kb3 a1 a2 a3
i.e., b1 b2 b3 = b1 b2 b3
c1 c2 c3 c1 c2 c3
6. If any row (or column) is the sum of two or more elements, then the determinant
can be expressed as sum of two or more determinants.
a1 k1 a2 k2 a3 k3 a1 a2 a3 k1 k2 k3
i.e., b1 b2 b3 = b1 b2 b3 + b1 b2 b3
c1 c2 c3 c1 c2 c3 c1 c2 c3
1 1 1
i.e., a b c has (a – b) as one of its factors (By putting a = b, first and
a2 b2 c2
155
Matrices and Determinants
1 a b+c
1 b c+a = 0
1 c a+b
Solution: Now,
1 a bc
1 b ca
1 c ab
1 1 1
= a b c [Interchanging rows and columns]
bc ca ab
Applying C2 → C2 – C1, C3 → C3 – C1
1 0 0
= a ba ca
bc ab ac
1 0 0
= (a b)(a c) a 1 1 = 0, by property 3
bc 1 1
Applying R1 → R1 + R2 + R3
0 0 0
= bc c a a b
ca ab bc
=0
156
Matrices and Determinants
1 1 1 1 0 0
Solution: a b c = a ba ca
a 2
b 2
c 2 a2 b2 a 2 c2 a2
Applying C2 → C2 – C1 and C3 → C3 – C1
1 0 0
= (b a )(c a ) a 1 1
a2 ba ca
= (b – a)(c – a)(c + a – b – a)
= (b – a)(c – a)(c – b)
= (a – b)(b – c)(c – a)
Example 6.21: Prove that
abc 2a 2a
2b bca 2b = (a + b + c)3
2c 2c c a b
abc 2a 2a
Solution: 2b bca 2b
2c 2c cab
Applying R1 → R1 + R2 + R3
abc abc a bc
= 2b bca 2b
2c 2c cab
1 1 1
= (a b c) 2b b c a 2b
2c 2c cab
Applying C2 → C2 – C1, C3 → C3 – C1
1 0 0
= (a b c) 2b (a b c) 0
2c 0 (a b c )
157
Matrices and Determinants
1 1 1
1
1 a 1 1 a a a
1 1 1
Solution: 1 1 b 1 = abc 1
b b b
1 1 1 c
1 1 1
1
c c c
Applying R1 → R1 + R2 + R3
1 1 1 1 1 1 1 1 1
1+ + + 1+ + + 1+ + +
a b c a b c a b c
1 1 1
= abc 1+
b b b
1 1 1
1+
c c c
1 1 1
1 1 1 1 1 1
= abc 1
1
a b c b b b
1 1 1
1
c c c
Applying C2 → C2 – C1, C3 → C3 – C1
1 0 0
1 1 1 1
= abc 1 1 0
a b c b
1
0 1
c
FG
= abc 1 + 1 + 1 + 1 IJ
H a b c K
Example 6.23: Prove that x = 2 and x = 3 are roots of the equation,
x5 2
3 x
=0
x−5 2
Solution: Now, − 3 x
=0
⇒ x2 – 5x + 6 = 0
⇒ (x – 3)(x – 2) = 0
⇒ x = 3, x = 2 are roots of the given equation.
158
Matrices and Determinants
x x
If A is invertible, let B = y y be inverse of A.
x y x y 1 0
Then AB = I implies x y x y = 0 1
⇒ x + y = 1, x′ + y′ = 0, x + y = 0, x′ + y′ = 1, which is absurd.
This proves our assertion.
In the present section, we give a method to determine the inverse of a matrix.
Consider the identity A = IA.
159
Matrices and Determinants
We reduce the matrix A on left hand side to the unit matrix I by elementary row
operations only and apply all those operations in same order to the prefactor I on the
right hand side of the above identity. In this way, unit matrix I is reduced to some
matrix B such that I = BA. Matrix B is then the inverse of A.
We illustrate the above method by the following examples.
Example 6.24: Find the inverse of the matrix,
1 3 3
1 4 3
1 3 4
F1 3 3 I F1 0 0 I F1 3 3 I
GG 1 4 3 JJ = GG 0 1 0 JJ GG 1 4 3 JJ
H1 3 4 K H0 0 1 K H1 3 4 K
Applying R2 → R2 – R1, then R3 → R3 – R1, we have,
F1 3 3 I 1 0 0 1 3 3
GG 0 1 0 JJ = 1 1 0 1 4 3
H0 0 1 K 1 0 1 1 3 4
F1 0 0 I 7 3 3 1 3 3
GG 0 1 0 JJ = 1 1 0 1 4 3
H0 0 1 K 1 0 1 1 3 4
7 3 3
1 1 0
1 0 1
1 3 − 2
− 3 0 −5
2 5 0
160
Matrices and Determinants
1 3 − 2 1 0 0 1 3 2
− 3 0 −5 = 0 1 0 3 0 5
2 5 0 0 0 1 2 5 0
1 3 2 1 0 0 1 3 2
0 9 11 = 3 1 0 3 0 5
0 0 25 15 1 9 2 5 0
Applying R3 → 1 R3 , we have,
25
1 3 2 1 0 0 1 3 2
0 9 11 = 3 1 0 3 0 5
0 0 1 3 1 9 2 5 0
5 25 25
1 2 18
1 3 0 5 25 25 1 3 2
18 36 99
0 9 0 = 3 0 5
5 25 25
0 0 1 2 5 0
3 1 9
5 25 25
1
Applying R2 → R2 , we have,
9
1 2 18
F 1 3 0 I 5 25 25 1 3 2
GG 0 1 0 = JJ 2 4 11
3 0 5
5 25 25
H 0 0 1 K 3 1 9
2 5 0
5 25 25
161
Matrices and Determinants
1 2 − 1
Example 6.26: Find the inverse of the matrix − 4 − 7 4
− 4 −9 5
162
Matrices and Determinants
The aspects of matrix subtraction and system of linear equations are discussed here.
163
Matrices and Determinants
3
2 1 1 x
or 8 2
1 3 y
7
X A 1B
as AX = B
A–1 (AX) = A–1B
(A–1A) X = A–1B
(I) X = A–1B
X A 1B
164
Matrices and Determinants
1 2 x 4
3
5 y
= 1
1
x 1 2 4
or y =
3 5 1
1 2
A = 3 5
1
A–1 = A adj A
1 2 1
= (3) (2) (1) (4) 4 3
1 2 1
= 2 4 3
1
1 2
= 3
2
2
∴ X = A–1B
1
1 2 4
1
= 3
2
2
7
2
x
y = 13
2
7 13
or x = , y=
2 2
165
Matrices and Determinants
1 3 1 x 1
2 5 0 y 3
3 1 2 z 2
X = A–1 B
1 3 1
0
A= 2 5
3 1 2
1
A–1 = A adjA
1 5 2
= (1) ( 5) (2) (3) 3 1
1 5 2
= 11 3 1
X = A–1 B
1 5 2 4
= 11 3 1 1
1 22
= 11 11
x 2
y = 1
or x = 2, y = 1
Example 6.29: Solve 3x + y = 2
4x + 2y = 3
166
Matrices and Determinants
3 1 x 2
4
2 y
= 3
1
x 3 1 2
or y =
4 2 3
3 1
Given A = 4 2
1
A–1 = A adjA
10 7 5
9 9 9
4 1 2
= 9 9 9
13 10 11
9 9 9
X = A–1B
10 7 5
9 9 9 1
4 1 2
3
= 9 9 9
2
13 10 11
9 9 9
x 21
y
= 3
z 39
167
Matrices and Determinants
It can also be done by partially pioviting the matrix and converting it to lower
triangular matrix by using elementary operations.
Same operations are also performed on R.H.S. matrix for a system of linear
equations AX = B, find an augmented matrix C = [A, B] and reduce it to lower
triangular matrix.
Example 6.30: Solve 3x + 3y + 4z = 20
X+y+z=6
2x + y + 3z = 13
Solution: Given matrix AX = B is
3 3 4 x 20
1 1 1 y 6
2 1 3 z 13
Augmented matrix C = [A : B]
3 3 4 : 20
1 1 1 : 6
2 1 3 : 13
1
R1 → R2 – R
3 1
3 3 4 20
1 2
0 0
2 3 3
R3 → R3 – R1 ~
3 1 1
0 1
3 3
3 3 4 20
1 1
0 1
3 3
R2 ↔ R3 ~
1 2
0 0
3 3
1 2
Now from this matrix z (last row)
3 3
z=2
168
Matrices and Determinants
1 1
y z = (second row)
3 3
1 1
y (2) =
3 3
y=1
and 3x + 3y + 4z = 20 [from first row]
3x + 3(1) + 4(2) = 20
x= 3
Example 6.31: Solve x + 2y + 3z = 7
– 2x + 3y – z = 5
– x – 2y + 3z = – 1
Solution: Given matrix AX = B is
1 2 3 x 7
2 3 1 y = 5
1 2 3 z 1
1 2 3 : 7
5
Augmented matrix C = [A : B] ~ 2 3 1 :
1 2 3 : 1
R2 → R2 + 2R1
1 2 3 7
19
R3 → R3 + R1 ~ 0 7 5
0 0 6 6
2 3 9
7
For example A = 8 5
3 6 4
2 3
Then B = 8 5
(Removing third row and third column)
5 7
C = 6 4 (Removing first row and first column)
8 7
D = 3 6 (Removing first row and second column)
5 7
C1 = 6 4
= 20 – 42 = – 22
8 7
D1 = 3 6
= 48 – 21 = 27
170
Matrices and Determinants
3 2 1
1 8 7 here a = 1, by
For example, cofactor of a21 in matrix A = O 21
5 6 9
2 1
deleting second row and first column, smaller matrix is 6 9
and cofactor of a21
2 1
is 6 9
[with a negative sign as a21, (2 + 1 = 3, odd)].
The value of a determinant is equal to the sum of the products of the elements
of a line by its corresponding cofactors.
a11 a12 a13
a21 a22 a23 a
22 a23 a21 a23 a21 a22
= a11 a a33
a12
a31 a33
a13
a31 a32
a31 a32 a33 32
2 5
Example 6.32: Find cofactor of a22 in 9 7
2 5
Solution: 9 7
By deleting second row and second column, value is 2 (with
+ve sign)
3 2 1
5
Example 6.33: Find cofactor of a31 and a23 for A = 0 2
2 1 4
3 2 1
5
Solution: 0 2
2 1 4
2 1
Cofactor of a31 = 2 5
(with the sign even)
= 2(– 5) – 2(1)
171
Matrices and Determinants
= – 10 – 2
= – 12
By deleting second row and third column
3 2 1
0 2 5
2 1 4
3 2
A2 = 2 1
3 2
Cofactor of a23 =
2 1
172
Matrices and Determinants
6.4 SUMMARY
• A matrix which has exactly one row is called a row matrix. For example,
(1 2 3 4) is a row matrix.
• A matrix which has exactly one column is called a column matrix.
• A matrix in which the number of rows is equal to the number of columns is
called a square matrix.
• A matrix each of whose elements is zero is called a null matrix or zero
matrix.
• A square matrix whose every element other than diagonal elements is zero,
is called a diagonal matrix.
• A diagonal matrix whose diagonal elements are equal, is called a scalar
matrix.
• A square matrix (aij), whose elements aij = 0 when i < j is called a lower
triangular matrix.
• The product AB of two matrices A and B is defined only when the number
of columns of A is same as the number of rows in B and by definition the
product AB is a matrix G of order m × p if A and B were of order m × n
and n × p, respectively.
• A square matrix is a matrix which has the same number of rows and
columns. An n-by-n matrix is known as a square matrix of order n.
• In matrix algebra, the determinant is a special number associated with any
square matrix. In linear transformation the determinant acts as a scale factor
or coefficient for measure.
• If two rows (or columns) are interchanged in a determinant it retains its
absolute value but changes its sign.
• Subtraction of matrix is done by subtraction from element to element of two
matrices subtraction of two matrices can be performed only if both of the
matrices are of same dimensions, i.e., having same number of rows and
columns. It is done by subtracting corresponding elements.
• By matrix invasion method, system of linear equations can be solved. In this
method system of equation can be defined as AX = B, where A is the
173
Matrices and Determinants
coefficient matrix, X is the variable matrix and B is the matrix for right hand
side values of system of linear equation.
• An element is said to be pivot element on the left hand side of the matrix for
whom above and below elements are made zero. For doing this elementary
operation will be performed.
• A minor of a matrix is defined as the determinant of a smaller square matrix
which is obtained by removing one or more rows or columns or both from
matrix A.
174
Matrices and Determinants
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
175
Matrices and Determinants
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
176
Differentiation
UNIT–7 DIFFERENTIATION
Objectives
After going through this unit, you will be able to:
• Discuss differentiation and its related concepts
• Understand limit and its types
• Analyse continuity in an interval
• Explain geometrical interpretation of continuity
Structure
7.1 Introduction
7.2 Limit
7.3 Differentiability
7.4 Summary
7.5 Key Words
7.6 Answers to ‘Check Your Progress’
7.7 Self-Assessment Questions
7.8 Further Readings
7.1 INTRODUCTION
7.2 LIMIT
177
Differentiation
2
Example 7.1. lim(2 x 5 x 3) = 2(2)2 + 5(2) + 3
x 2
Solution: = 21
Right Hand Limit: lim f ( x) = L, provide f(x) is made as close as L for all x
x a
Left Hand Limit: lim f ( x) = L, provided f(x) is made as close as L for all x
x a
L W
X
a
When f(x) approaches L as the y-axis when x approaches the value a on the
x-axis.
It means that f(x) approaches L as x approaches a from the right or left.
lim f ( x) is said to be exist, when both the left and right hand limit exist and equal.
x a
178
Differentiation
lim f ( x) = L
x a
2
or f(x) → L when x → a+ and x → a– find lim( x 9) .
x 3
2
Example 7.2. lim( x 9) = (3)2 + 9
x 3
Solution: = 18
f(x) is approaching 18 when x is approaching 3.
( x 2 9) ( x 3) ( x 3)
Example 7.3. lim = lim ( x 3)
x 3 ( x 3) x 3
= lim ( x 3)
x 3
=6
f(x) is approaching 6 when x is approaching 3. It can also be defined as,
Let f(x) be a function defined on an interval containing x = a except possibly
at x = a, then
lim f ( x) = L
x a
Let lim f ( x) and lim g ( x) both exist and let k be any constant. Then
x a x a
179
Differentiation
f ( x) lim f ( x)
x a
(iv) lim [if g(x) ≠ 0]
x a g ( x) lim g ( x)
x a
(v) xlima k k
n
(vi) xlim[ f ( x)]n lim f ( x) , n is 9 positive integer
a x a
2
Find lim(4 x 2 x 1)
x 1
2 f ( x ) 3 g ( x)
Example 7.5. If xlima f ( x) = 5 and xlima g ( x) = – 1 find xlima f ( x) g ( x) .
2 lim f ( x) 3 lim g ( x )
2 f ( x) 3 g ( x ) x a x a
Solution: xlima f ( x) g ( x) lim f ( x) lim g ( x)
x a x a
2(5) 3( 1)
= =
5 ( 1)
180
Differentiation
x2 1
Example 7.6. Find lim
x 3 2x
2
lim x lim1
x 3 x 3
Solution:
2 lim x
x 3
(3)2 1
=
2(3)
8
=
6
Continuity of a Function
is equal to f(a).
3. lim f ( x) f (a)
x a
Let f be a real function on a subset of the real numbers and let a be a point in
the domain of f. Then f is continuous at a if
lim f ( x) = f(a)
x a
181
Differentiation
More elaborately, if the left hand limit, right hand limit and the value of the
function at x = a exist and are equal to each other, i.e.,
lim f ( x) f ( x) lim f ( x)
x a x a
Continuity in an Interval
(i) f is said to be continuous in an open interval (A, B) if it is continuous at
every point in this interval.
(ii) f is said to be continuous in the closed interval [A, B] if
• f is continuous in (A, B)
• lim f ( x) f ( A)
x a
• lim f ( x) f ( A)
x b
4. |x – a| (– ∞; ∞)
5. x–n, n is a positive integer (– ∞, ∞) – {0}
6. p(x)/q(x), where p(x) and q(x) are polynomials in x R – {x : q(x) = 0}
7. sin x, cos x R
182
Differentiation
183
Differentiation
7.3 DIFFERENTIABILITY
Differentiation
Let f be a function defined on an open interval I and a a point of I. The function
f is said to be differentiable at a if and only if the rate of change of the function f
at a has a finite limit l at a, i.e.:
f (a h) f (a )
lim l
h 0 h
184
Differentiation
Another definition for derivative is, “the change of a property with respect to
a unit change of another property.”
Let f(x) be a function of an independent variable x. If a small change (∆x) is
caused in the independent variable x, a corresponding change ∆f(x) is caused in the
function f(x); then the ratio ∆f(x)/∆x is a measure of rate of change of f(x), with
respect to x. The limit value of this ratio, as ∆x tends to zero, lim ( f ( x) / ( x) ) is
x 0
called the first derivative of the function f(x), with respect to x; in other words, the
instantaneous change of f(x) at a given point x.
dc
Derivative of a constant =0
dx
d du
Derivative of constant multiple (cu ) = c
dx dx
(We could also write (cf)′ = cf ′, and could use the “prime notion” in the other
formulas as well)
d du dv
Derivative of sum or difference (u v) =
dx dx dx
d dv du
Product Rule (uv) = u v
dx dx dx
185
Differentiation
du dv
d u v u
dx dx
Quotient Rule =
dx v v2
dy dy du
Chain Rule =
dx du dx
d n d du
x = nx n –1 u n = nu 1
dx dx dx
d x d u du
a = (ln a)ax a = (ln a) au
dx dx dx
d x d u du
(If a = e) e = ex e = eu
dx dx dx
d 1 d 1 du
log a x = log a u =
dx (ln a) x dx u
(ln a) dx
d 1 d 1 du
(If a = e) ln x = ln u =
dx x dx u dx
d d du
sin x = cos x sin u = cos u
dx dx dx
d d du
cos x = – sin x cos u = sin u
dx dx dx
d d du
tan x = sec2 x tan u = sec 2 u
dx dx dx
d d du
cot x = – csc2 x cot u = csc 2 u
dx dx dx
d d du
sec x = sec x tan x sec u = sec u tan u
dx dx dx
186
Differentiation
d d du
csc x = – csc x cot x csc u = csc u cot u
dx dx dx
d 1 d
sin x = sin 1 u =
dx dx
d 1 d 1 du
arc sin x = arc sin u =
dx 1 x2 dx 1 u 2 dx
d 1 d
tan x = tan 1 u =
dx dx
d 1 d 1 du
arc tan x = arc tan u =
dx 1 x2 dx 1 u 2 dx
Differentiability
f (x h) f ( x)
The function defined by f ′(x) = lim , wherever the limit exists, is
h 0 h
defined to be the derivative of f at x. In other words, we say that a function f is
f (a h) f (a)
differentiable at a point a in its domain if both lim– , called left hand
h 0 h
f (a h) f (a )
derivative, denoted by Lf ′ (a), and lim , called right hand
h 0 h
derivative, denoted by Rf ′ (a), are finite and equal.
(i) The function y = f(x) is said to be differentiable in an open interval (A, B)
if it is differentiable at every point of (A, B).
(ii) The function y = f(x) is said to be differentiable in the closed interval (A,
B) if Rf ′ (A) and Lf ′ (B) exist and f ′ (x) exist for every point of (A, B).
(iii) Every differentiable function is continuous, but the converse is not true.
187
Differentiation
Solved Examples
Example 7.7. Find the value of the constant k so that the function f defined below
is
1 cos 4 x
,x 0
Continuous at x = 0, where f (x) = 8x2
x, x 0
1 cos 4 x
⇒ xlim0 = k
8x2
2 sin 2 2 x
⇒ lim = k
x 0 8x2
2
sin 2 x
⇒ lim = k
x 0 2x
⇒ k= 1
Thus, f is continuous at x = 0 if k = 1.
Example 7.8. Discuss the continuity of the function f(x) = sin x ⋅ cos x.
Solution: Since sin x and cos x are continuous functions and product of two
continuous function is a continuous function, therefore f(x) = sin x. cos x is a
continuous function.
Example 7.9. Let f be a piecewise function defined by:
f ( x) x2 2 x 2 if x 1
x 4
f ( x) if x 1
x
188
Differentiation
x 4
lim = – 3 and f(1) = 12 – 2 × 1 – 2 = –3
x 1 x
x 1
So: lim = f(1) the function f is continuous at 1.
x 1 x
1 h 4
3
f (1 h) f (1) 1 h 4h 4
h h h(1 h) 1 h
4
So, lim– = 4 then f+ '(1) = 4
h 0 1 h
Graphically the graph bf is a single unbroken curve and has a cusp at the
point A.
189
Differentiation
x2 6x 8
if x 2
f(x) = x 2
3 if x 2
( x 4) ( x 2)
lim f ( x) = lim =2–4=–2
x 2 x 2 x 2
x 6x 8
if x 2
f(x) = x 2
2 if x 2
x 4 if x 2
f(x) = ,
2 if x 2
so all we did is remove the point (2, – 2) in the line y = x – 4 and then fill it
in again. The other way would be to show
f ( x) f (2)
lim
x 2 x 2
exists. Using this rewritten form of f(x) for the limit is easier, and I’II leave it
to you to check.
190
Differentiation
9cos ( x) if x
f(x) =
ax b if x
9sin( x) if x
f ′(x) = a if x ,
? if x
lim f ( x) lim f ( x)
x x
So, if you check this, you get the equation 0 = a. Plug this into the above
equation to get b = –9. So with these two values, f(x) is differentiable at x = π.
The reason this works is because each piece of f ′(x) is continuous individually.
Also, what I wanted to demonstrate in the review session with the limit:
5 x 1
lim .
x 4 2 x
This is a 0/0 limit. Here, you could multiply by the conjugate on the bottom to
get
( sqrt 5 x 1) (2 x)
lim
x 4 4 x
but nothing cancels. So this tells you that you should also multiply by the
conjugate on the top:
191
Differentiation
( 5 x 1) ( 5 x 1) (2 x) (4 x) (2 x)
lim lim
x 4 (4 x) ( 5 x 1) x 4 (4 x ) ( 5 x 1)
4
= =2
2
dy 1 d
(tan x )
dx 2 tan x dx
1 d
= sec2 x ( x)
2 tan x dx
1 1
= (sec 2 x )
2 tan x 2 x
(sec 2 x )
= .
4 x tan x
dy
Example 7.12. If y = tan (x + y), find .
dx
dy d
sec 2 ( x y) (x y)
dx dx
2 dy
= sec ( x y ) 1
dx
dy
or [1 – sec2 (x + y] = sec2 (x + y)
dx
192
Differentiation
dy sec2 ( x y )
Therefore, = – cosec2 (x + y).
dx 1sec 2 ( x y )
dy
Example 7.13. If ex + ey = ex+y, find .
dx
dy x y dy
ex ey = e 1
dx dx
dy
or (e y ex y ) = ex+y – ex,
dx
dy ex y
ex e x ( e y 1)
= y .
dx e ex y
e y (1 e x )
dy log x
Example 7.14. If xy = ex–y, prove that
dx (1 log x) 2
x
i.e. y = 1 log x
1
(1 log x) 1 x
dy x log x
2
dx (1 log x) (1 log x) 2
193
Differentiation
d2y cos x
Example 7.15. If y = tan x + sec x, prove that 2
dx (1 sin x )2
dy
= sec2x + secx tanx
dx
dy 1
Thus = .
dx 1 sin x
x3 x 2 16 x 20
,x 2
Example 7.16. If f(x) = ( x 2) 2 is continuous at x = 2, find the
k ,x 2
value of k.
Solution: Given f(2) = k.
x3 x 2 16 x 20
Now, lim f ( x ) lim f ( x) lim
x 2– x 2 x 2 ( x 2)2
(x 5) ( x 2) 2
= xlim2 lim ( x 5) 7
(x 2) 2 x 2
As f is continuous at x = 2, we have
lim f ( x) = f(2)
x 2
⇒ k = 7.
194
Differentiation
1
x sin ,x 0
f(x) = x
0 ,x 0
is continuous at x = 0.
Solution: Left hand limit at x = 0 is given by
1
Similarly, lim f ( x) lim x sin 0. Moreover f(0) = 0.
x 0 x 0 x
195
Differentiation
7.4 SUMMARY
196
Differentiation
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
197
Differentiation
198
Integration and Its
Application
Objectives
After going through this unit, you will be able to:
• Discuss the aspects of integration
• Analyse integration by substitution
• Understand the applications of integration
• Determine a cost function
Structure
8.1 Introduction
8.2 Integration
8.3 Application of Integration
8.4 Summary
8.5 Key Words
8.6 Answers to ‘Check Your Progress’
8.7 Self-Assessment Questions
8.8 Further Readings
8.1 INTRODUCTION
This unit will introduce you to integration and its applications. Integration is a calculus
operation whereby the integral of a function is determined. It is in other words the
process of calculating either a definite integral or indefinite integral. Integration is
denoted with the symbol . Differentiation is also an integral aspect of integration.
The different between integration and differentiation can be defined as the difference
between squaring and taking the square root. There are different methods of
integration, namely, indefinite integral, integration by substitution, integration of
rational, irrational and trigonometric functions.
Integration has a wide application, from economics to accounting and business,
determination of cost functions, total revenue functions, consumer surplus and
producer surplus. The aspects of integration and its varied applications are discussed
in detail in this unit.
199
Integration and Its
Application
8.2 INTEGRATION
X
a b
Definite Integral
For a real function f(x) and a closed interval [a, b] on the real line, the definite
b
integral, f ( x), is defined as the area between the graph of the function the
a
horizontal axis and the two vertical lines at the end points of an interval.
Indefinite Integral
When a specific interval is not given, it is known as indefinite integral. A definite
integral can be calculated using anti-derivatives.
Given a function f(x), the indefinite integral (or antiderivative) of f(x) is a function
F(x) whose derivative is equal to f(x). This means that F ( x) f ( x).
Symbol of Integration
f ( x ) dx = F(x) + C
The C is called the constant of integration. From the rules of differentiation the
derivative of any constant is simply 0. That is how differentiation and integration are
related to each other.
200
Integration and Its
Application
number that is squared. Similarly, if the integration is applied on the result, that is
obtained by differentiating a continuous function f(x), it will leads back to the
original function and vice versa.
For example, let F(x) be the integral of function f(x) = x, therefore, F(x) =
f ( x) dx = (x2/2) + c, where c is an arbitrary constant. When differentiating F(x)
Integration Formulas
Indefinite Integral
Method of substitution
f ( g ( x )) g ( x) dx = f (u ) du
Integration by parts
f ( x) g ( x) dx = f ( x) g ( x) g ( x) f ( x) dx
xn 1
x n dx = C
n 1
1
dx = ln|x| + C
x
c dx = cx + C
x2
x dx = C
2
x3
x 2 dx = C
3
1 1
dx = C
x2 x
201
Integration and Its
Application
2x x
xdx = C
3
1
dx = arc tan x + C
1 x2
1
dx = arc sin x + C
1 x2
sin x dx = – cos x + C
cos x dx = sin x + C
tan x dx = ln |sec x| + C
1
sin 2 x dx = (x – sin x cos x) + C
2
1
cos 2 x dx = (x + sin x cos x) + C
2
tan 2 x dx = tan x – x + C
sec 2 x dx = tan x + C
ax e ax
e sin bx dx = [a sin bx – b cos bx]
a2 b2
ax
eax
e cos bx dx = [a cos bx + b sin bx]
a2 b2
202
Integration and Its
Application
ln x dx = x ln x – x + C
xn 1 x 1
n
x ln x dx = ln x C
n 1 ( n 1) 2
e x dx = ex + C
bx
b x dx = C
ln b
sin hx dx = cos h x + C
cos hx dx = sin h x + C
(ax b) n 1
(ax n
b) dx = , (n ≠ – 1)
a (n 1)
1 1
dx = ln (ax b)
(ax b) a
Integration by Substitution
Integration by substitution is a method which deals with comparatively complex
integration. Difficult piece of integration can be make easy by using this method. It
affects the variable and the integrand. Simple substitution method can be understood
by the example of linear substitution of ax + b = u. It can be said that substitution
method provides simpler integration involving the variable u.
Let u = ax + b
Step 1: Choose a new variable u
Step 2: Determine the value dx
Step 3: Make the substitution
Step 4: Integrate resulting integral
Step 5: Return to the initial variable x
203
Integration and Its
Application
Solution: Let x + u = u
du
du = dx
dx
du
Now, in this example, because u = x + 4 it follows immediately that 1 and
dx
so du = dx. So, substituting both for x + 4 and for dx in Equation (1) we have
( x 4)5 dx = u 5 du
u6
The resulting integral can be evaluated immediately to give c. We can
6
revert to an expression involving the original variable x by recalling that
u = x + 4, giving
( x 4)6
( x 4)5 dx = c
6
Solution: Let 3x + 4 = u
du
du = dx
dx
du
and so with u = 3x + 4 and =3
dx
It follows that
du
du = dx 3dx
dx
204
Integration and Its
Application
1
So, substituting u for 3x + 4, and with dx = du in Equation (2) we have
3
1
cos(3x 4) dx = cos u du
3
1
= sin u c
3
1
cos(3x 4) dx = sin(3x 4) c
3
Solution:
Step 1: Choose a substitution function u = 2x + 3
Step 2: Determine the value du = 2dx + 0
du
dx =
2
du
(2 x 3) 4 dx = u4
2
1 4 1 u5
= u du c
2 2 5
u5
= c
10
205
Integration and Its
Application
(2 x 3)5
= c
10
15
Example 8.4. Find dx
(3 2 x)
Solution:
Step 1: Choose a substitution function u = 3 – 2x
Step 2: Determine the value
du = 0 – 2dx
1
dx = du
2
15 15 1
dx = du
(3 2 x) 4 2
15 du 15
= |n|u| + C
2 u 2
15
= ln 3 2 x c
2
f [ g ( x )] g ( x ) dx = f (u ) du
Solution:
Let x2 = u
∴ 2x dx = du
Now integrate:
Example 8.6. 2 x 1 x 2 dx
Solution:
du
And so with u = 1 + x2 and = 2x
dx
It follows that
du
du = dx = 2x dx
dx
2 x 1 x 2 dx = u du
= u1/ 2 du
2 3/ 2
= u c
3
207
Integration and Its
Application
2
(1 x 2 )3 / 2 c
2
2 x 1 x dx = 3
1. What is integration?
................................................................................................................
................................................................................................................
................................................................................................................
dC
If C denotes the total cost and MC = is the marginal cost, then we write C
dx
208
Integration and Its
Application
∴ C(x) = (5 16 x 3 x 2 ) dx
x2 x3
= 5x + 16 3 k
2 3
C(x) = 5x + 8x2 – x3 + k
When x = 5, C(x) = C(5) = ` 500
or, 500 = 25 + 200 – 125 + k
This gives, k = 400
∴ C(x) = 5x + 8x2 – x3 + 400
Example 8.8. The marginal cost function of producing x units of a product is given
x
by MC = . Find the total cost function and the average cost function
x2 2500
x
Solution: MC = 2
x 2500
x
∴ C(x) = dx k
2
x 2500
Let x2 + 2500 = t2 ⇒ x dx = t dt
t dt
∴ C(x) = k
t
C(x) = dt k t k x2 2500 k
209
Integration and Its
Application
∴ 1000 = 2500 + k = 50 + k
Or, k = 950
2500 950
AC = 1 2
x x
d
MR = [ R ( x)]
dx
R( x)
Also, where R(x) is known, the demand function can be found as p =
x
∴ R= (12 3 x 2 4 x ) dx k
R
∴ p= = 12 + 2x – x2 is the demand function.
x
210
Integration and Its
Application
6
MR = 4
(x 3) 2
6
Solution: MR = 4
( x 32 )
6 6
∴ R= 2
4 dx 4x k
(x 3) x 3
x = 0, R = 0 ⇒ k = – 2
6
∴ R= – 4x – 2, which is the required revenue function.
x 3
R 6 2
Now, p= 4
x x( x 3) x
6 2
= 4
x ( x 3) x
6 2x 6
= 4
x( x 3)
2 2
= 4 4
x 3 3 x
2
∴ The demand function is given by p = 4.
3 x
S(x)
Price
D(x)
Quantity
212
Integration and Its
Application
In an ideal free market both consumers and producers gain by buying and
selling at the equilibrium price. The goal of this section is to compute exactly how
much the consumers gain by buying at the equilibrium price rather than at a higher
price.
The total amount spent by the consumers if everyone buys at the equilibrium
price p, in this case q units are supplied and bought, and the total amount spent is
the number of units bought times the price per unit, i.e.,
q
total amount paid at maximum prices = D(q ) dq .
0
The quantity in the integral is the area under the demand curve from q = 0 to
q = qe. As the figure shows it is greater than , which is the area of the rectangle
either sides [0 q ] and [0 p ], and which according to the formula represents the
total amount spent by consumers at the equilibrium price. The difference between
these two areas represents the total that consumers save by buying at equilibrium
price.
213
Integration and Its
Application
This is called the consumer surplus for this product (See picture above). To
summarize
q q
Consumer surplus = D ( q ) dq pq [ D (q ) p ] dq. ...(i)
0 0
A similar analysis (which you should try out) shows that the producers also gain
by trading at the equilibrium price. Their gain called producer surplus is given by the
following quantity
q q
Producer surplus = pe qe S (q ) dq [p S (q )] dq. ...(ii)
0 0
20
p = D(q) = q 1
20
= q + 2.
q 1
214
Integration and Its
Application
We compute consumer and producer surplus using formulae (i) and (ii) above:
q
CS = D ( q ) dq pe qe
0
3
20
= dq (5) (3)
0
q 1
= 20 ln(q 1) 30 15
= 20 ln 4 – 15
≈ 12.73.
q
Similarly PS = p q S ( q ) dq
0
= (5) (3) (q 2) dq
0
3
q2
= 15 2q
2 0
9
= 15 6 4.50 .
2
Example 8.12. Find consumer and producer surplus for demand equation P = –
50q + 2000 and supply equation p = 10q + 500.
215
Integration and Its
Application
Both areas can be found using a definite integral. The form the integral
takes is:
The shaded area for Consumer Surplus is shown in the figure. The left edge of
the triangle has an x-coordinate of 0, and the right edge is our equilibrium point,
which has an x-coordinate of 25. The top of the triangle is the demand equation
p = –50q +2000, and the bottom of the triangle is our constant equilibrium price,
750. So,
25
Consumer Surplus = 0
( 50q 2000) (750) dq
25
= 25q 2 2000q 750q 0
= Rs. 15,625
Producer Surplus can be found the same way: The left edge and the right edge
are still at 0 and 25, but now the top of the triangle is our equilibrium price, and
the bottom of the triangle is our supply equation p = 10q + 500. So,
25
Producer Surplus = 0
(750) (10q 500)dq
25
= 750q 5q 2 500q 0
= Rs. 3125
Example 8.13. The demand function is given by q = –0.5q + 70 and the supply
function is given by q = 0.7q – 50. On the x-axis, and quantity is on the y-axis.
Find consumer and producer surplus.
216
Integration and Its
Application
Sol. As before, we set the supply and demand equations equal to each other.
Supply = demand
0.7p – 50 = 0.5p + 70
1.2p = 120
p = 100
So we know that Ep = Rs. 100. To find Eq, we could use either the supply or
the demand equation. Again, both will give the same answer:
supply: demand:
q = 0.7(100) – 50 q = –0.5(100) + 70
q = 70 – 50 q = – 50+70
q = 20 q = 20
Both solutions agree, so we can be sure that Eq = 20 units.
Definite integral =
Consumer Surplus
The left edge of Consumer Surplus is the equilibrium line. And the x-coordinate of
that line is our equilibrium price, or `100. The right edge is the point where the
demand function crosses the x-axis. To find this point, we set the demand function
equal to zero and solve:
217
Integration and Its
Application
Demand = 0
–0.5p + 70 = 0
–0.5p = –70
p = ` 140
So the bounds of our integral will be at ` 100 and ` 140.
140
( 0.5 p 70) (0) dp
100
140 140
( 0.5 p 70) (0) dp = 0.5 p 70 dp
100 100
140
0.5 p 2
= 70 p C
2 100
0.5(140) 2 0.5(100) 2
= 70(140) C 70(100) C
2 2
= ` 400
Producer Surplus
The left edge of Producer Surplus is the point where the supply function crosses the
x-axis, and so to find this point, we set the supply function equal to zero and solve:
Supply = 0
0.75 – 50 = 0
0.7p = 50
p = ` 71.43
It’s the equilibrium line, and the x-coordinate of that line is `100. So the bounds
of our integral will be `71.43 and `100.
100
(0.7 p 50) (0) dp
71.43
100 100
(0.7 p 50) (0) dp = 0.75 p 50dp
71.43 71.43
218
Integration and Its
Application
100
0.7 p 2
= 50 p C
2 71.43
0.7(100) 2 0.7(71.43) 2
= 50(100) C 50(71.43) C
2 2
Example 8.14. Suppose that when it is t years old, a particular industrial machine
generates revenue at the rate R′(t) = 5,000 – 20t2 rupees per year and that
operating and servicing costs related to the machine accumulate at the rate C’(t) =
2,000 + 10t2 rupees per year.
(a) How many years pass before the profitability of the machine begins to
decline?
(b) Compute the net earnings generated by the machine over the time period
determined in part (a).
Solution:
(a) The profit associated with the machine after t years of operation is
P(t) = R(t) – C(t) and the rate of profitability is
P′(t) = R′(t) – C′(t)
= (5,000 – 20t2) – (2,000 + 10t2)
= 3,000 – 30t2
The profitability begins to decline when
P′(t) = 0
3,000 – 30t2 = 0
t2 = 100
t = 100 years
(b) The net earnings NE over the time period 0 ≤ t ≤ 10 is given by the
difference NE = P(10) – P(0), which can be computed by the integral
10
NE = P(10) P(0) P (t ) dt
0
219
Integration and Its
Application
10
= (3,000 30t 2 ) dt
0
10
= (3,000t 10t 3 ) 0 ` 20,000
8.4 SUMMARY
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
222
Meaning and Scope of Statistic
BLOCK-III
BASIC STATISTICAL CONCEPTS
The basic statistical concepts are discussed in this block. Statistics refers and relates to our
daily life in many ways. It generally consists of reaching various decisions, based on a
number of tests and surveys. The surveys are carried on a particular group of people or
population, called a sample. A sample in statistics is a common group that helps conduct
surveys and reach a result. This block discusses the meaning and scope of statistics, the
methods of organizing a statistical survey, accuracy, approximation and errors, ratio,
percentage and rates. This block consists of four units.
The ninth unit, of this book, discusses the meaning and scope of statistics. Statistics is an
important aspect of our daily life. It pertains to the various financial and calculative decisions
that we take in a day. It varies from rising stock rates to the literacy rates. Statistics has in
the recent years moved from mathematics to various other fields. The unit discusses the
many aspects of statistics in detail.
The tenth unit lists the method of organizing a statistical survey. Surveys help in reaching an
endpoint or a conclusion. Generally surveys are research based and are done with an
objective to reach some conclusion. A statistical survey however targets only a particular
population and is intended to help solve their problems and issues. The unit discusses the
methods of conducting surveys in detail.
The eleventh unit explains accuracy, approximation and errors. Any statistical data, when
collected comprises of these three things. As statistical data is generally collected on a large
scale, thus it is much likely to consist of errors as many-a-things and data are based on an
approximate value or count. Accuracy is important to reach a conclusion at the end of a
survey, but it has to be dealt with the errors and approximations. This unit tackles this with
explanation.
The twelfth unit discusses ratios, percentages and rates. Any survey or statistical data which
is collected over a large area, consists of ratios, percentages and rates. These factors are
later computed as per the need of the end result and are thus accounted. The unit discusses
the role of ratios, percentages and rates in statistics.
223
Meaning and Scope of Statistic
Objectives
After going through this unit, you will be able to:
• Describe the nature and scope of statistics
• Assess the concept of business statistics
• Analyse the importance of statistics in various fields
• Discuss the evaluation of statistics as a subject of study
Structure
9.1 Introduction
9.2 An Introduction to Statistics
9.3 Evaluating Statistics
9.4 Summary
9.5 Key Words
9.6 Answers to ‘Check Your Progress’
9.7 Self-Assessment Questions
9.8 Further Readings
9.1 INTRODUCTION
225
Meaning and Scope of Statistic
Most business decisions are made today on the basis of relevant information and
statistical analysis of such information. Quantitative analysis has replaced intuition and
experienced guess work in solving most business problems. One of the tools to
understand information is statistics.
In general, business statistics can be defined as ‘a body of methods for
obtaining, organizing, summarizing, presenting, interpreting, analysing and acting
upon numerical facts related to an activity of interest. Numerical facts are usually
subjected to statistical analysis with a view to helping a decision-maker make wise
decisions in the face of uncertainty’.
The word ‘statistics’ can be referred to in two ways. In a common way, it refers
simply to numerical statements of facts such as the number of children in a family, the
number of books on statistics in the college library, the number of students enrolled
in the department of economics in Delhi University, and so on. The following
statements indicate the use of statistics as referring to numbers:
• Around 20 million Americans have a serious drinking problem.
• Nearly 52,000 Americans died in automobile accidents last year.
• More than 76 per cent voters turned out to vote during elections in Punjab
in February 2007.
• Majority of Americans consider Japanese cars superior in quality than
American cars.
All these statements represent statistical conclusions in some form. These
conclusions help us in formulating specific policies and attitudes with respect to
diverse areas of interest.
The second meaning of statistics refers to the field of study rather than simply to
numerical statements. As an area of study, it is primarily concerned with making
scientific and rational decisions about various properties and characteristics of some
population of interest, such as stock market trends, interest rates, demographic
shifts, inflation rates over the years, and so on. Consider the following statistical
statements:
• The crime rate in the city has gone up by 15 per cent over what it was last
year. (This statistical conclusion could help us in making decisions regarding
our safety and security in the city).
226
Meaning and Scope of Statistic
• The rate of inflation is expected to remain less than 5 per cent per year over
the next five years. (This could help us in making more educated
judgements about the general economic health of the country in the near
future).
• Less than 20 per cent of all high school graduates enter colleges for higher
education and less than 40 per cent of those who do enter colleges actually
graduate. (This statement gives us a good indication of the educational
philosophy of the country and the community and the reasons for such low
rates of admission into colleges and graduation could be investigated).
All these statements represent statistical conclusions in some form, which help us
to understand our environment better, and further help us in formulating specific
policies and attitudes to address and solve issues of interest.
227
Meaning and Scope of Statistic
228
Meaning and Scope of Statistic
229
Meaning and Scope of Statistic
Functions of Statistics
Statistics is no longer confined to the domain of mathematics. It has spread to most
of the branches of knowledge including social sciences and behavioural sciences.
One of the reasons for its phenomenal growth is the variety of different functions
attributed to it. Some of the most important functions of statistics are described as
follows:
1. It condenses and summarizes voluminous data into a few
presentable, understandable and precise figures: The raw data, as is
usually available, is voluminous and haphazard. It is generally not possible
to draw any conclusions from the raw data as collected. Hence, it is
necessary and desirable to express this data in few numerical values. For
instance, the average salary of a policeman is derived from a mass of data
from surveys. But just one summarized figure gives us a pretty good idea
about the income of police officers. Similarly, stock market prices of
individual stocks and their trends are highly complex to comprehend, but a
graph of price trends gives us the overall picture at a glance.
2. It facilitates classification and comparison of data: Arrangement of
data with respect to different characteristics facilitates comparison and
interpretation. For instance, data on age, height, sex and family income of
college students gives us a much better picture of students when the data is
categorized relative to these characteristics. Additionally, simply the
statements about these figures don’t convey any significant meaning. It is
their comparison that helps us draw conclusions.
3. It helps in determining functional relationships between two or more
phenomenon: Statistical techniques such as correlational analysis assist in
establishing the degree of association between two or more independent
variables. For instance, the coefficient of correlation between literacy and
employment gives us the degree of association between extent of training
and industrial productivity. Similarly, correlation between average rainfall
and agricultural productivity can be obtained by using such statistical tools.
Some statistical methods can also be used in formulating and testing
hypothesis about a certain phenomenon. For instance, it can be tested
whether a credit squeeze is effective in controlling prices of consumer
230
Meaning and Scope of Statistic
Scope of Statistics
There is hardly any walk of life which has not been affected by statistics—ranging
from a simple household to big businesses and the government. Some of the
important areas where the knowledge of statistics is usefully applied are explained in
the following paragraphs:
Statistics in Government
Since the beginning of organized society, the rulers and the heads of states have
relied heavily on statistics in the form of collecting data on various aspects for
formulating sound military and fiscal policies. This data may have involved
population, taxes collected, military strength and so on. In the current structure of
democratic societies, the government is, perhaps, the biggest collector of data and
user of statistics. Various departments of the government collect and interpret vast
amount of data and information for efficient functioning and decision-making.
231
Meaning and Scope of Statistic
232
Meaning and Scope of Statistic
Limitations of Statistics
Statistics is essential for almost all sciences such as social, physical and natural. In
spite of the extensive scope of the subject it has the following limitations:
1. Statistics does not study qualitative phenomena because it deals with facts
and figures. So the quality aspect of a variable or the subjective
phenomenon falls out of the scope of statistics. For example, qualities like
beauty, honesty, intelligence, etc., cannot be numerically expressed. So
these characteristics cannot be examined statistically.
2. Statistics does not study individuals. Statistics deals with aggregate of facts.
Single or isolated figures are not statistics.
3. Statistics can be misused. Statistics is mostly a tool of analysis. Statistical
techniques are used to analyse and interpret the collected information in an
enquiry. Statements supported by statistics are more appealing and are
commonly believed. For this, statistics is often misused.
4. Statistical methods rightly used are beneficial but if misused these become
harmful. Statistical methods used by less expert hands will lead to
inaccurate results. Here the fault does not lie with the subject of statistics
but with the person who makes wrong use of it.
5. Statistical cannot be applied to heterogeneous data.
6. It sufficient care is not exercised in collecting, analyzing and interpretation
the data, statistical results might be misleading.
235
Meaning and Scope of Statistic
4. How can you say that statistics influences the operations of business and
management?
................................................................................................................
................................................................................................................
................................................................................................................
236
Meaning and Scope of Statistic
237
Meaning and Scope of Statistic
238
Meaning and Scope of Statistic
2. What are the various statistical factors which should be considered while
planning the development activities of the state?
................................................................................................................
................................................................................................................
................................................................................................................
9.4 SUMMARY
• Most business decisions are made today on the basis of relevant information
and statistical analysis of such information.
• Business statistics can be defined as a body of methods for obtaining,
organizing, summarizing, presenting, interpreting, analysing and acting upon
numerical facts related to an activity of interest.
• Statistics refers to the field of study rather than simply to numerical
statements.
• Single or isolated facts or figures cannot be called statistics as these cannot
be compared or related to other figures within the same framework.
• All statistics are stated in numerical figures which means that these are
quantitative information only.
• The procedures for collecting data should be predetermined and well
planned and such data collection should be undertaken by trained
investigators.
• The two methods which are used for collecting data are as follows:
(a) Actual counting or measuring
(b) Estimation
• The main objective of data collection is to facilitate a comparative or
relative study of the desired characteristics of the data.
• According to Schaum’s Outline of Business Statistics, Statistics refers to
the body of techniques used for collecting, organizing, analysing and
interpreting data.
• Statistics is no longer confined to the domain of mathematics and has
spread to most of the branches of knowledge including social sciences and
behavioural sciences.
239
Meaning and Scope of Statistic
240
Meaning and Scope of Statistic
241
Meaning and Scope of Statistic
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
242
Organizing a Statistical
Survey
Objectives
After going through this unit, you will be able to:
• Discuss the steps in a statistical survey
• Understand the sources of statistical data
• Analyse the factors affecting the type of enquiry
• Describe the various different types of enquiries
• Assess non-probability sampling methods
Structure
10.1 Introduction
10.2 An Overview to Statistical Survey
10.3 Sampling Methods
10.4 Statistical Unit
10.5 Summary
10.6 Key Words
10.7 Answers to ‘Check Your Progress’
10.8 Self-Assessment Questions
10.9 Further Readings
10.1 INTRODUCTION
This unit will discuss about statistical survey and its various aspects. A statistical
survey is a collection of information about items in a population. It is a statistical
inquiry on a specific target population so as to discover facts leading to information
which can then be further used to solve problems pertaining to that segment of the
population. It tells about the different enquiries and laws pertaining to it. As you
advance further with this unit, it will explain topics about statistical units and its
degrees that are important to remember to help conduct a survey.
243
Organizing a Statistical
Survey
The various aspects of statistical survey have been discussed below in detail.
244
Organizing a Statistical
Survey
245
Organizing a Statistical
Survey
purpose of survey is not complete till the findings are clearly defined to the people
and communicated in a systematic manner to the people. The results of the survey
may be well accounted for as a source of knowledge. For all these reasons the
survey reports are significant.
247
Organizing a Statistical
Survey
want it to serve. Without editing or scrutinizing it the secondary data may result in
errors and the investigation would then be incorrect. For these reasons it becomes
essential to use secondary data with caution.
248
Organizing a Statistical
Survey
Types of Enquires
A statistical survey is incomplete without deciding the enquiry type even if you have
already deployed other preliminary steps. It is important to include the steps related
to enquiries and its kinds. First you need to understand the types of enquiries,
whether it is direct or indirect, census or sample, original or repetitive and is it
confidential or open. However, it is first important to understand that factor that
influence the enquiry before talking about their different kinds.
249
Organizing a Statistical
Survey
250
Organizing a Statistical
Survey
Perhaps for this reason the government undertakes population census in a decade.
Another thing to remember is that at times it is difficult to examine all the items within
population. Only sometimes one may get accurate results but that too when only a
part of the population is studied. When this is the case then there is no requirement
for census surveys.
Sample enquiry requires a partial stud of the population while the field studies
comprise of time, cost, convenience and other such factors form the basis of
selecting sample survey. The sample survey is all about the sample items that are
selected to represent the population in its totality. The sample items would help the
investigator in estimation of the characteristics related to the population without bias
that would help in producing reliable and valid results.
Now it is time to know about the advantages that comes with sample enquiry:
i) Conducting a sample study is cheaper and involves less financial means in
comparison to census study. The results are obtained quickly in a short
time.
ii) The measurements are more accurate as the data collection is done by
experienced and trained investigators.
iii) With larger population the best means of survey is the sample survey
method for data collection.
iv) Sample survey method becomes the best beams of survey when one is to
utilize an object that would be destroyed under study. The best example of
this is related to physical science wherein fresh samples are required each
time the chemicals are used.
v) Through this method you will be able to estimate errors that come as a
result of sampling.
Despite these advantages, the sample enquiry if the given areas is small or
narrow then the utility of resorting to this method is useless. Deciding about
employing a means of enquiry varies on different factors such as availability of
resources, nature of enquiry, objective and scope.
Original or Repetitive Enquiry: This kind of enquiry is all about the first time
enquiry, the repetitive enquiry is something that happens in continuation to previous
surveys. The initial survey or an original survey one has the liberty to adopt any
means of data collection, but when it is about repetitive enquiry resorting to old
method is required throughout the study. In case of encountering a new situation it is
251
Organizing a Statistical
Survey
modified accordingly. However, repetitive enquiry one needs to be careful about not
changing the definition of terms related to it as it would then lead to inaccuracy in
comparisons.
Confidential or open Enquiry: A confidential survey as the name suggests is
all about keeping the survey results a secret and these are not revealed to the public.
However, when it comes to open enquiry things are different and opposite to it. Both
the enquiries are treated with different modes. Remember, that most of the enquiries
conducted by the state or the government including that by institutions are non-
confidential. When private bodies are involved such as trade unions, and other
associations these collect information that are kept amongst themselves and confined
to few members involved in it.
Direct or Indirect Enquiry: Direct enquiry comprise of producing direct
quantitative measurement. For example, factors related to height, weight, income,
these are included in quantitative terms. Indirect enquiry is different as it does not
require direct quantitative measurements that are also not possible in it. Things like
honesty, intelligence, efficiency are some of the factors that cannot be measured.
However, these factors are still taken into consideration at the time of indirect
enquiry as these factors influence on the problem even though these are not
measured quantitatively. However, factors that are not quantifiable should be
measured indirectly. If one is to study intelligence of the students then it is essential
to include the marks of the students and make it a part of that study.
Regular or Ad-hoc Enquiry : A regular enquiry comprise of collecting regular
data over a period of time, however, an ad-hoc means collecting data when required
without any given period of intervals or specific timings. It all depends whenever the
data is needed that the enquiry is conducted.
Official or Semi-official or Non-official Enquiry: Official enquiry is when the
government is conducting the survey, just as official enquiry. Semi- official enquiry is
done by other bodies that are of government patronage. Non-official or private
enquiry is carried out by private institutions, bodies or individuals. One thing to
remember is that the facilities available vary on the type of enquiry. For example, if
it is an official enquiry then people will have to go through the obligation to supply
information. In case of semi-official people the information is acquired on request
basis. If there is a private enquiry the investigator will run through numerous troubles
and difficulties for data collection.
252
Organizing a Statistical
Survey
Samples in a statistical survey are collected with the help of surveys. There are two
types of surveys, these are discusses below:
(1) Census survey- This contains an entire group.
(2) Sample survey- This contains selected representative items pertaining to a
group.
The sample survey, representative items are the sample. There are methods
though which samples are collected; these can be categorized as follows:
(1) Probability sampling methods
(2) Non-probability sampling methods
253
Organizing a Statistical
Survey
254
Organizing a Statistical
Survey
these groups and select samples that are suitable. This is what is called stratified
sampling.
Cluster Sampling: This method is all about grouping related to heterogeneous
groups that are called clusters and then one is to select a few out of these by
employing the random sampling technique. The survey work is accomplished by
using the selected clusters that include all the items. All the five different elements are
included as explained in pervious example, in this it forms a heterogeneous group
that contains employees related to an institution. Each institution in the list would be
a cluster. By selecting a few institutions through the process of random sampling the
survey is conducted of all the employees. This is cluster sampling.
Area Sampling: This method is similar to that of cluster sampling. It is used
when there is a need for covering a geographical area and when it is a widely spread
survey. With this method the area is divided into smaller areas then the method of
random selection is applied to smaller areas. All the units thus selected then are
studied and examined for the accomplishment of the task of survey.
Multi-stage Sampling: This method is used when the survey requires covering
large area or where the population comprises of heterogeneous group. For example,
the survey that you are about to conduct includes families from the whole country.
This is something that requires a sampling method that is multi-staged. The first
would need random selection of states. Next, you will require selecting few districts
randomly. After this the final stage would include selecting few towns from each of
the districts. Now you need to select families randomly from the selected towns.
This method requires stratification that is carried out in four sages to make a final
sample. With this there is a possibility of each item being selected.
256
Organizing a Statistical
Survey
Following are the conditions that are related to this law of statistical regularity:
i) The selection should be random. This would be like every item should be
able to get equal opportunity to be selected in the process.
ii) The items that you need to include should be large to support sufficient
representation of the sample.
From this it is understood that the population is a large sized sample that is
randomly chosen, it is certain that the sample taken too will contain the same
characteristics as that of the population.
slow and possibly gradual. This clearly defines how the inclusion of larger items, the
deviation is smaller.
It is important to consider that you need to define the unit properly prior to the
statistical survey. This unit is defined as per the measures undertaken by the
investigator as per the variable that are selected for enumeration, interpretation and
analysis. It is therefore essential to take into consideration the collection of relevant
data. If the unit is not well defined then, the possibility of the collected data may be
devoid of the relevant data and that should be included. For this reason it is not easy
to define the unit as it may see, in the first instance.
258
Organizing a Statistical
Survey
price. If retail price is suitable to the sample then this should be used
instead of choosing another type of price.
ii) The unit should be specific and unambiguous: It is essential to define
the unit specifically to avoid ambiguity. If this is not done then the data
collected will be full of errors and it would be inaccurate.
iii) The unit must be stable: If the value fluctuates then the data collections
from different places or times would be incomparable. This would be
misleading.
iv) The unit must be homogeneous: Once you have defined the statistical
unit then the next thing to do is to keep it uniform throughout your enquiry,
this is essential if you want to get a valid comparison that is based on the
data collected.
v) The unit must be simple: The statistical unit should be kept simple for the
sake of its understanding and it should be complete.
Types of Units
Now take a look at different kinds of statistical units.
1. There is a possibility that the statistical unit is either arbitrary or a physical
unit. The examples of physical units are: grams, tons, meters, kilograms and
so on. These units are common and need no explanation. However, when
it comes to studies, these are not suitable. For example when it comes to
defining the wages of workers in an industry then the unit will be wage.
Different wages would then be included such as piece wage, daily, monthly,
money wage and so on. This situation requires taking an arbitrarily decision
about the kind of wage that you need to collect and then define it.
2. The statistical units have categories that include
(i) Units of estimation or enumeration
(ii) Units of analysis and interpretation
(i) Units of enumeration are related to terms of collected data. These can
include simple or composite units. Simple unit is about representing
single condition that is devoid of qualifications. These would include
hour, house, meter, and worker. Composite unit consists of qualifying
259
Organizing a Statistical
Survey
word that is added to simple unit that limits the scope and for this
reason it is difficult to define it.
For example, skilled worker and worker are two units. The worker is
simple and the second one is composite. The second case should be
well defined with and without the additional component. Other similar
examples can be kilowatt-hour and machine-hour.
(ii) Analysis and interpretation are comparative and for this reason it
would include coefficients, rates, percentages and so on.
Degree of Accuracy
The first thing that one needs to decide for enquiry is the degree of accuracy that
should be well decided in advance for the purpose of achieving accuracy pertaining
to data collection.
There are two aspects that you should keep in mind:
(i) The accuracy
(ii) The degree of accuracy that is a necessary task to be achieved in the given
investigation. However, it is to be remembered that absolute accuracy is
not possible to achieve for the purpose of describing the exact
phenomenon. Other factors influencing the absolutism of the result includes
due to imperfection on the part of investigator or due to imperfect
measuring instruments. For these reasons expecting complete accuracy is
not possible. Even when we talk about physical sciences with environment
of controlled experiments, absolute accuracy is still not possible. For this
reason social sciences are not be referred.
260
Organizing a Statistical
Survey
261
Organizing a Statistical
Survey
From the figures obtained it would be misleading when we try to convey the
same on terms of statistics with relation to the age of the student as follows:
(16+17+16+15+15)/5 = 15.8 years.
It is better to express the age of a student to the highest accuracy by including it
to a complete 15 years.
The accuracy implied by the figures 15.8 years is what we call the spurious
accuracy. Now, one needs to understand that inclusion of numerical facts require
concern about spurious accuracy.
10.5 SUMMARY
• Surveys related to statistics are fact enquiries that also include interest, this
need to be planned properly and executed with caution in order for the
results to be able to depict realities.
• Following are the steps that you need to consider when it comes to
statistical survey:
o Defining the problem
o Determining the objective and scope of the survey
o Accomplishing the initial steps like deciding the sources of data, type
of enquiry, statistical unit and the degree of accuracy desired
262
Organizing a Statistical
Survey
o Data collection
o Editing the data
o Classification and tabulation of data
o Data analysis
o Data interpretation
o Writing the report
• The sources of the data can be primary or secondary. The first time
collected data becomes the primary and the original data when done by an
investigator. The data that is collected by the secondary data is the one on
which the investigator works as it is already collected data.
• Data collection comprise of several methods that includes using different
techniques like interview, questionnaire, schedule, observation and much
more. It is up to the investigator to use the best suitable technique that
varies on the scope nature and object of enquiry while considering the
money and time constrains.
• When it comes to collecting secondary data on the basis of secondary
sources, one can seek it from newspapers, journals, books, reports and
published sources that can be referred to. If needed unpublished sources
too can be referred.
• It is important to know prior to conducting a survey that there are different
kinds of survey; it can be simple or census. Census is utilized when a whole
group is to be surveyed, but the simple is used for surveying a part of the
group.
• On practical terms the best method to be employed is simple survey due to
numerous advantages it serves. Other forms of enquiries would include
direct and indirect, open and confidential, original or repetitive, regular or
ad-hoc.
• After deciding the kind of enquiry that you will employ, the next step is to
understand what factors you need to be concerned about, these would be.
The sample selection has many methods, like:
o Probability sampling methods
o Non-probability sampling methods
263
Organizing a Statistical
Survey
264
Organizing a Statistical
Survey
265
Organizing a Statistical
Survey
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
266
Accuracy, Approximation
and Errors
Objectives
After going through this unit, you will be able to:
• Discuss the errors in statistics
• Explain the measurement of errors of approximation
• Analyse the effect of mathematical operations on error
• Assess sampling and non-sampling errors
Structure
11.1 Introduction
11.2 Approximation and Errors
11.3 Estimation and Sampling of Errors
11.4 Summary
11.5 Key Words
11.6 Answers to ‘Check Your Progress’
11.7 Self-Assessment Questions
11.8 Further Readings
11.1 INTRODUCTION
This unit will discuss accuracy, approximation and errors. Statistical data should
comprise of reasonable standard accuracy and for this one needs to understand that
the degree of accuracy needs to be clearly defined. It is important to understand
measuring accuracy and making approximation that helps achieve the desired
accuracy. There are certain aspects that one needs to keep in mind, of which
foremost are the errors that occur. Errors generally happen when the measurements
are inaccurate, or the methods employed are inappropriate and the figures are of
approximate value.
This unit will teach you about accuracy related concepts. Along with it you will
also learn about the approximation methods and concepts, recognizing the errors
that happen due to deploying different measuring errors.
267
Accuracy, Approximation
and Errors
Accuracy
Taking estimates, measuring or counting are the means of obtaining statistical data.
Considering that the data that is related to cars, then the estimation requires counting
of cars. If it is related to milk then the produce is to be recorded and the milk
powder should be weighed. However, when the government is trying to find data on
wheat production before and after the harvest, the total production can only be
estimated. If the counting is done properly then only the exact figures would be
obtained. However, the estimates and measurements may not always be exact.
When we take an example of a truck load powder and it is weighed on a weight
bridge then a difference of kilogram or more would hardly matter. However, when
we take a pinch of powder on a chemical balance then even a milligram would
matter due to the variation it causes on the balance. The accuracy of the weight is
dependent on the smallest measure of milligram. When it is measured, the accuracy
is dependent on the instrument with which it is measured.
Other factors influencing the accuracy and cause errors are many that need to be
understood. For this reason it is difficult to achieve perfect accuracy. Even with
different fields of science it is hard to achieve accuracy. When it comes to statistical
measurement, a reasonable degree is enough due to the inclusion of practical value.
The way it is used, nature, purpose and cost of obtaining it. Statistical surveys are
such that there is no need of achieving high accuracy. For instance, when it comes to
application of a more meaningful approach, in case of population of a given country
that is estimated to be 1 million then we are not including the exact figure that may
be 1,004,601. Instead of going down to accurately defining the figure to its
absolution, the round figure is better as accuracy here is not desirable. Even with
this, the absolute accuracy may not be able to give you the desired clarity.
In our next example, when we take into consideration the sale of polyester
annually and cotton cloth that is from a retail shop then it better be defined with a
ratio of 3:2 instead of defining it with the actual figure that may be ` 60,340 for
polyester and ` 40,105 for cotton cloths.
Understand that the accuracy depends on the situation that is taken into
consideration. When we talk about the blacksmith who is weighing iron, then the
grams become irrelevant, however, when it comes to gold then each measure counts
268
Accuracy, Approximation
and Errors
270
Accuracy, Approximation
and Errors
When we talk about significant places, it means presenting the figures that are
relevant in accordance to the given number that is accurate.
For example, when we take the number 3.4752 the rounding off would be by
dropping 752. The new rounded off number would then be 3.5. Similarly, the
number 2,23,490 would become 2,23,500 as the rounded off number.
It is to be noted that when it comes to significant figures these would be the
digits that depict real information and are completely accurate avoiding inaccuracies
of any sort. Now look at the following Example.
Example 11.1
Errors in Statistics
Error is a significant term in statistics as it is often defined as the difference that the
true value and the estimated value have pertaining to a specific item. Errors happen,
this is due to the fact that the estimates are based on sample observations and the
methods too include figures that are approximately rounded.
For example, you are to find out the percentage of nitrogen present in fertilizer,
the samples that you collect would be from different parts, with different fertilizer mix
and that too on different days. For the purpose of further analysis, the sample is then
sent for a lab testing where different methods are applied for the purpose of
analysing it. There are high chances of slight variations in the concoction of the
mixture due to the environmental changes such as temperature, heat and humidity.
Now another kind of variation would be related to the different kinds of samples and
the results in sampling errors. Another factor affecting the final results would be the
differences that arise due to analysis. Errors can arise with the process of
measurement that is called errors of observation.
271
Accuracy, Approximation
and Errors
We can clearly say that with surveys and experiments there are different kind of
errors such as:
(i) Sampling errors
(ii) Analytical errors
(iii) Errors due to observations and measurements
It is to be concluded that errors are all about the mistakes that happen as a
result of data compilation. However, it is essential to understand that when there is
an arithmetical miscalculation then it is a mistake. Statistics is dealing with
approximate and or estimated values, errors thus become inevitable. There is no
possibility of eliminating errors, but it can be minimized. However, when it comes to
mistakes, these can be completely eliminated.
Sources of Errors
Following are the sources of errors:
1. Errors of origin: When variables such as height, distance, and weight are
involved then precision cannot be achieved. This happens due to the
limitations that one comes across with the measuring instruments. For this
reason the scope of difference between the actual state and measurement
cannot be eliminated. Many times when unsuitable statistical units are
involved the measurement is incorrect. Another thing can be the possibility
of incorrect information from the source. A person may be biased in
supplying information that would result in errors in measurement. These
errors are often referred as errors of origin. These errors increase with
increase in observation.
2. Errors of inadequacy: The sample taken in any enquiry should be able to
represent population. If the size is small and the sample is not represented
in a correct manner then it leads to errors. These errors are referred to as
errors of inadequacy.
3. Errors of manipulation: Errors may happen unconsciously on the part of
the investigator in classification and counting of the objects. These errors
along with the approximation are referred as errors of manipulation.
272
Accuracy, Approximation
and Errors
Following are the three types of errors that happen during statistical investigation:
Errors of Approximation
Statistical figures are often rounded off for the convenience of the method. Due to
rounding off, the accuracy is stated as:
1. Depiction of data to the nearest whole number that would be like, 4,672.4
is approximated to 5,000 and the nearest whole number round off would
be 4,672.
2. Using the + and – signs for the purpose of indicating the approximation in
absolute terms, 5,000 ± 500. This depicts the actual value would then be
500 of 5,000 i.e. 500 more or less than 5,000.
3. Using + and- for indicating the proportion of error 5,000 ±0.1. This
indicates the final value can be 0.1 of 5,000 it can be 500 more or less than
5,000.
4. Using the method of percentage is similar to the third case above. For
example 5,000 ± 10% means that the error is 10% of 5,000.
5. Depicting the approximation of accuracy to the significant figures, 4,672.4
would be correct to five significant figures.
The ± symbol thus becomes useful for the purpose of depicting degree of
approximation or an error. These symbols are used to denote that there are limits to
errors. Possible errors thus can be defined as the limits wherein the actual error lies.
For instance 5,000 ± 500. The minimum error is 500, if we are to round it off to
the closest thousand then in accordance to the upper limit it would be an error as
+500 and lower limit will be –500. The error would be written as ± 500. This is in
the case of rounded off to thousands. Now when it is taken to hundreds then it
would be ± 50 and ± 5 respectively.
273
Accuracy, Approximation
and Errors
The possibility with absolute error is that it can be both positive and
negative. When the figure is 5,000 ± 500, the maximum absolute error
would be 500 in both the cases. If it is about the true value that is greater
than estimated value, the error would then be positive and if less than the
estimated value, the error would then be negative.
For instance state has a population of 2,71,70,314 and its capital is
26,39,766. The approximated value of the population to the lakh of the
state would then be 272 lakhs and its capital would be 26 lakhs. The
approximated value in the first case is more than the true value, and in the
latter case it is less than the true value.
For the state population
A.E. = True Value – Approximated Value = 2,71,70,314 – 2,72,00,000 =
– 29,686
For state capital
A.E. = True Value – Approximated Value= 26,39,766 – 26,00,000 =
39,766.
In the first case AE is negative and in the second case it is positive.
2. Relative Error: The extent of errors in both the cases is not much even
when we know for the fact that the state populations is ten times to the
capital. If one is to find out about the significance of error out of these, then
the need for sighting absolute error is obsolete. If one is to find out which
error out of these is significant then they will need to depict it in a fraction
format either true value or approximated value. Now relative error
becomes useful. When RE or relative error is depicted as absolute error
ratio to the estimated or approximated value.
This can be expressed as follows:
Relative Error (RE) = Absolute Error (AE) ÷ Corresponding
Approximated Value x1
Now let us take the Example for the purpose of absolute error, and
estimate the Relative Error in approximating the state population and
capital.
RE in approximating population = –2,9,686 ÷ 2,72,00,000 = – 0.0011
RE in approximating capital = 39,766 + 26,00,000 = 0.0153
274
Accuracy, Approximation
and Errors
Example 11.2
Sight the relative and percentage error when the given figure 2,234.752 is rounded
to the
1. Closest two digits after decimal
2. Closest, whole number
3. Closest hundred
4. Closest thousand
Solution:
275
Accuracy, Approximation
and Errors
After a careful observation of the above Example, following are the sightings:
1. There is an increase in the maximum absolute error due to the increase in
the order of rounding, just as increasing number of digits are left out.
2. There is a significant increase in the relative error as the order of rounding
increases.
The higher order of rounding is the reason behind decreased precision.
276
Accuracy, Approximation
and Errors
Effect of Addition
The sum of the absolute is equal to the sum of absolute errors of its components.
For example, when we are adding 500 (to the closest 10) and 400 (to the closest
100). This statement would then be depicted as follows:
(500 ± 5) + (400 ± 50) = 900 ± 55
This can be explained in more detail as follows:
Effect of Subtraction
The absolute error of difference would be equal to that of the sum of errors of its
components. For instance, figures from 500 (to the closest 10) subtracted by 400
(to the closest 100). The difference of 500 – 400 = 100 the error of this equation
would be 5 + 50 = 55. All this can be explained as follows in a complete equation.
(500 ± 5) – (400 ± 50) = 100 ± 55
Let us get to the details of it. The occurrence of maximum error is going to occur
in accordance to the greater figure that would be at the greatest and when it comes
to the lower figure it would be at the lowest or it would be the opposite to it. When
absolute error difference is calculated, it will seem to be in the following manner in a
calculation:
277
Accuracy, Approximation
and Errors
The absolute error would then be stated as + 55 (i.e., 5 + 50), the relative error
is going to be expressed as ± 0.55 (± 55/100), and the percentage error will appear
as ± 55% (± 0.55 × 100). Comparison becomes easy with the depiction of all the
errors, all these in the form of addition and subtraction, the noticeable relative error
with subtraction would be more as compared to the addition. The fact behind is that
the base becomes smaller. Another noticeable thing is that there is equality in the
absolute errors. Another point to understand is that, the errors due to subtraction
and addition occurs in the sum total of these errors.
Effect of Multiplication
It is important to understand that relative error of a product would be approximately
equal to the sum of the relative error of its components. When we multiply the figure
of 500 (to the closest 10) by 40 (to the closest unit). Now absolute error in relation
to the figure of 500 is + 5 and the relative error in relation to the figure depiction is
±1%. Absolute error in the figure of 40 is going to be ±0.5, and the relative error of
the figure would then be written as ±1.25%. The multiplication of the figures 500
and 40 would then be 2,000.
The following manner will explain all about it:
(500 ± 1%) × (40 ± 1.25%) = 2,000 ± 2.25%
Here relative error ± 2.25% is the sum of ± 1% and ± 1.25%. Further the
explanation would be:
The maximum value of the product will be expressed as:
(500 + 5) × (40 + 0.5) = (500 × 40) + (500 × 0.5) + (5 × 40) + (5 × 0.5) (a)
Similarly, the minimum value of the product would be as follows:
(500 – 5) × (40 – 0.5) = (500 × 40) – (5 × 40) – (0.5 × 500) + (5 × 0.5) (b)
Normally, when the errors are small the product of the two errors such as, the
term (5 × 0.5) in (a) and (b), is bound to be ignored as it is small. So, when it
comes to the absolute error in multiplying two figures 500 × 40 would result in
2,000 will then be expressed as (5 × 40) + (0.5 × 500) which is equal to 450. This
means relative error would be depicted as 450/2,000 = 0.0225 and the percentage
error is going to result in 2.25%.
Effect of Division
The sum of the relative errors of its components would be equal to the relative error
of a quotient, this is all in approximation. This can be understood by inclusion of
278
Accuracy, Approximation
and Errors
multiplication and divide it. We have the equation of 500 / 40 = 12.5. Now with the
relative errors are going to be depicted as follows:
(500 ± 1%)/(40 ± 1.25%) = 12.5 ± 2.25%
The relative error in the quotient 2.25% is the sum of two relative errors 1% and
1.25%. In order to understand it, it is important to get the smallest and the largest
value of the division, this can be obtained with the difference, whether it is less or
more than the division of 500 by 40 i.e. 12.5%. The division is going to result in the
smallest value that can be obtained when the smallest value of the numerator
(500 – 1%) is divided by the largest value of the denominator (i.e., 40 + l.25%)
Now, 500 – 1% = 500 – 5 = 495, and 40 + 1.25% = 40 – 0.5 = 40.5
With 495/40.5 = 12.22, it is going to be the slimmest value with the given
division. The difference between 12.50 and 12.22 is 0.28, is going to be expressed
in the absolute error. The relative error would then be:
0.28/12.5 × 100 = 2.24% or approximately 1% + 1.25% which is the sum of
the relative errors in two numbers.
You first need to first check the largest value of the division and look for how
much more than the value of 500/40, 12.5. This will be expressed in figure as
(500 + 5)/(40 – 0.5) – 12.5 = 505/39.5 – 12.5 = 0.28. This difference is same as
the former difference. The result is that the relative error of a quotient is
approximately equal to the sum of the relative errors of its components.
Biased Errors
Biased errors are in one direction, the sum of the estimated figures is either going to
be large or it will be too small than the sum of actual figures.
Suppose all the numbers are rounded off then the biased error would be its
result. It is due to the fact that after rounding off the figures the rounding down is
going to be below the true values of these figures.
For example, 14 is rounded as 10, the figure 132 as 100, and the figure 5,396
as 5,000.
279
Accuracy, Approximation
and Errors
It can be seen that the errors are only in one direction like +4, +32 and +396,
this would result in the total error in the sum 14 + 132 + 5,396 (5,542) when
rounded by the sum of 10 + 100 + 5,000 (5,110) will be the sum of the errors
4 + 32 + 396 = 432, which is true as 5,542 – 5,110. The nature of these errors is
cumulative and for this purpose these are also called cumulative errors.
Due to the bias of persons or the instruments, these biased errors would happen
when data is collected.
Another thing to remember is that there is a high possibility that the respondents
may understate of overstate the facts, this happens due to personal bias. Another
example would be using the meter rod for the purpose of measuring the cloth that
can be smaller form the actual length. In both the cases the result would be biased
errors or cumulative errors. This is due to the rounding up or down of the numbers.
However, when it comes to rounding the closest digit this error would not appear. It
is important to understand that with the given large number of observations it is
possible that half of the figures may be raised up and the rest of the numbers may be
decreased. For this reason errors in total would get cancelled out.
Unbiased Errors
When the errors are cancelled out they are referred to as compensating errors or
unbiased errors.
For instance, when the rounding off includes six numbers with 21, 22, 24, 26, 27
and 28 to nearest tens. The first three figures would be approximated to 20 each this
would be expressed as a total error of +7.
The other three figures 26, 27 and 28 would be approximated to 30 each that
would result in a total error of –9. When we reach to the totality of the sum of all
these figures then it would be 7 – 9 = –2 only. For this reason the unbiased numbers,
with approximated value would be less than that of the true value, in other cases it is
more. For this reason, it can be both positive and negative that would be nullified
with the effect and cancel each other out. The larger the number, the smaller will be
the unbiased errors. With an increase in the number of observations, the unbiased
errors are going to decrease.
280
Accuracy, Approximation
and Errors
Example 11.3
282
Accuracy, Approximation
and Errors
Sampling Errors
These are the errors that are resulted due to the drawing inference about the
population on the basis of samples. The sampling errors result occurs due to the fact
that there is a bias with regard to the selection of sample units. These errors occur
because the study is based on a part of the population. When the entire population
is considered all is eliminated. When more than one sample units are involved with
the process of random sampling method, their results are going to be different and
the results are going to be different from the result of the population. This is because
the selected two sample items will be different. Thus, sampling error means precisely
the difference between the sample result and that of the population when both the
results are obtained by using the same procedure or method of calculation. For this
reason exact amount of sampling error will differ from sample to sample. One
283
Accuracy, Approximation
and Errors
cannot completely eliminate the sampling errors or avoid it. Another thing is that one
can minimize these errors by the process of a systematic survey.
Sampling errors are of two types:
(i) Biased sampling errors
(ii) Unbiased sampling errors
Biased Sampling Errors: This happens when the values of the statistics
obtained from the survey deviate only to one direction, for this reason it cannot be
cancelled out. These errors happen due to various factors such as bias in selection
unit, faulty data collection, bias in analysis and other such factors. For example,
possibility of biased sampling errors is more when the sample units are selected
through deliberate sampling method instead of random sampling method. When one
encounters difficulties with information from some of the sampling units included in
the random selection, the investigator is more likely to include it in some other units
of the population. This also leads to bias if the substitute units are not selected
randomly. Sometimes due to lack of information the investigator would include the
remaining information, this too would result in bias. In other cases the information
may be biased, if the person wishes to conceal some facts from the investigator. Any
of the errors that are consistent would result in biases. Bias can also occur with
improper data collection instruments and when the investigator is incompetent.
Limitations with the coding, collection procedure, and methods of analysis will also
result in bias. These increase with the increase in the number of observations. Biased
sampling errors are cumulative in nature.
Unbiased Sampling Errors: These errors arise due to chance differences
between the units of population included in the sample and the one that is not
included. Errors due to chance are called unbiased sampling errors. They are not
due to any form of bias. No amount of increase in observations can result in any
fluctuations with these errors. On the other hand these errors it may be neutralized
when the number of observations increase. For this reason it is often referred as
compensating errors or non-cumulative errors.
Thus, the total sampling errors comprises both, biased and unbiased errors. The
primary objective of the statistical method related to any given survey is to design
sampling schemes so that biased errors are removed as much as possible and the
unbiased errors can be reduced to the minimum.
284
Accuracy, Approximation
and Errors
Non-sampling Errors
This can happen with complete, enumeration or sampling. Non-sampling errors
include mistakes and biases. These are not chance errors.
Most of the factors are similar that result in occurrence of bias in complete
enumeration, that has been described earlier. They also things like lack of
information, careless definition of population, a vague idea of the information sought,
utilizing inefficient method of interview and so on. Mistakes happen when the coding
is improper, trouble in computations and mistakes in processing. One or more of the
reasons stated below are the reasons that are related to non-sampling errors:
(i) Improper and ambiguous data specifications those are irregular with
relation to the census or survey objectives.
(ii) Inappropriate methods of sampling, incomplete questionnaire and incorrect
interviewing.
(iii) Personal bias with relation to investigators or informants.
(iv) Unavailability of trained and qualified investigators.
(v) Errors in compilation and tabulation.
These are the possible reasons out of many other possibilities.
The total errors include the sum of sampling errors and non-sampling errors. The
objective of any survey is to minimize these. It is easy to control non-sampling errors
through the process of defining the precise population, creating a careful
questionnaire and pre-testing it. Other things include training the investigators,
conducting a check and monitoring each step. However, this is only possible with
small amount of items or else it is only going to be time consuming and the whole
matter is going to be utterly costly. Another thing to notice is that when the sampling
amount is small there is an increase in errors. Now when you plan a survey it is
essential to be careful about the allocation of limited resources that includes human
and capital both along with the time to be considered. This should be done in such
a manner that the errors related to sampling and non-sampling errors are minimized
and it is possible to achieve maximum level of accuracy.
285
Accuracy, Approximation
and Errors
11.4 SUMMARY
286
Accuracy, Approximation
and Errors
287
Accuracy, Approximation
and Errors
• When addition and subtraction is carried out with rounded figures, the most
essential aspect is that the answer obtained cannot be more accurate than
the least accurate figures.
• The sum of the relative errors of its components would be equal to the
relative error of a quotient, this is all in approximation.
288
Accuracy, Approximation
and Errors
2. Sampling errors result due to the drawing inference about the population on
the basis of samples.
3. Total errors include the sum of sampling errors and non-sampling errors.
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
289
Accuracy, Approximation
and Errors
290
Ratios, Percentages and
Rates
Objectives
After going through this unit, you will be able to:
• Discuss ratio, percentages and rates
• Analyse the various statistical derivatives
• Explain the differences between ratios and percentages
• Assess the purpose of statistical derivatives
Structure
12.1 Introduction
12.2 Meaning of Various Statistical Derivatives
12.3 Purpose of Statistical Derivatives
12.4 Summary
12.5 Key Words
12.6 Answers to ‘Check Your Progress’
12.7 Self-Assessment Questions
12.8 Further Readings
12.1 INTRODUCTION
This unit is designed to teach you all about the in-depth explanation on percentages,
ratios, computational aspects and rates that are involved in calculation. Another thing
that you will understand is the precaution you need to take when you are working
out the percentages, rates and ratios as illustrated in administration and business
logarithms during the process of computation that involves methods like roots,
multiplications and divisions.
It is to be understood that reasonable data collection is not enough for drawing
out desired conclusions; the figures should be able to speak. The data collected
should be analysed, compared, should be meaningful and it should be enough to
make a viable format that would help in gaining conclusions. Quantitative data should
be condensed but in a manner that it becomes meaningful. This is essential as it
should be easy to interpret and to understand. Utilizing the method of statistical
derivatives for the purpose of computing is a good solution. The simplest form of
derivatives are rates, percentages, ratios etc. All these are points that helps measure
the present relationship between factors that give a better interpretation. This unit will
291
Ratios, Percentages and
Rates
help you with learning about the utility of logarithms and making expansive
computations.
There are three segregations to the method: ratios, percentages, and rates.
Ratio
Ratio helps express the connection between the two quantities that are similar and
denotes the number of times one is contained within another. When the expression of
quantities contain A and B then it is depicted in the form of a ratio as A: B and it is
called A is to B. In this ratio A is the antecedent and the B is consequent. Another
form of expression can be AIB. Now the ratio between these two is the concept of
division taking place between A that is divided by B. The ratio here can be implied
division or actual, either of the two, however, for convenience it is not to be
represented as division.
For example, if comparison is drawn between workers on the basis of gender
then 550/110 would be incomprehensible. A better representation of the same would
be 5:1 that becomes easy to understand. This kind of representation in a ratio format
is done to reduce the size and facilitate in easy grasp. When two terms are
interchanged within a ratio then the second ratio that is obtained is the inverse ratio
with relation to the first. Now with A and B the inverse would be B:A.
Let us now take another example of 80 students in a class wherein 50 are boys
and 30 girls. Now when converted in ratio format it would be 5:3 and this boys is to
girls and when it is inverse then it is 3:5 with girls is to boys. Here it is to be noted
that one ratio is greater than another. It is possible that a ratio can take a value that
is greater, lesser or equal to another as per the given situation. The ratio is the
relation of a quantity or a number over another, the value that is expresses as the
quotient of the first one that is being divided with the second. However, the ratio
may be extended to more than two numbers. Three or more numbers can be
involved in comparison and expression, this can be written as A:B:C:D.
When we take the example of a class of 100 law students, commerce 70,
science 20 and arts 10. When we sight the comparison from these different streams
then it will be depicted as 7:2:1. With the increase in categories, the proportion too
is a derivative for the given representation, this way it becomes easy and is less
confusing. If the total items as N items that are divided into three categories as
292
Ratios, Percentages and
Rates
Percentage
Percentages are just a form of the proportion based on or against 100. To calculate
a percentage we simply multiply a proportion by 100. Proportions are special kinds
of ratios where the denominator is the total while the numerator is a subpart of the
total. This tells us what part the numerator is of the total. Thus, while the ratio of
females to males in a city is 1.06, females represent .515 proportion of the total.
Rate
When there needs to be comparison between two quantities of the same type then
it is a ratio. For example when we take male and female workers in a factory then
both are workers. For this reason they are to be in the same kind. When we take an
example of per capita income the total income would be the numerator and the total
population would be the denominator. Other examples may be accident rate, death
and birth rates. Here it is to be understood that rate is all about the concept that
varies. These are dynamic and are related to time. Quotient is a rate of change that
includes a number that is representing the change in denominator and numerator.
Thus a rate is all about standardized relation towards the denominator. When
division takes place with a related number and quotient multiplied by 1,000, the
resulting figure is rate per thousand.
293
Ratios, Percentages and
Rates
For example, if we divide number of deaths then the statistical concepts with the
entire population along with the quotient is going to be multiplied by 1000 with this
death rate is obtained. Another thing to notice is that the coefficient is the rate per
unit. Let us assume that 1.9% is the death rate or 19 per thousand, then the
coefficient is going to be 0.019 as the coefficient of death. If this is multiplied by the
entire population then the resultant will be total number of deaths.
Now it is time to know the purpose of statistical derivatives, the first one would be
comparison as the primary purpose. When we consider percentage, coefficient and
ratio then there is clarity of idea that all of them are representation of a relative
picture. However, when numbers are involved in comparison with each other, then
the standard figure taken into consideration for the purpose is the base. Now the
question is which type should be selected for the purpose of a base, this however,
depends on the given situation. A thing to notice here is that a derivative is not
meaningful all by itself especially for the purpose of analysing a given problem. If a
company has earned 18% ROI in the current year, then the question is whether it is
a high ROI or not. There has to be comparison for the meaningful use of derivatives.
Now when we talk about 18% return then this is comparable with either last year or
with the figures of other competing firms, this can however, be done if they are
comparable.
294
Ratios, Percentages and
Rates
Types of Ratios
Statistical work involves several ratios, this ratio is used for the purpose of statistical
work. This ratio is dependent on base, when there is a comparison between
numbers, the figures used in comparison thus become the base. Now it is important
to know about different kinds of ratios.
295
Ratios, Percentages and
Rates
for the purpose of sex ratio that is depicted as females 1000 per males and not as
females per 1,000 population.
Time Ratio
This ratio is a measure that depicts a series of arranged values in a given time
sequence and this is also expressed as percentage. This is what is referred to as past
to present ratio.
This is what brings us to two important classes of time ratios:
(i) Those involving a fixed base period, and (ii) those involving a moving base.
For instance we take the example of tea production of the current year.
If one is to utilize fixed based method then including a particular year like 1980
would serve the base year and the current year production would be compared with
the production of current year.
When one is to consider moving base method the base varies. Now it is to be
understood that when one is comparing it with current year then previous year
production is used as the base. However, when one is comparing it with next year
then current year is the base for the purpose of comparison. Thus all this comes in
handy for percentages, ratios and basic Statistical Concepts calculations. These are
the calculations that depict the comparisons between data in accordance to two
consecutive time periods.
Hybrid Ratio
When it comes to corresponding part within different categories ratios of the given
data that is referred to as hybrid ratio. It is important to understand that the
denominator and numerator will be different in different units. For example, when we
consider a simple statement like the car that is travelling at the pace of 30 mph, this
would be included in the hybrid ratio.
The miles involved are the numbers that would be divided by the number of
hours, here there are two units that include miles and hours these both are included
in the statement as a result of it all.
Another example that we can take for the purpose of hybrid ratios is the per
capita income, persons per square kilometers, and output per hour or per day, cost
per passenger mile, number of children per family, investment per mile and so on and
so forth.
296
Ratios, Percentages and
Rates
It is to be understood that hybrid ratios are to be stated as per unit base instead
of its expression as percentages. This is essential due to the fact that the dominators
and the numerators belong to different categories in the ratios involved in this kid of
depiction. For this reason it can be said that the hybrid ratios can also be referred to
as rates too.
Computation of Ratios
Variables to be Related: There must be a clear relationship between the
numerator and the denominator. For example, in the event that you are keen on
processing the gaining of an organization in the current year, the present year’s
speculation must be considered and not the venture at the season of its initiation.
Another case could be the agrarian generation per section of land. In this
proportion, horticultural generation per section of land of land developed is more
significant than agrarian creation per section of land to aggregate land (which
incorporates badlands, backwoods, deserts, and so on.
Choice of Base: The base or denominator of a measurable proportion is
dependably a standard with which the numerator is being analyzed. As you most
likely are aware, through proportion we build up relationship between two things.
Here it is essential to choose which of the two things is to be utilized as base. At
times decision of the base is self-evident, while in different cases decision of the base
is not self-evident. Be that as it may, certain speculations can be settled on in the
decision of the base.
(i) In a correlation between a section and the entire, the entire is dependably
the base. For example, in relating the quantity of unemployed to aggregate
work drive, the quantity of people in the work compel would be the
denominator of the proportion.
(ii) In time examinations between comparable things (time proportions), the
prior occasion is taken as the base perpetually. For instance in contrasting
the rate change of current year deals over the earlier year, you ought to
consider the earlier year’s deals as the base.
(iii) If the connection is to be studied between two factors, one of which might
be solely dependent upon the other, then the autonomous variable is by and
large utilized as the base of examination. For example, in relating the
quantity of mishaps to aggregate traveller miles, the later would for the most
part be taken as the base of correlation.
297
Ratios, Percentages and
Rates
298
Ratios, Percentages and
Rates
Application of Ratios
Proportions, rates, coefficients are utilized as a part of a wide range of studies. Per
capita wage, populace per square kilometer, generation per section of land, turnover
proportion, settled resources proportion, insight remainder, cargo income per mile,
and venture per mile, work to yield proportion, capital yield proportion, and so forth
are cases of different mainstream proportions utilized. These are the points of
interest of some regularly utilized as proportions.
Every one of these proportions is refined and henceforth they are called
refined proportions. A refined proportion is one in which the numerator or the
denominator or both are balanced to reject the incidental elements which have a
tendency to cloud coordinate relationship between them. For example, proportion
of work cost in an industrial facility to aggregate cost of make is a valuable
proportion. In any case, the denominator contains two sorts of costs, cost and
variable cost. The proportion of work cost to aggregate variable cost gives a
proportion which is more significant to the administration in breaking down the
operations. A proportion might be institutionalized by changing the segment parts
of a proportion for better equivalence with different proportions. The utilization of
institutionalized proportions is essential in the field of imperative measurements
where institutionalized demise rates, birth rates, and so forth are utilized in
correlation with various urban communities or areas of the nation. The figures of
institutionalized rates include the idea of weighted normal and are, along these
lines, out of extent of this unit.
299
Ratios, Percentages and
Rates
by 125%. In the event that any esteem decays by 100% it brings about zero
esteem. More prominent than 100% decay can’t happen with amounts like costs,
wages, work, and so on, and in the event that it does, it demonstrates a mistake.
For example, if the cost of ` 2,000 is decreased to ` 800, the decay of ` 1,200
is figured as 150% of the last cost. This is an inaccurate explanation. The base is
not accurately picked.
Twists Caused by Small Bases: Consider another case where off base
conclusions can be attracted because of the mutilations brought about by little bases.
Considerate caution is to be practiced in the translation of these figures. Clearly a
conclusion that the administration of firm is more proficient can’t be defended. Since
the rate demonstrates a relative greatness just, no deduction ought to be drawn from
this in regards to the total sums. In such a circumstance, a right picture can be
acquired just if the supreme figures are presented.
Examinations Based on Dissimilar Situations: The information ought to be
homogeneous for the calculation and the utilization of proportions and rates. Before
one can make huge determinations from the examination, it is constantly important to
see if the information broke down is tantamount or not. Number juggling Mistakes
including lost decimal focuses may prompt to gross misinterpretations.
Disgraceful Averaging: Averaging deserves some talk as it is done in a few
circumstances. To discover suitable normal it is important to know the quantity of
jolts created by every machine. From the above discourse it is apparent that figuring
of proportion and rate must be done precisely, that significant conclusions can be
drawn. At whatever point conceivable, the information from which these proportions
are determined ought to likewise be given so that the pursuer can confirm the
relationship, and can recognize the blunders to make his own particular
understanding.
1. What is a ratio?
................................................................................................................
................................................................................................................
................................................................................................................
300
Ratios, Percentages and
Rates
12.4 SUMMARY
• Data collection is not enough for drawing out desired conclusions; the
figures should be able to speak. The data collected should be analysed,
compared, should be meaningful and it should be enough to make a viable
format that would help in gaining conclusions.
• Ratio is that helps express the connection between the two quantities that
are similar and denotes the number of times one is contained within
another.
• Ratios can be converted in proportions by taking a figure that would be the
base and then multiplying it with 100.
• Quotient is a rate of change that includes a number that is representing the
change in denominator and numerator.
• When division takes place with a related number and quotient multiplied by
1,000, the resulting figure is rate per thousand.
• When we consider percentage, coefficient and ratio then there is a clarity of
idea that all of them are representation of a relative picture.
• Utilisation of derivatives is for the purpose of drawing out comparison
between different groups, now naturally it is reduced to a common
denominator and for this reasons comparisons are drawn to make it
meaningful yet simple.
• Statistical work involves several ratios, this ratio is used for the purpose of
statistical work. This ratio is dependent on base, when there is a
comparison between numbers, the figures used in comparison thus become
the base. Now it is important to know about different kinds of ratios.
• Interpart is when the ratio of a total part is in relation to another part in the
same total. The base here is one of the two parts as here the comparison is
drawn between two parts.
301
Ratios, Percentages and
Rates
302
Ratios, Percentages and
Rates
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
303
Ratios, Percentages and
Rates
304
Collection and
Classification of Data
BLOCK-IV
COLLECTION, CLASSIFICATION AND PRESENTATION OF DATA
This block will discuss the collection, classification and presentation of data. As discusses in
the previous block about the basic statistical concepts; its meaning, organizing a statistical
survey, accuracy and approximation of errors and ratio, percentages and rates. This block
will now deal with collection, classification, presentation and diagrammatic representation.
This block consists of three units.
The thirteenth unit, as per this book, discusses collection and classification of data. It
discusses the methods and techniques of data collection. A research begins by finding and
collecting data. The collected data is then classified as primary and secondary sources.
Primary data being the first hand source of data and secondary being the data collected
from various different sources. The unit discusses it in detail.
The fourteenth unit is about tabular presentation. Data once collected is classified and then
presented in tables. Tables are single-column or single row or multiple columns, depending
upon the nature of data. For example numerical data is presented in statistical tables while a
contingency table presents observed data. The unit discusses the aspects of data
presentation in detail.
The fifteenth unit explains diagrammatic and graphic presentation of data. Data, when it is
classified into different segments, needs to be stored in such a manner that it remains easy to
retrieve it. For the same diagrammatic and graphic representations are used. These help
compare and differentiate data amongst its different forms and types. The different types of
charts and diagrams are discussed in this unit.
305
Collection and
Classification of Data
Objectives
After going through this unit, you will be able to:
• Describe collection of data
• Assess the drafting of questionnaire
• Analyse the features of specimen questionnaire
• Discuss sampling and non sampling errors
Structure
13.1 Introduction
13.2 Collection of Data
13.3 Classification of Data
13.4 Summary
13.5 Key Words
13.6 Answers to ‘Check Your Progress’
13.7 Self-Assessment Questions
13.8 Further Readings
13.1 INTRODUCTION
In this unit, you will learn about the methods and techniques of data collection.
Determining the sources of data is one of the most important steps in conducting a
research. Sources of data can be classified into primary and secondary sources. A
primary source, also known as first-hand information, includes all the data that is
closest to the information or concept being examined. Primary data can be obtained
through observations or through direct communication with the persons associated
with the selected subject by performing surveys or descriptive research. A
secondary source, also known as second-hand information, is data that relates or
discusses information actually presented elsewhere. In this unit, you will also learn
about editing the data, which includes correction in data.
Merits
(i) The first-hand information obtained by the investigator is bound to be
more reliable and accurate since the investigator can extract the
correct information by removing doubts, if any, in the minds of the
respondents regarding certain questions.
(ii) High response rate since the answers to various questions are obtained
on the spot.
(iii) It permits explanation of questions concerning difficult subject matter.
(iv) It permits evaluation of respondent, his circumstances and reliability.
(v) This method is useful where sponteneity of response is required.
(vi) It provides personal rapport which helps to overcome reluctance to
respond.
(vii) Where the investigator and informant talk face to face, it becomes
possible to explore questions in depth.
(viii) Information is collected promptly and there is no dribbling in.
Limitations
(i) This method is suitable only for intensive studies and not for extensive
enquiries.
(ii) This method is time-consuming and the investigation may have to be
spanned over a long period.
(iii) This method is highly subjective in nature and the results of the enquiry
may be adversely affected by the personal biases, whim and
prejudices of the investigator.
(b) Telephone survey: Under this method, the investigator, instead of
presenting himself before the informants, contacts them on telephone and
collects information from them.
Merits
(i) The method is more convenient than personal interview.
(ii) This method is less time-consuming and can be applied even to
extensive fields of enquiries. Telephone survey has all the other merits
of personal interview.
309
Collection and
Classification of Data
Limitations
(i) This method excludes those who do not have a telephone as also
those who have unlisted telephones.
(ii) This method is also subjective in nature and personal bias, whim and
prejudices of the investigator may adversely affect the results of the
enquiry.
(c) Indirect personal interview: Under this method, instead of directly
approaching the informants, the investigator interviews several third persons
who are directly or indirectly concerned with the subject matter of the
enquiry and who are in possession of the requisite information. Such a
procedure is followed by the enquiry committees and commissions
appointed by the Government of India. The committee selects persons
known as witnesses and collects information from them by getting answers
to questions decided in advance. This method is highly suitable where the
direct personal investigation is not practicable either because the informants
are unwilling or reluctant to supply the information or where the information
desired is complex and the study in hand is extensive.
Merits
(i) This method is less costly and less time-consuming than the direct
personal investigation.
(ii) Under this method, the enquiry can be formulated and conducted
more effectively and efficiently as it is possible to obtain the views and
suggestions of the experts on the given problem.
Limitations
The success of this method depends upon:
(i) The representative character of the witnesses
(ii) The personal knowledge of the witnesses about the subject matter of
enquiry
(iii) The personal prejudices of the witnesses as regards definiteness in
stating what is wanted
(iv) The ability of the interviewer to extract information from the witnesses
by asking appropriate questions and cross-questions
310
Collection and
Classification of Data
(d) Information received through local agents: Under this method, the
information is not collected formally by the investigator, but local agents,
commonly known as correspondents, are appointed in different parts of the
area under investigation. These agents collect information in their areas and
transmit the same to the investigator. They apply their own judgement as to
the best method of obtaining information. This method is usually employed
by newspaper or periodical agencies which require information in different
fields such as economic trends, business, stock and share markets, sports,
politics, and so on.
Merits
(i) This method is very cheap and economical for extensive investigations.
(ii) The required information can be obtained expeditiously since only
rough estimates are required.
Limitations
Since, the correspondents apply their own judgement about the method of
collecting the information, the results are often vitiated due to personal
prejudices and whims of the correspondents. The data so obtained are thus
not so reliable. This method is suitable only if the purpose of investigation is
to obtain rough and approximate estimates. It is unsuited where a high
degree of accuracy is desired.
(e) Mailed questionnaire method: Under this method, the investigator
prepares a questionnaire containing a number of questions pertaining to the
field of enquiry. These questionnaires are sent by post to the informants
together with a polite covering letter explaining in detail the aims and
objectives of collecting the information, and requesting the respondents to
cooperate by furnishing the correct replies and returning the questionnaire
duly filled in. In order to ensure quick response, the return postage
expenses are usually borne by the investigator. This method is usually
adopted by the research workers, private individuals and non-official
agencies. The success of this method depends upon the proper drafting of
the questionnaire and the cooperation of the respondents.
Merits
(i) By this method, a large field of investigation may be covered at a very
low cost. In fact, this is the most economical method in terms of time,
money and manpower.
311
Collection and
Classification of Data
312
Collection and
Classification of Data
(iii) The success of the method depends upon the skill and efficiency of the
enumerators to collect the information as also on the efficiency and
wisdom with which the questionnaire is drafted.
314
Collection and
Classification of Data
16. Lastly, the questionnaire should be made attractive by a proper layout and
an appealing get up.
Specimen Questionnaire
This hypothetical study is adapted from a study developed by Deepak Mehendru in
India. Assume that this study involves 200 professors in New York area colleges
who are asked about their interest in buying automobiles. The basic objective of this
survey is to determine certain marketing trends among the population of professors
in New York area regarding their automobile buying patterns and are based upon the
following factors:
• The profile of the decision-maker who finally decides to buy a particular
type of car
• People around the decision-maker who influence the decision-making
process
• The factors affecting the selection of a particular dealer of cars
• People in the family who make or affect decisions regarding the maximum
budget that can be allocated for purchasing a car
• The effect of various options available in the car
• The image and reliability of the company that makes these cars
• The effect of heavy promotion on television about the utility of the car on
the decision maker
(For the sake of simplicity, it is assumed that the professors have only one car in
the family.)
The Questionnaire
1. General
Name................................................................................
Age...................................................................................
Sex.........M..........F............................................................
Marital Status ....... Married ....... Unmarried ...................
Number of members in the family
1–2...................
3–4...................
315
Collection and
Classification of Data
5–6...................
Over 6..............
Yearly income
Less than $30,000...................
$30,000–$39,999......................
$40,000–$49,999......................
$50,000 and more...................
2. What type of car do you own now?
.................American
.................Japanese
.................European
3. What size of car do you own?
.................Luxury
.................Mid-size
.................Compact
4. Did you buy this car new or used?
.................New....................Used
5. If you bought a used car, did you buy it from a dealer or a private party?
.................Dealer.................Private party
6. If you bought a new car, how long have you owned this car?
.................Number of years
7. If you bought a used car, how old is this car now?
..............Number of years
8. Price paid for the car..........New..........Used
9. Who influenced your decision to purchase the above brand of car? Indicate
if more than one.
...............Yourself ......................Your wife
...............Your children ...................... Your friend
...............Your neighbour ......................Your colleague
Others.................................................................................. .
316
Collection and
Classification of Data
10. Indicate as to who decided about the budget allocation for the car?
...............Yourself
...............Your spouse
.............. Family decision
11. If you bought your car from a dealer, then who influenced your decision
regarding the selection of a particular dealer?
...............Yourself
...............Your friend
...............Your colleague
...............Family decision
12. How did you come to know about this dealer?
...............TV commercial
...............Newspapers
...............Personal references
...............Others
13. Rank the following factors that affected the final decision at the time of
purchasing the car (A rank of 1 measures the most important factor, a rank
of 2 measures the second most important factor, and so on).
...............Very inconvenient without the car
...............Money was available
...............Reputation of car manufacturer
...............Discounts offered
...............Interest rate on financing
...............Guarantees and warranties offered
...........................Others
14. Did you make an extensive survey regarding price comparisons after you
decided to buy the particular car? ............ Yes......... No.
15. If you bought a used car, how did you learn about it?............ Newspapers
...............Friend ............... Others
16. In order of preference, what were the major reasons for buying a used
car?
317
Collection and
Classification of Data
Secondary Data
The chief sources of secondary data may be broadly classified into the following two
groups:
(i) Published sources
(ii) Unpublished sources
Published sources: There are a number of national organizations and international
agencies which collect and publish statistical data relating to business, trade, labour,
price, consumption, production, etc. These publications are useful sources of
secondary data. Some of these published sources are as follows:
1. Official publications of the Central and State Governments such as monthly
abstract of statistics, national income statistics, vital statistics of India, etc.
2. Publications of semi-government organizations, e.g., the Reserve Bank of
India bulletin
3. Publications of research institutions, e.g., the publications of the Indian
Council of Agricultural Research (I.C.A.R.), New Delhi
4. Publications of commercial and financial institutions, e.g., the publications of
the F.I.C.C.I.
5. Reports of various committees and commissions appointed by the
government, such as the Wanchoo Commission Report on Taxation
318
Collection and
Classification of Data
Correction in Data
When the researcher collects the data, it is in raw form and it needs to be edited,
organized and analyzed. The first step in the correction of data is to edit that data.
319
Collection and
Classification of Data
The edited data is then coded and inferences are drawn. The editing of the data is
not a complex task, but it requires an experienced, knowledgeable and talented
person to do so.
The next step in the processing of data is editing of the data instruments. Editing
is a process of checking data to detect and correct errors and omissions, if any.
Data editing happens at two stages, one at the time of recording of the data and
second at the time of analysis of data.
320
Collection and
Classification of Data
• Has the first counting of the data been compared with the original
documents of the researcher?
The editing steps check for the completeness, accuracy and uniformity of the
data as created by the researcher.
Completeness: The first step of editing or correction of data is to check whether
there is an answer to each of the questions/variables set out in the data set. If there
are any omissions, the researcher sometimes is able to deduce the correct answer
from other related data on the same instrument. If this is possible, the data set has to
rewritten on the basis of the new information. For example, the approximate family
income can be inferred from other answers to probes such as, occupation of family
members, sources of income, approximate spending saving and borrowing habits of
family members’, etc. If the information is vital and has been found to be incomplete,
then the researcher can take the step of contacting the respondent personally again
and solicit the requisite data. If none of these steps help in furnishing the required
data, the data must be marked ‘missing’.
Accuracy: Apart from checking for omissions, the accuracy of each recorded
answer should be checked. A random check process can be applied to trace the
errors at this step. Consistency in response can also be checked at this step. The
cross verification to a few related responses would help in checking for consistency
in responses. The reliability of the data set would heavily depend on this step of error
correction. While, clear inconsistencies should be rectified in the data sets, fact
responses should be dropped from the data sets altogether.
Uniformity: In editing data sets, another keen look-out should be for any lack of
uniformity in interpretation of questions and instructions by the data recorders. For
instance, the responses towards a specific feeling could have been queried from a
positive as well as the negative angle. While interpreting the answers, care should be
taken to record each answer as a ‘positive question’ response or as ‘negative
question’ response in all uniformity checks for consistency in coding throughout the
questionnaire/interview schedule response/data set.
The final selection in the editing of data is to maintain a log of all corrections that
have been carried out at this stage. The documentation of these corrections helps the
researcher to retain the original data set.
321
Collection and
Classification of Data
In the data preparation step, the data is prepared in a data format that allows the
analyst to use modern analysis software such as SAS or SPSS. The major
criterion in this is to define the data structure. A data structure is a dynamic
collection of related variables and can be conveniently represented as a graph
where nodes are labeled by variables. The data structure also defines the stages of
the preliminary relationship between variables/groups that have been pre-planned
by the researcher. Most data structures can be graphically presented to give clarity
as to the frames of the researched hypothesis. A sample structure could be a linear
structure in which one variable leads to the other and finally, to the resultant and
variable.
The identification of the nodal points and the relationships among the nodes
could sometimes be a more complex task than estimated. When the task is complex,
involving several types of instruments being collected for the same research question,
the procedure for drawing the data structure would involve a series of steps. In
several intermediate steps, the heterogeneous data structure of the individual data
sets can be harmonized to a common standard and the separate data sets are then
integrated into a single data set. However, the clear definition of such data structures
would help in the further processing of data.
322
Collection and
Classification of Data
323
Collection and
Classification of Data
Bad 2
Worst 1
5.1 Age Upto 20 years 1
21-40 years 2
40-60 years 3
5.2 Occupation Salaried S
Professional P
Technical T
Business B
Retired R
Housewife H
Others =
= could be treated as a separate variable/observation and the actual response could be
recorded. The new variable cannot be termed as ‘other occupation’.
The coding sheet needs to be prepared carefully, if the data recording is not
done by the researcher, but is outsourced to a data entry firm or individual. In order
to enter the data from the same perspective as the researcher would like to view it,
the data coding sheet is to be prepared first and a copy of the data coding sheet
should be given to the outsourcer to help him or her in the data entry procedure.
Sometimes, the researcher might not be able to code the data from primary
instrument itself. He or she may need to classify the responses and then code them.
For this purpose, classification of data is also necessary at the data entry stage.
Classification
When open-ended responses have been received, classification is necessary to code
the responses. For instance, the income of the respondents could be an open-ended
question. A suitable classification can be arrived at from all responses. A
classification method should meet certain requirements or should be guided by
certain rules.
First, classification should be linked to the theory and the aim of the particular
study. The objectives of the study will determine the dimensions chosen for coding.
The categorization should meet the information required to test the hypothesis or
investigate the questions.
Second, the scheme of classification should be exhaustive, that is, there must be
a category for every response. For example, the classification of marital status into
325
Collection and
Classification of Data
three category, viz., ‘married’ ‘single’ and ‘divorced’ is not exhaustive, because
responses like ‘widower’ or ‘separated’ cannot be fitted into the scheme. Here, an
open-ended question will be the best mode of getting the responses. From the
responses collected, the researcher can fit a meaningful and theoretically supportive
classification. The ‘others’ category be has to be carefully used by the researcher
for this purpose. However, this categorization tends to defeat the very purpose of
classification, which is to distinguish between observations in terms of the properties
under study. The ‘others’ category can be very useful when a minority of
respondents in the data set give varying answers. For instance, a survey is carried
out to find out the newspaper readily habits of people. 95 respondents out of 100
could be easily classified into 5 large reading groups while 5 respondents could have
given a unique answer. These answers, rather than being separately considered,
could be clubbed under the ‘others’ heading for meaningful interpretation of
respondents and reading habits.
Third, the categories must also be mutually exhaustive, so that each case is
classified only once. This requirement is violated when some of the categories
overlap or different dimensions are mixed up.
The number of categorization for a specific question/observation at the coding
stage should mostly be permissible since reducing the categorization at the analysis
level would be easier than splitting an already classified group of responses.
However, the number of categories is limited by the number of cases and the
anticipated statistical analysis that is to be used on the observation.
Transcription of Data
When the observations collected by the researcher are not very large, the simple
inferences, which can be drawn from the observations, can be transferred to a data
sheet, which is a summary of all responses on all observations from a research
instrument. The main aim of transition is to minimize the shuffling proceeds between
several responses and observations. Suppose a research instrument contains 120
responses and the observations have, been collected from 200 respondents; a
simple summary of one response from all 200 observations would require shuffling
of 200 pages. The process is quite tedious if several summary tables are to be
prepared from the instrument. The transcription process helps in the presentation of
all responses and observations on data sheets which can help the researcher to arrive
at preliminary conclusions as to the nature of the collected sample. Transcription is,
hence, an intermediary process between data coding and data tabulation.
326
Collection and
Classification of Data
Methods of Transcription
The researcher may adopt a manual or computerized transcription. Long
worksheets, sorting cards or sorting strips could be used by the researcher to
manually transcript the responses. The computerized transcription could be done
using a data base package such as spreadsheets, text files, or other databases.
The main requisite for a transcription process is the preparation of data sheets
where observations are the row of the data base and the responses/variables are the
columns of the data sheet. Each variable should be given a label so that long
questions can be covered under the label names. The label names are thus the links
to specific questions in the research instruments. For instance, opinion on consumer
satisfaction could be identified through a number of statements (say 10); the data
sheet does not contain the details of the statement, but gives a link to the question in
the research instrument though variables labels. In this instance, the variable names
could be given as CS1, CS2, CS3, CS4, CS5, CS6, CS7, CS8, CS9 and CS10.
The label CS indicate consumer satisfaction and the numbers 1 to 10 indicate the
statements measuring consumer satisfaction. Once the labeling process has been
done for all the responses in the research instrument, the transcription of the
response in done.
1. Manual Transcription
When the sample size is manageable, the researcher need not use any
computerization process to analyze the data. The researcher could prefer a manual
transcription and analysis of responses. The choice of manual transcription would be
when the number of responses in a research instrument is very less, say 10
responses, and the numbers of observations collected are within 100. A transcription
sheet with 100*50 (assuming each response has 5 options) rows /column can be
easily managed by a researcher manually. If, on the other hand, the variables have 20
options, it leads to a worksheet of 100*200 size, which might not be easily managed
by the researcher manually. In the second instance, if the number of responses is less
than 30, then the manual worksheet could be attempted manually. In all other
instances, it is advisable to use a computerized transcription process.
2. Long Worksheets
Long worksheets require quality paper; preferably chart sheets thick enough to last
several usages. These worksheets are normally ruled both horizontally and vertically
allowing responses to be written in the boxes. If one sheet is not sufficient, the
researcher may use multiple rule sheets to accommodate all the observations.
327
Collection and
Classification of Data
Heading of responses which are variable names and their coding (options) are filled
in the first two rows. The first column contains the code of observations. For each
variable, the responses from the research instrument are now transferred to the
worksheet by ticking the specific option that the observer has chosen. If the variable
cannot be coded into categories, requisite length for recording the actual response of
the observer should be provided for in the worksheet.
The worksheet can then be used for preparing the summary tables or can be
subjected to further analysis of data. The original instrument can now be kept aside
as safe documents. Copies of the data sheets can also be kept for further
references. As has been discussed under the editing section, the transcription data
has to be subjected to a testing to ensure error free transcription of data.
A sample worksheet is given below for reference:
Sl Vehicle Occupation Vehicle
No Owner Performance
Age Age
Y N S P T B R ROTHER occ 1 2 3 4 5 1 2 3 4
1 x x x x
2 x x x x
3 x x x x
4 x x x x
5 x x x x
6 x x x x x
7 x STUDENT x x
8 x ARTIST x x
Transcription can be made as and when the edited instrument is ready for
processing. Once all schedules/questionnaires have been transcripted, the frequency
tables can be constructed straight from the worksheet. Other methods of manual
transcription involve adoption of sorting strips or cards.
Earlier data entry and processing were done through mechanical and semi-
metric devices such as key punch using punch cards. The arrival of computers has
changed the data processing methodology altogether.
13.4 SUMMARY
• The quality of the results obtained from statistical data for the purpose of
using these outcomes for managerial decision-making depends upon the
quality of the information itself collected.
• It is important that a sound investigative process be established to ensure
that the data are highly representative and highly unbiased.
• Before any procedures for data collection are established, the purpose and
the scope of the study must be clearly specified.
• The scope of the study must take into consideration the field to be
covered, and the time period in which to conduct the study.
• The first-hand information obtained by the investigator is bound to be more
reliable and accurate since the investigator can extract the correct
information by removing doubts, if any, in the minds of the respondents
regarding certain questions.
• Where the investigator and informant talk face to face, it becomes possible
to explore questions in depth.
• In indirect personal interview method, instead of directly approaching the
informants, the investigator interviews several third persons who are
directly or indirectly concerned with the subject matter of the enquiry and
who are in possession of the requisite information.
• The committee selects persons known as witnesses and collects information
from them by getting answers to questions decided in advance.
• Under the mailed questionnaire method, the investigator prepares a
questionnaire containing a number of questions pertaining to the field of
enquiry.
• These questionnaires are sent by post to the informants together with a
polite covering letter explaining in detail the aims and objectives of
329
Collection and
Classification of Data
330
Collection and
Classification of Data
331
Collection and
Classification of Data
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
332
Tabular Presentation
Objectives
After going through this unit, you will be able to:
• Explain the concept of tabular presentation and the types of tables
• Discuss the components of a table
• Analyse the framing of tables
• Describe the concept of statistical tables
• Differentiate between classification and tabulation
Structure
14.1 Introduction
14.2 Tabulation of Data
14.3 Classification and Tabulation
14.4 Summary
14.5 Key Words
14.6 Answers to ‘Check Your Progress’
14.7 Self-Assessment Questions
14.8 Further Readings
14.1 INTRODUCTION
This unit will introduce you to tabulation, its concepts and objectives. It refers to the
tabulation of data into appropriate tables. Tables are generally one of the following
types: single-column or single-row table, multiple-column or multiple-row table. The
components of a table include table number, title of the table, headnotes, footnotes
and sources. A statistical table, which is an orderly and systematic presentation of
numerical data in columns and rows, also has the same components.
Tabular presentation means tabulating the data in the form of appropriate tables. A
table is a statistical table, containing data arranged into convenient number of rows
and/or columns. The numbers of rows or columns in which data may be classified
(or distributed) help bring out the broad data features to the fore to be easily seen at
a glance.
333
Tabular Presentation
The basic function of a table is to simplify data and to present them in a manner
that facilitates comparison. Simplifying data means that the information desired
becomes easy to locate. Comparison involves bringing all related data together at
one place such that a relational picture can be conveniently and efficiently drawn.
Types of Tables
Statistical tables can be laid in various ways. The form of a table must suit the data
at hand and be convenient to achieve the objective(s) in mind. Generally, a table is
of the following types:
(i) Single-column or single-row tables: Such tables are the simplest to
construct. The data in these tables are arranged in a single row or a single
column according to time, place, region of space, or an attribute of interest.
The table is vertically laid when the data are arranged in a single column,
and horizontally laid when the data are arranged in a single row.
In fact, laying the table either vertically or horizontally means the same thing.
How the available space allows laying the table is perhaps the only
important consideration that goes into deciding it. A horizontally laid table
obviously consumes lesser space. Otherwise, these two ways of tabulating
data constitute essentially a single type of table.
(ii) Multiple-column and multiple-row tables: As against single-column and
single-row tables, the given data on a variable may also be arranged in
multiple columns and rows. The data break-up and the kind of relational
comparative picture intended determine the number of columns and rows
required. If the number of rows is represented by r and of columns by c,
such tables are known as ‘r by c’ tables.
The intersection of each row with each column makes a cell. This means
that any ‘r by c’ table consists of r x c cells. A table so constructed is
known as a cross-classification table, with format looking as in Table 14.1.
It shows the following:
(a) There are three columns and four rows with 4 x 3 = 12 cells
comprising the body of the table, each containing a figure.
(b) While columns describe one characteristic of the data, rows describe
the other.
(c) Either columns or rows may represent time, place, region of space, or
some other attribute of the data.
334
Tabular Presentation
Table 14.1
. ..... Title ..... .
... .. .. .. Head Note .. ... ... .
Components of a Table
Components of a table are functional parts that constitute the structure of the table.
Almost invariably, there are eight (8) components of a statistical table. Each of these
may be understood with reference to the typical format of Table 14.1.
335
Tabular Presentation
336
Tabular Presentation
• Source(s): A source mentions where the data presented in the table have
come from. This is an important component of the table, since the source
enables the reader to check and re-check the data from where these may
have been borrowed. It may also help draw, if relevant and necessary,
more information from the source. The source must indicate all information
about itself, such as publication, place and year of publication, and page(s)
and table(s) where the concerned data appear.
337
Tabular Presentation
A Contingency Table
A contingency table is an important form of presenting observed data. It is amenable
to the application of a number of useful statistical tools of data analysis. It follows
largely the same format as that of Table 14.1. Running into r number of rows and c
number of columns, there are ‘r x c’ cell entries which make the body structure of
the table.
Consider for example, Table 14.2, which gives the distribution of 2,000
collegiate students according to sex and economic status. As a contingency table, it
deviates from a normal multi-column and multi-row format in Table 14.2 as under:
Table 14.2 Classification of 2000 Collegiate Students According to Sex and Economic Status
(A 2 × 3 Contingency Table)
338
Tabular Presentation
(for column totals) lies a cell containing the total number of frequencies, or
the number of subjects or objects/items observed in terms of the two
attributes of interest. The row totals and column totals are known as
marginal frequencies.
A look at Table 14.2 shows that the data provided in the cells are count data.
The row totals and column totals both add to 2000, the total number of students
observed. The last column presents the row totals and the last row the column totals.
All this is unlike a normal cross-classification table, where the data are the
measurements of a continuous quantitative variable.
The cell frequencies in a contingency table are amenable to meaningful
interpretations. For example, the first cell frequency (that is, 120) means that out of
all the 2000 collegiate there are 120 boys who have high-income means.
Similarly, among 200 collegiate out of 2000 who have high income means, 120
are boys and 80 girls. And, so on. An important point that must weigh heavy in the
construction of a contingency table is that the two classifying attributes are clearly
and objectively defined. This helps stating the various column heads and row heads
in unambiguous terms as to their meaning and coverage. Any ambiguity in defining
the attributes and, consequently, the column and row heads seriously erodes an
objective classification of the observed data. It also does not allow the cell
frequencies to offer precise and meaningful interpretations.
Statistical Tables
A statistical table is an orderly and systematic presentation of numerical data in
columns and rows. Columns are vertical arrangements; rows are horizontal. The
main objective of a statistical table is to so arrange the physical presentation of
numerical facts that the attention of the reader is automatically directed to the
relevant information. Some of the main advantages of tabular presentation over
descriptive statements are as follows:
• Tabulated data can be easily understood than facts stated in the form of
descriptions.
• They leave a lasting impression.
• They facilitate quick comparison.
• Statistical tables make easier the summation of items and detection of
errors and omissions.
339
Tabular Presentation
• When data are tabulated all unnecessary details and repetitions are avoided.
• A tabular arrangements makes it unnecessary to repeat explanations,
phrases and headings.
Parts of Tables
The following parts must be present in all tables:
• Title
• Caption
• Stubs
• Body
There are, however, other parts whose presence depends upon the specific
purpose. They are Headnote (or prefatory note), footnote and source note.
• Title: A complete title explains in brief and concise language (a) what the
data are, (b) where the data are, (c) time period of data and (d) how the
data are classified.
• Captions: The title of the columns are given in captions. In case there is a
sub-division of any column, there would be sub-caption headings also.
• Stubs: The titles of the rows are called stubs. The box over the stub on the
left of the table gives description of the stub contents, and each stub labels
the data found in its row of the table.
• Body: The body of the table contains the numerical information.
• Headnote (or prefatory note) It is a statement, given below the title, which
clarifies the contents of the table.
• Footnote: It is a statement which clarifies some specific items given in the
table or explains the omission thereof. Thus, if we look into a table, giving
yearly figures of wheat production in India, the sudden fall in the figure for
1947 relate to India after partition.
• Source: The source from where the data contained in the table has been
obtained should be stated. This would permit the reader to check the
figures and gather, if necessary, additional information.
340
Tabular Presentation
Table 14.3 Title (Description of Units and Year, Place etc) Headnote
Types of Tables
Tables may be classified according to the number of characteristics used for
tabulation. A simple or a one-way table use only one characteristic against which the
frequency distributions given, as in Table 14.4 where the characteristic used is the
age of student.
Table 14.4 Age Wise Distribution of the Students of a College
341
Tabular Presentation
‘sex’ and ‘course’, of the students, the table would take the form as shown on page
70 and would be called a higher order table.
Table 14.6 Table Showing Distribution of the Students of a College
According to ‘Age’, ‘Sex’, and ‘Course’
Course
Age in Years Arts Science Commerce Total
Male Female Male Female Male Female
16–17
17–18
18 and Over
Total
Example 14.1: Draft a form of tabulation to show:
(a) Sex,
(b) Three ranks–supervisors, assistants and clerks,
(c) Years–1970 and 1979
(d) Age group–18 years and under, over 18 but less than 55 years, over 55
years.
Solution: In the previous question, we have to prepare a table to show four
characteristics, i.e., sex., three ranks of the employees, as given, for two different
years and the data is to be divided according to age groups already given here. We
can prepare a blank table to incorporate all these characteristics (Table 14.7).
Table 14.7 Table Showing the Division of Three Ranks of Employees According to Sex and
Age Group for 1976 and 1979
1976 1979
Age Group Supervisors Assistants Clerks Total Supervisors Assistants Clerks Total
0–18
Males 18–55
55 and above
Total
0–18
Females 18–55
55 and above
Total
342
Tabular Presentation
Example 14.2: The city of Timbuktu was divided into three areas: the administrative
district, other urban districts and rural districts. A survey of housing conditions was
carried out and the following information was gathered:
There were 6,77,100 buildings of which 1,76,100 were in rural districts. Of the
buildings in other urban districts 4,06,400 were inhabited and 4,500 were under
construction in the administrative district 4,000 buildings were inhabited and 500
were under construction of the total of 61,600.
The total buildings in the city that are under construction are 6,200 and those
uninhabited are 44,900.
Tabulate the above information so as to give the maximum possible information.
How many buildings are under construction in rural areas?
Solution
Table 14.8 Distribution of Building in the Three Districts of
Timbuktu According to Inhabitation
(in hundreds)
District Inhabited Unihabited Under Construction Total
Administrative 571 40 5 616
Other Urban 4064 285 45 4394
Rural 1625 124 12 1761
Total 6260 449 62 6771
The table clearly shows that there are 1,200 buildings under construction in rural
areas.
Example 14.3: An investigation conducted by the education department in a public
library revealed the following facts. You are required to tabulate the information as
neatly and clearly as you can.
‘In 1960, the total number of readers was 46,000 and they borrowed some
16,000 volumes. In 1965, the number of books borrowed increased by 4,000 and
the borrowers by 50 per cent.’
The classification was on the basis of three sections: Literatures, Fiction and
Illustrated News. There were 10,000 and 30,000 readers in the section Literature
and Fiction, respectively, in the year 1960–Illustrated news and Fiction, respectively.
Marked changes were seen in 1965. There were 7,000 and 42,000 readers in the
Literature and Fiction section respectively. So also 4,000 and 13,000 books were
lent in the section Illustrated News and Fiction respectively.
343
Tabular Presentation
Solution:
1970 1975
Types of Number Number Number Number Changes in 1975
books of of books of of books over 1970
readers borrowed readers borrowed
Fiction 30,000 10,000 42,000 13,000 +12,000 +3000
Literature 10,000 4,000 7,000 3,000 –3,000 –1,000
Illustrated news 6,000 2,000 20,000 4,000 +18,000 +2,000
Total 46,000 16,000 69,000 20,000 27,000 4,000
Example 14.4: Prepare a two-way frequency table and marginal frequency tables
for 25 values of the two variables x and y given below. Take class interval of x as
10–20, 20–30, etc., and that of y as 100–200, 200–300 etc.
x 12 24 33 22 44 37 26 36
y 140 256 360 470 470 380 280 315
x 55 48 27 57 21 51 27 42
y 420 390 440 390 590 250 550 360
c 43 52 57 44 48 48 52 41 69
y 570 290 416 280 452 370 312 330 590
Solution:
Table 14.10 Bivariate Frequency Table
100–300 1 – – – – – 1
200–300 – 2 – – 2 – 4
300–400 – – 3 5 2 – 10
400–500 – 2 – 2 2 – 6
500–600 – 2 – 1 – 1 4
Total 1 6 3 8 6 1 25
344
Tabular Presentation
x f
10 – 20 1
20 – 30 6
30 – 40 3
40 – 50 8
50 – 60 6
60 – 70 1
Total 25
y f
100 – 200 1
200 – 300 4
300 – 400 10
400 – 500 6
500 – 600 4
Total 25
Example 14.5: In a trip organized by a college, there were 80 persons, each of who paid
` 15.50 on an overage. There were 60 students, each of who paid ` 16. Members of
teaching staff were charged at a higher rate. The number of servants (all males) was six and
they were not charged anything. The number of ladies was 20 per cent of the total and there
was only one ladystaff member. Tabulate this information.
Solution:
Total contribution = 80 × 15.50 = ` 1240.00
Table 14.13 Showing Participants, Sex and Class wise
345
Tabular Presentation
Example 14.6: Prepare a bivariate frequency distribution for the following data:
Marks in law 10 11 10 11 11 14 12 12 13 10 13
Marks in Statistics: 20 21 22 21 23 23 22 21 24 23 24
Marks in Law: 12 11 12 10 14 12 13 10 14
Marks in Statistics: 23 22 23 22 22 20 24 23 24
Solution:
Marks in
Statistics 20 21 22 23 24 Total
Law
10 1 – 2 2 – 5
11 – 2 1 1 – 4
12 1 1 1 2 – 5
13 – – – – 3 3
14 – – 1 1 1 3
Totals 2 3 5 6 4 20
346
Tabular Presentation
Classification of Data
Classification in statistics refers to the process of separation of data into various
groups or classes with the help of properties in the data set. For example, the
interests of particular class or group can be separated on the basis of gender. In this
classification, the raw data condenses into suitable forms for statistical analysis and
removes complex data patterns and highlights the core representatives of the raw
data. Post classification, the data can be put to comparison or inferences. Classified
data at some means can also provide relationships or correlative data patterns.
Data when it is raw, is classified using four key characteristics geographical,
chronological, qualitative and quantitative properties. Considering that a data set is
gathered for the analysis of the consumption of petrol per day around the world.
The consumption of petrol can be classified on the basis of countries and types of
vehicles. Here, geographical factors and vehicle types are the merits for
classification. A further classification as chronological, can include older vehicles
which have a higher rate of consumption. The maintenance and serviceability of the
vehicles can act as the qualitative base of classification and the gross average
claimed by the manufacturer can act as the quantitative base for classification of the
consumption.
Tabulation of Data
Tabulation in statistics is a method of summarising data by using a systematic
arrangement of data into rows and columns. Tabulation is carried out as to
investigate, compare, identify errors or omissions in data, to study a prevailing trend,
to simplify the known raw data and to use the space economically and use it as
future reference.
347
Tabular Presentation
348
Tabular Presentation
14.4 SUMMARY
• A source mentions where the data presented in the table have come from.
This is an important component of the table, since the source enables the
reader to check and re-check the data from where these may have been
borrowed. It may also help draw, if relevant and necessary, more
information from the source.
• There are no hard and fast rules governing how to frame a statistical table.
It all depends on the kind of data available and the objective(s) one wishes
to achieve.
• Where availability of space is a constraint in deciding the size of a table, it
should be so designed that the available space accommodates the table
with all the information it is supposed to contain.
• A contingency table is an important form of presenting observed data. It is
amenable to the application of a number of useful statistical tools of data
analysis.
• The data appearing as cell entries in a contingency table are essentially
qualitative count data. To be more specific, the cell entries are observed
frequencies/counts of an item or the outcome of an event possessing or not
possessing a certain attribute.
• A statistical table is an orderly and systematic presentation of numerical
data in columns and rows. Columns are vertical arrangements; rows are
horizontal. The main objective of a statistical table is to so arrange the
physical presentation of numerical facts that the attention of the reader is
automatically directed to the relevant information.
• The following parts must be present in all tables:
o Title
o Caption
o Stubs
o Body
• Classification, in statistics refers to the process of separation of data into
various groups or classes with the help of properties in the data set.
• Data when it is raw, is classified using four key characteristics-
geographical, chronological, qualitative and quantitative properties.
Considering that a data set is gathered for the analysis of the consumption
of petrol per day around the world.
350
Tabular Presentation
351
Tabular Presentation
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
352
Diagrammatic and Graphic
Presentation
Objectives
After going through this unit, you will be able to:
• Explain the diagrammatic representation of data
• Analyse pictogram as a sign language
• Discuss graphic representation of data
• Differentiate between histograms, frequency polygon and ogives
Structure
15.1 Introduction
15.2 Diagrammatic and Graphic Presentation
15.3 Graphical Presentation
15.4 Summary
15.5 Key Words
15.6 Answers to ‘Check Your Progress’
15.7 Self-Assessment Questions
15.8 Further Readings
15.1 INTRODUCTION
This unit will introduce you to graphic representation of data. Graphical or pictorial
representation of data helps in giving a visual indication of magnitudes, groupings,
trends and patterns in the data. These also help facilitate comparisons between two
or more sets of data. Diagrammatic representations include bar diagrams, pie charts
and pictograms, whereas graphic representation includes histograms, frequency
polygons and cumulative frequency curves or ogives.
The data we collect can often be more easily understood for interpretation if it is presented
graphically or pictorially. Diagrams and graphs give visual indications of magnitudes,
groupings, trends and patterns in the data. These important features are more simply
presented in the form of graphs. Also, diagrams facilitate comparisons between two or
more sets of data.
353
Diagrammatic and Graphic
Presentation
The diagrams should be clear and easy to read and understand. Too much
information should not be represented through the same diagram; otherwise, it may
become cumbersome and confusing. Each diagram should include a brief and self-
explanatory title dealing with the subject matter. The scale of the presentation should
be chosen in such a way that the resulting diagram is of appropriate size. The
intervals on the vertical as well as the horizontal axis should be of equal size;
otherwise, distortions would occur.
Diagrams are more suitable to illustrate discrete data, while continuous data is
better represented by graphs. The following are the diagrammatic and graphic
representation methods that are commonly used.
Diagrammatic representation
Year Revenue
1989 110
1990 95
1991 65
Solution:
The bar diagram for this data can be constructed as follows with the revenues represented
on the vertical axis and the years represented on the horizontal axis.
354
Diagrammatic and Graphic
Presentation
The bars drawn can be subdivided into components depending upon the type of information
to be shown in the diagram. This will be clear by the following example in which we are
presenting three components in a bar.
Example 15.2: Construct a subdivided bar chart for the three types of expenditures in
dollars for a family of four for the years 1988, 1989, 1990 and 1991 as given as follows:
Year Food Education Other Total
1988 3000 2000 3000 8000
1989 3500 3000 4000 10500
1990 4000 3500 5000 12500
1991 5000 5000 6000 16000
Solution:
355
Diagrammatic and Graphic
Presentation
(ii) Pie chart: This type of diagram enables us to show the partitioning of a
total into its component parts. The diagram is in the form of a circle and is
also called a pie because the entire diagram looks like a pie and the
components resemble slices cut from it. The size of the slice represents the
proportion of the component out of the whole.
Example 15.3: The following figures relate to the cost of the construction
of a house. The various components of cost that go into it are represented
as percentages of the total cost.
Item % Expenditure
Labour 25
Cement, Bricks 30
Steel 15
Timber, Glass 20
Miscellaneous 10
Construct a pie chart for the above data.
Solution:
The pie chart for this data is presented as follows:
Misc. Labour
10% 25%
Timber, glass
20%
Steel
Cement, bricks
15%
30%
Pie charts are very useful for comparison purposes, especially when there
are only a few components. If there are too many components, it may
become confusing to differentiate the relative values in the pie.
(iii) Pictogram: Pictogram means presentation of data in the form of pictures.
It is quite a popular method used by governments and other organizations
for informational exhibitions. Its main advantage is its attractive value.
Pictograms stimulate interest in the information being presented.
News magazines are very fond of presenting data in this form. For
example, in comparing the strength of the armed forces of USA and
Russia, they will simply make sketches of soldiers where each sketch may
356
Diagrammatic and Graphic
Presentation
357
Diagrammatic and Graphic
Presentation
Source: http://www.scratchinginfo.net/wp-content/uploads/2013/04/Modern-Pictograms.png
Source: http://kudesign.co.nz/studio/wp-content/uploads/pictograms.jpg
3. Name the chart that shows the partitioning of a total into component parts.
................................................................................................................
................................................................................................................
................................................................................................................
proportional to the respective frequency and the width represents the class
interval. Each rectangle is joined with the other and any blank spaces
between the rectangles would mean that the category is empty and there
are no values in that class interval.
As an example, let us construct a histogram for our example of ages of 30
workers. For convenience sake, we will present the frequency distribution
along with the mid-point of each interval, where the mid-point is simply the
average of the values of the lower and upper boundary of each class
interval. The frequency distribution table is shown as follows:
7 7
5 5
3 3
enclosed so that the starting point is joined with a fictitious preceding point
whose value is zero, so that the start of the curve is at horizontal axis and
the last point is joined with a fictitious succeeding point whose value is also
zero, so that the curve ends at the horizontal axis. This enclosed diagram is
known as the frequency polygon.
We can construct the frequency polygon from the preceding table as
follows:
(40, 7) (70, 7)
(20, 5)
(50, 5)
(60, 3)
(30, 3)
361
Diagrammatic and Graphic
Presentation
(a) Less than ogive: In this case, less than cumulative frequencies are plotted
against upper boundaries of their respective class intervals.
(b) Greater than ogive: In this case, greater than cumulative frequencies are
plotted against the lower boundaries of their respective class intervals.
These ogives can be used for comparison purposes. Several ogives can be
drawn on the same grid, preferably with different colours for easier visualization and
differentiation.
Although, diagrams and graphs are a powerful and effective media for presenting
statistical data, they can only represent a limited amount of information and they are
not of much help when intensive analysis of data is required.
362
Diagrammatic and Graphic
Presentation
Solved Problems
Example 15.4: Standard tests were administered to 30 students to determine their
IQ scores. These scores are recorded in the following table.
120 115 118 132 135 125 122 140 137 127
129 130 116 119 132 127 133 126 120 125
130 134 135 127 116 115 125 130 142 140
(d) Compute:
• Relative frequency
• A histogram
• A frequency polygon
Solution:
(a) The ordered array for this data is as follows:
115 115 116 116 118 119 120 120 122 125
125 125 126 127 127 127 129 130 130 132
132 132 133 134 135 135 137 140 140 142
363
Diagrammatic and Graphic
Presentation
(b) Let there be six groupings, so that the size of the class interval be five. The
frequency distribution is shown as follows:
Class Interval (CI) Frequency ( f )
115 to less than 120 6
120 ’’ ’’ ’’ 125 3
125 ’’ ’’ ’’ 130 8
130 ’’ ’’ ’’ 135 7
135 ’’ ’’ ’’ 140 3
140 ’’ ’’ ’’ 145 3
(c) The required elements are computed in the following table.
(d) The computed values of relative frequency, cumulative relative frequency (<)
and cumulative relative frequency (>) are shown in the following table:
364
Diagrammatic and Graphic
Presentation
(e) Before we construct the histogram and other diagrams, let us first
determine the midpoint (X) of each class interval.
A histogram
A frequency polygon
365
Diagrammatic and Graphic
Presentation
Example 15.5: Construct a stem and leaf display for the data of IQ scores presented
in the preceding example.
Solution:
The IQ scores of the given thirty students are presented in an ordered array, as follows:
115 115 116 116 118 119 120 120 122 125
125 125 126 127 127 127 129 130 130 132
132 132 133 134 135 135 137 140 140 142
366
Diagrammatic and Graphic
Presentation
The stem would consist of the first two digits and the leaf would consist of the last digit.
Stem Leaves
11 556689
12 00255567779
13 0022234557
14 002
Example 15.6: Suppose the Office of the Management and Budget (OMB) has
determined that the Federal Budget for 2008 would be utilized for proportionate
spending in the following categories. Construct a pie chart to represent this data.
Solution:
The pie chart is presented as follows. Care must be taken so that the percentage
allocation of budget is represented by the appropriate proportion of the pie.
1. What is a histogram?
................................................................................................................
................................................................................................................
................................................................................................................
367
Diagrammatic and Graphic
Presentation
15.4 SUMMARY
• The data we collect can often be more easily understood for interpretation
if it is presented graphically or pictorially. Diagrams and graphs give visual
indications of magnitudes, groupings, trends and patterns in the data.
• The diagrams should be clear and easy to read and understand. Too much
information should not be represented through the same diagram;
otherwise, it may become cumbersome and confusing.
• Bars are simply vertical lines where the lengths of the bars are proportional
to their corresponding numerical values. The width of the bar is unimportant
but all bars should have the same width so as not to confuse the reader of
the diagram.
• This type of diagram enables us to show the partitioning of a total into its
component parts. The diagram is in the form of a circle and is also called a
pie because the entire diagram looks like a pie and the components
resemble slices cut from it.
• Pictogram means presentation of data in the form of pictures. It is quite a
popular method used by governments and other organizations for
informational exhibitions. Its main advantage is its attractive value.
Pictograms stimulate interest in the information being presented.
• Pictograms or pictographs are symbols of representation of the pictorial
graphic system. Pictographs originated from prehistoric drawings on ancient
rocks signifying an object or thing with its depiction. It is meant to convey,
share or represent an idea or concept.
• Better known as ‘icons’, pictograms have been popularised with the use
and familiarization of softwares. Today the term is used widely and casually
with the broad sweep of many icons representing things.
• The Pictogram is a friendly visual language that is developed for all classes
of people and even those with no ability to speak, read or write.
368
Diagrammatic and Graphic
Presentation
• Pie charts: They are basically circle charts, which are usually drawn for
component-wise per cent data.
• Component charts: These charts are meant for exhibiting the changes in
the components or parts of a given total in relative terms.
• Pictogram: These are symbols of representation of the pictorial graphic
system.
• Frequency polygon: It is a line chart of frequency distribution in which
either the values of discrete variables or mid-points of class intervals are
plotted against the frequencies.
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
370
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
BLOCK-V
MEASURES OF CENTRAL TENDENCY, DISPERSION AND SKEWNESS
This block discusses the measures of central tendency, dispersion and skewness. The
concepts of central tendency, mean, median, mode and Geometric, Harmonic and Moving
Averages along with the methods of dispersion and skewness are discussed in this block.
This block consists of three units.
The sixteenth unit, as per this book, discusses the concept of central tendency. Central
tendency is the tendency for the values of a random variable to cluster round its mean, mode,
or median. Along with the basics and features of central tendency, the unit also discusses
mean, median and mode. Geometric, harmonica and moving averages are also covered in
this unit.
The seventeenth unit explains the measures of dispersion. Dispersion refers to the extent to
which values of a variable differ from a fixed value such as the mean. The measures of
dispersion can be expressed in an absolute form or in a relative form. The common measures
of dispersion, range and standard deviation are discussed in this unit.
The eighteenth unit discusses the measures of skewness. Skewness refers to a measure of
the asymmetry of the probability distribution of a real-valued random variable about its mean.
The skewness value can be either positive or negative, or it can even be undefined.
However, the qualitative interpretation of skewness remains complicated. The unit discusses
the measures, aspects and features of skewness in detail.
371
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
Objectives
After going through this unit, you will be able to:
• Discuss the measures of central tendency
• Describe the concepts of mean
• Analyse arithmetic mean of grouped data
• Assess the advantages and disadvantages of mean
Structure
16.1 Introduction
16.2 Measures of Central Tendency
16.3 Mean
16.4 Summary
16.5 Key Words
16.6 Answers to ‘Check Your Progress’
16.7 Self-Assessment Questions
16.8 Further Readings
16.1 INTRODUCTION
This unit will discuss the concepts of central tendency, mean, median, mode and
geometric, harmonic and moving averages. Central tendency refers to the tendency
for the values of a random variable to cluster round its mean, mode, or median.
Where mean, median, and mode are the three common forms of statistical averages.
Mean refers to an average of n numbers computed by adding some function of the
numbers and dividing by some function of n. Median on the other hand is the value
below which 50% of the cases fall and mode being the most frequent value of a
random variable. The measures of central tendencies, characteristics of mean,
median, mode, and the various types of means are discussed in this unit.
Statistics indicate the location of the frequency curve along the X-axis and ignore all
other features of the distribution. There are various possible measures that can be
used to ‘locate’ a frequency distribution, as shown in Fig. 16.1.
373
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
If the shape of the frequency distributions were fixed, then all these measures are
equally descriptive, and fix the location of the curve. But, the practical distributions
that we deal with always have some change in shape depending on the samples we
take, even though the general shapes are quite similar. It is, therefore, necessary that
we choose those measures of location which are not very sensitive to the specific
values of items, in particular the extreme values. Thus, measures A and E are
generally meaningless because they depend on the values of the lowest and the
highest items, respectively. The other measures, on the contrary, are less susceptible
to extreme values because they are somehow related to the entire distributions.
Thus, we treat B, C, D and E as the most common measures of location. There are
some more of such measures which we will consider later.
The most important object of calculating and measuring central tendency is to
determine a ‘single figure’ which may be used to represent a whole series involving
magnitudes of the same variable. In that sense, it is an even more compact
description of the statistical data than the frequency distribution.
Since an ‘average’ represent the entire data, it facilitates comparison within one
group or between groups of data. Thus, the performance of the members of a group
374
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
16.3 MEAN
There are several commonly used measures such as arithmetic mean, mode and
median. These values are very useful not only in presenting the overall picture of the
entire data but also for the purpose of making comparisons among two or more sets
of data.
While arithmetic mean is the most commonly used measure of central location, mode
and median are more suitable measures under certain set of conditions and for
certain types of data. However, each measure of central tendency should meet the
following requisites.
1. It should be easy to calculate and understand.
2. It should be rigidly defined. It should have only one interpretation so
that the personal prejudice or bias of the investigator does not affect its
usefulness.
3. It should be representative of the data. If it is calculated from a sample,
then the sample should be random enough to be accurately representing the
population.
375
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
19 20 22 22 17
X
5
In general, if there are n values in the sample, then
X1 X 2 ......... X n
X
n
In other words,
n
Xi
X i 1
, i 1, 2 ... n (16.1)
n
376
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
The above formula states, add up all the values of Xi where the value of i
starts at 1 and ends at n with unit increments so that i = 1, 2, 3, ... n.
If instead of taking a sample, we take the entire population in our calculations
of the mean, then the symbol for the mean of the population is m (mu) and the size
of the population is N, so that:
N
Xi
i 1
, i 1, 2 ...N (16.2)
N
If we have the data in grouped discrete form with frequencies, then the sample mean
is given by:
f (X )
X (16.3)
f
Where = Summation of all frequencies
Σf
= n
Σf(X) = Summation of each value of X multiplied by its
corresponding frequency ( f )
Example 16.1: Let us take the ages of 10 students as follows:
19, 20, 22, 22, 17, 22, 20, 23, 17, 18
Solution: This data can be arranged in a frequency distribution as follows:
Age Frequency
(X) (f) f(X)
17 2 34
18 1 18
19 1 19
20 2 40
22 3 66
23 1 23
Total = 10 200
377
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
Marks Frequency
(X) (f)
9 1
10 2
11 3
12 6
13 10
14 11
15 7
16 3
17 2
18 1
Total 46
Solution: This is a discrete frequency distribution, and is calculated using equation
(16.3). The following table shows the method of obtianing Σf(X).
Marks (X) Frequency ( f ) f(X)
9 1 9
10 2 20
11 3 33
12 6 72
13 10 130
14 11 154
15 7 105
16 3 48
17 2 34
18 1 18
Σf = 46 Σf(X) = 623
f ( m)
X = (16.4)
f
The determination of the midpoint of a class interval requires some
consideration. The position of the midpoint is determined by real as distinguished
from apparent class limits.
Advantages of Mean
1. Its concept is familiar to most people and is intuitively clear.
2. Every data set has a mean, which is unique and describes the entire data to
some degree. For example, when we say that the average salary of a
professor is ` 25,000 per month, it gives us a reasonable idea about the
salaries of professors.
3. It is a measure that can be easily calculated.
4. It includes all values of the data set in its calculation.
5. Its value varies very little from sample to sample taken from the same
population.
6. It is useful for performing statistical procedures such as computing and
comparing the means of several data sets.
Disadvantages of Mean
1. It is affected by extreme values, and hence, not very reliable when the data
set has extreme values especially when these extreme values are on one
side of the ordered data. Thus, a mean of such data is not truly a
representative of such data. For example, the average age of three persons
of ages 4, 6 and 80 is 30.
2. It is tedious to compute for a large data set as every point in the data set is
to be used in computations.
3. We are unable to compute the mean for a data set that has open-ended
classes either at the high or at the low end of the scale.
4. The mean cannot be calculated for qualitative characteristics such as beauty
or intelligence, unless these can be converted into quantitative figures such
as intelligence into IQs.
Median
The second measure of central tendency that has a wide usage in statistical works, is
the median. Median is that value of a variable which divides the series in such a
manner that the number of items below it is equal to the number of items above it. Half
379
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
the total number of observations lie below the median, and half above it. The median
is thus a positional average.
The median of ungrouped data is found easily if the items are first arranged in
order of magnitude. The median may then be located simply by counting, and its value
can be obtained by reading the value of the middle observations. If we have five
observations whose values are 8, 10, 1, 3 and 5, the values are first arrayed: 1, 3, 5,
8 and 10. It is now apparent that the value of the median is 5, since two observations
are below that value and two observations are above it. When there is an even number
of cases, there is no actual middle item and the median is taken to be the average of the
values of the items lying on either side of (N + 1)/2, where N is the total number of
items. Thus, if the values of six items of a series are 1, 2, 3, 5, 8 and 10. The median is
the value of item number (6 + 1)/2 = 3.5, which is approximated as the average of the
third and the fourth items, i.e.,(3+5)/2 = 4.
Thus the steps required for obtaining median are:
1. Arrange the data as an array of increasing magnitude.
2. Obtain the value of the (N+ l)/2th item.
Even in the case of grouped data, the procedure for obtaining median is
straightforward as long as the variable is discrete or non-continuous as is clear from
the following examples.
Example 16.3: Obtain the median size of shoes sold from the following data.
Number of Shoes Sold by Size in One Year
Size Number of Pairs Cumulative Total
5 30 30
5 21 40 70
6 50 120
6 21 150 270
7 300 570
7 21 600 1170
8 950 2120
8 21 820 2940
9 750 3690
9 21 440 4130
10 250 4380
380
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
10 21 150 4530
11 40 4570
11 21 39 4609
Total 4609
( N + 1) 4609 + 1
Solution: Median, is the value of th = th = 2305th item. Since the
2 2
items are already arranged in ascending order (size-wise), the size of 2305th item is
easily determined by constructing the cumulative frequency. Thus, the median size of
shoes sold is 81, the size of 2305th item.
In the case of grouped data with continuous variable, the determination
of median is a bit more involved. Consider an example: the data relating to the
distribution of male workers by average monthly earnings is given in the following
table. Clearly the median of 6291 cases is the earnings of (6291 + l)/2 = 3l46th
worker arranged in ascending order of earnings.
From the cumulative frequency, it is clear that this worker has his income in the
class interval 67.5–72.5. But it is impossible to determine his exact income. We,
therefore, resort to approximation by assuming that the 795 workers of this class are
distributed uniformly across the interval 67.5 to 72.5. The median worker is
(3146–2713) = 433rd of these 795, and hence, the value corresponding to him can
be approximated as,
433
67.5 + × ( 72.5 − 67.5) = 67.5 + 2.73 = 70.23
795
381
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
N +1
−C
Me = l + 2 ×i
f
Where l is the lower limit of the median class, i its width, f its frequency, C the
cumulative frequency upto (but not including) the median class, and N is the total
number of cases.
Fig. 16.3
Mode
The mode, is that value of the variable, which occurs or repeats itself the greatest
number of times. The mode is the most ‘fashionable’ size in the sense that it is the most
common and typical, and is defined by Zizek as ‘the value occurring most frequently in
a series (or group of items) and around which the other items are distributed most
densely.’
The mode of a distribution is the value at the point around which the items tend to
be most heavily concentrated. It is the most frequent or the most common value,
provided that a sufficiently large number of items are available to give a smooth
383
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
Example 16.5: Determine the mode for the data given in the following table.
Solution: In the given data, 22 – 26 is the modal class, since it has the largest frequency,
the lower limit of the modal class is 22, its upper limit is 26, its frequency 19, the
frequency of the preceding class is 18, and of the following class is 12. The class
interval is 4. Using the various methods of determining mode, we have,
12 18
(i) Mo = 22 4 (ii) Mo = 26 – 18 12
4
18 12
= 22 + 8 = 26 – 12
5 5
= 23.6 = 23.6
19 18 4
(iii) Mo = 22 4 = 22 = 22.5
(19 18) ( 19 12) 8
In formulae (i) and (ii), the frequency of the classes adjoining the modal class is
used to pull the estimate of the mode away from the midpoint towards either the upper
or lower class limit. In this particular case, the frequency of the class preceding the
modal class is more than the frequency of the class following and, therefore, the estimated
mode is less than the midvalue of the modal class. This seems quite logical. If the
frequencies are more on one side of the modal class than on the other, it can be
reasonably concluded that the items in the modal class are concentrated more towards
the class limit of the adjoining class with the larger frequency.
The formula (iii) is also based on a logic similar to that of (i) and (ii). In this case,
to interpolate the value of the mode within the modal class, the differences between
the frequency of the modal class, and the respective frequencies of the classes adjoining
385
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
it are used. This formula usually gives results better than the values obtained by the
other and exactly equal to the results obtained by graphic method. The formulae (i)
and (ii) give values which are different from the value obtained by formula (iii) and are
more close to the central point of modal class. If the frequencies of the class adjoining
the modal are equal, the mode is expected to be located at the midvalue of the modal
class, but if the frequency on one of the sides is greater the mode will be pulled away
from the central point. It will be pulled more and more if the difference between the
frequencies of the classes adjoining the modal class is higher and higher. In the example
given above, the frequency of the modal class is 19 and that of preceding class is 18.
So, the mode should be quite close to the lower limit of the modal class. The midpoint
of the modal class is 24 and lower limit of the modal class is 22.
Locating the Mode by the Graphic Method. The method of graphic
interpolation is illustrated in Fig. 16.3. The upper corners of the rectangle over the
modal class have been joined by straight lines to those of the adjoining rectangles as
shown in the diagram; the right corner to the corresponding one of the adjoining rectangle
on the left, etc. If a perpendicular is drawn from the point of intersection of these lines,
we have a value for the mode indicated on the base line. The graphic approach is, in
principle, similar to the arithmetic interpolation explained earlier.
The mode may also be determined graphically from an ogive or cumulative frequency
curve. It is found by drawing a perpendicular to the base from that point on the curve
where the curve is most nearly vertical, i.e., steepest (in other words, where it passes
through the greatest distance vertically and smallest distance horizontally). The point
where it cuts the base gives us the value of the mode. How accurately this method
determines the mode is governed by: (1) The shape of the ogive, (2) The scale on
which the curve is drawn.
Estimating the Mode from the Mean and the Median. There usually exists a
relationship among the mean, median and mode for moderately asymmetrical
distributions. If the distribution is symmetrical, the mean, median and mode will have
identical values, but if the distribution is skewed (moderately) the mean, median and
mode will pull apart. If the distribution tails off towards higher values, the mean and the
median will be greater than the mode. If it tails off towards lower values, the mode will
be greater than either of the other two measures. In either case, the median will be
about one-third as far away from the mean as the mode is. This means that,
Mode = Mean – 3 (Mean – Median)
= 3 Median – 2 Mean
In the case of the average monthly earnings (refer table of example 3) the mean is
68.53 and the median is 70.2. If these values are substituted in the above formula, we
get,
Mode = 68.5 – 3(68.5 –70.2)
= 68.5 + 5.1 = 73.6
According to the formula used earlier,
f2
Mode = l1 + ×i
f0 + f2
745
= 72.5 + ×5
795 + 745
= 72.5 + 2.4 = 74.9
OR
f1 f0
Mode = l1 2 f f0 f2
i
1
915 − 795
= 72.5 + ×5
2 × 915 − 795 − 745
The difference between the two estimates is due to the fact that the assumption of
relationship between the mean, median and mode may not always be true, it is obviously
not valid in this case.
Example 16.6: (a) In a moderately symmetrical distribution, the mode and mean are
32.1 and 35.4 respectively. Calculate the median.
(b) If the mode and median of moderately asymmetrical series are respectively
16'' and 15.7'', what would be its most probable median?
(c) In a moderately skewed distribution, the mean and the median are respectively
25.6 and 26.1 inches. What is the mode of the distribution?
Solution: (a) We know,
Mean – Mode = 3 (Mean – Median)
or 3 Median = Mode + 2 Mean
32.1 2 35.4
or Median =
3
102.9
=
3
= 34.3
(b) 2 Mean = 3 Median – Mode
or 31.1
Mean = 1 ( 3 × 15. 7 − 16.0) = = 15.55
2 2
(c) Mode = 3 Median – 2 Mean
= 3 × 26.1 – 2 × 25.6 = 78.3 – 51.2 = 27.1
388
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
the product of these values by simple arithmetic is a tedious work. To facilitate the
computation of geometric mean we make use of logarithms. The above formula when
reduced to its logarithmic form will be:
log x1 + log x 2 + log x 3 + ... log x n
log GM =
n
The logarithm of the geometric mean is equal to the arithmetic mean of the logarithms
of individual values.
Example 16.7: Find the GM of 2, 4, 8, 12, 16, 24.
log
2 0.3010
4 0.6021
8 0.9031
12 1.0792
16 1.2041
24 1.3802
5.4697
Solution:
Harmonic Mean
Another important mean is the Harmonic Mean (HM) which is used for averaging the
rates. It is defined by,
389
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
1 1 1 1 1
... /n
HM x1 x2 x3 xn
Where n is the number of items in the series x1, x2, x3, ..., xn.
Thus, if a man travels 200 km each on three days at speeds of 60, 50 and 40
kmph, respectively, his average speed is given by the HM of the three speeds, namely
3
HM = = 48.65 kmph
1 1 1
+ +
60 50 40
Note: HM gives the correct average speed because the man travelled equal distances on three
speeds. If, however, he had travelled for equal times, the AM would have been the correct
average.
Moving Averages
Moving averages are defined as a succession of average derived from successive
segments of constant size and overlapping of a series of values. Moving average is
calculated by creating series of averages of different subjects of the full data set.
After fixing the size of subset, the first element of the moving average is obtained by
taking the average of the initial fixed subset of the number series.
For example: 3, 5, 9, 11, 2, 8, 7, 6, 4, 2 A 3 year moving average can be
calculated as
390
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
16.4 SUMMARY
• Statistics indicate the location of the frequency curve along the X-axis and
ignore all other features of the distribution.
• The most important object of calculating and measuring central tendency is
to determine a ‘single figure’ which may be used to represent a whole series
involving magnitudes of the same variable.
• While arithmetic mean is the most commonly used measure of central
location, mode and median are more suitable measures under certain set of
conditions and for certain types of data.
• The mean is computed by adding all the data values and dividing it by the
number of such values.
• The mean cannot be calculated for qualitative characteristics such as beauty
or intelligence, unless these can be converted into quantitative figures such
as intelligence into IQs.
• Half the total number of observations lie below the median, and half above
it. The median is thus a positional average.
• The median of ungrouped data is found easily if the items are first arranged
in order of magnitude.
• The median of ungrouped data is found easily if the items are first arranged
in order of magnitude.
• The median can quite conveniently be determined by reference to the ogive
which plots the cumulative frequency against the variable.
• The mode is the most ‘fashionable’ size in the sense that it is the most
common and typical, and is defined by Zizek as ‘the value occurring most
frequently in a series (or group of items) and around which the other items
are distributed most densely.’
391
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
• The modal wage, for example, is the wage received by more individuals
than any other wage.
• It may be noted that the occurrence of one or a few extremely high or low
values has no effect upon the mode.
• If a series of data are unclassified, not having been either arrayed or put
into a frequency distribution, the mode cannot be readily located.
• There are several methods of estimating the value of the mode. But, it is
seldom that the different methods of ascertaining the mode give us identical
results.
• The four important methods of estimating mode of a series are: (i) Locating
the most frequently repeated value in the array; (ii) Estimating the mode by
interpolation; (iii) Locating the mode by graphic method; and (iv)
Estimating the mode from the mean and the median.
• In the case of continuous frequency distributions, the problem of
determining the value of the mode is not so simple as it might have
appeared from the foregoing description.
• There usually exists a relationship among the mean, median and mode for
moderately asymmetrical distributions.
• If the distribution is symmetrical, the mean, median and mode will have
identical values, but if the distribution is skewed (moderately) the mean,
median and mode will pull apart.
• If the distribution tails off towards higher values, the mean and the median
will be greater than the mode.
• The Geometric Mean (GM) of n positive values is defined as the nth root
of their product.
392
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
393
Concept of Central Tendency, Mean, Median, Mode,
and Geometric, Harmonic and Moving Averages
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
394
Measures of Dispersion–I & II
Objectives
After going through this unit, you will be able to:
• Define the concept of measures of dispersion and its significance in statistical
analysis
• Differentiate between quartile deviation and standard deviation
• Describe how to calculate coefficient of mean deviation and standard
deviation
• Analyse standard deviation by short-cut method
• Assess the various measures of dispersion
Structure
17.1 Introduction
17.2 Measures of Dispersion
17.3 Standard Deviation
17.4 Summary
17.5 Key Words
17.6 Answers to ‘Check Your Progress’
17.7 Self-Assessment Questions
17.8 Further Readings
17.1 INTRODUCTION
In this unit, you will learn about the measures of dispersion. The measures of central
tendency is computed to see through the variability or dispersion of the individual
values. But the dispersion is in itself a very important property of a distribution and
needs to be measured by an appropriate statistics.
The measure of dispersion can be expressed in an ‘absolute form’, or in a
‘relative form’. It is said to be in an absolute form when it states the actual amount
by which the value of an item on an average deviates from a measure of central
tendency. A relative measure of dispersion is a quotient obtained by dividing the
absolute measures by a quantity in respect to which absolute deviation has been
computed. Relative measures are used for making comparisons between two or
more distributions. The common measures of dispersion are range, semi-interquartile
395
Measures of Dispersion–I & II
range or the quartile deviation, mean deviation and standard deviation. Of these, the
standard deviation is the best measure. All these measures are discussed in this unit.
Range
The crudest measure of dispersion is the range of the distribution. The range of any
series is the difference between the highest and the lowest values in the series. If the
marks received in an examination taken by 248 students are arranged in ascending
396
Measures of Dispersion–I & II
order, then the range will be equal to the difference between the highest and the
lowest marks.
In a frequency distribution, the range is taken to be the difference between the
lower limit of the class at the lower extreme of the distribution and the upper limit of
the class at the upper extreme.
Table 17.1 Weekly Earnings of Labourers in Four Workshops of the Same Type
No. of Workers
Weekly earnings
` Workshop A Workshop B Workshop C Workshop D
15–16 ... ... 2 ...
17–18 ... 2 4 ...
19–20 ... 4 4 4
21–22 10 10 10 14
23–24 22 14 16 16
25–26 20 18 14 16
27–28 14 16 12 12
29–30 14 10 6 12
31–32 ... 6 6 4
33–34 ... ... 2 2
35–36 ... ... ... ...
37–38 ... ... 4 ...
Total 80 80 80 80
Mean 25.5 25.5 25.5 25.5
Consider the data on weekly earning of worker on four workshops given in the
above Table 17.1. We note the following:
Workshop Range
A 9
B 15
C 23
D 15
From these figures, it is clear that the greater the range, the greater is the
variation of the values in the group.
The range is a measure of absolute dispersion and as such cannot be usefully
employed for comparing the variability of two distributions expressed in different
units. The amount of dispersion measured, say, in pounds, is not comparable with
dispersion measured in inches. So the need of measuring relative dispersion arises.
An absolute measure can be converted into a relative measure if we divide it by
some other value regarded as standard for the purpose. We may use the mean of
the distribution or any other positional average as the standard.
397
Measures of Dispersion–I & II
23 23 15 15
Workshop C = = Workshop D = =
15 + 38 53 19 + 34 53
398
Measures of Dispersion–I & II
No. of Students
Class
Section Section Section
A B C
0–10 ... ... ...
10–20 1 ... ...
20–30 12 12 19
30–40 17 20 18
40–50 29 35 16
50–60 18 25 18
60–70 16 10 18
70–80 6 8 21
80–90 11 ... ...
90–100 ... ... ...
Total 110 110 110
Range 80 60 60
The table is designed to illustrate three distributions with the same number of
cases but different variability. The removal of two extreme students from section A
would make its range equal to that of B or C.
The greater range of A is not a description of the entire group of 110 students,
but of the two most extreme students only. Further, though sections B and C have
the same range, the students in section B cluster more closely around the central
tendency of the group than they do in section C. Thus, the range fails to reveal the
greater homogeneity of B or the greater dispersion of C. Due to this defect, it is
seldom used as a measure of dispersion.
399
Measures of Dispersion–I & II
(b) In the study of prices of securities, range has a special field of activity.
Thus to highlight fluctuations in the prices of shares or bullion it is a
common practice to indicate the range over which the prices have moved
during a certain period of time. This information, besides being of use to the
operators, gives an indication of the stability of the bullion market, or that
of the investment climate.
(c) In statistical quality control the range is used as a measure of variation.
We, e.g., determine the range over which variations in quality are due to
random causes, which is made the basis for the fixation of control limits.
400
Measures of Dispersion–I & II
vary between (25.3–2.1) = ` 23.2 and (25.3 + 2.1) = ` 27.4, shall be just half of the
total cases. The other half of the workers will be more than ` 2.1 removed from the
median wage. As this distribution is not symmetrical, the distance between Q1 and the
median Q2 is not the same as between Q3 and the median. Hence, the interval defined
by median plus and minus semi inter-quartile range will not be exactly the same as
given by the value of the two quartiles. Under such conditions the range between
` 23.2 and ` 27.4 will not include precisely 50 per cent of the workers.
If quartile deviation is to be used for comparing the variability of any two series,
it is necessary to convert the absolute measure to a coefficient of quartile deviation.
To do this the absolute measure is divided by the average size of the two quartile.
Symbolically,
Q3 − Q1
Coefficient of Quartile Deviation =
Q3 + Q1
Applying this to our illustration of four workshops, the coefficients of Q.D. are
as given below.
Table 17.3 Calculation of Quartile Deviation
N 80 80 80 80
Location of Q2 = 40 = 40 = 40 = 40
2 2 2 2 2
40 − 30 40 − 30 40 − 30 40 − 30
Q2 24.5 + ×2 24.5 + ×2 24.5 + ×2 24.5 + ×2
22 18 16 16
= 24.5 + 0.9 = 24.5 + 1.1 = 24.5 + 0.75 = 24.5 + 0.75
= 25.4 = 25.61 = 25.25 = 25.25
N 80 80 80 80
Location of Q1 = 20 = 20 = 20 = 20
4 4 4 4 4
20 − 10 20 − 16 20 − 10 20 − 18
Q1 22.5 + ×2 22.5 + ×2 20.5 + ×2 22.5 + ×2
22 14 10 16
= 22.5 + 0.91 = 22.5 + 0.57 = 20.5 + 2 = 22.5 + 0.25
= 23.41 = 23.07 = 22.5 = 22.75
3N 80
Location of Q3 3× 60
= 60 60 60
4 4
60 − 52 60 − 48 60 − 50 60 − 50
Q3 26.5 + ×2 26.5 + ×2 26.5 + ×2 26.5 + ×2
14 16 12 12
= 26.5 + 1.14 = 26.5 + 1.5 = 26.5 + 1.67 = 26.5 + 1.67
= 27.64 = 28.0 = 28.17 = 28.17
401
Measures of Dispersion–I & II
402
Measures of Dispersion–I & II
We can measure the deviations from any measure of central tendency, but the
most commonly employed ones are the median and the mean. The median is
preferred because it has the important property that the average deviation from it is
the least.
Calculation of the mean deviation then involves the following steps:
(a) Calculate the median or the mean, Md or Me ( X ).
(b) Record the deviations | d | = | x – Me | of each of the items, ignoring the
sign.
(c) Find the average value of deviations.
|d |
Mean Deviation =
N
Example 17.1: Calculate the mean deviation from the following data giving marks
obtained by 11 students in a class test.
14, 15, 23, 20, 10, 30, 19, 18, 16, 25, 12
11 + 1
Solution: Median = Size of th item
2
∑ |d | = 50
|d |
Mean Deviation from Median = ∑
N
50
= = 4.5 marks
11
403
Measures of Dispersion–I & II
For grouped data, it is easy to see that the mean deviation is given by,
f |d |
Mean Deviation (M.D.) = ∑
∑f
Where | d | = | x – median | for grouped discrete data, and | d | = M – median
| for grouped continuous data with M as the mid-value of a particular group. The
following examples illustrate the use of this formula.
Example 17.2: Calculate the mean deviation from the following data:
Size of Item 6 7 8 9 10 11 12
Frequency 3 6 9 13 8 5 4
Solution:
Example 17.3: Calculate the mean deviation from the following data:
Solution:
This is a frequency distribution with continuous variable. Thus, deviations are
calculated from midvalues.
404
Measures of Dispersion–I & II
80 1182
80
Median = Size of th item
2
6
= 20 + × 10 = 24
15
∑ f |d |
and then, Mean Deviation = ∑f
1182
= = 14.775
80
Merits
(i) It is easy to understand.
(ii) As compared to standard deviation (discussed later), its computation is
simple.
(iii) As compared to standard deviation, it is less affected by extreme values.
(iv) Since it is based on all values in the distribution, it is better than range or
quartile deviation.
Demerits
(i) It lacks those algebraic properties which would facilitate its computation
and establish its relation to other measures.
(ii) Due to this, it is not suitable for further mathematical processing.
Mean Deviation
Coefficient of M.D.=
Mean
(when deviations were recorded from the mean)
M.D.
= (when deviations were recorded from the median)
Median
Applying the above formula to Example 3.
14.775
Coefficient of Mean Deviation = = 0.616
24
2. What is range?
................................................................................................................
................................................................................................................
................................................................................................................
By far the most universally used and the most useful measure of dispersion is the
standard deviation or root mean square deviation about the mean. We have seen that
all the methods of measuring dispersion so far discussed are not universally adopted
for want of adequacy and accuracy. The range is not satisfactory as its magnitude is
determined by most extreme cases in the entire group. Further, the range is notable
because it is dependent on the item whose size is largely matter of chance. Mean
deviation method is also an unsatisfactory measure of scatter, as it ignores the
algebraic signs of deviation. We desire a measure of scatter which is free from these
shortcomings. To some extent standard deviation is one such measure.
The calculation of standard deviation differs in the following respects from that
of mean deviation. First, in calculating standard deviation, the deviations are squared.
This is done so as to get rid of negative signs without committing algebraic violence.
Further, the squaring of deviations provides added weight to the extreme items, a
desirable feature for certain types of series.
406
Measures of Dispersion–I & II
Secondly, the deviations are always recorded from the arithmetic mean, because
although the sum of deviations is the minimum from the median, the sum of squares
of deviations is minimum when deviations are measured from the arithmetic average.
The deviation from x is represented by d.
Thus, standard deviation, s (sigma) is defined as the square root of the mean of
the squares of the deviations of individual items from their arithmetic mean.
( x − x )2
σ = ∑ (17.1)
N
For grouped data (discrete variables),
2
∑ f (x − x )
σ = (17.2)
∑f
and, for grouped data (continuous variables),
∑ f (M − x)
σ = (17.3)
∑f
Where M is the midvalue of the group.
The use of these formulae is illustrated by the following examples.
Example 17.4: Compute the standard deviation for the following data:
11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21
Solution: Here formula (17.1) is appropriate. We first calculate the mean as x =
∑ x/ N = 176/11 = 16, and then calculate the deviation as follows:
x (x – x ) (x – x )2
11 –5 25
12 –4 16
13 –3 9
14 –2 4
15 –1 1
16 0 0
17 +1 1
18 +2 4
19 +3 9
20 +4 16
21 +5 25
176 11
110
σ= = 10 = 3.16
11
407
Measures of Dispersion–I & II
Example 17.5: Find the standard deviation of the data in the following distributions:
x 12 13 14 15 16 17 18 20
f 4 11 32 21 15 8 6 4
Solution: For this discrete variable grouped data, we use formula (17.2). Since for
calculation of x , we need ∑ fx and then for σ we need ∑ f ( x − x ) 2 , the calculations
are conveniently made in the following format.
x f fx d=x– x d2 fd2
12 4 48 –3 9 36
13 11 143 –2 4 44
14 32 448 –1 1 32
15 21 315 0 0 0
16 15 240 1 1 15
17 8 136 2 4 32
18 5 90 3 9 45
20 4 80 5 25 100
100 1500 304
Here x = fx / f = 1500/100 = 15
fd 2 304
and σ = ∑ = = 3. 04 = 1.74
∑f 100
1–3 2 1 2 –6 36 36
3–5 4 9 36 –4 16 144
5–7 6 25 150 –2 4 100
7–9 8 35 280 0 0 0
9–11 10 17 170 2 4 68
11–13 12 10 120 4 16 160
13–15 14 3 42 6 36 108
100 800 616
408
Measures of Dispersion–I & II
∑ fd 2 616
σ = = = 2.48
∑f 100
σ= ∑ x ′2 FG
∑ x′ IJ 2
(17.4)
N
−
N H K
and for grouped data,
2 2
fx fx
σ= (17.5)
f f
This formula is valid for both discrete and continuous variables. In case of
continuous variables, x in the equation x' = x – A stands for the midvalue of the class
in question.
Note that the second term in each of the formulae is a correction term because
of the difference in the values of A and x . When A is taken as x itself, this correction
is automatically reduced to zero. The following examples explain the use of these
formulae.
Example 17.7: Compute the standard deviation by the short-cut method for the
following data:
11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21
409
Measures of Dispersion–I & II
σ= ∑ x ′2 FG∑ x′ IJ 2
N
−
H N K
FG IJ
= 121 − 11
2
= 11 − 1
11 H 11K
= 10 = 3.16
Another Method: If we assumed A as zero, then the deviation of each item from
the assumed mean is the same as the value of item itself. Thus, 11 deviates from the
assumed mean of zero by 11, 12 deviates by 12, and so on. As such, we work with
deviations without having to compute them, and the formula takes the following
shape:
x x2
11 121
12 144
13 169
14 196
15 225
16 256
17 289
18 324
19 361
20 400
21 441
176 2926
410
Measures of Dispersion–I & II
σ= FG IJ
∑ x2 ∑x
2
N H K
−
N
2926 F 176 I
2
−G
=
11 H 11 JK = 266 − 256 = 3.16
Example 17.8: Calculate the standard deviation of the following data by short-cut
method.
Person 1 2 3 4 5 6 7
Monthly Income
(Rupees) 300 400 420 440 460 480 580
Solution: In this data, the values of the variables are very large making calculations
cumbersome. It is advantageous to take a common factor out. Thus, we use x' =
x− A
20
. The standard deviation is calculated using x' and then the true value of σ is
obtained by multiplying back by 20. The effective formula used is,
′2
σ = C× ∑x − ∑x
′ FG IJ 2
N H N K
Where C represents the common factor.
Using x' = (x – 420)/20
x Deviation from x' x'2
Assumed Mean
x′ = (x – 420)
300 –120 –6 36
400 –20 –1 1
420 0 0 0
–7
440 20 1 1
460 40 2 4
480 60 3 9
580 160 8 64
+ 14
N=7 x' = 7 x'2 = 115
′2
σ = 20 × ∑ x − ∑ x
′FG IJ 2
= 20 115 − 7 FG IJ 2
= 78.56
N H N K 7 H 7K
411
Measures of Dispersion–I & II
Example 17.9: Calculate the standard deviation from the following data:
Size 6 9 12 15 18
Frequency 7 12 19 10 2
Solution:
N = 50 ∑ fx′ ∑ fx′ 2
= –12 = 58
σ =C ∑
fx ′ 2
−
∑ fx ′ FG IJ 2
N H N K
= 3 58 − −12 FG IJ 2
50 H 50 K
= 3 1.1600 0.0576
= 3 × 1.05 = 3.15
Example 17.10: Obtain the mean and standard deviation of the first N natural numbers,
i.e., of 1, 2, 3, ..., N – 1, N.
Solution: Let x denote the variable which assumes the values of the first N natural
numbers.
Then,
N
x N ( N 1)
1 2 N 1
x = N N 2
412
Measures of Dispersion–I & II
Hence, x = 1 + 2 + 3 + ... + (N – 1) + N
1
N ( N 1)
=
2
To calculate the standard deviation σ, we use 0 as the assumed mean A. Then,
σ= ∑ x2 ∑x FG IJ 2
N
−
N H K
N ( N + 1) ( 2 N + 1)
But, ∑ x2 = 12 + 22 + 32 + ... (N – 1)2 + N2 =
6
Therefore,
N ( N + 1) ( 2 N + 1) N 2 ( N + 1) 2
σ = −
6N 4N 2
= LM
( N + 1) 2 N + 1 N + 1
−
OP = (N 1) ( N 1)
2 N3 2 Q 12
f = 79 60 242
–52
∑ fx′ = 8
413
Measures of Dispersion–I & II
Solution: Since the deviations are from assumed mean and expressed in terms of
class interval units,
σ = i× ∑
x′2
−
∑ fx ′ FG IJ 2
N H N K
= 10 × 242 − 8 FG IJ 2
79 H 79 K
= 10 × 1.75 = 17.5
N 1σ 12 + N 2 σ 22 + N 1 ( x − x1 ) 2 + N 2 ( x − x 2 ) 2
and σ = (17.7)
N1 + N 2
Example 17.12: The mean and standard deviations of two distributions of 100 and
150 items are 50, 5 and 40, 6 respectively. Find the standard deviation of all taken
together.
Solution: Combined mean,
N 1 x1 + N 2 x 2 100 × 50 + 150 × 40
x = = = 44
N1 + N 2 100 + 150
2 2
N1 N2 N1 ( x x1 )2 N2 ( x x2 ) 2
σ = 1 2
N1 N 2
= 7.46
414
Measures of Dispersion–I & II
Example 17.13: A distribution consists of three components with 200, 250, 300
items having mean 25, 10 and 15 and standard deviation 3, 4 and 5, respectively.
Find the standard deviation of the combined distribution.
Solution: In the usual notations, we are given here:
N1 = 200, N2= 250, N3 = 300
x1 = 25, x 2 = 10, x 3 = 15
The formulae (17.6) and (17.7) can easily be extended for combination of three
series as
N 1 x1 + N 2 x 2 + N 3 x 3
x =
N1 + N 2 + N 3
= 12000 = 16
750
and,
2 2 2
N1 1 N2 2 N3 3 N1 ( x x1 )2
2
N2 ( x x2 ) N3 ( x x3 )2
σ =
N1 N 2 N3
415
Measures of Dispersion–I & II
quality control where the same sample size is repeatedly used, so that comparison of
ranges are not distorted by differences in sample size.
The quartile deviations and other such positional measures of dispersions are
also easy to calculate but suffer from the disadvantage that they are not amenable to
algebraic treatment. Similarly, the mean deviation is not suitable because we cannot
obtain the mean deviation of a combined series from the deviations of component
series. However, it is easy to interpret and easier to calculate than the standard
deviation.
The standard deviation of a set of data, on the other hand, is one of the most
important statistics describing it. It lends itself to rigorous algebraic treatment, is
rigidly defined and is based on all observations. It is, therefore, quite insensitive to
sample size (provided the size is ‘large enough’) and is least affected by sampling
variations.
It is used extensively in testing of hypothesis about population parameters based
on sampling statistics.
In fact, the standard deviation has such stable mathematical properties that it is
used as a standard scale for measuring deviations from the mean. If we are told that
the performance of an individual is 10 points better than the mean, it really does not
tell us enough, for 10 points may or may not be a large enough difference to be of
significance. But if we know that the σ for the score is only 4 points, so that on this
scale, the performance is 2.5 σ better than the mean, the statement becomes meaningful.
This indicates an extremely good performance. This sigma scale is a very commonly
used scale for measuring and specifying deviations which immediately suggest the
significance of the deviation.
The only disadvantages of the standard deviation lies in the amount of work
involved in its calculation, and the large weight it attaches to extreme values because
of the process of squaring involved in its calculations.
Solved Problems
Example 17.14: The arithmetic mean and standard deviation of a series of 20 items
were calculated by a student as 20 cm and 5 cm respectively. But while calculating
them an item 13 was misread as 30. Find the correct arithmetic mean and standard
deviation.
416
Measures of Dispersion–I & II
Corrected σ2 =
Corrected ∑ X 2
−
FG
Corrected ∑ X IJ 2
N H N K
=
7769 FG IJ
383
2
= 388.45 – 366.72
20
−
H K
20
σ = 4.66
Hence, the correct mean is 19.15 and correct standard deviation 4.66.
Example 17.15: Mean, and standard deviation of the following continuous series are
31 and 15.9 respectively. The distribution after taking step deviation is as follows:
X' : –3 –2 –1 0 1 2 3
f : 10 15 25 25 10 10 5
Determine the actual class intervals.
Solution:
X' : –3 –2 –1 0 1 2 3 Total
f : 10 15 25 25 10 10 5 100
fX' : –30 –30 –25 0 10 20 15 –40
fX'2 : 90 60 25 0 10 40 45 270
417
Measures of Dispersion–I & II
2 2
fX fX
Standard Deviation = i
N N
N = 100
Putting the known values, we have
100 H 100 K ×i
= 2. 70 − 0.16 × i = 1.59 × i
∴ i = 15. 9 = 10
1.59
fX
Arithmetic Mean = A i
N
∴ Putting the known values, we have
−40
31 = A + × 10
100
or A = 31 + 4 = 35
A or assumed mean is the midpoint corresponding to the class having X value 0.
As the class interval is of 10 and the variable under study is a continuous one, the
class for which X = 0 will be 35–5 to 35 + 5, i.e., 30 to 40. A class next lower than
this is 30–10 to 30, i.e., 20 to 30.
Similarly other classes can be calculated. So all the class intervals are:
0–10 10–20 20–30 30–40 40–50 50–60 60–70
Example 17.16: The mean of 50 readings of a variable was 7.43 and their S.D.
was 0.28. The following ten additional readings become available: 6.80, 7.81, 7.58,
7.70, 8.05, 6.98, 7.78, 7.85, 7.21 and 7.40. If these are included with original 50
readings, find (i) The mean, (ii) The standard deviation of the whole set of 60
readings.
Solution: Mean of 50 readings = 7.43
X
Mean of 10 additional readings =
N
6.80 7.81 7.58 7.70 8.05 6.98 7.78 7.85 7.21 7.40
=
10
= 7.516
418
Measures of Dispersion–I & II
Standard Deviation,
X2
0.28 = (7.43) 2
50
X2
0.0784 = – 55.2
50
3330.71
∴S.D. of 60 readings = (7.44) 2
60
Example 17.17: The first of two subgroups has 100 items with mean 15 and S.D.
3. If the whole group has 250 items with mean 15.6 and S.D. 13. 44 , find the S.D.
of the second group.
N1 X 1 + N 2 X 2
Solution: Combined A.M. = X =
N1 + N 2
419
Measures of Dispersion–I & II
Combined S.D.,
2
900 36 150 24
13.44 = 2
or 3360 = 960 + 150σ22
250
A B
Midpoint Frequency Midpoint Frequency
15 15 100 340
20 33 150 492
25 56 200 890
30 103 250 1420
35 40 300 620
40 32 350 360
45 10 400 187
450 140
Variable A Variable B
420
Measures of Dispersion–I & II
N 289 N 4449
Q1 has , i.e., or 72.25 Q1 has , i.e., 2
or 1112.25 items
4 4 4
below it.
items below it.
∴ It lies in the group 22.5–27.5 ∴ It lies in the group 175–225
72. 25 − 48 1112. 25 − 832
Q1 = 22.5 + ×5 Q1 = 175 + × 50
56 890
= 24.67 = 190.7
3N 3N
Q3 has or 216.75 items below it. Q3 has items below it.
4 4
= 33.72 = 290.7
Coefficient of Q.D. Coefficient of Q.D.
33. 72 − 24. 67 290. 7 − 190. 7
= = 0.15 = = 0.21
33. 72 + 26. 67 290. 7 + 190. 7
300
421
Measures of Dispersion–I & II
Solution:
Sub- Men Average NX σ Nσ2 X – Xc N( X – X c )2
group N X
A 50 61 3050 8 3200 –13 8450
B 100 70 7000 9 8100 –4 1600
C 120 80.5 9660 10 12000 6.5 5070
D 30 83 2490 11 3630 9.0 2430
300 22200 26930 17550
NX 22200
Combined Mean ( X ) c = ∑ = = ` 74
N 300
2
(Combined Standard Deviation)
N 2 N (X X c )2
=
N N
44480
= 26930 17550 = = 148.27
300 300 300
σ = 148 . 27 = ` 12.18
Example 17.20: For a certain group of wage-earners, the median and quartile wages
per week were ` 44.3, ` 43.0 and ` 45.9 respectively. Wages for the group ranged
between ` 40 and ` 50. 10 per cent of the group had under ` 42 per week, 13 per
cent had ` 47 and over and 6 per cent ` 48 and over. Put these data into the form of
a frequency distribution, and hence obtain an estimate of the mean wage and the
standard deviation.
Solution: Assuming that the group has 100 workers the frequency distribution will
take the following shape.
422
Measures of Dispersion–I & II
fX
X = = 4450 .15 = ` 44.50
N 100
437 . 69
σ= = ` 2.1 (approx.)
100
Check Your Progress - 2
17.4 SUMMARY
423
Measures of Dispersion–I & II
425
Measures of Dispersion–I & II
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
426
Measures of Skewness
Objectives
After going through this unit, you will be able to:
• Discuss the measures of skewness
• Describe Karl Pearson’s measure of skewness
• Analyse Kelly’s measure of skewness
Structure
18.1 Introduction
18.2 Measures of Skewness
18.3 Karl Pearson’s Measure of Skewness
18.4 Summary
18.5 Key Words
18.6 Answers to ‘Check Your Progress’
18.7 Self-Assessment Questions
18.8 Further Readings
18.1 INTRODUCTION
This unit discusses the measures of skewness. Skewness, in probability theory and
statistics, is a measure of the asymmetry of the probability distribution of a real-
valued random variable about its mean. A distribution, or data set, is symmetric if it
looks the same to the left and right of the center point. Conceptually, skewness
describes which side of a distribution has a longer tail. If the tail is longer on the right,
then the skewness is rightward or positive; if the tail is longer to the left, then the
skewness is leftward or negative. Right skewness is common when a variable is
bounded on the left but unbounded on the right. Left skewness is less common in
practice, but it can occur when a variable tends to be closer to its maximum than its
minimum value. This unit will discuss skewness, its characteristics and types in detail.
427
Measures of Skewness
Class A B C D
Interval f f f f
56.5–58.5 5 3 0 4
58.5–60.5 25 5 4 8
60.5–62.5 15 20 40 20
62.5–64.5 10 44 24 24
64.5–66.5 15 20 20 40
66.5–68.5 25 5 8 4
68.5–70.5 5 3 4 0
N 100 100 100 100
Mean (Me) 63.5 63.5 63.5 63.5
Median (Md) 63.5 63.5 63 64
Mode (Mo) — 63.5 61.9 65.1
The histograms and the corresponding curves are drawn in Figures 18.1 and 18.2.
A A
56.5 X = Md 70.5
56.5 X = Mo = Md 70.5
Fig. 18.1
A glance at the data of each of the four classes given above makes a very
interesting study.
The shape of the curves, histograms and placement of equal items at equal
distances on either side of the median clearly show that distributions A and B are
symmetrical. If we fold these curves, or histograms on the ordinate at the mean, the
two halves of the curve or histograms will coincide. In distribution B, all the three
measures of central tendency are identical. In A, which is a bimodal distribution,
mean and median have the same value.
428
Measures of Skewness
Distributions C and D are asymmetrical. This is evident from the shape of the
histograms and curves, and also from the fact that items at equal distances from the
median are not equal in number. The three measures of central tendency for each of
these distributions are of different sizes.
A point of difference between the asymmetry of distribution C and that of D
should be carefully noted. In distribution C, where the mean (63.5) is greater than
the median (63) and the mode (61.9), the curve is pulled more to the right. In
distribution D where mean (63.5) is lesser than the median (64) and mode (65.1) the
curve is pulled more to the left.
In other words, we may say that if the extreme variations in a given distribution
are towards higher values they give the curve a longer tail to the right and this pulls
the median and mean in that direction from the mode. If, however, extreme
variations are towards lower values, the longer tail is to the left and the median and
mean are pulled to the left of the mode.
It could also be shown that in a symmetrical distribution the lower and upper
quartiles are equidistant from the median, so also are corresponding pairs of deciles
and percentiles. This means that in a asymmetrical distribution the distance of the
upper and lower quartiles from median is unequal.
C C
D D
56.5 70.5
MODE
MEDIAN
MEAN
Fig. 18.2
429
Measures of Skewness
From the above discussion, we can summarize the tests for the presence of
skewness as follows:
1. When the graph of the distribution does not show a symmetrical curve.
2. When the three measures of central tendency differ from one another.
3. When the sum of the positive deviations from the median are not equal to
the negative deviations from the same value.
4. When the distances from the median to the quartiles are unequal.
5. When corresponding pairs of deciles or percentiles are not equidistant from
the median.
Measures of Skewness
On the basis of the above tests, the following measures of skewness have been
developed:
1. Relationship between three measures of central tendency—commonly
known as the Karl Pearson’s measure of skewness.
2. Quartile measure of skewness—known as Bowley’s measure of skewness.
3. Percentile measure of skewness—also called the Kelly’s measure of
skewness.
4. Measures of skewness based on moments.
All these measures tell us both the direction and the extent of the skewness.
1. State the nature of distance of the upper and lower quartiles from median
in a asymmetrical distribution.
................................................................................................................
................................................................................................................
................................................................................................................
430
Measures of Skewness
It has been shown earlier that in a perfectly symmetrical distribution, the three
measures of central tendency, viz., mean, median and mode will coincide. As the
distribution departs from symmetry these three values are pulled apart, the difference
between the mean and mode being the greatest. Karl Pearson has suggested the use
of this difference in measuring skewness. Thus Absolute Skewness = Mean – Mode.
(+) or (–) signs obtained by this formula would exhibit the direction of the skewness.
If it is positive, the extreme variation in the given distribution is towards higher values.
If it is negative, it shows that extreme variations are towards lower values.
Number
of Persons 10 18 30 42 35 28 16 8
Solution: Height is a continuous variable, and hence 58″ must be treated as 57.5″–
58.5″, 59″ as 58.5″–59.5″, and so on.
Height Frequency x′ fx′ fx′2 Cumulative
(in inches) f from 61 Frequency
58 10 –3 –30 90 10
59 18 –2 –36 72 28
59.5″–60–60.5″ 30 –1 –30 30 58
–96
60.5″–61–61.5″ 42 0 0 0 100
62 35 1 35 35 135
62.5″–63–63.5″ 28 2 56 112 163
63.5″–64–64.5″ 16 3 48 144 179
65 8 4 32 128 187
187 171 611
+75
75 35
Mean = 61 + = 61.4, Mode = 60.5 + = 61.04
187 65
2
611 75
σ= = 3.27 0.16 = 3.11 = 1.76
187 187
If in the series the median and lower quartiles coincide, then the SK becomes
(+1). If the median and upper quartiles coincide, then the SK becomes (–1).
This measure of skewness is rigidly defined and easily computable. Further, such
a measure of skewness has the advantage that it has value limits between (+1) and
(–1), with the result that it is sufficiently sensitive for many requirements. The only
criticism levelled against such a measure is that it does not take into consideration all
the item of these series, i.e., extreme items are neglected.
Example 18.2: Calculate the coefficient of skewness of the data of table given in
example 9 based on quartiles.
Solution: With reference to table given in example 18.1, we have,
N 187
Q1 = The size of th 46.75th item
4 4
433
Measures of Skewness
18.75
= 59.5 + = 59.5 + 0.63 = 60.13
30
3N 3 187
Q3 = The size of th item 140.25th item
4 4
5.25
= 62.5 + = 62.5 + 0.19 = 62.69
28
Skewness = 62.69 + 60.13 – 2 (61.35) = 0.12 (using formula 18.1)
0.12
Coefficient of Skewness = 62.69 60.13 (using formula 18.2)
0.12
= = 0.047
2.56
0–10 10 5 –3 –30 90 10
10–20 40 15 –2 –80 160 50
20–30 20 25 –1 –20 20 70
–130
30–40 0 35 0 0 0 70
434
Measures of Skewness
40–50 10 45 1 10 10 80
50–60 40 55 2 80 160 120
60–70 16 65 3 48 144 136
70–80 14 75 4 56 224 150
150 194 808
+64
= 10 5.387 0.182
= 10 × 2.28 = 22.8
3( X Median) 3(39.27 45)
Skewness = =
22.8
3( 5.73) 17.19
= = = – 0.75
22.8 22.8
Example 18.4: From the following data compute quartile deviation and the
coefficient of skewness.
Size 5–7 8–10 11–13 14–16 17–19
Frequency 14 24 38 20 4
Solution:
Size Frequency Cumulative Frequency
4.5–7.5 14 14
7.5–10.5 24 38
10.5–13.5 38 76
13.5–16.5 20 96
16.5–19.5 4 100
435
Measures of Skewness
3 11
Q1 = 7.5 + = 8.87
24
3 37 111
Q3 = 10.5 + = 10.5 + = 10.5 + 2.92 = 13.42
38 38
3 12 36
Median = 10.5 + = 10.5 + = 10.5 + 0.947 = 11.447
38 38
Q3 Q1 13.42 8.87 4.55
Quartile Deviation = = = = 2.275
2 2 2
Q3 Q1 2Me
Skewness =
Q3 Q1
13.42 8.87 22.89
=
13.42 8.87
= 0.6 = – 0.13
4.55
Example 18.5: In a certain distribution the following results were obtained:
X = 45.00; Median = 48.00
Coefficient of Skewness = – 0.4
You are required to estimate the value of standard deviation.
Solution:
3 (45 48)
– 0.4 =
– 0.4σ = – 9
9
σ= = 22.5
0.4
Example 18.6: Karl Pearson’s coefficient of skewness of a distribution is +0.32.
Its standard deviation is 6.5 and mean is 29.6. Find the mode and median of the
distribution.
Solution:
3(29.6 Median)
0.32 =
6.5
6.5 × 0.32 = 88.8 – 3 Median
88.8 2.08
Median = = 28.91
3
Example 18.7: You are given the position in a factory before and after the
settlement of an industrial dispute. Comment on the gains or losses from the point of
the workers and that of the management.
Before After
No. of Workers 2440 2359
Mean Wages 45.5 47.5
Median Wages 49.0 45.0
Standard Deviation 12.0 10.0
Solution:
Employment. Since the number of workers employed after the settlement is less
than the number of employed before, it has gone against the interest of the workers.
Wages. The total wages paid after the settlement were 2350 × 47.5 =
` 1,11,625; before the settlement the amount disbursed was 2400 × 45.5 =
` 1,09,200.
This means that the workers as a group are better off now than before the
settlement, and unless the productivity of workers has gone up, this may be against
the interest of management.
Uniformity in the wage structure. The extent of relative uniformity in the wage
structure before and after the settlement can be determined by a comparison of the
coefficient of variation.
12
Coefficient of Variation, Before = × 100 = 26.4
45.5
10
Coefficient of Variation, After = × 100 = 21.05
47.5
This clearly means that there is comparatively lesser disparity in due wages received
by the workers. Such a position is good for both the workers and the management.
Pattern of the wage structure. A comparison of the mean with the median
leads to the obvious conclusion that before the settlement more than 50 per cent of
the workers were getting a wage higher than this mean, i.e., (` 45.5). After the
437
Measures of Skewness
settlement the number of workers whose wages were more than ` 45.5 became less
than 50 per cent. This means that the settlement has not been beneficial to all the
workers. It is only 50 per cent workers who have been benefited as a result of an
increase in the total wages bill.
18.4 SUMMARY
439
Measures of Skewness
7. You are given the position in a factory before and after the settlement of an
industrial dispute. Comment on the gains or losses from the point of the
workers and that of the management.
Before After
No. of Workers 2543 2766
Mean Wages 51.5 50.5
Median Wages 48.0 43.0
Standard Deviation 14.0 10.0
Chandan, J. S. 1998. Statistics for Business and Economics. New Delhi: Vikas
Publishing House Pvt. Ltd.
Monga, G. S. 2000. Mathematics and Statistics for Economics. New Delhi:
Vikas Publishing House Pvt. Ltd.
Kothari, C. R. 1984. Quantitative Technique. New Delhi: Vikas Publishing House
Pvt. Ltd.
Hooda, R. P. 2002. Statistics for Business and Economics. New Delhi: Macmillan
India Ltd.
Chaudhary, C. M. 2009. Research Methodology. Jaipur: RBSA Publishers.
Kothari, C. R. 2009. Research Methodology. New Jersey: John Wiley and Sons
Ltd.
Pande, G. C. 2003. Research Methodology in Social Sciences. New Delhi: Anmol
Publications (P) Ltd.
440
NOTES
NOTES