Quantative Apptitude 1
Quantative Apptitude 1
)A
m
ity
U
ni
ve
rs
Quantitative Aptitude
ity
O
nl
in
e
e
© Amity University Press
in
All Rights Reserved
No parts of this publication may be reproduced, stored in a retrieval system or transmitted
nl
in any form or by any means, electronic, mechanical, photocopying, recording or otherwise
without the prior permission of the publisher.
O
Advisory Committee
ity
Chairman : Ms. Monica Agarwal
Members : Prof. Arun Bisaria
Dr. Priya Mary Mathew
rs
Prof. Aindril De
Mr. Alok Awtans
ve
Dr. Coral J Barboza
Dr. Monica Rose
Mr. Sachit Paliwal
ni
Published by Amity University Press for exclusive use of Amity Directorate of Distance and Online Education,
Amity University, Noida-201313
Contents
Page No.
e
Module - 1: Algebra 01
in
1.1 Algebra
1.1.1 Introduction to Indices, properties and applciation in real life
1.1.2 Introduction to Surds, properties and application in real life
nl
1.1.3 Introduction to Logarithms, properties and application of common logarithms
1.1.4 Introduction to equations - Linear and Quadratic, roots of quadratic equation
1.1.5 Methods to solve quadratic equations
O
1.1.6 Introduction tosimultaneous equations and methods to solve them with two or three unknowns
1.1.7 Introduction to Inequalities with Graphs
ity
Module - 2: Data Arrangement 22
2.1 Arithmetic Progression
2.1.1 Definition of Arithmetic Progression and the nth term of AP.
2.1.2 Sum of n terms of AP
2.1.3 Representation of terms in AP
2.1.4 Arithmetic mean between a and b rs
ve
2.1.5 Arithmetic Progression and its applications in business
2.2 Geometric Progression
2.2.1 Definition of Geometric Progression, nth term of GP
2.2.2 Sum of n terms of GP
ni
3.1.2 Representation of a Set - Roster form, Descriptive form and Set Builder form.
3.1.3 Types of Sets - Empty, Singleton, Finite, Infinte Sets, Equal, Power and Universal Sets..
3.2 Relations and Functions
)A
e
1.1.2 Graphical Representation
1.1.3 Histogram
in
1.1.4 Frequency Polygon and Frequency Curve
1.1.5 Ogive- Part 1
nl
1.1.6 Ogive- Part 2
4.2 Descriptive Measures
2.1.1 Measure of the Central Tendency - I
O
2.1.2 Measure of the Central Tendency - II
2.1.3 Measure of Dispersion
2.1.4 Kurtosis, skewness
ity
Module - 5: Forecasting Techniques 164
5.1 Correlation
1.1.1 Correlation-Coefficient_Introduction
1.1.2 Correlation_Coefficient_Application
1.1.3 Introduction_Rank_Correlation
1.1.4 Comparison_Pearson_Spearman_Correlation
rs
ve
1.1.5 Application_Rank_Correlation
5.2 Regression
2.1.1 Introduction_Linear_Regression_model
ni
2.1.2 Population_sample_Regression
2.1.3 Method_Least_Square_Understanding
2.1.4 Maths_Behind_Least_Square
U
ity
m
)A
(c
Quantitative Aptitude 1
Module - 1: Algebra
Notes
e
Structure:
in
1.1 Algebra
1.1.1 Introduction to Indices, properties and applciation in real life
nl
1.1.2 Introduction to Surds, properties and application in real life
1.1.3 Introduction to Logarithms, properties and application of common
logarithms
O
1.1.4 Introduction to equations - Linear and Quadratic, roots of quadratic
equation
1.1.5 Methods to solve quadratic equations
ity
1.1.6 Introduction tosimultaneous equations and methods to solve them with
two or three unknowns
1.1.7 Introduction to Inequalities with Graphs
rs
ve
ni
U
ity
m
)A
(c
Unit - 1: Algebra
Notes
e
Objectives:
At the end of this unit, you will be able to understand:
in
●● Indices, Properties and Application in Real Life
●● Surds, Properties and Application in Real Life
nl
●● Logarithms, Properties and Application of Common Logarithms
●● Equations- Linear and Quadratic, Roots of Quadratic Equations
O
●● Methods to Solve Quadratic Equations
●● Simultaneous Equations and Methods to solve with 2 or 3 unknowns
ity
●● Inequalities with Graphs
Introduction
In this unit, we will discuss about indices, surds, logarithms and their properties and
rs
applications in real life; Linear and Quadratic equations; Simultaneous equations; and
Inequalities with graphs.
ve
Introduction to Indices, Properties and Application in Real Life
Index (indices) in Maths is the power or exponent which is raised to a number or
a variable. For example, in number 24, 4 is the index of 2. The plural form of index is
indices. In algebra, we come across constants and variables. The constant is a value
ni
which cannot be changed. Whereas a variable quantity can be assigned any number
or we can say its value can be changed. In algebra, we deal with indices in terms of
numbers. Let us learn the laws/rules of the indices along with formulas and solved
U
examples.
Indices
ity
The index says that a particular number (or base) is to be multiplied by itself, the
number of times equal to the index raised to it. It is a compressed method of writing big
numbers and calculations.
Example: 2˄3 = 2 × 2 × 2 = 8
(c
If n is an positive integer, then means the continued product of factors, each equal
to a . Here is called the base and n is termed as the index or exponent or power of an.
Notes
e
in
nl
O
Thus, 45,4,4,4 where 4 is the base and 5 is the index.
ity
Laws of indices (for positive integral exponents)
If m,n are positive integers and a,b are any two non-zero real numbers, then
rs
ve
ni
U
ity
Note:
m
)A
Solution:
(3x108)3 = 33 x (108)3
(c
= 27 x 1024
= 2.7 x 1025
Notes
e
310 x108
Ex. Simplify:
36 x 105
in
Laws of Indices
nl
There are some fundamental rules or laws of indices which are necessary to
understand before we start dealing with indices. These laws are used while performing
algebraic operations on indices and while solving the algebraic expressions, including it.
O
Rule 1: If a constant or variable has index as ‘0’, then the result will be equal to
one, regardless of any base value.
a0 = 1
ity
Example: 50 = 1, 120 = 1, y0 = 1
Rule 2: If the index is a negative value, then it can be shown as the reciprocal of
the positive index raised to the same variable.
a-p = 1/ap
ap.aq = ap+q
ni
Rule 4: To divide two variables with the same base, we need to subtract the power
U
ap/aq = ap-q
Rule 5: When a variable with some index is again raised with different index, then
both the indices are multiplied together raised to the power of the same base.
(ap)q = apq
m
Rule 6: When two variables with different bases, but same indices are multiplied
)A
together, we have to multiply its base and raise the same index to multiplied variables.
ap.bp = (ab)p
Rule 7: When two variables with different bases, but same indices are divided, we
are required to divide the bases and raise the same index to it.
ap/bp = (a/b)p
e
ap/q = q√ap
in
Example: 61/2 = √6
nl
In Mathematics, surds are the values in square root that cannot be further
simplified into whole numbers or integers. Surds are irrational numbers. The examples
of surds are √2, √3, √5, etc., as these values cannot be further simplified. If we further
O
simply them, we get decimal values, such as:
√2 = 1.4142135…
√3 = 1.7320508…
ity
√5 = 2.2360679…
Surds Definition
rs
Surds are the square roots (√) of numbers that cannot be simplified into a whole or
rational number. It cannot be accurately represented in a fraction. In other words, a surd
is a root of the whole number that has an irrational value. Consider an example, √2 ≈
ve
1.414213. It is more accurate if we leave it as a surd √2.
In the set of irrational numbers some are algebraic (like 2 , 3 9 etc.) and these
numbers are termed as surds. Of course, non-algebraic irrational numbers, such as
π ,e etc. are not surds. Thus, all surds are irrational numbers, but all irrational numbers
ni
Let m and p respectively denote any positive rational number and any positive
U
A surd is called a pure surd if no rational number (except 1) can be extracted out of
its radical (e.g. 5 , 3 6 etc.). Any surd which is not pure is known as a mixed surd. Two
ity
mixed surds are called similar if their irrational factors are same.
The order of a surd is indicated by the index of the radical present in the surd. Thus
5 , 6 7
2 3
are respectively surds of the second, third and fourth order. A given surd is called a
m
Types of Surds
)A
Simple Surds – A surd that has only one term is called a simple surd. Example:
√2, √5, …
(c
Similar Surds – The surds having the same common surds factor
Mixed Surds – Surds that are not completely irrational and can be expressed as a
Notes product of a rational number and an irrational number
e
Compound Surds – An expression which is the addition or subtraction of two or
more surds
in
Binomial Surds – A surd that is made of two other surds
nl
1. Mixed Surds: A surd of the form kn a where k is a rational number, k1 0 and
k 1 1 is called a mixed surd.
23 7 , 83 7
O
etc are mixed surds. For
ity
called a pure surd. For e.g. 3, 5 and 3 2 etc are pure surd.
rs
ve
ni
U
Rules of Surds:
Given below are some rules that one needs to follow:
ity
Root of a positive real quantity can be called a surd, if its value remains
m
undetermined.
√a × √a = a ⇒ √5 × √5 = 5
)A
The sum along with the difference of two quadratic surds will be called
complementary or conjugate surds to each other.
(c
Notes
e
in
nl
O
1.1.3 Introduction to Logarithms, Properties and Applications of
Common Logarithms
ity
Meaning of logarithm
If a x = n , a and n are positive real number such that a ≠ 1 , then x is said to be the
logarithm of the number n to the base ' a ' symbolically it can be expressed as follows:
log a n = x
For example
rs
ve
1. 43 =
64 ⇒ log 4 64 =
3
2. 33 =
27 ⇒ log 3 27 =
3
ni
U
ity
m
)A
Solution:
Notes
log2 8 + log2 4 = log2 23 + log2 22
e
= 3log2 2 + 2log2 2
3+2
= ( log 1)
=
in
Example 2 Find the value of log 3 162 − log 3 2 .
nl
Solution:
162
log 3 162 − log 3 2 = log 3
O
2
= log 3 81
= log 3 34
ity
= 4log
= 33 (log x x 1)
=4
Example 3 If log 6 ( 2 x − 4 ) + log 6 4 =log 6 40 , then find the value of x.
Solution:
rs
log 6 ( 2 x − 4 ) + log 6 4 =
log 6 40
log 6 4 ( 2 x − 4 ) =
log 6 40
ve
8 x − 16 = 40
8 x = 56
x =7
ni
logarithm.
Common Logarithm: Logarithms of numbers to the base 10 are known as
common logarithm.
ity
log N or log 10 N: Decadic logarithms and decimal logarithms are other names for
common logarithms.
If log N = x, then this logarithmic form can be represented in exponential form, i.e.,
)A
10 x = N.
The Richter scale, which is used to measure earthquakes, and the decibel scale,
(c
which is used to measure sound, are both usually expressed in logarithmic form. It is so
common that if you find no base written, you can assume it is log x or common log.
Laws of Logarithm
Notes
( mn ) loga m + loga n
log a =
e
m
log a =
log a m − log a n
n
in
log a ( mn ) = n (log a m )
1
log a m =
nl
log m a
log b m
log a m =
log b a
O
1.1.4. Introduction to Equations- Linear and Quadratic, Roots of
Quadratic Equation
ity
An equation having the form ax + b = 0 where a ≠ 0 , a and b are arbitrary
constants, is called a linear equation in x. This equation is also called first degree
equation, the highest power of the unknown factor x being 1. Again, two linear
equations of the form a1 x + b1 y + c1 =
0 , a2 x + b2 y + c2 =
0 are known as simultaneous
x
equations in and y .
rs
Linear equations are widely used in different places in Physics, Engineering and
Mathematics. It is partly due to the fact that it can help to approximate non-linear
ve
equations. Linear equations are equations for a straight line and are of the first order
Equation:
For Example,
3x + 4 =
16
U
x 2 − 7 x + 12 =
0
x − y =−1
ity
2x − y =7
In each of the above equations the parts separated by the sign = are termed as
sides of the equation.
m
Quadratic equation
( ) ( )
If f x is a quadratic polynomial. Then f x = 0 is known as quadratic equation.
The general form of a quadratic equation is ax 2 + bx + c =0 , a, b, c ∈ R and a ≠ 0 .
)A
Notes
e
in
nl
O
Roots of an equation
ity
The value of a variable in an equation which satisfied the given equation are known
( ) ( )
as roots of an equation i.e., if f x = 0 is a polynomial equation and f a = 0 , then a
( )
is a root of f x = 0 .
2
b − 4ac is known as discriminant and itrs
Roots of the quadratic equation ax 2 + bx + c =
is denoted by
0 are
D.
−b ± b 2 − 4ac , where
2a
ve
If the value of discriminant is negative, then the solution of the quadratic equation is
not possible.
If the value of discriminant is positive, then the solution of the quadratic equation is
ni
possible.
Nature of roots
U
The roots are irrational, if a, b, c are rational and D is not a perfect square.
m
e
general form of a quadratic equation is ax 2 + bx + c =0 , a, b, c ∈ R and a ≠ 0 .
in
For Ex. 3x 2 + 2 x + 4 , 5x 2 + 8 x + 9 and 2 x 2 + 8 x + 13 .
nl
Roots of the quadratic equation ax2 + bx + c = 0 are where b2 –
4oc is known as discriminant and it is denoted by D.
O
If the vale of discriminant is negati ve, then the solution of the quadratic equation is
not possible.
It the value of discriminant is positive, then the solution of the quadratic equation is
ity
possible.
Solution:
x2 + 6 x + 8 = x2 + 2x + 4 x + 8
)A
= ( x2 + 2x ) + ( 4 x + 8 )
= x ( x + 2) + 4 ( x + 2)
( x + 2 )( x + 4 )
=
(c
⇒ x =−2, −4
2
Ex.1.16: Solve the equation x + 4 x − 21 .
Solution:
Notes
x 2 + 4 x − 21 = x 2 + 7 x − 3x − 21
e
= ( x 2 + 7 x ) − ( 3x + 21)
= x ( x + 7) − 3( x + 7)
in
( x + 7 )( x − 3)
=
⇒x=−7,3
nl
Ex. 1.17: Solve the equation 3x 2 − 10 x + 3 =0.
Solution: Here, a = 3, b = -10, c = 3
O
So, the roots are
−b ± b 2 − 4ac
x=
ity
2a
10 ± 100 − 36
=
6
10 ± 64
rs
=
6
10 ± 8
=
6
ve
10 + 8 10 − 8
= or
6 6
1
=3 or
3
ni
−b ± b 2 − 4ac
x=
ity
2a
−2 ± 4 + 24
=
4
−2 ± 28
m
=
4
−2 ± 2 7
=
)A
4
−1 + 7 −1 − 7
= or
2 2
(c
e
Simultaneous (linear equation) equation
in
is called a linear equation in x. This equation is also called first degree equation, the
highest power of the unknown factor x being 1. Again, two linear equations of the form
0 are known as simultaneous equations in x and y .
0 , a2 x + b2 y + c2 =
a1 x + b1 y + c1 =
nl
Linear equations are widely used in different places in Physics, Engineering and
Mathematics. It is partly due to the fact that it can help to approximate non-linear
O
equations. Linear equations are equations for a straight line and are of the first order.
ity
So, the roots of the linear equation x + y = 5 is
X -10 -5 0 5 10
rs
15 10 5 0 -5
y= 5 − x
6 4 2 0 -2
y= 2 − x
ni
U
ity
Here blue line shows the graph of x + y =5 and red line shows the graph of
m
x+y =2.
Given below is a basic example of how this equation is graphed:
)A
(c
Notes
e
in
nl
O
ity
Methods of solving simultaneous equations
There are three methods to solve any simultaneous equations. These methods are
as follows:
We obtain the value of one variable in terms of the other from any of the given two
equations.
Step 2:
U
Substitute the value of variable, obtained in step 1, in the other equation and solve
it.
ity
Step 3:
Substitute the value of variable, obtained in step 2, in the result of step 1 and get
the value of the remaining unknown variable.
m
Step 2:
(c
Add both the equations, as obtained in step 1, or subtract one equation from the
other, so that the terms with equal numerical coefficients cancel mutually.
Step 3:
Solve the resulting equation to find the value of one of the unknowns.
Notes
Step 4:
e
Substitute this value in any of the two given equations and find the value of the
other unknowns.
in
3. Method of cross-multiplication.
nl
Step 1.
O
0
a1 x + b1 y + c1 =
and, a2 x + b2 y + c2 =
0
Step 2.
ity
Write the equation in the following form in order to obtain the solution of the given
equation.
x y 1
= =
b1 c1 c1 a1 a1 b2
b2 c2 c2 a2 a2 b2
= =
x y x
rs
ve
b1c2 − b2 c1 c1a2 − c2 a1 a1b2 − a2b1
ni
Mathematical sentences of the type 5 ≠ 7, 7 > 5, −1 < 2, 3 y < 15, 5t > 20 are
U
called inequalities or in inequations. These sentences say that one thing is not equal to
another.
Universal set
ity
If the variable x in the equation (a) is replaced by the number 4, it yields a true
statement. We say that the equation is satisfied. Similarly, the inequation (b) is satisfied
)A
Sometimes it is necessary to specify a set from which the replacement for the
(c
variable should be chosen. For example, if we write the open sentences. “_______ is a
beautiful city”, it would be difficult to give any precise answer, unless the set of cities is
specified.
The set of a elements from which the replacement for the variable is taken is called
Notes the replacement set.
e
Solution Set or truth Set
Consider the inequation 1 + x > 5 . We can obtain the following solutions for the
in
replacement sets shown.
nl
A = {0,1, 2,5, 7,8} 5, 7 and 8
B = {0,1, 2,3, 4, 6} 6
O
C = {0,1, 2,3, 4} φ
The above Ex. illustrates the fact that the solution of an inequation depend upon
the replacement set used. It also follows from the above that an equation may have
ity
one, many or no solution, depending upon the replacement set. The solution or
solutions of a given inequation from a set, which we call the solution set or the truth set.
It is obviously a subset of the replacement set.
Properties:
rs
1. Adding to or subtracting from both sides of any inequality any non-zero
number produces an equivalent inequality.
ve
(i) x − 3 > 5 is equivalentto
x −3+3 > 5+3
x >8
ni
x>2
2. Multiplying or dividing both sides of an inequalities by the same positive
number produces an equivalent inequality.
ity
x>3
x
(ii) > 2 is equivalent to
3
)A
x
×3 > 2×3
3
x>6
3. Multiplying both sides of an inequality by the same negative number
(c
e
both sides are positive.
in
In other words,
2 2
If x > y, x > 0, y > 0 then x > y :
nl
For Ex.: 5>3 produces
52 > 32
25 > 9
O
(ii) Squaring both sides of an inequality produces an inequality with its
direction reversed if both sides are negative.
ity
In other words,
2 2
If x > y, x < 0, y < 0 then x < y or
2 2
If x < y, x < 0, y < 0 then x > y
For Ex.: -3 > 5 produces
( −3)
2
< ( −5 )
2 rs
ve
9 < 25
Graphing the inequality
ni
Place the variable on the left and the inequality symbol will point to the direction to
be shaded. The arrows at the ends of the number line will match the inequality sign.
U
ity
m
)A
a. When the ≤ or ≥ signs are used, the point is included → use the solid circle
•.
b. When the < or > signs are used, the point is not included → use the open
(c
circle .
c. The line goes to the right (greater than symbol) if > is used
d. the lines goes to the left (less than the symbol) if < is used
Notes Here are some Example of inequalities which shows how such inequalities are represented.
e
in
nl
O
ity
rs
ve
ni
U
ity
1. ________ of a variable (or a constant) is a value that is raised to the power of the
variable.
)A
2. __________ are the square roots (√) of numbers that cannot be simplified into a
whole or rational number.
3. Thus, all surds are irrational numbers, but all irrational numbers are not surds. True
/ False
(c
e
is called a __________ equation in x.
8. Linear equations are equations for a ___________ line and are of the first order.
in
9. The set of a elements from which the replacement for the variable is taken is called
the _____________ set.
nl
10. The value of a variable in an equation which satisfied the given equation are known
as __________ of an equation
O
Multiple Choice Questions
log9 3
1. Solve 4 10log x 83
+ 9log2 4 =
a. 3
ity
b. 11
c. 4
d. 9 a+b b+c c +a
rs
xa xb xc
2. Find the value of b × c × a
x x x
a. 1
ve
b. 6
c. 9
d. 0
ni
3 1+ x + 1− x
3. If x = , then find the value of
2 1+ x − 1− x
U
a. √3
b. √5
c. √7
ity
d. 1
4. Find the value of log 0..01 10000
a. 2
m
b. 6
c. -2
)A
d. -8
5. Find the value of log 3 log 2 log 2 256
a. 7
(c
b. 5
c. 3
d. 1
Summary
Notes
●● The index says that a particular number (or base) is to be multiplied by itself,
e
the number of times equal to the index raised to it. It is a compressed method of
writing big numbers and calculations.
in
●● Surds are the square roots (√) of numbers that cannot be simplified into a whole or
rational number. It cannot be accurately represented in a fraction. In other words, a
surd is a root of the whole number that has an irrational value.
nl
●● Logarithm of numbers to base e are known as nature logarithm.
●● Logarithms of numbers to the base 10 are known as common logarithm.
O
●● An equation is a statement where two algebraic expressions are equal.
●● The value of a variable in an equation which satisfied the given equation are
( )
known as roots of an equation i.e., if f x = 0 is a polynomial equation and
ity
( ) ( )
f a = 0 , then a is a root of f x = 0 .
●● An equation having the form ax + b = 0 where a ≠ 0 , a and b are arbitrary constants,
is called a linear equation in x. This equation is also called first degree equation,
the highest power of the unknown factor x being 1.
●●
rs
There are three methods to solve any simultaneous equations. These methods are
as follows:
ve
●● Method of elimination by substitution.
●● Method elimination by equating coefficients.
●● Method of cross multiplication.
ni
●● Mathematical sentences of the type 5 ≠ 7, 7 > 5, −1 < 2, 3 y < 15, 5t > 20 are
called inequalities or in inequations. These sentences say that one thing is not
equal to another.
U
●● The solution or solutions of a given inequation from a set, which we call the
solution set or the truth set. It is obviously a subset of the replacement set.
ity
Activity
1. Simplify: 310 x 108 / 36 x 105
2. Solve the equation: 4x2 + 5x – 10 = 0 What are the roots?
m
e
Glossary
in
●● Binomial Surds: A surd that is made of two other surds
●● Common Logarithm: Logarithms of numbers to the base 10 are known as
common logarithm.
nl
●● Compound Surds: An expression which is the addition or subtraction of two or
more surds
O
●● Index: is a value that is raised to the power of the variable(or a constant).
●● Natural logarithm: Logarithm of numbers to base e are known as nature
logarithm.
ity
●● Simple Surds – A surd that has only one term is called a simple surd.
●● Surds: are the square roots (√) of numbers that cannot be simplified into a whole
or rational number.
●● Pure Surds – Surds which are completely irrational.
Further Reading rs
ve
1. McCune, Sandra Luna. McGraw-Hill Education Algebra I Review and
Workbook. McGraw Hill; 1st edition (January 8, 2019)
2. Selby, Peter H.; Slavin, Steve. Practical Algebra: A self-Teaching Guide,
Second Edition. John Wiley & Sons; 2nd edition (February 14, 1991)
ni
2. Surds
3. True
ity
4. Equation
5. False
6. Common
m
7. Linear
8. Straight
)A
9. Replacement
10. Roots
Multiple Choice
(c
1. B 2. A 3. A
4. C 5. D
e
Structure:
in
2.1 Arithmetic Progression
2.1.1 Definition of Arithmetic Progression and the nth term of AP.
nl
2.1.2 Sum of n terms of AP
2.1.3 Representation of terms in AP
2.1.4 Arithmetic mean between a and b
O
2.1.5 Arithmetic Progression and its applications in business
2.2 Geometric Progression
ity
2.2.1 Definition of Geometric Progression, nth term of GP
2.2.2 Sum of n terms of GP
2.2.3 Representation of terms in GP
rs
2.2.4 Geometric mean between a and b
2.2.5 Geometric Progression and its applications in business
2.3 Permutation and Combination
ve
2.3.1 Introduction to Permutations and Combination and its applications in
business
ni
U
ity
m
)A
(c
e
Objectives:
At the end of this unit, you will be able to understand
in
●● Definition of Arithmetic Progression and the nth term
●● Sum of n terms of AP
nl
●● Representation of terms in AP
●● Arithmetic mean between a and b
O
●● Arithmetic Progression and its applications in Business
Introduction
ity
In mathematics, the word sequence is used in the same way as it is in ordinary
English. When we say that a collection of objects is listed in a sequence, we usually
mean that the collection is ordered in such a way that it first identifies the first member,
then the second member, then the third member and so on. In this unit, we will learn
rs
about arithmetic progression, nth term of an A.P., the sum of first n terms of A.P. and
arithmetic mean also.
ve
2.1.1 Definition of Arithmetic Progression and the nth Term of AP
Arithmetic Progression
An arithmetic progression (A.P.) is a sequence whose terms increase or decrease
ni
a2 − a1 = a3 − a2 = ... = an − an-1 = d
If a is the first term and d is a common difference, then A.P. can be written as
ity
(ii) 4,8,12,16,......
m
arithmetic series.
progression, then new series formed will also be in the form of arithmetic series.
If corresponding terms of two series are added or subtracted, then new series
formed will also be in an arithmetic series.
e
2.1.2 Sum of n terms of AP
in
The nth term of an A.P.
Suppose a be the first term, d be a common difference and an be the last term of
an A.P., then the nth term is given by
nl
an =a + ( n − 1) d , where d= an − an−1
( n − r + 1)
th
The rth term from the end of finite A.P. is the term from the beginning.
O
The sum of n terms of an A.P.
Suppose there are n terms of a sequence, whose first term is a, the common
ity
difference is d and the last term is an, then the sum of n terms is given by
n
S= 2a + ( n − 1 ) d
2
n
Example
rs
Find the 10th term of the following sequence:
Sequence: 1, 3, 5, 7, …
ve
Answer:
Given: 1, 3, 5, 7, …
ni
a1 = 1, a2 = 3, a3 = 5, a4 = 7, a10 = ?
Common difference,
U
d1 = a2 - a1
=3–1
ity
=2
d2 = a3 - a2
=5–3
m
= 2
d = d1 = d2 = 2
)A
An = a + (n – 1)d
A10 = 1 + 9d
(c
= 19
Example
e
A1 = 10, a2 = 15, a3 = 20, a4 = 25, S10 = ?
in
Common difference,
D1 = a2 – a1
nl
= 15 – 10
=5
O
D2 = a3 – a2
= 20 – 15
=5
ity
So, the given sequence is A.P.
n
S= 2a + ( n − 1 ) d
2
n
= 325 rs
ve
2.1.3 Representation of Terms in AP
A progression is a type of sequence for which a formula for the nth term can
be found. The Arithmetic Progression is the most commonly used sequence in
ni
In AP, we will encounter three main terms, which are denoted as follows:
common difference between the two terms, and the nth term. Let’s suppose, a1, a2, a3,
……………., an is an AP, then; the common difference ‘d’ can be obtained as:
d = a2 – a1 = a3 – a2 = ……. = an – an – 1
zero.
e
a, a + d, a + 2d, a + 3d, a + 4d, ………. ,a + (n – 1) d
in
where ‘a’ is the progression’s first term.
nl
Consider an Arithmetic Progression to be: a1, a2, a3, ……………., an
O
1 a1 a = a + (1-1) d
2 a2 a + d = a + (2-1) d
3 a3 a + 2d = a + (3-1) d
ity
4 a4 a + 3d = a + (4-1) d
. . .
. . .
N
a+b
Arithmetic mean =
2
ni
Discrete Frequency Distribution: if the value x1 occurs f1 times, the value x2 occurs
m
e
a. Direct method
in
b. Short-Cut method
c. Step-deviation method
These methods are applicable to any type of series.
nl
a. Direct method:
O
b. Short cut method: In this method first we assured any numer say A (often
ity
called assumed mean). Then
c. By Stem-Deviation Method
∑ f (x – A)
x= A +
∑f
rs
ve
= A+
∑ f (x – A)
N
Where.
ni
h=Class size
U
ity
Example
10
∑ 3( −2)
n −1
Evaluate .
n =1
Notes 10
∑ 3( −2)
n −1
Answer: = 3 − 6 + 12 − 24 + .......
e
n =1
in
First term, a = 3
Common ratio, r = −2
nl
a (1 − r n )
Sn =
1−r
( )
O
3 1 − ( −2 )
10
S10 =
1 − ( −2 )
3 (1 − 1024 )
ity
=
3
= −1023
Example
rs
Insert one arithmetic mean between 3 and 5.
Answer:
ve
Given a = 3 and b = 5
a+b
Arithmetic mean =
2
ni
3+5
=
2
=4
U
in an Arithmetic Progression.
If we select terms from an Arithmetic Progression that are in the ordinary stretch,
these terms will also be in Arithmetic Progression.
(c
Example: Mr. Singh wants to start a savings plan. He decides that the bare
minimum he should save in any given year is Rs. 100,000. He serves Rs. 100,000 at Notes
e
the end of the first year and saves an additional Rs. 5000 each year after that. How
much money will he have saved by the end of his 30th year?
in
Solution: a = Rs. 100,000
d = Rs. 5000
nl
n = 30
O
ity
rs
Some real-life applications of an arithmetic progression include:
ve
◌◌ Arithmetic Progression is used in the calculation of straight-line depreciation.
◌◌ Arithmetic Progression is used to forecast any sequence while someone is
waiting for a cab. If the traffic is moving at a steady pace, they can predict
ni
applications.
◌◌ Online account checking is a fundamental application of everyday arithmetic.
With the risk of identity theft and online banking, a general understanding of
fundamental mathematics is essential.
ity
◌◌ You can see a real-life application of arithmetic progression when you take
a taxi. Once you board a taxi, you will be charged an initial fee followed by
a per mile or kilometre fee. This diagram shows an arithmetic sequence in
which you will be charged a fixed (constant) rate plus the starting rate for each
m
kilometre travelled.
e
4. Arithmetic mean is independent of the change of origin and scale. True / False
in
5. Arithmetic Progression is used in Pyramid-like patterns where things are constantly
changing, among other things, and in a variety of other applications. True / False
nl
Multiple Choice
1. Find the 10th term of the following sequence: 2,4,6,8,.........
O
a. 12
b. 10
c. 40
ity
d. 20
2. Find the sum of the first 10 terms of the following sequence: 1,3,5,7,.........
a. 250
b.
c.
88
100 rs
ve
d. 125
3. Find the number of all-natural numbers between 20 and 80, which are divisible by 3.
a. 15
ni
b. 20
c. 6
U
d. 18
4. The sum of three terms in A.P. is 33 and their product is 1155, Find the terms.
a. 15,11,7
ity
b. 1,4,3
c. 6,8,4
d. 4,2,9
m
Summary
●● An arithmetic progression (A.P.) is a sequence whose terms increase or decrease
)A
e
●● If the same number is multiplied or divided by each term of an arithmetic
progression, then new series formed will also be in the form of arithmetic series.
in
●● If corresponding terms of two series are added or subtracted, then new series
formed will also be in an arithmetic series.
●● Series formed by multiplying or dividing corresponding terms of two arithmetic
nl
progression, then series formed will be not be in an arithmetic series.
●● The terms used in this progression for a given series are the first term, the
O
common difference between the two terms, and the nth term.
●● Arithmetic mean of a group of observations is the quotient obtained by dividing the
sum of all the observations by their number.
ity
Activity
1. Rose wants to start saving money. She decides that the minimum amount she needs
to save in any year is Rs. 90,000. She deposits Rs. 90,000 at the end of the first year
rs
and saves an additional Rs. 8000 each year following that. How much money will
she have saved after 25 years?
2. How does one calculate the arithmetic mean using the direct method; Short-cut
ve
method; and Step-deviation method. Explain the steps and give original example
problems to illustrate how each method is used and in which scenarios.
Glossary
ity
found.
●● Further Reading
)A
e
2. Mean
in
3. True
4. False
5. True
nl
Multiple Choice
O
1. D
2. C
3. B
ity
4. A
rs
ve
ni
U
ity
m
)A
(c
e
Objectives:
At the end of this unit, you will be able to understand:
in
●● Definition of Geometric Progression and the nth term of GP
●● Sum of terms in GP
nl
●● Representation of terms in GP
●● Geometric mean between a and b
O
●● Geometric Progression and its Applications in Business
Introduction
ity
In this unit, we will learn about geometric progression, nth term of a G.P., the sum
of first n terms of G.P. and geometric mean also.
rs
A geometric progression (G.P.) is a sequence of numbers, whose first term is non-
zero and each of the terms is obtained by multiplying its just preceding term is obtained
ve
by a constant quantity. This constant quantity is called the common ratio of the G.P.
If a is the first and r is the common ratio, then G.P. can be written as a,ar ,ar2
,......,arn−1 , where a ≠ 0 .
ity
a1 = 3, a2 = 6, a3 = 12, a4 = 24
m
)A
(6/3) = (12/6) = 2
2, 6, 18, 54, ……
a1 = 2, a2 = 6, a3 = 18, a4 = 54
(c
(6/3) = (54/18) = 3
e
ratio and an be the last term of a G.P., then the nth term is given by
in
nl
Suppose there are n terms of a sequence, whose first term is a, the common ratio
is r, and the last term is an, then the sum of n terms is given by
O
ity
rs
ve
Example Find the sum of 5 terms of the G.P.: 3 + 6 + 12 + 24 ………….
Answer:
Given a = 3
ni
6
Common ratio r =
2
U
=3
r > 1 . So, the sum of 5 terms
a ( r n − 1)
ity
Sn =
r −1
3 ( 25 − 1 )
S5 =
2 −1
m
S5 = 3 × 31
S5 = 93
)A
1 + 3 + 3 + 3 3 + .......
(c
Answer:
Given a=1
3 Notes
Common ratio r =
1
e
= 3
r > 1 . So, the sum of 5 terms
in
a ( r n − 1)
Sn =
r −1
nl
S10 =
1 ( ( 3 ) − 1)
10
O
3 −1
5
3 −1 3 +1
= ×
3 −1 3 +1
ity
243 − 1
=
3−1
× 3 +1 ( )
= 121 ( 3 +1 )
( )
Hence, the required sum is
and r is the common ratio of the sequence. The common ratio can have both negative
and positive values. In a geometric progression, each successive term is obtained by
multiplying the common ratio by the preceding term.
U
Let a be the first term and r be the common ratio for a Geometric Sequence.
a2
The common ratio of a geometric progression r –
a1
m
The formula for the nth term of a geometric progression with the first term a and a
comon ratio of r.
)A
an =arn-1
a
The sum of infinite geometric formula S∞ – where r < 1.
1–r
Notes
e
2.2.4 Geometric Mean Between a and b
The geometric mean is defined as the nth root of the product of n numbers, i.e., for
in
a set of numbers a1, a2, a3, ….., an, the geometric mean is defined as
nl
Geometric mean
O
And also if we insert one geometric mean between two numbers a and b, then
Geometric mean =
ity
2.2.5 Geometric Progression and its Applications in Business
●● Geometric progression is a common method for calculating interest earned.
●● Geometric progression is a popular method for calculating the balance in our
rs
savings account.
Solved Examples:
Example What is the geometric mean of 10, 51.2 and 8?
ity
Answer:
Product of all given numbers = 10 × 51.2 × 8
= 4096
G.M. = 3 4096
m
= 3 (16 )
3
= 16
)A
Answer:
(c
e
G.M. = 5 59049
in
= 5 (9)
5
=9
nl
Exmple:
3.9 If the first two terms of a G.P. are 125 and 25. Find its 5th term.
O
Answer:
Given a = 125 ar = 25
ar
ity
\ Common ratio =
a
ar 25
=
a 125
r=
1
5 rs
ve
So, the fifth term of G.P.
4
1
ni
4
ar= 125 ×
5
1
=
U
5
1
Hence, the fifth term is
5
ity
2. A geometric progression is a sequence where each term has the same ________
ratio, known as the common ratio.
3. The geometric sequence is generally written as a, ar, ar2..., where a is the first term
)A
5. The geometric ________ is defined as the nth root of the product of n numbers
6. Geometric progression is a common method for calculating ______________
earned.
e
8. In a geometric progression, each successive term is obtained by multiplying the
common ratio by the following term. True / False
in
Multiple Choice
1. If the first two terms of a G.P. are 125 and 25. Find its 5th term.
nl
a. 1/5
b. 5
O
c. 1
d. 2
2. Find the sum of 5 terms of the G.P.: 3 + 6 + 12 + 24 ………….
ity
a. 65
b. 78
c. 44
3.
d. 93
rs
Find the sum of 10 terms of the geometric progression 1 + 3 + 3 + 3 3 + .......
ve
a. 154(√3+3)
b. 100(√3+5)
c. 121(√3+1)
ni
d. 145(√3+3)
4. The sum of few terms of any ratio series is 728, if common ratio is 3 and last term is
U
c. 9
d. 22
5 5 5
5. Calculate the following geometric series: 5 + + + + ......
3 9 27
m
a. 7.5
b. 9
)A
c. 4
d. 3.5
Summary
(c
●● The geometric sequence is generally written as a, ar, ar2..., where a is the first
term and r is the common ratio of the sequence. The common ratio can have both Notes
negative and positive values. In a geometric progression, each successive term is
e
obtained by multiplying the common ratio by the preceding term.
in
●● The geometric mean is defined as the nth root of the product of n numbers
●● Characteristics of Geometric Progression
●● If each term of geometric series is multiplied or divided by the same quantity, then
nl
new series formed will also be in geometric progression.
●● Series formed by multiplying or dividing by corresponding terms of two geometric
O
series will also be in geometric progression.
●● Series formed by reciprocal of the term of geometric series is also in geometric
progression.
ity
Activity
1. The 6th term and 8th term of a G.P. are 32 and 128 respectively. Find the common
ratio o the G.P.
rs
2. If three geometric means are put between 2 and 32, find the value of the third
geometric mean.
Glossary
●● Common Ratio: This constant quantity is called the common ratio of the G.P.
U
zero and each of the terms is obtained by multiplying its just preceding term is
obtained by a constant quantity.
Further Readings
m
1. Geometric
2. Fixed
3. Common
Notes
4. True
e
5. Mean
in
6. Interest
7. True
8. False
nl
Multiple Choice
1. A
O
2. D
3. C
ity
4. B
5. A
rs
ve
ni
U
ity
m
)A
(c
e
Objectives:
At the end of this unit, you will be able to understand:
in
●● Introduction to Permutations and Combination and its applications in Business
nl
Introduction
In today’s business world scenarios, there are many such problems which we can
solve with the help of Permutation and combination. This unit will cover in-depth about
O
the problems presented in businesses and how to solve them.
ity
Applications in Business
Each of different arrangements, which can be made by taking some or all of
a number of things is called a permutation. e.g., Arrangements of objects taking 2 at
a time from given 3 objects (a, b, c) are ab, bc, ca, cb, ac, ba, then total number of
arrangements is 6, each of which is known as permutation.
Meaning of n Pr rs
ve
Number of permutations of n distinct objects taking r at a time is denoted by n Pr .
n n!
= Pr , ∀0 ≤ r ≤ n
( n − r )!
ni
= n ( n − 1 )( n − 2 ) ...... ( n − r + 1 ) , ∀n ∈ N and r ∈W
Properties of n Pr
U
●●
n −1
P=
r (n − r ) n −1
Pr −1
Ex.: In how many of the distinct permutations of the letters in the ‘MISSISSIPPI’ do
m
The number of permutations of the word ‘MISSISSIPPI’ in which 4 I’s and 4’s are like
11!
=
4!4!2!
If all the I’s are together, then it will be considered as one letter and remaining 7
(c
letters and 1 I’s letter will be considered as 8 letters. So, the number of permutations is
8!
=
4!2!
Hence, total number of arrangements
11! 8!
Notes −
4!4!2! 4!2!
=
e
11 × 10 × 9 × 8 × 7 × 6 × 5 × 4! 8 × 7 × 6 × 5 × 4!
= −
4!4!2! 4!2!
in
= 34650 − 840
= 33810
nl
Application of Permutation and Combination
In today’s business world scenarios, there are many such problems which we can
solve with the help of Permutation and combination. Few of them are discussed below:
O
Ex.: If it is required to seat 5 men and 4 women in a row so that the women occupy
the even places. How many such arrangements are possible?
ity
Solution: We are given that there are 5 men and 4 women.
The even positions are: 2nd, 4th, 6th, and 8th places.
= 24
rs
These four places can be occupied by 4 women in P (4, 4) ways = 4! = 4 x 3 x 2 x 1
Ex.: How many different ways can five male and five females form a circle for a
business meeting such that the male and female set alternate?
U
Solution: After fixing up one boy on the table, the remaining can be arranged in 4!
ways but male and female are to sit alternate.
There will be 5 places, between two males, these 5 places can be filled by 5
ity
females in 5! ways.
Ex.: A board meeting of a company is organised in a room for 24 persons along the
m
two sides of a table with 12 chairs on each side, 6 persons want to sit on a particular
side and persons want to sit on the other side. In how many ways can they be seated?
12
P6 ways and 3 persons can be arranged on the 12 chairs on the other side in 12P3.
Remaining persons = 24 – 6 – 3 = 15
Now the remaining 15 persons can be arranged on the remaining 15 chairs in 15P15,
(c
Ex.: Mr. Rajan has 10 employee and he wants to invite 6 of them to a party. How
many times 3 particular employee never attend the party? Notes
e
Solution: Excluding the 3 particular friends. Mr. Rajan can invite 6 friends from
remaining 7 friends. This can be done in 7C6 = 7 ways.
in
Each of the different groups or selections which can be made by some or all of
a number given, things without reference to the order of the things in each group is
called a combination. e.g., The groups made by taking 2 objects at a time from three
nl
objects (a,b,c) are ab, bc, ca. then, the number of groups is 3 each of which is known
as combination.
O
Meaning of n C r
ity
n
n n! Pr
C= = ,∀0 ≤ r ≤ n
r ! ( n − r )!
r
r!
n ( n − 1 )( n − 2 ) ...... ( n − r + 1 )
, ∀n ∈ N and r ∈W
r ( r − 1 )( r − 2 ) .....2.1
Properties of n C r rs
ve
n
C r is a natural number.
n n
C0
= C n 1,n =
= C1 n
n
C r =n C n−r
ni
n
C r + n C r −1 =
n +1
Cr
n
C 0 + n C1 + n C2 + .... + n C n =
2n
U
Ex.1.32: Find n, if n −1
P3 :n P4 = 1: 9
Solution:
ity
n −1
P3 :n P4 = 1: 9
( n − 1)! : n! = 1: 9
( n − 1 − 3)! ( n − 4 )!
( n − 1)! : n ( n − 1)! = 1: 9
m
( n − 4 )! ( n − 4 )!
( n − 1)! × ( n − 4 )! =
)A
1: 9
( n − 4 )! n ( n − 1)!
1 1
=
n 9
n=9
(c
Ex.: In how many of the distinct permutations of the letters in the ‘MISSISSIPPI’ do
the four I’s not come together?
The number of permutations of the word ‘MISSISSIPPI’ in which 4 I’s and 4’s are like
Notes 11!
=
e
4!4!2!
If all the I’s are together, then it will be considered as one letter and remaining 7
in
letters and 1 I’s letter will be considered as 8 letters. So, the number of permutations is
8!
=
4!2!
nl
Hence, total number of arrangements
11! 8!
− =
4!4!2! 4!2!
O
11 × 10 × 9 × 8 × 7 × 6 × 5 × 4! 8 × 7 × 6 × 5 × 4!
= −
4!4!2! 4!2!
= 34650 − 840
ity
= 33810
Ex.1.34: In a polygon the number of diagonals is 54. The number of sides of the
polygon is ____?
rs
Solution: Let number of sides of polygon is n. Since number of sides of the
polygon is equal to the number of vertices of polygon.
n 2 − 3n =
108
n 2 − 3n − 108 =
0
( n + 9 )( n − 12 ) =
0
U
n = 12
Ex.1.35: Find the number of ways of selecting 9 balls from 6 red balls, 5 white balls
and 5 blue balls, if each selection consist of 3 balls of each colour.
= 6 C3 ×5 C3 ×5 C3
= 6 C3 ×5 C2 ×5 C2
6×5× 4 5× 4 5× 4
)A
= × ×
6 2 2
= 20 × 10 × 10
= 2000
Ex.1.36: Four boys picked up 30 mangoes. In how many can they divide them, if all
(c
mangoes areidentical?
Solution: 30 mangoes can be distributed among 4 boys such that each boy can
receive any number of mangoes.
Amity Directorate of Distance & Online Education
Quantitative Aptitude 45
e
=33 C3
33.32.31
in
=
1.2.3
= 5456
nl
Ex.1.37: How many different ways can five boys and five girls form a circle such
that the boys and girls set alternate?
O
Solution: After fixing up one boy on the table, the remaining can be arranged in 4!
waysbut boys and girls are to sit alternate.
There will be 5 places, between two boys, these 5 places can be filled by 5 girls in
5! ways.
ity
The required number of ways
= 4!× 5!
= 2880
Ex. 1.38: If it is required to seat 5 men and 4 women in a row so that the women
occupy the even places. How many such arrangements are possible?
ni
The even positions are: 2nd, 4th, 6th, and 8th places.
Ex. 1.39: How many different ways can five male and five females form a circle for
)A
a business meeting such that the male and female set alternate?
Solution: After fixing up one boy on the table, the remaining can be arranged in 4!
waysbut male and female are to sit alternate.
females in 5! ways.
= 4!× 5!
Notes
= 2880
e
Ex. 1.40: A board meeting of a company is organised in a room for 24 persons
along the two sides of a table with 12 chairs on each side, 6 persons want to sit on a
in
particular side and persons want to sit on the other side. In how many ways can they be
seated?
nl
Solution: 6 persons can be arranged on the 12 chairs on the particular sides in
12 12
P6 ways and 3 persons can be arranged on the 12 chairs on the other side inP3 .
Remaining persons = 24 – 6 -3 = 15 and remaining chairs = 24 – 6 – 3= 15.
O
Now the remaining 15 persons can be arranged on the remaining 15 chairs in
15
P15 , i.e., 15! Ways.
ity
12
Therefore, the required number of ways = P6 ×12 P3 × 15!
Ex. 1.41: Mr. Rajan has 10 employee and he wants to invite 6 of them to a party.
How many times 3 particular employee never attend the party?
Solution: Excluding the 3 particular friends. Mr. Rajan can invite 6 friends from
rs 7
remaining 7 friends. This can be done in C6 = 7 ways.
called a ____________.
4. In today’s business world scenarios, there are many such problems which we can
solve with the help of Permutation and combination. True / False
ity
Multiple Choice
1. How many different ways can five boys and five girls form a circle such that the boys
and girls set alternate?
m
a. 3456
b. 2880
)A
c. 4321
d. 2112
2. Four boys picked up 30 mangoes. In how many can they divide them, if all mangoes
are identical?
(c
a. 5456
b. 6666
c. 4347
Notes
d. 2112
e
3. Find the number of ways of selecting 9 balls from 6 red balls, 5 white balls and 5 blue
balls, if each selection consist of 3 balls of each colour
in
a. 5000
b. 2000
nl
c. 1500
d. 4000
O
n −1
4. Find n, if P3 :n P4 = 1: 9
a. 9
b. 5
ity
c. 7
d. 11
rs
Summary
●● Each of different arrangements, which can be made by taking some or all of a
number of things is called a permutation. e.g., Arrangements of objects taking 2 at
ve
a time from given 3 objects (a, b, c) are ab, bc, ca, cb, ac, ba, then total number of
arrangements is 6, each of which is known as permutation.
●● Each of the different groups or selections which can be made by some or all of a
number given, things without reference to the order of the things in each group is
ni
called a combination. e.g., The groups made by taking 2 objects at a time from
three objects (a,b,c) are ab, bc, ca. then, the number of groups is 3 each of which
is known as combination.
U
●● Each of the different groups or selections which can be made by some or all of a
number given, things without reference to the order of the things in each group is
called a combination.
ity
Activity
1. How many ways can you arrange the letters of the word ‘Leading” where all the
vowels always come together?
m
2. How many three constant and two vowel words can you form if you have seven
constants and four vowels?
)A
3. How many three digit numbers which are divisible by five can be formed out of the
numbers 2, 3, 5, 6, 7, 9?
2. What are two examples on how permutations solve business related problems?
Glossary
Notes
●● Combination: each of the different groups or selections which can be made by
e
some or all of a number given, things without reference to the order of the things in
each group.
in
●● Permutation: Each of different arrangements, which can be made by taking some
or all of a number of things.
nl
Further Readings
1. Carnielli, Antonio Lucio. Mathematics on Several Subjects: Theory and
tests: Combinatorics, Logarithms, Arithmetic Progressions and Geometric
O
Progressions and other topics. Independently published (November 1, 2016)
2. Future Point Coaching Center. Important questions of Quadratic Euations and
Arithmetic Progression for competitive exams and CBSE Xth class board exam
ity
2020: Useful for achievers. Kindle Edition January 25, 2020
Check your Understanding: Answers
1. Permutation
2.
3.
Arrangement
Combination rs
ve
4. True
Multiple Choice
1. B
ni
2. A
3. B
U
4. A
ity
m
)A
(c
e
Structure:
in
3.1 Sets
3.1.1 Definition of Sets and Subsets.
3.1.2 Representation of a Set - Roster form, Descriptive form and Set
nl
Builder form.
3.1.3 Types of Sets - Empty, Singleton, Finite, Infinte Sets, Equal, Power and
O
Universal Sets..
3.2 Relations and Functions
3.2.1 Relations and functions
ity
3.2.2 Function - Definition and Types
3.3 Limit and Continuity
3.3.1 Limit of a function - Definition and Methods
3.2.2 Methods to solve limit
rs
3.2.3 Continuous and discontinuous functions - Introduction and Application
ve
ni
U
ity
m
)A
(c
e
Objectives:
At the end of this unit, you will be able to understand:
in
●● Definition of Set and Subset
●● Representation of Set - Roster Form, Descriptive Form and Set Builder Form
nl
●● Types of Sets – Empty, Singleton, Finite, Infinite Sets, Equal, Power, Universal
O
3.1.1 Definition of Sets and Subsets
The set theory was developed by a German Mathematician Georg Cantor (1845-
1918). Nowadays, set theory is used in almost all branches of mathematics. We also
ity
use sets to define Relation and Functions. The knowledge of sets is required in the
study of geometry, sequence, probability, etc. In this unit, we will discuss some basic
definitions related to sets.
Definition of a Set
rs
“A well-defined collection of objects is called a set”.
The term “well-defined” implies that in a given set, it would be possible to decide if
ve
certain objects belong to the set. The term “distinct” implies that a given object should
not be repeated in a collection or group. The object in the set is called its member or
element. A set is represented by { }.
ni
Generally, sets are denoted by capital letters X, Y, Z…. and its element are denoted by
small letters x, y, z…….
Example: Suppose we have a set X that is defined in this way X = Set of all days in
a week. In this set Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday
are members of the set.
Definition of a Subset
m
In the previous unit, we learned what sets are, how they are represented and they
are classified. In this unit, we will learn what are Subset, Superset and Proper subset.
)A
Let X and Y be two non-empty sets. If each element of set X is an element of set Y,
then set X is known as subset of set Y.
e
is a superset of X.
Proper Subset If each element of X is in set Y but set Y has at least one element
in
which is not in X, then set X is known as proper subset of set Y, If X is a proper subset
of Y, then it is written as X ⊂ Y and read as X is a proper subset of Y.
nl
Example:
If N = {1,2,3,4,........}
O
And W = {0,1,2,3,4,.......}
Then N⊂W
ity
3.1.2 Representation of a Set- Roster Form, Descriptive Form and
Set Builder Form
Sets can be represented by following two methods:
1.
2.
Tabular Method or Roster Method
Set Builder Method rs
ve
Tabular Method or Roster Method – In this method, elements are listed and put
with in a brace {} and separated by commas.
Set Builder Method: - In this method, instead of listing all elements of a set, we
list the property or properties satisfied by the elements of a set and write it as X = {x :
U
P (x)} It is read as “X is the set of all elements x such that x has the property P(x).” The
symbol ‘:’ stands for such that.
Example: Suppose we have a set X that is defined in this way X = Set of all even
number less than 15. X = {x: x = 2n, n ∈ N and 1 ≤ n ≤ 7} Or X = {x: x is an even number
ity
3.1.3 Types of Sets- Empty, Singleton, Finite, Infinite Sets, Equal, Power and
Universal Sets
m
Types of Set:
(i) Empty (Void/Null) Set: A set which has no element, is called an empty set. It
)A
is denoted by φ or {}. Example, Let X = Set of all even prime numbers greater
than 3.
Example, Let Y = Set of all prime numbers less than 2.
(ii) Singleton Set: A set which has only one element or member, is known as
(c
singleton set.
Example, Let X = {x: x is an even prime number} and Y = {a}
(iii) Finite Set: A set which has finite number of element or member, is known as
Notes finite set.
e
Example, Let X = {x: x is an even number less than 9} and Y = {1, 3, 5, 7, 11, 13,
15}
in
(iv) Infinite Set: A set which have infinite number of element or member, is known
as infinite set. Example, Let X = {x: x is a natural number} and Y = {2, 4, 6, 8,
10, 12, 14……………………}
nl
(v) Equivalent Sets: If two finite sets X and Y have same number of elements,
then the sets are known as equivalent set.
O
Example, Let X = {2, 4, 6, 8} and Y = {1, 3, 5, 7}
(vi) Equal Sets: If X and Y are two non-empty sets and each element of X is an
element of set Y and each element of set Y is an element of set X, then sets
ity
X, and Y are called equal sets.
Example, Let X = {x: x = 2n, n ∈ N and 1 ≤ n ≤ 5} and Y = {2, 4, 6, 8, 10}
(vii) Universal Set: If there are some sets under consideration, then there happens
to be a set which is a superset of each one of the given sets. Such a set is
rs
known as the universal set and it is denoted by U.
Example, Suppose we have three set X = {a, b}, Y = {c, d, e}, and Z = {f, g, h, I, j}.
ve
∴ U = {a, b, c, d, e, f, g, h, I, j} is a universal set for all given sets.
(viii) Power Set: If X is a non-empty set, then the collection of all possible subsets
of set X can be referred to as power set. It is denoted by P(X). The total
number of elements in a power set of X, containing n elements, is 2n .
ni
Example, Let X = {a, b, c} ∴ P(X) = {φ, = {a}, {b}, {c}, {a, b}, {b, c}, {c, a}, {a, b, c}}
1. The set theory was developed by a German Mathematician Georg Cantor (1845-
1918).
ity
8. If X is a non-empty set, then the collection of all possible subsets of set X can be
referred to as _______ set.
Summary:
Notes
●● A set is well-defined collection of distinct objects.
e
●● Set can be represented in two ways (i) Tabular form or Roster form (ii) Set builder
method.
in
●● In the tabular form, the elements of a set are actually written down, separated by
commas and enclosed within braces.
nl
●● In the set builder method, a set is described by a characterizing property of its
element.
●● A set that does not contain a single element or member is called a null or empty
O
set.
●● A set with only one element or member is known as singleton set.
●● A set, which has a finite number of element or member, is known as finite set;
ity
otherwise it is called non-finite set.
●● Two sets X and Y are said to be equal, if every element of set X is in set Y and
every element of set Y is in set X.
●●
●● rs
The collection of all subset of a set X is called Power set of X.
Two sets X and Y are said to be equivalent, if the number of element in both sets
are equal.
ve
●● All the sets under consideration are likely to be subsets of a set is called the
universal set
Activity
ni
1. Find ten real life examples of where set are used. What method was used to represent
them? What kind of set are they?
U
Glossary
●● Set: A well-defined collection of objects
●● Set Builder Method: In this method, instead of listing all elements of a set, we list
)A
Further Reading
1. Boelkins, Matthew. Active Calculus 2018: Single Variable. CreateSpace
Independent Publishing Platform; 2018th edition (August 13, 2018)
e
3. Stewart, James. Calculus 8th Edition. Cengage Learning; 8th edition (May 19,
2015)
in
Check your Understanding: Answers
1. Set
nl
2. True
3. Distinct
O
4. Well-defined
5. Tabular
6. Empty
ity
7. False
8. Power
rs
ve
ni
U
ity
m
)A
(c
e
Objective:
At the end of this unit, you will be able to understand:
in
●● Relations and Functions
nl
Introduction
In mathematics, a function is a relation between a set of inputs and a set of
possible outputs. Functions have a property that they have an output for every input. In
O
this unit, we will learn what functions are and how they are classified, and we will also
discuss the inverse function.
ity
3.2.1 Relations and Functions
Definition of Function
A function f from a set X to another set Y is said to be a function (or mapping)
rs
from X to Y if, with every element of X, the relation f relates a unique element of Y. The
element of set Y is called f-image of the element of set X. Also, the element of set X is
called pre-image of the element of set Y under f.
ve
Classification of Functions
Constant Function: A function which does not change as its parameters vary, i.e.,
the function whose rate of change is zero.
ni
Or
function.
x 1 2 3 4 5
m
y = f(x) 2 2 2 2 2
Now we can draw a graph according to the obtain data from the above table
)A
(c
e
2.5
in
1.5
Y-axis
1
nl
0.5
O
0
0 1 2 3 4 5 6
X-axis
ity
Fig.3.1.3 The graph of constant function f(x) = 2
If
The domain of
rs
a0 ≠ 0 , then the degree of a polynomial function is n.
f ( x ) = R and range varies from function to function.
ve
Example 3.2.2 The graph of the function f ( x ) = x2
Table for functions at different values of x.
ni
x -5 -4 -3 -2 -1 0 1 2 3 4 5
f(x) = x2 25 16 9 4 1 0 1 4 9 16 25
U
Now we can draw a graph according to the obtain data from the above table.
Graph of f(x) = x2
ity
30
25
20
m
15
10
)A
0
-6 -4 -2 0 2 4 6
(c
Notes
P(x)
then the function f ( x ) = is known as a rational function.
e
Q( X )
The domain of R − {x : Q ( x ) =
f (x) = 0} and range varies from function to
in
function.
Irrational Function: The function containing one or more terms having non-
nl
integral rational powers of x are called irrational function.
=
Example 3.2.3.1 y (x)
f= x
O
Domain varies from function to function.
ity
The domain of f ( x ) = R and Range of f ( x ) = R
Example 3.2.3 The graph of a function f (x) = x
Table for function at the different value of x.
X
f (x) = x
-5
-5
-4
-4
-3
-3
-2
-2
-1
-1
0
0
1
1
rs
2
2
3
3
4
4
5
5
ve
Now we can draw a graph according to the obtain data from the above table.
Graph of f(x) = x
ni
4
U
0
ity
-6 -4 -2 0 2 4 6
-2
-4
m
-6
Square Root Function: The function that associates every positive real number x
to + x is called the square root function, i.e., f ( x ) = + x
f ( x=
) [0, ∞ ) .
(c
Range of
number, is an exponential function. The value of the function depends upon the value of
Notes a for 0 < a < 1 , the function is decreasing and 0for< a < 1 , the function is increasing.
Domain of f (x) = R
e
And Range of f ( x=
) [0, ∞ )
in
Example 3.2.4 The graph of the function f ( x ) = ex
Table for function at the different value of x.
nl
x f(x) = ex
-5 0.0067379
O
-4 0.0183156
-3 0.0497871
-2 0.1353353
ity
-1 0.3678794
0 1
1 2.7182818
2 7.3890561
3
4 rs 20.085537
54.59815
ve
5 148.41316
Now we can draw a graph according to the obtain data from the above table.
ni
Graph of f(x) = ex
160
140
U
120
100
Y-axis
80
ity
60
40
20
m
0
-6 -4 -2 0 2 4 6
X-axis
)A
=
Logarithmic Function: Function f (x) log a x , ( x , a > 0 ) and a ≠ 1 is known as
the logarithmic function.
(c
Domain of f ( x=
) ( 0, ∞ )
e
Example 3.2.5 The graph of the function
in
x 0.1 1 10 25 50 75 100
f (x) = log10 x -1 0 1 2 3 4 5
nl
Now we can draw a graph according to the obtain data from the above table
O
Graph of f(x) = log10x
2.5
2
ity
1.5
1
Y-axis
0.5
0
-0.5
-1.5
-1
0 20 40 60 80
rs 100 120
ve
X-axis
=
Modulus Function: Function y ( x ) x is known as modulus function.
f=
x, x0
y f ( x) x
U
x, x0
X –5 –4 –3 –2 –1 0 1 2 3 4 5
f (x) = |x| 5 4 3 2 1 0 1 2 3 4 5
m
Now we can draw a graph according to the obtain data from the above table
)A
(c
e
6
in
4
Y-axis
3
nl
2
O
0
-6 -4 -2 0 2 4 6
X-axis
ity
Fig. 3.1.8 The graph of Function f (x) = |x|
Inverse Function: Let f be defined as a function from X to Y such that for every
element of Y them exists an image. Let y be an arbitrary element of Y. Then, f being
onto, there exists an element x ∈ X such that f ( x ) = y . Also, f being one-one, this x
rs
must be unique. Thus, for each y ∈Y , there exists a unique element x ∈ X such that
f ( x ) = y . So, we may define a function,
ve
f −1 : Y → X
\ f −1 ( y ) =
x ⇔ f (x) =
y
The above function f −1 is called the inverse of f.
ni
respectively.
We have given y= f ( x )= x + 3
So, to find the inverse of this function, we need to find the value of x in terms of y.
ity
So, f −1 ( x )= y − 3
x 0 1 2 3 4 5
m
f (x) 3 4 5 6 7 8
Y 0 1 2 3 4 5
f –1 (x) –3 –2 –1 0 1 2
)A
Now we can draw a graph according to the obtain data from the above table.
(c
e
9
8
in
7
6
Y-axis
5
4
nl
3
2
1
O
0
0 1 2 3 4 5 6
X-axis
ity
Fig. 3.1.9 The graph of function f (x) = x + 3
Now we can draw a graph of the inverse of the above function according to the
obtain data from the above table.
3
Graph of f (x)
-1
rs
ve
2
0
Y-axis
ni
0 1 2 3 4 5 6
-1
-2
U
-3
-4
X-axis
ity
It is clear from the above figure that the points of the line of the two graphs are the
m
function.
3. Functions have the property that they have an output for every input. True / False
Multiple Choice
Notes
1. If , then x is
e
a. 81
in
b. 36
c. 64
d. None of these
nl
2. A number of two digits is equal to three times the digits’ sum. Find the number.
a. 72
O
b. 63
c. 24
ity
d. 27
3.
6 6 6........ is equal to
3
6
rs
a. 2
b. 1
3
2
ve
c. 6
d. 3
c) log39
d) log24
ity
Summary
●● In mathematics, a function is a relation between a set of inputs and a set of
possible outputs. Functions have the property that they have an output for every
m
input.
●● There are many types of functions such as constant function, polynomial function,
rational function. Irrational function, identity function, square root function,
)A
Activity
1. Draw a graph of the inverse of x
(c
e
2. Describe a rational function.
in
3. How is a square root function depicted?
4. What does a logarithmic function look like graphed out?
5. Describe a modulus function.
nl
Glossary
●● Constant Function: a function which does not change as its parameters vary.
O
●● Exponential Function: a function of the form f(x)=a2, where (a) is a positive real
number.
ity
●● Irrational Function: function containing one or more terms having non-integral
rational powers of x.
Further Reading
rs
1. Boelkins, Matthew. Active Calculus 2018: Single Variable. CreateSpace
Independent Publishing Platform; 2018th edition (August 13, 2018)
2. Edwards, Bruce H.; Larson, Ron. Calculus: Early Transcendental FUnctions
ve
Cengage Learning; 7th edition (January 1, 2018)
3. Stewart, James. Calculus 8th Edition. Cengage Learning; 8th edition (May 19,
2015)
ni
2. Constant
3. Exponential
4. True
ity
Multiple Choice
1. A
2. D
m
3. D
4. B
)A
(c
e
Objectives:
At the end of this unit, you will be able to understand:
in
●● Function – Definition and Types
●● Limit of a function – Definition and Methods
nl
●● Methods to Solve Limit
●● Continuous and Discontinuous Functions – Introduction and Application
O
3.3.1 Function – Definition and Types
In mathematics, a function is a relation between a set of inputs and a set of
ity
possible outputs. Functions have a property, which is they have an output for every
input. In this unit, we will learn what functions are and how they are classified, and we
will also discuss the inverse function.
rs
Definition of Function
A function f from a set X to another set Y is said to be a function (or mapping)
from X to Y if, with every element of X, the relation f relates a unique element of Y. The
ve
element of set Y is called f-image of the element of set X. Also, the element of set X is
called pre-image of the element of set Y under f. The image given below is an Example
of function in mathematics:
ni
U
ity
m
)A
Constant Function: A function which does not change, as its parameters vary, i.e.,
the function whose rate of change is zero.
Or
f ( x ) = c , ∀ x ∈ R is known as a constant
(c
e
Table for function at a different value of x.
in
X 1 2 3 4 5
y = f(x) 2 2 2 2 2
nl
Now we can draw a graph according to the obtain data from the above table
O
ity
rs
ve
ni
( )
U
n n −1
Polynomial Function: The function y = f x = a0 x + a1 x + ....... + an , where
a0 , a1 , a2 ,……, an are real coefficients, and n is a non-negative integer, is known as
a polynomial function.
If
x -5 -4 -3 -2 -1 0 1 2 3 4 5
)A
f(x) = x 2
25 16 9 4 1 0 1 4 9 16 25
Now we can draw a graph according to the obtain data from the above table.
(c
Notes
e
in
nl
O
ity
Fig. 3.1.2 The graph of function f(x) = x2
Q ( x ) ≠ 0 , then the
rs
Rational Function: If P(x) and Q(x) are polynomial functions,
P(x)
function f ( x ) = is known as a rational function.
Q( X )
ve
The domain of R − {x : Q ( x ) =
f (x) = 0} and range
varies from function to function. In other words, domain is the set of all values for
which we get the function defined. Range of the function will be a set of all the values
ni
taken by f.
5
A graph for the function f ( x ) = is shown below.
x −1
U
ity
m
)A
(c
Irrational Function: The function containing one or more terms having non-integral
rational powers of x are called irrational function. Notes
e
=
Ex. 3.3 y (x)
f= x
in
nl
O
ity
Domain varies from function to function. rs
ve
For Example, if we are to find the graph of function f(x) = √(x + 2), the following will
be the solution:
ni
U
ity
m
( )
Identity Function: Function f x = x , ∀ x ∈ R is known as identity function. It is a
straight-line passing origin and having slope unity.
X -5 -4 -3 -2 -1 0 1 2 3 4 5
(c
-5 -4 -3 -2 -1 0 1 2 3 4 5
f (x) = x
Now we can draw a graph according to the obtain data from the above table.
Notes
e
in
nl
O
ity
Fig. 3.1.5 The graph of function f(x) = x
rs
Square root Function: The function that associates every positive real number x
( )
to + x is called the square root function, i.e., f x = + x
Range of f ( x=
) [0, ∞ ) , which means that the range of a function can be 0 but not
ve
∞.
( ) x
Exponential Function: A function of the form f x = a , where a is a positive
real number and is an exponential function. The value of the function depends upon
the value of a for 0 < a < 1 , the function is decreasing and for a > 1 , the function is
ni
increasing.
Domain of f (x) = R
U
And Range of f ( x=
) [0, ∞ )
Ex. 3.5 The graph of the function f ( x ) = ex
ity
X f(x) = ex
-5 0.0067379
-4 0.0183156
m
-3 0.0497871
-2 0.1353353
)A
-1 0.3678794
0 1
1 2.7182818
2 7.3890561
(c
3 20.085537
4 54.59815
5 148.41316
Now we can draw a graph according to the obtain data from the above table.
Notes
e
in
nl
O
ity
Fig. 3.1.6 The graph of function f(x) = e.x
=
Logarithmic Function: Function f x ( ) log a x , ( x , a > 0 ) and a ≠ 1 is known as
the logarithmic function.
Domain of f ( x=) ( 0, ∞ ) rs
ve
And Range of f (x) = R
Ex. 3.6The graph of the function f ( x ) = log10 x
Table for function at the different value of x.
ni
x 0.1 1 10 25 50 75 100
-1 0 1 2 3 4 5
f ( x ) = log10 x
U
Now we can draw a graph according to the obtain data from the above table
ity
m
)A
(c
Notes =
Modulus Function: Function y ( x ) x is known as modulus function.
f=
f (x) = x
e
Ex. 3.7The graph of the function
in
X -5 -4 -3 -2 -1 0 1 2 3 4 5
5 4 3 2 1 0 1 2 3 4 5
nl
f (x) = x
Now we can draw a graph according to the obtained data from the above table
O
ity
rs
ve
Fig. 3.1.8 The graph of function f(x)=|x|
ni
Inverse Function: Let f be defined as a function from X to Y such that for every
element of Y there exists an image. Let y be an arbitrary element of Y. Then, f being
( )
onto, there exists an element x ∈ X such that f x = y . Also, f being one-one, this x
U
must be unique. Thus, for each y ∈Y , there exists a unique element x ∈ X such that
( )
f x = y . So, we may define a function,
f −1 : Y → X
ity
\ f −1 ( y ) =
x ⇔ f (x) =
y
respectively.
We have given y= f ( x )= x + 3
So, to find the inverse of this function, we need to find the value of x in terms of y.
(c
So, f −1 ( x )= y − 3
X 0 1 2 3 4 5
Notes
3 4 5 6 7 8
e
f (x)
Y 0 1 2 3 4 5
in
-3 -2 -1 0 1 2
f −1 ( x )
nl
Now we can draw a graph according to the obtain data from the above table.
O
ity
rs
Fig. 3.1.9 The graph of function f(x) =x + 3
ve
We can now draw a graph of the inverse, of the above function according to the
obtained data from the above table.
ni
U
ity
m
)A
It is clear from the above figure that the points of the line of the two graphs are the
image of each other.
(c
e
If f ( x ) is a function of x such that, if x approaches to a constant value ‘a’, then the value of
f ( x ) also approaches to another constant k, then constant k is known as limit of f ( x ) at x = a
in
. Limit is defined as
lim f ( x ) = k
nl
x →a
Or
The theoretical explanation and diagram does not match correctly. Please discuss
O
individual concepts with different diagram i.e. LHL, RHL and existence of limit, using the
exactly same notations as in theory, to specify where they exist in diagram.
A real number b is called the limit of the function f, if for every ε > 0 , however small,
ity
there exists d > 0 such that f ( x ) − b < ε , whenever 0 < x − a < d
lim f ( x ) = b
x →a
The working rule for finding the left hand limit is put a − h for x in f ( x ) , where h is
ni
i.e.,
f ( a −=
0 ) lim f ( a − h )
U
h→0
ity
It is written as lim+ f ( x ) = b or f (a + 0) =
b.
x →a
The working rule for finding the right-hand limit is put a + h for x in f ( x ) , where h is
)A
i.e.,
f ( a +=
0 ) lim f ( a + h )
h→0
(c
The right and left limits are shown in the diagram below. The region to the right of a
shows the right-hand limit and the region to the left of a shows the left-hand limit.
Notes
e
in
nl
O
Existence of Limit
If both right hand limit and left hand limit exist and are equal, then their common
value, evidently will be the limit of f as x → a i.e.,
ity
If lim
= f ( x ) lim
= f ( x ) b , then lim f ( x ) = b ,
x →a + −
x →a x →a
If however, either both of these limits do not exist or both these limits exist but are
not equal in value, then lim f ( x ) does not exist.
Algebra of Limits
x →a
rs
ve
Let f and g be two functions such that both lim f ( x ) and lim g ( x ) exist. Then,
x →a x →a
lim f ( x ) +=
g ( x ) lim f ( x ) + lim g ( x )
ni
x →a x →a x →a
Limit of difference of two functions is difference of the limits of the functions, i.e.,
U
lim f ( x ) −=
g ( x ) lim f ( x ) − lim g ( x )
x →a x →a x →a
Limit of product of two functions is product of the limits of the functions, i.e.,
lim f ( x )=
.g ( x ) lim f ( x ) − lim g ( x )
ity
x →a x →a x →a
f ( x ) lim f (x)
m
lim = x →a
x →a g ( x ) lim g ( x )
x →a
(
Ex. 3.9: Find the value of lim x + x − 3x .
4 3
)
)A
x →3
Solution:
= 81 + 27 − 9
= 99
e
4x + 3 4 × 4 + 3
lim =
x −2 4 −2
in
x →4
19
=
2
nl
x −4
Ex. 3.11: Find the value of lim .
x →4 x −4
O
x −4
Solution:Let f ( x ) = lim
x →4 x −4
At x = 4,
ity
RHL = lim+ f ( x )
x →4
= lim f ( 4 + h )
h→0
4+h−4
rs
= lim
h→0 4+h−4
= lim
(4 + h − 4)
h→0 4+h−4
ve
=1
At x =4,
LHL = lim− f ( x )
ni
x →4
= lim f ( 4 − h )
h→0
4−h−4
= lim
U
h→0 4−h−4
−(4 − h − 4)
= lim
h→0 4−h−4
= −1
ity
RHL ≠ LHL
x →1 x − 1 x →k x − k
x4 − 1 x3 − k3
Solution: Given, lim = lim 2 2
)A
x →1 x − 1 x →k x − k
3 3 −2 x n − an
⇒ 4 (1) lim
4 −1
= k
2 x →a x − a
3
(c
⇒ 4 =k
2
8
⇒k=
3
e
The tow broad areas of calculus known as differential and integral calclus are
built on the foundation concept of a limit. In this section our approach to this important
in
concet will be insuitive, concentirating on uinderstanding what a limit is using numerical
and graphical examples. In the next section, our apporach will be anlytical, that is, we
will use algebratic methods to compare the value of a limit of a function.
nl
1. Limit of a function informal approach Consider the function
16 – x2
f(x) =
4–x
O
Whose domain is the set of all real numbers except –4. Although f can not be
evaluted of –4 because subtituting Œ4 for x results in the nunddined quanitity 0/0,f(x)
can be calculated at any number x that is very close to –4. The two tables
ity
(2)
show that as X apporaches –4 from either the left or right, the function valjues
rs
f(x) appear to be approaching 8, in other words, when x is near (4,f (x) is near 8. To
interpret the numberical information in (1) graphically, observe that for every number x
=–4, the function f can be simplified by cancellation.
ve
As seen in the graph of f is essentially the graph of y= 4 –x with the exception
ni
that the graph of f has a hole at the point that corresponds to x = –4. For x sufficiently
close to –4, represented by the two arrowheads on the x-axis, the two arrowbeads on
the y-axis, representing function values f(x), simultaneously get closer and closer to the
U
number 8. indeed, in view of the numerical results in (2), the arrow heads can be made
as close as we like to the number 8. We say 8 is the limit of f9x) as x approaches –4.
ity
m
)A
(c
Notes
e
in
nl
O
ity
rs
ve
ni
U
ity
m
)A
(c
Notes
e
in
nl
O
ity
rs
ve
ni
U
ity
m
)A
(c
Notes
e
in
nl
O
ity
rs
ve
ni
U
ity
m
)A
(c
Which form the backbone of differential calculus, also have the indetermine form
0/0.
Notes
e
in
nl
O
Limit Theorems
ity
Intorudction: the intention of the informal discussion in Section 2.1 was to gi ve ou
an intuitive grasp of when a limit does or does not exist. However. It is neither desirable
nor practical, in every instance, to reach a conclusion about the existence of a limit
based on a graph or on a table of numerical values. We must be able to evaluate a
rs
limit, or discern its non-existence, in a somewhat mechanical fashion. The theorems
that we shall consider in this section establish such a means. The proofs of some of
these reults are given in the appendix.
ve
The firs theorem gives two basic results that will be used throughout the discussion
of this section.
ni
U
ity
m
)A
(c
Notes
e
in
nl
O
ity
rs
ve
ni
U
ity
m
)A
(c
Notes
e
in
nl
O
ity
rs
ve
ni
U
ity
m
)A
(c
Notes
e
in
nl
O
ity
rs
ve
ni
U
ity
m
)A
(c
Notes
e
in
nl
O
ity
rs
ve
ni
U
ity
m
)A
(c
Notes
e
in
nl
O
ity
rs
ve
ni
U
ity
m
)A
(c
Notes
e
in
nl
O
ity
rs
ve
ni
U
concepts. A proof of the existence of a limit can never be based on one’s ab olity to
sketch graphs or on tables of numerical values. Althogh a good intuitive understanding
of lim f (x) is sufficient for proceeding with the study of the calculus in this text, an
intuitive understanding is admittedly too vegue to be any use in proving theorems.
m
Notes
e
in
nl
O
ity
rs
ve
ni
U
ity
m
)A
(c
Notes
e
in
nl
O
ity
rs
ve
ni
U
ity
m
)A
(c
Notes
e
in
nl
O
ity
rs
ve
ni
U
ity
m
)A
(c
Notes
e
in
nl
O
ity
rs
ve
ni
U
ity
m
)A
(c
Notes
e
in
nl
O
ity
rs
ve
ni
U
ity
m
)A
(c
Notes
e
in
nl
O
Limits: Graphic Solutions
ity
Graphical Limits
Let f(x) be a functuion defined on the interval [–6,11] whose graph is given as:
rs
ve
The limits are defined as the value that the function approaches as it goes to an x
value. Using this definition, it is possible to find the value of the limits given a graph. A
ni
In general, you can see that these limits are equal to the value of the function. This
is true if the function is continuous.
Continuity
ity
x = –3 x=0 x=2
Notes
Discontinuous at this discontinuous at this point discontinuous at this point
e
point the value is not The limit of the left is not The limit from the left is equal
defined at –3 “Removab equal to the limit from the to the right, but is not equal
in
le discontinuity” right “Jump discontinuity” to the value of the function
“Removable discontinuty”
x=4 x=5 x=6
nl
continuous at this point Continuous at this point Dfiscontinuous at this point
The limit from the left is The limit from the left is The value of the limit is
equal to the limit from equal to the limit from equal to negative infinity and
O
the right and equal to the right and equal to the therefore not defined “Infinite
the value of the function value of the function discontinuity”
ity
One-sided limits are differentiated as right hand limits (when the limit approaches
from the right) and left-ha nd limits (when the limit approaches from the left) whereas
ordinary limits are sometimes referred to as two-sided limits. Right-hand limits approach
the specified point from positive infinity Left-hand limits approach this p;oint from
negative infinity.
From this information, a more formal definition can be found. Continuity, at a point
a, is defined when the limit of the function from the left equals the limit from the right
and this value is also euqal to the value of the function. using notation, for all points 0
ity
where
Now that you know how to solve a limit graphically, you may be asking yhourself:
‘That’s great, but what about when there isn’t a graph in the problem?’ That is a good
question, and that is what this next section is about. There are a manyh better (and Notes
more accurate) ways to find the value of the limit than graphing or pluggi ng in numbers
e
that get closer and closer to the value of interest. These solution methods fall under
three catagories: sub stitution, factoring, and the conjugate method. But first things first,
in
let’s discuss some of the general rules for limits.
nl
O
ity
rs
ve
ni
U
ity
m
)A
(c
expressions. If a function is considered rational and the denominator is not zero, the
Notes limit can be found by subsititution. This can be seen in the example below (which is
similar to the example #3 above, but now done i n one quick, convenient step):
e
in
This can be fefined more formally as: If P(x) and Q(x) are algebraic expressions
and Q(c) 30, then:
nl
O
ity
Factoring Method
x2-9
Consider the function (x) = x+3 . How would you find the limit of F as x
approacheds –3? If you try to use subsitituion to find the limit, world-ending paradoxes
ensue:
rs
But fear not, this answer just tells us that we must use a different method to find
ve
the limit, because the functgion likely has a “hole” at the given. Therefore, the factoring
method can be tried. To start this method, the numerator and denominator must be
factored (in this case the denominator is “factored” already).
ni
The factor (x+3) can be canceled to get the much simpler limit expression of limx--
U
3
(x–3) that can easily be evaluated via subsitituion:
Therefore, the result of the limit can be found, with the understanding that there is a
ity
x+3
Conjugate Method
The conjugate of a binomial expressio n (i.e. an expression with two terms, you
can tell this because of the latin root bi-meaning two) is the same expressio n with
opposite middle signs. For example, the conjugate of (√x – 5) is (√x + 5). “This is reality
(c
useful if you have a radical in your limit. This is because the product of two conjugates
containing radicals will, itself, contain no radical in you limit. This is because the product
of two conjugates containing radicals will, itself, contain no radical expressions. See
below:
e
containing redicals for which substitution does no twork.
in
Example:
nl
O
Well, another hole in the universe, or at least the graph, indicating that you’ll need
another method to find the limit since the function probably has a hole at x = 5. To start,
multiply both the numerator and denominator by the conjugate of the radical expression
ity
(√x+11+4):
rs
ve
ni
U
Notes
e
in
nl
O
ity
Limits: Advanced Topics
rs
Previously,k when we found that the result of a limit doing straight substitution
yielded0/0 we used factoring or conjugation to be able to solve the probple. What
happens when neither of those methods prove useful? You become very grateful for the
ve
17th-century French mathematician Guillaume de L’Hopital. L Hopital was the man that
derived a method of solving these types of equations, known as indeterminate forms.
This method, know as LHopital’s Rule, is formally defined below.
ni
U
ity
m
)A
(c
Notes
e
in
nl
O
ity
rs
ve
ni
U
ity
m
)A
(c
e
right? well, that depends on the function. But half of the answert can be discovered
by allowing the independent variable to take on in increasingly large, positive values
in
and keeping as eye on the output (the graph)- this investigates what is happening
as ewe go futher and further to the right. The other half is discovered by allowing the
independent variable to take on increasingly large, negative values and, again, keeping
nl
an eye on the output- this investigates what is happening as we go further and further to
the left.
Here arf some basic facts and some generalization that will be sufficient to
O
evaluate most ‘limits to infinity”.
Consider the function as an algebraic fraction, and consider the ratio of the leading
terms. Let the algebraic expression in the numerator be expressed as n(x) and the
ity
algebraic expression found in the denominator be expressed as d(x), then
◌◌ if the degree of the numerators is lower than the degree of the denominator,
then limx-.-∞ n(x) = 0. In genral, whenever the denominator grows faster than
d(x)
the numerator, the limit will go to zero. thus, in these cases, as the graph
Examples:
ni
U
◌◌ If the degree of the numerator is higher than the degree of the denominator,
then limx-.-∞ n(x) = ∞ or – ∞. In genral, whenever the numerator grows faster
d(x)
than the denominator, the limit will go to positive or negative infinity. Thus, in
these cases, as the graph extends far to both the left and the right, the output
ity
(i.e., the graph) increases or decreases without bound. In these cases, each
side needs to tbe considered separately.
m
)A
(c
Notes
e
in
nl
O
3.3.4 Continuous and Discontinuous functions – Introduction and
Application
Continuity of a Function at a Point
ity
A function f ( x ) is said to be continuous at an interior point x = a of its domain if
lim f ( x ) = f ( a ) . In other words, a function f ( x ) is said to be continuous at a point
xx →=aa provided left hand limit, right hand limit and value of the function are equal:
f ( x ) is continuous at a point x = a if
rs
A function
lim f ( a −=
h ) lim f ( a +=
h) f (a)
h→0 h→0
ve
Continuity on a function on an open interval
A function f ( x ) is said to be continuous on an open interval ( a, b ) if it is continuous
at each point of ( a, b ) .
ni
h→0
Discontinuity of a function
)A
Notes
e
in
nl
O
ity
not? rs
Ex. 3.13:Check whether the function f ( x ) =
4 x , if x ≤ 2
2
4 x 2 , if x ≤ 2
3, if x>2
is continuous at x = 2, or
ve
Solution: Given, f ( x ) =
3, if x>2
At x = 2,
LHL = limf ( x )
ni
x →2 −
= lim( 4 x 2 )
x →2 −
= 16
U
At x =2,
RHL = limf ( x )
x →2 +
ity
=3
f ( 2 ) = 16
LHL
= f ( 2 ) ≠ RHL
m
x+1, if x ≥ 1
Solution: Given, f ( x ) = 2
x + 1, if x<1
At x = 1,
RHL = limf ( x )
(c
x →1+
= lim( x + 1)
x →1+
=2
Amity Directorate of Distance & Online Education
Quantitative Aptitude 101
At x =1,
Notes
LHL limf ( x )
e
→
= limf ( x + )
in
→
= 1+1
f (1 )= x + 1
nl
=2
LHL
= RHL
= f (1)
O
\ function is continuous at x = 1.
kx 2 , if x ≤ 2
Ex. 3.15: If function f ( x ) = is continuous at x = 2, then find the value of
ity
k? 3, if x>2
kx 2 , if x ≤ 2
Solution: Given, f ( x ) =
3, if x>2
At x = 2,
LHL = limf ( x )
x →2 −
= lim( kx 2 )
rs
ve
x →2 −
= 4k
At x =2,
RHL = limf ( x )
ni
x →2 +
=3
f ( 2 ) = 4k
U
So,
ity
4k = 3
3
k=
4
Check your Understanding
m
1. The two broad areas of calculus knows as differential and integral calculus are built
on the foundation concept of a __________.
)A
2. A _________ of the existence of a limit can never be based on one’s ability to sketch
graphs or on tables of numerical values.
3. _________ of a graph is loosely defined as the ability to draw a graph without having
to lift your pencil.
(c
4. Left had limits approach the specified point from positive infinity. True / False
5. A _________ function is a function that can be written as the ratio of two algebraic
expressions.
e
Multiple Choice
(1 + ) −1
lim (1 + )
in
1. The value of is
→ −1
a. 20
b. 21
nl
c. 15
d. 17
O
x4 − 4
2. The value of lim x
x→ 2
2
+ 3 2x − 8
is
a. 8
−
5
7
ity
b.
5
8
c.
5
d. None of these
3. The value of
a. 0
lim x
x →1
x 7 − 2x 5 + 1
3
− 3x 2 + 2 rs
is
ve
b. 1
c. -1
d. 3
ni
x2 − 4
4. The value of lim
x →2 3x − 2 − x + 2
is
a. 8
U
b. -8
c. 7
ity
d. 5
5. Find n, if lim
x n − 2n
= 80 , n∈N .
x →2 x −2
a. 5
m
b. 6
c. 4
d. 7
)A
Summary
●● The two broad areas of calculus knows as differential and integral calculus are
built on the foundation concept of a limit.
(c
●● The limit of the nth root of a function is the nth root of the limit whenever the limit
exists and has a real nth root.
●● A proof of the existence of a limit can never be based on one’s ability to sketch
graphs or on tables of numerical values. Notes
e
●● Continuity of a graph is loosely defined as the ability to draw a graph without
having to lift your pencil.
in
●● One sided limits are differentiated as right-hand limits and left-hand limits whereas
ordinary limits are sometimes referred to as two-sided limits. Right had limits
approach the specified point from positive infinity. Left hand limits approach this
nl
point from negative infinity.
●● A general limit does not exist wherever a function increases or decreases infinitely
as it approaches a given x-value.
O
●● A rational function is a function that can be written as the ratio of two algebraic
expressions. If a function is considered rational and the denominator is not zero,
the limit can be found by substitution.
ity
●● The conjugate of a binomial expression is the same expression with opposite
middle signs.
●● ()
A function f x is said to be continuous at a point x =a provided left hand limit,
●●
right hand limit and value of the function are equal.
() rs
A function f x is said to be continuous on an open interval
( )
at each point of a , b .
( a, b ) if it is continuous
ve
●● ()
A function f x , which is not continuous at a point x =a, is said to be
discontinuous at that point.
Activity
ni
Glossary
Notes
●● Continuity of a graph: is loosely defined as the ability to draw a graph without
e
having to lift your pencil.
●● Rational function: is a function that can be written as the ratio of two algebraic
in
expressions.
●● Further Reading
nl
●● Boelkins, Matthew. Active Calculus 2018: Single Variable. CreateSpace
Independent Publishing Platform; 2018th edition (August 13, 2018)
●● Edwards, Bruce H.; Larson, Ron. Calculus: Early Transcendental FUnctions
O
Cengage Learning; 7th edition (January 1, 2018)
●● Stewart, James. Calculus 8th Edition. Cengage Learning; 8th edition (May 19,
2015)
ity
Check your Understanding: Answers
1. Limit
2. Proof
3.
4.
Continuity
False
rs
ve
5. Rational
6 Conjugate
ni
Multiple Choice
1. B
2. C
U
3. B
4. A
ity
5. A
m
)A
(c
e
Structure:
in
4.1 Data interpretation
1.1.1 Data and Statistical Data, Frequency Distribution
1.1.2 Graphical Representation
nl
1.1.3 Histogram
1.1.4 Frequency Polygon and Frequency Curve
O
1.1.5 Ogive- Part 1
1.1.6 Ogive- Part 2
ity
4.2 Descriptive Measures
2.1.1 Measure of the Central Tendency - I
2.1.2 Measure of the Central Tendency - II
2.1.3 Measure of Dispersion
2.1.4 Kurtosis, skewness
rs
ve
ni
U
ity
m
)A
(c
e
Objectives:
At the end of this unit, you will be able to understand:
in
●● Data and Statistical Data, Frequency Distribution
●● Graphical Representation
nl
●● Histogram
●● Frequency Polygon and Frequency Curve
O
●● Ogive Part 1
●● Ogive Part 2
ity
4.1.1 Data and Statistical Data, Frequency Distribution
Introduction
rs
The word Statistics is derived from the Italian word ‘Stato’ which means ‘state’;
and the word ‘Statista’ refers to a person who is involved with the affairs of state. Thus,
statistics originally was meant for collection of facts useful for affairs of the state, like
the taxes, land records, population demography, etc. There is an evidence of use of
ve
some of the principles of statistics by ancient Indian civilization as well. Some of the
techniques have found their mention in Vedic Mathematics. However, the modern
statistical methods spread from Italy to France, Holland and Germany in 16th century.
ni
Definitions of Statistics
The definitions of statistics are as follows: “Statistics are the classified facts
U
representing the conditions of the people in the state. Specially those facts which can
be stated in number or in table of numbers or in any tabular or classified arrangement.”
– Webster
ity
Functions of Statistics
m
for planning and decision-making. Predictions based on the gut feeling or hunch
can be harmful for the business. For example, to decide the refining capacity for a
petrochemical plant, it is required to predict the demand of petrochemical product
mix, supply of crude oil, the cost of crude, substitution products, etc., for next 10 to
20 years, before committing an investment. Notes
e
3. Testing of hypotheses: Hypotheses are the statements about population parameters
based on past knowledge or information. It must be checked for its validity in the light
in
of current information. Inductive inference about the population based on the sample
estimates involves an element of risk. However, sampling keeps the decision-
making costs low. Statistics provides quantitative base for testing our beliefs about
nl
the population.
4. Relationship between Facts: Statistical methods are used to investigate the cause
and effect relationship between two or more facts. The relationship between demand
O
and supply, money-supply and price level can be best understood with the help of
statistical methods.
5. Expectation: Statistics provides the basic building block for framing suitable policies.
ity
For example how much raw material should be imported, how much capacity should
be installed, or manpower recruited, etc., depends upon the expected value of
outcome of our present decisions.
Limitations of Statistics
rs
Statistical techniques, because of their flexibility have become popular and
are used in numerous fields. But statistics is not a cure-all technique and has few
limitations. It cannot be applied to all kinds of situations and cannot be made to answer
ve
all queries. The major limitations are:
1. Statistics deals with only those problems, which can be expressed in quantitative
terms and amenable to mathematical and numerical analysis. These are
ni
not suitable for qualitative data such as customer loyalty, employee integrity,
emotional bonding, motivation etc.
2. Statistics deals only with the collection of data and no importance is attached to
U
an individual item.
3. Statistical results are only an approximation and not mathematically correct.
There is always a possibility of random error.
ity
5. Statistics laws are not exact laws and are liable to be misused.
6. The greatest limitation is that the statistical data can be used properly only by a
profressional. A person having thorough knowledge of the methods of statistics
)A
Data Collection
Notes
The collection and analysis of data constitute the primary stages of execution of
e
any statistical investigation. The procedure for collection of data depends upon various
considerations such as the scope, objective, nature of investigation, etc. Availability
in
of resources such as time, money, manpower, etc., also affect the procedure choice.
Data may be collected either from a primary or from a secondary source, which are
described below.
nl
Types of Data – Primary and Secondary
The data used in statistical study is termed as either ‘primary’ or ‘secondary’
O
depending upon whether it was collected specifically for the undertaken study or for
some other purpose.
When the data used in a statistical study is collected under the control and
ity
supervision of the investigator, such type of data is referred to as ‘primary data’.
Primary data is collected afresh and for the first time, and thus, happen to be original
in character. On the other hand, when the data is not collected for this purpose, but is
derived from other sources then such data is referred to as ‘secondary data’. Often,
rs
secondary data is collected by some other organization to satisfy their needs, but it is
used by someone else for entirely different reasons.
The difference between primary and secondary data is only in terms of degree. For
ve
example, data, which are primary in the hands of one, becomes secondary in hands
of other. Suppose an investigator wants to study the working conditions of labourers in
an industry. If the investigator or their agent collects the data directly, then it is called
a ‘primary data’. But if subsequently someone else uses this collected data for some
ni
Types of Statistics
U
The study of statistics can be categorized into two main branches. These branches
are descriptive statistics and inferential statistics.
Descriptive statistics is used to sum up and graph the data for a category picked.
ity
Descriptive statistics give information that describes the data in some manner. For
m
example, suppose a pet shop sells cats, dogs, birds and fish. If 100 pets are sold, and
35 out of the 100 were dogs, then one description of the data on the pets sold would be
that 35% were dogs.
)A
and generalizing them to a population, we need to be sure that our sample accurately
represents the population. This requirement affects our process. At a broad level, we
must do the following:
e
◌◌ Use analyses that incorporate the sampling error.
in
Methods of Collecting Data
nl
Generally, for managerial decision-making, it is necessary to analyze information
regarding a large number of characteristics. Collection of primary data can be time
consuming, expensive, and hence requires a great deal of deliberation. According to the
O
nature of information required, one of the following methods or their combination can be
selected.
1. Observation Method: In this method, the investigator collects the data through
ity
personal observations. This method is very useful if data is created in the system
through capturing transactions. Computerized transaction processing can be
modified to generate necessary data or information. An investigator well versed with
the system or a part of the system is ideally suited for collecting this kind of data. Since
the investigator is solely involved in collecting the data, their training, knowledge and
2.
rs
skills play an important role as far as the quality of the data is concerned. Sometimes,
the audio/video aids can also be used to record the observations.
Indirect Investigation: In this case, the information collected by oral or written
ve
interrogation forms the primary data. Usually enquiry commissions, board of
investigations, investigation teams and committees collect data in this manner.
Quality of the data largely depends upon the person interviewed, their motives,
ni
memory, overall cooperation, and the interviewer’s repute with the person being
interviewed.
3. Questionnaire with Personal Interview: This is the most common and popular
U
method for data collection. In this method, individuals are personally interviewed and
answers are recorded to collect the data. Questionnaire is structured and followed
in specific sequence. Occasionally, a part of the questionnaire may be unstructured
to motivate the interviewee to give additional information or information on intimate
ity
matters. Accuracy of the data depends on the ability, sincerity and the tactfulness of
the interviewer to conduct the interview in friendly and professional environment.
4. Mailed Questionnaire: In this method, the structured questionnaire is mailed to
selected people with a request to fill it and return. Along with the questions, the
m
is necessary to explain the reason for the data collection and, if any, to alleviate
the respondent’s fears. The respondents are believed to be literate and be able to
answer the questions without any confusion. This is a less expensive and faster
method to collect large volume of data, over a wide geographic area, in a standard
(c
form, and at the convenience of the respondent. Hence this method is most popular
and extensively used. However, this method needs a guard against two drawbacks
viz. The absence of an interviewer, which results in a large proportion of the non-
response and the possibility of reducing the reliability of the replies if the respondent
e
5. Telephonic Interview: This method is less expensive but has limited in scope, as
the respondent must possess a telephone and has it listed. Further, the respondent
in
must be available and in the frame of mind to provide correct answers. This method
is comparatively less reliable for public surveys. However, for industrial survey, in
developed regions, and with known customers, this method is best suited. There is
nl
a limit to the number of questions that the interviewee could answer in three to four
minutes. The mthod is efficient If there are just three to five yes/no type questions
and two to three short questions.
O
6. Internet Surveys: Of late, Internet surveys have become popular. These are less
expensive, fast and can be interactive. However, its scope is limited to those who have
regular Internet access. With rapid growth in technology and Internet connectivity it
would be one of the main methods of collecting primary data. With its interactivity
ity
and multimedia facilities it also combines the advantages of other methods.
1. rs
for another purpose. Sources of secondary data are -
authentic.
3. Journals of trade, commerce, economics, scientific, engineering, medicine, etc.
This data could be very reliable for a specific purpose.
U
can provide most authentic and much cheaper information provided we could
identify the source.
6. Diaries, letters, mailers can also provide secondary data. The problem with the
unpublished data is that it’s difficult to locate and get access.
m
Applications of Statistics
Data is a collection of any number of related observations. We can collect the
)A
agriculture, medicine, psychology, education. All the fields lean heavily on data and its
analysis. The application of data is so vast and ever expanding that it is very difficult to
define. Its use has permeated almost in every facet of our lives.
e
to almost every realm of the business. Statistics is about scientific methods to gather,
organize, summarize and analyze data. More important still is to draw valid conclusions
in
and make effective decisions based on such analysis. To a large degree, company
performance depends on the preciseness and accuracy of the forecast. Statistics
is an indispensable instrument for manufacturing control and market research.
nl
Statistical tools are extensively used in business for time and motion study, consumer
behaviour study, investment decisions, credit ratings, performance measurements
and compensations, inventory management, accounting, quality control, distribution
O
channel design, etc. For managers, therefore, understanding statistical concepts and
knowledge about using statistical tools is essential. With an increase in a company’s
size and market uncertainty due to reduced competition, the need for statistical
knowledge and statistical analysis of various business circumstances has greatly
ity
increased. Prior to this, when the size of business used to be small without much
complexities, a single person, usually owner or manager of the firm, used to take all
decisions regarding the business. Example: A manager used to decide, from where the
necessary raw materials and other factors of production were to be acquired, how much
rs
of output will be produced, where it will be sold, etc. This type of decision making was
usually based on experience and expectations of this single individual and as such had
no scientific basis.
ve
Classification of Data
Classification refers to the grouping of data into homogeneous classes and
categories. It is the process of arranging things in groups or classes as per their
ni
1. To condense the data mass in such a way that salient features can be readily noticed;
for example, household incomes can be grouped as higher income group, middle-
income group and lower income group based on certain criterion.
ity
Bases of Classification
Some common types of bases of classification are:
(c
2. Chronological classification: In this type, data is classified according to the time of its
Notes occurrence; for example, monthly sales, daily demand, yearly production, etc.
e
3. Qualitative classification: When the data is classified according to some attributes,
which are not capable of measurement, it is known as qualitative classification. In
in
dichotomous classification, an attribute is divided into two classes, one possessing
the attribute and other not possessing it; for example, smoker, non-smoker, employed,
unemployed, etc. In many-fold classification, attribute is divided so as to form several
nl
classes like education level, religion, mother tongue, etc.
4. Classification of data according to characteristics: It refers to the classification of
data according to some characteristics which can be measured; for example, age,
O
salary, height, etc. Quantitative data may be further classified into two types namely
discrete and continuous. In case of discrete type, values of the variables taken are
countable (could be infinitely large also for example, integers). Examples of these
are number of accidents, number of defectives, etc. In case of continuous quantities,
ity
data can take any real values; for example, weight, height, distance, volume, etc.
Frequency Distribution
Classification of data shows the different values of a variable and their respective
rs
frequency of occurrence is called a frequency distribution of the values.
The process of preparing discrete frequency distribution is simple. First, all the
possible values of variables are arranged in ascending order in a column. Then another
column of ‘Tally’ mark is prepared to count the number of times a particular value of the
U
variable is repeated. To facilitate counting, a block of five ‘Tally’ marks is prepared. The
last column contains frequency. To illustrate this let us consider one example.
Example:
ity
Construct frequency distribution table for the following data of number of family
members in 30 families:
4 3 2 3 4 5 5 7 3 2
m
3 4 2 1 1 6 3 4 5 4
2 7 3 4 5 6 2 1 5 3
)A
Number
of Family ‘Tally Marks’ Frequency
Members
1 ||| 3
(c
2 |||| 5
3 |||| || 7
4 |||| | 6
5 |||| 5
Notes
6 || 2
e
7 || 2
Total N = 30
in
b. Continuous Frequency Distribution
For continuous data a ‘grouped frequency distribution’ is necessary. For discrete
nl
data, discrete frequency distribution is better than array, but this does not condense the
data. ‘Grouped frequency distribution’ is useful for condensing discrete data by putting
them into smaller groups or classes called class-intervals. Some important terms used
O
in case of continuous frequency distribution are as follows:
1. Class limits: Class limits denote the lowest and highest value which can be included
in the class. The two boundaries of class are known as the lower limit and upper limit
ity
of the class. For example, 10-18, 20-28, where 10 and 18 are limits of the first class;
20 and 28 are limits of second class,etc.
2. Class intervals: The class interval represents the width, the span or the size of a
class. The width may be determined by subtracting the lower limit of one class from
3.
class interval 20 – 15 =5.
rs
the lower limit of the following class. For example, classes 10-15, 15-20, etc have
successive lower limits divided by two. Thus class mark is the value lying halfway
between lower and upper class limits. For example, classes 10-20, 20-30, etc have
class marks 15, 25etc.
U
5. Types of class intervals: There are many different ways in which limits of class
intervals can beshown.
6. Exclusive method: In this method, the class intervals are so arranged that upper
ity
limit of one class is the lower limit of next class. This method always presumes that
the upper limit is excluded from the class, for example, with class limits 20-25, 25-30
observation with value 25 is included in class25-30.
7. Inclusive method: In this method, the upper limit of the class is included in the
m
same class itself. In such case there is no overlap of upper limit of former class and
lower limit of successive class. For example, with class limits 20-29.5, 30-39.5, 40-
49.5, etc. there is no ambiguity but values from 29.5 to 30 or 39.5 to 40 etc. are not
)A
allowed.
8. Open end: In an open-end distribution, the lower limit of the very first class or upper
limit of the last class is not given. For example, while stating the distribution of
monthly salary of managers in rupees, one may specify class limits as, below 10000,
(c
9. Unequal class interval: The method Is also used to limit the class intervals where
Notes the width of the classes is not equal for all classes. This method is of practical use
when there are large gaps in the data, or distribution of the data is uneven. It is used
e
for explaining, visualizing and plotting data with unequal class interval. However, we
must adjust formulae for calculationsaccordingly.
in
Cumulative and Relative Frequency
In many situations rather than listing the actual frequency opposite each class, it
nl
may be appropriate to list either cumulative frequencies or relative frequencies orboth.
Cumulative Frequencies
O
The cumulative frequency of a given class interval thus, represents the total of all
the previous class frequencies including the class against which it is written.
ity
Relative Frequencies
Relative frequency is obtained by dividing the frequency of each class by the total
number of observations ie. the totalfrequency.
rs
◌◌ If the relative frequency is multiplied by 100, we get the percentagefrequency.
◌◌ There are two important advantages in looking at relative frequencies
(percentages) instead of the absolute frequencies in a frequency distribution.
ve
Theseare:
◌◌ Relative frequencies facilitate the comparison of two or more than sets ofdata.
◌◌ Relative frequencies constitute the basis of understanding the probability
concept.
ni
22 21 37 33 28 42 56 33 32 59
40 47 29 65 45 48 55 43 42 40
ity
37 39 56 54 38 49 60 37 28 27
32 33 47 36 35 42 43 55 53 48
29 30 32 37 43 54 55 47 38 62
m
e
measures.
in
4.1.2 Graphical Representation
nl
One of the most convincing and appealing ways in which statistical results may be
represented is through graphs and diagrams.
O
Diagrams and graphs are extremely used because of the following reasons:
(i) Diagrams and Graphs attract to the eye.
ity
(ii) They have more memorizing effect.
(iii) It facilitates for easy comparison of data from one period to another.
(iv) Diagram and graphs give bird’s eye view of entire data; therefore, it conveys
meaning very quickly.
a. Bar Diagram
rs
In a bar diagram, only the length of the bar is taken into account but not the width.
ve
In other words bar is a thick line whose width is merely shown, but length of the bar is
taken into account and is called one-dimensional diagram.
It represents only one variable. Since these are of the same width and vary only in
lengths (heights), it becomes very easy for a comparative study. Simple bar diagrams
U
are very popular in practice. A bar chart can be either vertical or horizontal; for example
sales, production, population figures etc. for various years may be shown by simple bar
charts
ity
Illustration - 1
The following table gives the birth rate per thousand of different countries over a
certain period of time.
New
m
40
Notes B 40 Simple Bar Diagram
e
I 35
30
r
30
in
t
25
h 20
20 16 15
nl
R
15
a 10
t 5
O
e 0
India Germ UK New Swe Chin
an zeala den a
nd
Countries
ity
Comparing the size of bars, China’s birth rate is highest, next is India whereas
Germany and Sweden equal in the lowest positions.
Countries:
Production of
A
rs
B C D E F
ve
Rice (000’s 38 42 29 28 18 11
tons):
Production of Rice (000's Tons
ni
50
42
40 38
29 28
U
30
20 18
11
10
ity
0
A B C D E F
m
Sub-divided BarDiagram
In a subdivided bar diagram, each bar representing the magnitude of given value is
further subdivided into various components. Each component occupies a part of the bar
)A
Illustration -
e
2015-2016 280 610 280
in
Y
1400
nl
1200
1000
Scale: 1 cm = 200
800 Index
O
Sci
600 Hum
400 Com
ity
200
X
2014-15 2015-16
Illustration – 2
rs
The Number of Students in University X during 2008 to 2011 areas follows.
Represent the data by a similar diagram.
ve
Year Arts Commerce Science Total
2008 - 2009 20,000 10,000 5,000 35,000
2009 - 2010 26,000 9,000 7,000 42,000
ni
N
u
m 50000
5
b ,7
n ce
ie
ity
e
,7
0 Sc 00
r 40000 ce
ie n rce
0 Sc 00 me 0
,5 m
Co , 95 0
ce e
o
ie n erc
f 30000 Sc 00 m m 0
e Co , 900
erc
m
s m
m 0 0
t Co 100
,
u 20000 0
31
d s,
20
0 Art 00
)A
e 0 r t s,
20 A 00
n 10000 r t s,
A 00
t
s
0
(c
e
components are shown asseparate adjoining bars. The height of each bar represents
the actual value of the component. The components are shown by different shades or
in
colours.
Illustration 1 - Construct a suitable bar diagram for the following data of number of
students in two different colleges in different faculties.
nl
College Arts Science Commerce Total
A 1200 800 600 2600
O
B 700 500 600 180
1800
= College 'A'
ity
1600
= College 'B'
1400
1200
rs
1200
No. of students
1000
ve
800
800 700
600 600
600 500
ni
400
200
U
Fig: A multiple bar diagram showing numbers of students in two different colleges
in differentdepartments.
Illustration 2
m
Illustration 1
Notes
e
Year Men Women Children
1995 45% 35% 20%
in
1996 44% 34% 22%
1997 48% 36% 16%
nl
700
600
O
500
400 Ist Class
300 IInd Class
ity
200 IIIrd Class
Failed
100
0
◌◌
2006 2007 2008
rs
axis on the graph paper. Make sure to write the title above the table so that it
ve
determines the purpose of the graph.
◌◌ For instance, if one of the factors is time, it goes on the horizontal axis,
referred to as the x-axis. The other factor would subsequently go on the
ni
vertical axis, which is known as the y-axis. Both the axes are to be labeled as
per their respective factors. For example, The x axis can be labeled as time or
day.
U
◌◌ Afterward, with the help of the already given data, the exact values on the
graph can be pointed. Once the points are joined, a clear inference about the
trend can be made.
ity
Pie Chart
A pie chart or a circle chart is a circular statistical graphic that is divided into
slices to illustrate a numerical proportion. In a pie chart, the arc length of each slice
is proportional to the quantity it represents. While it is named for its resemblance to a
m
pie which has been sliced, there are variations on the way it can be presented..In a pie
chart, categories of data are represented by wedges in the circle and are proportional in
size to the percent of individuals in each category.
)A
Pie charts are very widely used in the business world and the mass media. Pie
charts are generally used to show percentage or proportional data and usually the
percentage represented by each category is provided next to the corresponding slice of
pie. Pie charts are good for displaying data for around six categories or fewer.
(c
Example:
Show the following data of expenditure of an average working class family by a
suitable diagram
Amity Directorate of Distance & Online Education
120 Quantitative Aptitude
e
Clothing 10
Housing 12
in
Fuel and Lighting 5
Miscellaneous 8
nl
Solution:
1. Food = 65/ 100 x 360 = 234
O
2. Clothing = 10/ 100 x 360 = 36
3. Housing = 12/ 100 x 360 = 43.2
4. Fuel and Lighting = 5/ 100 x 360 = 18
ity
5. Miscellaneous = 8/ 100 x 360 = 28.8
The angles of different sectors are calculated as shown below:
rs
Food Pie Chart
ve
ni
U
ity
4.1.3 Histogram
A histogram consists of contiguous boxes and has both horizontal axis and a
vertical axis. The horizontal axis is labeled with what the data represents (for instance,
m
distance from your home to school). The vertical axis is labeled either Frequency or
relative frequency. The graph will have the same shape with either label. The histogram
(like the stemplot) can give you the shape of the data, the center, and the spread of the
)A
data. (The next section tells you how to calculate the center and the spread.)
The relative frequency is equal to the frequency for an observed value of the data
divided by the total number of data values in the sample. (In the chapter on Sampling
and Data (Section 1.1), we defined frequency as the number of times an answer
occurs.)
(c
RF = f/n
Where f is the frequency n is the total number of data values (or the sum of the
individual frequencies), and RF is the relative frequency. Notes
e
Example – If 3 students in Mathematics class of 40 students received from 90% to
100%, then,
in
f = 3, n = 40 and
RF = f/n
nl
= 3/40
= 0.075
O
Seven and a half percent of the students received 90% to 100%. Ninety percent to
100% are quantitative measures.
ity
Example:
Formulate the Histogram from the following data –
Solution:
Histogram
U
ity
m
)A
(c
e
When frequencies are added, they are called the cumulative frequencies. The
in
curve obtained by plotting cumulating frequencies is called a cumulative frequency
curve or an ogive (pronounced as ojive).
nl
class, to get the cumulative frequencies. (ii) Plot classes on the horizontal (x-axis) and
cumulative frequencies on the vertical (y-axis).
O
Less than Ogive: To plot a less than ogive, data is arranged in ascending order
of magnitude and frequencies are cumulated from the top i.e. adding. Cumulative
frequencies are plotted against the upper class limits. Ogives under this method, gives
a positive curve
ity
Greater than Ogive: To plot a greater than ogive, the data is arranged in the
ascending order of magnitude and frequencies are cumulated from the bottom or
subtracted from the total from the top. Cumulative frequencies are plotted against the
lower class limits. Ogives under this method, gives negative curve
rs
Uses: Certain values like median, quartiles, quartile deviation, co-efficient of
skewness etc. can be located using ogives. Ogives are helpful in the comparison of the
two distributions.
ve
Illustration 1 –
Draw less than and more than ogive curves for the following frequency distribution
ni
f 5 12 18 25 15 12 8 5
20 5 100 0
40 17 95 20
60 35 83 40
80 60 65 60
m
100 75 40 80
120 87 25 100
)A
140 95 13 120
160 100 5 140
(c
Y
180 Notes
e
160
in
140
120
nl
100 Less than
80
O
60
40
ity
More than
20
0 X
20 40 60 80 100 120 140 160 180
rs
Example: Find the median from the following series. Also draw less than ogive,
more than ogive and locate median on a graph.
ve
Income (`) No. of Persons
0-20 82
20-40 112
ni
40-60 150
60-80 95
U
80-100 48
Solution:
ity
Class
Class
C.I. F L.C.F. (More M.C.F.
(Less then)
then)
0-20 82 20 82 0 487
m
600
Notes
500
e
No. of Persons
400 Less than ogive
in
300
nl
100
0
20 40 60 80 100
Median 50 Median Income
O
4.1.6 Ogive – Part 2
ity
If from a cumulative frequency table, the upper limits of the class taken as
x-coordinates and the cumulative frequencies as the y-coordinates and the points
are plotted, then these points when joined by straight lines, we obtain less than type
cumulative frequency polygon.
rs
If more than cumulative frequency is plotted against the corresponding lower limits
of each class and the points plotted are joined by straight lines, we obtain more than
type cumulative frequency polygon.
ve
However, when the points plotted are joined by a free hand smooth curve, we
obtain cumulative frequency curve.
0-10 2 2
10-20 4 6
20-30 10 16
ity
30-40 4 20
40-50 3 23
50-60 8 31
m
60-70 1 32
70-80 5 37
80-90 11 48
)A
90-100 2 50
(c
Notes
e
in
nl
O
ity
Activity1: Construct an ogive curve for the following frequency distribution of
Cotton Mills in Bombay according to the quantities of cotton consumed-
8-10 8
10-12 4
U
12-14 1
14-16 3
16-18 1
ity
18-20 1
over 20 2
The following table shows the frequency distribution for the number of students per
m
Students x Frequency F
)A
1 7
4 46
7 165
(c
10 195
13 189
16 89
19 28
Notes
22 19
e
2 9
528 3
in
Check your Understanding
nl
1. ___________ helps in forecasting by analysing trends, which are essential for
planning and decision-making.
2. ______________ are the statements about populations parameters based on past
O
knowledge or information.
3. Inductive ___________ about the population based on the sample estimates involves
an element of risk.
ity
4. Statistics deals with only those problems which can be expressed in quantitative
terms and amendable to mathematical and numerical analysis. True / False
5. _____________ statistics is used to sum up and graph the data for a category
picked.
6.
rs
______________ refers to the grouping of data into homogeneous classes and their
categories.
ve
7. In a _________ graph, only the length of the bar is taken into account but not the
width.
8. A _________ graph is a type of chart used to show information changing over time.
ni
9. A ________ chart or a circle chart is a circular statistical graphic that is divided into
slices to illustrate a numerical proportion.
10. A ___________ consists of contiguous boxes and has both horizontal axis and a
U
vertical axis.
11. The ___________ frequency is equal to the frequency for an observed value of the
data divided by the total number of data values in the sample.
ity
Summary
m
●● Statistics originally was meant for collection of facts useful for affairs of the state,
like the taxes, land records, population demography, etc.
)A
●● “Statistics are the classified facts representing the conditions of the people in the
state. Specially those facts which can be stated in number or in table of numbers
or in any tabular or classified arrangement.”
●● The collection and analysis of data constitute the primary stages of execution of
any statistical investigation. The procedure for collection of data depends upon
(c
e
●● Inferential statistics are techniques that allow us to use certain samples to
in
generalize the populations from which the samples were taken. Hence, it is crucial
that the sample represents the populations accurately.
●● Statistical tools are extensively used in business for time and motion study,
nl
consumer behaviour study, investment decisions, credit ratings, performance
measurements and compensations, inventory management, accounting, quality
control, distribution channel design, etc. For managers, therefore, understanding
O
statistical concepts and knowledge about using statistical tools is essential.
●● Classification refers to the grouping of data into homogeneous classes and their
categories. It is the process of arranging things in groups or classes as per their
ity
resemblances and affinities.
●● One of the most convincing and appealing ways in which statistical results may be
represented in through graphs and diagrams.
●● A line graph is a type of chart used to show information changing over time. We
●●
rs
use multiple dots to plot line graphs connected by straight lines. It is also known as
a line chart. It consists of two axes defined as the ‘x’ and ‘y’ axis.
A pie chart or a circle chart is a circular statistical graphic that is divided into slices
ve
to illustrate a numerical proportion. Pie Charts are widely used in the business
world and the mass media. They are generally used to show percentages or
proportional data and usually the percentages represented by each category is
ni
curve or an ogive.
Activity
Create the cumulative frequency table and draw the Ogive for the data below.
m
e
●● Classification: the grouping of data into homogenous classes and categories.
in
●● Line Graph: is a type of chart used to show information changing over time.
●● Primary Data: when the data used in a statistical study is collected under the
control and supervision of the investigator.
nl
●● Relative frequency: is equal to the frequency for an observed value of the data
divided by the total number of data values in the sample.
O
●● Secondary Data: when data is derived from other sources that have not been
collected for the specific purpose of research.
●● Further Reading
ity
●● Bekes, Gabor. Data Analysis for Business, Economics, and Policy. Cambridge
University Press (May 6, 2021)
●● Huberman, A. Michael; Miles, Matthew B.; Saldana, Johnny. Qualitative Data
Analysis: A Methods Sourcebook
●●
rs
Knaflic, Cole Nussbaumer. Storytelling with Data: A Data Visualization Guide for
Business Professionals. Wiley; 1st edition (November 2, 2015)
ve
Check your Understanding: Answers
1. Statistics
2. Hypotheses
ni
3. Inference
4. True
U
5. Descriptive
6. Classification
ity
7. Bar
8. Line
9. Pie
10. Histogram
m
11. Relative
12. True
)A
(c
e
Objectives:
At the end of this unit, you will be able to understand
in
●● Measure of the Central Tendency – I
●● Measure of the Central Tendency – II
nl
●● Measure of Dispersion
●● Kurtosis, Skewness
O
Introduction
Measures of central tendency are a single value which can be considered as
ity
representative of a set of observations. The value around which the observations
can be considered as centered is known as an Average or average value or a
location center. Since such representative values tend to lie centrally within a set of
observations when arranged according to magnitudes, these averages are then called
measures of central tendency.
central value.
●● Mean - The mean is the average of the numbers. It is easy to calculate: add up all
the numbers, then divide by how many numbers there are. In other words it is the
U
the average. The median is often used as opposed to the mean when the series
includes outliers that may distort the average of values.
●● Mode - The mode is the number most frequently seen in a dataset. A collection of
numbers may have one mode, one mode, or no mode at all. Other popular central
m
Average
An average is a single figure that sums up the characteristics of a whole group of
figures.
In the words of clark “average is an attempt to find one single figure to describe
(c
In the world of CROXTON and COWDEN “an average is a single value within the
range of the data that is used to represent all of the values in the series. Since an average
Notes is somewhere within the range of the data, it is called a measure of cultural value.
e
Objectives served by Averages
in
Averages serve the following purposes:
nl
2. To compare different groups by the means of averages.
3. To obtain a clear picture of a whole group studying sample data.
4. To provide definite rates to the relationship between different groups.
O
Characteristics of good average
1. It is rigidly defined and its value is always definite.
ity
2. It is easy to understand and calculate, hence it is very popular.
3. It is based on all the observations; so that it can become a good representative.
4. It can be easily used for comparisons.
5.
rs
It is capable of further algebraic treatments, like finding the sum of the observation
values. Finding the mean and total number of the observations, and finding the
combined arithmetic mean when different groups are given etc.
ve
6. It is not affected much by sampling fluctuations.
Arithmetic Mean
m
Arithmetic mean is defined as the value obtained by dividing the total values
of all items in the series by their number. In other word is defined as the sum of the
given observations divided by the number of observations, i.e., add values of all items
)A
Symbolically – x= x1 + x2 + x3 + xn/n
1. The sum of the deviations, of all the values of x, from their arithmetic mean, is zero.
2. The product of the arithmetic mean and the number of items gives the total of all
items.
Amity Directorate of Distance & Online Education
Quantitative Aptitude 131
3. Finding the combined arithmetic mean when different groups are given.
Notes
Demerits of Arithmetic Mean
e
1. Arithmetic mean is affected by the extreme values.
in
2. Arithmetic mean cannot be determined by inspection and cannot be located
graphically.
3. Arithmetic mean cannot be obtained if a single observation is lost or missing.
nl
4. Arithmetic mean cannot be calculated when open-end class intervals are present in
the data.
O
Arithmetic Mean for Ungrouped Data
A) Individual Series
ity
1. Direct Method
The following steps are involved in calculating arithmetic mean under an individual
series using direct method:
- Add up all the values of all the items in the series.
rs
- Divide the sum of the values by the number of items. The result is the arithmetic
mean.
ve
The following formula is used: X = ∑ x/N
Illustration 1 – Value(x) – 125 128 132 135 140 148 155 157 159 191
Solution –
Mean = ∑ x ∑ 125 128 132 135 140 148 155 157 159 191 = 1440
ity
X ∑ ∑ x/n ∑ 1440/10
= 144
m
N = Number of items
Notes
Illustration - 1
e
Calculate the arithmetic average of the data given below using short–cut method
in
Roll No 1 2 3 4 5 6 7 8 9 10
Marks 43 48 65 57 31 60 37 48 78 59
nl
Solution –
O
1 43 -17
2 48 -12
3 65 5
ity
4 57 -3
5 31 -29
6 60 0
7
8
9
rs 37
48
78
-23
-12
18
ve
10 59 -1
∑d = – 74
ni
X = a + ∑d/N
Arithmetic mean and number of items of two or more related groups are known
ity
as combined mean of the entire group. The combined average of two series can be
calculated by the given formula –
n1x1 + n2x2/ n1 + n2
m
Where, n1 = No. of items of the first group, n2 = No. of items of the second group
Example - From the following data ascertain the combined mean of a factory
consisting of 2 branches namely branch A and Branch B. In branch A the number of
workers is 500, and their average salary of 300. In branch B the number of workers is
1,000 and their average salary is 250
(c
Solution:
Let the no. of workers in branch A be n1 = 500
e
n1x1 + n2x2/ n1 + n2
in
= 500(300) + 1000(250)/ 500 + 1000
= 1, 50,000 + 2, 50,000/1500
nl
= 266.66
O
Some times, some observations get relatively more importance than other
observations. The weight for such observation must be given on the basis of their
relative importance. In weighted arithmetic mean, for finding an average the value of
each item is multiplied by its weight and then the product are divided by the number of
ity
weights.
Symbolically = ∑wx / ∑w
Example – Calculate simple and weighted average from the following data –
Month
Price
Jan
42.5
Feb
51.25
March
50 52
rs
April May
44.25
June
54
ve
No. of tonnes 25 30 40 50 10 45
Solution:
ni
April 52 50 2600
May 44.25 10 442.5
June 54 45 2430
m
Simple AM
)A
X = ∑x/n = 294/6 = 49
Weighted AM
The correct average price paid is `50.30 and not `49 i.e., weight arithmetic mean is
correct than simple arithmetic mean.
Median
Notes
Median is defined as the value of the item dividing the series into two equal
e
halves, whereonehalf contains all values a less than (or equal to) it and the other half
contains all values greater than (or equal to) it. It is alsode fined as the “ central value
in
of the variable. In median, the value of items must be arranged in order of their size or
magnitude to find out themedian.
Median is a positional average. The term position refers to the place of a value in
nl
the series, where the place of median is such that it is equal to the number of items
lying on the either side; therefore it is also called as locativeaverage.
O
Merits of Median
Following are the advantages of median:
1. It is rigidly defined.
ity
2. It is easy to calculate and understand.
3. It can be located graphically.
4. It is not affected by extreme values like the arithmeticmean.
5.
6.
rs
It can be found by mere in spection.
It can be used for qualitativestudies.
ve
7. Even if the extreme values are unknown, median can be calculate difone knows the
number of items.
Demerits of Median
ni
1. In the case of individual observations, the values are to be arranged in order of their
U
size to locate median. Such an arrangement of data is tedious task if the number of
items islarge.
2. If the median is multiplied by the number of items, the total value of all the items
ity
Application of Median
Example – Determine the median from the following –
)A
S. No Value or Size
(c
1 15
2 20
3 23
4 23
Notes
5 25
e
6 25
7 25
in
8 27
9 40
nl
Median = 10/2
= 5th term
= 25
O
Example:
The following steps are involved in calculating median in continuous series:
ity
1) Find out the cumulative frequency
2) Find out the median item, i.e., N/2 th item.
3) Find out the group or class containing the median
4) Estimate the median applying the following formula.
n
Me = i + 2
− cf
xi
rs
ve
fm
where me = Median
i = Lower limit of the median class
ni
Example 1:
ity
0-20 13
0-30 20
0-40 32
)A
0-50 60
0-60 80
0-70 90
(c
Solution:
Notes
Mark F CF
e
0-10 5 5
0-20 6 13
in
0-30 7 20
0-40 12 32
nl
0-50 28 60
0-60 20 80
0-70 10 90
O
N 90
M= = = 45
2 2
ity
n
− cf 50 - 40
Me = 1 + 2
xi = 40 + 28
fm
10
rs
= 40 + x 13
28
= 40 + 4.64 = 44.64
ve
Find the median from the following series. Also draw less than ogive, more than
ogive and locate median on a graph.
No. of Persons
ni
Income (`)
0-20 82
20-40 112
U
40-60 150
60-80 95
ity
80-100 48
Solution:
Class
m
Class
C.I. F L.C.F. (More M.C.F.
(Less then)
then)
0-20 82 20 82 0 487
)A
600
Notes
500
e
No. of Persons
400 Less than ogive
in
300
nl
100
0
20 40 60 80 100
Median 50 Median Income
O
Mode
ity
The word “mode” is derived from the French word “1a mode” which means fashion.
So it can be regarded as the most fashionable item in the series or the group.
Croxtan and Cowden regard mode as “the most typical of a series of values”.
As are sult it can sum up the characteristics of a group more satisfactorily than the
rs
arithmetic mean ormedian. Mode is defined as the value of the variable occurring most
frequently in a distribution. In other words it is the most frequent size of item in a series.
ve
Merits of Mode
The following are the merits of mode:
1. The most important advantage of mode is that itisusuallyon an actual value.
ni
5. It is easy to understand and this average is used by people in their every day speech.
Demerits of Mode
ity
Applications of Mode
Mode in Ungrouped Data
(c
a) IndividualSeries
The mode of this series can be obtained by mere inspection. The number which
occurs most often is the mode.
Amity Directorate of Distance & Online Education
138 Quantitative Aptitude
Illustration - 1
Notes
Locate mode in the data 7, 12, 8, 5, 9, 6, 10, 9, 4, 9, 9
e
Solution:
in
On inspection, it is observed that the number 9 has maximum frequency i.e., repeated
maximum of 4 times than any other number. Therefore mode (Z)= 9
nl
b) DiscreteSeries
The mode is calculated by applying grouping and analysis table.
O
i) Grouping Table: Consisting of six columns including frequency column, 1st
column is the frequency 2nd and 3rd column is the grouping two way frequencis
and 4th, 5th and 6th column is the grouping three way frequencies.
ii) Analysis table: consisting of two columns namely tally bar and frequency
ity
Steps in Calculating Mode in Discrete Series
The following steps are involved in calculating mode in discrete series:
1.
2.
3.
rs
Group the frequencies bytwo’s.
Leave the frequency and group the other frequencies intwo’s.
Group the frequencies inthrees.
ve
4. Leave the frequency of the first size and add the frequencies of other sizes in
three’s.
5. Leave the frequencies of the first two sizes and add the frequencies of the other
ni
sizes in threes.
6. Prepare an analysis table to know the size occurring the maximum number
U
of times. Find out the size, which occurs the largest number of times. That
particular size is themode.
c) ContinuousSeries
ity
1. Find out the modal class. Modal class can be easily found out by inspection.
The group containing maximum frequency is the modal group. Where two or
more classes appearto be a modal class group, it can be decided by grouping
m
Marks F CF
0-10 5 5
e
Solution:
in
Here, the maximum frequency is 12, corresponding to the class interval (35-40)
which is the modal class, Therefore, L1=35 L2=40 F1=12 FM=8 F2=7
nl
X F
20-25 1
25-30 3
O
30-35 8
35-40 12
40-45 7 f2
ity
15-50 5
fm – f1 12 - 8
Mode = I + xI = 35+ ( ) 40.35
2 fm – f1–f1 2(12)=8.7
= 35 + (
4
24–15
)5 = 35 + (
20
9
) =35+2.22 rs
= 37.22
ve
Example 2:
Less than 10 20 30 40 50 60 70 80
ni
Solution:
U
Need to ascertain lower limit of the continuous class (LL = UL –) Class length (CL)
= 20–10 = 10 i.e., (10–10 = 0............)
fm – f1 36 - 24
Z=I+ xI = 30+ ( ) 40.30
2fm – f1–f1 2(36)= 24-20
12 120
Notes = 30 + ( ) 10 = 30 + ( ) =30+4.285 Z= 37.22
72–44 28
e
Empirical Relationship between Mean, Median andMode
in
When mode is ill defined, it is difficult to find the value of mode, a sort of empirical
relationship exist among the mean, median and mode in such a way that the median
lies between the mode and the mean.The mode departs (to the left i.e., positive
nl
skewed) 2/3 difference from the median and the mean departs (to the right i.e.,
negatively skewed) 1/3 difference from the median. Karl Pearson’s expressed this
relationship as Z = 3M - 2X (when it is positives kewness).
O
Example - M is 28, AM is 29 find Mode
Solution :
Z = 3M - 2X
ity
= 3(28)-2(29)
= 84– 78
rs
=26
29>28>26
– M = ? AM = 39 Z = 36.5
ve
Solution:
Z = 3M - 2X
ni
= 36.5 = 3(M)-2(39)
= 36.5 = 3M –78
U
= 3M = -78 - 36.5
M = - 114.5/-3
= 38.16
ity
Key Takeaways
●● Measures of central tendency: It is a single value which can be considered as
representative of a set of observations and around which the observations can be
m
●● Median: It is defined as the value of that item which divides the series into two
equal halves, onehalf contains all values less than (or equal to)it and the other half
containing all values greater than (or equal to) it. It is also defined as the “central
value of the variable.
e
may differ. One of the characteristic is central tendency. A central tendency measure is
a single value that attempts to describe a set of data by identifying the central position
in
within that set of data. As a result, measures of central tendency are also known as
measures of central location. They’re also known as summary statistics. The mean
(also known as the average) is probably the most familiar measure of central tendency,
nl
but there are others, such as the median and the mode. The mean, median, and mode
are all valid measures of central tendency, but depending on the circumstances, some
measures of central tendency are more appropriate to use than others.
O
Measures of central tendency are averages of the first order. Following are some
general rules:
◌◌ The mean is the most commonly used and generally regarded as the best
ity
measure of central tendency. However, in some cases, either the median or
the mode is preferable.
◌◌ When determining central tendency, the median is the preferred measure:
In the data distribution, there are a few outliers (Remember that a single
rs
outlier can have a significant impact on the mean).
Your data contains some missing or undetermined values.
ve
There is an unrestricted distribution. (For example, if you have a data field
that measures the number of children and your options are 0, 1, 2, 3, 4, 5
or “ 6 or more,” the “ 6 or more field” is open ended and makes calculating
the mean impossible because we do not know exact values for this field.
ni
Arithmetic Mean
The mean (or average) is the most well-known and widely used measure of
ity
central tendency. It can be applied to both discrete and continuous data, but it is most
commonly applied to continuous data. The sum of all the values in the data set divided
by the number of values in the data set equals the mean. So, if we have ‘n’ values in a
data set with values x1, x2, ..., xn, the sample mean, usually denoted by x (pronounced
m
“x bar”), is:
)A
This formula is usually written in a slightly different style, with the Greek capital
letter Σ, pronounced “sigma,” which means “sum of...”:
(c
You may have noticed that the sample mean is mentioned in the preceding
formula. So, what is the significance of the term “sample mean”? This is because, in
Notes statistics, samples and populations have very different meanings, and these differences
are very important, even though they are calculated in the same way in the case of the
e
mean. To indicate that we are calculating the population mean rather than the sample
mean, we use the Greek lower case letter “mu,” which is represented as μ:
in
nl
The mean is essentially a data set model. It is the most commonly used value.
However, you will notice that the mean is not always one of the actual values in your
O
data set. However, one of its most important characteristics is that it minimises error in
predicting any single value in your data set. That is, it is the value in the data set that
produces the least amount of error when compared to all other values in the data set.
ity
The mean has the important property of including every value in your data set as part of
the calculation. Furthermore, the mean is the only measure of central tendency in which
the sum of the deviations from the mean is always zero.
rs
The mean has one major drawback: it is especially vulnerable to the influence of
outliers. These are values that are out of the ordinary in comparison to the rest of the
data set because they are unusually small or large in numerical value. Consider the
ve
following wages for factory workers:
Staff 1 2 3 4 5 6 7 8 9 10
Salary 1500 1800 1600 1400 1500 1500 1200 1700 9000 9500
ni
The average salary for these ten employees is Rs. 3070. However, an examination
of the raw data suggests that this mean value may not be the best way to accurately
U
reflect a worker’s typical salary, as most workers earn between Rs.1200 and Rs. 1800.
The two high salaries have skewed the mean. As a result, in this situation, we would
like to have a more accurate measure of central tendency. As we will see later, taking
the median is a better measure of central tendency in this case.
ity
When our data is skewed, we usually prefer the median over the mean (or mode)
(i.e., the frequency distribution for our data is skewed). When the data is perfectly
normal, the mean, median, and mode are identical, according to the normal distribution,
m
which is the most commonly used in statistics. Furthermore, they all represent the most
common value in the data set. However, as the data becomes skewed, the mean loses
its ability to provide the best central location for the data because the skewed data
pulls it away from the average value. The median, on the other hand, best retains this
)A
Median
The median is the middle score of a set of data arranged in order of magnitude.
(c
Outliers and skewed data have less of an impact on the median. Assume we have the
following data to calculate the median:
65 55 89 56 35 14 56 55 87 45 92
We must first rearrange the data in descending order of magnitude (smallest first):
Notes
14 35 45 55 55 56 56 65 87 89 92
e
Our median score is the midpoint - in this case, 56. (highlighted in bold). It is the
in
middle score because there are five scores before it and five scores after it. When you
have an odd number of scores, this works fine, but what happens when you have an
even number of scores? What if you only had ten points? Simply take the middle two
nl
scores and average the result. So, consider the following example:
65 55 89 56 35 14 56 55 87 45
O
We reorder the data in the following order of magnitude (smallest first):
14 35 45 55 55 56 56 65 87 89
Only now do we need to average the fifth and sixth scores in our data set to get a
ity
median of 55.5.
Mode
In our data set, the mode is the most frequent score. The highest bar in a bar chart
rs
or histogram is represented by this symbol on a histogram. As a result, the mode may
be regarded as the most popular option at times. A mode is illustrated below as an
example:
ve
ni
U
ity
m
)A
The mode is typically used for categorical data where we want to know which
category is the most common, as shown below:
(c
Notes
e
in
nl
O
ity
We can see from the data above that the bus is the most common mode of
rs
transportation in this data set. However, because the mode is not unique, we have
problems when we have two or more values that share the highest frequency, as shown
below:
ve
ni
U
ity
m
)A
We are now at a loss as to which mode best describes the data’s central tendency.
This is especially problematic with continuous data because we are less likely to have
any one value that is more frequent than the other. Consider measuring the weights of
(c
30 people (to the nearest 0.1 kg). How likely is it that we will come across two or more
people who are exactly the same weight (e.g., 67.4 kg)?
The answer is probably very unlikely - many people may be close, but with such a
small sample (30 people) and such a wide range of possible weights, it is unlikely that
you will find two people who are exactly the same weight; that is, to the nearest 0.1 kg. Notes
As a result, the mode is rarely used with continuous data. Another issue with the mode
e
is that it does not provide a good measure of central tendency when the most common
mark is far from the rest of the data in the data set, as shown in the diagram below:
in
nl
O
ity
rs
ve
The mode has a value of 2 in the diagram above. However, we can clearly see
that the mode is not representative of the data, which is primarily concentrated in the
20 to 30 value range. It would be misleading to use the mode to describe the central
tendency of this data set.
ni
Skewed Distributions
U
When you have a normally distributed sample, you can use either the mean
Notes or the median to calculate central tendency. In fact, the mean, median, and mode of
any symmetrical distribution are all equal. However, in this case, the mean is widely
e
regarded as the best measure of central tendency because it is the only measure that
uses all of the values in the data set to calculate its value, and any change in any of the
in
scores will affect the mean’s value. This is not true for the median or mode.
nl
O
ity
rs
ve
ni
The mean is being dragged in the direction of the skew. In these cases, the median
is generally regarded as the best representative of the data’s central location. The
greater the difference between the median and the mean, the greater the emphasis
U
should be placed on using the median rather than the mean. Income (salary) is a
classic example of the above right-skewed distribution, with higher-earners providing a
false representation of typical income when expressed as a mean rather than a median.
When dealing with a normal distribution and normality tests reveal that the data is
ity
not normal, it is customary to use the median rather than the mean. However, this is
more of a guideline than a rule of thumb. Researchers may wish to report the mean of
a skewed distribution if the median and mean are not noticeably different (a subjective
assessment), and if it allows for easier comparisons to previous research.
m
Please see the summary table below to find out what the best measure of central
tendency is for each type of variable.
)A
e
Measures of central tendency are averages of the first order. Measures of dispersion
are averages of the second order. A measure of dispersion gives an idea about the
in
extent of lack of uniformity in the sizes and qualities of the items in a series. It helps us
to know the degree of uniformity and consistency in the series. If the difference between
items is large the dispersion or variation is large and vice versa.
nl
A measure of dispersion or variation in any data shows the extent to which the
numerical values tend to spread about an average. If the difference between items is
small, the average represents and describes the data adequately. For large differences
O
it is proper to supplement information bycalculating a measure of dispersion in addition
to an average.It is useful to determine data for the knowledge it may serve:
ity
◌◌ To compare two are more sets of observations.
◌◌ To suggest methods to control variation in the data.
A study of variations helps us in knowing the extent of uniformity or consistency in
rs
any data. Uniformity in production is an essential requirement in industry. Quality control
methods are based on the laws of dispersion.
particular subject; the absolute dispersion will provide the value in Marks. The only
difficulty is that if two or more series are expressed in different units, the series cannot
be compared on the basis of dispersion.
U
Definition: A precise measure of dispersion is one that gives the magnitude of the
variation in a series, i.e. it measures in numerical terms, the extent of the scatter of the
values around the average.
m
values. A good measure of dispersion should have properties similar to those described
for a good measure of central tendency.
e
The Standard Deviation
in
Graphical Method
nl
In statistics, dispersion measures help to understand data variability, i.e. how
homogeneous or heterogeneous the data is. It shows how squeezed or dispersed the
variable is, in simple terms.
O
There are two main types of dispersion methods in statistics which are the absolute
Measure of Dispersion and the relative Measure of Dispersion
ity
There practical implications involve -
◌◌
represents the data.
rs
It is usually used in conjunction with a measure of central tendency, such as
the mean or median, to provide an overall description of a set of data.
ve
◌◌ The range is the difference between the highest and lowest scores in a data
set and is the simplest measure of spread.
◌◌ Quartiles tell about the spread of a data set by breaking the data set into
ni
Range
U
The ‘Range’ of the data is the difference between the largest value of data and
smallest value of data.
of data, ‘Range’ may not give a true picture. In such case, relative measure of range,
called coefficient of range is used. This is given by,
Example 1:
Solution: L = 12 S = 5
Notes
Range = L – S
e
= 12 – 5
in
=7
Coefficient of range = L – S / L + S
nl
= 12 – 5/ 12 + 5
= 7/17
O
= 0.4118
Example 2:
ity
Compute the range and the co-efficient of range from the following distribution.
rs
130 - 140 9
140 - 150 16
150 - 160 12
ve
160-170 5
Solution:
In finding the range the frequencies are never taken into account. The upper limit
ni
of the highest class and the lower limit of the smallest class are only taken into account
Range = L - S
U
= 170 - 120 = 50
= 50/290
ity
= 0.1724
Inter-quartile range and deviations are described in the following sub sections.
Inter-quartile Range
)A
Inter-quartile range is a difference between upper quartile (third quartile) and lower
quartile
Quartile deviation
(c
e
quartiles from Median.
QD = (Q3 - Q1)/2 = Q3 - Q1 / Q3 + Q1
in
Example 1:
nl
Weekly wages of labourers is given below. Calculate Q.D. and coefficient of Q.D.
O
No. of Weeks: 5 8 21 12 6 52
Solution:
ity
100 5 5
200 8 13
400 21 34
500
600 rs 12
6
46
52
N = 52
ve
Q1 = N+1 /4
= 52+1/4
ni
13.25
= 200 + 50
ity
= 250
Q3 = 3(N+1 /4)
m
= 3 x 13.25
= 39.75
)A
= 500 + 0.75 X 0
(c
= 500.
Q.D. = Q3 - Q1 / 2
= 500 – 250/2
= 250/2
Notes
= 125
e
Coefficient of Q.D. = Q3 - Q1/ Q3 + Q1
in
. .= 500 -250/ 500 + 250
= 250/750
nl
= 0.333
Example 2:
O
Determine the interquartile range and percentile range of the following distribution:
ity
11 - 13 8
13 - 15 10
15 - 17 15
17 - 19 20
19 – 21
21 - 23
23 – 25
rs
12
11
4
ve
Solution:
ni
15 - 17 15 33
17 - 19 20 53
19 – 21 12 65
ity
21-23 11 76
23-25 4 80
Calculation of Q1
N 80
Since = = 20, the first quartile class is 15–17
)A
4 4
\ IQ1 = 15, fQ1 = 15, h = 2 and C = 18
20 – 18
Hence, Q1 = 15 + x 2 = 15.27
15
(c
Calculation of Q3
3N 3 x 80
Since = = 60, the third quartile class is 19-21
4 4
e
Hence, Q3 = 19 + x 2 = 20.17
12
in
Thus, the interquartile range = 20.17 – 15.27 = 4.90
nl
Calculation of P10
10N 10 x 80
O
Since, = = 8, P10 lies in the class interval 11–13
100 100
ity
8–0
Hence, P10 = 11 + x 2 = 13
8
Calculation of P10
Since
90N
100
=
90 x 80
100 rs
= 72, P10 lies in the class interval 21-23
ve
\ IP10 = 21, fP10 = 11, h = 2 and C = 65
72 – 65
Hence, P90 = 21 + x 2 = 22.27
11
ni
Mean Deviation
U
Mean deviation is the arithmetic mean of the absolute deviations of the values
about their arithmetic mean or median or mode. Mean Deviation (MD) is an average
value of absolute deviation of observations from the data mean (or the median or the
ity
Where,
Average used for calculating deviation can be the mean, the median or the mode.
However, usually the mean is used. There is also an advantage of taking deviations
from the median, because ‘Mean Deviation’ from median is lowest as compared to any
other ‘Mean Deviations’. Since absolute values of deviations ignoring sign are taken
(c
for calculating Mean Deviation, the mean deviation is not amenable to further algebraic
treatment.
e
coefficient of Mean Deviation’. It is defined as:
in
also be expressed in percentage by multiplying it with 100.
Formulae:
nl
Coefficient of Mean deviation (about mean) = =Mean deviation about Mean / Mean
= Σ|x-x|/N
O
Coefficient of Mean deviation (about Median) = Mean deviation about Median/
Median
= Σf|x-M|/N
ity
Coefficient of Mean deviation (about Mode) = Mean deviation about Mode / Mode
= Σf|x-z|/N
Example:
12 7 9 7 7 4 10 9 15 20
rs
ve
Solution:
X = 12 + 7 + 9 + 7 + 7 + 4 + 10 + 9 + 15 + 20/ 10
ni
= 100/10
= 10
U
= 34/10
ity
= 3.4
Example 1:
m
MD in Individual series
Value (x) 125 128 132 135 140 145 155 157 159 161
(c
Solution:
Notes
Steps 1: First compute AM Step 2: Deviation From X Mean
e
Sl. No. Value (x) Formula (X-X) = Dx deviation
A 125 Σx 125-144= -19 Σ Dx
in
X= MD =
B 128 n 128-144= -16 n
C 132 1440 132-144= -12 120
MD = MD =
nl
D 135 10 135-144= -9 ignoring 10
negative
E 140 140-144= -4 sign MD= 12
X=144
F 148 148-144= +4 Coefficient of
O
G 155 155-144= +11 MD
MD =
H 157 157-144= 13 x
I 159 159-144= +15 12
ity
=0.083
J 161 161-144= +17 144
n =10 Σx = 1.440 ΣDx = 120
12 120
= 30 + ( ) 10 = 30 + ( ) =30 + 4.285 z = 34.285
Example 2:
72.44
rs 28
ve
MD in Discrete Series
X 35 40 45 50 55 60 65 70 75 80 85 90 95
ni
f 3 8 12 9 4 7 15 5 10 7 5 3 2
U
Solution:
Example 3:
Notes
MD in Continuous Series
e
Calculate mean deviation and its co-efficient for the following data:
in
X f
10-20 5
20-30 4
nl
30-40 7
40-50 12
O
50-60 10
60-70 8
70-80 4
ity
Solution
X f Mid fx AM (X- x ) =
fdx
Point X dx
10-20
20-30
30-40
5
4
7
15
25
35
75
100
245
x=
Σ fx
n
2.330
31.6
21.6
11.6
rs
80.85
175.60
203.40
MD =
Σ Dx
n
689.6
ve
x= MD =
40-50 12 45 540 50 1.6 107.55 50
50-60 10 55 550 = 46.6 8.4 84.0 Co-efficient of
md
60-70 8 65 520 18.4 147.2 MD= x
ni
Measures of Skewness
In addition to measures of central tendency and measures of variation, there are
two attributes of frequency distribution of a data set that may be of interest to managers
for effective decision-making. These are the Skewness and Kurtosis.
m
When the distribution stretches more to the right than it does to the left, the
distribution is said to be ‘right skewed’ or ‘positively skewed’. Similarly, a left-skewed
distribution is the one that stretches asymmetrically to the left. Thus, the skewness is
)A
(P90 + P10 ) – 2 ´ Md
Notes Skk =
(P90 – P10 )
e
2. Bowley’s Coefficient of Skewness (SKB) (quartile coefficient of skewness). It is
defined as:
in
(Q3 – Q2 ) – (Q2 – Q1 )
(SK B ) =
(Q3 + Q1 )+ (Q2 – Q1 )
nl
(Q + Q1 ) – 2 ´ Md
= 3
(P90 – P10 )
O
Where, Q is quartile.
ity
Skewness =
N
Where, P is percentile.
rs
Skewness is also defined in term of the moment about mean. One such
measure is defined as:
ve
(x–i )
Absolute Kurtosis =
N
percentage curve comparing the population and factor under study. For example, we
could plot a graph of percentage of population and percentage of their wealth. Lorenz
curve is very useful for comparing two populations particularly when their means and
SD are same
ity
X – Mo
SK1 =
S
(c
Where x = the mean, Mo = the mode and s = the standard deviation for the sample.
e
3.1.17 Bowley’s Coefficient of Skewness
in
Bowley skewness is a method to figure out whether there is a positively-skewed or
negatively skewed distribution. Bowley Skewness is used as an alternative to find out
nl
more about the asymmetry of an distribution. It is very useful if there are extreme data
values ie. the outliers or if there is an open-ended distribution.
O
Skewness = 0 means that the curve is symmetrical.
Skewness > 0 means the curve is positively skewed.
ity
Skewness < 0 means the curve is negatively skewed.
In a symmetric distribution, like the normal distribution, the first (Q1) and third
(Q3) quartiles are at equal distances from the mean (Q2). In other words, (Q3-Q2) and
(Q2-Q1) will be equal. If you have a skewed distribution then there will be a difference
between those two values.
Skewness.
Example:
U
1 60 120
2 50 170
3 20 190
4 25 215
m
5 10 225
6 or more 5 230
)A
Solution –
Step 1: Finding the quartiles for the data set. Looking at for the “nth” observation
using the following formulas:
(c
Step 2: Looking in the table to find the nth observations as calculated in Step 1:
Notes
Q1 = 57.75th observation = 0
e
Q2 = 115.5th observation = 1
in
Q3 = 173.25th observation = 3
nl
Skq = Q3 + Q1 – 2Q2 / Q3 – Q1
Skq = 3 + 0 – 2 / 3 – 0 = 1/3
O
Skq = + 1/3, so the distribution is positively skewed.
Measure of Kurtosis
ity
Kurtosis is a measure of peaked-ness of distribution. Larger the kurtosis, more and
more peaked will be the distribution. The kurtosis is calculated either as an absolute or
a relative value. Absolute kurtosis is always a positive number. Absolute kurtosis of a
normal distribution (symmetric bell shaped distribution) is taken as 3. Relative kurtosis
can be calculated as follows:
Absolute Kurtosis =
(x–i )
N
rs
ve
Relative kurtosis = Absolute kurtosis–3
Example:
ity
The first four central moments of a distribution are 0, 2.5, 0.7 and 18.75. Test the
skewness and kurtosis of the distribution.
Testing Skewness
m
μ32
β1 =
μ23
Testing Kurtosis:
e
When a distribution is more peaked than the normal, β2 is more than 3 and when it
is less peaked than the normal, β2 is less than 3.
in
μ4
β1 =
μ22
nl
μ4 = 18.75, μ2 = 2.5
18.75 18.75
β1 = = =3
O
(2.5) 3
6.25
ity
Key Takeaways:
Measure of dispersion: IT gives an idea about the extent of lack of uniformity in the
sizes and qualities of the items in a series. It helps us to know the degree of uniformity
and consistency in the series. If the difference between items is large the dispersion or
variation is large and vice versa.
rs
Range: The ‘Range’ of the data is the difference between the largest value of data
and smallest value of data.
ve
Inter-quartile range: It is a difference between upper quartile (third quartile) and
lower quartile (first quartile). Quartile Deviation is the average of the difference between
upper quartile and lower quartile.
ni
Variance: It is the average squared deviation of the data from their mean. For
sample data, we take the average by dividing with (n-1) where n is a sample size. This
is to cater for degree of freedom. For population data, we average by dividing with the
U
population size N.
effectively.
The Standard Deviation (SD) of a set: It is the positive square root of the
variance of the set. This is also referred as Root Mean Square (RMS.) value of the
deviations of the data points. SD of sample is the square root of the sample variance.
)A
e
multiplied by its weight and then the product are divided by the number of weights.
5. The word “mode” is derived from the French word “la mode” which means fashion.
in
True / False
6. A central ___________ measure is a single value that attempts to describe a set of
data by identifying the central position within that set of data.
nl
7. When you have a ____________ distributed sample, you can use either the mean
or the median to calculate central tendency.
O
8. Different series may possess different dispersions of items around the average. True
/ False
9. Absolute measures of __________ are expressed in the same units in which the
ity
original data are expressed.
10. __________ or Coefficient of dispersion is the ratio or the percentage of a measure
of absolute dispersion to an appropriate average.
rs
11. Negative kurtosis indicates a flatter distribution than the normal distribution, and
called as ____________.
12. A positive kurtosis means more peaked curve, called _____________.
ve
13. A peak of normal distribution is called Leptokurtic. True / False
Multiple Choice
1. The mean height of 25 male workers in a factory is 61 cm and the mean height of
ni
35 female workers in the same factory is 58cm. The combined mean height of 60
workers in the factory is?
U
a. 59.25
b. 59.5
c. 59.75
ity
d. 58.75
2. Mean of 100 items is 49. It was discovered that three items which should have been
60, 70, and 80 were wrongly read as 40, 20, and 50, respectively. The correct mean
is
m
a. 48
b. 89
)A
c. 50
d. 80
3. If the mode of the data is 18 and the mean is 24, what is the median?
(c
a. 18
b. 24
c. 22
Amity Directorate of Distance & Online Education
Quantitative Aptitude 161
d. 21
4. The mean of 30 given numbers, when it is given that the mean of 10 of them is 12
Notes
e
and the mean of the remaining 20 is 9, is equal to
a. 11
in
b. 10
c. 9
nl
d. 5
Summary
O
●● Central tendency has three main measures: mode, median and mean. Each of
those measurements represents a specific indication of the distribution’s typical or
central value.
ity
●● Arithmetic Mean is defined as the value obtained by dividing the total values of all
items in the series by their number. In other words, is defined as the sum of the
given observations divided by the number o observations, i.e., add values of all
items together and divide this sum by the number o observations.
●●
●●
rs
Arithmetic mean for ungrouped data of individual series can be calculated using
either the direct method or Short-cut method(indirect method).
Sometimes, some observations get relatively more importance than other
ve
observations, the weight for such observation must be given on the basis of their
relative importance. In weighted arithmetic mean, for finding an average the value
of each item is multiplied by its weight and then the product are divided by the
number of weights.
ni
●● Median is defined as the value of the item dividing the series into two equal
halves, where one half contains all values less than (or equal to) it and the other
U
half contains all values greater than (or equal to) it. It is also defined as the ‘central
value’ of the variable.
●● Mode is defined as the value of the variable occurring most frequently in a
ity
distribution.
●● A central tendency measure is a single value that attempts to describe a set
of data by identifying the central position within that set of data. As a result,
measures of central tendency are also known as measures of central location.
m
They’re also known as summary statistics. The mean (also known as the average)
is probably the most familiar measure of central tendency, but there are others,
such as the median and the mode. The mean, median, and mode are all valid
)A
e
dispersion.
in
●● Relative or Coefficient of dispersion is the ratio or the percentage of a measure of
absolute dispersion to an appropriate average.
●● The range of the data is the difference between the largest value of data and the
nl
smallest value of data.
●● Mean deviation is the arithmetic mean of the absolute deviations of the values
about their arithmetic mean or median or mode. There is also an advantage of
O
taking deviations from the median, because ‘Mean Deviation’ from median is
lowest as compared to any other ‘Mean Deviations’.
●● In addition to measures of central tendency and measures of variation, there are
ity
two attributes of frequency distribution of a data set that may be of interest to
managers for effective decision-making. These are the Skewness and Kurtosis.
●● Pearson’s coefficient of skewness is calculated by multiplying the difference
between the mean and median, multiplied by 3. The result is then divided by the
●●
standard deviation.
rs
Bowley skewness is a method to figure out whether there is a positively skewed or
negatively skewed distribution.
ve
●● Kurtosis is a measure of peaked-ness of distribution. Larger the kurtosis, more
and more peaked will be the distribution. The kurtosis is calculated ether as an
absolute or a relative value. Absolute kurtosis is always a positive number.
ni
Activity
1. Using Karl-Person’s Coefficient of Skewness formula; solve for the median and
U
mode.
a. Skewness of Distribution = 0.32
b. Standard Deviation = 6.5
ity
c. Mean = 29.6
2. What is the coefficient of skewness if beta one is 9 and beta two is 11?
3. The median of a skewed distribution is 8, 3rd quartile is 12, 1st quartile is 8, and inter-
m
e
●● Arithmetic Mean: is defined as the value obtained by dividing the total values of
in
all items in the series by their number.
●● Mean: an average, it is a single figure that sums up the characteristics of a whole
group of figures.
nl
●● Median: is defined as the value of the item dividing the series into two equal
halves, where one half contains all values less than (or equal to) it and the other
half contains all values greater than (or equal to) it.
O
●● Mode: is defined as the value of the variable occurring most frequently in a
distribution.
●● Range: is the difference between the largest value of data and the smallest value
ity
of data.
●● Skewness: is a measure of the extent of symmetry or asymmetry of the
distribution.
Further Readings
1. rs
Bekes, Gabor. Data Analysis for Business, Economics, and Policy. Cambridge
ve
University Press (May 6, 2021)
2. Huberman, A. Michael; Miles, Matthew B.; Saldana, Johnny. Qualitative Data
Analysis: A Methods Sourcebook
3. Knaflic, Cole Nussbaumer. Storytelling with Data: A Data Visualization Guide
ni
Multiple Choice
1. A
)A
2. C
3. B
4. C
(c
e
Structure:
in
5.1 Correlation
1.1.1 Correlation-Coefficient_Introduction
1.1.2 Correlation_Coefficient_Application
nl
1.1.3 Introduction_Rank_Correlation
1.1.4 Comparison_Pearson_Spearman_Correlation
O
1.1.5 Application_Rank_Correlation
5.2 Regression
ity
2.1.1 Introduction_Linear_Regression_model
2.1.2 Population_sample_Regression
2.1.3 Method_Least_Square_Understanding
2.1.4 Maths_Behind_Least_Square
rs
ve
ni
U
ity
m
)A
(c
e
Objectives:
in
●● Correlation-Coefficient Introduction
●● Correlation-Coefficient Application
●● Introduction Rank Correlation
nl
●● Application Rank Correlation
O
Introduction
Correlation is a degree of linear association between two random variables. In
these two variables, we do not differentiate them as dependent and independent
ity
variables. It may be the case that one is the cause and other is an effect i.e.
independent and dependent variables respectively. On the other hand, both may be
dependent variables on a third variable. In some cases there may not be any cause
effect relationship at all. Therefore, if we do not consider and study the underlying
economic or physical relationship, correlation may sometimes give absurd results
rs
Frequently, data is available in the form of some kind of ranking for different
variables. Other times, data can have instances where it is difficult to measure the
ve
cause-effect variables. An example of this instance is when selecting a candidate
for scenario. There are a number of factors on which the experts need to base their
assessment on, therefore, it is not possible to measure many of these parameters in
physical units e.g., sincerity, loyalty, integrity, tactfulness, initiative, etc. The purpose of
ni
j) represents the frequency or count that falls in both groups of a particular range of
values of Xi and Yj. In this case correlation coefficient is given by
1
)A
∑ f × mx × m y − ∑ ( f × mx ) ∑ ( f × m y )
r= n
∑ f × mx ) 2 ∑ f × my )2
∑ ( f × mx 2 ) − ∑( f × m y 2 ) −
2 2
(c
e
y
0-200 12 6 - - - 18
in
200-400 2 18 4 2 1 27
400-600 - 4 7 3 - 14
nl
600-800 - 1 - 2 1 4
800-1000 - - 1 2 3 6
Total 14 29 12 9 5 69
O
Solution:
Let the assumed mean for X be a = 1250 and the scaling factor g = 500. Therefore,
ity
we can calculate f x dy and f x dx2 from the marginal distribution of X as,
Class mx - a
X dx = Frequecny f f x dx f x dx2
Mark mx g
rs
0-500 250 -2 14 -28 66
600-1000 750 -1 29 -29 29
1000-1500 1250 0 12 0 0
ve
1500-2000 1750 1 9 9 9
2000-2500 2250 2 5 10 20
Total -38 114
ni
Cov x.cov y
r=
σ xσ y
ity
1 _ _
∑( x − x)( y − y )
r=n
σ xσ y
m
variables.
Notes
formulla can be modified as:
e
1 _ _
1 _ _ _ _
∑( x − x)( y − y ) ∑( xy − x y − x y + x y
r n= n
in
=
σ xσ y σ xσ y
N N N
=
nl
2 2 2 2
∑x ∑x ∑y ∑y
− −
n n n n
E [ XY ] − E[ x]E[Y ]
=
O
E[ X 2 ] − ( E[ X ]) 2 E[Y 2 ] − ( E[Y ]) 2
Equations (2) and (3) are alternate forms of equation (1). These have advantage
ity
that we don’t have to subtract each value from the mean.
Solution:
We shall take U to be the deviation of X values from the assumed mean of 30
U
divided by 5. Similarly, V represents the deviation of Y values from the assumed mean
of 400 divided by 10.
4 40 500 2 10 20 4 100
5 30 450 0 5 0 0 25
6 20 400 -2 0 0 4 0
)A
n
1 n n
Notes ∑ u1v1 − ∑ 1 ∑ v1
u
n i −1 i −1
r= i −1
e
2 2
n
1 n 2 n
1 n
∑ u12 −
i −1
∑ v1
n i −1
∑ v12 −
i −1
∑ v1
n i −1
in
(−2)(26)
561 −
10 561 + 5.2
= = = 0.976
4 676 109.6 3068.4
nl
110 − 3136 −
10 10
Interpretation of r
O
●● The correlation coefficient, r ranges from −1 to 1. A value of 1 implies that a linear
equation describes the relationship between X and Y perfectly, with all data points
lying on a line for which Y increases as X increases. A value of −1 implies that all
ity
data points lie on a line for which Y decreases as X increases. A value of 0 implies
that there is no linear correlation between the variables.
●● More generally, note that (Xi − X) (Yi − Y) is positive if and only if Xi and Yi lie on
the same side of their respective means. Thus the correlation coefficient is positive
●●
their respective means.
rs
if Xi and Yi tend to be simultaneously greater than, or simultaneously less than,
which the experts base their assessment. It is not possible to measure many of these
parameters in physical units e.g. sincerity, loyalty, integrity, tactfulness, initiative, etc.
Similar is the case during dance contests. However, in these cases the experts may
)A
rank the candidates. It is then necessary to find out whether the two sets of ranks
are in agreement with each other. This is measured by Rank Correlation Coefficient.
The purpose of computing a correlation coefficient in such situations is to determine
the extent to which the two sets of ranking are in agreement. The coefficient that is
determined from these ranks is known as Spearman’s rank coefficient, rS.
(c
n
6 × ∑ d12 Notes
rs = 1 i −1
e
2
n(n − 1)
Where, n = Number of observation pairs
in
D = Xi - Yi
nl
Rank correlation for tied ranks
O
In case of a tie, i.e., when two or more individuals have the same rank, each
individual is assigned a rank equal to the mean of the ranks that would have been
assigned to them in the event of there being slight differences in their values. To
understand this, let us consider the series 20, 21, 21, 24, 25, 25, 25, 26, 27, 28. Here
ity
the value 21 is repeated two times and the value 25 is repeated three times. When
we rank these values, rank 1 is given to 20. The values 21 and 21 could have been
assigned ranks 2 and 3 if these were slightly different from each other. Thus, each value
will be assigned a rank equal to mean of 2 and 3, i.e., 2.5. Further, the value 24 will be
Since the Spearman’s formula is based upon the assumption of different ranks to
ve
different individuals, therefore, its correction becomes necessary in case of tied ranks. It
should be noted that the means of the ranks will remain unaffected.
When two or more items have the same rank, a correction has to be applied to
Σd1 .
2
U
For example, if the ranks of X are 1, 2, 3, 3, 5,..... showing that there are two items
with the same 3rd rank and fourth rank is skipped, then instead of writing 3, we write 3
1 1 1
for both. Thus the sum of these ranks which is 7 (3+4=3 + 3 =7) remains same
2 2 2
ity
keeping the mean of ranks unaffected. But in such cases the standard deviation is
affected. Therefore, correction is required for the Rank Correlation Coefficient. For this,
(m3 –m)
Σd12 is increased by for each tie, where m is number of items in each tie.
12
We must remember that if there are more than one gorup of items with common
m
rank, this correction factor is to be added that many times once for each group.
Example: Twelve salesmen are ranked for efficiency and length of service as
)A
below:
Salesman A B C D E F G H I J K L
Efficiency (X) 1 2 3 4 4 4 7 8 9 10 11 12
(c
Solution:
Amity Directorate of Distance & Online Education
170 Quantitative Aptitude
e
(Y=y1)
A 1 2 -1 1
in
B 2 1 1 1
C 3 5 -2 4
nl
D (4+5+6)/3=5 3 2 4
E (4+5+6)/3=5 9 -4 16
F (4+5+6)/3=5 (7+8)/3=7.5 -2.5 6.25
O
G 7 (7+8)/3=7.5 -0.5 0.25
H 8 6 2 4
I 9 4 5 25
ity
J 10 (11+12)/2=11.5 -1.5 2.25
K 11 10 1 1
L 12 (11+12)/2=11.5 0.5 0.25
Total
Now, n = 12,
N
d 2
= 65
rs 65
ve
1
i-1
rs = 1 i-1
n (n2 –1)
=1– = 0.762
12 (144–1)
We can conclude that there is a high degree of correlation between efficiency and
length of service.
ity
Solution:
e
(Y=y1)
1 1 3 +2 4
in
2 2 1 -1 1
3 3 4 +1 1
nl
4 4 2 -2 4
5 5 6 +1 1
6 6 9 +3 9
O
7 7 8 +1 1
8 8 10 +2 4
9 9 5 -4 10
ity
10 10 7 -3 9
Total 50
N
Now, n = 10, d 2
= 50
rs
1
i-1
We can say that there is a high degree of correlation between the performance in
ni
Example: Find the rank correlation coefficient for the following data.
X 75 88 95 70 60 80 81 50
ity
X Y R1 R2 d=R1–R2 d2
m
75 120 5 5 0 0
00 134 2 4 –2 4
)A
95 150 1 1 0 0
70 115 6 6 0 0
60 110 7 7 0 0
80 140 4 3 1 1
(c
81 142 3 2 1 1
50 100 8 8 0 0
6
– 1–
6x6
= + 93
8 (64 –1)
2
N (n -1)
e
In this method the biggest item gets the first rank, the next biggest second rank and
in
so on.
nl
In case of a tie, i.e., when two or more individuals have the same rank, each
individual is assigned a rank equal to the mean of the ranks that would have been
assigned to them in the event of there being slight differences in their values. To
O
understand this, let us consider the series 20, 21, 21, 24, 25, 25, 25, 26, 27, 28. Here
the value 21 is repeated two times and the value 25 is repeated three times. When
we rank these values, rank 1 is given to 20. The values 21 and 21 could have been
assigned ranks 2 and 3 if these were slightly different from each other. Thus, each value
ity
will be assigned a rank equal to mean of 2 and 3, i.e., 2.5. Further, the value 24 will be
assigned a rank equal to 4 and each of the values 25 will be assigned a rank equal to 6,
the mean of 5, 6 and 7 and so on.
Since the Spearman’s formula is based upon the assumption of different ranks to
rs
different individuals, therefore, its correction becomes necessary in case of tied ranks. It
should be noted that the means of the ranks will remain unaffected.
ve
5.1.5 Application Rank Correlation
When two or more items have the same rank, a correction has to be applied to
Σd12.
ni
For example, if the ranks of X are 1, 2, 3, 3, 5,..... showing that there are two items
with the same 3rd rank and fourth rank is skipped, then instead of writing 3, we write 3
1 1 1
U
for both. Thus the sum of these ranks which is 7 (3+4=3 + 3 =7) remains same
2 2 2
keeping the mean of ranks unaffected. But in such cases the standard deviation is
affected. Therefore, correction is required for the Rank Correlation Coefficient. For this,
(m3 –m)
ity
Σd12 is increased by for each tie, where m is number of items in each tie.
12
We must remember that if there are more than one gorup of items with common
rank, this correction factor is to be added that many times once for each group.
m
Example: Twelve salesmen are ranked for efficiency and length of service as
below:
Salesman A B C D E F G H I J K L
)A
Efficiency (X) 1 2 3 4 4 4 7 8 9 10 11 12
Solution:
e
A 1 2 -1 1
B 2 1 1 1
in
C 3 5 -2 4
D (4+5+6)/3=5 3 2 4
nl
E (4+5+6)/3=5 9 -4 16
F (4+5+6)/3=5 (7+8)/3=7.5 -2.5 6.25
G 7 (7+8)/3=7.5 -0.5 0.25
O
H 8 6 2 4
I 9 4 5 25
J 10 (11+12)/2=11.5 -1.5 2.25
ity
K 11 10 1 1
L 12 (11+12)/2=11.5 0.5 0.25
Total 65
Now, n = 12,
N
d
i-1
1
2
= 65
rs
ve
Using the formula
N 1 1 1
{ d + x (33 – 3)
2
x (23 – 2) + x (23 – 2)}
1
12 12 12
rs = 1 i-1
n (n2 –1)
ni
We can conclude that there is a high degree of correlation between efficiency and
length of service.
ity
the extent to which the two sets of ranking are in agreement. True / False
3. The coefficient that is determined from these ranks is known as _____________
)A
rank coefficient.
4. Many times the observations are grouped into a “two way” frequency distribution
table. True / False
Multiple Choice
(c
e
d. Y variable
in
2. Correlation r contains no unit because
a. Absolute number
b. It is a relative number
nl
c. It denotes variables
d. It show the value of attributes
O
3. Value of r lying outside the range of -1 and + 1, indicates
a. Zero correlation
ity
b. Strong correlation
c. Error in calculation
d. Weak correlation
rs
4. A scatter diagram displaying plotted points on an upward sloping straight line denotes
that X and Y has
a. Positive correlation
ve
b. Zero correlation
c. Perfect positive correlation
d. Perfect negative correlation
ni
Summary
●● A table displaying correlation coefficients between variables is a correlation matrix.
U
The association between two variables is presented by each cell in the table. To
summarise results, as an input into a more advanced analysis, and as a diagnosis
for advanced analysis, a correlation matrix is used.
ity
●● Quite often the data is available in the form of some ranking for different
variables. Also, there are occasions where it is difficult to measure the cause-
effect variables. It is necessary to find out whether the two sets of ranks are in
agreement with each other. This is measured by Rank Correlation Coefficient. The
purpose of computing a correlation coefficient in such situations is to determine
m
the extent to which the two sets of ranking are in agreement. The coefficient that is
determined from these ranks is known as Spearman’s rank coefficient.
)A
Activity
1. Sara and Lily are studying for a text. Sara has spent eight hours studying for this test
where as Lily has only put in two hours. The tests were graded and Sara received
an A on the test and Lily received a C. Seeing this Lily wondered if there was a
(c
correlation between the hours studied and the test score. Below is the data that Lily
collected. Using that fine the correlation coefficient.
X (time) 8 2 6 4 2
Y (grade) 98 74 87 82 72
Notes
e
2. The scores of ten students in physics and biology are as follows. Calculate the
Spearman Rank Correlation
in
a. Physics: 35, 23, 47, 17, 10, 43, 9, 6, 28, 32
b. Biology: 30, 33, 45, 23, 8, 49, 12, 4, 31, 42
nl
Questions and Exercises
1. What are the key decisions that need to be made when creating a correlation matrix?
O
2. What is Spearman’s rank coefficient formula?
3. How does one determine the rank for tied ranks?
ity
Glossary
●● Bivariate Frequency Distribution: observations that are grouped into a “two
way” frequency distribution table.
●● Correlation Matrix: A table displaying correlation coefficients between variables.
Further Reading
1.
rs
Athanasopoulos, George; Hyndman, Rob J. Forecasting: Principles and
ve
Practice. Otexts; 3rd ed. edition (May 31, 2021)
2. Gilliland, Michael. Business Forecasting (Wiley and SAS Business Series).
Wiley; 1st edition (December 29, 2015)
ni
2. True
3. Spearman’s
4. True
ity
Multiple Choice
1. A
m
2. B
3. C
4. C
)A
(c
e
Objectives:
At the end of this unit, you will be able to understand
in
●● Introduction to Linear Regression Model
●● Population Sample Regression
nl
●● Method Least Square Understanding
●● Math Behind Least Square
O
Introduction
There is a need for a statistical model that will extract information from the given
ity
data to establish the regression relationship between independent and dependent
relationship. The model should capture systematic behaviour of data. The non-
systematic behaviour cannot be captured and called as errors. The error is due to
random component that cannot be predicted as well as the component not adequately
rs
considered in statistical model. Good statistical model captures the entire systematic
component leaving only random errors.
The best fit is calculated as per Legender’s principle of least sum squares of
deviations of the observed data points from the corresponding values on the ‘best
fit’ curve. This is called as minimum squared error criteria. It may be noted that the
ity
data.
between the input variables (x) and the single output variable (y). More specifically, that
y can be calculated from a linear combination of the input variables (x).
A simple and widely used kind of predictive analysis is linear regression. Two
questions are discussed in the general concept of regression: Notes
e
1. Does a collection of predictor variables do a good job of predicting an outcome
(dependent ) variable?
in
2. In particular, which variables are important predictors of the outcome variable and
how they influence the outcome variable, as shown by the magnitude and sign of the
beta estimates.
nl
The relationship between one dependent variable and one or more independent
variables is explained by these regression projections. The formula y = c + b*x
O
describes the simplest form of the regression equation with one dependent and one
independent variable, where y = expected dependent variable score, c = constant, b =
coefficient of regression and x = score on the independent variable.
When there is a single input variable (x), the method is referred to as simple linear
ity
regression. When there are multiple input variables, literature from statistics often refers
to the method as multiple linear regression
Regression Lines
rs
For a bivariate data (Xi, Yi), i = 1,2, ...... n, we can have either X or Y as
independent variable. If X is independent variable then we can estimate the average
values of Y for a given value of X. The relation used for such estimation is called
ve
regression of Y on X. If on the other hand Y is used for estimating the average values of
X, the relation will be called regression of X on Y.
For a bivariate data, there will always be two lines of regression. It will be shown
ni
later that these two lines are different, i.e., one cannot be derived from the other by
mere transfer of terms, because the derivation of each line is dependent on a different
set of assumptions.
U
Line of Regression of Y on X
The general form of the line of regression of Y on X is YCi = a + bXi , where YCi
denotes the average or predicted or calculated value of Y for a given value of X = Xi.
ity
Y
)A
Y1
bX i
a+
Ya =
Ya
(c
a
{
O X1 X
Notes The above line is known if the values of a and b are known. These values are
estimated from the observed data (Xi, Yi), i = 1,2, ...... n.
e
Line of Regression of X on Y
in
The general form of the line of regression of X on Y is XCi = c + dYi , where XCi
denotes the predicted or calculated or estimated value of X for a given value of Y = Yi
nl
and c and d are constants. d is known as the regression coefficient of regression of X
on Y.
O
Y
Yi
c+b
Xa =
Yi
ity
Xa Xi
C
{
rs
O X
In this case, we have to calculate the value of c and d so that S1 = Σ(X1 –Xa)2 is
ve
minimised.
This shows that the line of regression also passes through the point X,Y . Since
ni
both the lines of regression passes through the point X,Y , therefore X,Y is their point of
intersection as shown
U
bY
i
Y
+
c
=
bX i
a+
a
X
Ya =
ity
Y
Xi
C
m
O X X
We can write c = X – dY
)A
ŷ = b0 + b1x
e
μy = β0 + β1x, where μy is the population mean response, β0 is the y-intercept, and
beta1 is the slope for the population model.
in
In a population, there can be many different responses for a value of x. In simple
linear regression, the model assumes that for each value of x the observed values
of the response variable y are normally distributed with a mean that depends on x.
nl
To represent such means μy is used. It is also assumed that these means all lie on
a straight line when plotted against x (a line of means). The sample data then fit the
statistical model:
O
Data = fit + residual
yi = (β0 + β1xi) + ϵi
ity
where the errors (εi) are independent and normally distributed N (0, σ). Linear
regression also assumes equal variance of y (σ is the same for all values of x). We use
ε (Greek epsilon) to stand for the residual part of the statistical model. A response y is
the sum of its mean and chance deviation εfrom the mean. The deviations ε represents
rs
the “noise” in the data. In other words, the noise is the variation in y due to other causes
that prevent the observed (x, y) from forming a perfectly straight line.
The sample data used for regression are the observed values of y and x. The
ve
response y to a given xis a random variable, and the regression model describes the
mean and standard deviation of this random variable y. The intercept β0, slope β1, and
standard deviation σ of y are the unknown parameters of the regression model and
must be estimated from the sample data.
ni
◌◌ The value of ŷ from the least squares regression line is really a prediction of
the mean value of y (μy) for a given value of x.
U
∑ x i2 =
∑ X i 2 − nX 2
∑ y i2 =
∑ Yi 2 − nY 2
)A
∑ X iYi − nX .Y
∑ xi yi =
∑ xi y
b= , a= Y − bX
∑ x i2
These measures define a and b which will give the best possible fit through the
(c
original X and Y points and the value of r can then be worked out as under:
Notes b ∑ x i2
r=
∑ y i2
e
Thus, the regression analysis is a statistical method to deal with the
in
formulation of mathematical model depicting relationship amongst variables which can
be used for the purpose of prediction of the values of dependent variable, given the
values of the independent variable. Alternatively, for fitting a regression equation of the
nl
type Y = a + bXto the given values of X and Y variables, we can find the values of the
two constants viz., a andb by using the following two normal equations:
∑ yi = na + b ∑ xi
O
∑ X iYi = a ∑ X i + b ∑ X i 2
and then solving these equations for finding a and b values. Once these values are
obtained and have been put in the equation Y = a + bX, we say that we have fitted the
ity
regression equation of Y on X to the given data. In a similar fashion, we can develop
the regression equation of X and Y viz., X = a + bX, presuming Y as an independent
variable and X as dependent variable.
rs
The mathematical form of a parabolic trend is given by Yt = a + bt + ct2 or Y =
a + bt + ct2 (dropping the subscript for convenience). Here a, b and c are constants
ve
to be determined from the given data. Using the method of least squares, the normal
equations for the simultaneous solution of a, b, and c are:
∑ Y = na + b ∑ t + c ∑ t 2
ni
∑ tY = a ∑ t + b ∑ t 2 + c ∑ t 3
∑ t2Y = a ∑ t 2 + b ∑ t 3 + c ∑ t 4
U
By selecting a suitable year of origin, i.e., define X = t - origin such that SX = 0, the
computation work can be considerably simplified. Also note that if SX = 0, then SX3 will
also be equal to zero. Thus, the above equations can be rewritten as:
ity
∑ Y = na + cX 2 ...(i)
b X 2 ...(ii)
∑ XY =∑
∑ X 2Y = a ∑ X 2 + c ∑ X 4 ...(iii)
m
∑ XY
From equation (ii), we get b= ...(iv)
∑X2
∑Y − c ∑ X 2
)A
Thus, equations (iv), (v) and (vi) can be used to determine the values of the
constants a, b and c.
e
different points of time in its cycle gets completely neutralised, i.e., S = 0 t in one year
and C = 0 t in the period of cyclical variations.
in
Method of Least Square - Exponential Trend
The general form of an exponential trend is Y = a.bt, where a and b are constants
nl
to be determined from the observed data. Taking logarithms of both sides, we have logY
= log a + t log b. This is a linear equation in log Y and t and can be fitted in a similar way
as done in case of linear trend. Let A = log a and B = log b, then the above equation can
O
be written as log Y = A + Bt.
∑ log Y = nA + B ∑ t
ity
and ∑ t log Y = A ∑ t + B ∑ t 2
By selecting a suitable origin, i.e., defining X=t- origin, such that SX = 0, the
computation work can be simplified. The values of A and B are given by
=A
∑ log Y
=
n
and B
∑ X log Y
∑X2 rs
ve
respectively. Thus, the fitted trend equation can be written as long Y=A+BX
do.”
probability and sampling distributions, we must be familiar with some common terms
used in theory of probability. Although these terms are commonly used in business, they
have precise technical meaning.
e
data; the proof employs simple calculus and linear algebra. The fundamental issue is
determining the best-fitting straight line
in
y = ax + b
given that, for n ∈ {1, . . . , N}, the pairs (xn, yn) are observed. The method easily
nl
generalises to determining the best form fit:
y = a1 f1 (x) + · · · + cK fK (x);
O
It is not necessary for the functions fk to be linear in x; instead, y must be a linear
combination of these functions.
ity
Step 1: For each (x,y) point calculate x2 and xy
Step 2: Sum all x, y, x2 and xy, which gives us Σx, Σy, Σx2 and Σxy (Σ means “sum up”)
b = Σy − m ΣxN
y = mx + b
Example: Sam found how many hours of sunshine vs how many ice creams were
U
“x” “y”
Hours of Ice Creams
ity
Sunshine Sold
2 4
3 5
5 7
m
7 10
9 15
)A
Let us find the best m (slope) and b (y-intercept) that suits that data
y = mx + b
x y x2 xy
2 4 4 8
Notes
e
3 5 9 15
5 7 25 35
in
7 10 49 70
9 15 81 135
nl
x y x2 xy
2 4 4 8
O
3 5 9 15
5 7 25 35
7 10 49 70
ity
9 15 81 135
Σx: 26 Σy: 41 Σx : 168
2
Σxy: 263
= 249164 = 1.5183...
b = Σy − m ΣxN
U
= 41 − 1.5183 x 265
= 0.3049...
ity
y = mx + b
y = 1.518x + 0.305
m
y = 1.518x +
)A
x y error
0.305
2 4 3.34 −0.66
3 5 4.86 −0.14
5 7 7.89 0.89
(c
7 10 10.93 0.93
9 15 13.97 −1.03
Here are the (x,y) points and the line y = 1.518x + 0.305 on a graph:
Notes
e
in
nl
O
Nice fit!
ity
Sam hears the weather forecast which says “we expect 8 hours of sun tomorrow”,
so he uses the above equation to estimate that he will sell
rs
Sam makes fresh waffle cone mixture for 14 ice creams just in case.
1. There is a need for a statistical model that will extract information from the given
data to establish the _________ relationship between independent and dependent
ni
relationship.
2. The non-systematic behaviour cannot be captured and called as _________.
U
3. The error is due to _________ component that cannot be predicted as well as the
component not adequately considered in statistical model.
4. If the variables in a bivariate distribution are correlated, the points in scatter diagram
approximately cluster around some curve. True / False
ity
5. The equation of the curves which is closest to the observations is called the “______
_____”.
6. When there are multiple input variables, literature from statistics often refers to the
m
x 1 2 3 4 5
y 1 2 3 4 5
Notes
a) y=x’
e
b) y=x+1
in
c) y=2x
d) y=2x+1
nl
2. Fit a second degree parabola to the following data.
x 1 2 3 4 5 6 7 8 9
O
y 2 6 7 8 10 11 11 10 9
ity
c) y = 0.2673x2 + 3.5232x + 0.9286
d) y = -0.2673x2 + 3.5232x + 0.9286
3. Fit the straight line to the following data.
x
y
0
7
5
11
10
16
rs
15
20
20
26
ve
a) y = 0.94x + 6.6
b) y = 6.6x + 0.94
ni
c) y = 0.04x + 5.6
d) y = 5.6x + 0.04
4. Fit the straight line curve to the following data.
U
x 75 80 93 65 87 71 98 68 84 77
y 82 78 86 72 91 80 95 72 89 74
ity
a) y = 0.9288x + 7.78155
b) y = 7.78155x + 0.9288
c) y = 0.8288x + 6.78155
m
d) y = 6.78155x + 0.8288
Summary
)A
●● There is a need for a statistical model that will extract information from the given
data to establish the regression relationship between independent and dependent
relationship. The model should capture systematic behaviour of data. The non-
systematic behaviour cannot be captured and called as errors. The error is due
to random component that cannot be predicted as well as the component not
(c
●● When there is a single input variable (x), the method is referred to as simple linear
Notes regression. When there are multiple input variables, literature from statistics often
refers to the method as multiple linear regression.
e
●● To construct an ordinary least-squares regression line, we use the means and
in
standard deviations of our sample data to calculate the slope and y-intercept.
●● In the method of moving average, successive arithmetic averages are computed
from overlapping groups of successive values of a time series.
nl
Activity
1. Determine the forecast for the four year simple moving average
O
Year 1 2 3 4 5 6 7 8 9 10
Sales 20 21 23 22 25 24 27 26 28 30
ity
Questions and Exercises
1. What are the two questions discussed in the general concept of regression?
2. What is the mathematical form of a parabolic trend?
rs
3. What is the applicability of using the moving average method?
4. What is the mathematical equation for the method of least squares?
ve
Glossary
●● Linear regression is a linear model, e.g a model that assumes a linear
relationship between the input variables(x) and the single output variable (y).
●● Moving average method: successive arithmetic averages are computed from
ni
Further Reading
1. Athanasopoulos, George; Hyndman, Rob J. Forecasting: Principles and
ity
1. Regression
2. Errors
3. Random
(c
4. True
5. Best fit
6. Multiple
7. Means
Notes
8. True
e
Multiple Choice
in
a. a
b. a
nl
c. a
d. a
O
ity
rs
ve
ni
U
ity
m
)A
(c