Book - Applied - Mathematics - ClassXII-208-382

it
n
U
4
4.0 LEARNING OUTCOMES

After completion of this unit the students will be able to
 Understand the concept of Random Variables
 Distinguish between discrete and continuous random variable
 Understand and apply the concept of Probability Distribution
 Write probability distribution of discrete random variable
 Calculate the mathematical expectation and variance of a discrete random
variable
 Understand and apply the concept of Binomial Distribution
 Calculate the mathematical expectation and variance for a binomial distribution
 Understand and apply the concept of Poisson Distribution
 Calculate the mathematical expectation and variance for a Poisson distribution
 Understand and apply the concept of Normal Distribution
 Calculate the mathematical expectation and variance for a normal distribution
 Calculate Z-Score and Use Z-Table to interpret normal distribution data set
4.0.0 BEFORE YOU START, YOU SHOULD KNOW

1. Random experiment, Sample space, Event associated with a sample space
2. Mutually exclusive and mutually exhaustive events
3. Independent and dependent events
4. Multiplication theorem of Probability
5. Addition theorem of Probability
6. Total Probability
7. Bayes’ theorem
Probability Distribution 4.1

4.1 CONCEPT MAP
4.2 INTRODUCTION
Suhani has two black sweaters and a white sweater in her cupboard. She takes out a sweater
at random, notes the colour and puts it back in the cupboard. She repeated the process once more
before making up her mind.
What shall be the sample space of the situation stated above?
Let us consider the sweaters as B1, B2 and W1
For the selection of two assignments,
the sample space is S = { B1B1, B1 B2, B2B1, B2B2, B1W1, B2W1, W1B1, W1B2, W1W1}
Clearly these draws are of a random experiment with random outcomes that cannot be predicted.
Let X represent the number of white sweaters drawn in this situation, in that case what can you
say about the value of X?
Here, X (B1B1) = X (B1 B2) = X (B2B1) = X( B2B2) = 0 as the sample element does not have any white
sweater.
Also, X(B1W1) = X(B2W1) = X(W1B1) = X(W1B2) = 1 as the sample element has one white sweater
And, X(W1W1) = 2 as the sample element has two white sweaters
X can take values 0, 1 or 2
Here, X is a function whose domain is the set of possible outcomes (or sample space) of a random
experiment. Also, the variable X take any real value, therefore, its co-domain is the set of real
numbers
In such a case X is considered as a random variable
Definition: A random variable is a real valued function whose domain is the sample space of
a random experiment
Let us consider an experiment of tossing a coin two times in succession.
Clearly the sample space of this experiment is S = {HH, HT, TH, TT}.
If X represents the number of heads obtained in this situation,
4.2 Applied Mathematics

Then X(HH) = 2
X (HT) = X (TH) = 1
and X (TT) = 0.
let Y represent the number of tails minus the number of heads for each random outcome of the
above sample space S
Then Y(TT) = 2
Y(TH) = Y(HT) = 0
And Y(HH) = -2
In this case, Y is a random variable which can take values 2, 0 or - 2
Please note that more than one random variable can be defined on a given sample space. In both
the situations above, we shall assume that each random outcome is equally likely to be selected.
Example 1
Rajat is playing a game of rolling a die with his friends.
According to the game rules, he will win Rs 5 for
rolling an even number and for getting an odd digit on
the die, he looses 2. If X represents the amount of
money Rajat wins or loses. Show that X is a random
variable and also represent it as a function on the
sample space of the game play.
Solution: Sample space of the game play S = {1, 2, 3, 4, 5, 6}

As X represents the amount of money Rajat wins or loses X is a function whose values are
defined on the basis of random outcomes, therefore it is a random variable.
X (2) = X (4) = X (6) = 1× 5 = Rs 5 as Rajat wins Rs 5 when he rolls an even digit on the die,
X(1) = X(3) = X(5) = 1× (-2) = -2 as Rajat loses Rs 2 on rolling an odd digit on the die
Thus, for each element of the sample space, X takes a unique value, hence, X is a function on
the sample space whose range is {+5, -2}
4.2.1 Discrete and Continous Random Variables

Recall that a variable is a quantity that keeps varying.
Let us consider toss of a fair coin and let X be the random variable defined as
X =  0, if coin toss result in head

1, if coin toss result in tail
Here, the random variable is taking two distinct and countable (measurable) values.
Hence X in this case has distinct and countable outcomes with no number in between these
values, therefore it is a discrete random variable.
A wrist watch with only hour and minutes display shows time as 12:00, then
12:01, 12:02, and so on and there is no time shown in between. In this case the
random time change is distinct and countable. Therefore the change in time in this
case is discrete random variable
Each possible value of the discrete random variable can be associated with a non-
zero probability.
Whereas a wrist watch displaying the seconds count as well shows time
22:31:25 pm and 22:32:17 pm and the elapsed time in between as well. A random
variable whose value is obtained by measuring and it takes many values between
two values, is called a continuous random variable.
In other words, a continuous random variable is a random variable with a set
of possible infinite and uncountable values (known as the range).
4.2.2 Probability Distribution of Discrete Random Variable

Let us consider another random variable X defined as sum of digits on rolling of two dice.
The following grid shows the sample space of this random experiment:
+ 1 2 3 4 5 6
1 2 3 4 5 6 7
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
5 6 7 8 9 10 11
6 7 8 9 10 11 12
Clearly, X will take values 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 and 12 which are distinct and countable,
hence X is a discrete random variable in this case.
Now let us find the probability for each random outcome
X  2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 5 4 3 2 1
P(X) 
36 36 36 36 36 36 36 36 36 36 36
Table (i)
Observe that in table (i); for all possible values of the discrete random variable X, all elements of
the sample space are covered.
This table of possible outcomes and their respective probability is called Probability distribution
table for the given random variable X. A probability distribution table links every possible outcome
of the random experiment with the probability of the event to occur.
In a probability distribution table, the sum of all the probabilities is one. (Refer table (i) )
Example 2
A coin is tossed thrice and outcomes are recorded. Prepare the probability distribution table for the
number of heads.
Solution: Let a random variable X denote the number of heads in In a s
ingle
three throws of a die toss
P (he of a f
ad) = 1 air co
Here the sample space S = {HHH, HHT, HTH, THH, HTT, , in:
2 and P (ta
THT, TTH, TTT} il) = 1
2

Which means that X can attain values 0 for no heads, 1 for one head and two tails, 2 for two
heads and one tail or 3 for three heads
X = 0, 1, 2 and 3
The probability of no heads i.e. P (TTT) which can also be written as
P(X = 0) = =
(Recall multiplication theorem of probability from class XI)

probability of obtaining one head and two tails i.e. P(HTT, THT, TTH) is denoted by
P(X = 1) = 3 =
probability of obtaining two head and one tail i.e. P(HHT, HTH, THH) is denoted by
P(X = 2) = 3 =
And, probability of obtaining three heads i.e. P (HHH) is written as
P(X = 3) = =
the
ribution Therefore, the probability distribution table is:
st
ity di ilities
probabil e probab
In a of all
th
one xi  0 1 2 3
sum s always
i
1 3 3 1
P(xi) = pi 
8 8 8 8
4.3 MATHEMATICAL EXPECTATION OF DISCRETE PROBABILITY DISTRIBUTION

Recall that mean is a measure of central tendency as it locates a rough estimation of a middle
or average value of a random variable in an experiment.
Definition: In an experiment, for a given random variable X whose possible finite values x1, x2,
x3, ..., xn occur with probabilities p1, p2, p3, …,pn respectively such that = 1
Then the mathematical expectation is the weighted average of the possible values of X given by:
n
E(X) = 1 1 + 2 2 + 3 3 +⋯+ =  (x p )
i 1
i i
In a nutshell, the mathematical expectation, also known as expected value for a random variable X
is the summation of product of all possible values for the given random variable X and their
respective probabilities.
Example 3
A coin is tossed twice and outcomes are recorded. Prepare the probability distribution table for
random variable X which represents the number of heads in the experiment. Also calculate the
mathematical expectation of X.
Solution: Let a random variable X denote the number of heads in two throws of a die
the sample space S = { HH, HT, TH,TT}

Clearly X = 0, 1 and 2
The probability of occurrence of a head = probability of occurrence of a tail =
The probability distribution table-
xi  0 1 2
Sample event TT HT, TH HH
1 1 1 1
P(xi) = pi  2 
4 4 2 4
1 1 1 1 1
xi p i  0 0 1  2 
4 2 2 4 2
Note that = 1
n
Therefore, E(X) =  i=1
( xi pi ) = 0 + + = 1
Example 4
In a manufacturing unit inspection, from a lot of 20 baskets which include 6 defectives, a sample
of 2 baskets is drawn at random without replacement. Prepare the probability distribution of the
number of defective baskets. Also calculate E(X) for the random variable X.
Solution:
As X denotes the number of defective baskets in a draw of 2 without replacement
X = 0, 1 and 2
Therefore, in a draw of two baskets;
X 0 1 2
xi No defective baskets One defective basket Two defective baskets
14 13 182 14 6 168 6 5 30
P(xi) = pi   2    
20 19 380 20 19 380 20 19 380
182 168 168 30 60

xi pi 0 0 1  2 
380 380 380 380 380
Note that = 1
n
Therefore, E(X) =  i=1
( xi pi ) = 0 + =
E(X) of a random variable X, is the theoretical mean of X. It is not based on sample data
but on the distribution of it
So, the mean expectation value is a parameter and not a statistic.
Sometimes it is also represented by use of Greek letter mu () as well.

Also random variables with different probability distributions can have equal means. Let us take
an example to study this statement in detail
Have a look at probability distribution of two different random variables X and Y as given below:
X 1 2 3 4
1 3 2 1
P(xi) = pi
7 7 7 7
1 1 3 6 2 6 1 4
xi pi 1×  2  3  4 
7 7 7 7 7 7 7 7
n 1 6 6 4 17
Here, E(X) =  i=1
( xi pi ) =    
7 7 7 7 7
 2.71
Y -1 0 4 5
1 2 3 1
P(yi) = pi
7 7 7 7
1 1 2 3 12 1 5
yi pi –1   0 0 4  5 
7 7 7 7 7 7 7
n
Here, E(Y) =  i=1
( xi pi ) =
Clearly the random variables X and Y with different probability distributions can have equal
means. In such cases, we need a technique to check variability and extent to which the values of
random variable are spread out.
4.4 VARIANCE OF DISCRETE PROBABILITY DISTRIBUTION

While the mean is a central tendency; known as the average of a group of data, the variance
measures the average degree to which each number in the data is different from the mean value.
The extent or scope of the variance correlates to the size of the overall range of the given sample
Variance enables us to study the variability of random variable from the mean expectation
When there is a narrower range among the sample elements in a given sample space; that means
that the value of the random variable is close to mean expectation and hence the variance is less
And, when there is wide range among the sample elements, it means that the value of the random
variable is far from the mean expectation and thus the variance is high.
Basically, the variance measures the average degree to which each sample element differs from
the mean of the sample space
In a probability distribution of a discrete random variable X, the variance denoted by Var (X); is the
summation of the product of the squared deviations of xi from the mean E(X) and the corresponding
probabilities pi.
Definition: Let X be a discrete random variable whose possible finite values x1, x2, x3,..., xn occur
with probabilities p1, p2, p3, …,pn respectively.

n  n 
 x 2  Var(X)   xi 2 pi –   xi pi 
 
i 1  i 1 
n
2
In other words, Var (X) = E(X2) – [E(X)]2, where E (X2) = x
i 1
i pi –
And, the standard deviation denoted by is given by:
 x  Var (X )
Example 5
A class XII has 20 students whose marks (out of 30) are 14, 17, 25, 14, 21, 17, 17, 19, 18, 26, 18,
17, 17, 26, 19, 21, 21, 25, 14 and 19 years. If random variable X denotes the marks of a selected
student given that the probability of each student to be selected is equally likely.
a) Prepare the probability distribution of the random variable X.
b) Find mean, variance and standard deviation of X.
Solution: Based on the given data, let us prepare a table
Marks 14 17 18 19 21 25 26
frequency 3 5 2 3 3 2 2
As probability of a selection of a student is equally likely
That means P (a student to be selected) =
Therefore, the probability distribution is:
Marks = xi 14 17 18 19 21 25 26
Frequency = fi 3 5 2 3 3 2 2
3 5 2 3 3 2 2
P(xi) = pi
20 20 20 20 20 20 20
42 85 36 57 63 50 52
xi pi
20 20 20 20 20 20 20
147 289 162 1083 1323 125 338

x i 2p i
5 4 5 20 20 2 5
Here, E(X) = =
And, =

n 2 2
 n
2
 7689  385 
 Var(X) 
i 1
xi pi – 

 i 1

xi p i  


–
20  20 

= 13.9
And standard deviation, =
Example 6
Let X denote the number of hours a person watches television during a randomly selected day. The
probability that X can take the values , has the following form, where k is some unknown constant.
0.2, if xi  0
kx , if xi  1 or 2
P(X = xi) =  i
k(5  xi ), if xi  3
0 otherwise
a) Find the value of k.

b) What is the probability that the person watches two hours of television on a selected day?
c) What is the probability that the person watches at least two hours of television on a selected day?
d) What is the probability that the person watches at most 2 hours of television on a selected day?
e) Calculate mathematical expectation
f) Find variance and standard deviation of random variable X
Solution:
xi 0 1 2 3
P(xi) = pi 0.2 k 2k 2k
a) As = 1
0.2 + k + 2k + 2k = 1
5k = 0.8
k =
b) Probability that the person watches two hours of television

= P (xi = 2)
4 8
= 2k = 2 × 
25 25

c) Probability that the person will watch at least two hours of television
= P (xi = 2, 3)
= 2k + 2k = 4k = 4
d) Probability that the person will watch at most two hours of television
= P (xi = 0, 1, 2)
= 0.2 + k + 2k = 0.2 +3k = 0.2+ 3
e) xi 0 1 2 3
1 4 8 8
P(xi) = pi 0.2 =
5 25 25 25
4 16 24
xi p i 0
25 25 25
4 32 72
xi 2 p i 0
25 25 25
n
E(X) =  i 1
( xi pi ) = 0 +
n 2
f)  x
i 1 i
pi = 0+
n 2 2
 n 2
 108  44  108 1936
 Var(X) 
i 1

xi pi – 

 i 1

xi p i  


–  
25  25  25
–
625
2700 – 1936 764

=   1.22
625 625
And standard deviation,  x  Var(X)  1.22  1.1
4.5 BINOMIAL DISTRIBUTION

When you toss a coin, it either shows a ‘head’ or a ‘tail’. When you are asked to calculate 3 +
4, the answer is either ‘correct’ or ‘incorrect’. In such similar experiments the likely outcome is either
a ‘success’ or a ‘failure’.
In the case of discrete random variable X denoting a prime
number on throw of a die, we can say that numbers 2, 3 and 5
each
Note that in will be considered as ‘success’, while 1, 4 and 6 will be counted as
pr ob ab ility of
trial, the s
ilure remain ‘failure’ in the experiment.
success or fa tc om e of
ou
constant, the pendent Each time you roll a die or perform any experiment in
trial is in de
any probability, it is called a trial. If in an experiment, a die is rolled
th e ou tc om e of any
o f
oth er trial. thrice, then the number of trials is counted as 3, each trial having
exactly two outcomes, namely, success or failure.

Independent trials which have only two outcomes usually referred as ‘success’ and ‘failure’ are
called Bernoulli trials. Here, the probability of success and failure remains same.
Definition: In a random experiment, a collection of trials is called Bernoulli trials, if:
a) The number of trials is finite.
b) The trials are independent by nature.
c) Each trial has exactly two outcomes defined as success and failure.
d) The probability of success remains the same in each trial.
Recall example 4 where the random variable X denotes the number of defective baskets in a draw
of two baskets without replacement
What would happen if 5 baskets are drawn with replacement?
X = 0, 1, 2, 3, 4 and 5
In this case, probability of drawing a defective basket will be considered a success, usually
denoted by p and probability of drawing a non-defective basket will be a failure, denoted by q
Also, the draw of a basket will be called a trial and since we are drawing 5 baskets; the number
of trials is 5
Do you think that these trials qualify as Bernoulli trials?
here p = and q = 1 p=1 , Since probability of success remains same in all the trials,
hence we can say these are binomial trials.
When the drawing is done without replacement, the probability of success (i.e., drawing a defective
basket) in first trial is , in 2nd trial it will be and so on.
Clearly, the probability of success is not same for all trials, hence the trials in example 4 are not
Bernoulli trials.
Probability of ‘r’ successes in ‘n’ Bernoulli trials is given by:
−
! −
P ( ‘r’ successes) = =
! ( − )!
Where n = number of trials
r = number of successful trials = 0, 1, 2, 3, …, n
p = probability of a success in a trial
q = probability of a failure in a trial
And, p + q = 1
Clearly, P (‘r’ successes), is the (r + 1)th term in the binomial expansion of (q + p)n .
The probability distribution of number of successes for a random variable X can be written as:
X = ri 0 1 2 3 … r … n
0 −0 1 −1 2 −2 3 −3 − −
P(ri) = pi 0 1 2 3 … …
This probability distribution is called Binomial distribution with parameters n and p.

The binomial distribution with n Bernoulli trials and success p is also denoted by B(n, p)

Example 7:
Prepare the Binomial distribution B (4, )
Solution: Here total number of trials = n = 4 and p =
As p + q = 1 q = 1 =
Now, number of successes = r = 0, 1, 2, 3 or 4

The binomial distribution can be given as:
X = ri 0 1 2 3 4
0 4–0 1 4–1 2 4– 2 3 4– 3 4 4–4
2  1 2 1 2  1 2 1 2  1
P(ri) = pi C04     C14     C24     C34     C 44    
3 3 3 3 3 3 3 3 3 3
1 8 24 32 16
= = = = =
81 81 81 81 81
Now let us calculate the mean expectation, Variance and Standard Deviation for example 7
n
Recall that E(X) =  i 1
( xi pi ) = 0 = 2.67
Also see that np = 4 2.67
n 2
2
 n 
 Var(X)  
i 1
xi pi – 
 
 i 1
xi pi 


= 0.89
Also see that npq = 4 0.89
And Standard deviation = =
In a binomial distribution having ‘n’ number of Bernoulli trials

where p denotes the probability of success and q denotes the probability of failure
then,
Mean = np
Variance = npq
Standard Deviation = npq
Example 8
If a fair coin is tossed 9 times, find the probability of
a) exactly five tails
b) At least five tails
c) At most five tails

Solution:
Repeated tosses of a fair coin qualify as Bernoulli’s trails
Let X denote the number of tails in an experiment of 9 such trials and hence is the binomial distribution
Here, n = 9, p = and q = 1 – p =
As P ( ‘r’ successes) =
a) Probability of exact 5 successes in 9 trials = P( X = 5) =
b) Probability of at least 5 successes in 9 trials = P ( X 5)

256
512
c) Probability of at most 5 successes in 9 trials = P ( X 5) = 1 - P( X 5)
= 1 -
131 382
= 1 
512 512
Example 9
In a manufacturing unit inspection, from a lot of 20 baskets which include 6 defectives, a sample
of 2 baskets is drawn at random with replacement. Prepare the binomial distribution of the number
of defective baskets. Also find E(X) and Var(X) for the random variable X
Solution: Here, X denotes the number of defective baskets in a draw of 2 baskets with replacement
Clearly, the trials are Bernoulli trials
And X = 0, 1 and 2
Also number of trials = n = 2
If drawing a defective basket is considered a success,
then p = q =
X 0 1 2
ri No defective baskets One defective basket Two defective baskets
0 2–0 1 2–1 2 2–2
 3   7   3   7   3   7 
P(ri) = pi C02     C12     C22    
 10   10   10   10   10   10 
49 42 9
= = =
100 100 100
E(X) = np = and Var(X) = npq =

Example 10
The probability that Rohit will hit a shooting target is . While preparing
for an international shooting competition, Rohit aims to achieve the
probability of hitting the target at least once to be 0.99. What is the
minimum number of chances must he shoot to attain this probability?
Solution: Let the number of chances Rohit shoots the target be n
Here, the trials are Bernoulli with p be the probability of success to
hit the target = and q be the probability of failure to hit the target be 1 – p =
Then P( r number of successes) =
As Rohit wants to hit the target at least once with the probability of 0.99
P(r = 1,2,3,…) 0.99
1 – P(r = 0) 0.99
1 – 0.99
1 – 0.99
0.01
100
As 35 100 n 5
Rohit should hit the target at least 5 times to achieve his target.
Example 11
Sonal and Anannya are playing a game by throwing a die alternatively till one of them gets a ‘1’
and wins the game. Find their respective probabilities of winning, if Sonal starts first
Solution: Clearly the trials are Bernoulli’s with n
Getting a 1 on a single throw of the die is considered a success
p = and q = 1 – p =
Sonal starts the game by throwing the die first

P( Sonal to win in the first throw) =
When will Sonal get a chance to win next?

Sonal will get to try winning in third throw; when Sonal fails in first throw and then Anannya
fails to win in second throw
P (Sonal to win in third throw) =
Next time Sonal will get to win is fifth throw
P (Sonal to win in fifth throw) = and so on

Hence, P(Sonal will win) = P( in first throw) + P( in third throw) + P(in fifth throw) + …
It is an infinite geometric series with a = , r =
As ,
And, P(Anannya to win) = 1 – P(Sonal to win) = 1 -
Example 12
A die is thrown again and again until three 5s’ are obtained. Find the probability of obtaining the
third 5 in the seventh throw of the die
Solution: Clearly the trials are Bernoulli with n = 6
P (a 5 on a single throw of die) = p = and q = 1 – p =
For finding the probability of third six in the seventh throw of the die, we know that there must
have been two 5s’ on previous six throws
P (third 5 on seventh throw of die) = P (two 5s on six throws) P (a 5 on the next single throw
of die)
4.6 POISSON DISTRIBUTION

Let us consider the car sales of a car dealer
showroom X in a city, on a given day.
Do you think that the number of cars sales on a
given day will make for a random variable?
Assuming that each car sale is an independent
event, meaning that sale of one car sale gives no
information about when the next sale will happen.
And the probability of one car sale in a given length Source courtesy:
http://clipart-library.com/free-car-images.html
of time, does not change over time.
Theoretically, the rate at which the car sales are occurring is not changing through time.
Therefore, we can conclude that the events defined as car sales in such a case are occurring
randomly and independently.
Based on these conditions, a random variable X, representing the number of events in a given
length of time has a Poisson distribution.
A discrete probability distribution that expresses the probability of a given number of events
occurring over a fixed period of time or space is called a Poisson distribution if:
1. The events occur with a known constant mean rate
2. The events are independent of the time from the occurrence of the previous event.

3. The rate of occurrence of events is constant and not based on time
4. The probability of an event is proportional to the length of the period of time
The Poisson distribution can also be used for the number of events in other specified intervals
such as distance, area or volume.
DEFINITION: Let X be the discrete random variable which represents the number occurrence
of events over a period of time.
If X follows the Poisson distribution, then the probability of occurrence of ‘k’ number of events
over a period of time is given by:
−
( = )= ( )=
!
Where e is Euler’s number (e = 2.71828…)
‘k’ is the number of occurrences of the event such that k = 0, 1,2, …
And = E(X) = Var(X), is a positive real number
With existence conditions:
−
1. ∑∞=0 =1
!
2. for k = 0, 1, 2, …
Formula courtesy: https://en.wikipedia.org/wiki/Poisson_distribution
A restaurant is doing booming business. It was recorded that during their peak business time,
an average of 30 customers per hour arrive at the restaurant. Can we develop a Poisson probability
distribution model for the arrivals of customers, if 30 customers arrive in an interval of 1 hour on
an average?
You might say that arrival average is 1 customer every 2 minutes
But the thing to remember here is that arrival time of each customer is random and hence this
approach is inappropriate
Let us try another approach and divide each one-minute interval along an interval of an hour
so that each customer arrival is equally-likely
During each minute, let us consider one customer’s arrives in the middle of that time interval.
As probability for a customer to arrive is this is going to be a binomial distribution B (60, )
Thus the process will average at = 30 arrivals during an hour
But then again, we cannot assume that the customers are arriving at uniform pace and at regular
time intervals
What if we divide the time interval in seconds and consider that probability for a customer to

arrive is not equally- likely but biased at . In such a case, the binomial distribution will be B
(3600, )
And the process will average at = 30 arrivals during an hour
To summarize this process, we can say that:

a) as n , the time intervals getting larger and larger, p is smaller than before while E(X) =
np is kept constant at 30 customers per hour
b) In the limit as n , the number of customers arriving during an hour is Poisson (30)
c) With the width = of an hour with arrival rate 30 customer per hour:
i) Probability of an arrival during the interval = =
ii) Probability of more than one arrival during a time interval is 0

iii) Probability of an arrival during a time interval is independent of the previous arrivals
One
of th
histo e fam
rical ous
an a stud
pplic y sho
distri a ti w
butio on of Po s
num n, to isson
bers estim
of at
soldie Prussian e the
rs kil cava
hors led d lry
e-kic ue to
ks in
a ye
ar!
Picture credits: https://mindyourdecisions.com/blog/2013/06/21/what-do-deaths-from-horse-

kicks-have-to-do-with-statistics/
Example 13
As the story goes, the Prussian soldiers monitored 10 cavalry corps over a period of 20 years. The
annual number of recorded deaths due to horse-kick ‘k’ observations is as shown in the table:
k 0 1 2 3 4 Total
Number
of deaths 109 65 22 3 1 200
Does this data provide adequate description of Poisson distribution?

Solution:
0 × 109 + 1 × 65 + 2 × 22 + 3 × 3 + 4 × 1 122
Here E(X) = 200
=
200
= 0.61 =

As k = 0, 1, 2, 3, 4
By Poisson distribution formula:

Then Poisson model progression will be
P(k = 0) =
P(k = 1) =
P(k = 2) =
P(k = 3) =
P(k = 4) =
Now Poisson prediction will be: 200 P(k)
k 0 1 2 3 4
Number 109 65 22 3 1
of deaths
Poisson 200×0.54 200×0.33 200×0.1 200×0.02 200×0.003
predicted =108 =66 =20 =4 =0.6~1
deaths
Yes, the Poisson predictions are adequate for the given data
Example 14
A traffic engineer records the number of bicycle riders that use a particular cycle track. He records
that an average of 3.2 bicycle riders use the cycle track every hour. Given that the number of bicycles
that use the cycle track follow a Poisson distribution, what is the probability that:
a) 2 or less bicycle riders will use the cycle track within an hour?
b) 3 or more bicycle riders will approach the intersection within an hour?
Also write the mean expectation and variance for the random variable X
Solution:
For this problem, E(X) = Var(X) = = 3.2
a) The goal is to find P(X 2)
As
P(X = 0) = =
P(X = 1) = =
P(X = 2) = =
Therefore, P(X 2) = P(X = 0) + P(X = 1) + P(X = 2) = 0.041+0.13+0.21 = 0.381

b) The goal is to find P(X 3)
The probability that there are 3 or more bicycle riders using the track within an hour has no
upper limit on the value of ‘k’, which means that this probability cannot be calculated directly
But, using the rule of complement we can say that
P(X 3) = 1 - P(X 2) = 1 – 0.381 = 0.619
In the given Poisson distribution, E(X) = Var(X) = = 3.2

For the calculation of Euler’s number: http://eguruchela.com/math/calculator/e-power-x
Example 15
A particular river near a small-town floods and overflows twice in every 10-years on an average.
Assuming that the Poisson distribution is appropriate, what is the mean expectation. Also calculate
the probability of 3 or less overflow floods in a 10-year interval.
Solution: As the average event of flood overflow, in every 50-years is two
In the given Poisson distribution, = 2
The goal is to find P(X 3)
As
P (X = 0) = = 0.14
P (X = 1) = = 0.27
P (X = 2) = = 0.27
P (X = 3) = = 0.18
Therefore, P(X 3) = P(X = 0) + P(X = 1) + P(X = 2) +P(X = 3)

= 0.14+0.27+0.27+0.18 = 0.86
Example 16:
For a Poisson distribution model, if arrival rate of passengers at
an airport is recorded as 30 per hour on a given day. Find:
a) The expected number of arrivals in the first 10 minutes
of an hour
b) The probability of exactly 4 arrivals in the first 10 minutes
of an hour
c) The probability of 4 or fewer arrivals in the first 10 minutes
of an hour
Picture courtesy: https://
d) The probability of 10 or more arrivals in an hour given
englishlive.ef.com/blog/career-
that there are 8 arrivals in the first 10 minutes of that
english/travel-english-key-words-
hour
for-the-airport/

Solution:
a) As 10 minutes = th of an hour
In the given Poisson distribution, X is defined as number of arrivals in the first 10 minutes
or th hour is the width of time interval
Here, = 30 as number of arrivals is 30 per hour
Therefore, E(X) = for the first 10-minute of the hour
and P( X = k) = , where k = 0, 1, 2, 3, …
b) Probability of exactly 4 arrivals in the first 10 minutes of an hour = P(k = 4) = = 0.176
c) The probability of 4 or fewer arrivals in the first 10 minutes of an hour = P(k 4) = P( k =

0)+P(k = 1)+ P(k = 2) + P(k = 3) +P(k= 4)
= +
= 0.007+0.03+0.08+0.14+0.18 0.44
d) We are given that there have been 8 arrivals in the first 10 minutes (= hour)
And we need to find probability of 10 or more arrivals in an hour

That means, we need to find probability of 10 – 8 = 2 arrivals in 60 – 10 = last 50 minutes
(= hour)
Therefore, E(X) = for the last 50-minute of the hour
And, in this case P(X = k) = , where k = 0, 1, 2, 3, …
probability of 10 or more arrivals in an hour given that there are 8 arrivals in the first 10
minutes of that hour = P(k = 2, 3, 4, 5, …
= 1 – P (k = 0, 1)
= + )
4.7 NORMAL DISTRIBUTION

In this module, the distributions discussed up till now are applicable when the random variable
is discrete by nature. In case of a continuous random variable like heights or weights; as we have
infinite number of values between two distinct values; thus it becomes very difficult to distribute the
total probability among all these values.
Therefore, a continuous random variable X is defined in terms of its probability density function
also known as PDF.

In such a case the probability density function is defined as:
A continuous random variable X is designed to follow normal distribution with constant
parameters = E(X) and Var(x) = and written as X N ( , )
( ) ≥ 0, ∀ ( −∞, ∞)
2
1 −
1 −
such that ( )= . 2
—————— (i)
√2
w here,   (–, ) is the mean of normal distribution
 > 0 is the standard deviation
When a random variable can take on any value within a given range where the probability
distribution is continuous ( refer 4.1.1), it is called a normal distribution or Gaussian distribution. A
random variable with a Gaussian distribution is said to be normally distributed, and is called a normal
deviate
In the normal distribution function given by (i), the curve known as probability curve is bell-
shaped with one peak point as shown below:
Picture courtesy: https://www.lsssimplified.com/normal-distribution-for-lean-six-sigma/
The normal distribution is used in the cases

where we need to make inferences by taking
random samples; and distribution of random
variable is not known. This type of distribution is
applied to fit the actual observed frequency
distribution on many phenomena like weights and
heights
A Normal distribution have key features that
are easy to spot in graphs:
1. The mean, median and mode of the sample
space are exactly the same.
2. The bell-shaped probability curve has one peak point, it means that the normal distribution
has a unique mode

3. The area below the curve has two tails of the curve extended on both
sides and never touch the axis. As the line through x = is dividing the normal curve into
two equal parts in all aspects which means that the normal curve is symmetrical about x =
as half the values fall below the mean and half above the mean.
4. In a normal distribution curve the total area below the curve is always equal to 1 unit; i.e.,
5. The distribution can be described by two values: the mean and the standard deviation
l-life
d is a rea al
r rm
n boa w no
Galto ple of ho show the
exam tion can ibution
u tr
distrib ability dis sisfun.com
prob w.math .html
://ww unx
https ata/quinc
/d
Picture credits: https://en.wikipedia.org/wiki/Bean_machine
4.7.1 Standard Normal Distribution

As discussed above, X is a normal variate based on two parameters namely, mean and
standard deviation
But in a real-life situation, there can be a data set with a mean as 50 and standard deviation of
3 while there can be another data set with a mean of 100 and a standard deviation of 5. How do
we compare such different normally distributed data sets?
4.7.2 Z-Score of Normal Distribution

When mean = 0 and standard deviation = 1 for a data set, then the normal distribution
is called as standard normal distribution

Picture courtesy: https://www.scribbr.com/statistics/normal-distribution/
We make use of data by converting it into a standard normal variate. All normal variate can be
converted to standard normal distribution. In order to do so, we calculate the standard score or Z-
score for each of the data value in the normal variate therefore enabling to compare information
since they are on the same scale. This distribution is also called a Z-Distribution.
Basically, the Z-score in a standard normal distribution represents how far the said data point
from the mean ( .
How do we find the Z-Score?
Recall that a normal distribution function is given by
In this case Z = is called the Z-Score.
Example 17:
Calculate Z-Score for a normal distribution of length of 7 rare species of Indian
butterfly that you have in your garden
Butterfly 1 2 3 4 5 6 7
Length
(in cm) 2 2 3 2 5 1 6
Solution: Here mean =
and [as ]
Butterfly 1 2 3 4 5 6 7
Length (in cm) 2 2 3 2 5 1 6
− 2−3 2−3 3−3 2−3 5−3 1−3 6−3

Z-Score = = 0 1.69 1.69
1.69 1.69 1.69 1.69 1.69
 –0.59  –0.59  –0.59  1.18  1.18  1.78

Notice that butterfly number 3 has the Z-Score = 0, it means that this data point is the mean of
the data.
Also note that the Z-score is positive if the data point lies above the mean, and negative if it lies
below the mean.
Picture courtesy: https://www.simplypsychology.org/normal-distribution.html

As shown in the graph above, if the data values in a normal distribution are converted into z-
scores in a standard normal distribution, then the percentage of the data that fall within specific
numbers of standard deviations () from the mean () for bell-shaped curve
1. Data points are symmetrical along the mean (
2. Z-score describes the position of each data point in terms of its distance from the mean, when
measured in standard deviation units.
3. The Z-score is positive if the data point lies above the mean, and negative if it lies below the
mean.
4. there is a 68.27% probability of randomly selecting a Z-score between -1 and +1 standard
deviations from the mean.
dx has probability 68.27%
where
5. 95.45% probability of randomly selecting a score between -2 and +2 standard deviations from
the mean.
where
6. 99.73% probability of randomly selecting a score between -3 and +3 standard deviations from
the mean.
where

Example 18:
Given that mean of a normal variate X is 12 and standard deviation is 4, then find:
a) Find the Z-Score of data point 20
b) The data point if its Z-Score is 5
c) Data point if its Z-Score is -2
Solution:
a) As and and = 20
Z =
b) As and and Z = 5
Then Z =
c) As and and Z = -2
Then Z =
4.7.3 Z-Test for Normal Distribution

If a drug company announces one day that they had found a new drug that cures diabetes, you
would want to be sure how true is their claim. Techniques and various hypothesis test are able to
tell you if it’s probably true, or probably not true. Some of the popular hypothesis tests used in
probability distribution are f-test, chi-square test, t-test and Z-test
We are going to discuss one of these tests used in normal distribution data set. To use Z-test, we
need to see that:
a) sample size is greater than 30.
b) data points should be independent from each other.
c) data should be randomly selected from a population, where each data point has equally likely
of being selected.
d) sample sizes should be equal if at all possible.
Let us now see how to use Z-test in a given normal distribution of data set
Example 19
In a district, exam scores of 300 student of class XII are recorded at the end of the session.
a) Ramesh scored 800 marks in total out of 1000. The average score for the batch was 700 and
the standard deviation was calculated to be 180. Find out how has Ramesh scored compared
to his batch mates in the whole district.

b) Sudha scored 420 marks in the same batch. What can you say about her performance as
compared to the batch of 300 students?
c) How much has Abhay scored if he has done better than 44.83% of his batchmates?
Solution:
a) Firstly, we need to find Ramesh’s Z-Score and use the respective z-table before we determine
how well he has performed as compared to his batch mates
As and and = 800
Z =
Once you have the Z-Score, the next step is choosing between the two Z- Tables. (Refer Appendix
at the end)
In the Z-table, go vertically down on the leftmost column to find the value of the first two digits
of your Z Score (0.5 in this case) and then go alongside on the topmost row to find the value of the
digits at the second decimal position (.06 in this case). Once you have mapped these two values, the
intersection of the row of the first two digits and column of the second decimal point in the table
gives the value 0.7123 i.e. the area on the left of ordinate corresponding to Z = 0.56. This area also
represents the probability of scoring < 800 marks.
Lastly, to get this as a percentage we multiply that number with 100 i.e. 0.7123 x 100 = 71.23%.
Hence, we can say that Ramesh did better than 71.23% of students in the district.
b) In the case of Sudha, and and = 420
Z =
Looking at the Z-Table we can say that it maps to 0.0594 and hence we can say that Sudha did
better than 100 5.94% of students in the district.
c) If Abhay has done 44.83% better than his batchmates, then his score on Z-Table is 44.83 ÷
100 = 0.4483 which corresponds to Z-Score = - 0.13
Here and and Z = -0.13
Therefore Z =
Which means that Abhay has scored approximately 677 marks out of 1000
Example 20:
Given that the scores of a set of candidates on an IQ test are normally distributed. If the IQ test has
a mean of 100 and a standard deviation of 10, what is the probability that a candidate who takes
the test will score between 90 and 110?
Solution: P(90 < X < 110) = P( X < 110 ) - P( X < 90 )
P( 90 < X < 110 ) = P (–1 < Z < 1) = P (Z < 1) – P(Z < –1)
= 0.8413 – 0.1587 = 0.6826

4.8 CHECK YOUR PROGRESS
1. State which of the following are not the probability distributions of a random variable. Give
reasons for your answer
a. X -1 0 1 2
P(X) 0.1 0.8 0.001 0.2
b. X 1 2 3 4 5
P(X) 0.1 0.4 0.05 -0.2 0.2
c. X -2 2 5
P(X) 0.5 0.2 0.3
2. A lady’s bag contains 2 black and 1 red pens. One pen is drawn at random and then put
back in the box after noting its colour. The process is repeated again. If X denotes the number
of red pens recorded in the two draws. Describe X.
3. What is the mean of the numbers obtained on throwing a die having written 1 on three faces,
2 on two faces and 5 on one face?
4. Raheem tossed a fair coin 10 times, find the probability of (i) exactly six heads (ii) at least six
heads (iii) at most six heads.
5. Find the probability of getting 5 exactly twice in 7 throws of a fair die.
6. Let X denote the number of hours a class XII student studies during a randomly selected
school day. The probability that X can take the values xi, for an unknown constant ‘k’
0.1, if xi  0

kx , if xi  1 or 2
P(X = xi) =  i
 k(5  x ), if xi  3, 4
 i
a. Find the value of k.

b. What is the probability that the student studied for at least two hours? Exactly two hours?
At most two hours?
7. How many times must Sumit toss a fair coin so that the probability of getting at least one
head is more than 90%?
8. A pair of dice is thrown and the random variable X represents the sum of the numbers that
appear on the two dice. Calculate the mathematical expectation of X.
9. Find the variance of a bernoulli random variable whose probability of success is 0.6.
10. If the mean and variance of a binomial distribution are 4/3 and 8/9 resp. find P(x=1)
11. What is the expected value of number of tails on a throw of a fair coin?
12. A customer care company receives an average of 4.5 calls every 5 minutes. If each customer
executive can handle one of these calls over the 5-minute period. But if an executive is not
unavailable to take the call, then the call is put on hold. Assuming that the calls received by
the customer care company follows a Poisson distribution, what is the minimum number of
customer executives are needed on duty so that calls received are placed on hold for at the
most 10% of the time?

13. A statistician records the number of trucks approaching a particular intersection to analyze
the flow of traffic. He observes that on an average 1.6 trucks approach the intersection every
minute. Assuming that the number of trucks approaching the intersection, follow a Poisson
distribution model, what is the probability that 3 or more trucks will approach the intersection
withing a minute?
14. A computer disk manufacturer tests disk quality on random basis before approving it. The
approval is based on the number of errors in a test area on each disk and follows Poisson
distribution with = 0.2. What is the percentage of test areas having two or a smaller number
of errors?
15. In a Poisson distribution, if mean is 2, what is the variance?
16. It is given that 3% defective electric bulb are manufactured by a company. Using Poisson
distribution, find the probability of 100 bulbs w ill contain no defective bulbs. (Use e-3 = 0.05)
17. The mortality rate for a certain disease is 0.007. Using Poisson distribution, calculate the
probability for 2 deaths in a group of 400 people
18. An ice-cream parlour receives a customer at an average rate of 4 per minute. If the number
of customers received by the parlour follows a Poisson distribution, what is the approximate
probability that 16 customers will be coming to the parlour in a particular 4-minute period
on a given day?
19. Using Z-Table, Calculate
a) P (Z < 1.20)
b) P (Z 1.20)
c) P (–0.5 Z )
d) P (–1.0 Z )
20. A company conducted an IQ test for randomly select 50 employees. Volunteer A scored 74
out of the possible 120 points. If the average IQ test score was recorded as 62 and the
standard deviation was 11. How well did volunteer A perform on the test compared to the
other volunteers?
21. An average ceiling fan manufactured by the Jagdeep Corporation lasts 300 days with a
standard deviation of 50 days. Assuming that the ceiling fan;s life is normally distributed,
what is the probability that a ceiling fan will last at most 365 days?
22. In a survey of daily travel time (in minutes) of students to reach their school was recorded
as follows:
26, 33, 65, 28, 34, 55, 25, 44, 50, 36, 26, 37, 43, 62, 35, 38, 45, 32, 28, 34
If the mean travelling time is 38.8 minutes and the standard deviation is 11.4 minutes.
Convert the travel time of each student into a Z-Score
23. In an examination, 2000 students appeared and the mean of the normal distribution of marks
is 30 with standard deviation as 6.25. Find out how many students are expected to score.
i. between 20 and 40 marks.
ii. less than 25 marks
24. In a normal distribution, 31% of the articles are under 45 and 8% are over 64. Calculate the
mean and standard deviation of the distribution

4.9 QUICK RECAPITULATION
1. A random variable is a real valued function whose domain is the sample space of a random
experiment, denoted by X
2. More than one random variable can be defined on a given sample space.
3. The random variable X can take distinct and countable (measurable) values in between values
for it, then it is called a discrete random variable.
4. When a random variable whose value is obtained by measuring and it takes many values
between two values, is called a continuous random variable
5. The table of possible outcomes and their respective probability is called Probability distribution
table for the given random variable X
6. In a probability distribution the sum of all the probabilities is always one
7. In an experiment, for a given random variable X whose possible finite values x1, x2, x3, ….,
xn occur with probabilities p1, p2, p3, …,pn respectively such that
= 1
Then the mathematical expectation is the weighted average of the possible values of X given
n
by E(X) = =  i 1
( xi pi )
8. Mean expectation value is a parameter and not a statistic.

9. Variance enables us to study the variability of random variable from the mean expectation
10. Let X be a discrete random variable whose possible finite values x1, x2, x3,..., xn occur with
probabilities p1, p2, p3, …,pn respectively.
Let  = E (X) be the mean of X. Then the variance of X, denoted by Var (X) or x2 is given
n 2
2 2
 n 
by  x  Var(X)  
i 1
xi pi – 
 
 i 1
xi pi 


n
2
2
In other words, Var (X) = E(X ) – [E(X)] , where E(X ) = 2 2
x
i 1
i pi
11. The standard deviation denoted by

12. In a random experiment, a collection of trials is called Bernoulli trials, if:
i. The number of trials is finite.
ii. The trials are independent by nature.
iii. Each trial has exactly two outcomes defined as success and failure.
iv. The probability of success remains the same in each trial.
13. Probability of ‘r’ successes in ‘n’ Bernoulli trials is given by
P (‘r’ successes) =
Where n = number of trials

r = number of successful trials = 0, 1, 2, 3, …, n

p = probability of a success in a trial
q = probability of a failure in a trial
And, p + q = 1
14. P(‘r’ successes) is the (r+1)th term in the binomial expansion of (q + p)n
15. The binomial distribution with n Bernoulli trials and Probability of success p is also denoted
by B(n, p)
16. In a binomial distribution having ‘n’ number of Bernoulli trials where p denotes the probability
of success and q denotes the probability of failure, then
i. Mean = np
ii. Variance = npq
iii. Standard Deviation =

17. Let X be the discrete random variable which represents the number occurrence of events over
a period of time. If X follows the Poisson distribution, then the probability of occurrence of
‘k’ number of events over a period of time is given by
Where e is Euler’s number (e = 2.71828…)

‘k’ is the number of occurrences of the event such that k = 0, 1,2, …
And = E(X) = Var(X), is a positive real number
With existence conditions:
i.
ii. for k = 0, 1, 2, …
18. A continuous random variable X is defined in terms of its probability density function
also known as PDF as well
19. A continuous random variable X is designed to follow normal distribution with constant
parameters = E(X) and Var(x) = and written as X N ( , )
such that —————— (i)
where is the mean of normal distribution and is the standard deviation

20. When a random variable can take on any value within a given range where the probability
distribution is continuous, it is called a normal distribution or Gaussian distribution.
21. A random variable with a normal/Gaussian distribution is said to be normally distributed,
and is called a normal deviate
a. The mean, median and mode of the sample space are exactly the same.
b. The bell-shaped probability curve has one peak point, it means that the normal distribution
has a unique mode

c. The area below the curve has two tails of the curve extended on
both sides and never touch the axis. As the line through x = is dividing the normal curve
into two equal parts in all aspects which means that the normal curve is symmetrical
about x = as half the values fall below the mean and half above the mean.
d. In a normal distribution curve the total area below the curve is always equal to 1 unit;
i.e.,
e. The distribution can be described by two values: the mean and the standard deviation
22. When mean = 0 and standard deviation = 1 for a data set, then the normal distribution
is called as standard normal distribution
23. In a normal distribution of data, the Z-score is given by Z =
24. When the Z-score is positive if the data point lies above the mean, and negative if it lies below
the mean
25. When the data values in a normal distribution are converted into Z-scores in a standard
normal distribution, then the percentage of the data that fall within specific numbers of
standard deviations () from the mean () for bell-shaped curve is constant.
i. Data points are symmetrical along the mean (
ii. Z-score describes the position of each data point in terms of its distance from the mean,
when measured in standard deviation units.
iii. The Z-score is positive if the data point lies above the mean, and negative if it lies below
the mean.
iv. There is a 68.27% probability of randomly selecting a Z-score between -1 and +1 standard
deviations from the mean.
a. dx has probability 68.27%
b. where
v. 95.45% probability of randomly selecting a score between -2 and +2 standard deviations

from the mean.
b. where
vi. 99.73% probability of randomly selecting a score between -3 and +3 standard deviations
from the mean.
b. where

26. Some of the popular hypothesis tests used in probability distribution are f-test, chi-square
test, t-test and Z-test
27. To use Z-test, we need to see that:
i. sample size is greater than 30.
ii. data points should be independent from each other.
iii. data should be randomly selected from a population, where each data point has equally
likely of being selected.
iv. sample sizes should be equal if at all possible.
v. for convenient calculation of Z-Score, we use Z-Table to interpret normal distribution data set
4.10 ANSWER KEY TO CHECK YOUR PROGRESS
1. (a) no, p  1 (b) no, p < 0 (c) yes 2. X = 0,1,2 3. 2 4. 105/512, 193/512, 53/64 5.
6. k = 0.15, 0.75, 0.3, 0.55 7. 4 or more times 8. 7 9. 0.24 10. 32/81 11. 0.5 12. 7 13. 0.217 14. 99.89%
15. 2 16. 0.05 17. 0.235 18. 0.099 19. (a) 0.11507 , (b) 0.88493, (c) 0.5328, (d) 0.6826 20. better than
43 other volunteers 21. 90% 23. (i) 1781, (ii). 424 24. Mean = 50, SD = 10

4.11 APPENDIX
Z- SCORE TABLE

Table Courtesy: https://www.dummies.com/education/math/statistics/how-to-use-the-z-table/


it
n
U
5
After the completion of this unit, the students will be able to :

 develop an understanding of population and sample
 understand the concept of parameter and statistical interferences
 understand the idea of a hypothesis testing
 use and extend the knowledge of inferential statistics and their applications in
real-life situations
Inferential Statistics 5.1

Concept Map
Introduction
One of the most important application of statistics is making estimations about an entire population
based on the information from a small sample. This process is known as statistical inference. This
can be achieved only if we are confident that our sample accurately reflects the desired population.
For example, making exit poll results of public opinions using a small group of thousand voters and
exactly predicting the outcome of an election in which millions of votes are cast.
This chapter on inferential statistics will take you to see how to draw conclusions from a sample
and generalize them to a larger population.
5.1 Population and sample

Several real-life problems are statistical in nature. Let’s take some examples;
1. You are a part of a fitness campaign in your school. You are concerned about the overall
wellbeing of fellow students and want to know that what proportion of students regularly
do exercises.

2. As a quality control expert, you want to know what percentage of good computer chips are
produced by the manufacturing unit of your company in a week.
In example 1, the population under study is total number of students enrolled in school as you
want to conduct study on them. In example 2, the population is the total number of computer chips
produced by manufacturing unit in a week then out of it you will see what proportion is good.
Thereby a population is a group of all distinct individuals or objects that you want to draw
conclusions about. The number of individuals/objects in a population is called population size.
In statistics, we commonly use a sample that is a small subset of a larger set of data for making
inferences about the large set. Here larger set is population out of which sample is drawn.
NOTE :  Every time the sample size is smaller than the population’s total size.
 The population refers to the entire group from which you want to draw conclusions.
Sampling
Sampling is a technique of selecting small group (subset) of population for estimating the characteristics,
without having to investigate every individual. It includes selecting a group of people, events, behaviors,
or other elements with which we are concerned to make our conclusions. We can extend our results
obtained from sample group to the entire population.
Let us suppose a vaccine company has manufactured a new vaccine for COVID-19 and would
like to see its adverse effects on country’s population, then it is almost impossible to perform clinical
trials that includes all. So in this scenario, researchers select a group of people from each demographic
for conducting the tests on them and estimates the impact on whole population.
Steps involved in Sampling
There are number of ways in which the sampling process can be carried out. But in this chapter,
we shall limit ourselves to simple random sampling and systematic random sampling only.

1. Probability Sampling: Randomization (choosing something at random) is used in this sampling
method to ensure that every member of the population has an equal chance of being included
in the selected sample.
2. Non-Probability Sampling: Randomization is not used in non-probability sampling. The
result of this method can be biased, making it difficult for all the elements of population to
be included in the sample equally.
Simple random sampling

As name suggests here every individual is chosen entirely by chance and every member of the
population has an equal chance of being included in sample.
Suppose from a finite population of size N we take a sample of size ‘n’. It implies that we will
have NCn possible samples to choose. A sampling method wherein each of the NCn samples has an
equal chance to be selected is known as random sampling and the sample attained by this method
is called a random sample.
For example: From a class of 50 students randomly selecting 10 students where every student has
equal opportunity of getting selected. The probability of every selection is 1/50.
Systematic random sampling

When members of sample are chosen at regular interval of population. This requires starting
point of selection and sample size which then follows repetition of the same. The items of population
can be first arranged alphabetically, numerically or in any increasing/decreasing order then
individuals are chosen at regular interval.
For example: Suppose all twelve students of your class are listed as per their roll number. You
randomly select a starting point 2 then 2 onwards, every 3rd student is selected (2,5,8,11) and you
end up with a sample of 4 students selected in systematic method.

ACTIVITIES
 What should be the suitable sample to collect data on how people use smart phones these days?
 Use simple random sampling to collect data on dress codes among women. Choose your sample
across age, profession and family background.
Representative and unrepresentative Sample

Representative Sample: A sample which accurately represents, reflects, or matches with some of the
features of your population. It has to be an unbiased reflection of the population. For example, if you
have selected a representative sample of Indian Cricket team fans and found that 75% of your
sample are male, then it follows that 75% of your target population will also be male.
In order to get the correct inferences, you should always try to make your sample representative
of your target population. However, sometimes you might deliberately choose not to study a
representative sample.
Sample needs to fulfill following conditions to be a representative sample:
(i) The sampling process must have a component of random selection.
(ii) The sample size must be large enough to give us a good picture of the variability of the
population.
Unrepresentative Sample: When the statistic does not represent the population parameter, it is called
unrepresentative sample. For example, in some cases you might not want to make generalizations
about a very large group of people based only on a very small group.
This is also known as biased Sample. The bias that results from an unrepresentative sample is
called selection bias.
Suppose, for obtaining a sample of households, a TV rating service dials numbers taken at
random from telephone directories. Then it is going to be an unrepresentative sample as some
households may have unlisted telephone numbers.
Unbiased and biased sampling

Unbiased Sampling:
If every individual or the elements in the population has an equal chance to be part of the
selected sample, then the sampling process is called unbiased. “Probability sampling is unbiased in

nature.” Some of the unbiased sampling are Simple random, Stratified and Systematic random
sampling.
For example :
i. One student is randomly selected by teacher, every week, to review the homework answers
with rest of the class. This is unbiased because it uses simple random sampling process.
ii. To know the students’ favorite sport, every fifth student who enters the school is asked to tell
the name of their most favourite sport. It is unbiased as systematic sampling process is being
followed.
Biased Sampling:
If a sampling process systematically favours certain outcomes over others, it is said to be biased.
Convenience sampling type is one of the biased sampling. The following example shows how a
sample can be biased,
For example;
i. Suppose for making selections for a competition, a teacher selects those students whose roll
numbers ends with the digit 2. Then it is not a simple random sampling because every
student does not have a chance to be chosen.
Types of biasness
i. Voluntary response bias: When individual has choice to choose to participate.
ii. Undercover: If sample gives less representation of the sample.
iii. Convenience: When a sample is taken from individual that are conveniently available.
iv. Response bias: Anything in a survey that influence responses.
Sampling errors
The difference between a population parameter and a sample statistic is known as a sampling
error. Even randomly selected sample also contains sampling errors because random samples are not
identical to the population in terms of numerical measures like means and standard deviations. It
can be either positive or negative, and the estimated sampling error decreases as the sample size
grows.
Sampling error = x – 
Where x = Sample Mean and  = Population mean
X i  X i
Population Mean (  )= Sample Mean ( x )=
N n
Where N = population size and n = sample size.
Reasons for sampling errors:
i. The population parameter is estimated differently by different samples.
ii. Faulty selection of sample.
iii. Small size of sample.
iv. Sample results have potential variability.

5.2 Parameter and Statistics and Statistical Interferences
5.2.1 Parameter and Statistics

Parameter Statistic
 It is a characteristic of a population.  It is a characteristic of a sample.
 A parameter is a numerical value  A statistic is the numerical value taken
that is taken from the entire population, from a sample and calculated from the
such as the population mean. sample observations alone, i.e. some
subset of the entire population.
 The value of a parameter is computed  The value of a statistic is computed
from all the population observations. from portion of population (sample).
 Generally denoted by Greek alphabets  Generally denoted by english alphabets
(mean-µ, S.D.-, Variance- 2 etc.) (Mean –X, S.D. –S, Variance –S2, etc.
Example: Example:
Under a study of calculating the average Mean and standard deviation of income
income of people of some specific region, of 1000 residents from South Delhi.
the mean income and standard deviation
of these incomes are parameters.
Knowing the average height of adults in Mean and standard deviation of height
India is a parameter which is nearly of 50 Indian adults.
impossible to calculate.
Image Source: https://www.cliffsnotes.com/study-guides/statistics/sampling/populations-samples-parameters-and-statistics

Thus, Parameter and statistic both are related yet distinct measures. The first refers to the whole
population, while the second refers to part of the population.
5.2.2 Statistical Significance and Sampling distribution

Statistical Significance
Statistical significance is a measure of reliability of findings which establishes that when a finding
is significant, it simply means we are confident that it is real and sample was framed wisely.

To decide if a data set’s outcome is statistically significant, statistical hypothesis testing is used.
When a statistic has high significance then it is thought to be more reliable.
Sampling distribution
The sampling distribution of a statistic is the distribution of all possible values taken by the
statistic when all possible samples of a fixed size n are taken from the population. It is a theoretical
idea—we do not actually build it.
To put it another way, suppose we are regularly taking samples of the same sample size from
the population, compute the statistics (Mean, S.D. mean), and then draw a histogram of those
statistics, the distribution of that histogram tends to have is called the sample distribution of that
particular statistics (Mean, S.D.).
Central Limit Theorem (CLT)

Central limit theorem (CLT) implies that the distribution of a sample leads to become a normal
distribution (bell curve shaped) as the sample size becomes larger, considering that all the sizes of
samples are identical, whatever be the shape of the population distribution.
A sample size of 30 or more is considered to be sufficient to hold CLT and as the sample size becomes
large the prediction of characteristics of population becomes more accurate.
NOTE : As per CLT, when sample size increases the mean of a sample of data becomes close to mean
of overall population.
The interesting thing about CLT is that as N increases, the sampling distribution of the mean
approaches a normal distribution, regardless of the shape of the parent population.
Confidence Interval (CI):

We use Confidence Interval (CI) to express the precision and uncertainty of a sampling process.
A confidence level, a statistic and a margin of error are the three components of it. The margin of
error describes the accuracy of a sampling method, while the confidence level explains its uncertainty.
Consider the case where we are computing an interval estimate of a population parameter with
a 95% confidence interval. It means that 95% of the time, by using the same sampling method to

pick different samples and computing different interval estimates, the true population parameter
would fall within the margin of error specified by the sample statistic.
For example: Assume a news channel conducts pre-election survey and predicts that the candidate
A will get 30% of the vote. According to news channel the survey had margin of error of 5% and
a confidence level of 95%. This means that we are 95% sure that the candidate A will receive
between 25% and 35% of the vote.
5.3 t-Test (a test of difference for parametric data)
5.3.1 Hypothesis
In order to make decisions it is useful to make some assumptions about the population. Such
assumptions, which may or may not be true, are known as hypothesis. These are the tentative,
declarative statement about the relationship between two or more variables. There are two types of
statistical hypotheses for each situation: the null hypothesis and the alternative hypothesis. Both
of these hypotheses contain opposite view points.
• The null hypothesis (H0) states that there is no difference

between a parameter and a specific value, or that there is no
Null difference between two parameters (i.e Ho is a statement of
Hypothesis (H0) a no relationship)- It explicitly says that the two groups we
are studying are the same.
• The alternative hypothesis (H1) states the existence of a
difference between a parameter and a specific value, or
Alternative
states that there is a difference between two parameters- In
Hypothesis (H1) other words it says that the two groups we are studying are
different.
Symbols used in hypothesis

H0 H1
equal (=) not equal () or greater than (>) or less than (<)
greater than or equal to () less than (<)
less than or equal to () more than (>)

Example 1
If we want to examine that on an average college student take less than five years to complete their
education. The null and alternative hypotheses are:
H0 :   5
H1 :  < 5
Writing null hypothesis

Case 1: Suppose a cake baked through conventional method has an average life span of µ days
and it is proposed to test a new process of baking cakes. So, we have two populations of cakes (one
by conventional method and other by new process). Here hypothesis can be formed like:
(i) New method is better than conventional method.
(ii) New method is inferior to conventional method.
(iii) There is no difference between the two methods.
Since first two statements display a preferential mentality, they tend to be biased. As a result,
adopting the hypothesis of no difference, i.e. a neutral or null attitude toward the outcome, is the
safest course of action. Thus, if the average life of cakes baked using the new method is µ0, the null
hypothesis is:
H0: µ = µ0
Case 2 : Suppose a departmental store is planning to have its own android application (app)
conditioned that new service will be introduced only if more than 60% of its customers use internet
to shop. So here null hypothesis would be that % of customers using internet is less or equal to 60%
and the alternative hypothesis will be its opposite.
H0: Proportion of customers using internet for shopping  60%
H1: Proportion of customers using internet for shopping > 60%
If the null hypothesis is rejected, then the alternative hypothesis H1 will be accepted and as a
result e-commerce shopping service will be introduced.
ACTIVITY
Choose type of hypothesis from following statements and write them (H0, H1) in terms of the
appropriate parameter ( or p).
(i) During COVID-19 pandemic, the chance of getting infected from virus is under 25% for
school students.
(ii) Fewer than 7% of students ride two-wheeler to reach the school on time.
(iii) The average salary package for Delhi University graduates is at least 10,00,000/annum.
Answers:
(i) H0: p  0.25; H1: p < 0.25
(ii) H0: p = 0.07; H1: p < 0.07
(iii) H0:   10,00,000; H1:  < 10,00,000

NOTE
 After stating the hypothesis, the researcher designs the study. The researcher selects the correct
statistical test, chooses an appropriate level of significance, and formulates a plan for conducting the
study.
 If we discard the null hypothesis, then we can assume there is enough evidence to support the
alternative hypothesis.
Standard Error of Mean (M)

When we take a sample from a population, we pick up one of many samples. Some of them will
have the same mean whereas some will have very different means. Standard error of the mean
(SEM) measures how much dispersion there is likely to be in a sample's mean compared to the
population mean i.e it measures the standard deviation of sampling distribution about the mean.

M 
N
 = Standard deviation of original distribution

N = Sample size
 Small SEM: Having large number of observations and all of them being close to the sample
mean (large N, small SD) gives us confidence that our estimation of the population means
(i.e., that it equals the sample mean) is relatively accurate.
 Large SEM: Having small number of observations and they vary a lot (small N, large SD),
then population estimation is likely to be quite inaccurate.
Degrees of freedom
The number of independent pieces of information on which an approximation is based is known
as the degrees of freedom. You can also think of it as the number of values that are free to vary as
you estimate parameters.
Example 1
Consider a classroom having seating capacity of 30 students. The first 29 students have a choice to
sit but the 30th student can only sit on the one remaining seat. Therefore, the degrees of freedom
is 29.
Example 2
For scheduling three hour-long tasks (read, eat and nap) between the hours of 5 p.m. and 8 p.m.
we have two degrees of freedom as any two tasks can be scheduled at will, but after two of them
have been set in time slots, the time slot for the third is decided by default.
Degrees of freedom is some or other way related with the size of the sample because higher the
degrees of freedom generally mean larger sample sizes.

Note: A higher degree of freedom means more power to reject a false null hypothesis and find a significant
result.
Df = N–1
where: Df = degrees of freedom and
N = sample size
The t-Test (for one sample and two independent groups)

The t test is a statistical test for the mean of a population and is used when the population is
normally or approximately normally distributed with an unknown variance.
The inferential statistic calculated in the t-test is called the t-ratio and denoted by "t". The larger
the t-ratio (in absolute value), the more likely we will reject the null hypothesis because the more
evidence in the data that the two groups differ from each other.
Note: "t" statistic is used to determine whether the null hypothesis should be rejected or not.
Use following procedure for testing the hypotheses by using the t test (traditional method):
NOTE :
 If the population is roughly normally distributed and the population standard deviation is unknown, then
only t test should be used.
 Perform a two-tailed t-test if you only want to see if the two populations are different from one another.
 Perform one-tailed t-test if you wish to know whether one population mean is greater than or less than
the other.
Reading and locating t-value from the t-table;

The t distribution table values are critical values of the t distribution. The column header are the
t distribution probabilities (alpha) whereas the row depicts the degrees of freedom (df). This can be
used for both one-sided (lower and upper) and two-sided tests using the appropriate value of .

Source: http://www.math.odu.edu/stat130/t-tables.pdf
For a one-tailed test, find  level by looking at the top row of the table and finding the appropriate
column. Look down the left-hand column for the degrees of independence.

Example:
Find the critical t value for  = 0.05 with d.f. = 16 for a right-tailed t test.
Solution: Look for 0.05 column in top row and 16 in left hand column. The critical value of found
where row and column meet. It is +1.746.
5.3.2 One sample t- test

The one sample t-test is used to compare a sample mean to a specific value. In this test, we draw
a random sample from the population and then compare the sample mean with the population
mean and make a statistical decision as to whether or not the sample mean is different from the
population.
Mean  Comparison value
t
Standard Error
x  0
t
S/ n
0 = The test value, x = Sample mean, n = Sample size and S = Sample standard deviation
This t value is compared to the critical t value from the t distribution table with degrees of
freedom df = n – 1 and confidence level chosen. We reject the null hypothesis if the measured t value
is greater than the critical t value.
Example:
Let us consider the average rainfall in a given area is 8 inches. However, a local meteorologist claims
that rainfall was above average from 2016-2020 and argues that average rainfall during this period
was significantly different from overall average rainfall. The following is the average rainfall for the
observed period of 2016-2020:
Year 2016 2017 2018 2019 2020

Rainfall (inches) 8 5 7 5 6

Solution: Sample mean ( x ) = 6.2, Sample size (n) = 5 and
Sample standard deviation (S) = 1.30. Here we are comparing a single sample mean (6.2 inches)
to a known population mean (8 inches).
Step 1: Null Hypothesis: The average annual rainfall from 2016-2020 is the same as the overall
average annual rainfall of 8 inches. If any difference is observed it is purely due to the random error.
Alternative Hypothesis: The average annual rainfall from 2016-2020 was not the same as the
overall average annual rainfall of 8 inches, but was significantly higher. The observed difference is
not solely due to random error, but rather indicates a true difference in average annual rainfall.
6.2  8
Step 2: t  3.10
1.3 / 5
Step 3:
Where t.025 is the critical value from the t distribution and is found using:
df = N–1 = 5–1 = 4
Step 4 :
Since t (4) = -3.10, p <.05; Reject the null
hypothesis.
The null hypothesis is rejected since the obtained
value is more extreme than the critical value (p =
.05)
Hence, we can say that there was less-than
average rainfall 2016-2020. The observed average
rainfall for this period does not appear to be due to random error alone, but suggests that the
weather pattern for the local area was different during the period studied.”
5.3.3 t- test for two independent groups

This compares two groups (experimental and control groups) and helps us to see whether a
statistically significant difference exists between the two means-thus, the two-sample t test compares
two group means. For example, we can use t-test to testing the following hypothesis:
"It is expected that boys will have higher Mathematics scores than the girls."
Hypothesis for two independent samples can be expressed following ways:
H0: µ1 = µ2 ("the two-population means are equal")
H1: µ1  µ2 ("the two-population means are not equal")
Sample one mean  Sample two mean

t
Standard error of the difference in means
There are two forms of the test statistic for this test.
Case 1: When Variances are assumed to be equal
When the two independent samples are assumed to be drawn from populations with identical
population variances (i.e., 12 = 22), the test statistic t is computed as

x1  x2 (n1  1)s12  (n2  1)s22
t Where, Sp 
1 1 n1  n2  2
Sp 
n1 n2
s1 = standard deviation of first sample
s2 = standard deviation of second sample
sp = pooled standard deviation (a combined estimate of the overall standard deviation)
Case 2 : When Variances are assumed to be unequal
x x
t 1 2 Use x1  x2 if x1  x2
s12 s22
 Use x2  x1 if x1  x2
n1 n2
The calculated t value is then compared to the critical t value from the t distribution table with
following formula of degrees of freedom and chosen confidence level:
s12 s22 2
(  )
n1 n2
df 
1 s12 2 1 s2
( )  ( 2 )2
n1  1 n1 n2  1 n2
If the calculated t value > critical t value, then we reject the null hypothesis.
Example
Country A has an average farm size of 191 acres, while Country B has an average farm size of 199
acres. Assume the data were attained from two samples with standard deviations of 38 and 12 acres
and sample sizes of 8 and 10, respectively. Is it possible to infer that the average size of the farms
in the two countries is different at  = 0.05? Assume that the populations are normally distributed.
Solution:
Step 1: Hypothesis H0: µ1 = µ2 and H1: µ1  µ2 (claim)
Step 2: Find the critical values. The test is two-tailed and  = 0.05, also variances are unequal,
the degrees of freedom are the smaller of n1–1 or n2–1. In this case, the degrees of freedom are
8 – 1 = 7. Hence, from t-table F, the critical values are –2.365 and –2.365.
x1  x2 191  199
Step 3 : t    0.57
s1 s2 38 2
12 2
 
n1 n2 8 10
Step 4 : Make the decision.
Do not reject the null hypothesis, since - 0.57 > -2.365.
Step 5: Make Conclusion. There is not enough evidence to support the claim that the average
size of the farms is different.

5.4 Chapter Summary
 Process of drawing statistical inferences is as follows
 Hypothesis is an educated guess which needs to be tested.

 Sampling distribution: A sampling distribution is a distribution of possible values of a statistic
for a given size sample selected from population.
 Estimation: The process by which one makes inferences about a population, based on the
information obtained from a sample.
 Confidence Interval: It is the amount of uncertainty associated with a sample estimate of a
population parameter.
 Hypothesis testing: It is the procedure used by statisticians to accept or reject statistical
hypotheses.
 Sampling error = x  
 Central Limit Theorem (CLT): Sampling distribution leads to be normal (bell curve shaped)
if n is large, no matter what the shape of the population is
 Degree of Freedom (Df) = N–1, where N = sample size
 T-test for one sample
x  0
t
S/ n
 T- test for two independent groups
x1  x2 x1  x 2
t t
1 1 s12 s22
Sp  
n1 n2 n1 n2
When Variance is equal When Variances is unequal
5.5 Online resources

1. Virtual Laboratories in Probability and Statistics https://www.randomservices.org/random/
2. Statistics online support
http://sites.utexas.edu/sos/
3. Website for Statistical Computation
http://vassarstats.net/

4. Intro to Inferential Statistics (Free course on Udacity)
https://www.udacity.com/course/intro-to-inferential-statistics--
ud201?irclickid=x7HRwbRQBxyLUMFwUx0Mo3QnUkEXcow3a1SlSI0&irg
wc=1&utm_source=affiliate&utm_medium=&aff=259799&utm_term=&ut
m_campaign=_gtc_search_&utm_content=&adid=788805
5. Statistical Resources
https://sixsigmastats.com/
Exercise– 5.1
1. Identify the below statement as biased or Unbiased statement. Justify your answer.
"For a survey about daily mobile uses by students, random selection of twenty students from a
school"
2. (i) Find the critical t value for  = 0.01 with d.f.= 22 for a left-tailed test.
(ii) Find the critical t values for  = 0.10 with d.f.=18 for a two-tailed t test.
3. Suppose that a 95% confidence interval states that population mean is greater than 100 and less
than 300. How would you interpret this statement?
4. A shoe maker company produces a specific model of shoes having 15 months average lifetime. One
of the employees in their R & D division claims to have developed a product that lasts longer. This
latest product was worn by 30 people and lasted on average for 17 months. The variability of the
original shoe is estimated based on the standard deviation of the new group which is 5.5 months.
Is the designer's claim of a better shoe supported by the findings of the trial? Make your decision
using two tailed testing using a level of significance of p < .05.
5. An electric light bulbs manufacturer claims that the average life of their bulb is 2000 hours. A random
sample of bulbs is tested and the life (x) in hours recorded. The following were the outcomes:
x  127808 and ( x – x)2  9694.6
Is there sufficient evidence, at the 1% level, that the manufacturer is over estimating the life span
of light bulbs?
6. A fertilizer company packs the bags labelled 50 kg and claims that the mean mass of bags is 50 kg
with a standard deviation 1kg. An inspector points out doubt on its weight and tests 60 bags. As a
result, he finds that mean mass is 49.6 kg. Is the inspector right in his suspicions?
7. The average heart rate for Indians is 72 beats/minute. To lower their heart rate, a group of 25 people
participated in an aerobics exercise programme. The group was tested after six months to see if the
group had significantly slowed their heart rate. The average heart rate for the group was 69 beats/
minute with a standard deviation of 6.5. Was the aerobics program effective in lowering heart rate?
Answers for Exercise 5.1

1. Unbiased because it is random sampling.
2. (i) -2.508 (ii) +1.734 and -1.734
4. Yes. Null hypothesis accepted.
5. No sufficient evidence to reject Null Hypothesis.
6. Yes. Null hypothesis is accepted.
7. Yes. There was significant effect of the aerobics in lowering heart rate.


it
n
U
6

At the end of this unit, the student will be able to:
 Understand the concept of an index number
 Appreciate the purpose and usefulness of index numbers in global economy
 Construct index number for simple and weighted data set
 Familiarize with limitations of index numbers (Simple and Weighted)
 Classify and validify to test adequacy of index numbers (Unit test and Time reversal
test)
 Familiarize with the characteristics and components of a time series
 Acquire understanding of time series and analyse time series for univariate data
sets
 Learn to compute and review trend analysis by method of moving average
 Learn to compute and review straight line trend analysis by using least squares
method
Index Numbers and Time-based Data 1

6.1 Concept Map
Seasonal
component

6.2 Introduction
Recall when a teacher compares the increase
in your math test scores from one term to the
next, and says that performance is 25% times
better or 10% lower than earlier.
We often read in the newspaper how the
cost of consumer goods has increased by 30%
in last decade or the economists announce that
crude oil cost is up by 5.8 points.
As it turns out, these percentages or
numbers are index numbers, numbers used in
statistics and market to tell changes in various
fields of economy
Index numbers is a concept applied very widely in the
economic sphere, many of the prominent economic indicators
are presented as index numbers, including the Consumer Prices
Index (CPI), Gross Domestic Product (GDP) and the Retail Sales
Index (RSI).
At NSE (National Stock exchange) and BSE (Bombay Stock
Exchange) make use of Nifty and Sensex indexes.
However, the index numbers are increasingly being used in
the social sphere through composite measures such as - to
quantify complex concepts such as poverty and prosperity in a
country.
What exactly is an index number?
Through this unit, we shall discuss the concept of index numbers, the purpose and application
of index. We shall also learn how to construct index number
6.2.1 Index Number

An index number is a measure of change in a group of related variables over two different
situations with respect to time, geographical location or other characteristics. The different situations
may be two different times or places. It is a measure that tracks the movement in the general level
of prices of consumer goods and services.
A collection of index numbers for different years or locations etc. is called an Index series
6.3 Use of Index Numbers

1. Index numbers help to measure changes in the standards of living as well as prices fluctuations.
2. Government policies are framed on the basis of index numbers of prices.
3. Index Numbers not only help in the study of past and present behavior, they are also used
for forecasting economic and business activities.
4. Index numbers facilitate comparative study with respect to time and place, especially where
units(weights) are different.

6.4 Construction of Index Number
Let us discuss a few issues that arise in the construction of index numbers. The problem may be
categorized as follows:
1. Stating the purpose of index number: the purpose should be clearly and unambiguously
stated as an improperly constructed index may be misleading and incorrect for the analysis
of data.
2. Selection of data: data sample of the relevant commodity should be sufficiently large sample
to obtain reliable index numbers
3. Choice of base period: The base period should be a normal year; meaning that the prices of
that period should not be subject to an erratic boom or depression or to the effect of a
calamity or natural catastrophes. The base period should not be too short or too long such
as the prices for a too short a period are highly unreliable
4. Choice of commodity: while constructing index number, including the prices of all commodities
is not possible as it will be taking lot of time and efforts. Hence suitable sample of commodities
is to be taken that can reflect changes in the over-all price. Selection of commodity should be
by judgement sampling and not done randomly so that the results are approximate and not
perfect.
All the points listed above are of varying importance and not interdependent on each other.
There are different methods of construction of index numbers that we shall learn in this module.
Broadly, the calculation of price index can be divided into two subgroups, namely (a) Simple
(unweighted) Aggregate method. (b) Weighted Aggregates method.
6.4.1 Relative Index Number

Example 1
Suppose we want to compare price level of crude oil for the years 2005 through 2020.
Period Price (thousand / gallon)
December 2005 2.5

January 2010 3.5
November 2015 2.8
October 2020 2.9
Source courtesy:
https://www.indexmundi.com/commodities/?commodity=crudeoil&months=180&currency=inr
In this case, year 2005 is called the base period and the year for which the index number is
constructed, is called the current period.
To facilitate comparisons with other years, the actual price per gallon of the current period is
converted into a relative index number with respect to the base period. The relative index number is
calculated for prices, quantities, volume of consumption, export etc.

Choice of the base period shall be done keeping in mind that
i) The base period shall be a normal period when the prices should not be subject to boom or
a depression or to the effect of war or natural calamities
ii) Base period shall not be too short or too long
Let p0 and pn denote prices of an article in two situations denoted by 0 and n respectively. A
change in the price of the article can be expressed as pn / p0 (as a relative term)
In this case, the simplest form of price relative shows how the current price per unit is placed
in comparison to a particular base period price per unit.
pn
Relative Price Index number in time period n = In =  100
po
In the same way, we can compute quantity index using the formula:
Qn
Relative Quantity Index number in time period n = In =  100
Qo
For easy comparison, we shall consider the price level for the base period as 100 and the price
level of a particular year in consideration is expressed relative to the base period.
Considering year 2005 as base period -
Table (i)
Period Price
(Rupees/ gallon) Relative Index
2.5
December 2005 2,576.49 I 2005   100 = 100
2.5
3.5
January 2010 3,541.88 I 2005   100 = 140
2.5
2.8
November 2015 2,847.43  100 = 112
2.5
2.9
October 2020 2,931.66  100 = 116
2.5
NOTE: For the sake of comparison, we shall consider the price level for the base period as 100 and the
price level of a particular year in consideration is expressed relative to the base period.
As we considered the price level on year 2005 as base period,

An index series the relative index of year 2003 will be 100 and if we compare the
is a list of index numbers relative index of other years with respect to year 2005, we can say
for two or more periods that crude oil price in year 2010 was 140 – 100 = 40% above the
of time, on the basis of
2005 base-year price.
same base period.
In this case, the price relative index number for year 2010 is 140
(there is no need to write % symbol)

Similarly, the price relative index is 116 in year 2020 showing a 16% increase in crude oil price
in year 2020 from the 2005 base period price.
The price relative index numbers, such as above are extremely useful in understanding and
interpreting variations in ever so changing economic and business conditions over time.
Here, a list of index numbers have been calculated by employing same base year 2005; Therefore,
the table (i) above is called an index series
Example 2
A departmental store paid annually for newspaper and television advertisements in 1990 and 2000
as shown below:
Expenditure 1990 2000
Newspaper
(in ten thousand ) 1.3 2.9
Advertisement (in ) 1.8 3
Using 1990 as the base year, compute a 2000 price index for newspaper and television advertisement
prices.
Also compare the relative expenditure increase between the two modes of advertisements
2.9
Solution: I2000 (Newspaper)  100 = 223
1.3
3
I2000 (Television)  100 = 167
1.8
Clearly, Newspaper advertising cost increased at a greater rate as compared to Television

advertisement cost.
NOTE: An important consideration in the construction of index numbers is the objective of the index
numbers as they are constructed with specific purpose. No single index is ‘all purpose’ index number.
6.4.2 Simple (Unweighted) Aggregative Method

As discussed above, the relative index is useful to measure price changes over a period of time
for individual items. But if we want to construct an index to be based on the price change for a
group of items such as housing, food, medical care cost, stock market, transportation etc., we us an
aggregated index. The purpose of aggregated index is to measure the collective change in a group
of items.
This method consists of expressing aggregate of prices in any year as a percentage of their
aggregate in base year
A simple aggregative index is constructed as follows:
pn
Price index number in time period n = In = p  100
o
Where, represents unweighted aggregate Index

represents sum of unit prices for the base period 0
And, represent sum of unit prices for the current period n
And, the quantity index:
Qn
Quantity index number in time period n = In =  100
Qo
Example 3
A manufacturer purchases four distinct raw materials, that differ in unit price as given below:
COMMODITY UNIT PRICE ( ) UNIT PRICE ( )

Year 2000 Year 2008
A 3.20 3.8
B 1.70 2.1
C 148.10 149.50
D 34 45
Calculate an unweighted aggregate price index for year 2008 using year 2000 as the base period.
Solution:
Year 2000 Year 2008
A 3.20 3.8
B 1.70 2.1
C 148.10 149.50
D 34 45
Total p0 = 187 pn = 200.4
Therefore, the index number of year 2008 on the base year 2000 = =
=
= 107.165
107.2
From the above example, we can conclude that the price index number of year 2008 has only
increased by 7.2%over the period of 2000 to 2008.
But note that the unweighted aggregate approach is heavily influenced by the commodities with
large per unit pricing. Therefore, the commodities with relatively lower unit prices such as A and
B are dominated by the high unit price commodities like C and D.
Because of highly sensitivity of unweighted index as shown in the above example, this form of
index number is not very accurate and useful. Therefore, it is the major flaw in using absolute
quantities and not relatives. Such high unit prices become the concealed weights and tend to give
out biased index number.

6.4.3 Simple Average of relatives Method
To address the concerns shared for simple aggregative method, let us construct sample average
of relatives. In this method, we shall convert price of each commodity in table (i) into percentage of
the base period. To construct index number, we shall calculate average of all such relatives because
the index number calculated from relatives will remain the same regardless of the units of each
commodity.
Example 4

Year 2000 Year 2008 Relative Index
3.8
A 3.20 3.8  100 = 119
3.2
2.1
B 1.70 2.1  100 = 124
1.7
149.5
C 148.10 149.50  100 = 101
148.1
45
D 34 45  100 = 132
34
Solution
Simple average of relative index, = = = 119
Though there is an improvement over previously calculated index number, this method is also
flawed to an extent as it gives equal importance to each commodity’s relative. This amounts to
incorrectness in case of different weights or quantities because the individually calculated relatives
disregard the absolute quantity of each commodity.
6.4.4 Weighted Aggregative Method

Due to the limitations of methods discussed above, constructing a weighted aggregate index
number provides a better and more accurate comparison when the data items have variation of
weights.
Since the index number does not depend on the units in which the prices are quoted, we shall
weigh prices by quantities and price relatives by values
In such case, use of a weighted index number allows greater importance to be attached to some
items. Moreover, the Information also includes factors such as quantity sold or quantity consumed
for each item.
In this method appropriate weights are assigned to different commodities to make them
comparable and thus compatible for summation. The advantage of this method of computing index
number is that the allotment of weights enables the commodities of greater importance to have more
impact on index number.
A weighted aggregative index is constructed as follows:
pni Qi
Index number in time period n = In = p Q  100
0i i

Where, represents weighted aggregative Index
is the quantity of usage for commodity

Here, quantity is
represents unit price for the base period 0 fixed and assumed
not to vary over time
And, represents unit price of commodity for the current as prices change
period n
Example 5:
With reference to example above, what will happen to the index number for year 2000 if the
commodities are used in different weights (quantities)?
COMMODITY Quantity (weights) UNIT PRICE ( ) UNIT PRICE ( )

Year 2000 Year 2008
A 100 3.20 3.8
B 20 1.70 2.1
C 15 148.10 149.50
D 50 34 45
= 115
From this calculation of weighted aggregative index, we can conclude that the cost of raw
material used by the manufacturer has increased by 15% over the period from year 2000 to year
2008. In general, a weighted aggregative index along with the quantity of usage of commodities is
a preferred method to establish a price index for a group of commodities.
Clearly, compared to the simple (unweighted) aggregative index, the weighted index provides
more accurate indication of the price change over a period of time. Taking the quantity of usage of
each commodity into account helps to find a more precise index.
But what if the quantity of usage in current period differs from that of base period?
6.4.5 Laspeyres’ Index

In a special case of the fixed-quantity weights considered from base period usage, the weighted
aggregative index is known by a new name, Laspeyres Index
In 1871, French economist Laspeyere recommended that quantities of commodities consumed in
base year shall be taken as weights for the purpose of calculating index numbers.
According to him, the weighted aggregative index is constructed as follows:
pni Q0 i
Index number in time period n = InLa =  100
p0 i Q0 i

Where, represents weighted aggregative Index using Laspeyres’ method
is the quantity of usage for commodity in base period
represents unit price for the base period 0
And, represents unit price of commodity for the current period n

Hence, in the example above, if the quantities(weights) for the group of commodities are of year
2000, then the calculated Index is based on Laspeyres Index
Since the Laspeyres index uses base period weights, it has a disadvantage of overestimating the
rise in the cost of living (because people may have reduced their consumption of items that have
become proportionately dearer than others)
6.4.6 Paasche Index

In 1874, German statistician Paasche suggested that for determining quantity(weights) is to revise
the quantity over time. When the fixed-quantity weights are considered from current period usage,
the weighted aggregative index is known by another name, Paasche Index
In this case, the weighted aggregative index is constructed as follows:
pni Qni
Index number in time period n = InPa = p Q  100
0 i ni
Where, represents weighted aggregative Index by Paasche’s method
is the quantity of usage for commodity in current period n
represents unit price for the base period 0
And, represents unit price of commodity for the current period n

Why do we need this weighted aggregative Index?
So in the example above, if the quantities (weights) for the group of commodities are of year
2008, then the calculated Index is based on Paasche Index.
Paasche method has the advantage of being based on current need and usage of commodities
though this method may underestimate the rise in the cost of living as the calculations are based on
the current period weights. Also Paasche index construction requires a new set of weights for the
year in consideration, and gathering such data-information can be time-consuming and expensive.
Let us compare and analyze the application of the two stated methods of Index construction
Example 6
Following table shows the data on energy consumption and expenditure at Badarpur Thermal
Power Station, in Delhi region. Construct an aggregative price index for the energy expenditures in
year 2015 using
i) Laspeyres’ index
ii) Paasche index.

Sector Quantity Unit Price
(weights) ( /kWh)
Year 1987 Year 2015 Year 1987 Year 2015
Commercial 5416 6015 1.97 10.92

Residential 15293 20262 2.32 6.16
Industrial 21287 17832 0.79 5.13
Agriculture 9473 8804 2.25 8.10
(Laspeyres Index) =
= 403
(Paasche Index) =
= 381
NOTE: Paasche value being less than the Laspeyres indicates usage has increased faster in the lower
priced sectors.
6.4.7 Fisher’s Ideal method

This index calculation gives the geometric mean* of Laspeyres’ and
Paasche’s methods. When a, b and c
are in a geometric
progression, then
p1Q0 p1Q1
Index number in time period n = InF =   100 b is called the
p0 Q0 p0Q1 geometric mean
as b 2 = ac
6.4.8 Marshall-Edgeworth’s Method
The statistician duo, Marshall and Edgeworth proposed that index
number is to be calculated by taking the average of the base year and
the current year.
 pn (Q 0  Q n )
Index number in time period n = InME =  100
 p0 (Q 0  Q n )
6.4.9 Weighted Average Of Relatives

This method makes use of price relatives. When the base and current prices of a number of
commodities with varying weights or quantities are given, then this method to construct index
number is recommended.

pn
 ( p0 Q0 )
Index number in time period n = In =
p0
 100
 p0 Q 0
Example 7
Calculate the price index using weighted average of relatives method for the food consupltion in a
student hostel in a month. Use data of year 1997 as base year for calculations.
COMMODITY WEIGHT PRICE PER UNIT

1997 2001
Rice 14 quintals 90 120
Wheat 20 kg 30 46
Pulses 35 kg 22 34
Milk 15 litre 50 90
Solution
Commodity Quantity Price per unit Price relative Value Weights Weighted Price
Year Year (base period Relatives
1997 2001 1997)
Q0 p0 pn pn p 0Q 0 pn
( p0 Q0 ) = pn Q0
p0 p0
120 120
Rice 14 90 120 14 × 90= 1260 ×14 × 90 = 1680
90 90
46 46
Wheat 20 30 46 20 × 30 = 600 × 20 × 30 = 920
30 30
34 34
Pulse 35 22 34 35 × 22 = 770 × 35 × 22 = 1190
22 22
90 90
Milk 15 50 90 15 × 50 = 750 × 15 × 50 = 1350
50 50
pn
p 0 Q 0 = 3380 p 0
( p0 Q0 )  5140
Weighted price relative for year 2001 on the base period 1997 = =
5140
=  100
3380
= 152.07

6.5 Types of Index Numbers
We have learnt how to compute the index number for a single item or a group of items. Now
let us consider some price indexes that are important measure of business and economy.
1. Value Index is the measure of the average value for a particular period with that of the
average period of the base period. It is used to keep stock of inventory, sales and trading etc.
Picture credit: https://www.insightsonindia.com/2020/12/17/human-development-index-2/
2. Quantity Index is the measure of change in the quantity of goods (produced/ consumed/
sold) within a stipulated period of time. An example of quantity index is the Index of
Industrial Production, known as IIP
3. Price Index is the measure relative price change over a period of time. An example of price
index is the Consumer Price Index, known as CPI
Picture credit: https://www.investopedia.com/terms/c/consumerpriceindex.asp

6.6 Limitations of an Index Number
 There are chances for errors, given that index numbers come as a result of samples. These
samples are put together after analysis and deliberation, which creates chances for errors
 It is calculated based on items which may not be in trend which in turn will create an
inaccurate analysis
 Multiple methods are used to formulate index numbers. Due to this variety of methods,
outcomes may bring in different set of values which may cause confusion
 Selection of representative commodities may be skewed as they are based on samples collected
or considered.
6.7 Index Series

Refer to example 1 where index numbers of two or more periods of time are constructed, on the
basis of same base period.
Such a list of indexes is called Index series.
Many businesses and economies make use of various type of index series such as company sales,
industry sales, and inventories, measured in dollar amounts.
The purpose of such series often is to indicate increased usage (physical, for example - volume)
associated with the activities.
6.8 Test of Adequacy of Index Numbers

As discussed in 6.3, there are many methods to construct an index number. The important thing
to consider is the appropriate method for constructing index number for the data analysis.
It is essential for testing the consistency of a good index number. The following tests are available
for checking the adequacy of index number-
1. Unit test
2. Time reversal test
3. Factor reversal test
4. Circular test
These tests maintain consistency by verifying their adequacy. Let us learn how to verify adequacy
of indexes using the unit test and the time-reversal test
6.8.1 Unit Test

According to this test, the selection of method of construction of index number should be
independent of the units in which the pricing or quantities of commodities are available. For example,
the quantities of commodities such as wheat is in kilograms while weight of milk is in liters.
This test of adequacy can be applied to all the methods discussed above except for the simple
aggregative method.
6.8.2 Time-reversal Test

The time-reversal test is used to test whether the method of constructing index number will work
with any consideration of time period. This test says that the method used should give the same ratio
between one point or another for comparison; no matter which time period is taken as base period.

Basically, if the time subscripts ( ) of a price or quantity index number are interchangeable
then the resulting price/ quantity relative should be reciprocal of the original price/quantity relative
– i.e. if represents price of wheat in year 2013 and represent price in year 2018;
then
Here, is the index for current year ‘1’ on the basis of base year ‘o’
And, is the index for year ‘0’ based on year ‘1’
Clearly this test of adequacy cannot be tested on Laspeyers’ method and Paasche’s method
Because
Clearly this test cannot be tested on Laspeyres’ method of index number because
Also the Paasche method of index number cannot be tested for adequacy using this test as
Whereas, the Fisher’s Ideal index number satisfies the time-reversal test
Example 8
Calculate Fisher’s price index number for the given data and verify that it satisfies the time-reversal test.
Commodity Price Quantity

2008 2012 2008 2012
(p0) (p1) (Q0) (Q1)
Rice 10 13 4 6
Wheat 15 18 7 8
Rent 25 29 5 9
Fuel 11 14 8 10
Solution–
Commodity Price Quantity p 0Q 0 p 0Q 1 p 1Q 0 p 1Q 1

2008 2012 2008 201
(p0) (p1) (Q0) (Q1)
Rice 10 13 4 6 40 60 52 78
Wheat 15 18 7 8 105 120 126 144
Rent 25 29 5 9 125 225 145 261
Fuel 11 14 8 10 88 110 112 140
Total 359 515 435 623
Fisher’s price index number =

Here, :
= (as )
= = 1
According to Time-reversal test, the adequacy of Fisher’s Ideal index number is verified because
6.9 Time Series

A time series is a sequentially recorded numerical data points for a given variable arranged in
a successive order to track variation. For thorough analysis, these data points are recorded at
successive times or successive periods, to provide the information being sought for analysis or forecast.
An essential aspect of managing any business or economy model is planning for the future. Time
series analysis is useful in analyzing how a given asset, security, or economic variable changes over
a period of time. These series also help to see how a business or economic variable change over a
period of time. It also gives an insight on how changes associated with the chosen data points
compare to the changes in other variables over the same period of time
For example, let us analyse a time series of daily opening stock prices for a particular stock over
a period of one year. In such a case, you will collect a list of all opening prices of the stock for each
day in chronological order as one-year, daily opening price time-series for the stock. A time series
data is analyzed using technical analysis tools to see if the stock prices show any pattern or seasonality.
Such information is found useful to determine when the stock goes through peaks and troughs.
Analyzing the stock time series and relating it to other variables like employment rate can provide
valuable information to benefit businesses and economy.
Time series forecasting tools use information based on historic data and associated patterns to
predict the future activity such as trend analysis, fluctuation analysis; though the success in predicting
future patterns is not guaranteed.
Such data analysis is considered in three types:
 Time series data: when data of the variable is collected at distinct time intervals, for a specified
period of time.
 Cross-sectional data: when data for one or more variables is collected at the same point in
time.
 Pooled data: when data in a combination of time series data and cross-sectional data is
collected.
Forecasting methods can be classified as quantitative and qualitative. Quantitative method of
forecasting can be used:
 When the past information about the variable is available
 When information and data of the variable can be quantified
 On the assumption that the pattern of the past will continue in the future
 The variable has a cause-and-effect relationship with one or more other variables

When forecasting is done based on historic data of past values, it is called a time series method.
Qualitative method is generally based on expert judgement and analytical opinion to develop
forecasts. One of the benefits of using these methods is that they can be applied when information on
the data of the variable cannot be quantified or historic information is neither available nor applicable.
6.9.1 Time series analysis

A time series in which data of only one variable is varying over time is called a univariate time
series/data set. For example, data collected from a temperature sensor measuring the temperature
of a place every second, the data will show us only one-dimensional value - temperature.
Figure 6.8.1 (i)
Source: Rural Electrification Corporation Ltd data; Power Ministry press release.
When a time series is a collection of data for multiple variables and how they are varying over
time, it is called multivariate time series/data set.
Figure 6.8.1 (ii)

Source credit: https://www.americanexperiment.org/2019/06/global-co2-emissions-skyrocket-india-plans-build-42-shercos-coming-years/

The patterns and behavior of the data in any time series are based on four components:
1. Secular trend component – also known as trend series, is the smooth, regular and long-term
variations of the series, observed over a long period of time. Figure 6.8.1 (iii) shows an
upward trend for annual electricity consumption per household in a certain residential locality
from years 1990 – 2002. In general, trend variations can be either linear or non-linear.
Figure 6.8.1 (iii)
2. Seasonal component – when a time series

captures the periodic variability in the data,
capturing the regular pattern of variability;
within one-year periods. The main causes of such
fluctuations are usually climate changes, seasons,
customs and habits which people follow at
different times.
Figure 6.8.1 (iv) shows seasonal electricity
consumption and variations of peak demand in
Nepal and India in year 2018.
3. Cyclical component – when a time series shows
an oscillatory movement where period of
oscillation is more than a year where one
complete period is called a cycle.
Figure 6.8.1 (iv)
4. Irregular component – these kinds of fluctuations
are unaccountable, unpredictable or sometimes Source credits: https://www.researchgate.net/figure/
caused by unforeseen circumstances like – floods, ndia-Nepal-Peak-Demand-seasonal-variation-in-a-
year_fig2_337444939
natural calamities, labor strike etc. Such random
variations in the time series are caused by short-term, unanticipated and nonrecurring factors
that affect the time series.
6.9.2 Trend analysis by fitting linear trend line

Among the four components of the time series as discussed above, the secular trend analysis (also
known as trend analysis) depicts the long-term direction of the series. One of the most widely used
in practice mathematical techniques of finding the trend values is the method of least squares. It
plays an important role in finding the trend forecasts for the future economic and business time
series data

Trend can be measured using the following methods:
1. Graphical method
2. Semi averages method
3. Moving averages method
4. Method of least squares
We shall be studying two methods to compute trend line from the above list.
6.9.2 (i) Trend analysis by moving average method

This method is used to draw smooth curve for a time series data. It is mostly used for eliminating
the seasonal variations for a given variable. The moving average method helps to establish a trend
line by eliminating the cyclical, seasonal and random variations present in the time series. The period
of the moving average depends upon the length of the time series data. As shown in the figure 6.8.2
(i), the red smooth curve is the trend-cycle, which is noticeably smoother than the original data and
captures the main movement of the time series ignoring the minor fluctuations. The order of the
moving average determines the smoothness of the trend-cycle estimate.
While using this method:
1. Choice of the length of moving

average is very important
2. Appropriate length plays an

important role in smoothing
the variations
Figure 6.8.2 (i)

Picture credits: https://otexts.com/fpp2/moving-averages.html
Procedure for calculating Moving average for odd number of years (n = odd)
Let us take an example for n = 3 years moving averages to understand the procedure
1. Add up the values of the first 3 years and place the yearly sum against the median (middle)
year. (This sum is called 3-year moving total)
2. Continue this process by leaving the first-year value, add up the next three year values and
place it against its median year.
3. This process must be continued till all the values of the data are taken for calculation.
4. Calculate the n-year average by dividing each n-yearly moving total by n to get the n-year
moving averages, which is our required trend values.
5. There will be no trend value for the beginning period and the ending period in this method

Example 9 :
Calculate the 3-year moving averages for the loans (In lakh ) issued by co-operative banks for
farmers in different states of India based on the values given below.
Year 2006 2007 2008 2009 2010 2011 2012 2013 2014
Loan amount
(In lakh ) 41.85 40.2 38.12 26.5 55.5 23.6 28.36 33.31 41.1
Solution:
Year Loan amount 3- year moving total 3- year moving average

2006 41.85 - -
2007 40.2 120.17 120.17/3 = 40.6
2008 38.12 104.82 34.94
2009 26.5 120.12 40.4
2010 55.5 105.6 35.2
2011 23.6 107.46 35.82
2012 28.36 85.27 28.4
2013 33.31 102.77 34.26
2014 41.1 - -
The graph shows the observation data in blue whereas, the red curve shows the smooth trend
curve obtained by calculating moving averages of 3 years
Procedure for calculating Moving average for even number of years (n = even)
Let us take an example for n = 4 years moving averages to understand the procedure
1. Add up the values of the first 4 years and place the sum against the middle of 2nd and
3rd year. (This sum is called 4-year moving total)
2. For the next moving total, leave the first year value and add next 4 values from the 2nd till
5th year and write the sum against its middle position i.e. in the middle of 3rd year and
4th year

3. This process will continue till the value of the last observation is taken into account.
4. Now, calculate the average of each 4-year moving totals by dividing each moving total by 4
5. In the next column, calculate the sum of the first two 4-years moving averages and write the
sum against 3rd year, in the middle (known as centered total).
6. After this, leave the first 4-year moving averages and add the next two 4-year moving total
and place it against 4th year.
7. This process of finding centered totals will continue till all pairs of 4-yearly moving averages
of previous column are summed up and centered.
8. Divide the newly obtained centered totals by 2 to get the moving averages which are our
required trend values based on 4-year moving averages.
Example 10
Compute the trends by the method of moving averages, assuming that 4-year cycle is present in the
following series.
Year 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989
Index
number 400 470 450 410 432 475 461 500 480 430
Solution: The 4- year moving averages are shown in the last column as centered average
Year Index 4-year 4-year Moving Centered Centered moving

Number Moving Average total average
total
1980 400 - -
- -
1981 470 - -
1730 1730/4 = 432.5.
1982 450 875.5 875.5/2 = 437.75
1762 1772/4 = 443
1983 410 884.75 884.75/2 = 442.38
1767 1767/4 = 441.75
1984 432 886.25 443.13
1778 1778/4 = 444.5
1985 475 911.5 455.75
1868 1868/4 = 467
1986 461 946 473
1916 1916/4 = 479

1987 500 946.75 473.38
1871 1871/4 = 467.75
1988 480 - -
- -
1989 430 - -
6.9.2 (ii) Computation of Straight-line trend by using Method of Least squares

Method of least squares is a technique for finding the equation
which best fits a given set of observations. In this technique, the
sum of squares of deviations of the actual and computed values
is least and eliminates personal bias.
Suppose we are given n number of observations and it is
required to fit a straight line to these data.
Note that n, the number of observations can be odd or even.
Recall that the general linear equation to represent a straight
line is:
y = a + bx, --------------------- (i)
where y is the actual value, x is time; a and b are real numbers
In order to fit the best fitted trend line with the help of general equation y = a + bx for the given
time series, we will try to find the estimated values of yi say yˆ i close to the observed values yi for
i = 1, 2, …, n.
According to the principle of least squares, the best fitting equation is obtained by minimizing the
sum of squares of differences which leads us to two conditions:
1. The sum of the deviations of the actual values of y and ŷ (estimated value of y) is zero
 ( y  yˆ i ) = 0
2. The sum of squares of the deviations of the actual values of y and ŷ (estimated value of y) is
least  ( y  yˆ i )2 is least
For the purpose of plotting the best fitted line for trend analysis, the real values of constants ‘a’
and ‘b’ are estimated by solving the following two equations:
 Y = n a + b  X --------------------- (ii)
 XY = a  X + b  X2 -------------------- (iii)
Where ‘n’ = number of years given in the data.
1. Remember that the time unit is usually of successive uniform duration. Therefore, when the
middle time period is taken as the point of origin, it reduces the sum of the time variable x to zero
Which means that by taking the mid-point of the time as the origin,
we get  X = 0
2. When X = 0, the equations (ii) and (iii) reduce to:
 Y = na + b (0)

 a =
And,        XY = a (0) + b X2
 b =
5. By substituting the obtained values of ‘a’ and ‘b’ in equation (i), we get the trend line of best fit.
Example 11
Given below are the consumer price index numbers (CPI) of the industrial workers.
Year 2014 2015 2016 2017 2018 2019 2020

Index Number 145 140 150 190 200 220 230
Find the best fitted trend line by the method of least squares and tabulate the trend values.
Solution
Note that the number of years is Odd
 n = odd
Procedure:
1. Take middle year value as A i.e. A = 2017
2. Find X = xi – A
3. Find X2 and XY
Year Index X = xi – A X2 XY Trend value

(x i) number (Y) = xi – 2017 Yt = a + bX
2014 145 -3 9 -435 152.1+ (-3)×16.6 = 102.3
2015 140 -2 4 -280 152.1+ (-2)×16.6 = 118.9
2016 150 -1 1 -150 152.1+ (-1)×16.6 = 135.5
2017 190 0 0 0 152.1+ (0)×16.6 = 152.1
2018 200 1 1 200 152.1+ (1)×16.6 = 168.7
2019 220 2 4 440 152.1+ (2)×16.6 = 185.3
2020 230 3 9 690 152.1+ (3)×16.6 = 201.9
n = 7 Y = 1065 X = 0 X 2 = 28 XY = 465 Yt = 1064.7
= 152.14
and = 16.6
Therefore, the required equation of the straight-line trend is given by
y = a + bx y = 152.1+16.6x

Example 12
Based on the data available for the sales of an item in a district, by the method of least squares
(i) tabulate the trend values
(ii) find the best fit for a straight-line trend
(iii) compute expected sale trend for year 2002
Year 1996 1997 1998 1999 2000 2001

Sales
(In lakh ?) 6.5 5.3 4.3 6.1 5.6 7.8
Note that the number of years is even

 n = even
Procedure:
sum of two middle years 1998  1999
1. Take middle year value as i.e. = = = 1998.5
2 2
xi – A
2. Find X = (we divide by 0.5 to avoid cumbersome calculations)
0.5
3. Find X2 and XY
Solution:
xi  A
Year Index number X X2 XY Trend value
0.5
xi  1998.5
(x i) (Y)  Yt = a + bX
0.5
1996 6.5 -5 25 -32.5 5.9+ (-5) × 0.13 = 5.25
1997 5.3 -3 9 -15.9 5.9+ (-3) × 0.13 = 5.51
1998 4.3 -1 1 -4.3 5.9+ (-1) × 0.13 = 5.77
1999 6.1 1 1 6.1 5.9+ (1) × 0.13 = 6.03
2000 5.6 3 9 16.8 5.9+ (3) × 0.13 = 6.29
2001 7.8 5 25 39 5.9+ (5) × 0.13 = 6.55
n = 6 Y = 35.6 X = 0 X2 = 70 XY = 9.2 Yt = 35.4
= 5.9
and = 0.13

Therefore, the required equation of the straight-line trend is given by
y = a + bx y = 5.9 + 0.13x
According to the line trend, the predicted sales for year 2002:
y = 5.9 + 0.13( 6.81 lakh
Note: 1. Future trend forecast made by using this method are based only on the trend values
2. The predicted trend values by using this method are more reliable than any other method
6.10 CHECK YOUR UNDERSTANDING REFLECTIVE QUESTIONS

Q1. Judge the correctness or otherwise of the following statements:
i) An index number is a pure number
ii) Index numbers are independent of choice of unit
iii) An index number can be a negative quantity
iv) The purchase power of money decreases as the wholesale index increases
Q2. A price index which is based on the prices of the items in the composite, weighted by their
relative index is called:
i) price relatives ii) Consumer price index
iii) Weighted aggregative price index iv) Simple aggregative index
Answer: iii)
Q.3 A weighted aggregate price index in which the weight for each variable is considered its
current-period quantity is:
i) Aggregative index ii) Consumer Price index
iii) Laspeyres Index iv) Paasche’s’ index Answer: iv)
Q.4 An index constructed to measure changes in quantities over a period of time is:
i) Quantity index ii) Time series index
iii) Quality index iv) Value index Answer: i)
Q.5 For calculating the weighted index number, which of the following uses quantities consumed
in the base period as weights:
i) Fisher’s method ii) Paasche’s method
iii) Laspeyres method iv) Aggregative method Answer : i)
Q.6 What is the index number of the base period?
i) 200 ii) 300
iii) 10 iv) 100 Answer: iv)
Q.7 Index number is a special type of :
i) Average ii) Dispersion
iii) Correlation iv) None of the above Answer: i)
Q.8 Index number is always expressed in
i) Percentage ii) Ratio
iii) Proportion iv) None of the above Answer : i)

Q.9 Which index number is called as ideal index number
i) Laspeyres ii) Paasches
iii) Fisher iv) None of the above Answer iii)
Q.10 In Laspeyres price index number weight is considered as
i) Quantity in base year ii) Quantity during current year
iii) Prices in base year iv) Prices in current year. Answer i)
Q.11 In Paasche’s price index number weight is considered as
i) Quantity in base year ii) Quantity in current year
iii) Prices in base year iv) Prices in current year Answer: ii)
Q.12 Fishers price index number is the
i) A.M. of Laspeyres and Paasche’s
ii) G.M. of Laspeyres and Paasche’s
iii) Difference between Laspeyres and Paasche’s
iv) None of the above. Answer: ii)
Q.13 When the prices of rice are to be compared, we compute:
i) Volume index ii) Value index
iii) Price index iv) Aggregative index Answer: iii)
Q.14 Purchasing power of money can be accessed through:
i) Simple index ii) Fisher’s index
iii) Consumer price index iv) Volume index Answer: iii)
Q.15 Cost of living at two different cities can be compared with the help of:
i) Value index ii) Consumer price index
iii) Volume index iv) Un-weighted index Answer: ii)
6.11 PRACTICE EXERCISE

Q.1 Calculate index numbers from the following data by simple aggregate method taking prices
of 1995 as base period.
Commodity Year A B C D
Price (in 1995 80 50 90 30
Rupees/unit) 2005 95 60 100 45
Q.2 Construct price index number from the following data using
i) Laspeyre’s Method and ii) Paasche’s method iii) Fisher’s Ideal method
Commodity Price Quantity
Year Year Year Year
2008 2010 2008 2010
P 2 4 8 5
Q 5 6 12 10
R 4 5 15 12
S 2 4 18 20

Q.3 Taking 1995 as base year calculate relative index number for the years 1997-2005
Year 1995 1997 1999 2001 2003 2005
Price (in ) 12 14 13 20 25 21
Q.4 Compute the weighted aggregative index number for the following data:
Variable Price Weights

Current year Base year
X 5 4 60
Y 3 2 50
Z 2 1 30
Q.5 Calculate price index number for 2004 taking 1994 as the base year from the following data
by simple aggregative method:
Item Rice Wheat Pulses Millets Oil

Price in year
1990 ( in ) 60 40 100 60 90
Price in year
2010 ( in ) 140 60 205 70 100
Q.6 Based on the data on the expenses of middle-class families in a certain city, calculate the
cost-of-living index during the year 2003 as compared with 1990:
Expenses Year Food Fuel Clothing Rent Miscellaneous

Price (in ) 2003 1500 250 750 300 425
Price (in ) 1990 1400 200 400 200 250
Q7. From the data given below, obtain the index of retail sales in India for years 1982, 1983,
1984 with the year 1981 as base period.
Year Index of sales volume Index of sales value

1995 101 105
1996 113 108
1997 106 124
Q8. Calculate the price index number for the following data using weighted aggregative method:
Commodity Unit Weight Price

Base year Current year
P Quintal 14 90 120
Q Kg 20 10 17
R Dozen 35 40 60
S Litre 15 50 93

Q9. Based on the given data, check whether i) Paasche’s formula and, ii) Fisher’s formula will
satisfy the time reversal test:
Commodity Base Year Current Year

Price Quantity Price Quantity
P 4 10 6 15
Q 6 15 4 20
R 8 5 10 4
Q10. The annual rainfall (in mm) was recorded for Cherrapunji, Meghalaya:
Year Rainfall (in cm)

2001 1.2
2002 1.9
2003 2
2004 1.4
2005 2.1
2006 1.3
2007 1.8
2008 1.1
2009 1.3
Determine the trend of rainfall by 3-year moving averages

Q11. Compute the seasonal indices by 4-year moving averages from the given data of production
of paper (in thousand tons)
Year 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989
Index
number 2450 1470 2150 1800 1210 1950 2300 2500 2480 2680
Q12. Given below is the data of workers welfare expenses (in lakh ) in steel industries during
2001 - 2005. Use method of least squares to:
i) tabulate the trend values
ii) find the best fit for a straight-line trend
iii) compute expected sale trend for year 2006
Year 2001 2002 2003 2004 2005
Welfare expenses
(in lakh ) 160 185 220 300 510
Q13. Fit a straight-line tend by method of least squares for the following data and also find the
trend value for year 1998:
Year 1992 1993 1994 1995 1996 1997
Production
(in tons) 210 225 275 220 240 235

6.11 UNIT SUMMARY
1. An index number is a measure of change in a group of related variables over two different situations
with respect to time, geographical location or other characteristics.
2. Factors influencing construction of index numbers:
 Selection of data  Base period
 Selection of weights  Choice of variables
3. Index number for time period n is represented as In
4. A list of indexes is called a Index series
5. Methods to construct index number:
i. Relative Index number =
ii. Simple (Unweighted) Aggregative Method =
iii. Simple Average of relatives Method
iv. Weighted Aggregative Method=
v. Laspeyers’ method =
vi. Paasche’s’ method =
vii. Fisher’s Ideal method =
viii. Marshall-Edgeworth’s method =
ix. Weighted averages of relatives =
6. Types of index numbers:

i. Value index
ii. Quantity Index
iii. Price Index
7. There are tests to check consistency to verify adequacy of an index number:
i. Unit test ii. Time reversal test
iii. Factor reversal test iv. Circular test
8. Time reversal test: po1 × p10 = 1
Here, po1 is the index for current year ‘1’ on the basis of base year ‘0’
And, p10 is the index for year ‘0’ based on year ‘1’
9. A time series is a sequentially recorded numerical data points for a given variable arranged in a
successive order to track variation
10. The purpose of time series is to show an increasing growth pattern over time for a variable

11. Time reversal test of adequacy cannot be tested on Laspeyers’ method and Paasche’s method
12. When data of the variable is collected at distinct time intervals, for a specified period of time, it is
called time series data.
13. When data for one or more variables is collected at the same point in time, it is called cross-sectional data.
14. When data is collected in a combination of time series data and cross-sectional data, it is called
pooled data.
15. A time series in which data of only one variable is varying over time is called a univariate time series.
16. When a time series is a collection of data for multiple variables and how they are varying over time,
it is called multivariate time series.
17. Secular trend component or simply called trend series, is the smooth, regular and long-term variations
of the series, observed over a long period of time.
18. Seasonal component is a time series captures the periodic variability in the data, capturing the
regular pattern of variability; within one-year periods.
19. Cyclical component is a time series shows an oscillatory movement where period of oscillation is
more than a year where one complete period is called a cycle. The real Gross Domestics Product
(GDP) provides good examples of a time series that displays cyclical behavior.
20. Irregular component is a time series in which fluctuations are unaccountable, unpredictable or
sometimes caused by unforeseen circumstances like – floods, natural calamities, labor strike etc.
21. Trend can be measured using by the following methods:
i. Graphical method
ii. Semi averages method
iii. Moving averages method
iv. Method of least squares
6.12 Check your Understanding Answer Key

ANSWERS TO REFLECTIVE QUESTIONS
2. iii) 3. iv) 4. i) 5. i) 6. iv) 7. i) 8. i) 9. iii) 10. i) 11. ii) 12. ii) 13. iii) 14. iii) 15. ii)
ANSWERS TO PRACTICE QUESTIONS

1. 120
2. Laspeyre’s = 146; Paasche’s = 149; Fisher’s = 147
3. Year 1995 1997 1999 2001 2003 2005

Price (in ) 12 14 13 20 25 21
100 117 108 167 208 175
4. 137
5. 164
6. 132
7. Year
1995 104
1996 96
1997 117
8 153
9 Paasche’s formula – no, Fisher’s formula - yes
10 Year 3-year moving average
2001 -
2002 2.55
2003 2.65
2004 2.75
2005 2.4
2006 2.6
2007 2.1
2008 2.25
2009 -
11 Year 4-year moving averages

1980 -
1981 -
1982 1812.5
1983 1712.5
1984 1791.25
1985 1897.5
1986 2138.75
1987 2393.75
1988 -
1989 -
12 Year (xi ) Trend value

Yt = 275 + 81.5 X
2001 112
2002 193.5
2003 275
2004 381.5
2005 438
Trend line : Yt = 275 + 81.5 X

Predicted trend for year 2006 = 519.5 lakh rupees
13 Year (xi ) Trend value
Yt = 234 + 1.6 X
1992 226
1993 229.2
1994 232.4
1995 235.6
1996 238.8
1997 242
Trend line: Yt = 234 + 1.6 X
Predicted trend for 1998 = 245.2 tons

it
n
U
7

 Explain the concept of perpetuity and sinking fund.
 Calculate perpetuity.
 Differentiate between sinking fund and savings account.
 Define the concept of valuation of bond and related terms.
 Calculate value of bond using present value approach.
 Explain the concept of EMI.
 Calculate EMI using various methods.
 Explain the concept of rate of return and nominal rate of return.
 Calculate rate of return and nominal rate of return.
 Understand the concept of Compound Annual Growth Rate.
 Differentiate between Compound Annual Growth Rate and Annual Growth Rate.
 Calculate Compound Annual Growth Rate.
 Explain the concept of Stocks, shares and Debentures.
 Enlist features related to equity shares and debentures.
 Interpret case studies related to shares and debentures (Simple Case Studies only).
 Define the concept of linear method of depreciation.
 Interpret cost, residual value and useful life of an asset from the given information.
 Calculate depreciation using linear method of depreciation.
Financial Mathematics 7.1

CONCEPT MAP
Introduction
Financial mathematics is of great importance in our day-to-day life. The entire operation in
banking, insurance, property dealing etc. are based on the concept of money belonging to one
individual that may be used by others in return for periodic payments. Interest plays an important
role in almost all the financial activities. Many people have set up their own finance companies and
are earning a lot.
In this chapter, we shall discuss some of the basic topics of finance.
7.1.1 Perpetuity:
Perpetuity: A perpetuity is an annuity where payments continue forever.
Amount of a Perpetuity: Amount of a perpetuity is undefined since it increases beyond all
bounds as time goes on.
Present value of Perpetuity: We consider two types of perpetuity which are as follows:
(i) The present value of a perpetuity of R payable at the end of each period, the first payment
due one period hence is the sum of money which is invested now at the rate i per period will
yield R at the end of each period forever. It is given by
R (I + i)-1 + R (I + i)-2 + ————-
It is an infinite geometric series with first term R (1+i)-1 and whose common ratio is (1+i)-1
Its sum is given by
Present value of a perpetuity of R payable at the end of each period, the first being due one
period hence is
P=
where R = size of each payment

i = rate per period
7.2 Financial Mathematics

(ii) Perpetuity of R payable at the beginning of each period, the first payment due on Present
value. This annuity can be considered as an initial payment of R followed by a perpetuity
of R of above type.
Thus, the present value is given by R +
where, R = size of each payment

i = rate per period
Example 1
Find the present value of a sequence of payments of 60 made at the end of each 6 months and
continuing forever, if money is worth 4% compounded semi-annually.
Solution: This is a perpetuity of type (i), since payments are made at the end of each period. given
that
R =60 and i = = 0.02
Then present value of a perpetuity
P = = = 3000
Example 2
At 6% converted quarterly, find the present value of a perpetuity of 600 payable at the end of each
quarter.
Solution: Given that
R = 600, i = = 0.015
Then the present value of a perpetuity
P = = = 40,000
Example 3:
At what rate converted semi-annually will the present value of a perpetuity of 450 payable at the
end of each 6 months be 20,000?
Solution: let r be the interest rate converted semi-annually. Then i, the interest rate per period is
Since P =
where P = 20,000 and R = 450
we have i = = = 0.0225
= 0.0225
r = 0.045 or 4.5 %

Example 4:
How much money is needed to endure a series of lectures costing 2500 at the beginning of each
year indefinitely, if money is worth 3% compounded annually?
Solution: We have R= 2500, i = 0.03 Money needed to endure a series of lectures costing 2500 at
the beginning of each year means the present value of a perpetuity of 2500 payable at the beginning
of each year
P = R + = 2500 +
= 85833.33
Example 5:
The present value of a perpetual income of x at the end of each six months is 40000. Find the
value of x if money is worth 6% compounded semi-annually.
Solution: We have P = 40,000
i = = 0.03
We know that
P =
40,000 =
X = 40,000 x 0.03 = 1200
7.1.2 Sinking Fund:

A sinking fund is a fund established by a company or business entity by setting aside revenue
over a period of time to fund a future capital expense, or repayment of a long-term debt. It is a fund
that is accumulated for the purpose of paying off a financial obligation at some future designated
date.
The periodic payments of R made at the end of each period required to accumulate a sum of
A over n periods with interest charged at the rate i per period is
Where
R = Size of each instalment or payment

i = rate per period
n = number of instalments
A = lumpsum amount to be accumulated
Remark: The problems relating to Sinking Fund are solved by using known formulas for the
amount of an ordinary annuity or annuity due as the case may be depending on whether the
payments are set aside at the end or beginning of each payment interval.

Difference between Sinking Fund and Savings Account
Sinking fund and savings account, both, involve setting aside an amount of money for the future.
The main difference is that the sinking fund is set up for a particular purpose and is to be used at
a particular time, while the savings account is set up for any purpose that it may serve.
Example 6:
A company establishes sinking fund to provide for the payment of 1,00,000 debt. maturing in 4
years. Contributions to the fund are to be made at the end of every year. Find the amount of each
annual deposit if interest is 18% per annum.
Solution: let each annual deposit to the sinking fund be R. Then R is given by
=> 1,00,000 = R [
= R [ ]
= R [ ]
= R [ ] = R (5.2156)
=> R= = 19,173.25
Example 7
In 10 years, a machine costing 40,000 will have a salvage value of 4,000. A New Machine at that
time is expected to sell for 52,000. In order to provide funds for the difference between the
replacement cost and the salvage cost, a sinking fund is set up into which equal payments are placed
at the end of each year. If the fund earns interest at the rate 7% compounded annually, how much
should each payment be?
Solution: Amount needed after 10 years
= Replacement Cost - Salvage Cost
= 52,000 – 4,000 = 48,000
The payments into sinking fund consisting of 10 annual payments at the rate 7% per year is
given by
=> 48,000 = R [
= R [
=> R = = 3474.12

Example 8:
Mr X plans to save amount for higher studies of his son, required after 10 years. He expects the cost
of these studies to be 1,00,000. How much should he save at the beginning of each year to
accumulate this amount at the end of 10 years, if the interest rate is 12% compounded annually?
Solution: Let the size of each annual payment be R. These payments represent annuity due
consisting 10 annual payments at the rate 0.12 per annum. Thus, using the following formula for
the amount of annuity due:
Where A = 1,00,000, n = 10 and i = 0.12
ð 1,00,000 = R [S]   1]
110.12
= R [ ]
= R (19.65458)
=> R =
= 5087.87
EXERCISE 7.1
1. Find the present value of a sequence of payments of 80 made at the end of each 6 months and
continuing forever, if money is worth 4% compounded semi-annually.
2. Find the present value of an annuity of 1800 made at the end of each quarter and continuing
forever, if money is worth 5% compounded quarterly.
3. If the cash equivalent of a perpetuity of 300 payable at the end of each quarter is 24,000. Find
the rate of interest compounded quarterly?
4. Find the present value of a perpetuity of 780 payable at the beginning of each year, if money is
worth 6% effective.
5. The present value of a perpetual income of x at the end of each 6 months is 36000.
Find the value of x if money is worth 6% compounded semi-annually.
6. If you need 20,000 for your daughter’s education, how much must you set aside each quarter for
10 years to accumulate this amount at the rate of 6% compounded quarterly?
7. To save for child’s education, a sinking fund is created to have 1,00,000 at the end of 25 years.
How much money should be retained out of the profit each year for the sinking fund, if the investment
can earn interest at the rate 4% per annum.
8. A machine costs 1,00,000 and its effective life is estimated to be 12 years. A sinking fund is created
for replacing the machine by a new model at the end of its lifetime when its scrap realises a sum
of 5,000 only. Find what amount should be set aside at the end of each year, out of the profits,
for the sinking fund if it accumulates at 5% effective.
9. Suppose a machine costing 50,000 is to be replaced at the end of 10 years, at that time it will have
a salvage value of 5,000. In order to provide money at that time for a machine costing the same
amount, a sinking fund is set up. The amount in the fund at that time is to be the difference between
the replacement cost and salvage value. If equal payments are placed in the fund at the end of each
quarter and the fund earns 8% compounded quarterly. What should each payment be?

VALUATION OF BONDS
Bond: It is a written contract between a borrower and a lender (bond holder). Through this
contract, the borrower promises to pay a specified sum at a specific future date and to pay interest
payments at a specific rate at equal intervals of time until the bond is redeemed (repaid).
A bond is characterized by following terms:

Face Value: The face value (also known as par value) of a bond is the price at which the bond
is sold to buyers (investors) at the time of issue. It is also the price at which the bond is redeemed
at maturity. It is also known as the par value of the bond.
Redemption Price: It is the amount the bond issuer pays at maturity. It is usually equal to the
face value in case the bond is redeemed at par.
Discount: Where the market price of bond is less than its face value (par value), the bond is
selling at a discount.
Premium: if the market price of bond is greater than its face value, the bond is selling at a
premium.
Bond valuation: is the determination of the fair price of a bond. As with any security or capital
investment, the theoretical fair value of a bond is the present value of the stream of cash flows it
is expected to generate. Hence, the value of a bond is obtained by discounting the bond’s expected
cash flows to the present using an appropriate discount rate.
Nominal rate of interest: It is the rate at which a bond yields interest. It’s also known as coupon
rate.
Coupon Rate: A bond’s coupon rate denotes the annual interest rate paid by the bond issuer to
the bond holder. It is simply the coupon payment C as a percentage of the face value F. Coupon
yield is also called nominal yield.
Coupon rate =
Current Yield: The current yield is simply the coupon payment C as a percentage of the (current)
bond price Po.
Current yield =
Yield to Maturity (YTM): The yield to maturity (YTM) is the discount rate which returns the
market price of a bond without embedded optionality; it is identical to required return. YTM is thus
the internal rate of return of an investment in the bond made at the observed price. Since YTM can
be used to price a bond, bond prices are often quoted in terms of YTM.
To achieve a return equal to YTM, the bond owner must:
 Buy the bond at a price Po
 Hold the bond until maturity
 Redeem the bond at par
Relationship between Bond, YTM and Coupon yield

The concept of current yield is closely related to the other bond concepts, including yield to
maturity and coupon yield. The relationship between yield to maturity and the coupon rate is as
follows:

 When a bond sells at a discount, YTM > current yield > coupon yield.
 When a bond sells at a premium, coupon yield > current yield > YTM
 When a bond sells at par, YTM = current yield = coupon yield
Present Value Approach (or bonds with a maturity period)

 When a bond or debenture has a maturity date, the value of a bond will be calculated by
considering the annual interest payments plus its terminal value using the present value
concept, the discounted value of these flows will be calculated.
 By comparing the present value of a bond with its current market value, it can be determined
whether the bond is overvalued or undervalued.
 In the present value approach, we first calculate the present value of each expected cash flow
and then we add all the individual present values to obtain the value or fair value of
purchase price of a bond.
Let there be a bond where
Value of bond or Market price of bond or, purchase price of bond = V
Face Value = F
Redemption price or Maturity value = C
Number of cash flows or number of periodic payments = n
id be the rate of interest per period
Let periodic dividend payment R (periodic – interest) is given by,
R = C x id
Let Yield rate or interest rate per period = i
Coupon payment or periodic interest (dividend) payment = R
Present value of annuity of periodic dividend payments of R for n periods is given by
P 1= R
The present value of redemption price of the bond is given by:

P2=C (1+i)-n
Let V be the purchase price of the bond, then
V = P1 + P2
V = R + C (1+i)-n
In other words
Bond Value = Present value of first periodic payment + Present value of second periodic payment+
. . . + Present value of nth periodic payment + Present value of Redemption price/Maturity value
= + + . . . + +
= +

Bond Value (V) = R + C (1+i)-n
Note: If a bond is redeemed at par, then C=F
V = R + F (1+i)-n
Example 9
Find the purchase price of a 600, 8% bond, dividends payable semi-annually redeemable at par in
5 years, if the yield rate is to be 8% compounded semi-annually.
Solution: Face value of the bond C = 600
Nominal rate of interest i = 8% or 0.08
As dividends are paid semi-annually
Therefore, Rate of interest per period id = = 0.04
Therefore, periodic dividend payment R = C x id = 600 x 0.04 = 24

So, semi-annual dividend R is 24
Yield rate is 8% = 0.08, compounded semi annually
Therefore i = 0.08 = 0.04
2
No. of years n = 5
Therefore, no. of dividend periods (n) = 5 x 2 = 10
Purchase price (V) of the bond is given by
V = R + C (1+i)-n
= 24 + 600 (1+0.04)-10
= 24 + 600 (1.04)-10
= 24 + 600 (0.6755)
= 194.7 + 405.3 = 600

Therefore, purchase price of bond is 600.
Example 10
A 2,000, 8% bond is redeemable at the end of 10 years at 105. Find the purchase price to yield
10% effective rate.
Solution: Face value of the bond C = 2,000
As the bond is redeemable at 105, so redemption price of the bond is 105 % of its face value.
Therefore, redemption value C= 1.05 x 2,000 = 2,100

Nominal rate id = 8 % or 0.08
So R = C x id = 2,000 x 0.08 = 160
No. of periods before redemption n = 10
Annual yield rate i = 10 % or 0.1
Therefore, purchase price V is given by,
V = R + C (1+i)-n
= 160 + 2100 (1+0.1)-10
= 160 + 2100 (1.01)-10
= 160 + 2100 (0.3855)
= 160 (6.14) + 2100 (0.3855)

= 982.4 + 809.6
= 1792
Therefore, the present value of the bond is 1,792.
Example 11
Consider a bond with a coupon rate of 10% charged annually. The par value is 2,000 and the bond
has 5 years to maturity. The yield to maturity is 11 %. What is the value of the bond
Solution: Face value C = 2,000
Coupon rate id = 10 % annually or 0.1
Therefore R = C x id = 2,000 x 0.1 = 200
No. of periods before redemption (n) = 5
Yield rate i= 11 % or 0.11
Therefore
V = R + C (1+i)-n
= 200 + 2000 (1+0.11)-5
= 200 + 2000(1.11)-5
= 200 + 2000 (0.593451)

= 200 (3.6959) + 1186.902
= 739.18 + 1186.902
= 1926.08

Therefore, the value of the bond is 1,927.
Relative price approach: Under this approach, the bond will be priced relative to a benchmark
usually a government security. Here, the yield to maturity on the bond is determined based on the
bond’s credit rating relative to a government security with similar maturity/duration. The better the
quality of the bond, the smaller the spread between its required return and the YTM of the benchmark.
Exercise 7.2
1. What should be the price of the bond to yield an effective interest rate of 8% if it has a face value
of 1,000 and maturity period of 15 years? The nominal interest rate is 10%.
2. Suppose a bond has a face value of 1,000, redeemable at the end of 12 years at 15% premium
and paying annual interest at 8%. If the yield rate is to be 10% p.a. effective then what will be the
purchase price of the bond?
3. An investor is considering purchasing a 5 year bond of 1,00,000 at par value and an annual fixed
coupon rate of 12% while coupon payments are made semi-annually. The minimum yield that the
investor would accept is 6.75%. Find the fair value of the bond.
4. Suppose that a bond has a face value of 1,000 and will mature in 10 years. The annual coupon
rate is 5%, the bond makes semi-annual coupon payments. With a price of 950, what is the bond’s
YTM?
5. A bond with a face value of 1,000 matures in 10 years. The nominal rate of interest on bond is
11% p.a. paid annually. What should be the price of the bond so as to yield effective rate of return
equal to 8%?
6. What is the value of the bond, considering a bond has a coupon rate of 10% charged annually, par
value being 1,000 and the bond has 5 years to maturity. The yield to maturity is 11%.
7.3 Calculation of EMI or Amortization of Loans

People spend the money that they earn on housing, gadgets etc and on some extra expenditures
to be met with. For example, one may want to buy a car or a house, one may want to set up his
or her business or may go for a foreign trip and so on. Some people plan and manage to put aside
some money for such expenditures but most people have to borrow money/take loan for such
contingencies. This loan is paid by the borrower to the lender within a defined length of time.
However, when we talk about loans and how to pay it back, the most important term we need to
understand is EMI. Before knowing EMI, we need to understand basic terms related to it.
Principal: It is the initial amount of money borrowed (or invested).
Interest: It is the price paid by a borrower for the use of lender’s money. It is the difference
between initial amount borrowed and end payment made to the lender.
Rate of Interest: It is the percentage of the sum borrowed which is charged for a defined length
of time for using the principal generally on a yearly basis.
Term of Loan: It is the defined length of time it will take for a loan to be completely paid off
when the borrower is making regular payments.
Meaning of EMI
EMI stands for equated monthly instalment. It is a monthly payment that we make towards a
loan we opted for at a fixed date of every month.
A loan is said to be amortized if it can be discharged by a sequence of equal payments (EMI)
made over equal periods of time. Each payment can be considered as consisting of two parts:

(i) Interest on the outstanding loan, and
(ii) Repayment of part of the loan
Thus, a loan is amortized when part of each periodic payment is used to pay interest and the
remaining part is used to reduce the principal.
7.3.1 Methods of calculation of EMI or Instalment

EMI or Instalment can be calculated by two methods:
1. Flat Rate Method
2. Reducing-balance method or Amortization of Loan
Flat Rate Method: - In the flat-rate method, each interest charge is calculated based on the
original loan amount, even though the loan balance outstanding is gradually being paid down. The
EMI amount is calculated by adding the total principal of the loan and the total interest on the
principal together, then dividing the sum by the number of EMI payments, which is the number of
months during the loan term.
Let P, I and n be the principal of the loan, the total interest on the principal and number of
months in loan period respectively. EMI is given by the formula
EMI = ( )
Reducing-Balance Method or Amortization Formulas

When one is amortizing a loan, at the beginning of any period, the principal outstanding is the
present value of the remaining payments. Using this fact, we obtain the formulas in table that
describe the amortization of an interest bearing loan of P, at a rate i per period by n equal
payments of R each and such that a payment is made at the end of each period.
Table-Amortization Formulas
1. Periodic payment or Instalment
R = P ( ) =
2. Principal outstanding at the beginning of kth period =
= R [ ]
3. Total interest paid = nR - P

Where P= amount of the loan
R= size of equal payment
i = rate per period
n = number of equal payments
Example 12
Mr. X takes a loan of 2,00,000 with 10% annual interest rate for 5 years. Calculate EMI under Flat
Rate system.

Solution: We are given that
P = 2,00,000
I= x 2,00,000 x 5 = 1,00,000
n = 5 years = 5 x 12 = 60
EMI is given by the formula
EMI = ( )
EMI= ( )
= = 5000
Example 13
A couple wishes to purchase a house for 10,00,000 with a down payment of 2,00,000. If they
can amortize the balance at 9% per annum compounded monthly for 25 years, what is their
monthly payment? What is the total interest paid?
Given
Solution: The monthly payment R needed to pay off the balance 8,00,000 at 9% per annum
compounded monthly for 25 years (300 months) is given by
R=
The total interest paid = nR-P

= (6713.57)(300)-8,00,000
= 12,14,071
Example 14
Mr. M borrowed 10,00,000 from a bank to purchase a house and decided to repay by monthly
equal instalments in 10 years. The bank charges interest at 9% compounded monthly. The bank
calculated his EMI as 12,668. Find the principal and interest paid in first year?
Solution: Principal left unpaid after one year (12 payments)
= present value of remaining 108 payments = R a n i

Where R =12,668, n=108 and i = = 0.0075
= 12,668 x 73.83916= 9,35,395

Principal paid during first year = 10,00,000 – 9,35,395 = 64,605
Interest paid during first year
= {12,668 x 12) – 64,605
= 87,411
Exercise 7.3
1. Mohan takes a loan of 5,00,000 with 8% annual interest rate for 6 years. Calculate EMI under Flat-
Rate system.
2. XYZ company borrows 3,00,000 with 7% annual interest rate for 4 years. Calculate EMI under
Reducing Balance method.
3. Rajesh borrows 6,00,000 with 9% annual interest rate for 5 years. Calculate EMI under Reducing
Balance method.
4. A person amortizes a loan of 1,50,000 for a new home by obtaining a 10 year mortgage at the rate
of 12% compounded monthly. Find
(i) The monthly payments (ii) Total interest paid
5. A couple wishes to purchase a house for 12,00,000 with a down payment of 2,50,000. If they
can amortize the balance at 9% per annum compounded monthly for 20 years
(i) What is their monthly payment (ii) What is the total interest paid?
7.4 Nominal and Effective Rate of Interest:

Nominal Rate of Interest: The announced or stated rate of interest is called nominal rate of
interest.
Effective Rate of Interest: The actual rate by which the money grows during each year is called
the effective rate of interest.
Relation between effective rate of interest and nominal rate of interest:
let r be the nominal rate of interest converted m times in a year and reff be the effective rate of
interest.
Then i = .
Then the principal P amounts in one year to = P (1 + i)m

Since an effective rate is the actual rate compounded annually, therefore at the effective rate reff,
the principal P amounts in one year to P ( 1 + reff ). Thus,

P (1 + reff) = P (1 + i)m
1 + reff = (1 + i)m
reff = (1 + i) m – 1 = (1 + ) m - 1
If r is compounded continuously, then
reff = – 1]
= – 1
= - 1
Let = x, then as m   => x -> 0
Then reff = - 1
= er - 1
Hence,
Relation between the nominal rate and effective rate
reff = ( 1 + )m - 1
where reff = effective rate of interest

r = nominal rate of interest
m = number of conversion periods per year
In case of continuous compounding of nominal rate r, the effective rate of interest is
reff = er - 1
where reff = effective rate of interest
r = nominal rate of interest
Example 15
Mr X took a loan of 2,000 for 6 months. Lender deducts 200 as interest while lending. Find the
effective rate of interest charged by lender.
Solution: Since the money Lender deducts 200 as interest while lending a loan of 2000 for 6
months, therefore 200 may be treated as interest on 1800 for 6 months. Consequently, interest
rate per six months is
i= =
Thus, the equivalent effective rate of interest, reff is given by
reff = (1 + i)2 - 1
= (1 + )2 - 1 = 0.23456
= 23.45 %

Example 16
What effective rate is equivalent to a nominal rate of 8% converted quarterly?
Solution: When compounded quarterly we have r = 0.08 and m = 4
Using formula, the effective rate reff is equivalent to a nominal rate is given by
reff = (1 + )m - 1
= (1 + )4 - 1 = (1.02)4 - 1
= 1.0824 - 1 = 0.0824 or 8.24 %

Thus, the effective rate is 8.24%. This means that the rate 8.24% compounded annually yields
the same interest as the nominal rate 8% compounded quarterly.
Example 17
Mr. Y has two investment options - either at 10% per annum compounded semi-annually or 9.5 %
per annum compounded continuously. Which option is preferable and why?
Solution: When compounded semi-annually we have r = 0.10, m = 2
Now, reff = (1 + )m - 1
= (1 + )2 - 1
= 0.1025 or 10.25 %
when compounded continuously
reff = er – 1 = e0.095 -1
= 0.0996 = 9.96 %
Thus, the first investment is preferable.
Example 18:
Find the effective rate of interest equivalent to a nominal rate of 6% compounded (i) Semi-annually
(ii) Quarterly (iii) Continuously
Solution:
(i) When compounded semi-annually
We have r = 0.06 and m = 2
reff = (1 + )m – 1 = (1 + )2 - 1
= 0.0609 or 6.09 %
(ii) When compounded quarterly
We have r = 0.06 and m = 4
reff = (1 + )m - 1
= (1 + )4 - 1
= 0.0613 or 6.13 %

(iii) When compounded continuously
reff = er - 1 = e0.06 -1
= 1.0618 - 1
= 0.0618 or 6.18 %
EXERCISE 7.4
1. What is the effective annual rate of interest compounding equivalent to a nominal rate of interest 5%
per annum compounded quarterly?
2. Which is the better investment, 3% per year compounded monthly or 3.1% per year compounded
quarterly?
3. What effective rate of interest is equivalent to a nominal rate of 8% converted quarterly?
4. To what amount will 12000 accumulate in 12 years if invested at an effective rate of 5%?
5. Which yields more interest: 8% effective or 7.8% compounded semi-annually?
7.5 Compound Annual Growth Rate

Meaning of Compound Annual Growth Rate
Compound annual growth rate (CAGR) depicts the cumulative performance of a particular
variable over a period of time via compounding effect. It is often used to evaluate the performance
of different investments by an individual or enterprise through annual rate of return. The basic
concept of compound growth rate can be explained with the help of following example:
If you had invested ?1,000, and it grew at a compound rate of 10% annually,
Year 1: 1,000 + (1,000 x 10%) = 1,100
Year 2: 1,100 + (1,100 x 10%) = 1,210
Year 3: 1,210 + (1,210 x 10%) = 1,331
Year 4: 1,331 + (1,331 * 10%) = 1,464.10
So, the amount would be worth 1,464 after 4 years.
Formula for calculation of CAGR
CAGR = x 100
where: EV = Investment’s ending value

SV = Investment’s starting value
n = Number of investment periods (months, years, etc.)
Example 19
Assume an investment’s starting value is 10,000 and it grows to 60,000 in 4 years. Calculate
CAGR.
Solution:
CAGR = × 100
CAGR = (1.56508-1) x 100

Hence, CAGR = 56.50%

IMPORTANT POINTS
 CAGR is expressed in percentage
 CAGR can be used to compare historical returns in different investment portfolios
 CAGR eliminates the effects of volatility on periodic investments
Difference between Average Annual Growth rate and Compound Annual Growth Rate
Average Annual Growth Rate is calculated by dividing the cumulative return by the number of
years. It usually inflates the results. Compound Annual Growth Rate is determined by compounding
effect on the return or any variable taken into consideration. Many investors prefer CAGR because
it smoothens out the volatile nature of year-by-year growth rates and provides more accurate measure
of performance as compared to Average Annual Growth rate.
Use of Compound Annual Growth Rate

The CAGR can be used to calculate the average growth of a single investment. As we know, due
to market volatility, the year-to-year growth of an investment is likely to appear uneven. For example,
an investment may increase in value by 9% in one year, decrease in value by 3% the second year
and increase in value by 5% in the next. CAGR helps smooth returns when growth rates are
expected to be volatile and inconsistent.
CAGR is also used to track the performance of various business measures of one or multiple
companies alongside one another. For example, over a five-year period, a Retail Store’s market share
CAGR was 1.75%, but its customer satisfaction CAGR for the same period was -0.51%. Thus,
comparing the CAGRs of measures within a company reveals its strengths and weaknesses.
Example 20:
Calculate CAGR of unit sales on the basis of given information:

Year 2012 2013 2014 2015 2016
Sales 53,000 60,786 73,450 86,000 105,000
Solution:
EV= 105,000 units SV= 53,000 units n= 4
CAGR = × 100
= x 100
= – 1] x 100
= [ 0.18639]x100 =18.63%
Example 21:
Suppose a person invested 15,000 in a mutual fund and the value of investment at the time of
redemption was 25000. If CAGR for this investment is 8.88%. Calculate the number of years for
which he has invested the amount?

Solution:
EV= 25000 SV= 15000 CAGR= 8.88% n=?
CAGR = x 100
8.88 = x 100
0.0888 +1 =
1.0888 =
log (1.0888) = log (1.666)
n = = = 6.005  6 years
Exercise 7.5
1. An investment has a starting value of 5000 and it grows to 25,000 in 4 years. What will be its
CAGR?
2. An investment has a starting value of 2000 and it grows to 18,000 in 3 years. What will be its
CAGR?
3. Calculate CAGR from the following data
Year 2015 2016 2017 2018
Revenue(?) 3,00,000 3,50,000 4,00,000 4,50,000
4. Mr. Kumar has invested 20,000 in year 2014 for 5 years. If CAGR for that investment turned out
to be 11.84%. What will be the end balance?
5. Mr. Naresh has bought 200 shares of City Look Company at 100 each in 2015. After selling them
he has received 30,000 which accounts for 22.47% CAGR. Calculate the number of years for which
he was holding the shares.
7.6 Stock, shares and Debentures:

To start a big business or an industry a large amount of money is needed. It is beyond the
capacity of one or two persons to arrange such a huge amount. However, some persons associate
together to form a company. They, then, draft a proposal, issue a prospectus (in the name of the
company) explaining the plan of the project and invite the public, to invest money in this project.
They, thus pool up the funds from the public, by assigning them shares of the company.
Important facts and formulae

Stock capital: The total amount of money needed to run the company is called the stock capital.
Shares or Stock: The whole capital is divided into small units, called shares or stock.
For each investment, the company issues a share certificate, showing the value of each share and
the number of shares held by a person. The person who subscribes in stock or shares is called a
shareholder or stock holder.

For example: Reliance Industries Ltd., incorporated in the year 1973, is operating in Diversified
sector. Company has reported net profit after tax of 14,819.00 Crore in latest quarter. Reliance
Industries Ltd. share price moved up by 0.37% from its previous close of 2,002.85. Reliance Industries
Ltd. stock last traded price is 2,007.10. As on 31-12-2020, the company has a total of 676.21 Crore
shares outstanding.
Explanation: Reliance Industries Ltd. is engaged in issue of Equity Shares. The cost of 1 equity
share of this company is 2007.10 as recorded on 31.12.2020. Till this date, the company carries
with itself 676.21 crore shares outstanding.
For example: The board of directors of the UCO Bank, on 7th April 2021, approved the proposal
for the issue of equity shares on preferential basis to the Government of India against capital infusion
of 2,600 crore.
Explanation: UCO Bank issued preference shares worth 2,600 crore to the Government on 7th
April 2021. This means that the Government would carry all the preferential rights in the company
i.e. payment of preference dividend, right to participate in meetings etc. Also, if the Bank incur
losses, then Government will be paid first before equity shares because they own preference shares.
Debentures: The word ‘debenture’ is a derivation of the Latin word ‘debere’ which means to
borrow or loan. Debentures are written instruments of debt that companies issue under their common
seal. They are similar to a loan certificate.
Debentures are issued to the public as a contract of repayment of money borrowed from them.
These debentures are for a fixed period and a fixed interest rate that can be paid yearly or half-
yearly.
For example: L&T Finance Limited came up with public issue of secured, redeemable non-
convertible debentures of face value of 1,000 each for an amount of 500 Crore on 16th
December2019.
Explanation: L&T Limited issued debentures to general public on 16th December 2019 worth
500 crore. The face value of Debenture is 1000 per debenture. These debentures are backed by
some fixed assets in the form of security and shall be redeemable after said period. Also, these
debentures, being non-convertible means that they cannot be converted into preference shares or
equity shares.
Dividend: The annual profit distributed among shareholders is called dividend. Dividend is
usually paid annually as per share or as a percentage.
Face Value: The original value of a share or stock printed on the share certificate is called its face
value or nominal value or par value. The dividend is calculated as a percentage of face value.
Market value: The stocks of different companies are sold and bought in the open market through
brokers at stock-exchanges. A share or stock is said to be
(i) at premium or above par, if its market value is more than its face value.
(ii) At par, if its market value is the same as its face value.
(iii) At discount, if its market value is less than its face value.
Brokerage: The broker’s charge is called brokerage.
(i) when stock is purchased, brokerage is added to the cost price.
(ii) when stock is sold, brokerage is subtracted from the selling price.
Remember:
(i) The face value of a share always remains the same.

(ii) The market value of a share changes from time to time.
(iii) Dividend is always paid on the face value of a share.
(iv) Number of shares held by a person
= = =
Features of Equity Shares:

 Equity Shares are permanent in nature.
 Equity shareholders are the owners of the company, and also bear the highest risk.
 They are transferable, i.e. ownership of equity shares can be transferred with or without
consideration to another person.
 Dividend payable to equity shareholders is an appropriation of profit.
 Equity shareholders may get a fixed or a fluctuating rate of dividend.
 Equity shareholders have the right to participate in and control the affairs of an organization.
 The liability of equity shareholders is limited to the extent of their investment in the company.
Features of Debentures
 Debentures are the instruments of debt, which means that debenture holders become creditors
of the company.
 Debentures are a certificate of debt, with the date of redemption and the amount of repayment
mentioned on it. This certificate is also known as a Debenture Deed.
 Debentures have a fixed rate of interest, and such interest amount is payable yearly or half-
yearly.
 Debenture holders are not entitled to any voting rights. This is because they are not instruments
of equity, so debenture holders are not owners of the company, only creditors.
The interest payable to these debenture holders is a charge against the profits of the company.
So these payments have to be made even in case of a loss.
Example 22
Find the cost of
(i) 7200, 8% stock at 90
(ii) 4500, 8.5% stock at 4 premium
(iii) 6400, 10% stock at 15 discount
Solution:
(i) Cost of 100 stock= 90
Cost of 7200 stock = ( x 7200) = 6480
(ii) Cost of 100 stock = (100 + 4) = 104
Cost of 4500 stock = ( x 4500) = 4680

(iii) Cost of 100 stock = (100 – 15) = 85
Cost of 6400 stock = ( x 6400) = 5440

Example 23:
Which is better investment
7.5% stock at 105 or 6.5% stock at 94
Solution: Let the investment in each case be 105 x 94
Case I: 7.5% stock at 105
On investing 105, income=
On investing (105 x 94) income
= ( x x 105 x 94)
= 705
Case II: 6.5% stock at 94
on investing 94, income =
On investing (105 x 94), income
= ( x x105 x 94)
= 682.50
Clearly, the income from 7.5% stock at 105 is more.
Hence, the investment in 7.5% stock at 105 is better.
Example 24
Find the cost of 96 shares of 10 each at discount, brokerage being ¼ per share.
Solution: Cost of 1 share = [(10- )+ ¼]
Cost of 96 shares = ( x 96)

= 912
Example 25
A man sells 5000, 12% stock at 156 and invests the proceeds partly in 8% stock at 90 and 9% stock
at 108. He thereby increases his income by 70. How much of the proceeds were invested in each
stock
Solution: S.P. of 5000 stock
= ( x 5000)
= 7800

Income from this stock
= ( x 5000) = 600
Let investment in 8% stock be ‘s’ and that in 9% stock = 7800 - s
Therefore (s x ) + (7800 - s) × = 600+70
=> + = 670
=> 16s + 117000 – 15s = 670 × 180
=> s = 3600
Therefore, money invested in 8% stock at 90 = 3600
Money invested in 9% at 108
= (7800-3600)
= 4200
Exercise 7.6
1. Find the cash required to purchase 3200, 7 ½ % stock at 107 (brokerage ½ %)
2. Find the cash realised by selling 2440, 9.5 % stock at 4 discount(brokerage ¼%)
3. Which is better investment 11% stock at 143 or 9 ¾% stock at 117
4. Find the income derived from 88 shares of 25 each at 5 premium, brokerage being ¼ per share
and the rate of dividend being 7 ½ % per annum. Also find the rate of interest on the investment.
5. A man buys 25 shares in a company which pays 9% dividend. The money invested is such that
it gives 10% on investment. At what price did he buy the shares?
7.7 Depreciation
The decrease in the value of the assets such as building machinery and equipment of all kinds
is called depreciation.
Scrap value, Residual value or salvage value: The value of a depreciable asset at the end of
its useful life is called the scrap value.
Total depreciation or wearing value: The difference between the original cost and the scrap
value is called total depreciation.
Book value: The difference between the original cost of the asset and the accumulated depreciation
at any given date is called the book value of that asset on that date
Methods of computing the annual depreciation:

We will discuss the following three methods of computing the annual depreciation:
1. Straight line method
2. Sum of the years digit method
3. Written down value method or reducing balance method
Linear or Straight line method:

The linear method of depreciation is the simplest and the most widely used method to calculate
the depreciation for fixed assets. Buildings, machinery, computer, automobiles, electronic items are

examples of assets that will last for more than one year, but will not last indefinitely. Value of such
assets decreases year by year because of passage of time, wear and tear, outdated, accidents etc. The
work efficiency of asset decreases and expenses on repairs increases. Under this method, a percentage
of original cost is written off every year. As the result of this, the amount of Depreciation is uniform
every year. In this chapter, we shall discuss various methods of computing depreciation for a
depreciable asset.
According to this method the annual depreciation is given by
D =
Where D = the annual depreciation

C = the original cost of the asset
S = estimated scrap value or salvage value
n = the useful life in years
Remark: In the above formula, C-S is the total depreciation.
It should be noted that:
1. When rate of depreciation is given with the words per annum (e.g. 10% p.a.) and the date
of acquisition is given then Depreciation is charged only for the period for which the asset
is held.
2. When the date of acquisition is not given, then depreciation is charged for full year.
3. When rate of depreciation is given without the words per annum, then depreciation is
charged for the full year.
(i) It is a simple method of calculating the Depreciation.
(ii) In this method, asset can be depreciated up to the estimated scrap value.
(iii) In this method, it is easy to know the amount of Depreciation as it is uniform every year.
Sum of the years digit method: In this method, the fraction of the asset to be depreciated each
year is obtained by putting the digit of the year in reverse order over the sum of the digits of the
life periods. A greater fraction of the cost of the asset is depreciated in the earlier years of the life
of the asset.
Written down value method or reducing balance method: This method is called the constant
percentage method or diminishing balance method. In this method, the annual depreciation is a
constant percentage of the book value of the depreciated asset at the end of the preceding year.
This constant percentage must be determined so that the book value of the asset at the end of
its estimated life is reduced to scrap value. The book value at the end of the nth year is given by
S = C (1-r)n
Where, S = Book value at the end of nth year
C = original cost of the asset
r = rate of depreciation
Example 26
On 1st April, 2020, Ram purchased a machinery costing 40,000 and spent 5,000 on its erection.
The estimated effective life of the machinery is 10 years with a scrap value of 5,000. Calculate the
depreciation using the Linear/Straight line method with accounting year ending on 31st March,
2021.

Solution: Annual depreciation =
= = 4,000 p.a.
Example 27
A machine costing 30,000 is expected to have a useful life of 4 years and a final scrap value of
4000. Find the annual depreciation charge using the straight-line method. Prepare the depreciation
schedule.
C = 30,000; n=4; S = 4000
Annual depreciation =
= 6500
Depreciated schedule
Year Annual Accumulated Book Value

depreciation depreciation
( ) ( ) ( )
0 0 0 30,000
1 6500 6500 23,500
2 6500 13000 17,000
3 6500 19,500 10,500
4 6500 26,000 4000
Example 28
An asset costing 10,000 is expected to have a useful life of 4 years and a scrap value of zero. Find
the annual depreciation charge using the sum-of- the-years digits method.
C = 10,000 ; n = 4 ; S = 0
The annual depreciation charged each year is determined by putting the digits of the year in
reverse order over the sum of the digits of the life periods.

Depreciation schedule
Year Digit of the Fraction of the Annual depreciation Accumulated

year in asset to be ( ) depreciation
reverse order depreciated ( )
4 4
1 4  10000  4000 4000
10 10
3 3
2 3  10000  3000 7000
10 10
2 2
3 2  10000  3000 9000
10 10
1 1
4 1  10000  1000 10000
10 10
Example 29
A machine costing 50,000 depreciates at a constant rate of 8%. What is the depreciation charge
for the 8th year. If the estimated useful life of the machine is 10 years, determine its scrap value.
Solution: It is given that C = 50,000 and r = 0.08
The depreciation charge for the 8th year is obtained by subtracting the book value at the end of
the 8th year from the book value at the end of the 7th year
The book value at the end of the 7th year
= 50,000 (1-0.08)7
= 50,000 (0.92)7
= 50,000 (0.5578466)
= 27892.33
The book value at the end of the 8th year
= = 50,000 (1-0.08)8
= 50000 (0.92)8
= 50,000 (0.5132188)
= 25660.94
Hence depreciation charge for the 8th year
= 27892.33 – 25660.94
= 2231.39
The scrap value of the machine is given by
S = = 50,000 (1-0.08)10
= 50,000 (0.92)10
= 50,000 (0.4343884)
= 21719.42

Exercise 7.7
1. A machine costing 30000 is expected to have a useful life of 13 years and a final scrap value of
4000. Find the annual depreciation charge using the straight line method.
2. An asset costing 15,000 is expected to have a useful life of 5 years and a scrap value of 3000.
Find the annual depreciation charge using the straight-line method.
3. A piece of machinery costing 10000 is expected to have a useful life of 4 years and a scrap value
of zero. Find the annual depreciation charge using the sum- of- the- years digits method.
4. A machine, the life of which is estimated to be 15 years, costs 40,000. Calculate the scrap value
at the end of its life if it is depreciated at a constant rate of 10% per annum.
5. A machine costing 5000 depreciates at a constant rate of 5%. What is the depreciation charge for
the 5th year?
6. A firm bought a machinery for 7,40,000 on 1st April, 2018 and 60,000, is spent on its installation.
Its useful life is estimated to be of 5 years. It’s estimated reliable or scrap value at the end of the
period was estimated at 40,000. Find out the amount of annual depreciation and rate of depreciation.
7. Shiv & Co. purchased a mobile phone for 21,000 on 1st April, 2019. The estimated life of the mobile
phone is 10 years, after which its residual value will be 1,000 only. Find out the amount of annual
depreciation according to linear method.
8. On 1st April, 2015, Dreams Ltd. purchased an AC for 3,00,000 and incurred 21,000 towards
freight, 3,000 towards carriage and 6,000 towards installation charges. It has been estimated that
the machinery will have a scrap value of 30,000 at the end of the useful life which is four years.
What will be the annual depreciation and the value of machinery after four years according to linear
method?
ANSWERS
EXERCISE 7.1
1. 4000
2. 144000
3. 5%
4. 13,780
5. 1080
6. 373.60
7. 2408.19
8. 5968.8
9. 745
Exercise 7.2
1. 1,171.19
2. 911.53
3. 94,671
4. 5.66%
5. 1,201.20 963

EXERCISE 7.3
1. 10,278
2. 7,179
3. 12,455
4. 2,152.42, 108290.4
5. 8547.20, 1101376.17
EXERCISE 7.4
1. Effective annual rate of interest = 5.09 %
2. Better investment is 3.1 % per year compounded quarterly
3. Effective annual rate of interest = 8.24 %
4. 21560 approximately
5. First option
EXERCISE 7.5
1. 49.53%
2. 108%
3. 14.47%
4. 35,000
5. 2 years
EXERCISE 7.6
1. 3440
2. 2298
3. 9¾ % stock at 117 is better.
4. 165, 6.2%
5. 22.50
EXERCISE 7.7
1. 2000
2. 2400
3. 4000, 3000, 2000, ?1000
4. 8224
5. 203.50
6. 1,52,000 p.a.; 19% p.a.
7. 2,000 p.a.
8. 75,000 p.a.; 3,30,000


it
n
U
8

 Understand the concept of Linear Programming Problem.
 Know the Mathematical Formulation of Linear Programming Problem.
 Conceptualize the feasible region and infeasible region.
 Distinguish between the feasible solution and optimal solution.
 Find the optimal solution of LPP by Graphical Method.
 Know the meaning of Optimization.
Before you start you should know:
 Graphing a given linear equation or a linear inequality
 Knowledge of linear inequalities
 Solving the simultaneous linear equations.
 Finding the coordinates of intersection point of linear equations/ inequalities
CONTENT
 Introduction and related terminologies (constraints, objective function, optimization)
 Mathematical formulation of LPP
 Application of LPP on different types of real life situations
 Graphical method of solution for problems in two variables
Corner- method
Iso-profit/iso-cost method
 Feasible and Infeasible regions (Bounded and Unbounded)
 Feasible and Infeasible solution, optimal feasible solution (up to three non-trivial
constraints.
Linear Programming Problem 8.1

MIND MAP
Objective function Non-negative

Decision variables Linear constraints
conditions
8.0 INTRODUCTION
Most of the organizations, big or small are concerned with a problem of planning and optimizing
its available resources to yield the maximum production (or to maximize profit) or in some cases, to
minimize the cost of production. Dealing with such problems using mathematics are referred to as
the problems of constrained optimization.
Linear Programming is a one of the techniques for determining an optimal solution of
interdependent constraints and factors in view of the available resources. It refers to a particular
plan of action from amongst several alternatives for maximizing profit or production or minimizing
cost of production or transport etc. The word linear stands for indicating that all inequations or
equation used in a particular problem are linear.
8.2 Linear Programming Problem

Thus, a linear programming problem deals with the optimization (Minimization or Maximization)
of a linear function having number of variables; subject to a number of conditions on the variables
in the form of linear inequations or equations in the variables involved.
In this chapter, we shall discuss mathematical formulation of LPP and also learn graphical
method to solve it. We shall also try to understand and appreciate the wide applicability of LPP in
industry, commerce, management and sciences. The graphical method is used to optimize and find
possible solutions for an LPP in two-variables.
8.1 LINEAR PROGRAMMING PROBLEM:

A Linear programming problem (LPP) consists of three important components:
(i) Decision variables
(ii) The Objective function
(iii) The Linear Constraints
1. Decision Variable: - The decision variables refer to the limitations or the activities that are
competing with one another for sharing the available resources. These variables are usually inter-
related in terms of utilization of resources and need simultaneous solution. All the decision variables
are considered to be continuous, controllable and non-negative and represented as variables x, y etc.
2. The Objective function: - As every linear programming problem is aimed to have an objective
to be measured in quantitative terms such as profit (sales) maximization, cost (time) minimization
and so on. The relationship among the variables representing objective must be linear.
A linear objective is a real valued function, represented as Z = ax + by, where a, b are arbitrary
constants, where Z is to be maximized or minimized.
3. The Constraints: - There are always certain limitations (constraints) on the use of resources,
such as labor, space, availability of raw material or restrictions on transportation variables etc. that
limit the extent to which an objective can be achieved. Such constraints are expressed as linear
inequalities or equalities in terms of decision variables.
The conditions x  0, y  0 are called non-negative restrictions on the decision variables.
Basic Assumptions:
A Linear programming problem is based on the following four basic assumptions:
(i) Certainty: It is assumed that in LPP, all the parameters; such as availability of resources,
profit (or cost) contribution of a unit of decision variable and consumption of resources by
a unit decision variable must be known and fixed.
(ii) Divisibility (continuity): Another assumption of LPP is that the decision variables are continuous.
This means a combination of outputs can be used with the fractional values along with the
integer values.
(iii) Proportionality: This requires the contribution of each decision variable in both the objective
function and the constraints to be directly proportional to the value of the variable.
(iv) Additivity: The value of objective function and the total amount of each resources used must
be equal to the sum of the respective individual contributions (profit or cost) by decision
variables.

8.2 MATHEMATICAL FORMULATION A LINEAR PROGRAMMING PROBLEM
Let us take an example to understand how LPP is used to
solve real-life problems.
Rajat wishes to purchase a number of table-fans and sewing
machines. He has Rs.57600 to invest and has available space for
at most 20 items. A table-fan costs Rs. 360 and a sewing machine
costs Rs.240. Rajat wishes to sell one table-fan at a profit of Rs.22
and a sewing machine at a profit of Rs. 18.
Now, Rajat is in confusion as to how many table-fans and Rajat Table fan
sewing machines should he purchase from the available money
to get the maximum profit, assuming that he can sell all the items
which he buys.
To maximize the profit, let us suppose that Rajat purchases
x number of table-fans and y number of sewing machines which
are the decision variables for the LPP
Clearly, we can assume that x  0 and y  0, which are Sewing machine
sometimes also referred to as trivial constraints
Since Rajat has space for at most 20 items.
Therefore,
Total number of table-fans + Total number of sewing machine should be less than or equal to 20.
 x + y  20…….(i)
Also, we are given that a table-fan costs Rs. 360 and a sewing machine costs Rs. 240.
Total cost of x table-fans and y sewing machine is (360x+240y)
Since he has only Rs. 57600 to invest.
Total cost of x number of table-fan and y number of sewing machine should be less than or
equal to 5760.
 360x + 240y  57600 …..(ii)
Since Rajat can sell all the items that he can buy and the profit on a table-fan is Rs.22 and Rs.18.
on a sewing machine
Total profit on x table-fans and y sewing machine is Rs. (22x +18y)
Let Z denote the total profit, which is to be maximized in this case
Therefore, the linear objective function Z = 22x +18y
The above situation gives the description of the type of a Linear programming Problem.
Hence the given LPP can be mathematically formulated as:
(Objective function) To maximize Z = 22x +18y
Subject to constraints:
x  0, y  0
x + y  20
360x + 240y  57600

8.3 TYPES OF LINEAR PROGRAMMING PROBLEMS
The application of LPP can be found in various daily life situations.
Some of the important LP problems we shall study are:
1. Manufacturing Problem
2. Diet problem
3. Transportation Problem
4. Assignment Problem
Manufacturing Problem - These problems involve the production and sale of different products
by a company. The production of the products requires optimization of labour force, machine hours,
raw material, storage space, etc. Different products are produced to satisfy the aforementioned
constraints and the investment available.
Diet Problem - Very often the dieticians and nutritionists are required to prepare health and diet
charts. The objective of these diet charts is to include all the important kinds of nutrients that are
required by the human body to stay healthy at a reasonable cost. Thus, in the diet problems, a
minimum amount of available nutrients, thereby minimizing the cost of such a diet plan.
Transportation Problem - These problems are related to the study of the efficient transportation
routes i.e. how efficiently the product from different sources of production is transported to the
different markets, such as the total transportation cost is minimized. Analysis of such problem is very
crucial for big companies with several production plants and a widespread area to cater to. In this
type of problem, constraints mean the specific supply and demand patterns and objective function
means the transportation cost should be minimized.
Assignment problem:-
This type of problems are related with the completion of a particular task /assignment of a
company by choosing a certain number of employees to complete the assignment within the required
deadline, given that a single person works on only one job within the assignment.
In this type of problem, the number of employees, the work- hours of each employee etc. are
considered as constraints and the total assignment to be done is treated as objective function.
Example 1
A furniture manufacture makes two products: chairs and tables. Processing of these products is done
on two machines A and B. A Chair requires 2 hours on machine A and 6 hours on machine B. A
table requires 5 hours on machine A and no time on machine B. There are 16 hours per day available
on machine A and 30 hours on machine B. Profit gained by the manufacturer from a chair and a
table is Rs. 2 and Rs.10, respectively. Formulate this problem as a linear programming problem to
maximize the total profit of the manufacturer.
Solution:
The given problem can be tabulated as follows for convenience:
Machine Chair Table Available time
A 2 hours 5 hours 16 hours
B 6 hours 0 30 hours
Profit per unit Rs. 2 Rs. 10

Let x and y number of chairs and tables be produced respectively.
Then total profit to be maximized, Z = 2x + 10 y
Since the number of chairs and tables cannot be negative.
x  0 and y  0
It is given that a chair requires 2 hours on machine A and a table requires 5 hours on machine A
Therefore, it must be less than or equal to the total time available on machine A.
2x + 5y  16
Similarly, for machine B,
6x  30 Or x  5
Hence the mathematical form of the given LPP is as follows:
Maximize Z = 2x + 10y
Subject to the constraints:
x  0, y  0
2x + 5y  16
x  5
Example 2:
A small manufacturing firm produces two types of gadgets A and B, which are first processed in
the foundry shop, and then sent to the machine shop for finishing. The number of man-hours of
labor required in each shop for the production of each unit of A and B, the number of man hours
the firm has available per week are as follows:
Gadget Foundry Machine-shop
A 10 5
B 6 4
Firm’s capacity per week 1000 600
The profit on the sale of gadget A is Rs. 30 per unit as compared with Rs. 20 per unit of gadget
B. Formulate this problem as LPP to maximize the total profit
Solution:
Let x and y number of weekly production of gadgets A and B.
Therefore, Z = 30x + 20y (Since total profit is Z)
Since the number of weekly productions of gadgets, A and B cannot be negative.
x  0, and y  0
It is given that 10 and 6 man-hours of labor required in foundry shop for the production of each
unit of gadgets A and B.
Therefore, Total man-hours of labor required in foundry shop for the production of each unit of
gadgets A and B is (10x + 6y).
But firm’s total capacity per week is 1,000 man-hours of labor.
So, Total man-hours of labor required in foundry shop for the production of each unit of gadgets
A and B is less than or equal to 1000.
10x + 6y  1000

 5x + 3y  500
Similarly, for finishing,
5x + 4y  600
Hence the mathematical form of the given LPP is as follows:
x  0, y  0
5x + 3y  500
And 5x + 4y  600,
Example 3
A firm is engaged in breeding pigs. The pigs are fed on various products grown on the farm. In view
of the need to ensure certain nutrients constituents (call them X, Y and Z), It is necessary to buy two
additional products, say A and B. One unit of product A contains 36 units of nutrient X, 3 units of
nutrient Y and 20 units of nutrient Z. One unit of product B contains 6 units of nutrient X, 12 units
of nutrient Y and 10 units of nutrient Z. The minimum requirement of nutrients X, Y and Z is 108
units, 36 units and 100 units respectively. Product A costs 20 per unit and product B costs 40
per unit. Formulate the above as a linear programming problem to minimize total cost.
Solution: Let x and y number of units of product A and B.
Therefore, Total cost = 20x + 40y.
Now, according to the question,
Nutrient Minimum
constituents Nutrient content in product of nutrient. amount
A B
X 36 6 108
Y 03 12 36
Z 20 10 100
Cost of product Rs. 20 Rs. 40
Making use of above information, the appropriate mathematical formulation of the linear
programming problem is:
Minimize Z = 20x + 40y.
Subject to the constraints: x  0, y  0
36x + 6y  108  6x + y  18
3x + 12y  36  x + 4y  12
20x + 10y  100  2x + y  10
Example 4
There is a factory located at each of the two places P and Q .From these locations, a certain commodity
is derived to each of the three depots situated at A, B and C. The weekly requirements of the depots
are respectively 5 ,5 and 4 units of the commodity while the production capacity of the factories at
P and Q are 8 and 6 units respectively. The cost of transportation per unit is given below:

From/to Costs (in Rs)
A B C
P 16 10 15
Q 10 12 10
How many units should be transported from each factory to each depot in order that the
transportation cost is minimum. Formulate above as a linear programming problem.
Solution: The above given problem can be represented in diagrammatically as follows:
Let the factory at P transports x units of commodity to depot at A and y units to depot at B.
Since the requirements are always non negative quantities. Therefore, x  0, and y  0
Also, the factory at P has the capacity of 8 units of the commodity.
Therefore, the left over (8-x-y) units will be transported to depot at C
Clearly, 8 – x – y  0
 x + y  8
Since the weekly requirement of the depot at A is 5 units of the commodity and x units are
transported from the factory at P.
Therefore, the remaining quantity of (5 – x) units are to be transported from the factory at Q.
Similarly, (5 – y) units of the commodity will be transported from the factory at Q to the depot
at B.
But the factory at Q has the capacity of 6 units only, therefore the remaining units
6-(5-x+5-y) =x+y-4 units will be transported to the depot at C.
As the requirements of the depots at A, B and C are always non negative.
x – 5  0, 5 – y  0, and x + y – 4  0

 x  5,
y  5,
and x + y  4
The transportation cost from the factory at P to the factory at A, B and C are respectively Rs.16x,
10y and 15(8- x -y).
Similarly, the transportation cost from the factory at Q to the depots at A, B and C are respectively
Rs.10 (5-x), 12(5-y) and10(x + y - 4).
Therefore, the total transportation cost Z is given by:
Z = 16x+10y+15(8-x-y) +10(x-5) +12(5-y) +10(x+y-4)
= x-7y+190
Hence, the above LPP can be stated mathematically as follows:
Minimize Z = x – 7y + 190
x  0, y  0
x + y  8
x + y  4
x, y  5
Example 5
A company has two groups of inspectors namely, group A and B, who are assigned to do a quality
inspection work. It is required that at least 1800 pieces are inspected for 8-hour day. It is known that
inspectors of group A can check pieces at the rate of 25 per hour with an accuracy of 98%, while
inspectors of group B can check at the rate of 15 pieces per hour with an accuracy of 95%. The
inspectors of group A and B are paid Rs 40 and Rs 30 per hour respectively to do the work. Each
time an error is caused by the any inspector, it costs a loss of Rs 20 to the company. The company
has 8 inspectors in group A and 10 in group B. The company wants to determine the optimal
assignment of Inspectors to minimise total inspection cost. Formulate an LPP
Solution:
Let an inspector of group A inspect for x number of hours and each inspector of group B inspect
for y number of hours
The data of the given problem can be summarized as follows:
Group A Inspector Group B Inspector
Number of Inspectors 8 10
Rate of checking per hour 25 pieces 15 pieces
Inaccuracy in checking 1-0.98=0.02 1-0.95=0.05
Cost of Inaccuracy in checking Rs.20 Rs.20
Wage rate per hour Rs.40 Rs. 30
Hourly costs of each Group A and Group B inspectors are given by:
Group A Inspector: Rs. (40+20 ×0.02×25) = Rs. 50
Group B Inspector: Rs. (30+20 ×0.05×15) = Rs. 45

Using the above information, the appropriate LPP is
Minimize Z = 8 × 50x + 10 × 45y = 400x + 450y
x  0, y  0
4x + 3y  120
x  8, y  10
8.4 SOLVING A LINEAR PROGRAMMING PROBLEM

In this section we are going to learn how to solve an LPP. Let us understand a few terms used
while solving it.
Solution: The set of values of decision variables (j = 1, 2,….,n) which satisfy the constraints
of an LP problem is said to constitute solution to that LP problem.
Feasible Solution: The set of values of decision variables (j = 1, 2,….,n) which satisfy all the
constraints and non-negativity condition of an LP problem is said to constitute feasible solution to
that LP problem.
In other way, a solution that also satisfies the non-negativity restrictions of a LPP, is called a
feasible solution.
Infeasible Solution: The set of values of decision variables (j = 1, 2,….,n) which do not satisfy
all the constraints and non-negativity condition of an LP problem is said to constitute the infeasible
solution to that LP problem.
Feasible region: Feasible region is the common region determined by all the constraints including
non-negative constraints of a LPP and every point in this region is the feasible solution of the given
LPP.
Optimal Feasible Solution: A feasible solution of a LPP that optimizes (maximizes or minimizes)
the objective function is called the optimal solution of the LPP. At times, an LPP can have no solution
or more than one optimal solution.
Theorem 1: Let R be the feasible region for a linear programming problem and let
Z = ax + by be the objective function.
When Z has an optimal value (maximum or minimum), where the variables x and y are subject
to constraints described by linear inequalities, this optimal value must occur at a corner point
(A corner point of a feasible region is a point in the region which is the intersection of two
boundary lines)
Theorem 2 : Let R be the feasible region for a linear programming problem, and let Z = ax + by
be the objective function.
(i) If R is bounded, then the objective function Z has both a maximum and a minimum value
on R and each of these occurs at a corner point of R.
(ii) If R is unbounded, then maximum or minimum value of objective function may not exit.
However, if it exits then it must occur at the corner point of the feasible region.
An LPP can be solved using many methods. In the next section we shall learn to solve a given
LPP using graphical method

8.5 GRAPHICAL METHOD OF SOLVING LINEAR PROGRAMMING PROBLEM:
In the previous sections we learnt how to formulate a linear programming problem in mathematical
form, the next step is to solve the problem to get the optimal solution for the given LPP.
In this unit we shall focus on solving a linear programming problem with only two variables
using a graphical method as the graphical method provides a pictorial representation of the solution
process and a great deal of insight into the basic concept. So, in this chapter we shall focus on the
graphical methods involving two variables only.
The following methods are used to solve LP problems graphically:
(i) Corner - Point Method
(ii) Iso -Profit or Iso - cost method
8.5.1 CORNER - POINT METHOD

In this method, the coordinates of all corner (extreme) points of the feasible region are determined
and the value of the objective function at these points are computed because the mathematical
theory of LP states that an optimal solution to any LP problem always lie at one of the corner points
of the feasible region.
This method consists of the following steps:
(i) Formulate the given LPP in mathematical form.
(ii) Draw X-axis and Y- axis on the graph paper, the non -negativity restrictions Feasible region
i.e., x  0, y  0 imply that the values of the variables x and y can lie only will always be
in first quadrant. in the first
(iii) Plot the inequality constraints on the graph and decide the area of feasible quadrant
region according to the inequality sign of constraints.
To determine the region

represented by an inequations
replace x and y both by zero,
if the inequation reduces to a
valid statement, then the
region containing the origin is
the region represented by the
given inequation
(iv) Shade the common region of the graph that satisfies all the constraints. The common region
is called the feasible region of the given LPP. Any point on or inside the feasible region is the
feasible solution of the given LPP. The feasible region can be bounded (closed) or unbounded
(open) as shown below:

(v) Now determine the coordinates of corner points of the feasible region
(vi) Now evaluate the objective function Z at each corner point of the feasible region. The point
where the objective function attains its optimum (maximum or minimum) value is the optimal
solution of the given LP problem.
Now let us discuss the two possibilities of feasible region in detail:
Case(i) If the feasible region of a LPP is bounded:

In this case the objective function has both a maximum value and a minimum value at a corner
point of the given feasible region. For example:
Corner Points Z = x + 2y
A (2, 3) 8 Minimum
B (1, 7) 15
C (4, 9) 22 Maximum
Case (ii)- If the feasible region of a LPP is unbounded:

In this case the objective function has both a maximum value and a minimum value at a corner
point of the given feasible region.
Corner Points Z = x + 2y
A (1, 5) 11
B (3, 5) 13
C (5, 8) 21 Maximum
D (0, 2) 4 Minimum

In order to check whether Z (objective function) has maximum or minimum values respectively,
we proceed as follows:
1) Draw the line ax+by = M and find the open half plane ax + by > M.
If the open half plane represented by ax + by > M, has no point common with the unbounded
feasible region, then M is the maximum value of Z, otherwise Z has no maximum value.
2) Draw the line ax+by =m and find the open half plane ax + by < m.
If the open half plane represented by ax + by < m, has no point common with the unbounded
feasible region, then m is the minimum value of Z, otherwise Z has no minimum value.
We shall now illustrate these steps of Corner Point Method by considering some examples:
Example 6
Solve the following Linear Programming Problem Graphically.
3x + 5y  15
5x + 2y  10
x  0, and y  0
Solution:
By plotting the given linear inequalities, we
can see that the inequality 3x + 5y  15 meets the
co-ordinates axes at points (5,0) and A(0,3)
respectively.
Also the inequality 5x + 2y  10 meets the
co-ordinates axes at points C(2,0) and (0,5)
respectively.
As shown in graph (i) the shaded bounded
region OABCO represents the common region of
the above inequations. This region is the feasible
region of the given LPP.
Graph (i)
The coordinates of the vertices (corner point)
of the shaded bounded feasible region are O (0, 0), A (0, 3), B (20/19, 45/19) and C (2, 0).
These points have been obtained by solving the equations of the corresponding intersecting lines,
simultaneously. The value of the objective function as these points are given in the following table:
Corner Points Coordinates Objective Function

Z = 5x + 3y
O (0,0) 0
A (0,3) 9
B (20/19,45/19) 235/19
C (2,0) 10
Clearly, Z is the maximum at P (20/19, 45/19)
Hence, x = 20/19, y = 45/19 is the optimal solution of the given LPP.
The optimal maximum value of Z is 235 /19 when x = 20/19 and y = 45/19

Example 7
x + 2y  5
x + y  4
x  0 and y  0
Solution: By plotting the given linear inequalities, we can see that the inequality x + 2y  5 meets
the co-ordinates axes at the point (5,0) and A(0,2.5) respectively.
Similarly, The inequality x + y  4 meets the co-ordinates axes at the point C (4, 0) and (0, 4)
respectively.
Graph (ii)
As shown in the graph above, the shaded bounded region OABCO represents the common
region of the above inequation. This region is the feasible region of the given LPP.
The coordinates of the vertices (corner point) of the shaded feasible region are O (0, 0), A (0, 2.5),
B (3, 1) and C (4, 0).
The value of the objective function as these corner points are given in the following table:

Z = 2x + 4y
O (0,0) 0
A (0,2.5) 10 (Max.)
B (3,1) 10 (Max.)
C (4,0) 8
Clearly, Z has maximized at two corner points A (0, 2.5) and B (3, 1).
Hence, any point on the line segment joining points A and B will give the maximum value
Z = 10 of the objective function.
The optimal maximised value of Z is 10 when x = 0 and y = 2.5 or when x = 3 and y = 1

Example 8:
Minimise Z = x + 2y
x  0 and y  0
2x + y  3
x + 2y  6
Show that minimum Z has more than two optimal soutions
Graph (iii)
Solution: By plotting the given linear inequalities, we can see that the inequality x + 2y  6 meets
the co-ordinates axes at the points B (0, 3) and A(6,0) respectively
Also, the inequality 2 x + y  3 meets the co-ordinates axes at points (0, 3) and (3/2, 0)
respectively
As shown in the graph above, the shaded feasible region is unbounded.
The coordinates of the vertices (corner point) of the shaded feasible region are A (6, 0), and B
(0, 3)
The value of Z at the corner points are as follows:
Corner Points Coordinates Objective function
Z = x + 2y
A (6, 0) 6 min
B (0, 3) 6 min
There are no distinct maximum or minimum values of Z as the value of Z at points A and B are
same
Therefore, all the points lying on the line joining the points A and B will minimise and maximise
the objective function at many more points than A and B
Hence the minimum value of Z occurs for more than two corner points, i.e., all the points lying
on the line segment AB will minimize the objective function

Example 9
Maximize Z = 6x + y
2x + y  3
y – x  
x  0, and y  0
Solution: By plotting the given linear inequalities, we can see that the inequality 2 x + y  3 meets
the co-ordinates axes at the point (1.5, 0) and (0, 3) respectively.
Similarly, the inequality y – x  0 meets the co-ordinate axis at the point O (0, 0) respectively.
Graph (iv)
As shown in the graph above, the shaded feasible region is unbounded.

The coordinates of the vertices (corner point) of the shaded feasible region are A (0, 3), and B (1, 1).
The value of the objective function as these points are given in the following table:

Z=6x + y
A (0,3) 3
B (1,1) 7 (Max.)
From this table, we find that 7 is the maximum value of Z at the corner point B (1, 1)
As the feasible region is unbounded.
Therefore, 7 may or may not be the maximum value of Z.
To decide this issue, we graph the inequality 6x + y > 7.
Plot this inequation on the same graph and check whether the resulting open half plane has
points in common with the feasible region or not.
As shown in the figure the red line representing the inequality 6x + y > 7 is passing through
corner point B(1, 1) but lies in the feasible region
Hence the given LP problem has no solution and Z cannot be maximized for any values of x and y.

Example 10:
Maximize Z = x + y
Subject to constraints: x, y  0
x – y  –1
x  y
Solution:
Graph (v)
Plotting the graph, we can see that there is no possible feasible region for the given constraints
Hence the given LPP has no solution and Z cannot be maximized
Example 11
Minimize Z = 3x + 5y
Subject to constraints: x, y  0
x + 3y – 3  0
x + y – 2  0
Solution : The feasible region determined by the system of constraints, x + 3y  3, x + y  2, and
x, y  0 is given below:
x + 3y  3
x+y 2
Graph (vi)

Here, the feasible region is unbounded.
The corner points of the feasible region are A (3, 0), B (3 / 2, 1 / 2) and C (0, 2)
The values of Z at these corner points are given below:

Z = 3x + 5y
A (03, 0) 9
B (3/2, 1/2) 7 (Min.)
C (0, 2) 10
As we wish to minimize Z, we are going to draw graph of Z = 3x + 5y < 7 and check whether
the resulting half plane has any common points with the feasibe region or not
As the inequality, Z – 3x + 5y < 7 passes through a corner point B ( 3/2, 1/2) without interfering
the feasible region
That means, the corner point B (3/2, 1/2) minimizes Z and the minimum value of Z is 7. When
x = 3/2, y = 1/2.
8.5.2 ISO-PROFIT/ ISO-COST METHOD:

In this section we are going to learn another method to solve a given LP problem. Iso-Profit
method is another way to find the optimal solution by using the slope of the objective function line
(or equation).
An iso-profit (or cost) line is a collection of points which designate solution with same value of
objective function. By assigning various values to Z, we get different Profit (cost) lines. Graphically,
many such lines can be plotted parallel to each other .
The steps of iso-profit (cost) function method are as follows.
1. Formulate the given LPP in mathematical form
2. Identify the feasible region and extreme (corner) points of the feasible region .(As discussed
in Corner-Point method)
3. Give some convenient values to Z and draw the line so obtained in xy- plane.
4. If the objective function is to be maximized, then draw lines parallel to the line in step 3.
5. Obtain a line which is farthest from the origin and has at least one point common to the
feasible region.
6. If the objective function is to be minimized, then draw lines parallel to the line in step 3 and obtain
a line which is nearest to the origin and has at least one point common to the feasible region.
7. Find the co-ordinate of the common point obtained in step 4. The point so obtained determine the
optimal solution and the value of the objective function at these points give the optimal solution.
Example 12
Subject to 4x + 6y  360
3x  180
5y  200
x  0, and y  0

Solution:
Graph (vii)
To begin with, equality constraints are considered equations, as shown For choosing arbitrary
in the above figure. value for Z, we can use
The bounded feasible area is formed by considering the area to the the LCM (a, b) for the
lower left side of each equation (towards origin). A family of lines that objective function
represents various levels of objective function is drawn (black lines in Z = ax + by
figure).
These lines are called iso- profit lines.
Let us select an arbitrary value of Z as 300
Hence, the iso-profit function equation becomes 15x + 10 y = 300.
This equation can be plotted in the same manner as the equality constraints were plotted. This
line is then moved upward until it first intersects a corner in the feasible region (corner B).
The coordinates of corner point B can be read from the graph or can be computed as the
intersection of the two linear equations.
The coordinates x = 60 and y = 20 of corner point B satisfy the given constraints and the total
profit obtained is Z = 1100.
Example 13

Minimize Z = 18x + 10y
Subject to 4x + y  20
2x + 3y  30
and x  0 and y  0

Solution: As shown in the graph below, the feasible region of the LPP is unbounded
Graph (viii)
Give a value, say 180 equal to (2 times LCM of 18 and 10) to Z to obtain the line 18x + 10y =180.
This line meets the co-ordinate axes at (10, 0) and (0, 18).
Join these points by black line. Move this line parallel to itself in the decreasing direction towards
the origin so that it passes through only one point of the feasible region. clearly PQ is such a line
passing through the vertex B of the feasible region. The coordinates of B are obtained by solving the
lines 4x + y = 20 and 2x + 3y = 30.
Solving these equations, we get x = 3 and y = 8.
Putting x = 3 and y = 8 in the objective function Z = 18x + 10y, we get Z= 134
The minimum value of Z is 134 at x = 3 and y = 8 .
8.6 CHECK YOUR PROGRESS

1) To maintain his health, a person must fulfill certain minimum daily requirements for several
kinds of nutrients. Assuming that there are only three kinds of nutrients -calcium, protein
and Calories and the person’s diet consist of only two food items 1 and 2, whose price and
nutrient contents are shown in the table below:
Nutrients Food Food II Minimum
(per lb) (per lb) daily requirement
Calcium 10 4 20
Protein 5 5 20
Calories 2 6 13
Price (in Rs.) 0.60 1.00
What combination of two food items will satisfy the daily requirement and entail the least cost?
Formulate this problem as a LPP.

2) Vitamins A and B are found in two different foods F1 and F2 One unit of food F1 contains
two units of Vitamin A and 3 units of Vitamin B, one unit of food F2 contains 4 units of Vitamin
A and 2 units of Vitamin B, one unit of food F1 and F2 cost 5 and 2.5 respectively. The minimum
daily requirements for a person of Vitamin A and B are 40 and 50 units respectively. Assuming that
anything in excess of daily minimum requirement of Vitamin A and B is not harmful, find out the
optimum mixture of food F1 and F2 at the minimum cost which meets the daily minimum requirement
of Vitamin A and B. Formulate this problem as an LPP.
3) A brick manufacturer has two depots, A and B with stocks of 30,000 and 20,000 bricks
respectively. He receives orders from three builders P, Q and R for 15,000, 20,000 and 15,000 bricks
respectively. The cost in Rs. of transporting 1,000 bricks to the builders from the depots are given
below:
From\ To P Q R
A 40 20 30
B 20 60 40
How should the manufacturer fulfil the orders so as to keep the cost of transportation minimum?
Formulate the above problem as linear programming problem.
4) A Cooperative Society of farmers has 50 hectares of land to grow two crops X and Y. The
profit from crops X and Y per hectare are estimated as 10,500 and 9,000 respectively. To control
weeds, a liquor herbicide has to be used for crops X and Y at rate of 20 liters and 10 liters per
hectare. Further, no more than 800 liters of herbicide should be used in order to protect fish and wild
life using a pond which collects drainage from this land. How much land should be allocated to each
crop so as to maximize the total profit of the society? Formulate the above problem as linear
programming problem.
5) A company has two grades of inspectors, I and II to undertake quality control inspection. At
least 1,500 pieces must be inspected in an 8-hour day. Grade I inspector. can check 20 pieces in an
hour with an accuracy of 96%. Grade II inspector checks 14 pieces an hour with an accuracy of 92
%. Wages of grade I Inspector are Rs. 5 per hour while those of Grade II Inspector are Rs. 4 per hour.
Any error made by Inspector costs Rs. 3 to the company. If there are, in all, 10 grade I inspectors
and 15 Grade II inspectors in the company, find the optimal assignment of inspectors that minimizes
the daily inspection cost. Formulate the above problem as linear programming problem.
6) Solve the following Linear Programming Problem graphically:
i. Maximize Z = 3x + 2y
Subject to the constraints: –2x + y  1
x  2,
x + y  3, and x  0, y  0
ii. Minimize Z = 5x – 2y
Subject to the constraints: 2x + 3y  1,
and x  0, y  0
iii. Minimize Z = –x + 2y
Subject to the constraints: –x + 3y  10,
x + y  6
x – y  2
and x  0, y  0

iv. Maximize Z = –x + 2y
Subject to the constraints: – 0.5x + y  2,
x – y  –1
and x  0, y  0
v. Maximize Z = 5x + 4y
Subject to the constraints: x – 2y  1,
x + 2y  6
x – y  3
and x  0, y  0
Solve the following Linear Programming Problem graphically by using Iso-cost method:
7. Minimize Z = 4x – 2y
Subject to the constraints: x + y  14,
2x + y  24,
3x + 2y  14,
and x  0, y  0
8. Maximize Z = 3x + 9y
Subject to the constraints: x + 4y  8,
x + 2y  4,
and x  0, y  0
9. Maximize, Z = 3x + 2y
Subject to the constraints: –2x + y  ,
x + y  3,
x  2,
and x  0, y  0
CHECK YOUR PROGRESS ANSWERS

1) Minimize Z = 0.60x + 1.00y
Subject to the constraints: 10x + 4y  20
5x + 5y  20
2x + 6y  13
and x  0, y  0
2) Minimize Z = 5x + 2.5y
Subject to the constraints 2x + 4y  0
3x + 2y  50
and x  0, y  0
3) Minimize Z = 30x – 30y + 1800
Subject to the constraints: x + y  30
x  15, y  20, x + y  15
and x  0, y  0

4) Maximize Z = 10500x + 9000y
Subject to the constraints: x + y  50
2x + y  80
and x  0, y  0
5) Minimize Z = 59.20x + 58.88y
Subject to the constraints: x, y  0
x  10, y  15
160x + 112y  1500
6) i. x = 2, y = 1, max. z = 8
1 2
ii. x = 0, y = , min z = 
3 3
iii. x = 2 , y =0, min. z = –2
iv. multiple optimal solutions, max. Z = 4 at (0, 2) and (2, 3) and infinite points on the
line segment joining them.
v. No solution (unbounded solution)
7) x = 8, y = 6 and Min. Z = 20
8) x = 0, y = 2 Max. Z = 18
9) x = 2, Y = 1, Max. Z = 8
8.7 UNIT SUMMERY

1. Linear programming problem deals with the optimization (Minimization or Maximization) of
a linear function of a number of variables subject to a number of conditions on the variables
in the form of linear inequations or equations in variables involved.
2. A Linear programming problem (LPP) consists of three important components:
(i) Decision variables
(ii) The Objective function
(iii) The Linear Constraints
3. The decision variables refer to the limitations or the activities that are competing with one
another for sharing the available resources
4. A linear objective is a real valued function, represented as Z = ax + by, where a, b are
arbitrary constants, where Z is to be maximized or minimized
5. the conditions x  0, y  0 are called non-negative restrictions on the decision variables
6. Some of the important LP problem we shall study are:
a. Manufacturing Problem
b. Diet problem
c. Transportation Problem
d. Assignment Problem
7. The set of values of decision variables (j=1,2,….,n) which satisfy the constraints of an LP
problem is said to constitute solution to that LP problem.

The set of values of decision variables xj (j = 1,2,….,n) which satisfy all the constraints and
non-negativity condition of an LP problem is said to constitute feasible solution to that LP
problem.
8. The set of values of decision variables xj (j=1,2,….,n) which do not satisfy all the constraints
and non-negativity condition of an LP problem is said to constitute the infeasible solution to
that LP problem.
9. Feasible region is the common region determined by all the constraints including non-negative
constraints of a LPP and every point in this region is the feasible solution of the given LPP.
10. A feasible solution of a LPP that optimizes (maximizes or minimizes) the objective function
is called the optimal solution of the LPP.
11. An LPP can have no solution or more than one optimal solution.
12. Theorem: Let R be the feasible region for a linear programming problem and let Z = ax + by
When Z has an optimal value (maximum or minimum), where the variables x and y are
subject to constraints described by linear inequalities, this optimal value must occur at a
corner point
(A corner point of a feasible region is a point in the region which is the intersection of two
boundary lines)
13. Theorem: Let R be the feasible region for a linear programming problem, and let Z = ax + by
If R is bounded, then the objective function Z has both a maximum and a minimum value
on R and each of these occurs at a corner point of R.
If R is unbounded, then maximum or minimum value of objective function may not exit.
However, if it exits then it must occur at the corner point of the feasible region
14. There are two methods to solve an LPP graphically:
i. Corner-Point Method
ii. Iso-Profit or Iso-cost method


it
n
U
9
LEARNING OUTCOMES
After completion of the unit the students will be able to:
 Plot the different graphs of functions on Excel.
 Learn different operations performed on the matrix.
 Learn how to perform simulation.
 Comprehend practical applications of demand and supply associated with economics.
 Understand the meaning of Data Analysis and Data Visualization.
 Apply Analytical Methods for business decision-making.
 Analyse different data and develop meaningful inferences for decision-making.
Practical and Project Work

Assessment Plan
1. Overall Assessment of the course is out of 100 marks.
2. The assessment plan consists of an External Exam and Internal Assessment.
3. External Exam will be of 03 hour duration Pen/Paper Test consisting of 80 marks.
4. The weightage of the Internal Assessment is 20 marks. Internal Assessment can be a
combination of activities spread throughout the semester/academic year. Internal
Assessment activities include projects and excel based practicals. Teachers can choose
activities from the suggested list of practicals or they can plan activities of a similar
nature. For data-based practical, teachers are encouraged to use data from local sources
to make it more relevant for students.
5. Weightage for each area of internal assessment may be as under:
Sl. No. Area and Assessment Area Marks allocated
Weightage
1. Project work Project work
(10 marks) and record 5
Year-end Presentation/
Viva of the Project 5
2. Practical work Performance of
(10 marks) practical and record 5
Year-end test of
anyone practical 5
Total 20
Practical and Project Work 9.1

Practical: Use of spreadsheet
Learning Outcomes
Students shall be able to:
 Draw graph of an exponential function
 Comprehend Demand and supply functions on Excel
 Study the nature of function at various points
 perform Matrix operations using Excel
Suggested practical using the spreadsheet
 Plot the graphs of functions on excel and study the graph to find out the point of
maxima/minima
 Probability and dice roll simulation
 Matrix multiplication and the inverse of a matrix
 Stock Market data sheet on excel
 Collect the data on weather, price, inflation or pollution. Analyse the data and make
meaningful inferences
 Collect relevant data from newspaper on traffic, sports, stock markets and use excel to
study future trends
About Project Work

Some suggested project works are given below (according to the syllabus). For more details, you
can refer to the syllabus. The project work would be conducted individually or in a group (3 to 4
students). Students select the topic related to the practical implementation of mathematics in different
domains. Project work includes Data analysis, predicting market trends for better decision making,
making inferences based on collected data, do simulations, and find out different best solutions to
real-time problems. You can collect the data sets of various domains from different platforms like
Kaggle, https://www.kaggle.com (world’s largest data science community), Google Public Datasets,
Amazon Web Services (AWS) datasets, and many more for data analysis and make better decision
making. Students may also work on various projects based on the prediction of future sales, stock
market, air pollution, COVID-19 outbreak etc.
Evaluation of each project should be based on the learning outcomes and practical implementation
of its results. The objective of the project should be clear to every student. Different approaches for
solving the problem and get the best and optimized solution to it. Teachers should guide the students
in making their projects successful.
List of Suggested Projects (Class XII)

1. COVID-19 Data Analysis, pre-process and merge datasets to calculate needed measures and
prepare them for an Analysis. In this, you can work with the COVID19 dataset, published
by John Hopkins University, which consists of the data related to the cumulative number of
confirmed cases, per day, in each Country.
2. Earthquake prediction using past data.
3. The Project Manager uses cost trend analysis to identify project budget under and overruns
and to solve many budget issues. One of the responsibilities of a project manager is to manage
the project budget. This covers, estimating the initial budget (in most cases or at least reviewing),
9.2 Practical and Project Work

creating a monthly budget plan, tracking progress (actual)spend , and then re-forecasting the
budget required (known as the estimate to complete – ETC).
The objective of conducting this analysis is to identify where a project will under or overrun
the budget. While a project manager should be completing budget reviews and re-forecasts
regularly. So the data needed for this is:
 List of projects
 Budget for each project
 Previous months Year to Date (YTD) Actuals for each project
 Current months YTD Actuals for each project
 Estimate to Complete (ETC) for each project
 This data should be produced by the project monthly so hopefully should not present an
issue.
4. Another popular one is to obtain product and pricing data from e-commerce sites for Data
Analysis and decision making. For instance, extract product information about Bluetooth
speakers on Amazon, or collect reviews and prices on various tablets and laptops. This means
you can start with a product that has a small number of reviews, and then start analysing
with previous feedbacks, reviews on that particular product, and upscale according to the
parameters you could choose for better sales.
5. Global Suicide Rates Analysis by taking the dataset from Kaggle (https://www.kaggle.com/
russellyates88/suicide-rates-overview-1985-to-2016), it covers suicide rates in various countries,
with additional data including year, gender, age, population, GDP, and more. When carrying
out your EDA (Exploratory Data Analysis), ask yourself: What patterns can you see? Are
suicide rates climbing or falling in various countries? What variables (such as gender or age)
can you find that might correlate to suicide rates?
6. Most Followed on Instagram: Whether you’re interested in social media, or celebrity and
brand culture, this dataset of most-followed people on Instagram
https://data.world/socialmediadata/most-followed-on-instagram has great potential for
visualization. You could create an interactive bar chart that tracks changes in the most
followed accounts over time. Or you could explore whether brand or celebrity accounts are
more effective at influencer marketing. Otherwise, why not find another social media dataset
to create a visualization?
7. Superstore Sales Forecasting Data: Areas such as product placement, inventory management,
and customization of offers, are sought to improve constantly through the application of data
science. Collect the data of various stores and on basis of historical data predict the sales. The
goal is to predict the department-wise sales of each store using the historical data spanning
across 143 weeks.
8. Forecasting Uber/ Rapido Demands
Demand and Supply functions on Excel

A spreadsheet provides various types of charts to display demand and supply information.
Different types of charts are available to use but it depends on what kind of analysis you are going
to perform. If the data requires finding the equilibrium for supply and demand, a line graph provides
the best results as well as visualization but if you want to show the variations between supply and
demand, a column chart is more suitable.

For creating different charts on supply and demand, the steps are as follows:
1. Open a new spreadsheet, add the data in the different columns as in the image shown below
i.e. price, quantity demanded and quantity supplied.
Fig 1: Supply and Demand Data
2. Select columns of price, quantity demanded, and quantity supplied and click on the Insert
Tab, choose line chart from chart group.
Fig 2: Line Chart
3. After choosing a line chart, it will appear with the mentioned data values on the x-axis and
y-axis with different colors. By default, the price would be at x–axis.

Fig 3: Supply and Demand Line Chart
With the same data, you can also visualize the data by using a scatter plot. The scatter chart is
shown below. The supply and demand curve will intersect at a point which is your equilibrium point
and it is visible in both the charts.
Fig 4: Supply and Demand Scatter Chart

Matrix multiplication
Before going to matrix multiplication we need to know about the basics of the matrix, what is
the matrix? A matrix is an arrangement of numbers in rows and columns.
For example, Matrix A has two rows and two columns
1. In Microsoft Excel, the MMULT function is used for multiplying any two matrices. So let’s
take two matrices 3x3 matrix A and matrix B as shown in the given figure below.
Fig: 5: Two 3x3 Matrices A and B
2. For multiplication of two matrices A and B, the number of columns in the first matrix must
be equal to the number of rows in the second matrix. After selecting the cells 3x3 and write
down the MMULT () function in the given cell . You can see in the below figure once you
write the MMULT function it asks for array 1 and array2.
Fig 6: MMULT () function

So Array 1 is the first matrix and Array 2 is the second matrix. After opening the bracket of the
MMULT function select the cells of Matrix A (A2:C4), as well as Matrix B (G2:I4) separated with
comma as given the fig 7 and 8.
Fig 7: Array 1(Matrix A)
Fig 8: Array 2(Matrix B)
3. While calculating the result, key Combination Ctrl+Shift+Enter should be used and you will
get the desired output. As you can see in fig 9.
Note: Do Not Press Enter Alone, Pressing enter alone will only show one value instead of a matrix.

Fig 9: Matrix Multiplication using MMULT ( ) Function
Finding the Inverse matrix using MS Excel

1. For calculating the inverse of a given matrix, the MINVERSE function is used. First, create
the matrix A (3x3) given below in fig 10. Enter the values in rows and columns.
Fig 10: 3×3 Matrix
2. Once you create the matrix, then write the MINVERSE in the highlighted cell in which where
you want to place the resulting matrix . For the array, select the Matrix A cells (B2:D4). As
you can see in the below-given image fig:11.

Fig 11: MINVERSE Function
3. While calculating the result, key Combination Ctrl+Shift+Enter should be used and you will
get the desired output. As you can see in fig 12.
Note: Do Not Press Enter Alone, Pressing enter alone will only show one value instead of a matrix.
Fig 12: Inverse of Matrix

How to Calculate Probability Using MS Excel?
Probability is the branch of mathematics and it is defined as the likelihood for which an event
is probable, or likely to happen. MS Excel has built-in function PROB for calculating the probability.
PROB function is classified under the MS Excel statistical functions. The function of probability
worked in different domains like in business, sports, in financial analysis for estimating the long term
business losses and gain.
Formula :
=PROB (x_range, prob_range, [lower_limit],[upper_limit])
 range – the range of numeric values containing our data
 prob range – the range of probabilities for each corresponding value in our range
 lower_limit –the lower limit of the values for which we want to calculate the probability
 upper_limit –the upper limit of the values for which we want to calculate the probability
Let’s take an example to understand the usage of the built-in function PROB for calculating the
probability.
The Below table contains grades and their corresponding probabilities. Here we set the lower
limit to 50 and the upper limit to 80.
Fig 13: Table for Grades
1. After entering the data in the table we write the PROB function in the formula bar along with
the arguments. Select the cell (A2:A7) for grades and B2:B7 for corresponding probabilities
with lower limit 50 and upper limit 80.

Fig 14: PROB function
2. After applying the formula you will get the resultant in desired cell i.e.C2 cell 0.95.
Fig 15: Result

Dice roll simulation
The given example shows you how to simulate the dice roll in MS Excel. Following are the steps
to follow which is given below:
1. The first is to choose the dice, for this go to the INSERT tab and click on the Symbols and
select the Segoe UI symbol as shown in fig 16.
Fig 16: Dice Selection
2. After selecting the Segoe UI Symbol, select the dices from 1 to 6 in column B.
Fig 17: Dices

3. Put each dice on the single-cell by selecting the one dice Ctrl+X and paste it in another cell
along with numbering 1 to 6 as shown in fig 18.
Fig 18: Copying of Dices in individual cell
4. The Next step is to generate the random number between 1 and 6 just similar to the roll of
dice by using built-in function RANDBETWEEN. By applying this function, you have to enter
the bottom and top values.
Fig 19: RANDBETWEEN Function

5. Put the 1 as the bottom value and 6 top value and enter it. Any random number would be
generated as you can see in Fig 21.
Fig 20: Top and Bottom value of RANDBETWEEN Function
Fig 21: Random Number Generation

6. Apply the VLOOKUP function to select the outcome for each die. Place the number for the
first die in column D and the number for the second die in column E as shown in fig 22 .
Fig 22: VLOOKUP Function
7. Add different parameters like lookup value, table array col index and range lookup. Write
down the formula in cell D4, open the bracket and choose D3 value and put a comma. For
table array, select the column A and B along with the associated values within the cell and
press F4. Add col index value is 2 and for range lookup to choose the argument FALSE. And
press enter.
Syntax:
=VLOOKUP (value, table, col_index, [range lookup])
Arguments:
Value: The value to look for in the first column.
Table: The table from which to retrieve a value.
Col_index: The column in the table from which to retrieve a value.
Range_lookup: - [optional] TRUE = approximate match (default),FALSE = exact match.

Fig 23: VLOOKUP Arguments
Fig 24: VLOOKUP Function

8. After press enter, you will get the dice roll in the desired cell which is mentioned below in
the given fig 25 and 26.
Fig 25: Dice Roll
Fig 26: Two Dice Roll

9. For increasing the Dice size, click on the HOME tab and select the alignment and Increase
font size. Adjust the size of both the dice.
Fig 27: Resizing the Dice
10: Now select one dice, copy it and paste it on another sheet and check the formula bar.
Fig 28: Formula for Dice 1

Fig 29: Two dices
11. Similarly check for dice 2 and add some changes for rolling the dice in the given cell on the
formula bar for rolling the dice simulation.
Fig 30: Dice rolling simulation

Plot the Graphs of functions on Excel
Linear Function: In mathematics, linear functions have the form f(x) = ax + b, where a and b
are constants. The graph of any linear function is a straight line. The Linear function is quite popular
in economics. It is very useful in comparing the rates of pay.
Let’s take an example if one company offers to pay you $450 per week and the other offers $10
per hour, and both the companies asked you to work 40 hours per week, then which company is
going to offer the better rate of pay? For this kind of implementation, a linear equation can help you
out in a better way. More applications in the daily life of linear equations are: In Budgeting, making
predictions, variable costs etc.
In excel, plotting of linear function graph, here we take the different values of x and put the
values in the mentioned formula f(x) = 3x-3 and get the values. You can see in fig 31. Column A:
Different values of x i.e. from 0 to 5 and Column B: f(x) =3x-3. For calculating each value apply the
formula for value 0, write it down in Column B second cell: 3*A2-3, and then press enter you will
get the value i.e. -3 .
Fig 31: Different Values of x
After getting the 1st value, drag the A2 cell till A7, you will get all the values by formula. See
the values in the given below fig 32. Now select both column A and column B and then for plotting
the graph, go to the INSERT tab and click on charts and choose to scatter plot. Your linear function
would be plotted for the given values which are shown in Fig 33. The Graph shows the straight line.

Fig 32: Scatter Plot Graph
Fig 33: Graph of Linear Function
Quadratic Function: The quadratic function or polynomial of degree 2 is in the form of

where a, b, and c are real numbers and a‘“0. If a >0, the parabola opens
upward. If a<0, the parabola opens downward. The graph of a quadratic function is a curve called
a parabola. It may open upward or downward and can vary in their “width” but they all have the
same basic “U” shape.

The quadratic functions are used in real-world situations or everyday life like optimizing profits
for business, calculating areas of certain land, speed of an object like throwing a ball, determining
a product’s profit, etc.
In excel, for plotting graph of quadratic function f(x) = , we follow these steps.
1. Take different values of a, b and c for the given function f(x) = ax2+bx+c, Let’s take a=1, b=4
and c=2. Put these values in the given function you will get f(x) =x2+4x+2. Put the values of
a, b, and c in different cells of column B as shown in figure 34 below.
2. After assigning the values of a, b, and c, assign different values to x. Now substitute these
values in the function f(x) = ax2+bx+c based on each value of x assigned. Apply the formula
in the formula bar by selecting the cell and their associated values in the given fig 34.
Fig 34: Formula for implementing Quadratic Function
3. After applying the formula, press enter and you will get the desired value of f(x) for value
x = - 5 i.e 7 in Fig 35.
4. Now if you want to use the auto-completion, we simply lock the cells containing the values
of a, b, and c as shown in fig 36.

Fig 35: Result
Fig 36: Auto-completion (lock the cells)

5. After applying the auto-completion select the columns D and E with values and for plotting
the graph click on the INSERT tab and then click on charts and then select the scatter plot
and choose the scatter chart you can see the parabolic curve in the given below fig 37. In this
way, you can plot the graph.
Fig 37: Quadratic Function Graph
Exponential Function: An exponential function is a Mathematical function in the form

f (x) = , where “x” is a variable and “a” is a constant which is called the base of the
function and it should be greater than 0.
The real-world applications of exponential functions are bacterial growth/decay, population
growth/decline, and compound interest, pandemics (exponential growth of disease),
smartphone uptake and sale, the exponential growth of cancer cells, etc.
In Excel, there is a built-in exponential function called as EXP function that returns a numeric
value, which is equal to e raised to the power of a given number.
The Syntax for the EXP function
=EXP( ) Note: In between parenthesis you write the number.
EXP function in Excel takes only one input, which is required; it is the exponent value raised
to base e.
The number e is an irrational number, whose value is constant and is approximately equal
to 2.7182. This is number is also known as Euler’s Number.
Now let’s see that how to plot an exponential graph in excel. Let’s take different values of
x and apply the EXP function. In the given below fig 38. Apply the function EXP(1) for value
1 in Column B.

Fig 38: EXP () Function
After applying function for value 1 and then press enter you will get the value i.e. 2.718282
approximately. You can see in fig 39. Similarly, apply for different values and calculate the exponential
values for different values of x.
Fig 39: EXP (1) Result

After calculating the values plot the exponential function graph by clicking on the INSERT tab,
select the graph and choose the scatter chart. The graph would be automatically plotted. The graph
is shown in fig 40.
Fig 40: Exponential Function Graph
If you want to add Trend lines , right-click on the plotted graph line and select the Add Trend
Line option as shown in fig 41.
Fig 41: Add TRENDLINE

After choosing the Trend Line, select the Exponential option in the list of Trend line options and
apply it . The final Exponential Function graph would be shown in given fig 42 and 43.
Fig 42: Exponential TRENDLINE
Fig 43: Exponential Function Graph with TRENDLINE

Stock Market Data Sheet on Excel
In Microsoft 365 Excel, we get easily the stock and geographic data. We can convert the text into
Stock data type as well as Geography data type which is mentioned in different cells.
In fig A which is given below in column, different cells with company names contain the Stocks
data type. This icon: symbolize the Stock Data Type. The Stocks data type is connected to an
online source that contains more information. Columns B and C are used to extracting that information.
Specifically, the values for price and change price are getting extracted from the Stocks data type
in column A.
Fig A: Stock Data Type
Let’s take an example to get a better understanding of using Stock data type . How you convert
the text into Stock and Geography Data Type.
In fig 44 which is given below, first, you have to type any text in the cell-like I have written
google text in Column A cell A1.
If you want stock information, type a ticker symbol, company name, or fund name into each cell.
If you want geographic data, type a country, province, territory, or city name into each cell.
Fig 44: Stock Information
After selecting the Stock information, as you can see in fig 45 ,the table of stock information
would be placed on an Excel worksheet.

Fig 45: Stock Information
Now the next step is that you have to convert into Stock Data type . For this you have to select
the cells to go to the Data Tab and select the Stocks Data Type or Geography Data Type as per your
data choice. So here I would choose the Stock data type according to my data .You can see in fig
46, you’ll know they’re converted if they have this icon for stocks: and this icon for geography:
Fig 46: Stock and Geography Data Type
After selecting the stock data type you will see that the first Company Name Microsoft Corporation
converted into Stock Data Type and you can see the symbol on the left corner of the text. Similarly,
one by one you can convert all company names into a stock data type. You can see in both fig 47
and 48.

Fig 47: Stock Data Type
Fig 48: Stock Data Type
Once all done with the stock data type, after that you have to add different metrics or information
related to different stocks in your data set. Select one or more cells with the data type and the Insert
Data button will appear as you can see in the above-given fig 48.Click that button and you will
notice a lot of different information or fields associated with that particular stock. Choose one field
and it will be added to the right of your current data set.
In the given example as shown in fig 49, I choose 52 weeks high, you can choose price also. For
Geography, you might pick Population.

Fig 49: 52-week high field
Once you press enter you will get the value associated with the chosen field. Similarly, apply the
same formula with different data set and choose different fields also.
See the results in fig 50 and 51 which are given below.
Fig 50: Result of the selected field
Fig 51: 52 Week High field result

Click the Insert Data button again to add more fields. If you’re using a table, here’s a tip: Type
a field name in the header row. For example, type Change in the header row for stocks, and the
change in the price column will appear.
Collect Data on weather, price, inflation and pollution: Analyse the Data and make meaningful
Inferences
Before going to the inferences we must know about Data Analysis. So Data Analysis is the
process of cleaning, transforming and modelling the data to discover useful information for business
decision-making. The main purpose of Data Analysis is to extract useful information from data and
suggesting conclusions, taking the better and right decision based upon the data analysis.
Data Analysis is defined by the statistician John Tukey in 1961 as “Procedures for analysing
data, techniques for interpreting the results of such procedures, ways of planning the gathering of
data to make its analysis easier, more precise or more accurate, and all the machinery and results
of (mathematical) statistics which apply to analysing data.”
Data Analysis through Excel

Excel provides different functions, commands, and tools for making data analysis easy. You can
also do data analysis of large data sets by using PivotTables and produce desired reports.
Data Analysis consists of different phases:
Fig 52: Data Analysis Phases

Generating Inference from Data
Pivot tables are an amazing tool for analysing large data sets. It also helps in summarizing data
and explore data insights.
For creating a Pivot table and analyse the data follow these steps. You can import the data set
externally by clicking on the DATA tab choose the option From Other sources and select the option
as per your requirement. You can import data from other options like From Access, From Web and
Text. You can see the options in the given fig 53.
Fig 53: Import Dataset in Excel
Let’s take an example like suppose you are working with company data, you have questions
like “How much revenue is contributed by branches of North region?” or “What was the average
number of customers for product A or B?” and many others. Then pivot table is the best tool for
analysing this kind of data.
A pivot table provides different operations like count, average, sum, and perform other calculations
according to the reference which you have selected i.e. It converts a data table to an inference table
so that we can analyse the data and which helps us to make better decisions.
The table given in fig 54 has sales detail of each customer with the region and product mapping.
In the table which is given in fig 55, we have summarized the information at the regional level which
now helps us to generate an inference that the South region has the highest sales.

Fig 54: Dataset
Fig 55: Inference Table

How to create a Pivot table, follow the steps given below:
1. First click anywhere in the list of the dataset, here I have selected the cell D7 of Column
Premium and then click on the INSERT tab and choose Pivot Table in the left side corner of
Tab. Once you choose, Excel automatically selects the chosen area which consists of the data
values including the heading. If it doesn’t select automatically then you have to do it manually
to drag over the area to select a particular cell. See the given below fig 56 for selecting the
pivot table.
Fig 56: Create Pivot Table from INSERT Tab

2. After selecting the option new worksheet, place the pivot table on a new sheet and click ok.
Now you can see the pivot table in the new worksheet shown in the fig 57.
Fig 57: Pivot Table creation in the new worksheet
3. Now you can see the pivot table panel at the right-hand corner which contains different
fields in the given list. Choose the field as per need for data analysis and making inferences
based on them. Once you select the fields, here I have selected Product id, Premium and
Region Field.

4. After that your pivot table is created you can see in the fig 58.
Fig 58: Pivot Table
5. If you want to change the rows and columns of the fields which you have selected then right-
click the row or column label or the item in a label, point to Move, and then use one of the
commands on the Move menu to move the item to another location. Seein fig 59 .
Fig 59: Pivot Table (Generated Inference)

6. In Fig 59. given above, we have arranged the Region in row and Product Id in column and
sum of premium taken as value. Now the pivot table is created and inferences generated. You
can also use different functions like sum, average, min, max and other summary fields. We
have summarized the information at the region level which now helps us to generate an
inference that the South region has the highest sales.
Creating Charts: For taking better business decisions and making inferences, data visualization
through charts or graphs would be very useful. In excel, for making different charts, click on the
INSERT tab select charts and choose any chart .
For the above given data set, visualization of pivot table would be shown like this.
Fig 60: Creating Charts
Similarly, we can analyse the weather also by collecting the weather dataset and make inferences
based on different parameters or metrics. Get the daily temperature and weather conditions .You
can take the metrics: Weather Type and for visualization choose different colours for every different
weather, whether its rain, cloud, snow or sun. Make a table with two columns one for weather
description and another for type.
Now for analysis, Choose a date, day, temp and weather for making better predictions and
making decisions.
Collect data from newspapers on traffic, sports activities and on market trends and use excel
to study future trends.
Before going to study future trends first we must know about some terminologies which have
worked and going to help you in predicting, analysing the trends.
Forecasting: It is the process of predicting the future by analysing the past and present data.
When we talk about Quantitative forecasting so it will work on time series data like we want to
know the number of passengers flying every year on planes by use of time series data.

Some important components of time –series are Trend and Seasonality.
Trend: It is the long term tendency of the variable to increase or decrease over a period of time.
For example the number of passengers flying every year on planes by use of time series data.
Examining sales patterns to see if sales are declining because of specific customers or products
or sales regions
Seasonality: In this pattern is repeating at a regular interval of time. For example, retail sales
tend to peak for the Christmas season and then decline after the holidays. So time series of retail
sales will typically show increasing sales from September through December and declining sales in
January and February.
Forecasting is the most important technique for predicting future trends and opportunities.
Companies use to make good and better business decisions.
Let’s see more examples to get a better understanding of forecasting.
 Economists use forecasting techniques to predict future recessions, ups and downs in the
market values and accordingly they recommend good plans.
 Government bodies use forecasting for making a plan and build their policies.
 Companies Managers use forecasting to predict sales and accordingly they make their budgets,
plan things and hire employees.
 Profiling customer behaviour, preferences and buying habits.
 Finding uncommon developments over the period, which we need to address in our forecasts.
 Studying sales patterns to predict future performance.
In Excel, some important functions are available for forecasting like
 forecast.linear(): It predicts the values by using past values.
 forecast.ets(): It predicts values by existing values.
 forecasting.ets.seasonality(): It works on the number of seasonal patterns .
 forecasting.ets.confint(): It predicts the value at the specified target date.
Excel TREND function:

Let’s take an example to suppose that you are analysing some data for a sequential period of time
and you want to analyse some trend or pattern.
In this example, we have the month numbers (independent x-values) in cell A2:A13 and sales
numbers (dependent y-values) in cell B2:B13. Based on this data, we want to determine the overall
trend in the time series.For this select the cells C2:C13, type the formula which is given below and
press Ctrl + Shift + Enter to complete it:
=TREND (B2:B13, A2:A13)
If you want to draw the trend line, then select the sales and trend values cells (B1:C13) and select
the line chart (Insert tab > Charts group > Line or Area Chart).
You can see the result, it consists of both the numeric values for the line of best fit and a visual
representation of those values in a graph also.

Fig 61: TREND Function
Fig 62: Visual Representation of Trend Analysis


Book - Applied - Mathematics - ClassXII-208-382

Uploaded by

Copyright:

Available Formats

Book - Applied - Mathematics - ClassXII-208-382

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Book - Applied - Mathematics - ClassXII-208-382

Uploaded by

Copyright:

Available Formats

it

4.0 LEARNING OUTCOMES

4.0.0 BEFORE YOU START, YOU SHOULD KNOW

Probability Distribution 4.1

4.2 Applied Mathematics

Solution: Sample space of the game play S = {1, 2, 3, 4, 5, 6}

4.2.1 Discrete and Continous Random Variables

X =  0, if coin toss result in head

4.2.2 Probability Distribution of Discrete Random Variable

4.4 Applied Mathematics

(Recall multiplication theorem of probability from class XI)

And, probability of obtaining three heads i.e. P (HHH) is written as

4.3 MATHEMATICAL EXPECTATION OF DISCRETE PROBABILITY DISTRIBUTION

Probability Distribution 4.5

The probability of occurrence of a head = probability of occurrence of a tail =

The probability distribution table-

Sample event TT HT, TH HH

xi No defective baskets One defective basket Two defective baskets

182 168 168 30 60

4.6 Applied Mathematics

4.4 VARIANCE OF DISCRETE PROBABILITY DISTRIBUTION

Probability Distribution 4.7

And, the standard deviation denoted by is given by:

As probability of a selection of a student is equally likely

That means P (a student to be selected) =

Therefore, the probability distribution is:

147 289 162 1083 1323 125 338

4.8 Applied Mathematics

And standard deviation, =

a) Find the value of k.

b) Probability that the person watches two hours of television

Probability Distribution 4.9

= 0.2 + k + 2k = 0.2 +3k = 0.2+ 3

2700 – 1936 764

And standard deviation,  x  Var(X)  1.22  1.1

4.5 BINOMIAL DISTRIBUTION

4.10 Applied Mathematics

basket) in first trial is , in 2nd trial it will be and so on.

This probability distribution is called Binomial distribution with parameters n and p.

Probability Distribution 4.11

Prepare the Binomial distribution B (4, )

Solution: Here total number of trials = n = 4 and p =

Now, number of successes = r = 0, 1, 2, 3 or 4

Also see that np = 4 2.67

Also see that npq = 4 0.89

And Standard deviation = =

In a binomial distribution having ‘n’ number of Bernoulli trials

4.12 Applied Mathematics

a) Probability of exact 5 successes in 9 trials = P( X = 5) =

b) Probability of at least 5 successes in 9 trials = P ( X 5)

E(X) = np = and Var(X) = npq =

Probability Distribution 4.13

Then P( r number of successes) =

Sonal starts the game by throwing the die first

When will Sonal get a chance to win next?

P (Sonal to win in third throw) =

Next time Sonal will get to win is fifth throw

P (Sonal to win in fifth throw) = and so on

4.14 Applied Mathematics