..... Numarical Analaysis PDF
..... Numarical Analaysis PDF
..... Numarical Analaysis PDF
UNIT-1
NUMERICAL METHOD
We use numerical method to find approximate solution of problems by numerical calculations with aid of
calculator. For better accuracy we have to minimize the error.
For example π = 3.14159 is approximated as 3.141 for chopping (deleting all decimal)
Significant digit:
It is defined as the digits to the left of the first non-zero digit to fix the position of decimal point.
Intermediate value Theorem: If a function f(x) is continuous in closed interval [a,b] and satisfies f(a)f(b) < 0
then there exists atleast one real root of the equation f(x) = 0 in open interval (a,b).
Algebraic equations are equations containing algebraic terms ( different powers of x). For example x2-7x+6=0
Transcendental equations are equations containing non-algebraic terms like trigonometric, exponential,
logarithmic terms. For example sin x – ex = 0
Step-I We rewrite the equation f(x) = 0 of the form x = h(x), x=g(x), x = D(x)
Step-II We choose that form say x = h(x) which satisfies I h΄(x) I < 1 in interval (a,b) containing the
solution (called root).
Step-III We take xn+1 = h(xn) as the successive formula to find approximate solution (root) of the
equation f(x) = 0
Step-III Let x=x0 be initial guess or initial approximation to the equation f(x) = 0
Then x1=h(x1) , x2=h(x2) , x3=h(x3) and so on.We will continue this process till we get solution (root) of
the equation f(x) = 0 up to desired accuracy.
If x=a is a root of the equation f(x) = 0 and the root is in interval (a, b). The function h΄(x) and h(x)
defined by x = h(x) Is continuous in (a,b) .Then the approximations x1=h(x1) , x2=h(x2) , x3=h(x3) .......
converges to the root x=a provided I h΄(x) I < 1 in interval (a,b) containing the root for all values of x.
Problems
1. Solve x3 - sin x -1 =0 correct to two significant figures by fixed point iteration method correct up
to 2 decimal places.
As f(1)f(2)< 0 by Intermediate value Theorem the root of real root of the equation f(x) = 0 lies
between 1 and 2
We see that I h1΄(x) I < 1 in interval (1,2) containing the root for all values of x.
We use xn+1= (1 + Sin xn)1/3 as the successive formula to find approximate solution (root) of the
equation (1).
Procedure
Step-I We find the interval (a,b) containing the solution (called root) of the equation f(x) = 0 .
Step-II Let x=x0 be initial guess or initial approximation to the equation f(x) = 0
Step-III We use xn+1 =xn - [f(xn) / f΄(xn)] as the successive formula to find approximate solution (root)
of the equation f(x) = 0
Step-III Then x1 , x2 , x3 ............ and so on are calculated and we will continue this process till we get
root of the equation f(x) = 0 up to desired accuracy.
2. Solve x - 2sin x - 3 = 0 correct to two significant figures by Newton Raphson method correct up
to 5 significant digits.
f(0) = -3, f(1)= -2 - 2 Sin 1 , f(2)= -1 - 2 Sin 2 ,f(3)= - 2 Sin 3, f(4)= 1- 2 Sin 4
As f(3)f(4)< 0 by Intermediate value Theorem the root of real root of the equation f(x) = 0 lies
between 3 and 4
Secant Method
Procedure
Step-I We find the interval (a,b) containing the solution (called root) of the equation f(x) = 0 .
Step-II Let x=x0 be initial guess or initial approximation to the equation f(x) = 0
Step-III We use xn+1 = xn - [ (xn - xn-1 )f(xn)] / [f(xn) - f(xn-1)] as the successive formula to find
approximate solution (root) of the equation f(x) = 0
Step-III Then x1 , x2 , x3 ............ and so on are calculated and we will continue this process till we get
root of the equation f(x) = 0 up to desired accuracy.
3 . Solve Cos x = x ex correct to two significant figures by Secant method correct up to 2 decimal
places.
As f(0)f(1)< 0 by Intermediate value Theorem the root of real root of the equation f(x) = 0 lies
between 0 and 1
Then
x3 x 2
x2 x1 f ( x2 ) 0.31465 0.31465 1 f (0.31465) 0.44672
f ( x2 ) f ( x1 ) f (0.31465) f (1)
x3 x 2 f ( x 3 )
x 4 x3 0.64748
f ( x3 ) f ( x 2 )
x 4 x3 f ( x 4 )
x5 x 4 0.44545
f ( x 4 ) f ( x3 )
Let f(x) = x4 - x - 7
As f(1)f(2)< 0 by Intermediate value Theorem the root of real root of the equation f(x) = 0 lies
between 1 and 2
Interpolation is the method of finding value of the dependent variable y at any point x using the
following given data.
x x0 x1 x2 x3 .. .. .. xn
y y0 y1 y2 y3 .. .. .. yn
This means that for the function y = f(x) the known values at x = x0 , x1 , x2 ,........., xn are respectively
For this purpose we fit a polynomial to these datas called interpolating polynomial. After getting the
polynomial p(x) which is an approximation to f(x), we can find the value of y at any point x.
i.e. x1 = x0 + h, x2 = x1 + h, ......................, xn = xn - 1 + h
Where p = (x - x0)/h
Problems
5. Using following data find the Newton’s interpolating polynomial and also find the value of y at x=5
x 0 10 20 30 40
y 7 18 32 48 85
Solution
x1 - x0= 10 = x2 - x1 = x3 - x2 = x4 - x3
As x= 5 lies between 0 and 10 and at the start of the table and data is equispaced, we have to use
Newton’s forward difference Interpolation.
x y Δy Δ2 y Δ3 y Δ4 y
0 7
11
10 18 03
14 02
20 32 05 10
19 12
30 51 17
36
40 87
Here x0 = 0, y0 = 7, h= x1 - x0 = 10-0 = 10
Δ y0 = 11 , Δ2 y0 =3 ,
Δ3 y0 = 2, Δ4 y0 =10
To find the approximate value of y at x=5 we put x=5 in the interpolating polynomial to get
y(5)=Pn (5) = 0.0000416 (5)4 - 0.0022 (5)3 +0.05(5)2 + 1.26 (5) +7 = 14.301
6. Using following data find the Newton’s interpolating polynomial and also find the value of y at x=24
x 20 35 50 65 80
y 3 11 24 50 98
Solution
x1 - x0= 15 = x2 - x1 = x3 - x2 = x4 - x3
As x= 24 lies between 20 and 35 and at the start of the table and data is equispaced, we have to use
Newton’s forward difference Interpolation.
Here x0 = 20, y0 = 3, h= x1 - x0 = 35 - 20 = 15
Δ y0 = 8 , Δ2 y0 = 5 ,
Δ3 y0 = 8, Δ4 y0 = 1
x y Δy Δ2 y Δ3 y Δ4 y
20 3
35 11 05
13 08
50 24 13 01
26 9
65 50 22
48
80 98
Where p = (x - xn)/h
x 0 10 20 30 40
y 7 18 32 48 85
Solution :
x1 - x0= 10 = x2 - x1 = x3 - x2 = x4 - x3
As x= 35 lies between 3 0 and 40 and at the end of the table and given data is equispaced ,we have
to use Newton’s Backward difference Interpolation.
yn= 36 , 2yn = 17 ,
x y Δy Δ2 y Δ3 y Δ4 y
0 7
11
10 18 03
14 02
20 32 05 10
19 12
30 51 17
36
40 87
= 87 + (-0.5) (36) + (-0.5) (-0.5+1) (17) /2! + (-0.5) (-0.5+1) (-0.5+2) (12) /3!
= 65.734375
Inverse Interpolation
The process of finding the independent variable x for given values of f(x) is called Inverse
Interpolation .
8. Solve ln x = 1.3 by inverse Interpolation using x= G(y) with G(1)=2.718 ,G(1.5)= 4.481 , G(2)=
7.387 ,G(2.5)= 12.179 and find value of x
y x Δy Δ2 y Δ3 y
1 2.718
1.763
2.906 0.743
2 7.387 1.886
4.792
2.5 12.179
Δ3 x0 = 0.743
= 3.680248
Linear interpolation is interpolation by the line through points (x1,y1) and (x0,y0)
Where l0 = (x- x1) /( x0- x1) and l1 = (x- x0) /( x1- x0)
Quadratic Lagrange Interpolation is the Interpolation through three given points (x2,y2) , (x1,y1) and
(x0,y0) given by the formula
P2(x)= l0 y0 + l1 y1 + l2 y2
Where l0
x x2 x x1 , x x2 x x0 x x1 x x0
l1 and l 2
x0 x2 x0 x1 x1 x2 x1 x0 x2 x1 x2 x0
9. Using quadratic Lagrange Interpolation find the Lagrange interpolating polynomial P2(x)
and hence find value of y at x=2 Given y(0) = 15, y(1) = 48, y(5) = 85
Solution :
x1 - x 0 = 1 ≠ x2 - x1 = 4
x x2 x x1 x 5x 1 x 2 6 x 5
l0
x0 x2 x0 x1 0 50 1 5
l1
x x2 x x0
x 5x 0 x 2 5 x
x1 x2 x1 x0 1 51 0 4
x x1 x x0 x 1x 0 x 2 x
and l 2
x2 x1 x2 x0 5 15 0 20
y l0 y0 l1 y1 l 2 y 2
x 2
6x 5
15
x 2 5x
48
x2 x
85
5 4 20
4.75x 2 37.75x 15
General Lagrange Interpolation is the Interpolation through n given points (x0,y0), (x1,y1) ,
(x2,y2)....................... , (xn,yn) given by the formula
Pn(x)= l0 y0 + l1 y1 + l2 y2 + ................ + ln yn
x xn ................x x2 x x1
Where l0
x0 xn ................x0 x2 x0 x1
x xn ................x x2 x x0
l1
x1 xn ................x1 x2 x1 x0
x xn ...................x x1 x x0
l2
x2 xn .................x2 x1 x2 x0
.........
...........
x xn1 ...................x x1 x x0
and l n
xn xn1 .................x2 x1 x2 x0
Solution :
x1 - x 0 = 1 ≠ x2 - x1 = 6
y l 0 y 0 l1 y1 l 2 y 2 l3 y3
1
18 1 42 2 57 7 90
9 6 3 18
2 7 38 35 82
f [ x 1 , x 2 ] - f [ x 0 , x1 ]
f [ x 0 , x1 , x 2 ]
x 2 x0
f [ x 2 , x3 ] - f [ x1 , x 2 ]
f [ x1 , x 2 , x 3 ]
x 3 x1
f [ x 1 , x 2 , x 3 ] - f [ x 0 , x1 , x 2 ]
f [ x 0 , x1 , x 2 , x 3 ]
x 3 x0
f [ x 1 , x 2 , x 3 ,......., x n ] - f [ x 0 , x1 , x 2 ,......., x n -1 ]
f [ x 0 , x1 , x 2 , x 3 ,......., x n ]
x n x0
Problems
11. Using following data find the Newton’s divided difference interpolating polynomial and also
find the value of y at x= 15
x 0 6 20 45
y 30 48 88 238
Newton’s divided difference table
0 30
(48-30)/6=3
6 48 (8-3)/11=0.45
11 88 (10-8)/20=0.1
(238-88)/15=10
26 238
NUMERICAL DIFFERENTIATION
When a function y = f(x) is unknown but its values are given at some points like (x0 , y0 ), (x1, y1 ),
Sometimes it is difficult to differentiate a composite or complicated function which can be done easily
in less time and less number of steps by numerical differentiation.
where p = (x - x0)/h
where p = (x - xn)/h
12. Using following data find the first and second derivative of y at x=0
x 0 10 20 30 40
y 7 18 32 48 85
Solution
x y Δy Δ2 y Δ3 y Δ4 y
0 7
11
10 18 03
14 02
20 32 05 10
19 12
30 51 17
36
40 87
Here x0 = 0, y0 = 7, h= x1 - x0 = 10-0 = 10
Δ y0 = 11 , Δ2 y0 =3 ,
Δ3 y0 = 2, Δ4 y0 =10
Linear Interpolation
y( x 1 ) - y ( x 0 ) y y0
y ( x 0 ) 1
x 1 x0 x 1 x0
Quadratic Interpolation
The second derivative is constant i.e. same at all points because of quadratic
interpolation and the interpolating polynomial is of degree two. Hence we must have
y΄΄(x0) = ( y0 -2 y1 + y2 ) /(2h)
Problems
13. Using following data find the value of first and second derivatives of y at x=30
x 10 30 50
y 42 64 88
Solution
y0 = 42, y1 = 64, y2 = 88
Linear Interpolation
y( x 1 ) - y( x 0 ) y y0 64 42
y ( x 0 ) 1 1.1
x 1 x0 x 1 x0 30 10
Quadratic Interpolation
14. Using following data find the value of first and second derivatives of y at x=12
x 0 10 20 30 40
y 7 18 32 48 85
Solution
x y Δy Δ2 y Δ3 y Δ4 y
0 7
11
10 18 03
14 02
20 32 05 10
19 12
30 51 17
36
40 87
Here x0 = 0, y0 = 7, h= x1 - x0 = 10-0 = 10
Δ y0 = 11 , Δ2 y0 =3 ,
Δ3 y0 = 2, Δ4 y0 =10
NUMERICAL INTEGRATION
Where integrand f(x) is a given function and a, b are known which are end points of the interval [a, b]
Let us divide the interval [a, b] into n number of equal subintervals so that length of each subinterval
is h = (b – a)/n
The end points of subintervals are a=x0, x1, x2, x3, ............. , xn = b
Let us approximate integrand f by a line segment in each subinterval. Then coordinate of end points
of subintervals are (x0, y 0), ( x1, y1 ) , (x2, y2), ............. ,( xn , yn ). Then from x=a to x=b the area under
curve of y = f(x) is approximately equal to sum of the areas of n trapezoids of each n subintervals.
ba 2
The error in trapezoidal rule is h f ( ) where a < θ <b
12
Where integrand f(x) is a given function and a, b are known which are end points of the interval [a, b]
We are taking two strips at a time Instead of taking one strip as in trapezoidal rule. For this reason the
number of intervals in Simpsons rule of Numerical integration must be even.
The formula is
I= f ( x) dx = (h/3) [ y
a
0 + y2m + 4(y1 + y3 + ............ + y 2m-1 ) + 2( y2 + y4 + .......... + y 2m-2) ]
ba 4 v
The error in Simpson 1/3rd rule is h f ( ) where a < θ <b
180
Where integrand f(x) is a given function and a, b are known which are end points of the interval [a, b]
We are taking three strips at a time Instead of taking one strip as in trapezoidal rule. For this reason
the number of intervals in Simpsons 3/8th rule of Numerical integration must be multiple of 3.
The formula is
I= f ( x) dx = (3h/8) [ y
a
0 + y3m + 3(y1 + y2 + y4 + y5 + ....... + y 3m - 1 ) + 2( y3 + y6 + ....... + y 3m – 3 ) ]
ba 4 v
The error in Simpson 1/3rd rule is h f ( ) where a < θ <b
80
15. Using Trapezoidal and Simpsons rule evaluate the following integral with number of subintervals n =6
e
( x2 )
dx
0
Solution:
y0 y1 y2 y3 y4 Y5 y6
= (h/3) [ y 0 + y6 + 4(y1 + y3 + y5 ) + 2( y2 + y4 ) ]
= (3h/8) [ y 0 + y6 + 3(y1 + y2 + y4 + y5 ) + 2( y3 ) ]
16. Using Trapezoidal and Simpsons rule evaluate the following integral with number of subintervals n =8
0 .8
dx
4 x
0
2
Solution:
Here integrand y = f(x) = ( 4 + x2)-1
( 4 + x2)-1
y0 y1 y2 y3 y4 Y5 y6 Y7 y8
0.8
1 x
0.8
dx
0 4 x 2 2 tan
1
0.5 tan 1 0.4 tan 1 0 0.5 tan 1 0.4
2 0
=10.900704743176
0 .6
dx
I= 0 1 x
Solution:
1
Here integrand y = f(x) =
1 x
Y= 1 1 1 1 1 1 1
1 .1 1 .2 1 .3 1 .4 1 .5 1 .6
1
1 x =0.953462 =0.912871 =0.877058 =0.845154 =0.816496 =0.790569
y0 y1 y2 y3 y4 Y5 y6
= (h/3) [ y 0 + y6 + 4(y1 + y3 + y5 ) + 2( y2 + y4 ) ]
= (3h/8) [ y 0 + y6 + 3(y1 + y2 + y4 + y5 ) + 2( y3 ) ]
= (0.3/8) [ 1+ 0.790569+
UNIT-II
This is an iterative method used to find approximate solution of a system of linear equations.
Some times in iterative method convergence is faster where matrices have large diagonal
elements. In this case Gauss elimination method require more number of steps and more row
operations. Also sometimes a system has many zero coefficients which require more space to
store zeros for example 30 zeros after or before decimal point. In such cases Gauss-Seidal
iteration method is very useful to overcome these difficulties and find approximate solution of
a system of linear equations.
Procedure:
We shall find a solution x of the system of equations Ax=b with given initial guess x0.
Step-I Rewrite the given equations in such a way that in first equation coefficient of x1 is
maximum, in second equation coefficient of x2 is maximum, in third equation coefficient of
x3 is maximum and so on.
Step-III
If initial guess is given we take that value otherwise we assume X = (1, 1, 1) as initial guess.
Put values of x1 , x2 obtained in (1) and (2) in the third equation to get value of x3.
Step-IV
18. Solve following linear equations using Gauss-Seidal iteration method starting from 1, 1, 1
x1 + x2 + 2 x3 = 8
2x1 + 3 x2 + x3 = 12
5x1 + x2 + x3 = 15
Solution Rewrite the given equations so that each equation for the variable that has coefficient largest we get
5x1 + x2 + x3 = 15 ..........................................................(1)
2x1 + 3 x2 + x3 = 12 ..........................................................(2)
x1 + x2 + 2 x3 = 10 ..........................................................(3)
5x1 = 1 5 - x2 - x3
2x1 + 3 x2 + x3 = 12
x1 + x 2 + 2 x3 =10
Step-1
Step-2
Step-3
Step-4
This is an iterative method used to find approximate value of Eigen values and Eigen vectors
of an n x n non-singular matrix A.
Procedure:
x1 = Ax0
x2 = A x1
x3 = A x2
..........................
..........................
..........................
xn = A xn-1
For any n x n non-singular matrix A we can apply this method and we get a dominant
eigen value λ such that absolute value of this eigen value λ is greater than that of other
eigen values.
Theorem: Let A be an n x n real symmetric matrix. Let x ≠ 0 be any real vector with n
components. Let y=Ax, m0 = xT x, m 1=xTy, m2=yTy
m2
Assuming r = λ - ϵ we have I ϵ I r2
m0
6 3
19 . Find the eigen values and eigen vectors of the matrix by Power method
3 2
taking x0= [ 1 1 ]T
6 3 T
Solution Let A = . Given x0= [ 1 1 ]
3 2
x1 = Ax0
6 3 1 9 1
= = = 9
3 2 1 5 5 / 9
1
Dominated eigen value is 9 and and eigen vector is
5 / 9
x2 = A x1
6 3 1 7.666 1
= 3 2 5 / 9 = 4.111 = 7.666 0.536
1
Dominated eigen value is 7.666 and and eigen vector is
0.536
x3 = A x2
6 3 1 7.608 1
= 3 2 0.536 = 4.072 = 7.608 0.535
1
Dominated eigen value is 7.608 and and eigen vector is
0.535
6 3 1
20 . Find the eigen values and eigen vectors of the matrix 3 2 0 by Power method
1 4 5
taking x0= [ 1 1 1 ]T
6 3 1
Solution Let A = 3 2 0 . Given x0= [ 1 1 1]T
1 4 5
x1 = Ax0
1
Dominated eigen value is 10 and and eigen vector is 0 .5
1
x2 = A x1
6 3 1 1 8.5 1
4 8.50.4705
= 3 2 0 0 .5 =
1 4 5 1 8 0.9411
1
Dominated eigen value is 8.5 and and eigen vector is 0.4705
0.9411
x3 = A x2
6 3 1 1 8.3526 1
= 3 2 0 0.4705 =
3.941 8.35260.4718
1 4 5 0.9411 7.5875 0.9084
1
Dominated eigen value is 8.3526 and and eigen vector is 0.4718
0.9084
Unit III: Solution of IVP by Euler’s method, Heun’s method and Runge-Kutta fourth order
method. Basic concept of optimization, Linear programming, simplex method, degeneracy,
and Big-M method.
y f ( x, y )
y (xo) = y0
The sufficient conditions for the existence of unique solution on the interval [x0 , b] are the
well-known Lipschitz conditions. However in ‘Numerical Analysis’, one finds values of y at
successive steps, x = x1 , x2 , … , xn with spacing h. There are many numerical methods
available to find solution of IVP, such as : Picards method, Euler’s method, Taylor’ series
method, Runge-Kutta method etc.
using a numerical scheme applied to discrete node xn = x0 + nh, where h is the step-size by
Euler’s method, Heun’s method and Runge-Kutta method.
In Euler’s method we use the slope evaluated at the current level ( x n , y n ) and use
that value as an approximation of the slope throughout the interval ( x n , x n 1 ) .
Hune’ method samples the slope at beginning and at the end and uses the average
as the final approximation of the slope. It is also known as Runge-kutta method of
order-2.
Runge-kutta method of order-4 improve on Euler’ s method looking at the slope at
multiple points.
y j 1 y j hf ( x j , y j ) , j = 0, 1, 2, … n - 1.
1
y j 1 y j (k1 k 2 ) , j = 0, 1, 2, … n - 1.
2
Where k1 hf ( x j , y j ), k 2 hf ( x j h, y j k1 )
The necessary formula for solution of (1) by Runge – Kutta method of order-4 is:
1
y j 1 y j (k1 2k 2 2k 3 k 4 ) , j = 0, 1, 2, …, n – 1.
6
Where k1 hf ( x j , y j )
1 1
k 2 hf ( x j h , y j k1 )
2 2
1 1
k 3 hf ( x j h, y j k 2 )
2 2
k 4 hf ( x j h, y j k 3 )
Example : Use the Euler method to solve numerically the initial value problem
u 2tu 2 , u (0) 1
We have
2
u j 1 u j 2 ht j u j , j 0,1,2,3,4. [Here x and y are replaced by t and u
respectively]
For j = 0: t0 = 0, u0 = 1
For j = 1: t1 = 0.2, u1 = 1
u(0.8) = u4 = 0.63684.
Similarly, we get
u(1.0) = u5 = 0.50706.
Note: In the similar way IVP can be solved by Heun’s method and Runge-Kutta fourth order
method.
Optimization
Optimization is the means by which scarce resources can be utilized in an efficient manner
so as to maximize the profit or minimize the loss.
A set of constraints are those which allow the unknowns to take on certain values but
exclude others. In the manufacturing problem, one cannot spend negative amount of time on
any activity, so one constraint is that the "time" variables are to be non-negative. In the pier
design problem, one would probably want to limit the breadth of the base and to constrain its
size.
The optimization problem is then to find values of the variables that minimize or maximize
the objective function while satisfying the constraints.
Objective Function
As already stated, the objective function is the mathematical function one wants to maximize
or minimize, subject to certain constraints. Many optimization problems have a single
In the present context we will apply the optimization technique to Linear programming
problem.
The general form of a linear programming problem is:
…. …. ….
To solve Linear Programming problem (LPP), Graphical method helps to visualize the
procedure explicitly. It also helps to understand the different terminologies associated with
the solution of LPP. Let us discuss these aspects with the help of an example. However, this
visualization is possible for a maximum of two decision variables. Thus, a LPP with two
decision variables is opted for discussion. However, the basic principle remains the same for
more than two decision variables also, even though the visualization beyond two-dimensional
case is not easily possible.
Let us consider the same LPP (general form) discussed in previous class, stated here once
again for convenience.
Maximize Z = 6x +5y
subject to 2x −3y ≤ 5 (C −1)
x +3y ≤11 (C − 2)
4x + y ≤15 (C −3)
x, y ≥ 0 (C − 4) & (C −5)
First step to solve above LPP by graphical method, is to plot the inequality constraints one-
by-one on a graph paper. Fig. 1a shows one such plotted constraint.
1
0
-2 -1 0 1 2 3 4 5
-1
-2 2x −3y ≤ 5
Fig. 1b shows all the constraints including the nonnegativity of the decision variables (i.e., x
≥ 0 and y ≥ 0 ).
5
x +3y ≤11
4 4x + y ≤15
3
x≥0
2
1 y≥0
0
-2 -1 0 1 2 3 4 5
-1
2x −3y ≤ 5
2
Common region of all these constraints is known as feasible region (Fig. 1c). Feasible region
implies that each and every point in this region satisfies all the constraints involved in the
LPP.
2 Feasible
region
1
0
-2 -1 0 1 2 3 4 5
-1
-2
As the (optimum) value of Z is not known, objective function is plotted by considering any
constant, k (Fig. 1d). The straight line, 6x + 5 y = k (constant), is known as Z line (Fig. 1d).
This line can be shifted in its perpendicular direction (as shown in the Fig. 1d) by changing
the value of k.. Note that, position of Z line shown in Fig. 1d, showing the intercept, c, on the
y axis is 3. If, 6x +5yy = k => 5y = −6x + k => y = −6 x + k , i.e., m = −6 and
5 5 5
k
c= = 3 => k =15 .
5
0
-2 -1 0 1 2 3 4 5
-1
-2 Z Line
5
Z Line
4 Optimal
Point
3
0
-2 -1 0 1 2 3 4 5
-1
-2
Now it can be visually noticed that value of the objective function will be maximum when it
passes through the intersection of x + 3y =11 and 4x + y =15 (straight lines associated with the
second and third inequality constraints). This is known as optimal point (Fig. 1e). Thus the
* *
optimal point of the present problem is x = 3.091 and y = 2.636 . And the optimal solution is
* *
= 6x +5y = 31.727
Visual representation of different cases of solution of LPP
A linear programming problem may have i) a unique, finite solution, ii) an unbounded
solution iii) multiple (or infinite) number of optimal solutions, iv) infeasible solution and v) a
unique feasible point. In the context of graphical method it is easy to visually demonstrate the
different situations
ns which may result in different types of solutions.
The example demonstrated above is an example of LPP having a unique, finite solution. In
such cases, optimum value occurs at an extreme point or vertex of the feasible region.
Unbounded solution
If the feasible region is not bounded, it is possible that the value of the objective function
goes on increasing without leaving the feasible region. This is known as unbounded solution
(Fig 2).
3
Z Line
2
0
-2 -1 0 1 2 3 4 5
-1
-2
If the Z line is parallel to any side of the feasible region all the points lying on that side
5 Parallel
4
0
-2 -1 0 1 2 3 4 5
-1
-2 Z Line
Infeasible solution
Sometimes, the set of constraints does not form a feasible region at all due to inconsistency in
the constraints. In such situation the LPP is said to have infeasible solution. Fig 4 illustrates
such a situation.
1
Z Line
0
-2 -1 0 1 2 3 4 5
-1
-2
This situation arises when feasible region consist of a single point. This situation may occur
only when number of constraints is at least equal to the number of decision variables. An
example is shown in Fig 5. In this case, there is no need for optimization as there is only one
solution.
4
Unique
3
feasible point
2
0
-2 -1 0 1 2 3 4 5
-1
-2
Recall from the previous discussion that the optimal solution of a LPP, if exists, lies at
one of the vertices of the feasible region. Thus one way to find the optimal solution is to
find all the basic feasible solutions of the standard form and investigate them one-by-
one to get at the optimal. However, again recall thatt, for m equations with n variables
there exists a huge number ( n cm ) of basic feasible solutions. In such a case, inspection
of all the solutions one-by-one is not practically feasible. However, this can be
overcome by simplex method. Conceptual principle of this method can be easily
understood for a three dimensional case (however, simplex method is applicable for any
higher dimensional case as well).
Imagine a feasible region (i.e., volume) bounded by several surfaces. Each vertex of
this volume, which is a basic feasible solution, is connected to three other adjacent
vertices by a straight line to each being the intersection of two surfaces. Being at any
one vertex (one of the basic feasible solutions), simplex algorithm helps to move to
another adjacent vertex which is closest to the optimal solution among all the adjacent
vertices. Thus, it follows the shortest route to reach the optimal solution from the
starting point. It can be noted that the shortest route consists of a sequence of basic
feasible solutions which is generated by simplex algorithm.
Simplex algorithm
Simplex algorithm is discussed using an example of LPP. Let us consider the
following problem.
It can be recalled that x4 , x5 and x6 are slack variables. Above set of equations, including the
objective function can be transformed to canonical form as follows:
Z 0 . It can be noted that, x4 , x5 and x6 are known as basic variables and x1 , x2 and x3 are
known as nonbasic variables of the canonical form shown above. Let us denote each equation
of above canonical form as:
For the ease of discussion, right hand side constants and the coefficients of the variables are
symbolized as follows:
The left-most column is known as basis as this is consisting of basic variables. The
coefficients in the first row ( c1 Λ c6 ) are known as cost coefficients. Other subscript
notations are self explanatory and used for the ease of discussion. For each coefficient, first
subscript indicates the subscript of the basic variable in that equation. Second subscript
indicates the subscript of variable with which the coefficient is associated. For example, c52 is
the coefficient of x2 in the equation having the basic variable x5 with nonzero coefficient (i.e.,
c55 is nonzero).
This completes first step of calculation. After completing each step (iteration) of calculation,
three points are to be examined:
1. Is there any possibility of further improvement?
Entering nonbasic variable is decided such that the unit change of this variable
should have maximum effect on the objective function. Thus the variable having
the coefficient which is minimum among all the cost coefficients is to be
entered, i.e., xS is to be entered if cost coefficient cS is minimum.
After deciding the entering variable xS , xr (from the set of basic variables) is
b
r
decided to be the exiting variable if is minimum for all possible r, provided
c
rs
crs is positive.
It can be noted that, crs is considered as pivotal element to obtain the next
canonical form.
In this example, c1 −4 is the minimum. Thus, x1 is the entering variable for the next step
b
4 6
c
of calculation. r may take any value from 4, 5 and 6. It is found that 41 2 3 ,
b5 0 b 4 b
0 and 6 0.8 . As, 5 is minimum, r is 5. Thus x5 is to be exited and c51 is
c c c
51 1 61 5 51
the pivotal element and x5 is replaced by x1 in the basis. Set of equations are transformed
through pivotal operation to another canonical form considering c51 as the pivotal element.
The procedure of pivotal operation is already explained in first class. However, as a refresher
it is explained here once again.
1. Pivotal row is transformed by dividing it with the pivotal element. In this case, pivotal
element is 1.
2. For other rows: Let the coefficient of the element in the pivotal column of a particular row
be “l”. Let the pivotal element be “m”. Then the pivotal row is multiplied by l / m and then
subtracted from that row to be transformed. This operation ensures that the coefficients of
the element in the pivotal column of that row becomes zero, e.g., Z row: l = -4 , m = 1. So,
pivotal row is multiplied by l / m = -4 / 1 = -4, obtaining
After the pivotal operation, the canonical form obtained is shown below.
Z 0 . However, this is not the optimum solution as the cost coefficient c2 is negative. It is
observed that c2 (= -15) is minimum. Thus, s 2 and x2 is the entering variable. r may take
any value from 4, 1 and 6. However, c12 −4 is negative. Thus, r may be either 4 or 6. It is
b b
found that, 4
6 0.667 , and 6
4 0.222 . As b6 is minimum, r is 6 and x6 is to
c c c
942 62 18 62
be exited from the basis. c62 (=18) is to be treated as pivotal element. The canonical form for
next iteration is as follows:
1 5 10
Z 0x1 0x2 − 4x3 0x4 − x5 x6 Z
6 6 3
1 1
x4 0x1 0x2 4x3 1x4 x5 − x6 4
2 2
It is observed that c3 (= - 4) is negative. Thus, optimum is not yet achieved. Following similar
procedure as above, it is decided that x3 should be entered in the basis and x4 should be
exited from the basis. Thus, x4 is replaced by x3 in the basis. Set of equations are
transformed to another canonical form considering c43 (= 4) as pivotal element. By doing so,
the canonical form is shown below.
1 1 22
Z 0x1 0x2 0x3 1x4 x5 x6 Z
3 3 3
1 1 1
x3 0x1 0x2 1x3 x x5 − x6 1
4 4 8 8
1 1 5 14
x1 1x1 0x2 0x3 x4 − x5 x
6 36 36 6 9
1 7x 1x 8
x2 0x1 1x2 0x3 6 x4 − 36 5 − 36 6 9
14
9 1.556
8
x2 9 0.889
x3 1
The calculation shown above can be presented in a tabular form, which is known as Simplex
Tableau. Construction of Simplex Tableau will be discussed next.
Construction of Simplex Tableau
Same LPP is considered for the construction of simplex tableau. This helps to compare the
calculation shown above and the construction of simplex tableau for it.
After preparing the canonical form of the given LPP, simplex tableau is constructed as
follows.
Variables br
b
Iteration Basis Z x x x x r c
rs
1 x2 3 x4 5 6
Z 1 -4 1 -2 0 0 0 0 --
x4 0 2 1 2 1 0 0 6 3
1
x
5 0 1 -4 2 0 1 0 0 0
4
x6 0 5 -2 -2 0 0 1 4
5
After completing each iteration, the steps given below are to be followed.
Logically, these steps are exactly similar to the procedure described earlier. However, steps
described here are somewhat mechanical and easy to remember!
1. Investigate whether all the elements in the first row (i.e., Z row) are nonnegative
or not. Basically these elements are the coefficients of the variables headed by
that column. If all such coefficients are nonnegative, optimum solution is
obtained and no need of further iterations. If any element in this row is negative,
the operation to obtain simplex tableau for the next iteration is as follows:
Operations to obtain next simplex tableau:
3. The exiting variable from the basis is identified (described earlier). The
corresponding row is marked as Pivotal Row as shown above.
6. All the elements in the pivotal row are divided by pivotal element.
7. For any other row, an elementary operation is identified such that the coefficient in
the pivotal column in that row becomes zero. The same operation is applied for all
other elements in that row and the coefficients are changed accordingly. A similar
procedure is followed for all other rows.
For example, say, (2 x pivotal element + pivotal coefficient in first row) produce zero
in the pivotal column in first row. The same operation is applied for all other
elements in the first row and the coefficients are changed accordingly.
Simplex tableaus for successive iterations are shown below. Pivotal Row, Pivotal Column
and Pivotal Element for each tableau are marked as earlier for the ease of understanding.
Variables br
Iteration Basis Z br c
x1 x2 x3 x4 x5 x6 rs
Z 1 0 -15 6 0 4 0 0 --
x4 0 0 9 -2 1 -2 0 6 13
2
x
1 0 1 -4 2 0 1 0 0 --
x6 0 0 18 -12 0 -5 1 4 29
Variables br
Iteration Basis Z br c
x1 x2 x3 x4 x5 x6 rs
1 5 10
Z 1 0 0 -4 0 −6 6 3 --
1 1
x4 0 0 0 4 1 − 4 1
2 2
3
2 1 2 8
x1 0 1 0 − 0 − --
3 9 9 9
2 5 1 2
x2 0 0 1 − 0 − --
3 18 18 9
Z 1 0 0 0 1 1 1 22
3 3 3
x3 0 0 0 1 1 1 −1 1
4 8 8
4
1 1 2 14
x1 0 1 0 0 −
6 36 9 9
1 7 1 8
x2 0 0 1 0 − −
6 36 36 9
Optimum value of Z
Value of x3
14 8
values of basic variables are x1 9 1.556 , x2 9 0.889 , x3 1 and those of
It can be noted that at any iteration the following two points must be satisfied:
1. All the basic variables (other than Z) have a coefficient of zero in the Z row.
If any of these points are violated at any iteration, it indicates a wrong calculation. However,
reverse is not true.
Big-M method
Introduction
In the previous lecture the simplex method was discussed with required transformation of
objective function and constraints. However, all the constraints were of inequality type with
‘less-than-equal-to’ ( ≤ ) sign. However, ‘greater-than-equal-to’ ( ≥ ) and ‘equality’ ( )
constraints are also possible. In such cases, a modified approach is followed, which will be
discussed in this lecture. Different types of LPP solutions in the context of Simplex method
will also be discussed. Finally, a discussion on minimization vs maximization will be
presented.
• Cost coefficients, which are supposed to be placed in the Z-row in the initial simplex
tableau, are transformed by ‘pivotal operation’ considering the column of artificial
variable as ‘pivotal column’ and the row of the artificial variable as ‘pivotal row’.
• If there are more than one artificial variable, step 3 is repeated for all the artificial
variables one by one.
where x3 is surplus variable, x4 is slack variable and a1 and a2 are the artificial variables. Cost
coefficients in the objective function are modified considering the first constraint as follows:
Pivotal Column
Next, the revised objective function is considered with third constraint as follows:
Pivotal Column
The modified cost coefficients are to be used in the Z-row of the first simplex tableau.
Next, let us move to the construction of simplex tableau. Pivotal column, pivotal row and
pivotal element are marked (same as used in the last class) for the ease of understanding.
Variables br
Iteration Basis Z br c
x x rs
1 x2 3 x4 a1 a2
Z 1 − 3 − 4M − 5 − 3M M0 0 0 − 20M--
a
1 0 1 1 -1 0 1 0 2 2
1
x4 0 0 1 0 1 0 0 6 --
a2 0 3 2 0 0 0 1 18 6
Variables br
Iteration Basis Z br c
x1 x2 x3 x4 a1 a2 rs
a2 0 0 -1 3 0 -3 1 12 4
Variables br
Iteration Basis Z br c
x1 x2 x3 x4 a1 a2 rs
Z 1 0 -3 0 0 M1 M18 --
x1 2 1
0 1 0 0 0 6 9
3 3
3
x4 0 0 1 0 1 0 0 6 6
x3 0 0 −1 1 0 -1 1 4 --
3 3
Z 1 0 0 0 3 M1 M36 --
2
x1 0 1 0 0 − 0 1 2 --
4 3 3
x2 0 0 1 0 1 0 0 6 --
1 1
x3 0 0 0 1 -1 6 --
3 3
and x2 6 . The methodology explained above is known as Big-M method. Hope, reader has
As already discussed in lecture notes 2, a linear programming problem may have different
type of solutions corresponding to different situations. Visual demonstration of these
different types of situations was also discussed in the context of graphical method. Here, the
same will be discussed in the context of Simplex method.
Unbounded solution
If at any iteration no departing variable can be found corresponding to entering variable, the
value of the objective function can be increased indefinitely, i.e., the solution is unbounded.
If in the final tableau, one of the non-basic variables has a coefficient 0 in the Z-row, it
indicates that an alternative solution exists. This non-basic variable can be incorporated in the
basis to obtain another optimal solution. Once two such optimal solutions are obtained,
infinite number of optimal solutions can be obtained by taking a weighted sum of the two
optimal solutions.
Curious readers may find that the only modification is that the coefficient of x2 is changed
from 5 to 2 in the objective function. Thus the slope of the objective function and that of third
constraint are now same. It may be recalled from lecture notes 2, that if the Z line is parallel to
any side of the feasible region (i.e., one of the constraints) all the points lying on that side
constitute optimal solutions (refer fig 3 in lecture notes 2). So, reader should be able to
imagine graphically that the LPP is having infinite solutions. However, for this particular set
of constraints, if the objective function is made parallel (with equal slope) to either the first
constraint or the second constraint, it will not lead to multiple solutions. The reason is very
simple and left for the reader to find out. As a hint, plot all the constraints and the objective
function on an arithmetic paper.
Now, let us see how it can be found in the simplex tableau. Coming back to our problem,
final tableau is shown as follows. Full problem is left to the reader as practice.
Final tableau:
Variables
b
r
Iteration Basis Z br c
x x rs
x1 2 3 x4 a1 a2
Z 1 0 0 0 0 M1 M18 --
2 1
x1 0 1 0 0 0 6 9
x 3 3
3 4 0 0 1 0 1 0 0 6 6
x3 0 0 −1 1 0 -1 1 4 --
3 3
Coefficient of non-basic variable x2 is zero
As there is no negative coefficient in the Z-row the optimal is reached. The solution is Z 18
with x1 6 and x2 0 . However, the coefficient of non-basic variable x2 is zero as shown in
the final simplex tableau. So, another solution is possible by incorporating x2 in the basis.
br 4
Based on the , x will be the exiting variable. The next tableau will be as follows:
c
rs
Variables br
Iteration Basis Z br c
x x a a rs
1 2 x3 x4 1 2
Z 1 0 0 0 0 M1 M18 --
x1 0 1 0 0 −2 0
1
2 --
x 3 3
4 2 0 0 1 0 1 0 0 6 6
x 1 1
3 0 0 0 1 3 -1 3 6 18
noted that, the coefficient of non-basic variable x4 is zero as shown in the tableau. If one
more similar step is performed, same simplex tableau at iteration 3 will be obtained.
Thus, we have two sets of solutions as and . Other optimal solutions will be obtained
0 6
6 2
as β 1− β
where, β ∈ 0,1 . For example, let β 0.4 , corresponding solution is
0 6
3.6
, i.e., x1 3.6 and x2 3.6 . Note that values of the objective function are not changed
3.6
Infeasible solution
If in the final tableau, at least one of the artificial variables still exists in the basis, the
solution is indefinite.
Reader may check this situation both graphically and in the context of Simplex method by
considering following problem:
2. While selecting the entering nonbasic variable, the variable having the maximum
coefficient among all the cost coefficients is to be entered. In such cases, optimal
solution would be determined from the tableau having all the cost coefficients as non-
positive ( ≤ 0 )
Still one difficulty remains in the minimization problem. Generally the minimization problems consist of
constraints with ‘greater-than-equal-to’ ( ≥ ) sign. For example, minimize the price (to compete in the
market); however, the profit should cross a minimum threshold. Whenever the goal is to minimize some
objective, lower bounded requirements play the leading role. Constraints with ‘greater-than-equal-to’ ( ≥ )
sign are obvious in practical situations.
To deal with the constraints with ‘greater-than-equal-to’ ( ≥ ) and sign, Big-M method is to be
followed as explained earlier.
UINT - IV
These are n = 14 measurements of the tensile strength of sheet steel in kg/mm2 recorded
in the order obtained and rounded to integer values. To see what is going on, we aort
these data, that is, we order them by size
78 81 83 84 86 87 87 89 89 89 89 90 91 99. (2)
1
The numbers in (1) range from 78 to 99. We divide these numbers into 5 groups,
75 − 79, 80 − 84, 85 − 89, 90 − 94, 95 − 99. The integers in the tens position of the groups
are 7, 8, 8, 9, 9. These form the stem. The first leaf is 8. The second leaf is 134(representing
81, 83, 84), and so on.
The number of times a value occurs is called its absolute frequency. Thus 78 has
absolute frequency 1, the value 89 has absolute frequency 4 etc. The column to the
extreme left shows the cummulative absolute frequency, that is the sum of the absolute
frequencies of the values up to the line of the leaf. Thus the number 4 in the second line
on the left shows that (1) has 4 values up to and including 84. The number 11 in the
next line shows that there are 11 values not exceeding 89, etc. Dividing the cummulative
absolute frequencies by n(= 14) gives the cummulative relative frequencies.
Histogram
For large sets of data, histograms are better in displayng the distribution of data than
stem-and-leaf plots.
Center and Spread of data: Median
As a center of the location of data values we can simply take the median, the data
value that falls in the middle when the values are ordered. In (2) we have 14 values.
The seventh of them is 87, the eighth is 89, and we the split the difference, obtaining the
median 88.
Mean, Standard deviation, Variance
The average size of the data values can be measured in a more refined way by the
mean
1∑
n
1
x= xj = (x1 + x2 + · · · + xn ). (3)
n j=1 n
This is the arithmetic mean of the data values, obtained by taking their sum and dividing
by the data size n. Thus in (1),
1 611
x= (89 + 84 + · · · + 89) = ≈ 87.3.
14 7
Similarly, the spread of the data values can be measured in a more refined way by the
standard deviation s or by its square, the variance
1 ∑
n
1
2
s = (xj − x)2 = [(x1 − x)2 + (x2 − x)2 + · · · + (xn − x)2 ].
n − 1 j=1 n−1
Thus to obtain the variance of the data, take the difference xj − x of each data value from
2
the mean, square it, take the sum of these n squares, and divide it by n − 1. To get the
standard deviation s, take the square root of s2 .
For example, using x = 611/7, we get the data (1) the variance
3 Probability
The “probability” of an event A in an experiment is supposed to measure how frequenty
A is about to occur if we make any trials.
Defination 1. Probability
If the sample space S of an experiment consists of finitely many outcomes that are
equally likely, then the probability P (A) of an event A is
Number of points in A
P (A) = .
Number of points in S
3
Thus in particular,
P (S) = 1.
Defination 2. Probability
Given a sample space S, with each event A of S there is associated a number P (A),
called the probability of A, such that the following axioms of probability are satisfied.
1. For every A in S,
0 ≦ P (A) ≦ 1.
P (S) = 1.
P (A ∪ B) = P (A) + P (B).
this is the probability that X will assume any value not exceeding x. From (4) we obtain
the fundamental formula for the probability corresponding to an interval a < x ≦ b,
4
or at most countably many values x1 , x2 , · · · , called the probability values of X, with
positive probabilities p1 = P (X = x1 ), p2 = P (X = x2 ), · · · , whereas the probability of
P (X ∈ I) is zero for any interval I containing no possible value.
Obviously, the discrete distribution is also determined by the probability function f (x)
of X, defined by {
pj if x = xj , (j = 1, 2, · · · )
f (x) =
0 otherwise .
From this we get the values of the distribution function F (x) by takng sums,
∑ ∑
F (x) = f (xj ) = pj .
xj ≦x xj ≦x
where for any given x we sum all the probabilities pj for which xj is smaller that or equal
to that x.
For the probability corresponding to intervals we have
∑
P (a < X ≦ b) = pj .
a<xj ≦b
f (x) = F ′ (x)
5
Note: ∫ ∞
f (v)dv = 1.
−∞
Example
Let the random variable X =Sum of the two numbers when two dice turn up. The
probability function and the distribution function are as follows:
x 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 5 4 3 2 1
f (x) 36 36 36 36 36 36 36 36 36 36 36
1 3 6 10 15 21 26 30 33 35 36
F (x) 36 36 36 36 36 36 36 36 36 36 36
Example {
0.75(1 − x2 ) if − 1 ≦ x ≦ 1,
Let X have the density function f (x) = Find the
0 otherwise .
distribution function. Find the probability P (− 21 ≦ X ≦ 21 ) and P (− 14 ≦ X ≦ 2). Fnd x
such that P (X ≦ x) = 0.95.
Solution:
From the defination, it is clear that F (x) = 0 if x ≦ −1,
∫ x
F (x) = 0.75 (1 − v 2 )dv = 0.5 + 0.75x − 0.25x3 , −1 < x ≦ 1,
−1
Finally,
P (X ≦ x) = F (x) = 0.5 + 0.75x − 0.25x3 = 0.95.
6
∑
xj f (xj ) (Discrete distribution),
µ= ∫j ∞
xf (x)dx (Continuous distribution),
−∞
σ is called the standard deviation of X and its distribution. f is the probability func-
tion (or probability mass function) or the density function, respectively, in discrete and
continuous distribution.
6 Binomial Distribution
Consider a set of n independent trials (n being finite) in which the probability p of success
in any trail is constant for each trial, then q = 1 − p, is the probability of failure in any
trail.
A random variable X is said to follow binomial distribution if it assumes only non-
negative values and its probability mass function is given by:
{ (n )
x
px q n−x x = 0, 1, 2, · · · , n; q = 1 − p, ,
P (X = x) = p(x) =
0 otherwise,
µ = np
Example:
Ten coins are thrown simultaneously. Find the probability of getting at least seven
heads.
Solution: Here p= Probability of gettng a head = 12
q= Probability of not gettng a head = 12
7
∴ Probability of getting at least seven heads is given by :
P (X ≥ 7) = p(7) + p(8) + p(9) + p(10) = 176
1024
.
7 Poisson Distribution
A random variable X is said to follow Poisson distribution if it assumes only non-negative
values and its probability mass function is given by:
−λ x
e λ x = 0, 1, 2, · · · ; λ > 0
P (X = x) = p(x, λ) = x!
0 otherwise,
8 Normal Distribution
A random variable X is said to have a normal distribution with parameters µ (called
mean) and σ 2 (called variance) if its probability density function (pdf) is given by the
8
probability law:
[ ( )2 ]
1 1 x−µ
f (x) = √ exp − , −∞ < x < ∞, −∞ < µ < ∞, σ > 0.
σ 2π 2 σ
Remark: When a r.v. X is normally distributed with mean µ and standard deviation
σ, it is customary to write X as distributed as N (µ, σ 2 ) and is expressed as X ∼ N (µ, σ 2 ).
Distribution function F (x)
The normal distribution has the distribution function
∫ [ ( )2 ]
1 x
1 v−µ
F (x) = √ exp − dv.
σ 2π −∞ 2 σ
For the corresponding standardized normal distribution with mean 0 and standard
deviation 1 we denote F (x) by
∫ x
1
e−u
2 /2
Φ(z) = √ du.
2π −∞
Result 1
The distribution function F (x) of the normal distribution with any µ and σ s related
to the standardized distribution function Φ(z) by the formula
( )
x−µ
F (x) = Φ .
σ
Result 2
The probability that a normal random variable X with mean µ and standard deviation
σ assume any value in an interval a < x ≦ b is
( ) ( )
b−µ a−µ
P (a < x ≦ b) = F (b) − F (a) = Φ −Φ .
σ σ
9 Regression Analysis
Regression analysis is a mathematical measure of the average relationship between two
or more variables in terms of the original units of the data.
Linear regression
If the variables in a bivariate distribution are related, we will find the points in the
scatter diagram will cluster arround some curve called the “curve of regression”. If the
curve is a straight line, it is called the line of regression and there is said to be linear
9
regression between two variables, otherwise regression is said to be curvilinear.
The line of regression is the line which gives the best estimate to the value of one
variable for any specific value of other variable. Thus the line of regression is the line of
“best fit” and is obtained by the principle of least squares.
Let us suppose that in the bivariate distribution (xi , yi ); = 1, 2, · · · , n; Y is dependent
variable and X is independent variable. Let the line of regression of Y on X be
y = a + bx.
The above equation represents a family of straight lines for different values of the arbitrary
constants a and b. The problem is to determine a and b so that the line y = a + bx is the
line of “best fit”.
Using Least square method, we get the line of regression of Y on X passes through
the point (x, y) as
y − y = k1 (x − x),
where x and y are the means of the x− and y− values in our sample, and the slope k1 is
called the regression coefficient, is given by
sxy
k1 = ,
s2x
10
∴ Equation of line of regression of Y on X is:
2.9892
y − 69 = (x − 68) ⇒ y = 0.665x + 23.78
(2.12)2
10 Correlation Analysis
Correlation analysis is concerned with the relation between X and Y in a two-dimensional
random variable (X, Y ). A sample consists of n ordered pairs of values (x1 , y1 ), · · · ,
(xn , yn ), we shall use the sample means x and y, the sample variances s2x and s2y and the
sample covariance sxy .
The sample correction coefficient is
sxy
r= .
sx sy
Remarks:
1. The correction coefficient r satisfies −1 ≦ r ≦ 1, and r = ±1 if and only if the sample
values lie on a straight line.
2. Two independent variables are uncorrelated.
3. Correlation coefficient is independent of change of origin and scale.
11 Tests of Significance
A very important aspect of the sampling theory is the study of the tests of significance,
which enables us to decide on the basis of sample results, if
(i) the deviation between the observed sample statistic and the hypothetical parameter
values.
(ii) the deviation between two independent sample statistics; is significant or might be
attributed to chance or the fluctuations of sampling.
11
hypopthesis that the population has a specifiedmean µ0 , (say), i.e., H0 : µ = µ0 then the
alternative hypopthesis could be:
The alternative hypopthesis in (i) is known as a “two-tailed alternative” and the alterna-
tives in (ii) and (iii) are known as “right-tailed” and “left-tailed alternatives” respectively.
p−P
Z=√ , where p = X/n, Q = 1 − P.
P Q/n
12
11.3.2 Test of significance for dfference of proportions
Suppose we want to compare two distinct populations with respect to the prevalence of
a certain attribute, say A, among theirs members. Let X1 and X2 be the number of
persons possessing the given attributes A n random samples of sizes n1 and n2 from two
populations respectively. Then the sample proportions are given by : p1 = X1 /n1 and
p2 = X2 /n2 .
Under H0 : P1 = P2 , the test statistic for difference proportions is given by
p1 − p2 n1 p1 + n2 p2
z=√ , where Pb = , b = 1 − Pb.
Q
PbQ(
b 1 +
n1
1
n2
) n1 + n2
Let x1 be the mean of a sample of size n1 from a population with mean µ1 and variance
σ12 and let x2 be the mean of an independent random sample of size n2 from another
population with mean µ2 and variance σ22 .
Here the test statistic becomes
x1 − x2
z=√ 2 .
(σ1 /n1 ) + (σ22 /n2 )
For more details about the theory, workout examples and questions, see the
book:
“Fundamentals of Mathematical Statistics” by S.C. Gupta and V.K. Kapoor.
13