Unit 1-2
Unit 1-2
Editor
Deekshant Awasthi
Published by:
Department of Distance and Continuing Education
Campus of Open Learning, School of Open Learning,
University of Delhi, Delhi-110007
Printed by:
School of Open Learning, University of Delhi
IT SKILLS AND DATA ANALYSIS-II
Reviewer
Ms. Asha Yadav
Disclaimer
Printed at: Taxmann Publications Pvt. Ltd., 21/35, West Punjabi Bagh,
New Delhi - 110026 (...... Copies, 2025)
Syllabus Mapping
Unit - I: Functions and Their Graphical Representations Lesson 1: Functions and
This unit introduces the graphical visualisation of functions to understand Graphs
the relationship between two variables. (Pages 3–15)
Lesson 2: Linear and
Non-Linear Functions
(Pages 16–36)
Lesson 3: Reciprocal,
Exponential and
Logarithmic Functions
(Pages 37–50)
Unit - II: Relationship between Variables Lesson 4: Correlation
Students will learn about scatter diagrams and correlation analysis as a (Pages 53–71)
means to describe the nature and strength of association between two vari-
Lesson 5: Regression
ables. The concept of regression analysis will be introduced as a method
for quantifying the relationship between two variables. Further, multiple (Pages 72–82)
linear regression will be discussed for situations where more than one in- Lesson 6: Multiple
dependent variable is needed to estimate the dependent variable. The focus Regression and
will be mainly on interpreting estimated regression coefficients. Correlation
(Pages 83–93)
Lesson 7: Curve Fitting:
Principle of Least Squares
(Pages 94–121)
PAGE
UNIT-I
Lesson 1: Functions and Graphs 3–15
UNIT-II
Lesson 4: Correlation 53–71
PAGE i
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
PAGE 1
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
1
Functions and Graphs
STRUCTURE
1.1 Introduction to Functions
1.2 Graphical Representation of Functions
1.3 Graphs
1.4 Vertical Line Test
1.5 Important Notes on Linear Functions
1.6 Graphical Behavior of Functions
1.7 Constant Functions
1.8 Exercise
PAGE 3
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
1.3 Graphs
Graphs provide a visual representation of functions, showing the rela-
tionship between the input values and output values or Independent and
Dependent Variables in the Function.
Graph can be defined as a diagram displaying data; in particular one
showing the relationship between two or more variables, quantities, mea-
surements, or numbers.
By graphically representing a function we mean choosing some values
for the independent variable x and putting them into the function to get
values of y. This gives us a set of ordered pairs (x, y). This can also be
written as (x, f(x)). These sets of ordered pairs are then plotted on the
graph and connected through a line.
4 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
This is the linear graph since the power of independent variable is 1 and
when plotted it is forming a straight line.
Example 2: Let the function be
f(x) = 3x + 5
PAGE 5
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes Let x take values –1 and 0. Then keeping these values in the above
function, the ordered pairs are (–1, 2), (0, 5). Two points are sufficient
to plot a line. So, plotting these points forms the following graph:
1
Example 4: Plot the graph for the given function: f(x) = − x + 1
3
Let the ordered pairs be (0, 1) and (3, 0). Following is the graph obtained
on plotting these points.
6 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
PAGE 7
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes If a vertical line intersects the graph in all places at exactly one
point, then the relation is a function.
If a vertical line intersects the graph in some places more than once,
then the relation is NOT a function.
In the first graph, if we draw a single vertical through the red dots would
intersect the curve 3 times. Thus, it fails the vertical line test and does
not represent a function. Any vertical line in the second graph passes
through only once and hence passes the vertical line test, and thus rep-
resents a function.
8 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
vertical line x = a, if it cuts the curve y = f(x) at only one point (a, Notes
f(a)), then such a curve y = f(x) represents a function.
A vertical line is supposed to cut the curve at only one point, for the
curve to represent a function. And if the vertical line x = a is cutting the
graph y = f(x) at more than one point, i.e. at two points such as (x, y1),
(x, y2), then it is having different y values for the same x-value. Thus,
each domain has more than one codomain value and it contradicts the
basic definition of a function, and the curve y = f(x) does not represent
a function.
To use the vertical line test, take a ruler and draw a line parallel to the
y-axis for any selected value of x If the vertical line intersects the graph
more than once for any value of x then the graph is not the graph of
a function. If, alternatively, a vertical line intersects the graph no more
than once, no matter where the vertical line is placed, then the graph is
the graph of a function. For example, a curve which is any straight line
other than a vertical line will be the graph of a function.
Some examples of relations that are also functions because they pass the
vertical line test.
Graph of the line f(x)=x+1
PAGE 9
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
If a vertical line intersects the graph at more than one point, then the
relation is not a function.
10 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Some examples of relations that are not functions because they fail the Notes
vertical line test.
Graph of the “sideway” parabola x = y2
PAGE 11
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
1.5 Important Notes on Linear Functions
A linear function is of the form f(x) = ax + b and hence its graph
is a line.
When its slope is 0 then a linear function f(x) = ax + b is a
horizontal line and is known as a constant function.
For a linear function f(x) = ax + b the domain and range is R (all
real numbers) whereas for a constant function f(x) = b the range
is {b}.
These linear functions are useful to represent the objective function
in linear programming.
A vertical line is NOT a linear function as it fails the vertical line
test.
12 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
If it is f(x1) > f(x2) then function is called strictly increasing function. Notes
A decreasing function is one where f(x1) ≤ f(x2) for every x1 ≤ x2.
If it is f(x1) < f(x2) then function is called strictly decreasing function
In terms of a linear function f(x) = mx + b, if m is positive, the func-
tion is increasing, if m is negative, it is decreasing, and if m is zero, the
function is a constant function.
The average rate of change of an increasing function is positive, and the
average rate of change of a decreasing function is negative. The figure
below shows examples of increasing and decreasing intervals on a function.
PAGE 13
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
1.8 Exercise
1. Draw the graph for each of these equations, and use the vertical line
test, to check if they are a function or not a function:
i. y = x
ii. y = x2
iii. y = 3
iv. y = |x|
v. y = Sinx
vi. y = x3
vii. y = 3√x
viii. x = y2
ix. x2 + y2 = 9
x. x = 4
xi. y = √x
2. Draw the graph of the following equations:
i. y = 3x + 1
ii. y = 2x – 3
iii. y = 5x – 2
iv. y = 6 – 2x
14 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
3. Plot the below coordinates on the graph and answer the following Notes
questions.
(2, 210), (5, 420), (7, 560), (6, 490), (3, 280), (1, 140), (8, 630)
i. Does the graph drawn from the above coordinates represent a
linear graph?
ii. Find the value of x-coordinate if y-coordinate corresponds to
350.
iii. Calculate the value of y-coordinate for which x-coordinate is
11, if the graph drawn is linear.
4. Mrs Mary asks John to identify whether the given equation 3x −
7y = 16 forms a linear graph or not without plotting its values.
Now help John to figure out whether it is a linear graph or not.
5. Mikel has to prepare a linear graph for the equation 2x + y = 8.
Complete the table below for the above equation.
x - 4 -2
y 8 - -
PAGE 15
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
2
Linear and Non-Linear
Functions
STRUCTURE
2.1 Polynomial Functions
2.2 Degree of a Polynomial Function
2.3 Adding and Subtracting Polynomials
2.4 Graphing Polynomial Functions
2.5 How to Determine a Polynomial Function
2.6 Quadratic Functions
2.7 The Quadratic Formula
2.8 Differences between Quadratics and Linear Functions
2.9 A Graphical Interpretation of Quadratic Solutions
2.10 Cubic Function
2.11 Y-Intercept of Cubic Function
2.12 Exercise
16 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Multiplying Polynomials
To multiply two polynomials together, multiply every term of one poly-
nomial by every term of the other polynomial. The degree of a product
of two polynomials equals the sum of the degrees of said polynomials.
PAGE 17
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
2.4 Graphing Polynomial Functions
Some of the most commonly used polynomial functions are:
1. Zero Polynomial function f(x) = ax0 = a, which is a vertical line
parallel to y-axis. Linear Polynomial function f(x) = ax + b. Linear
polynomial functions are also known as first-degree polynomials.
The graph of a linear polynomial function shapes a straight line.
2. Quadratic Polynomial function f(x) = ax2 + bx + c. The graph of a
second-degree or quadratic polynomial function is a curve referred
to as a parabola.
3. Cubic Polynomial function f(x) = ax3 + bx2 + cx + d which is a
polynomial function of the third degree
4. Quartic polynomial function f(x) = ax4 + bx3 + cx2 + dx + e which
is a polynomial function of the fourth degree.
18 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
2.5 How to Determine a Polynomial Function
In order to determine if a function is polynomial or not, the function
needs to be checked against certain conditions for the exponents of the
variables. These conditions are as follows:
The exponent of the variable in the function in every term must only
be a non-negative whole number i.e., the exponent of the variable
should not be a fraction or negative number.
The variable of the function should not be inside a radical i.e., it
should not contain any square roots, cube roots, etc.
The variable should not be in the denominator.
We now study these polynomials in detail.
Linear Functions
As defined in the previous chapter, a linear function is an algebraic equa-
tion in which each term is either a constant or the product of a constant
and (the first power of) a single variable.
Mathematically, it is expressed as a function f(x) = ax + b; where a and
b are constants, x is the independent variable.
PAGE 19
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Both blue and red lines are linear functions. The blue line, y = 21x − 3
has m = 21 and b = −3 which implies that slope is 21 and is positive and
intercept is –3. The red line, y = −x + 5 has m = −1 and b = 5 which
implies that here slope is negative with value as −1 and intercept is 5.
20 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
Slope
Slope describes the direction and steepness of a line, and can be calculated
given two points on the line. Its sign indicates the direction, while its
magnitude indicates the steepness which is measured by the absolute value
of the slope. A slope with greater absolute value indicates a steeper line.
Thus, a line with a slope of -8 is steeper than a line with a slope of 6.
PAGE 21
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes Slope is calculated by finding the ratio of the “vertical change” to the
“horizontal change” between any two distinct points on a line. This ra-
tio is represented by a quotient and gives the same number for any two
distinct points on the same line.
The slope of the equation is calculated using the formula:
y2 − y1
m=
x2 − x1
where ( x1 , y1 ) and ( x2 , y2 ) are points on the line.
The direction of a line is either increasing, decreasing, horizontal or verti-
cal. Thus, the slope of a line can be positive, negative, zero, or undefined.
A line is increasing if it goes up from left to right which implies
that the slope is positive (m > 0).
A line is decreasing if it goes down from left to right and the slope
is negative (m < 0).
If a line is horizontal the slope is zero and is a constant function
(y = c).
If a line is vertical the slope is undefined.
Example 2: Find the slope of the line shown on the coordinate plane below.
22 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Locate two points on the graph preferably integer values. Let it be (0, Notes
−3) and (5, 1). Here (0, −3) is point ( x1 , y1 ) and (5, 1) is point ( x2 , y2 ).
Locate any two integer points on the graph. Let it be (0, 5) and (3, 3).
Let (0, 5) be ( x1 , y1 ) and (3, 3) be ( x2 , y2 )
PAGE 23
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
( x2 − x1 ) + ( y2 − y1 )
2 2
d=
Example 4: Find the distance between two points (4, 6) and (6, 9).
Substitute the given values of points ( x1 , y1 ) and ( x2 , y2 ) and the distance
formula
( x2 − x1 ) + ( y2 − y1 )
2 2
d=
( 6 − 4) + (9 − 6)
2 2
=
= 22 + 32
= 13
24 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Example 5: Find the distance between two points (−1, 4) and (5, 7) Notes
Substitute the given values of points ( x1 , y1 ) and ( x2 , y2 ) and the distance
formula
( x2 − x1 ) + ( y2 − y1 )
2 2
d=
( 5 + 1) + ( 7 − 4 )
2 2
=
= 62 + 32
= 45
Mid-point of a Line Segment
Mid-point of a line segment is defined as a point which divides a line
segment into two lines of equal length. It is the middle point of a line
segment, or the middle point of two points on a line, and thus is equi-
distant from both end points.
Let the two points be ( x1 , y1 ) and ( x2 , y2 ) . Then the midpoint of the seg-
ment connecting the two points is obtained from the following formula:
x1 + x2 y1 + y2
,
2 2
Example 6: Find the mid-point between (4, 5) and (6, 9).
Substitute the values of the points in the formula for the mid-point.
x1 + x2 y1 + y2
Mid-point = ,
2 2
4+6 5+9
= ,
2 2
= ( 5, 7 )
PAGE 25
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes Thus, if two lines, say f(x) = mx + b and g(x) = nx + c, are parallel,
then n must equal m.
Example 7: Let the two lines be f(x) = 2x + 3 and g(x) = 2x − 1. Then
these two lines are parallel since they have the same slope, m = 2.
Perpendicular Lines
Two lines in the same plane are perpendicular to each other if their slopes
are negative reciprocals of each other. That is, if slope of one line is m
1
then slope of other line will be − .
m
An alternate way of defining perpendicular lines is: Two lines are said
to be perpendicular to each other if they form congruent adjacent angles.
In other words, they are perpendicular if the angles at their intersection
are right angles. The perpendicular symbol is ⊥.
Symbolically, two perpendicular lines, f(x) = m1x + b1 and g(x) = m2x +
b2are denoted as f(x) ⊥ g(x).
26 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
1 Notes
For example, given two lines: f(x) = 3x − 2 and g(x) = x + 1 are per-
3
pendicular lines since their slopes are negative reciprocal of each other.
Example 8: An athlete begins his normal practice for the next marathon
during the evening. At 6:00 pm he starts to run and leaves his home. At
7:30 pm, the athlete finishes the run at home and has run a total of 7.5
miles. How fast was his average speed over the course of the run? How
many miles did he run after the first half hour? If he kept running at
the same pace for a total of 3 hours, how many miles will he have run?
Solution: Here slope of the equation is the rate of change in the speed
of his run; distance over time. Therefore, the two variables are time (x)
and distance (y).
The first point is (0,0) which is the first point is at his house, where his
watch read 6:00 pm and he has not run anywhere yet. Let’s think about
our time in hours. The second point is 1.5 hours later, and the distance
covered is 7.5 miles. Therefore, the second point is (1.5,7.5)
Now the speed (rate of change) is simply the slope of the line connecting
the two points. The slope, given by:
PAGE 27
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes y2 − y1
m=
x2 − x1
7.5 − 0
= = 5 per hour.
1.5 − 0
To graph this line, we need the y-intercept and the slope to write the
equation. The slope was 5 miles per hour and since the starting point
was at (0,0), the y-intercept is 0. Therefore, the linear function is y = 5x
Using the graph, predictions can be made assuming that his average
speed remains the same.
The number of miles he ran after the first half hour will be obtained
1
using the equation y = 5x. Here x = .
2
Thus, y = 5(0.5) = 2.5 miles.
If he kept running at the same pace for a total of 3 hours, then the number
of miles he ran will be obtained by substituting x = 3 in the equation y
= 5x. Thus, he ran 15 miles.
Example 9: A rental company charges a flat fee of Rs. 30 and an addi-
tional Rs. 0.25 per mile to rent a moving van. Write a linear equation to
approximate the cost y in terms of x, the number of miles driven. How
much would a 75-mile trip cost?
Solution: The equation is formed using the slope-intercept form of a
linear equation, with the total cost being the dependent variable y and
the miles being the independent variable x.
28 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
PAGE 29
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes The quadratic formula can always be used to find the roots of a quadrat-
ic equation, regardless of whether the roots are real or complex, whole
numbers or fractions, and so on. The symbol ± indicates there will be two
solutions, one that involves adding the square root and the other found
by subtracting said square root. The resulting x values (zeros) may or
may not be distinct, and may or may not be real.
Example 10: Find the roots of the following quadratic function:
f(x) = 2x2 + 5x + 3
Solution: First, set the function equal to zero, as the roots are where the
function equals zero.
2x2 + 5x + 3 = 0
Second, identify the constants in the equation. The value of a = 2, b =
5, and c = 3
Substitute these values into the quadratic equation and solve:
−b ± b 2 − 4ac
x=
2a
− 5 ± 52 − 4 ( 2 )( 3)
=
2 ( 2)
−5 ± 1
=
4
−5 + 1 −5 − 1
= and
4 4
−6
Thus x = −1 and
4
30 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
pairs of unique independent variables will produce the same dependent Notes
variable, with only one exception for a given quadratic function.
iii. The slope of a quadratic function, unlike the slope of a linear
function, is constantly changing.
PAGE 31
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
32 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
From the graph we can see that the parabola intersects the x-axis at two Notes
points: (−1,0) and (2,0)
We know that the roots of the quadratic function are given by x-intercepts
of a parabola. Therefore, there are roots at x = −1 and x = 2
Now, let’s obtain the roots of the function y = x2 – x − 2 algebraically
using the quadratic formula. Here, a = 1, b = –1 and c = –2
Substituting in the quadratic formula
−b ± b 2 − 4ac
x=
2a
1 ± 12 − 4 (1)( −2 )
=
2 (1)
1± 9
=
2
1 +3 1− 3
= and
2 2
Thus, roots are x = 2, and –1
These are the same values of the roots which were obtained graphically.
PAGE 33
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
PAGE 35
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
2.12 Exercise
1. Subtract the polynomials: (5x3 + x2 + 9) − (4x2 + 7x − 3)
2. Initially, trains A and B are 325 miles away from each other. Train A
is traveling towards B at 50 miles per hour and train B is traveling
towards A at 80 miles per hour. At what time will the two trains
meet? At this time how far did the trains travel?
3. Write an equation of the line (in slope-intercept form) that is parallel
to the line y = −2x + 4 and passes through the point (−1,1).
4. Write an equation of the line (in slope-intercept form) that is
perpendicular to the line y = 41x – 3 and passes through the point
(2,4)
5. Find the roots of the quadratic function f(x) = x2 − 4x + 4. Solve
graphically and algebraically.
6. Find the x intercept(s) and y intercept of cubic function: f(x) = 3
(x – 1) (x – 2) (x – 3).
36 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
3
Reciprocal, Exponential
and Logarithmic Functions
STRUCTURE
3.1 Reciprocal Function
3.2 How to Find Reciprocal of a Function
3.3 Properties of Reciprocal Functions
3.4 Graph of Reciprocal Functions
3.5 Domain and Range of Reciprocal Function
3.6 How to Solve Reciprocal Functions?
3.7 Exponential Function
3.8 Logarithmic Functions
3.9 Exercise
PAGE 37
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes The reciprocal of a number is a number which when multiplied with the
actual number produces a result of 1.
For example, let us take the number 5. The reciprocal is 1/5. Also, when
we multiply the reciprocal with the original number we get 1.
Some more examples of reciprocal functions:
2
f ( x) =
x2
1
( x)
f= −4
x +1
1
f ( x) =
− +3
x+2
38 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
3.3 Properties of Reciprocal Functions
The reciprocal functions can be easily identified with the following
properties.
Reciprocal functions are in the form of a fraction. A numerator is
a real number and the denominator is either a number or a variable
or a polynomial.
The reciprocal of x is 1/x.
The denominator of a reciprocal function cannot be 0.
The domain and range of the reciprocal function is the set of all
real numbers excluding 0.
The graph of the equation f(x) = 1/x is symmetric with the equation
y = x.
PAGE 39
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes the x-axis and y-axis. The y-axis is said to be the vertical asymptote as
the curve gets very closer but never touches it. Also, the x-axis is the
horizontal asymptote as the curve never touches the x-axis.
Notes
3.7 Exponential Function
An exponential function is a mathematical function. It is used in many
practical situations, such as to find the exponential decay or exponential
growth, to compute investments, to model populations and so on.
PAGE 41
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Exponential Curve
The exponential curve depends on the value of x of the exponential func-
tion. If the variable is negative, the function is undefined for –1 < x < 1.
The growth or the decay of an exponential curve depends on the expo-
nential function. Any quantity that grows or decays at regular intervals
by a fixed per cent should possess either exponential growth or expo-
nential decay.
Exponential Growth
In Exponential Growth, the quantity increases very slowly at first, and
then rapidly. The rate of change increases over time. The rate of growth
becomes faster as time passes. The rapid growth is meant to be an “ex-
ponential increase”. The formula to define the exponential growth is:
y = a ( 1+ r )x
where r is the growth percentage. The graph of the function in exponen-
tial growth is increasing. The exponential growth formulas are used to
find doubling time, to model population growth, compound interest, etc.
42 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
PAGE 43
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
The properties of the exponential function graph when the base is greater
than 1 are given below.
The graph is an increasing function passing through the point (0,1).
The domain is all real numbers and the range is y > 0.
The graph is asymptotic to the x-axis as x approaches negative
infinity.
The graph increases without bound as x approaches positive infinity.
The graph is continuous and smooth.
Example 4: Construct the graph for the exponential function
g(x) =(1/2)x.
Following the similar steps as done in the above example, we get the
following graph.
The properties of the exponential function and its graph when the base Notes
is between 0 and 1 are as follows:
The graph is the decreasing function with line passes through the
point (0,1).
The domain includes all real numbers and the range is y > 0.
The graph is asymptotic to the x-axis as x approaches positive
infinity.
The line increases without bound as x approaches negative infinity.
Graph is continuous and smooth.
Thus, from the above two graphs, we can see that when f(x) = 2x the
function is increasing and when g(x) = (1/2)x the function is decreasing.
Thus, the graph of exponential function f(x) = ax increases when a > 1
and decreases when 0 < a < 1.
PAGE 45
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes function, the higher its growth. But a function which grows faster than
a polynomial function is exponential function, y = f(x) = ax, where a>1.
Thus, for any of the positive integer n the function f (x) is said to grow
faster than that of fn(x).
Thus, the exponential function having base greater than 1, i.e., a > 1 is
defined as y = f(x) = ax. The domain of exponential function will be the
set of real numbers and the range will be the set of all the positive real
numbers. Also note that the exponential function is increasing and the
point (0, 1) always lies on the graph of an exponential function. Also, it
is very close to zero if the value of x is mostly negative.
46 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
The base of the logarithm is a. This can be read as log base a of x. The Notes
most 2 common bases used in logarithmic functions are base 10 and
base e. Thus Log functions include natural logarithm (ln) (with base e
and is denoted by loge.) or common logarithm (log) (with base 10 and it
is denoted by log10 or simply log).
Some examples of logarithmic functions are:
f(x) = ln (x – 3)
f(x) = log2 (x + 6) – 2
f(x) = 2 log x, etc.
The logarithms can be calculated for positive whole numbers, fractions,
decimals, but not for negative values. The logarithms are generally cal-
culated with a base of 10. The logarithmic value of any number can be
found using a Napier logarithm table.
The domain of log function y = log x is (0, ∞) and the range of any
log function is the set of all real numbers (R).
Example 5: Simplify log2 (1/128).
Solution: We use the properties of logarithmic function to simplify the
given logarithm.
log2 (1/128) = log2 1 – log2 128 (since log (a/b) = log a – log b)
= 0 – log2 27 (since loga 1 = 0)
= –log2 27
= –7 log2 2 (since logax = x loga)
= –7 (1) (since loga a = 1)
= –7
Hence log2 (1/128) = –7
Example 6: Find the domain and range of the logarithmic function
f(x) = 2 log (2x – 4) + 5.
Solution: For finding domain, we set the argument of the function greater
than 0 and solve for x.
2x – 4 > 0
2x > 4
x > 2
PAGE 47
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Logarithmic Graph
Since the exponential and log functions are inverses of each other their
graphs are symmetric with respect to the line y = x. Also, when x = 0,
y = 0 as y = loga1 = 0 for any ‘a’. Thus, all such functions have x-in-
tercept of (1, 0). Since loga0 is not defined, a logarithmic function does
not have a y-intercept . The domain of the basic logarithmic function y
= loga x is the set of positive real numbers and the range is the set of all
real numbers. Using all these, the graph of the logarithmic graph will be
48 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Join the two points and extend the curve on both sides with respect Notes
to the vertical asymptote.
Example 7: Graph the logarithmic function f(x) = 2 log3 (x + 1).
Solution: Here, the base is 3 > 1 so the curve will be an increasing curve.
For domain: Since x + 1 > 0 ⇒ x > –1 so domain = (–1, ∞).
Range = R.
Vertical asymptote is x = –1.
At x = 0, y = 2 log3 (0 + 1) = 2 log3 1 = 2 (0) = 0
At x = 2, y = 2 log3 (2 + 1) = 2 log3 3 = 2 (1) = 2
Thus, the two points on the curve are (0, 0) and (2, 2). Thus, the loga-
rithmic function graph will be:
3.9 Exercise
1. Find the reciprocal of the following.
i. 5
ii. 3x
iii. x2 + 6
iv. 5/8
PAGE 49
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes 2. Find the domain and range of the reciprocal function y = 1/(x+3).
3. Find the vertical and horizontal asymptote of the function
f(x) = 2/(x – 7).
4. Solve the exponential equation: (¼)x = 64.
5. Graph an exponential function (⅓)x – 1.
6. Solve for x: 8(4x-1) = 45x.
7. Solve the exponential equation for x: –5x-3 = 25/40.
8. Simplify the following exponential expression: 3x – 3x+2.
9. The half-life of carbon-14 is 5,730 years. If there were initially
1000 grams of carbon, then what is the amount of carbon left after
2000 years? Round your answer to the nearest integer.
10. Express 43 = 64 in logarithmic form.
11. Given the logarithmic function f(x) = 3 log2 (2x – 3) – 7. Find the
domain, range, vertical and horizontal asymptotes of this function.
50 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
PAGE 51
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
4
Correlation
STRUCTURE
4.1 Covariance
4.2 Correlation
4.3 Types of Correlation
4.4 Scatter or Dot Diagram
4.5 Karl Pearson’s Coefficient of Correlation
4.6 Coefficient of Correlation of Grouped Data
4.7 Rank Correlation
4.8 Spearman’s Rank Correlation Coefficient
4.9 Equal Ranks
4.10 Correlation Factor
4.11 Exercise
4.1 Covariance
Let the corresponding values of two variables X and Y, given by ordered pairs (x1, y1),
(x2, y2), (x3, y3) ,….(xn, yn)
Then the Covariance between X and Y is denoted by Cov (X, Y).
It is defined as
OR
Cov (X, Y) = E (X Y) – E (X) E (Y)
E (XY), E (X), E (Y) are the corresponding means
PAGE 53
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
n n
and ∑ xi yi .
n
Cov (X, Y) =
54 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Solution: Notes
4.2 Correlation
Whenever two variables x and y are so related that an increase in the
one is accompanied by an increase or decrease in the other, then the
variables are said to be correlated.
For example, the yield of crop varies with the amount of rainfall.
PAGE 55
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
56 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
In third diagram points are scattered in such a way that there is no cor- Notes
relation between them. Fourth and Fifth diagrams have exactly straight
line. As increase in one variable x is proportional to the increase or
decrease in other variable y for perfect positive or negative correlation
respectively.
Where X = x – x̄, Y= y – ȳ
i.e. X, Y are the deviations measured from their respective means,
= Covariance
and σx, σy being the standard deviations of these series.
Example 3: Calculate the coefficient of correlation between x and y series
from the following data:
∑(x − x ) ∑ ( y=
− y) ∑ ( x − x )(=
y − y)
2 2
136,
= 138, 122
PAGE 57
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
r=
∑XY
∑X ∑Y
2 2
x 5 9 13 17 21
y 12 20 25 33 35
Solution: Here
5 + 9 + 13 + 17 + 21 65
x
= = = 13
5 5
12 + 20 + 25 + 33 + 35 125
y
= = = 25
5 5
Let
X=
( x − x ) and Y =
( y − y)
58 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
x 21 23 30 54 57 58 72 78 87 90 Notes
y 60 71 72 83 110 84 100 92 113 135
Solution:
Σn 570
Mean of x= x= = = 57
n 10
Σy 920
Mean of y= y= = = 92
n 10
If X and Y are deviation of xʼs and yʼs from their respective means, then
the data may be arranged in the following form:
x y X = x – 57 Y = y – 92 X2 Y2 XY
21 60 –36 –32 1296 1024 1152
23 71 –34 –21 1156 441 714
30 72 –27 –20 729 400 540
54 83 –3 –09 9 81 27
57 110 0 18 0 324 00
58 84 1 –08 1 64 –08
72 100 15 08 225 64 120
78 92 21 00 441 00 00
87 113 30 21 900 441 630
90 135 33 43 1089 1849 1419
∑x = ∑y = ∑X2 = ∑Y2 = ∑XY =
TOTAL
570 920 5846 4688 4594
ΣX2 = 5846, ΣY2 = 4688 and ΣXY = 4594
PAGE 59
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes X′ = x – a, Y′ = y – b
Solution: Let the assumed mean for x and y series be 5 and 9 respectively.
x y X′ = x – 5 Y′ = y – 9 X′2 Y′2 X′Y′
3 7 –2 –2 4 4 4
7 12 2 3 4 9 6
5 8 0 –1 0 1 0
4 8 –1 –1 1 1 1
6 10 1 1 1 1 1
8 13 3 4 9 16 12
2 5 –3 –4 9 16 12
7 10 2 1 4 1 2
Total 2 1 32 49 38
Now,
Example 7: From the following data, examine whether input of oil and
output of electricity can be said to be correlated.
60 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Input of Oil 6.9 8.2 7.8 4.8 9.6 8.0 7.7 Notes
Output of Electricity 1.9 3.5 6.5 1.3 5.5 3.5 2.2
PAGE 61
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
62 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Hence the age and the sum assured are negatively correlated, i.e., as age
goes up the sum assured comes down.
PAGE 63
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes Example 10: Calculate the coefficient of correlation for the following table:
Here,
Solution:
64 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Example 11: A computer operator while calculating the coefficient be- Notes
tween two variates x and y for 25 pairs of observations obtained the
following constants:
n = 25, Σx = 125, Σx2 = 650, Σy = 100, Σy2 = 460, Σxy = 508
It was however later discovered at the time of checking that he had
copied down two pairs as (6, 14) and (8, 6) while the correct pairs were
(8, 12) and (6, 8). Obtain the correct value of the correlation coefficient.
Solution:
Here, corrected Σx = Incorrect Σx – (6 + 8) + (8 + 6) = 125 – 14 +14 = 125
Corrected Σy = Incorrect Σy – (14 + 6) + (12 + 8)
= 100 – 20 + 20 = 100
Corrected Σx2 = 650 – (62 + 82) + (82 + 62) = 650 – 100 + 100
= 650
Corrected Σy = 460 – (142 + 62) + (122 + 82) = 460 – 232 + 208 = 436
2
Corrected Σxy = 508 – [(6) (14) + (8) (6)] + (8) (12) + (6) (8)
= 508 – (84 + 48) + (96 + 48) = 508 – 132 + 144 = 520
Corrected value of correlation coefficient is
PAGE 65
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
4.8 Spearman’s Rank Correlation Coefficient
Spearman’s rank correlation coefficient measures the strength of association
between two ranked variable
Working Rule
Step I: Assign ranks to each item of both series, if they are not given.
Step II: Calculate the difference d of ranks of X from the rank of Y and
write it in a separate column.
Step III: Square the difference d and write d 2 in a separate column.
Step IV: Apply the formula to get the Rank correlation
Solution:
Rank in Rank in
Person Statistics R1 Income R2 d = R1 – R2 d2
A 9 1 8 64
B 10 2 8 64
C 6 3 3 9
D 5 4 1 1
E 7 5 2 4
F 2 6 –4 16
G 4 7 –3 9
66 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
H 8 8 0 0 Notes
I 1 9 –8 64
J 3 10 –7 48
Sd = 280
2
m(m 2 − 1)
to Σd 2 ,
If the formula of rank correlation coefficient, add the factor 12
where m is the number of times an item (say a1) is repeated. This factor
is added for each repeated value in both the series. The total number of
observations is denoted by n.
The modified formula for the rank correlation coefficient is given below.
Example 13: Obtain the rank correlation coefficient for the following data:
x 68 64 75 50 64 80 75 40 55 64
y 62 58 68 45 81 60 68 48 50 70
PAGE 67
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes Solution:
x y Rank in x = x’ Rank in y = y’ d = x – y’ d2
68 62 4 5 _1 1
64 58 6 7 _1 1
75 68 2.5 3.5 _1 1
50 45 9 10 _1 1
64 81 6 1 5 25
80 60 1 6 _5 25
75 68 2.5 3.5 _1 1
40 48 10 9 1 1
55 50 8 8 0 0
64 70 6 2 4 16
Total ∑ d = 72
2
II Judge 1 6 5 10 3 2 4 9 7 8
III Judge 6 4 9 8 1 2 3 10 5 7
Use the correlation coefficient to determine which pair of Judges has the
nearest approach to common flair in beauty rankings.
68 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Solution: Notes
d12 = r1 d13 = r1 d23 = r2
r1 r2 r3 – r2 – r3 – r3 d212 d213 d223
5 1 6 4 –1 –5 16 1 25
3 6 4 –3 –1 2 9 1 4
10 5 9 5 1 –4 25 1 16
7 10 8 –3 –1 2 9 1 4
2 3 1 –1 1 2 1 1 4
1 2 2 –1 –1 0 1 1 0
4 4 3 0 1 1 0 1 1
10 9 10 1 0 –1 1 0 1
4 7 5 –3 –1 2 9 1 4
6 8 7 –2 –1 1 4 1 1
∑d = ∑d = ∑d2 =
2 2
Since r13 = 0.9455 is maximum, so the pair of first and third Judge has
the nearest approach to the common taste of beauty.
4.11 Exercise
1. Calculate the coefficient of correlation from the data given below.
x 4 6 8 10 12
y 2 3 4 6 10
2. Find the coefficient of correlation between x and y from the table
of their values:
x 1 3 4 6 8 9 11 14
y 1 2 4 4 5 7 8 9
PAGE 69
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes 3. Find the coefficient of correlation of the following data taking new
origin of x at 70 and for y at 67:
x 67 68 64 68 72 70 69 70
y 65 66 67 67 68 69 71 71
4. Calculate the coefficient of correlation for the following data:
x 1 2 3 4 5 6 7 8 9
y 9 8 10 12 11 13 14 16 15
5. Calculate Karl Pearson’s coefficient of correlation from the following
data, using 20 as working mean for price and 70 as working mean
for demand.
Price 14 16 17 18 19 20 21 22 23
Demand 84 78 70 75 66 67 62 58 60
6. Calculate the correlation coefficient between the following pairs of
values:
x 100 110 115 116 120 120 125 130 135
y 18 18 17 16 16 16 15 13 10
7. The following marks have been obtained by a class of students in
statistics (out of 100):
Paper I 45 55 56 58 60 65 68 70 75 80 85
Paper II 56 50 48 60 62 64 65 70 74 82 90
Compute the coefficient of correlation for the above data.
8. Calculate the coefficient of correlation between the values of x and
y from the following data:
x 78 89 97 69 59 79 61 61
y 125 137 156 112 107 136 123 108
[Hint: You may use 69 as working mean for x and 112 as that for y].
9. Find the correlation coefficient between the income and expenditure
of a wage earner and comment on the result:
Month Jan. Feb. March April May June July
Income 46 54 56 56 58 60 62
Expenditure 36 40 44 54 42 58 54
70 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
10. Two girls Asgari and Mumtaz were asked to rank 7 different types Notes
of lipsticks. The ranks given by them are given below:
Lipsticks A B C D E F G
Asgari 2 1 4 3 5 7 6
Mumtaz 1 3 2 4 5 6 7
Calculate Spearman’s rank correlation coefficient.
PAGE 71
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
5
Regression
STRUCTURE
5.1 Introduction
5.2 Curve of Regression
5.3 Regression Coefficients
5.4 Properties of Regression Coefficients
5.5 Relation between Regression Analysis and Correlation Analysis
5.6 Exercises
5.1 Introduction
If the scatter diagram indicates some relationship between two variables x and y, then
the dots of the scatter diagram will be concentrated round a curve. This curve is called
the curve of regression.
Regression analysis is the method used for estimating the unknown values of one variable
corresponding to the known values of another variable.
72 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Suppose we have to find out the unknown value of y for a certain value Notes
of x, then we have line of regression of y on x, i.e., y = a + bx. Here
y is dependent variable and x is independent variable.
If we have to find out unknown value x for a given value of y, then we
have a line of regression of x on y i.e., x = a + by. Here x is dependent
variable and y is independent variable.
So, we have two lines of regression.
When the curve is a straight line, it is called a line of regression. A line
of regression is the straight line which gives the best fit in the least
square sense to the given frequency.
Regression will be called non-linear if there exists a relationship (parabola
etc.) other than a straight line between the variables under consideration.
Let S be the sum of the squares of such distances, then
S = ∑(y –a – bx)2
According to the principle of least squares, we have to choose a and b
so that S is minimum. The method of least square gives the condition
for minimum value of S.
where x and y are the means of x series and y series. This shows that
( x , y ) lie on the line of regression (1). Shifting the origin to ( x , y ),
the equation (3) becomes
PAGE 73
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes But
Regression coefficient of x on y is
74 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
5.4 Properties of Regression Coefficients
1. The geometric mean between regression coefficients is the coefficient
of correlation r.
(a) If byx is positive, bxy will also be positive.
(b) If byx is negative, bxy will also be negative.
(c) Both regression coefficients must have the same sign. If bxy
and byx are both positive, r will be positive and if bxy and byx
are both negative, r will also be negative.
(d) bxy byx < 1 .
2. If one regression coefficient is greater than unity, then the other
regression coefficient must be less than unity.
3. Arithmetic mean of b yx and b xy is equal to or greater than the
coefficient of correlation.
bxy + byx
i.e.
≥ r
2
4. Regression coefficients are independent of origin but not of scale.
5. If θ be the acute angle between the two regression lines in the case
of two variables x and y, then
PAGE 75
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
2
Hence, the correlation coefficient between x and y is
.
3
Example 2: The following regression equations were obtained from a
correlation table:
y = 0.516x + 33.73 , x = 0.512y + 32.52
Find the value of (a) the correlation coefficient, (b) the mean of x and
the mean of y.
Solution:
(1)
(2)
76 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
(5)
(6)
PAGE 77
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes Solution:
S. N. x y xy x2
1 1 1 1 1
2 3 2 6 9
3 4 4 16 16
4 6 4 24 36
5 8 5 40 64
6 9 7 63 81
7 11 8 88 121
8 14 9 126 196
Total 56 40 364 524
Let y = a + bx be the line of regression of y on x, where a and b are
given by the following equations:
Example 6: The following data regarding the heights (y) and the weights
(x) of 100 college students is given:
Σx = 15000 Σx2 = 2272500
Σy = 6800 Σy2 = 463025
Σxy = 1022250
Find the correlation coefficient between height and weight and state the
equation of regression of height on weight.
Solution:
PAGE 79
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
5.5 Relation between Regression Analysis and Correlation
Analysis
Sr.
No. Correlation Analysis Regression Analysis
1. The relationship between two variables In this case some points are stepped up
is given by the coefficient of correlation. and some are stepped down for making
an average value.
2. It is a measure of direction and degree bxy and byx are mathematical measure of
of relationship between x and y. average relationship between the two
variables.
3. Here, rxy = ryx bxy ≠ byx
4. It does not reflect upon the nature of vari-
It indicates which is dependent variable
able (dependent or independent variable).
and which is independent variable.
5. It does not imply cause and effect rela-
It indicates the cause and effect rela-
tionship between the variables. tionship between the variables.
6. It is a relative measure and have no units.
It is an absolute measure.
7. It indicates the degree of association.
It is used to forecast the nature of the
dependent variable when the value of
independent variable is given.
8. It is confined to the study of linear It has not only application of linear rela-
relationship. tionship but non-linear relationship also.
5.6 Exercises
1. Find the regression line of y on x for the data:
x 1 4 2 3 5
y 3 1 2 5 4
2. Find the equation of regression lines for the following values of x
& y.
x 2 4 6 8 10
y 6 5 4 3 2
3. Compute the regression line of y on x for the following data:
x 1 2 3 4 5 6 7 8 9
y 9 8 10 12 11 13 14 16 15
4. Find the regression lines of x on y and y on x from the given data:
x 1 2 3 4 5 6 7 8 9 10
y 10 12 16 28 25 36 41 49 40 50
80 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
5. Find the equations to the lines of regression and the coefficient of Notes
correlation for the following data:
x 2 4 5 6 8 11
y 18 12 10 8 7 5
6. For the following data, find the equation of line of regression:
x 10 12 13 16 17 20 25
y 10 22 24 27 29 33 37
7. Find the regression line of y on x if
x 40 70 50 60 80 50 90 40 60 60
y 2.5 6.0 4.5 5.0 4.5 2.0 5.5 3.0 4.5 3.0
8. The following marks have been obtained by a class of students in
statistics
Paper I 80 45 55 56 58 60 65 68 70 75 85
Paper II 81 56 50 48 60 62 64 65 70 74 90
Compute the coefficient of correlation for the above data. Find the
lines of regression.
9. The following results were obtained from records of age (x) and
systolic blood pressure (y) of a group of 10 men:
x y
Mean 53 142
Variance 130 165
and ∑ (x − x ) (y − y ) =1220.
Find the appropriate regression equation and use it to estimate the
blood pressure of a man whose age is 45.
10. The following results were obtained from lineups in Applied
Mechanics and Engineering Mathematics in an examination:
Applied Mechanics (x) Engg. Maths. (y)
Mean 47.5 39.5
S.D. 16.8 10.8
Find both the regression equations. Also estimate the value of y for
x = 30.
PAGE 81
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
82 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
6
Multiple Regression and
Correlation
STRUCTURE
6.1 Multiple Regression
6.2 Multiple Correlation
6.3 Formulae for the Calculation of Multiple Correlation Coefficient
6.4 Properties of Multiple Correlation
6.5 Multiple Regression Analysis
6.6 Exercise
PAGE 83
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes Solution:
Here, we have
x y x2 xy x3 x4 x 2y
1 1.7 1. 1.7 1 1 1.7
2 1.8 4 3.6 8 16 7.2
3 2.3 9 6.9 27 81 20.7
4 3.2 16 12.8 64 256 51.2
Total 10 9.0 3 25.0 100 354 80.8
Let the non-linear relationship be
y = a 0 + a 1x + a 2x 2
Normal equations are
Σy = na0 + a1Σx + a2Σx2
Σxy = a0Σx + a1Σx2 + a2Σx3
Σx2y = a0Σx2 + a1Σx3 + a2Σx4
Substituting the values of Σy, n, Σx, Σx2 etc. in these equations, we get
9 = 4a0 + 10a1 + 30a2
25 = 10a0 + 30a1 + 100a2
80.8 = 30a0 + 100a1 + 354a2
Solving these equations, we get
a0 = 2, a1 = – 0.5 and a2 = 0.2
Then the non-linear relationship is
y = 2 – 0.5x + 0.2x2
84 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
0.16 + 0.25 − 0.24
=
1 − 0.36
0.17
= 0.2656
0.64
= 0.515
Example 3: If r12 = 0.6, r23 = 0.35 and r31 = 0.4 then find R3.12.
Solution: We have r12 = 0.6, r23 = 0.35, and r31 = 0.4
We know that
r312 + r322 − 2r12 r13 r23
R3.23 =
1 − r232
0.1145
= = 0.1789
0.64
= 0.423
Example 4: If r12 = 0.25, r13 = 0.35 and r23 = 0.45 then find R2.13.
Solution: Here, we have r12 = 0.25, r13 = 0.35, and r23 = 0.45
We know that,
r212 + r232 − 2r12 r13 r23
R2.13 =
1 − r232
(0.25) 2 + (0.45) 2 − 2(0.25)(0.35)(0.45) 0.0625 + 0.2025 − 0.07875
= =
1 − (0.35) 2 1 − 0.1225
0.18625
= = 0.2123
0.8775
= 0.461
Example 5: Given the following data:
x1 3 5 6 8 12 14
x2 16 10 7 4 3 2
x3 90 72 54 42 30 12
Compute the coefficient of linear multiple correlation of x3 on x1 and x2.
86 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Solution: Notes
Now
ΣX 1 X 2
r12 = = −0.89
ΣX 12 × ΣX 22
Also,
ΣX 1 X 3 −582
r13 = = = −0.97
ΣX 12 × ΣX 22 90 × 4008
Again
ΣX 2 X 3 720
=r23 = = 0.96
ΣX 22 × ΣX 32 140 × 4008
We know that
(−0.97) 2 + (0.96) 2 − 2(−0.89)(−0.97)(0.96)
R3.12 =
1 − (−0.89) 2
0.97) 2 + (0.96) 2 − 2(−0.89)(−0.97)(0.96)
Substituting the values of r12, r13 and r23 in (1), we (−get 2
1 − (−0.89)
(−0.97) 2 + (0.96) 2 − 2(−0.89)(−0.97)(0.96)
0.9409 + 0.9216 − 1.66 0.2025
1 − (−0.89) 2 =
1 − 0.7921 0.2079
0.9409 + 0.9216 − 1.66 0.2025
= 0.9740
1 − 0.7921 0.2079
0.9740
= 0.987
PAGE 87
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
6.5 Multiple Regression Analysis
We have considered two types of regression equations one of x on y and
the other of y on x.
Multiple regression analysis represents an extension of two variables
to three or more variables.
We take x1 as dependent variable and x2 and x3 as independent variables.
(1)
Normal equation of multiple regression equation is
(2)
Putting the value of X1 from (1) in (2), we get
(4)
(5)
Since Σx1 = Σ(X1 − X̅1) = 0 , Σx2 = Σ(X 2 − X̅2 ) = 0, Σx3 = Σ(X3 − X̅3) =
0 [Sum of the deviations from the mean = 0]
Therefore from (3), a = 0
Putting the value of a = 0 in (4) and (5), we get
On solving (6) and (7), we get the values of b12.3, and b13.2.
On putting the values of a, b12.3 and b13.2 in (1), we get the required
regression equation.
88 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Similarly Notes
x2 = b21.3x1 + b23.1x3
x3 = b31.2x1 + b32.1x2
Second Method:
On putting the values of Σx1 x2 etc. in (6) and (7), we get
Where ∆ij is the co-factor of the element in the ith row and jth column
in the determinant.
Hence, on substituting the values of b12.3 and b13.2 the equation to the
regression plane of x1 on x2 and x3 is
(12)
PAGE 89
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes Similarly,
90 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
b21.3 =
=
and
b23.1 =
PAGE 91
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
We know that
6.6 Exercise
1. f r12 = 0.59, r13 = 0.46, and r23 = 0.77 then find R1.23.
2. If r12 = 0.5, r13 = 0.6, and r23 = 0.7 then find R2.13.
3. If r12 = 0.6, r13 = 0.7, and r23 = 0.65 then find R3.12.
4. If r12 = 0.8, r13 = – 0.5, and r23 = 0.40 then prove that R1.23 = R1.32
92 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
5. If r12 = 0.45, r13 = 0.32, and r23 = 0.61 then find R1.23 Notes
6. If r12 = 0.8, r13 = 0.7, r23 = 0.6 and σ1 = 10, σ2 = 8, σ3 = 5, then
find the regression equation of x1 on x2 and x3.
7. If σ1 = 3, σ2 = 4, σ3 = 5 and r12 = 0.7, r23 = 0.4, r31 = 0.6, then
determine the regression equation of x1 on x2 and x3.
8. If r12 = 0.28, r23 = 0.49, r31 = 0.51 and σ1 = 2.7, σ2 = 2.4, σ3 = 2.7,
then find the regression equation of x3 on x1 and x2.
9. Find the multiple linear regression equation of x1 and x2 on x3 from
the data given below:
x1 2 4 6 8
x2 3 5 7 9
x3 4 6 8 10
PAGE 93
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
7
Curve Fitting: Principle of
Least Squares
STRUCTURE
7.1 Introduction
7.2 Principle of Least Squares
7.3 Method of Least Squares (Fitting of Straight Lines)
7.4 Fitting a Curve of Type Y = A/X + B√X
7.5 To Fit a Parabola (Up to Second Degree)
7.6 Change of Scale in Second Degree Equations
7.7 To Fit an Exponential Curve (Y = ABX + C)
7.8 Change of Origin and Scale
7.9 Exercise
7.1 Introduction
Consider a number of pairs of x and y i.e.; (x1, y1), (x2, y2), (x3, y3) … and so on. To get
a relationship between x and y, plot these points on a graph paper. The diagram obtained
is known as scatter diagram or dot diagram. This diagram shows a rough relationship
between two variables x and y. An exact relationship between two variables is obtained
by curve fitting. Here we get algebraic equation of the curve.
Some of the errors e1, e2, e3,….. en will be positive and others negative.
In finding the total errors, all the errors are added. In addition, some
negative and some positive errors may cancel and in some cases sum
of all the errors may be zero, which leads to false result. To avoid such
situation, we may make all the errors positive by squaring. Sum, S = e12
+ e22 + e32 +…..+ en2.
The curve of the best fit is that for which the sum of the squares of
errors (S) is minimum.
This is called the principle of least squares.
PAGE 95
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
P(x1, y1)
For S to be minimum
96 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
PAGE 97
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes x y x y x2
1 14 14 1
2 27 54 4
3 40 120 9
4 55 220 16
5 68 340 25
Σx = 15 Σy = 204 Σxy = 748 Σx2 = 55
Here,
n = 5
Normal equations are
Σy = na + bΣx (2)
Σxy = aΣx + bΣx2(3)
On putting the values of Σx, Σy, Σxy and Σx2 in (2) and (3), we have
204 = 5a + 15b(4)
748 = 15a + 55b(5)
On solving equations (4) and (5), we get
a = 0, b = 13.6
On substituting the values of a and b in (1), we get
y = 13.6 x
Example 3: Find the least squares fit of the form y = a + bx2 to the
following data:
x – 1 0 1 2
y 2 5 3 0
Solution:
y = a + bx2(1)
Let
x2 = z, y = a + bz (2)
x y z = x2 yz z2
–1 2 1 2 1
0 5 0 0 0
1 3 1 3 1
2 0 4 0 16
Σy = 10 Σz = 6 Σyz = 5 Σz2 = 18
98 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Here, Notes
n = 4
Normal equations are
Σy = na + bΣz (3)
Σyz = aΣz + bΣz (4) 2
On putting the values of Σy, Σz, Σyz, Σz2 in (3) and (4), we have
10 = 4a + 6 b (5)
5 = 6a + 18b (6)
25 10
On solving equations (5) and (6), we get, a = , b = −
6 9
On substituting the values of a, b in (2), we obtain
25 10
y = − z
6 9
PAGE 99
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes On putting the values of Σx, Σy, xy, Σx2 in (2) and (3), we have
24 = 5a + 10 b (4)
67 = 10a + 30b (5)
On solving (4) and (5), we get
a = 1, b = 1.9
On substituting the values of a and b in (1), we get
y = 1 + 1.9x
Example 5: Fit a curve y = ax2 + b for the following data:
x 12 16 20 22 24 26 30
y 4.44 7.5 4.9 10.76 10.76 11.76 14.0
Solution: Given parabola is
y = ax2 + b (1)
Putting x2 = X in (1), we get
y = aX + b (2)
Normal equations are
Σy = aΣX + nb (3)
ΣXy = aΣX2 + bΣX (4)
n x y x2 = X x4 = X2 Xy
1 12 4.44 144 20736 639.36
2 16 7.5 256 65536 1920
3 20 4.9 400 160000 1960
4 22 10.76 484 234256 5207.84
5 24 10.76 576 331776 6197.76
6 26 11.76 676 456976 7949.76
7 30 14.0 900 810000 12600.00
Total Sy = 64.12 SX = SX = 2079280
2
SXy =
3436 37562.72
Putting the values of Σ y, n, Σ X in (3), we get
64.12 = 7b + 3436 aΣ3436 a + 7b – 64.12 = 0 (5)
100 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
PAGE 101
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes Here n = 6
x y 1 1 y x
x x2
1 36 1 1 36
3 29 0.3333 0.1111 9.6667
5 28 0.2000 0.0400 5.6000
7 26 0.1429 0.0204 3.7143
9 24 0.1111 0.0123 2.6667
10 15 0.1000 0.0100 1.5
Σ = Σy = 158 Σ1 = 1.8873 Σ1 = 1y =
x 1.1938 x2 59.1477 x
On putting these values in the above normal equations, we have
158 = 6a + 1.8873b
59.1477 = 1.8873a + 1.1938b (1)
On solving the equations (1) and (2), we get
a = 21.3810, b = 15.7441 (2)
Hence, the equation of the curve is
xy = 21.3810x + 15.7441
102 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
i.e.,
and
On solving equations (2) and (3), we get the values of a and b. On put-
ting the values of a and b in (1) we get the equation of required curve
a
y = + b√ x
x
C0
Example 7: Use the method of Least Squares to fit the curve: y =
x
+ c1 √ x to the following table of value
x 0.1 0.2 0.4 0.5 1 2
y 21 11 7 6 5 6
C
Solution: The equation of the curve to be fitted is y = 0 + c1 √ x
x
The normal equations are:
PAGE 103
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes 1
1
x y y/x y√ x
√x x2
0.1 21 210 6.64078 3.16228 100
0.2 11 55 4.91935 2.23607 25
0.4 7 17.5 4.42719 1.58114 4.25
0.5 6 12 4.24264 1.41421 4
1 5 5 5 1 1
2 6 3 8.48528 0.70711 0.25
Σx = Σ(y/x) = Σy √ x = 1 Σ1/x = 136.5
2
4.2 302.5 Σ =
33.71524 √x
10.10081
From equations (1) and (2), we have
302.5 = 136.5c0 + 10.10081c1
and
33.71524 = 10.10081c0 + 4.2c1
Solving these, we get
c0 = 1.97327 and c1 = 3.28182
Hence the required equation of the curve is
1.97327
y = + 3.28182 √ x
x
Example 8: A person runs the same racetrack for five consecutive days
and is timed as follows:
Days (x) 1 2 3 4 5
Time (y) 15.3 15.1 15 14.5 14
Make a least square fit to the above data using the function
b c
y =a + +
x x2
b c
Solution: The given equation is y = a + + 2 Therefore, the normal
x x
equations are
104 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
Here n = 5
1 1 1 1 y y
x y
x x2 x3 x4 x x2
1 15.3 1 1 1 1 15.3 15.3
2 15.1 0.5 0.25 0.1250 0.0625 7.5500 3.7550
3 15 0.3333 0.1111 0.0370 0.0123 5.0000 1.6667
4 14.5 0.2500 0.0625 0.0156 0.0039 3.6250 0.9063
5 14 0.2000 0.0400 0.0080 0.0016 2.8000 0.5600
Σy = 1 1 1 1 y y
73.9 Σ = Σ = Σ 3= Σ 4 = Σ = Σ =
x x2 x x x x2
2.2833 1.4636 1.1856 1.0803 34.2750 22.1880
1
On putting the values of Σy, Σ ,…., etc., in the normal equations, we get
x
73.9 = 5a + 2.283b + 1.4636c (1)
34.2750 = 2.2833a + 1.4636b + 1.1856c (2)
22.1880 = 1.4636a + 1.1856b + 1.0803c (3)
On solving these equations (1), (2) and (3), we get
a = 12.6751, b = 8.2676, and c = – 5.7071
Hence, the equation is
PAGE 105
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes x 1 2 3 4 5
y 1.8 5.1 8.9 14.1 19.8
106 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
PAGE 107
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
108 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Steps for solution of (5), (6) and (7) are the following:
3 × (5), 21720 = 15a + 45b + 165c (8)
(6) – (8), 2055 = 10b + 60c (9)
11 × (5), 79640 = 55a + 165b + 605c (10)
(7) – (10), 12715 = 60b + 374c (11)
6 × (9), 12330 = 60 b + 360c (12)
PAGE 109
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
110 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
PAGE 111
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
112 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
PAGE 113
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
y = 100(1.2)x
Example 16: Obtain a relation of the form y = abx for the following
data by the method of least squares:
x 2 3 4 5 6
y 8.3 15.4 33.1 65.2 127.4
Solution: The curve to be fitted is y = abx, n = 5
Taking log on both the sides, we get
log y = log a + x log b …(1)
On putting Y = log y
A = log a
B = log b
The transformed equation of (1) becomes
Y = A + Bx …(2)
114 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
x y Y = log y x2 xY Notes
2 8.3 0.9191 4 1.8382
3 15.4 1.1875 9 3.5625
4 33.1 1.5198 16 4.0792
5 65.2 1.8142 25 9.0710
6 127.4 2.1052 36 12.6312
Σx = 20 ΣY = 7.5458 Σx = 90
2
ΣxY = 31.1821
Normal equations are
ΣY = nA + BΣx (3)
ΣxY = AΣx + BΣx2(4)
On putting the values of Σx, etc. from the above table in (3) and (4),
we get
7.5458 = 5A + 20B (5)
31.1821 = 20A + 90B (6)
On solving (5) and (6), we get
A = 1.1099 and B = 0.0998
loga = 1.1099 ⇒ a = Antilog (1.1099) ⇒ a = 12.87
logb = 0.0998 ⇒ b = Antilog (0.0998) ⇒ b = 1.258
On substituting the values of a and b in (1), we get
y = 12.87 (1.258)x
Example 17: Fit the curve y = axb to the following data by least square
method.
x 1 2 3 4 5 6
y 2.98 4.26 5.21 6.1 6.8 7.5
Solution:
The curve to be fitted is
y = axb …(1)
Taking logarithm on both sides, log y = log a + b log x
⇒ y = A + bx (Straight line)
where Y = log y, A = log a and X = log x and n = 8
PAGE 115
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
116 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Example 18: Find the normal equations to the curve 2x = ax2 + bx + c Notes
Solution: Given curve 2x = ax2 + bx + c
PAGE 117
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
118 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
x y u = x – 69 v = y – 68 uv u2 Notes
71 69 2 1 2 4
68 72 –1 4 –4 1
73 70 4 2 8 16
69 70 0 2 0 0
67 68 –2 0 0 4
65 67 –4 –1 4 16
66 68 –3 0 0 9
67 64 –2 –4 8 4
Total Σu = –6 Σv = 4 Σuv = 18 Σu2 = 54
On putting the values of Σu, Σv, Σuv, Σu2 in (3) and (4), we get
4 = 8a + b (–6) (5)
18 = –6a + 54b (6)
On solving (5) and (6), we get
9 14
a = , b =
11 33
PAGE 119
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
Notes
7.9 Exercise
1. Find the linear least square polynomials based on given data. (Punjab
2018)
x –2 –1 0 1
Y 6 3 2 2
2. Find the least square straight line approximation to the data.
x 1 2 3 4
y 3 7 13 21
3. Find the least square line for the data points
(–1, 10), (0, 9), (1, 7), (2, 5), (3, 4), (4, 3), (5, 0) and (6, –1)
4. Fit a straight line to the following data regarding x as the independent
variable:
x 1 2 3 4 5 6
y 1200 900 600 200 110 50
5. The following table shows the number of salesmen working for a
certain concern:
year 1998 1999 2000 2001 2002 2003
number 28 38 46 40 56 60
Use the method of least squares to fit a straight line trend.
6. Given the following experimental values
x 0 1 2 3
y 2 4 10 15
Fit by the method of least squares a parabola of the type y = a +
bx2
7. If V (km/hr) and R (kg/ton) are related by a relation of type R = a
+ bV2, find by the method of least squares, a and b with the help
of the following table:
V 10 20 30 40 50
R 8 10 15 21 30
120 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi
PAGE 121
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi