0% found this document useful (0 votes)
8 views130 pages

Unit 1-2

The document outlines the syllabus for the course 'IT Skills and Data Analysis-II' offered by the University of Delhi, focusing on functions, their graphical representations, and the relationship between variables. It includes lessons on linear and non-linear functions, correlation, regression analysis, and curve fitting. The material is designed for distance learning and includes a disclaimer about incorporating feedback for future editions.

Uploaded by

maneet77788
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views130 pages

Unit 1-2

The document outlines the syllabus for the course 'IT Skills and Data Analysis-II' offered by the University of Delhi, focusing on functions, their graphical representations, and the relationship between variables. It includes lessons on linear and non-linear functions, correlation, regression analysis, and curve fitting. The material is designed for distance learning and includes a disclaimer about incorporating feedback for future editions.

Uploaded by

maneet77788
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 130

IT SKILLS AND DATA ANALYSIS-II

[FOR LIMITED CIRCULATION]

Editor

Prof. Shikha Gupta


Content Writers

Dr. Archana Verma, Ms. Priyanka Gupta


Academic Coordinator

Deekshant Awasthi

Department of Distance and Continuing Education


E-mail: [email protected]
[email protected]

Published by:
Department of Distance and Continuing Education
Campus of Open Learning, School of Open Learning,
University of Delhi, Delhi-110007

Printed by:
School of Open Learning, University of Delhi
IT SKILLS AND DATA ANALYSIS-II

Reviewer
Ms. Asha Yadav
Disclaimer

Corrections/Modifications/Suggestions proposed by Statutory Body, DU/


Stakeholder/s in the Self Learning Material (SLM) will be incorporated in
the next edition. However, these corrections/modifications/suggestions will be
uploaded on the website https://sol.du.ac.in. Any feedback or suggestions may
be sent at the email- [email protected]

Printed at: Taxmann Publications Pvt. Ltd., 21/35, West Punjabi Bagh,
New Delhi - 110026 (...... Copies, 2025)

Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi
Syllabus
IT Skills and Data Analysis - II

Syllabus Mapping
Unit - I: Functions and Their Graphical Representations Lesson 1: Functions and
This unit introduces the graphical visualisation of functions to understand Graphs
the relationship between two variables. (Pages 3–15)
Lesson 2: Linear and
Non-Linear Functions
(Pages 16–36)
Lesson 3: Reciprocal,
Exponential and
Logarithmic Functions
(Pages 37–50)
Unit - II: Relationship between Variables Lesson 4: Correlation
Students will learn about scatter diagrams and correlation analysis as a (Pages 53–71)
means to describe the nature and strength of association between two vari-
Lesson 5: Regression
ables. The concept of regression analysis will be introduced as a method
for quantifying the relationship between two variables. Further, multiple (Pages 72–82)
linear regression will be discussed for situations where more than one in- Lesson 6: Multiple
dependent variable is needed to estimate the dependent variable. The focus Regression and
will be mainly on interpreting estimated regression coefficients. Correlation
(Pages 83–93)
Lesson 7: Curve Fitting:
Principle of Least Squares
(Pages 94–121)

Department of Distance & Continuing Education, Campus of Open Learning,


School of Open Learning, University of Delhi

Syllebus.indd 1 06-Feb-25 10:39:23 AM


Syllebus.indd 2 06-Feb-25 10:39:24 AM
Contents

PAGE

UNIT-I
Lesson 1: Functions and Graphs 3–15

Lesson 2: Linear and Non-Linear Functions 16–36

Lesson 3: Reciprocal, Exponential and Logarithmic Functions 37–50

UNIT-II
Lesson 4: Correlation 53–71

Lesson 5: Regression 72–82

Lesson 6: Multiple Regression and Correlation 83–93

Lesson 7: Curve Fitting: Principle of Least Squares 94–121

PAGE i
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

TOC.indd 1 06-Feb-25 10:41:17 AM


TOC.indd 2 06-Feb-25 10:41:17 AM
UNIT - I

PAGE 1
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 1 06-Feb-25 12:17:06 PM


IT Skills and Data Analysis - II.indd 2 06-Feb-25 12:17:06 PM
L E S S O N

1
Functions and Graphs

STRUCTURE
1.1 Introduction to Functions
1.2 Graphical Representation of Functions
1.3 Graphs
1.4 Vertical Line Test
1.5 Important Notes on Linear Functions
1.6 Graphical Behavior of Functions
1.7 Constant Functions
1.8 Exercise

1.1 Introduction to Functions


Definition
Let there be two sets X and Y. Then in mathematics, a function from set X to set Y is a
relation that assigns to each element of X exactly one element of Y. The set X is called
the domain of the function and the set Y is called the codomain or range of the function.
Thus the domain of a function is the set of all possible input values that produce some
output value range and the set of values the function takes on as output is called range.
A function can be expressed as an equation, a set of ordered pairs, as a table, or as a
graph in the coordinate plane. Thus, a function can also be defined as a relation with the
property that each input is related to exactly one output. It is denoted by letter f.
For example, f(x) = x + 2 is the function of x and it represents the output of the function
f corresponding to an input x, where x = 1
In the case of a function with just one input variable, the input and output of the function
can be expressed as an ordered pair, (x, f(x)), where the first element is the argument and
the second is the output.

PAGE 3
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 3 06-Feb-25 12:17:10 PM


IT SKILLS AND DATA ANALYSIS - II

Notes In the above example, it will be (x, f(x)) = (1, 3) for x = 1.


Another commonly used notation for a function is f:X→Y, which reads as
f is a function that maps values from the set X onto values of the set Y.

1.2 Graphical Representation of Functions


A function can also be defined as a relation between two sets of vari-
ables such that one variable depends on another variable. The function
has an independent and dependent variable. The variable in an equation
or function whose value depends on another variable is called dependent
variable and the variable in an equation or function whose value does
not depend on any other variable is called independent variable.
Suppose we have a f(x) = 2x + 3. Then we call the variable that we are
changing, in this case, x, the independent variable. We assign the value
of the function to a variable, in this case, y, that we call the dependent
variable. Function notation, f(x) is read as “ f of x” which means “the
value of the function at x. Since the output, or dependent variable is y,
for function notation often times f(x) is taken as y. The ordered pairs
normally stated in linear equations as (x, y), in function notation are now
written as (x, f(x)). The variable x is called independent since we can
pick any value of x for which function is defined.

1.3 Graphs
Graphs provide a visual representation of functions, showing the rela-
tionship between the input values and output values or Independent and
Dependent Variables in the Function.
Graph can be defined as a diagram displaying data; in particular one
showing the relationship between two or more variables, quantities, mea-
surements, or numbers.
By graphically representing a function we mean choosing some values
for the independent variable x and putting them into the function to get
values of y. This gives us a set of ordered pairs (x, y). This can also be
written as (x, f(x)). These sets of ordered pairs are then plotted on the
graph and connected through a line.

4 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 4 06-Feb-25 12:17:10 PM


Functions and Graphs

Graphing a Function Notes


Linear Function
A linear function is a function which forms a straight line in a graph.
It is generally a polynomial function whose degree is utmost 1 or 0.
In general linear function can be written as
y = f(x) = px + q
where, p and q are constants. Here, p is the slope of line, q is the in-
tercept of the line, x is the independent variable and y (=f(x)) is the
dependent variable. If q = 0, then this becomes the line which passes
through the origin. Thus a linear function is an algebraic equation in
which each term is the product of a constant and a single variable (of
degree one) or a constant.
Example 1: Let the function be
f(x) = 5x – 10
we choose few values for independent variable x and substitute it in the
function to get the ordered pair. Let x = –8, 0, 8 then f(x) = –30, –10, 30
respectively and the ordered pairs will be (–8, –30), (0, –10) and (8, 30).
We now plot these points on the graph and connect them with a line.

This is the linear graph since the power of independent variable is 1 and
when plotted it is forming a straight line.
Example 2: Let the function be
f(x) = 3x + 5

PAGE 5
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 5 06-Feb-25 12:17:11 PM


IT SKILLS AND DATA ANALYSIS - II

Notes Let x take values –1 and 0. Then keeping these values in the above
function, the ordered pairs are (–1, 2), (0, 5). Two points are sufficient
to plot a line. So, plotting these points forms the following graph:

(Graph: Courtesy Boundless Algebra)

Example 3: A t-shirt company charges a one-time fee of Rs. 100 and


Rs. 10 per T-shirt to print logos on T-shirts. Then, the total fee is ex-
pressed by the linear function y = f(x) = 10x + 10, where x is the number
of t-shirts. On plotting we get the following graph.

1
Example 4: Plot the graph for the given function: f(x) = − x + 1
3
Let the ordered pairs be (0, 1) and (3, 0). Following is the graph obtained
on plotting these points.

6 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 6 06-Feb-25 12:17:12 PM


Functions and Graphs

Notes

(Graph: Courtesy Boundless Algebra)

1.4 Vertical Line Test


The vertical line test is a method that is used to determine whether a
given relation is a function or not. A function can have only one output,
y, for each unique input, x. If any x value in a curve is associated with
more than one y value, then the curve does not represent a function.
The approach is rather simple. Draw a vertical line cutting through the
graph of a relation, and then observe the points of intersections. If a
vertical line intersects a curve on a xy-plane more than once, then for
one value of x the curve has more than one value of y and the curve
does not represent a function.
The vertical line test supports the definition of a function. That is, every
x-value of a function must be paired to a single y-value. If a vertical line
intersects the graph of a relation at exactly one point, it implies that a
single x-value is only paired to a unique value of y.
On the other hand, if the vertical line intersects the graph more than
once, this suggests that a single x-value is being associated with more
than one value of y. This condition causes the relation to be “disquali-
fied” as a function.

PAGE 7
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 7 06-Feb-25 12:17:13 PM


IT SKILLS AND DATA ANALYSIS - II

Notes If a vertical line intersects the graph in all places at exactly one
point, then the relation is a function.
If a vertical line intersects the graph in some places more than once,
then the relation is NOT a function.

(Graph: Courtesy Boundless Algebra)

In the first graph, if we draw a single vertical through the red dots would
intersect the curve 3 times. Thus, it fails the vertical line test and does
not represent a function. Any vertical line in the second graph passes
through only once and hence passes the vertical line test, and thus rep-
resents a function.

Why Vertical Line Test?


A function is expected to have a unique range for each of its domains,
and if the input has more than one output, then it is not considered a
function. A vertical line test helps to find if the graph is a function or
not. If a vertical line intersects the graph of the relation at only one point,
then it is a function, and if it intersects at more than one point then the
graph does not represent a function.
The vertical line in a coordinate system represents a set of infinite points
having the same x coordinate values and different y coordinate values
for each of its points. The vertical line is drawn parallel to the y-axis,
if it cuts the curve at one distinct point then it has one y-value for the
given x value and it follows the basic definition of a function. For every
domain x value, there is only one range y value for the function. The

8 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 8 06-Feb-25 12:17:13 PM


Functions and Graphs

vertical line x = a, if it cuts the curve y = f(x) at only one point (a, Notes
f(a)), then such a curve y = f(x) represents a function.

(Graph: Courtesy Cuemath.com)

A vertical line is supposed to cut the curve at only one point, for the
curve to represent a function. And if the vertical line x = a is cutting the
graph y = f(x) at more than one point, i.e. at two points such as (x, y1),
(x, y2), then it is having different y values for the same x-value. Thus,
each domain has more than one codomain value and it contradicts the
basic definition of a function, and the curve y = f(x) does not represent
a function.
To use the vertical line test, take a ruler and draw a line parallel to the
y-axis for any selected value of x If the vertical line intersects the graph
more than once for any value of x then the graph is not the graph of
a function. If, alternatively, a vertical line intersects the graph no more
than once, no matter where the vertical line is placed, then the graph is
the graph of a function. For example, a curve which is any straight line
other than a vertical line will be the graph of a function.
Some examples of relations that are also functions because they pass the
vertical line test.
Graph of the line f(x)=x+1

PAGE 9
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 9 06-Feb-25 12:17:15 PM


IT SKILLS AND DATA ANALYSIS - II

Notes

(Graph: Courtesy Chilimath.com)


Graph of the quadratic function (parabola) f(x) = x2 – 2

(Graph: Courtesy Chilimath.com)

Graph of the cubic function f(x) = x3

(Graph: Courtesy Chilimath.com)

If a vertical line intersects the graph at more than one point, then the
relation is not a function.

10 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 10 06-Feb-25 12:17:16 PM


Functions and Graphs

Some examples of relations that are not functions because they fail the Notes
vertical line test.
Graph of the “sideway” parabola x = y2

(Graph: Courtesy Chilimath.com)

Graph of the circle x2 + y2 = 9

(Graph: Courtesy Chilimath.com)


Graph of the relation x = y3 – y + 2

(Graph: Courtesy Chilimath.com)

PAGE 11
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 11 06-Feb-25 12:17:17 PM


IT SKILLS AND DATA ANALYSIS - II

Notes
1.5 Important Notes on Linear Functions
‹ A linear function is of the form f(x) = ax + b and hence its graph
is a line.
‹ When its slope is 0 then a linear function f(x) = ax + b is a
horizontal line and is known as a constant function.
‹ For a linear function f(x) = ax + b the domain and range is R (all
real numbers) whereas for a constant function f(x) = b the range
is {b}.
‹ These linear functions are useful to represent the objective function
in linear programming.
‹ A vertical line is NOT a linear function as it fails the vertical line
test.

Increasing, Decreasing and Constant Functions


Functions can either be constant, increasing as x increases, or decreasing
as x increases.
Increasing Function: Any function of a real variable whose value in-
creases (or remains constant) as the independent variable increases is
called increasing function.
Decreasing Function: Any function of a real variable whose value de-
creases (or remains constant) as the independent variable increases is
called decreasing function.
Constant Function: Any function of a real variable whose value remains
same for all the elements of its domain is called Constant function.

1.6 Graphical Behavior of Functions


As part of exploring how functions change, we can identify intervals over
which the function is changing in specific ways. We say that a function
is increasing on an interval if the function values increase as the input
values increase within that interval. Similarly, a function is decreasing
on an interval if the function values decrease as the input values increase
over that interval.
An increasing function is one where f(x1) ≥ f(x2) for every x1 ≥ x2.

12 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 12 06-Feb-25 12:17:17 PM


Functions and Graphs

If it is f(x1) > f(x2) then function is called strictly increasing function. Notes
A decreasing function is one where f(x1) ≤ f(x2) for every x1 ≤ x2.
If it is f(x1) < f(x2) then function is called strictly decreasing function
In terms of a linear function f(x) = mx + b, if m is positive, the func-
tion is increasing, if m is negative, it is decreasing, and if m is zero, the
function is a constant function.
The average rate of change of an increasing function is positive, and the
average rate of change of a decreasing function is negative. The figure
below shows examples of increasing and decreasing intervals on a function.

Graph: Courtesy Boundless Algebra

1.7 Constant Functions


A constant function is a function whose values do not vary, regardless
of the input into the function. That is, a function is a constant function
if f(x) = c for all values of x and some constant c.
The graph of the constant function f(x) = c is a horizontal line in the
plane that passes through the point (0, c)
For example, the graph of the function y(x) = 4 is given as

PAGE 13
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 13 06-Feb-25 12:17:18 PM


IT SKILLS AND DATA ANALYSIS - II

Notes

(Graph: Courtesy Boundless Algebra)

1.8 Exercise
1. Draw the graph for each of these equations, and use the vertical line
test, to check if they are a function or not a function:
i. y = x
ii. y = x2
iii. y = 3
iv. y = |x|
v. y = Sinx
vi. y = x3
vii. y = 3√x
viii. x = y2
ix. x2 + y2 = 9
x. x = 4
xi. y = √x
2. Draw the graph of the following equations:
i. y = 3x + 1
ii. y = 2x – 3
iii. y = 5x – 2
iv. y = 6 – 2x

14 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 14 06-Feb-25 12:17:19 PM


Functions and Graphs

3. Plot the below coordinates on the graph and answer the following Notes
questions.
(2, 210), (5, 420), (7, 560), (6, 490), (3, 280), (1, 140), (8, 630)
i. Does the graph drawn from the above coordinates represent a
linear graph?
ii. Find the value of x-coordinate if y-coordinate corresponds to
350.
iii. Calculate the value of y-coordinate for which x-coordinate is
11, if the graph drawn is linear.
4. Mrs Mary asks John to identify whether the given equation 3x −
7y = 16 forms a linear graph or not without plotting its values.
Now help John to figure out whether it is a linear graph or not.
5. Mikel has to prepare a linear graph for the equation 2x + y = 8.
Complete the table below for the above equation.
x - 4 -2
y 8 - -

PAGE 15
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 15 06-Feb-25 12:17:19 PM


L E S S O N

2
Linear and Non-Linear
Functions

STRUCTURE
2.1 Polynomial Functions
2.2 Degree of a Polynomial Function
2.3 Adding and Subtracting Polynomials
2.4 Graphing Polynomial Functions
2.5 How to Determine a Polynomial Function
2.6 Quadratic Functions
2.7 The Quadratic Formula
2.8 Differences between Quadratics and Linear Functions
2.9 A Graphical Interpretation of Quadratic Solutions
2.10 Cubic Function
2.11 Y-Intercept of Cubic Function
2.12 Exercise

2.1 Polynomial Functions


A polynomial function is a function involving only non-negative integer powers of x. For
example, a quadratic, a cubic, a quartic, and so on are polynomial functions. Polynomial
functions are expressions that contain variables of differing degrees, non-zero coefficients,
positive exponents, and constants.
Some examples of polynomial functions.
‹ f(x) = 3x2 − 5
‹ f(x) = −7x3 + 2x − 7
‹ f(x) = 3x4 + 7x3 − 12x2

16 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 16 06-Feb-25 12:17:20 PM


Linear and Non-Linear Functions

In general, a polynomial function is expressed as: Notes


f(x) = anxn + an-1xn-1 +…+ a2x2 + a1x + a0.
where, an, an-1, …, a0 are real number constants with an not equal to zero and
n is a non-negative integer. This algebraic expression is known as the
polynomial function in variable x. Here, all the powers of a polynomial
function must be a whole number.

2.2 Degree of a Polynomial Function


The degree of the polynomial function is the highest power of the vari-
able it is raised to.
For example, consider the polynomial function f(x) = −7x3 + 6x2 + 11x
– 19, the highest exponent found is 3 from −7x3. This means that the
degree of this particular polynomial is 3.
Consider another polynomial function f(x) = 4x13 + 3x + 1. Here the
highest exponent of x is 13. Thus the degree of this polynomial is 13.

2.3 Adding and Subtracting Polynomials


Any two polynomials can be added or subtracted, regardless of the number
of terms in each, or the degrees of the polynomials. The sum or difference
of two polynomials will have the same degree as the polynomial with
the higher degree in the problem. The rules for adding and subtracting
algebraic expressions apply to polynomials. That is, polynomials can be
added or subtracted by combining like terms.
Example 1: Find the sum of 4x2 − 5x + 1 and 3x2 − 8x − 9.
Solution: Polynomials are added by grouping the like terms together:
(4x2 + 3x2) + (−5x − 8x) + (1 − 9) = 7x2 − 13x – 8
Similarly we can subtract two polynomials.

Multiplying Polynomials
To multiply two polynomials together, multiply every term of one poly-
nomial by every term of the other polynomial. The degree of a product
of two polynomials equals the sum of the degrees of said polynomials.

PAGE 17
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 17 06-Feb-25 12:17:20 PM


IT SKILLS AND DATA ANALYSIS - II

Notes
2.4 Graphing Polynomial Functions
Some of the most commonly used polynomial functions are:
1. Zero Polynomial function f(x) = ax0 = a, which is a vertical line
parallel to y-axis. Linear Polynomial function f(x) = ax + b. Linear
polynomial functions are also known as first-degree polynomials.
The graph of a linear polynomial function shapes a straight line.
2. Quadratic Polynomial function f(x) = ax2 + bx + c. The graph of a
second-degree or quadratic polynomial function is a curve referred
to as a parabola.
3. Cubic Polynomial function f(x) = ax3 + bx2 + cx + d which is a
polynomial function of the third degree
4. Quartic polynomial function f(x) = ax4 + bx3 + cx2 + dx + e which
is a polynomial function of the fourth degree.

(Graph: Courtesy Boundless Algebra)

18 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 18 06-Feb-25 12:17:21 PM


Linear and Non-Linear Functions

Notes
2.5 How to Determine a Polynomial Function
In order to determine if a function is polynomial or not, the function
needs to be checked against certain conditions for the exponents of the
variables. These conditions are as follows:
‹ The exponent of the variable in the function in every term must only
be a non-negative whole number i.e., the exponent of the variable
should not be a fraction or negative number.
‹ The variable of the function should not be inside a radical i.e., it
should not contain any square roots, cube roots, etc.
‹ The variable should not be in the denominator.
We now study these polynomials in detail.

Linear Functions
As defined in the previous chapter, a linear function is an algebraic equa-
tion in which each term is either a constant or the product of a constant
and (the first power of) a single variable.
Mathematically, it is expressed as a function f(x) = ax + b; where a and
b are constants, x is the independent variable.

Linear Function in Terms of Slope and Intercept


Linear functions are algebraic equations whose graphs are straight lines
with unique values for their slope and y-intercepts.
It is commonly expressed as
Y = mx + b
This is namely the slope-intercept form, where m is the slope or gradient
of that line and b is the constant or y-intercept that gives the point at
which the line crosses the y-axis.
It is a linear function because it meets both criteria with x and y as
variables and m and b as constants. It is linear since the exponent of the
x term is a one (first power), and it follows the definition of a function
that is for each input x there is exactly one output y. Also, its graph is
a straight line.

PAGE 19
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 19 06-Feb-25 12:17:21 PM


IT SKILLS AND DATA ANALYSIS - II

Notes Graph of Linear Functions


When plotted the linear equation forms a straight line in the plane.

(Graph: Courtesy Boundless Algebra)

Both blue and red lines are linear functions. The blue line, y = 21x − 3
has m = 21 and b = −3 which implies that slope is 21 and is positive and
intercept is –3. The red line, y = −x + 5 has m = −1 and b = 5 which
implies that here slope is negative with value as −1 and intercept is 5.

Vertical and Horizontal Lines


Vertical lines have an undefined slope, and cannot be represented in the
form y = mx + b
Here equation is of the form x = c, where c is the constant. It is the
point on x-axis where the vertical line intersects it.
For example, the graph of the equation x = 6 includes the same input
value of 6 for all points on the line, but would have different ordered
pairs or output values, such as (6, −1), (6, 0), (6, 3), (6, 7) etc.
However, vertical lines are not functions, since each input is related to
more than one output.

20 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 20 06-Feb-25 12:17:21 PM


Linear and Non-Linear Functions

Notes

Horizontal lines have a slope of zero and is represented by the form, y =


b, where b is the y-intercept. A graph of the equation y = 3 includes the
same output value of 3 for all input values on the line, such as (−3, 3),
(0, 3), (2, 3), (5, 3) etc.
Horizontal lines are functions because the relation has the characteristic
that each input is related to exactly one output.

Slope
Slope describes the direction and steepness of a line, and can be calculated
given two points on the line. Its sign indicates the direction, while its
magnitude indicates the steepness which is measured by the absolute value
of the slope. A slope with greater absolute value indicates a steeper line.
Thus, a line with a slope of -8 is steeper than a line with a slope of 6.

PAGE 21
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 21 06-Feb-25 12:17:22 PM


IT SKILLS AND DATA ANALYSIS - II

Notes Slope is calculated by finding the ratio of the “vertical change” to the
“horizontal change” between any two distinct points on a line. This ra-
tio is represented by a quotient and gives the same number for any two
distinct points on the same line.
The slope of the equation is calculated using the formula:
y2 − y1
m=
x2 − x1
where ( x1 , y1 ) and ( x2 , y2 ) are points on the line.
The direction of a line is either increasing, decreasing, horizontal or verti-
cal. Thus, the slope of a line can be positive, negative, zero, or undefined.
‹ A line is increasing if it goes up from left to right which implies
that the slope is positive (m > 0).
‹ A line is decreasing if it goes down from left to right and the slope
is negative (m < 0).
‹ If a line is horizontal the slope is zero and is a constant function
(y = c).
‹ If a line is vertical the slope is undefined.

(Graph: Courtesy Boundless Algebra)

Example 2: Find the slope of the line shown on the coordinate plane below.

(Graph: Courtesy Boundless Algebra)

22 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 22 06-Feb-25 12:17:23 PM


Linear and Non-Linear Functions

Locate two points on the graph preferably integer values. Let it be (0, Notes
−3) and (5, 1). Here (0, −3) is point ( x1 , y1 ) and (5, 1) is point ( x2 , y2 ).

(Graph: Courtesy Boundless Algebra)

We substitute the values of the points ( x1 , y1 ) and ( x2 , y2 ) in the mathe-


matical formula of the slope we get the slope of the line as
1 − ( −3) 4
=m =
5−0 5
The slope is positive since the line slants upward from left to right.
Example 3: Find the slope of the line shown on the coordinate plane
below.

(Graph: Courtesy Boundless Algebra)

Locate any two integer points on the graph. Let it be (0, 5) and (3, 3).
Let (0, 5) be ( x1 , y1 ) and (3, 3) be ( x2 , y2 )

PAGE 23
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 23 06-Feb-25 12:17:26 PM


IT SKILLS AND DATA ANALYSIS - II

Notes

(Graph: Courtesy Boundless Algebra)

Substituting the corresponding values into the slope formula, we get:


3 − 5 −2
m
= =
3−0 3
We can see that the slope is negative since the line slants downward
from left to right.

Distance between Two Lines


Distance between two points on a line or between two points on a line
segment can be defined as the amount of space between them.

Let the two points be ( x1 , y1 ) and ( x2 , y2 ) . Then the distance between


two these two points is

( x2 − x1 ) + ( y2 − y1 )
2 2
d=

Example 4: Find the distance between two points (4, 6) and (6, 9).
Substitute the given values of points ( x1 , y1 ) and ( x2 , y2 ) and the distance
formula

( x2 − x1 ) + ( y2 − y1 )
2 2
d=

( 6 − 4) + (9 − 6)
2 2
=

= 22 + 32
= 13

24 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 24 06-Feb-25 12:17:28 PM


Linear and Non-Linear Functions

Example 5: Find the distance between two points (−1, 4) and (5, 7) Notes
Substitute the given values of points ( x1 , y1 ) and ( x2 , y2 ) and the distance
formula

( x2 − x1 ) + ( y2 − y1 )
2 2
d=

( 5 + 1) + ( 7 − 4 )
2 2
=

= 62 + 32

= 45
Mid-point of a Line Segment
Mid-point of a line segment is defined as a point which divides a line
segment into two lines of equal length. It is the middle point of a line
segment, or the middle point of two points on a line, and thus is equi-
distant from both end points.
Let the two points be ( x1 , y1 ) and ( x2 , y2 ) . Then the midpoint of the seg-
ment connecting the two points is obtained from the following formula:
 x1 + x2 y1 + y2 
 , 
 2 2 
Example 6: Find the mid-point between (4, 5) and (6, 9).
Substitute the values of the points in the formula for the mid-point.
 x1 + x2 y1 + y2 
Mid-point =  , 
 2 2 
 4+6 5+9 
= , 
 2 2 
= ( 5, 7 )

Parallel and Perpendicular Lines


Two lines in a plane are said to be parallel if they do not intersect or
touch at any point. Parallel lines never intersect even as they go to in-
finity. In other words, two lines are said to be parallel if they have the
same slope. The parallel symbol is ‖.

PAGE 25
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 25 06-Feb-25 12:17:31 PM


IT SKILLS AND DATA ANALYSIS - II

Notes Thus, if two lines, say f(x) = mx + b and g(x) = nx + c, are parallel,
then n must equal m.
Example 7: Let the two lines be f(x) = 2x + 3 and g(x) = 2x − 1. Then
these two lines are parallel since they have the same slope, m = 2.

(Graph: Courtesy Boundless Algebra)

Perpendicular Lines
Two lines in the same plane are perpendicular to each other if their slopes
are negative reciprocals of each other. That is, if slope of one line is m
1
then slope of other line will be − .
m
An alternate way of defining perpendicular lines is: Two lines are said
to be perpendicular to each other if they form congruent adjacent angles.
In other words, they are perpendicular if the angles at their intersection
are right angles. The perpendicular symbol is ⊥.
Symbolically, two perpendicular lines, f(x) = m1​x + b1​ and g(x) = m2​x +
b2​are denoted as f(x) ⊥ g(x).

26 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 26 06-Feb-25 12:17:32 PM


Linear and Non-Linear Functions

1 Notes
For example, given two lines: f(x) = 3x − 2 and g(x) = ​x + 1 are per-
3
pendicular lines since their slopes are negative reciprocal of each other.

(Graph: Courtesy Boundless Algebra)

Example 8: An athlete begins his normal practice for the next marathon
during the evening. At 6:00 pm he starts to run and leaves his home. At
7:30 pm, the athlete finishes the run at home and has run a total of 7.5
miles. How fast was his average speed over the course of the run? How
many miles did he run after the first half hour? If he kept running at
the same pace for a total of 3 hours, how many miles will he have run?
Solution: Here slope of the equation is the rate of change in the speed
of his run; distance over time. Therefore, the two variables are time (x)
and distance (y).
The first point is (0,0) which is the first point is at his house, where his
watch read 6:00 pm and he has not run anywhere yet. Let’s think about
our time in hours. The second point is 1.5 hours later, and the distance
covered is 7.5 miles. Therefore, the second point is (1.5,7.5)
Now the speed (rate of change) is simply the slope of the line connecting
the two points. The slope, given by:

PAGE 27
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 27 06-Feb-25 12:17:34 PM


IT SKILLS AND DATA ANALYSIS - II

Notes y2 − y1
m=
x2 − x1
7.5 − 0
= = 5 per hour.
1.5 − 0
To graph this line, we need the y-intercept and the slope to write the
equation. The slope was 5 miles per hour and since the starting point
was at (0,0), the y-intercept is 0. Therefore, the linear function is y = 5x

(Graph: Courtesy Boundless Algebra)

Using the graph, predictions can be made assuming that his average
speed remains the same.
The number of miles he ran after the first half hour will be obtained
1
using the equation y = 5x. Here x = .
2
Thus, y = 5(0.5) = 2.5 miles.
If he kept running at the same pace for a total of 3 hours, then the number
of miles he ran will be obtained by substituting x = 3 in the equation y
= 5x. Thus, he ran 15 miles.
Example 9: A rental company charges a flat fee of Rs. 30 and an addi-
tional Rs. 0.25 per mile to rent a moving van. Write a linear equation to
approximate the cost y in terms of x, the number of miles driven. How
much would a 75-mile trip cost?
Solution: The equation is formed using the slope-intercept form of a
linear equation, with the total cost being the dependent variable y and
the miles being the independent variable x.
28 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 28 06-Feb-25 12:17:35 PM


Linear and Non-Linear Functions

The linear equation is y = mx + b Notes


The total cost is equal to the rate per mile times the number of miles
driven plus the cost for the flat fee: y = 0.25x + 30
To calculate the cost of a 75-mile trip, substitute 75 for x into the equa-
tion. We get
y ​= 0.25x + 30 = 0.25(75) + 30 =18.75 + 30 = 48.75 Rs.

2.6 Quadratic Functions


Quadratic equations are second order polynomials, and have the standard
form f(x) = ax2 + bx + c
where a is a nonzero constant, b and c are constants of any finite value,
and x is the independent variable.
When all constants are known, a quadratic equation can be solved as to
find a solution of x. The solutions to a quadratic equation are known as
its zeros, or roots.
Quadratic form can also be expressed as
f(x) = a(x − x1​)(x − x2​)
This form is known as factored form, where x1 and x2​ are the zeros, or
roots, of the equation. These are x values at which the function crosses
the y-axis and thus where y equals zero.

2.7 The Quadratic Formula


The zeros or roots of a quadratic equation can be found by solving the
quadratic formula.
For the quadratic equation ax2 + bx + c = 0
−b ± b 2 − 4ac
x=
2a
where a and b are the coefficients of the x2 and x terms, respectively, in
a quadratic equation, and c is the value of the equation’s constant.
Note that to use the quadratic formula, ax2 + bx + c must be equal to
zero and a must not be zero.

PAGE 29
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 29 13-Feb-25 2:16:41 PM


IT SKILLS AND DATA ANALYSIS - II

Notes The quadratic formula can always be used to find the roots of a quadrat-
ic equation, regardless of whether the roots are real or complex, whole
numbers or fractions, and so on. The symbol ± indicates there will be two
solutions, one that involves adding the square root and the other found
by subtracting said square root. The resulting x values (zeros) may or
may not be distinct, and may or may not be real.
Example 10: Find the roots of the following quadratic function:
f(x) = 2x2 + 5x + 3
Solution: First, set the function equal to zero, as the roots are where the
function equals zero.
2x2 + 5x + 3 = 0
Second, identify the constants in the equation. The value of a = 2, b =
5, and c = 3
Substitute these values into the quadratic equation and solve:
−b ± b 2 − 4ac
x=
2a
− 5 ± 52 − 4 ( 2 )( 3)
=
2 ( 2)
−5 ± 1
=
4
−5 + 1 −5 − 1
= and
4 4
−6
Thus x = −1 and
4

2.8 Differences between Quadratics and Linear Functions


Quadratic equations are different than linear functions in a few key ways.
i. Linear functions either always decrease (if they have negative slope)
or always increase (if they have positive slope). All quadratic
functions both increase and decrease.
ii. With a linear function, each input has an individual, unique output
(assuming the output is not a constant). With a quadratic function,

30 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 30 06-Feb-25 12:17:37 PM


Linear and Non-Linear Functions

pairs of unique independent variables will produce the same dependent Notes
variable, with only one exception for a given quadratic function.
iii. The slope of a quadratic function, unlike the slope of a linear
function, is constantly changing.

Graphs of Quadratic Form


The graph of a quadratic function is a U-shaped curve called a parabola.
The sign on the coefficient a of the quadratic function affects whether
the graph opens up or down. If a < 0, the graph opens down and if a >
0 then the graph opens up.
The extreme point (that is the point at which a parabola changes direc-
tion, corresponding to the minimum or maximum value of the quadratic
function) of a parabola is called the vertex, and the axis of symmetry
is a vertical line that passes through the vertex. If the parabola opens
up, the vertex represents the lowest point on the graph, or the minimum
value of the quadratic function. If the parabola opens down, the vertex
represents the highest point on the graph, or the maximum value. In
either case, the vertex is a turning point on the graph.
The x-intercept are the points at which the parabola crosses the x-axis. If
they exist, the x-intercepts represent the zeros, or roots, of the quadratic
function. There may be zero, one, or two x-intercepts.
The y-intercept is the point at which the parabola crosses the y-axis.
There cannot be more than one such point, for the graph of a quadratic
function. If there were, the curve would not be a function.

PAGE 31
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 31 06-Feb-25 12:17:38 PM


IT SKILLS AND DATA ANALYSIS - II

Notes

(Graph: Courtesy Boundless Algebra)

2.9 A Graphical Interpretation of Quadratic Solutions


The roots of a quadratic function can be found algebraically with the
quadratic formula, and graphically by making observations about its
parabola. The points where the graph cuts the x-axis give the roots of a
quadratic equation.
Example 11: Find the roots of the quadratic equation algebraically and
graphically.
y = x2 – x – 2
Solution: On plotting the equation graphically we get

(Graph: Courtesy Boundless Algebra)

32 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 32 06-Feb-25 12:17:38 PM


Linear and Non-Linear Functions

From the graph we can see that the parabola intersects the x-axis at two Notes
points: (−1,0) and (2,0)
We know that the roots of the quadratic function are given by x-intercepts
of a parabola. Therefore, there are roots at x = −1 and x = 2
Now, let’s obtain the roots of the function y = x2 – x − 2 algebraically
using the quadratic formula. Here, a = 1, b = –1 and c = –2
Substituting in the quadratic formula
−b ± b 2 − 4ac
x=
2a
1 ± 12 − 4 (1)( −2 )
=
2 (1)

1± 9
=
2
1 +3 1− 3
= and
2 2
Thus, roots are x = 2, and –1
These are the same values of the roots which were obtained graphically.

2.10 Cubic Function


A polynomial function of degree 3 is called a cubic function. It is of the
form f(x) = ax3 + bx2 + cx + d, where a, b, c, and d are real numbers
and a ≠ 0. A cube function has a maximum of 3 roots. Since complex
roots always occur in pairs, a cubic function always has either 1 or 3
real roots. It will have at least one real root but cannot have 2 real roots.
Cubic function may intersect the x-axis at a maximum of 3 points.
Here are two examples of a cubic function.

PAGE 33
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 33 06-Feb-25 12:17:39 PM


IT SKILLS AND DATA ANALYSIS - II

Notes

(Graph: Courtesy Cuemath.com)

X-Intercept of Cubic Function


The x-intercepts of a function are also known as roots (or) zeros. As
the degree of a cubic function is 3, it can have a maximum of 3 roots.
Since complex roots of any function always occur in pairs so a function
can either have 0 or two complex roots. Thus, it has one or three real
roots or x-intercepts.
To find the x-intercept(s) of a cubic function, we just substitute y =
0 (or f(x) = 0) and solve for x-values.
Example 12: To find the x-intercept(s) of f(x) = x3 – 4x2 + x – 4, sub-
stitute f(x) = 0. Then
x3 – 4x2 + x – 4 = 0
x2 (x – 4) + 1 (x – 4) = 0
(x – 4) (x2 + 1) = 0
x – 4 = 0; x2 + 1 = 0
x = 4; x2 = –1
x = 4; x = ±i
Complex numbers cannot be the x-intercepts. Therefore, f(x) has only
one x-intercept which is (4, 0).

2.11 Y- Intercept of Cubic Function


A cubic function always has exactly one y-intercept.
To find the y-intercept of a cubic function, we just substitute x = 0 and
solve for y-value.
34 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 34 06-Feb-25 12:17:42 PM


Linear and Non-Linear Functions

Example 13: To find the y-intercept of f(x) = x3 – 4x2 + x – 4, substitute Notes


x = 0. Then f(x) = 03 – 4(0)2 + (0) – 4 = –4.
Therefore, the y-intercept of the function is (0, –4).

End Behavior of Cube Function


The end behavior of any function depends upon its degree and the sign
of the leading coefficient. A cube function (x) = ax3 + bx2 + cx + d has
an odd degree polynomial in it. So its end behavior is as follows:
i. When the leading coefficient is positive (a > 0): f(x) → ∞ as x
→ ∞ and f(x) → -∞ as x → -∞
In this case, the shape of the graph is from bottom to top.

(Graph: Courtesy Cuemath.com)


ii. When the leading coefficient is negative (a < 0): f(x) → –∞ as
x → ∞ and f(x) → ∞ as x → –∞
In this case, the shape of the graph is from top to bottom.

(Graph: Courtesy Cuemath.com)

PAGE 35
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 35 06-Feb-25 12:17:44 PM


IT SKILLS AND DATA ANALYSIS - II

Notes
2.12 Exercise
1. Subtract the polynomials: (5x3 + x2 + 9) − (4x2 + 7x − 3)
2. Initially, trains A and B are 325 miles away from each other. Train A
is traveling towards B at 50 miles per hour and train B is traveling
towards A at 80 miles per hour. At what time will the two trains
meet? At this time how far did the trains travel?
3. Write an equation of the line (in slope-intercept form) that is parallel
to the line y = −2x + 4 and passes through the point (−1,1).
4. Write an equation of the line (in slope-intercept form) that is
perpendicular to the line y = 41x – 3 and passes through the point
(2,4)
5. Find the roots of the quadratic function f(x) = x2 − 4x + 4. Solve
graphically and algebraically.
6. Find the x intercept(s) and y intercept of cubic function: f(x) = 3
(x – 1) (x – 2) (x – 3).

36 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 36 06-Feb-25 12:17:45 PM


L E S S O N

3
Reciprocal, Exponential
and Logarithmic Functions

STRUCTURE
3.1 Reciprocal Function
3.2 How to Find Reciprocal of a Function
3.3 Properties of Reciprocal Functions
3.4 Graph of Reciprocal Functions
3.5 Domain and Range of Reciprocal Function
3.6 How to Solve Reciprocal Functions?
3.7 Exponential Function
3.8 Logarithmic Functions
3.9 Exercise

3.1 Reciprocal Function


A reciprocal function is obtained by finding the inverse of a given function. For a function
f(x) = x, the reciprocal function is f(x) = 1/x.
a
The general form of reciprocal function is f= ( x) +k ,
x+h
where a, h and k are constants.
The reciprocal function is also the multiplicative inverse of the given function. The
reciprocal function can be found in trigonometric functions, logarithmic functions, and
polynomial functions.
The common form of a reciprocal function is y = k/x, where k is any real number and x
can be a variable, number or a polynomial. When graphed this forms a hyperbola. This
means that reciprocal functions are functions that contain constant on the numerator and
algebraic expression in the denominator.

PAGE 37
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 37 06-Feb-25 12:17:45 PM


IT SKILLS AND DATA ANALYSIS - II

Notes The reciprocal of a number is a number which when multiplied with the
actual number produces a result of 1.
For example, let us take the number 5. The reciprocal is 1/5. Also, when
we multiply the reciprocal with the original number we get 1.
Some more examples of reciprocal functions:
2
f ( x) =
x2
1
( x)
f= −4
x +1
1
f ( x) =
− +3
x+2

3.2 How to Find Reciprocal of a Function


The reciprocals and reciprocal functions share similar characteristics and
properties. As we know we can determine a number’s reciprocal by di-
viding 1 by the given number. In the same manner to find a function’s
reciprocal function – we divide 1 by the function’s expression.
Here’s a table to compare the reciprocal that we learned in the past and
reciprocal functions:
Reciprocal Reciprocal Function
Given a number, k, its reciprocal Given a function, f(x), its reciprocal
is 1/k. function is 1/f(x).
The product of k and its reciprocal The product of f(x) and its reciprocal
is equal to k · 1/k = 1. is equal to f(x) · 1/f(x) = 1.
Given 1/k, its value is undefined Given 1/f(x), its value is undefined
when k = 0. when f(x) = 0.
For example: Let the function be f(x) = 2x – 1. Then the reciprocal will
1
be f(x) = .
2x −1
1
We can also confirm the product of 2x–1 and its reciprocal is .
2x −1
This also means that 2x – 1 must never be zero, so x must never be 1/2.

38 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 38 06-Feb-25 12:17:48 PM


Reciprocal, Exponential and Logarithmic Functions

Notes
3.3 Properties of Reciprocal Functions
The reciprocal functions can be easily identified with the following
properties.
‹ Reciprocal functions are in the form of a fraction. A numerator is
a real number and the denominator is either a number or a variable
or a polynomial.
‹ The reciprocal of x is 1/x.
‹ The denominator of a reciprocal function cannot be 0.
‹ The domain and range of the reciprocal function is the set of all
real numbers excluding 0.
‹ The graph of the equation f(x) = 1/x is symmetric with the equation
y = x.

3.4 Graph of Reciprocal Functions


There are many forms of reciprocal functions. One of them is of the form
k/x. Here ‘k’ is real number and the value of ‘x’ cannot be 0. Now let
us draw the graph for the function f(x) = 1/x by taking different values
of x and y.

x –3 –2 –1 –1/2 –1/3 1/3 1/2 1 2 3


y –1/3 –1/2 –1 –2 –3 3 2 1 1/2 1/3

(Graph: Courtesy Boundless Algebra)


For a reciprocal function f(x) = 1/x, ‘x’ can never be 0 and so 1/x can
also not be equal to 0. From the graph, we observe that they never touch

PAGE 39
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 39 06-Feb-25 12:17:48 PM


IT SKILLS AND DATA ANALYSIS - II

Notes the x-axis and y-axis. The y-axis is said to be the vertical asymptote as
the curve gets very closer but never touches it. Also, the x-axis is the
horizontal asymptote as the curve never touches the x-axis.

3.5 Domain and Range of Reciprocal Function


The reciprocal functions have a domain and range similar to that of the
normal functions. The domain of the reciprocal function is all the real
number values except values which gives the result as infinity. And the
range is all the possible real number values of the function.
The set of all real numbers except 0, for which the function is undefined,
forms the Domain of the function and the Range is the set of all real
numbers.

3.6 How to Solve Reciprocal Functions?


The reciprocal functions of some of the numbers, variables, expressions,
fractions can be obtained by simply reversing the numerator with the
denominator. The method to solve some of the important reciprocal
functions is as follows:
‹ Reciprocal of a Number: To find the reciprocal we divide the
number, variable, or expression by 1. For example, reciprocal of
6 is 1/6
‹ Reciprocal of a Variable: The reciprocal of a variable ‘y’ can be
found by dividing the variable by 1. For example, the Reciprocal
of y is 1/y
‹ Reciprocal of an Expression: The reciprocal of an expression can
be found by exchanging the positions of numerator and denominator.
Examples are, Reciprocal of x/(x – 4) is (x – 4)/x.
‹ Reciprocal of a Fraction: Reciprocal of a fraction can be obtained
by flipping the places of numerator and denominator. For example,
reciprocal of 5/8 is 8/5.
‹ Reciprocal of a Mixed Fraction: Reciprocal of a mixed fraction
can be obtained by finding the improper fraction and then finding its
reciprocal. For example, to find the reciprocal of 334334, we find
the improper fraction which is 15/4 and now find the reciprocal 4/15.
40 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 40 06-Feb-25 12:17:48 PM


Reciprocal, Exponential and Logarithmic Functions

Notes
3.7 Exponential Function
An exponential function is a mathematical function. It is used in many
practical situations, such as to find the exponential decay or exponential
growth, to compute investments, to model populations and so on.

Definition: Exponential Function


An exponential function is defined as a mathematical function of the form
f(x) = ax, where x is a variable and a is the base of the function which
is constant taking values greater than 0 except 1. The transcendental
number e is the most commonly used exponential function base whose
value is approximately equal to 2.71828. The function will be f(x) = ex.

Formula for Exponential Function


The formula for an exponential function is
f(x) = ax
where “a” is a constant, known as the base of the function and a > 0 and
is not equal to 1. The input variable is x which occurs as an exponent.
x is any real number.
The different forms of an exponential function are:
‹ f(x) = bx
‹ f(x) = abx
‹ f(x) = abcx
‹ f(x) = ex
‹ f(x) = ekx
‹ f(x) = p ekx
where, a, b, c, p and k are constants and ‘x’ is a variable. In each expo-
nential function the base must be a positive number. i.e., b > 0 and e >
0. Also, b ≠ 1 (for if b = 1, then the function f(x) = bx becomes f(x) =
1 and in this case, which is a linear function). The domain of a function
y = f(x) is the set of all x-values (inputs) and the range is the set of all
y-values (outputs) of the function.

PAGE 41
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 41 06-Feb-25 12:17:48 PM


IT SKILLS AND DATA ANALYSIS - II

Notes Some of the examples of exponential functions are:


‹ f(x) = 2x
‹ f(x) = 1/ 2x = 2-x
‹ f(x) = 2x+3
‹ f(x) = 0.5x
Example 1: Simplify the exponential function f(x) = 2x – 2x+1
Solution: Given exponential function: 2x – 2x+1
To simplify the function, we use the property: ax ay = ax+y. Thus, the given
function is written as:
2x – 2x+1 = 2x – 2x. 2
= 2x(1 – 2)
= 2x(–1)
= –2x
Therefore, the simplification of the given exponential function f(x) =
2x – 2x+1 is – 2x.

Exponential Curve
The exponential curve depends on the value of x of the exponential func-
tion. If the variable is negative, the function is undefined for –1 < x < 1.
The growth or the decay of an exponential curve depends on the expo-
nential function. Any quantity that grows or decays at regular intervals
by a fixed per cent should possess either exponential growth or expo-
nential decay.

Exponential Growth
In Exponential Growth, the quantity increases very slowly at first, and
then rapidly. The rate of change increases over time. The rate of growth
becomes faster as time passes. The rapid growth is meant to be an “ex-
ponential increase”. The formula to define the exponential growth is:
y = a ( 1+ r )x
where r is the growth percentage. The graph of the function in exponen-
tial growth is increasing. The exponential growth formulas are used to
find doubling time, to model population growth, compound interest, etc.

42 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 42 06-Feb-25 12:17:48 PM


Reciprocal, Exponential and Logarithmic Functions

Exponential Decay Notes


In Exponential Decay, the quantity decreases very rapidly at first, and
then slowly. The rate of change decreases over time. The rate of change
becomes slower as time passes. The rapid growth meant to be an “ex-
ponential decrease”. The formula to define the exponential growth is:
y = a ( 1 – r )x
where r is the decay percentage. The graph of the function in exponential
growth is decreasing. The exponential decay is used to find half-life, to
model population decay, etc.
Example 2: In 1990, there were 100,000 citizens in a town. If the popu-
lation increases by 8% every year, then how many citizens will be there
in 10 years?
Solution: The initial population, a = 100,000.
The rate of growth, r = 8% = 0.08.
The time, x = 10 years.
Using the exponential growth formula,
f(x) = a (1 + r)x
f(x) = 100000(1 + 0.08)10
≈ 215,892
Therefore, the number of citizens in 10 years will be 215,892.

Exponential Function Graph


The exponential function graph and its properties are explained through
the following examples.
Example 3: Let the function be f(x) = 2x . We take some random values
of x to construct the graph the function. On the graph these points are
connected by a curve, which is then extended on both ends. The graph
generated is given below.

PAGE 43
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 43 06-Feb-25 12:17:49 PM


IT SKILLS AND DATA ANALYSIS - II

Notes

(Graph: Courtesy Cuemath.com)

The properties of the exponential function graph when the base is greater
than 1 are given below.
‹ The graph is an increasing function passing through the point (0,1).
‹ The domain is all real numbers and the range is y > 0.
‹ The graph is asymptotic to the x-axis as x approaches negative
infinity.
‹ The graph increases without bound as x approaches positive infinity.
‹ The graph is continuous and smooth.
Example 4: Construct the graph for the exponential function
g(x) =(1/2)x.
Following the similar steps as done in the above example, we get the
following graph.

(Graph: Courtesy Cuemath.com)


44 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 44 06-Feb-25 12:17:50 PM


Reciprocal, Exponential and Logarithmic Functions

The properties of the exponential function and its graph when the base Notes
is between 0 and 1 are as follows:
‹ The graph is the decreasing function with line passes through the
point (0,1).
‹ The domain includes all real numbers and the range is y > 0.
‹ The graph is asymptotic to the x-axis as x approaches positive
infinity.
‹ The line increases without bound as x approaches negative infinity.
‹ Graph is continuous and smooth.
Thus, from the above two graphs, we can see that when f(x) = 2x the
function is increasing and when g(x) = (1/2)x the function is decreasing.
Thus, the graph of exponential function f(x) = ax increases when a > 1
and decreases when 0 < a < 1.

Exponential Function Asymptotes


There is no vertical asymptote of an exponential function since the function
is continuously increasing/decreasing. But the function has a horizontal
asymptote. The equation of horizontal asymptote of an exponential func-
tion f(x) = abx + c is always y = c. i.e., a constant which is added to
the exponent part of the function. In the above two examples (f(x) = 2x
and g(x) = (1/2)x), the horizontal asymptote is y = 0 as nothing is being
added to the exponent part in both the functions. Thus, we can note that:
‹ An exponential function never has a vertical asymptote.
‹ The horizontal asymptote of an exponential function f(x) = abx +
c is y = c.
The domain of the graphs of f(x) = 2x and g(x) = (1/2)x is the set of all
real numbers (–∞, ∞) and the range is obtained using the horizontal as-
ymptote of the graph, y = c, by seeing whether the graph is above y = c
or below y = c. Thus, for an exponential function f(x) = abx,
Domain is the set of all real numbers (or) (-∞, ∞) and Range is f(x) >
c if a > 0 and f(x) < c if a < 0.
The following figure gives the comparison of polynomial function and
exponential function. We can see that the nature of polynomial functions
is dependent on their degree. The higher the degree of any polynomial

PAGE 45
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 45 06-Feb-25 12:17:50 PM


IT SKILLS AND DATA ANALYSIS - II

Notes function, the higher its growth. But a function which grows faster than
a polynomial function is exponential function, y = f(x) = ax, where a>1.
Thus, for any of the positive integer n the function f (x) is said to grow
faster than that of fn(x).

(Graph: Courtesy Byju’s)

Thus, the exponential function having base greater than 1, i.e., a > 1 is
defined as y = f(x) = ax. The domain of exponential function will be the
set of real numbers and the range will be the set of all the positive real
numbers. Also note that the exponential function is increasing and the
point (0, 1) always lies on the graph of an exponential function. Also, it
is very close to zero if the value of x is mostly negative.

3.8 Logarithmic Functions


The logarithmic functions are important functions in math calculations.
It has numerous applications in astronomical and scientific calculations
involving huge numbers. Logarithmic functions are related to exponen-
tial functions and are defined as an inverse of the exponential function.
The exponential function ax = y is transformed to a logarithmic function
logay = x

Definition of Logarithmic Functions


The inverse of the exponential function ax = y is defined as the Log-
arithmic functions. The exponential function of the form a x = y can
be transformed into a logarithmic function logay = x. The logarithmic
function is of the form
y = f(x) = logax, where a > 0 and a ≠ 1

46 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 46 06-Feb-25 12:17:50 PM


Reciprocal, Exponential and Logarithmic Functions

The base of the logarithm is a. This can be read as log base a of x. The Notes
most 2 common bases used in logarithmic functions are base 10 and
base e. Thus Log functions include natural logarithm (ln) (with base e
and is denoted by loge.) or common logarithm (log) (with base 10 and it
is denoted by log10 or simply log).
Some examples of logarithmic functions are:
‹ f(x) = ln (x – 3)
‹ f(x) = log2 (x + 6) – 2
‹ f(x) = 2 log x, etc.
The logarithms can be calculated for positive whole numbers, fractions,
decimals, but not for negative values. The logarithms are generally cal-
culated with a base of 10. The logarithmic value of any number can be
found using a Napier logarithm table.
The domain of log function y = log x is (0, ∞) and the range of any
log function is the set of all real numbers (R).
Example 5: Simplify log2 (1/128).
Solution: We use the properties of logarithmic function to simplify the
given logarithm.
log2 (1/128) = log2 1 – log2 128 (since log (a/b) = log a – log b)
= 0 – log2 27 (since loga 1 = 0)
= –log2 27
= –7 log2 2 (since logax = x loga)
= –7 (1) (since loga a = 1)
= –7
Hence log2 (1/128) = –7
Example 6: Find the domain and range of the logarithmic function
f(x) = 2 log (2x – 4) + 5.
Solution: For finding domain, we set the argument of the function greater
than 0 and solve for x.
2x – 4 > 0
2x > 4
x > 2

PAGE 47
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 47 06-Feb-25 12:17:50 PM


IT SKILLS AND DATA ANALYSIS - II

Notes Thus, domain = (2, ∞).


Now we know that the range of any log function is R, so the range of
f(x) is R.

Logarithmic Graph
Since the exponential and log functions are inverses of each other their
graphs are symmetric with respect to the line y = x. Also, when x = 0,
y = 0 as y = loga1 = 0 for any ‘a’. Thus, all such functions have x-in-
tercept of (1, 0). Since loga0 is not defined, a logarithmic function does
not have a y-intercept . The domain of the basic logarithmic function y
= loga x is the set of positive real numbers and the range is the set of all
real numbers. Using all these, the graph of the logarithmic graph will be

(Graph: Courtesy Cuemath.com)

Graphing Logarithmic Functions


Before drawing a log function graph, check whether the curve is an
increasing or decreasing. If the base > 1, then the curve will be increas-
ing whereas if 0 < base < 1, then the curve is decreasing. The steps for
graphing logarithmic functions:
‹ Find the domain and range.
‹ Find the vertical asymptote by setting the argument equal to 0. Note
that a log function doesn’t have any horizontal asymptote.
‹ To find the x-intercept, substitute some value of x that makes the
argument equal to 1 and use the property loga 1 = 0.
‹ To get a point on the graph, substitute some value of x that makes
the argument equal to the base and use the property loga a = 1.

48 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 48 06-Feb-25 12:17:50 PM


Reciprocal, Exponential and Logarithmic Functions

‹ Join the two points and extend the curve on both sides with respect Notes
to the vertical asymptote.
Example 7: Graph the logarithmic function f(x) = 2 log3 (x + 1).
Solution: Here, the base is 3 > 1 so the curve will be an increasing curve.
For domain: Since x + 1 > 0 ⇒ x > –1 so domain = (–1, ∞).
Range = R.
Vertical asymptote is x = –1.
‹ At x = 0, y = 2 log3 (0 + 1) = 2 log3 1 = 2 (0) = 0
‹ At x = 2, y = 2 log3 (2 + 1) = 2 log3 3 = 2 (1) = 2
Thus, the two points on the curve are (0, 0) and (2, 2). Thus, the loga-
rithmic function graph will be:

(Graph: Courtesy Cuemath.co)

3.9 Exercise
1. Find the reciprocal of the following.
i. 5
ii. 3x
iii. x2 + 6
iv. 5/8

PAGE 49
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 49 06-Feb-25 12:17:52 PM


IT SKILLS AND DATA ANALYSIS - II

Notes 2. Find the domain and range of the reciprocal function y = 1/(x+3).
3. Find the vertical and horizontal asymptote of the function
f(x) = 2/(x – 7).
4. Solve the exponential equation: (¼)x = 64.
5. Graph an exponential function (⅓)x – 1.
6. Solve for x: 8(4x-1) = 45x.
7. Solve the exponential equation for x: –5x-3 = 25/40.
8. Simplify the following exponential expression: 3x – 3x+2.
9. The half-life of carbon-14 is 5,730 years. If there were initially
1000 grams of carbon, then what is the amount of carbon left after
2000 years? Round your answer to the nearest integer.
10. Express 43 = 64 in logarithmic form.
11. Given the logarithmic function f(x) = 3 log2 (2x – 3) – 7. Find the
domain, range, vertical and horizontal asymptotes of this function.

50 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 50 06-Feb-25 12:17:52 PM


UNIT - II

PAGE 51
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 51 06-Feb-25 12:17:52 PM


IT Skills and Data Analysis - II.indd 52 06-Feb-25 12:17:52 PM
L E S S O N

4
Correlation

STRUCTURE
4.1 Covariance
4.2 Correlation
4.3 Types of Correlation
4.4 Scatter or Dot Diagram
4.5 Karl Pearson’s Coefficient of Correlation
4.6 Coefficient of Correlation of Grouped Data
4.7 Rank Correlation
4.8 Spearman’s Rank Correlation Coefficient
4.9 Equal Ranks
4.10 Correlation Factor
4.11 Exercise

4.1 Covariance
Let the corresponding values of two variables X and Y, given by ordered pairs (x1, y1),
(x2, y2), (x3, y3) ,….(xn, yn)
Then the Covariance between X and Y is denoted by Cov (X, Y).
It is defined as

OR
Cov (X, Y) = E (X Y) – E (X) E (Y)
E (XY), E (X), E (Y) are the corresponding means

PAGE 53
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 53 06-Feb-25 12:17:52 PM


IT SKILLS AND DATA ANALYSIS - II

Notes Working Rule to Calculate Covariance


n n
Step I: Calculate the sums ∑xi and ∑y
i =1
i
i =1

Step II: Calculate the sum Σ xi yi of the products of xi and yi.

Step III: Divide the values obtained in steps I, II by n to get


∑x i
,
∑y i

n n
and ∑ xi yi .
n

Step IV: Obtain the difference to get Cov (X, Y).

Example 1: Calculate the Covariance of the following pairs of observa-


tions of two variates.
(1, 4) (2, 2) (3, 4) (4, 8) (5, 9) (6, 12)
Solution:
Σxi = 1 + 2 + 3 + 4 + 5 + 6 = 21
Σyi = 4 + 2 + 4 + 8 + 9 + 12 = 39
Σxi yi = (1 × 4) + (2 × 2) + (3 × 4) + (4 × 8) + (5 × 9) + (6 ×12)
= 4 + 4 + 12 + 32 + 45 + 72 = 169

Cov (X, Y) =

Example 2: Find the Covariance of the following pairs of observations


of two variates:
(10, 35) (15, 20) (20, 30) (25, 30) (30, 35)
(35, 38) (40, 42) (45, 30) (50, 40) (55, 70)

54 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 54 13-Feb-25 2:26:51 PM


Correlation

Solution: Notes

4.2 Correlation
Whenever two variables x and y are so related that an increase in the
one is accompanied by an increase or decrease in the other, then the
variables are said to be correlated.
For example, the yield of crop varies with the amount of rainfall.

4.3 Types of Correlation


1. Positive Correlation: If an increase in the value of one variable X
results in a corresponding increase in value of other variable Y on
average.
OR
If a decrease in the value of one variable X results in a corresponding
decrease in value of other variable Y on average.
The correlation is said to be positive.

PAGE 55
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 55 06-Feb-25 12:17:56 PM


IT SKILLS AND DATA ANALYSIS - II

Notes 2. Negative Correlation: If the increase in the values of one variable X


results in a corresponding decrease in the values of other variable Y.
OR
If the decrease in the values of one variable X results in the increase
in the corresponding values of Y the correlation between X and Y
is said to be negative.
3. Linear Correlation: When all the plotted points lie approximately on
a straight line, then the correlation is said to be linear correlation.
4. Perfect Correlation: If two variables vary in such a way that their
ratio is always constant, then the correlation is said to be perfect.
In this case the plotted points on a graph lie exactly on a straight
line.
5. Perfect Positive Correlation: If increase in one variable X is
proportional to the increase in the other variable Y. The graph will
be exactly straight line.
6. Perfect Negative Correlation: If increase in one variable X is
proportional to the decrease in the other variable Y. The graph will
be exactly a straight line.
7. Independent Correlation: If there is no relationship between two
variables, they are said to have Independent correlation.
Even though independent means zero correlation but vice-versa may
not be true. Because zero correlation will not always mean independent
correlation.

4.4 Scatter or Dot Diagram


When we plot the corresponding values of two variables, taking one on
x-axis and the other along y-axis, it shows a collection of dots.
This collection of dots is called a dot diagram or a scatter diagram.
There are five types of scatter diagrams given below. In the first diagram
on increasing x, then y also increases. This is positive correlation. In
second diagram on increasing x, then y decreases. This relation is called
negative correlation.

56 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 56 06-Feb-25 12:17:56 PM


Correlation

In third diagram points are scattered in such a way that there is no cor- Notes
relation between them. Fourth and Fifth diagrams have exactly straight
line. As increase in one variable x is proportional to the increase or
decrease in other variable y for perfect positive or negative correlation
respectively.

4.5 Karl Pearson’s Coefficient of Correlation


Relation between two variables x, y, denoted by r, is defined as:

Where X = x – x̄, Y= y – ȳ
i.e. X, Y are the deviations measured from their respective means,

= Covariance
and σx, σy being the standard deviations of these series.
Example 3: Calculate the coefficient of correlation between x and y series
from the following data:

∑(x − x ) ∑ ( y=
− y) ∑ ( x − x )(=
y − y)
2 2
136,
= 138, 122

Solution: Here, we have


∑X2 = ∑(x – x)2 = 136

PAGE 57
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 57 06-Feb-25 12:17:59 PM


IT SKILLS AND DATA ANALYSIS - II

Notes ∑Y2 = ∑(y – ȳ)2 = 138


∑XY = ∑(x – x̄̄)(y – ȳ) = 122

r=
∑XY
∑X ∑Y
2 2

Putting the values of ∑XY, ∑X2 & ∑Y2 in (1), we get

Example 4: Calculate the correlation coefficient between the following


data:

x 5 9 13 17 21
y 12 20 25 33 35
Solution: Here
5 + 9 + 13 + 17 + 21 65
x
= = = 13
5 5
12 + 20 + 25 + 33 + 35 125
y
= = = 25
5 5
Let
X=
( x − x ) and Y =
( y − y)

Example 5: Calculate the correlation coefficient between x and y for


the following data:

58 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 58 06-Feb-25 12:18:00 PM


Correlation

x 21 23 30 54 57 58 72 78 87 90 Notes
y 60 71 72 83 110 84 100 92 113 135

Solution:
Σn 570
Mean of x= x= = = 57
n 10
Σy 920
Mean of y= y= = = 92
n 10
If X and Y are deviation of xʼs and yʼs from their respective means, then
the data may be arranged in the following form:
x y X = x – 57 Y = y – 92 X2 Y2 XY
21 60 –36 –32 1296 1024 1152
23 71 –34 –21 1156 441 714
30 72 –27 –20 729 400 540
54 83 –3 –09 9 81 27
57 110 0 18 0 324 00
58 84 1 –08 1 64 –08
72 100 15 08 225 64 120
78 92 21 00 441 00 00
87 113 30 21 900 441 630
90 135 33 43 1089 1849 1419
∑x = ∑y = ∑X2 = ∑Y2 = ∑XY =
TOTAL
570 920 5846 4688 4594
ΣX2 = 5846, ΣY2 = 4688 and ΣXY = 4594

General Formula for Coefficient of Correlation


Here the deviations are taken from the assumed means a, b; previously
the deviations were taken from the actual means x̄ and ȳ.

PAGE 59
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 59 06-Feb-25 12:18:02 PM


IT SKILLS AND DATA ANALYSIS - II

Notes X′ = x – a, Y′ = y – b

Example 6: Calculate Karl Pearson’s coefficient of correlation for the


data given below:
Independent Variable x 3 7 5 4 6 8 2 7
Dependent variable y 7 12 8 8 10 13 5 10

Solution: Let the assumed mean for x and y series be 5 and 9 respectively.
x y X′ = x – 5 Y′ = y – 9 X′2 Y′2 X′Y′
3 7 –2 –2 4 4 4
7 12 2 3 4 9 6
5 8 0 –1 0 1 0
4 8 –1 –1 1 1 1
6 10 1 1 1 1 1
8 13 3 4 9 16 12
2 5 –3 –4 9 16 12
7 10 2 1 4 1 2
Total 2 1 32 49 38

Now,

Example 7: From the following data, examine whether input of oil and
output of electricity can be said to be correlated.

60 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 60 06-Feb-25 12:18:05 PM


Correlation

Input of Oil 6.9 8.2 7.8 4.8 9.6 8.0 7.7 Notes
Output of Electricity 1.9 3.5 6.5 1.3 5.5 3.5 2.2

Solution: Assumed Mean of x = 4.8, Assumed Mean of y = 1.3


x y X´ = x – 4.8 Y´ = y – 1.3 X´2 Y´2 X´Y´
6.9 1.9 2.1 0.6 4.41 0.36 1.26
8.2 3.5 3.4 2.2 11.56 4.84 7.48
7.8 6.5 3 5.2 9 27.04 15.6
4.8 1.3 0 0 0 0 0
9.6 5.5 4.8 4.2 23.04 17.64 20.16
8.0 3.5 3.2 2.2 10.24 4.84 7.04
7.7 2.2 2.9 0.9 8.41 0.81 2.61
ΣX´ = 19.4 ΣY´ = 15.3 ΣX´2 = ΣY´ = ΣX´Y´
2

66.66 55.53 = 54.15


Now,

Note: Let the assumed means of x and y be 0, 0 i.e.,


a = 0, b = 0 ⇒ X′ = x and Y′ = y
N Σxy −ΣxΣy
r=
N Σx2 −(Σx)2 N Σy 2 − (Σy ) 2

PAGE 61
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 61 06-Feb-25 12:18:05 PM


IT SKILLS AND DATA ANALYSIS - II

Notes Example 8: Calculate the coefficient of correlation between the marks


obtained by 8 students in mathematics and statistics.
Students A B C D E F G H
Mathematics 25 30 32 35 37 40 42 45
Statistics 08 10 15 17 20 23 24 25
Solution: Let the marks of two subjects be denoted by x and y respectively.
Let the assumed mean for x marks be 0 and that of for y be 0 and N = 8.
X y x2 y2 xy
25 08 625 64 200
30 10 900 100 300
32 15 1024 225 480
35 17 1225 289 595
37 20 1369 400 740
40 23 1600 529 920
42 24 1764 576 1008
45 25 2025 625 1125
Sx = 286 Sy = 142 Sx = 10532
2
Sy = 2808
2
Sxy = 5368

4.6 Coefficient of Correlation of Grouped Data

where r is the coefficient of correlation.

62 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 62 06-Feb-25 12:18:07 PM


Correlation

X' = Deviation from assumed mean of x = x – a Notes


Y' = Deviation from assumed mean of y = y – b
N = Total number of items.
Example 9: Find the coefficient of correlation between beneficiaries of
the age and the amount of LIC from the following table.
Amount of LIC in Rs.
Age- No. of
group 10,000 20,000 30,000 40,000 50,000 Beneficiaries
20–30 4 6 3 7 1 21
30–40 2 8 15 7 1 33
40–50 3 9 12 6 2 32
50–60 8 4 2 – – 14
Total 17 27 32 20 4 100

Solution: Let the amount denoted by x and the age group by y.

Hence the age and the sum assured are negatively correlated, i.e., as age
goes up the sum assured comes down.

PAGE 63
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 63 06-Feb-25 12:18:07 PM


IT SKILLS AND DATA ANALYSIS - II

Notes Example 10: Calculate the coefficient of correlation for the following table:

Here,

Solution:

64 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 64 06-Feb-25 12:18:08 PM


Correlation

Example 11: A computer operator while calculating the coefficient be- Notes
tween two variates x and y for 25 pairs of observations obtained the
following constants:
n = 25, Σx = 125, Σx2 = 650, Σy = 100, Σy2 = 460, Σxy = 508
It was however later discovered at the time of checking that he had
copied down two pairs as (6, 14) and (8, 6) while the correct pairs were
(8, 12) and (6, 8). Obtain the correct value of the correlation coefficient.
Solution:
Here, corrected Σx = Incorrect Σx – (6 + 8) + (8 + 6) = 125 – 14 +14 = 125
Corrected Σy = Incorrect Σy – (14 + 6) + (12 + 8)
= 100 – 20 + 20 = 100
Corrected Σx2 = 650 – (62 + 82) + (82 + 62) = 650 – 100 + 100
= 650
Corrected Σy = 460 – (142 + 62) + (122 + 82) = 460 – 232 + 208 = 436
2

Corrected Σxy = 508 – [(6) (14) + (8) (6)] + (8) (12) + (6) (8)
= 508 – (84 + 48) + (96 + 48) = 508 – 132 + 144 = 520
Corrected value of correlation coefficient is

4.7 Rank Correlation


Arranging or numbering people or items in the order of merit or expertise
in a field, subject or characteristic is known as the ranking. The number
or grade indicating the position of people, items or products is called
their Rank.
If ranks of individuals or items are available for two characteristics then
correlation between these two characteristics is known as rank correlation.

PAGE 65
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 65 06-Feb-25 12:18:09 PM


IT SKILLS AND DATA ANALYSIS - II

Notes
4.8 Spearman’s Rank Correlation Coefficient
Spearman’s rank correlation coefficient measures the strength of association
between two ranked variable

Working Rule
Step I: Assign ranks to each item of both series, if they are not given.
Step II: Calculate the difference d of ranks of X from the rank of Y and
write it in a separate column.
Step III: Square the difference d and write d 2 in a separate column.
Step IV: Apply the formula to get the Rank correlation

Example 12: Compute Spearman’s rank correlation coefficient r for the


following data:
Person A B C D E F G H I J
Rank in statistics 9 10 6 5 7 2 4 8 1 3
Rank in income 1 2 3 4 5 6 7 8 9 10

Solution:

Rank in Rank in
Person Statistics R1 Income R2 d = R1 – R2 d2
A 9 1 8 64
B 10 2 8 64
C 6 3 3 9
D 5 4 1 1
E 7 5 2 4
F 2 6 –4 16
G 4 7 –3 9

66 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 66 06-Feb-25 12:18:10 PM


Correlation

H 8 8 0 0 Notes
I 1 9 –8 64
J 3 10 –7 48
Sd = 280
2

4.9 Equal Ranks


If there is more than one item with the same rank, the rank to the equal
items is assigned by average rank to each of these individuals.
For example, suppose an items is repeated at the rank 5th i.e., the 5th
and 6th items are having the same values then the common rank assigned
5+6
to 5th and 6th item = = 5.5, which is the average of 5 and 6. The
2
next rank assigned will be seven.
If an item is repeated thrice at rank 2, then the common rank assigned
2+3+ 4
to each value will be = 3 which is the arithmetic mean of 2, 3
3
and 4. Then next rank to be assigned would be 5.
To find the rank of correlation coefficient of repeated ranks, correlation
factor is added to the Spearman's correlation formula.

4.10 Correlation Factor

m(m 2 − 1)
to Σd 2 ,
If the formula of rank correlation coefficient, add the factor 12
where m is the number of times an item (say a1) is repeated. This factor
is added for each repeated value in both the series. The total number of
observations is denoted by n.
The modified formula for the rank correlation coefficient is given below.
Example 13: Obtain the rank correlation coefficient for the following data:
x 68 64 75 50 64 80 75 40 55 64
y 62 58 68 45 81 60 68 48 50 70

PAGE 67
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 67 06-Feb-25 12:18:11 PM


IT SKILLS AND DATA ANALYSIS - II

Notes Solution:
x y Rank in x = x’ Rank in y = y’ d = x – y’ d2
68 62 4 5 _1 1
64 58 6 7 _1 1
75 68 2.5 3.5 _1 1
50 45 9 10 _1 1
64 81 6 1 5 25
80 60 1 6 _5 25
75 68 2.5 3.5 _1 1
40 48 10 9 1 1
55 50 8 8 0 0
64 70 6 2 4 16
Total ∑ d = 72
2

Repeated Rank of No. of Repeated Rank of y No. of


x column times column times
2.5 2 = m1 3.5 2 = m2
6 3 = m3
Example 14. In a college contest of Miss Engineering ten competitors are
ranked by three judges in the following order:
I Judge 5 3 10 7 2 1 4 10 4 6

II Judge 1 6 5 10 3 2 4 9 7 8
III Judge 6 4 9 8 1 2 3 10 5 7

Use the correlation coefficient to determine which pair of Judges has the
nearest approach to common flair in beauty rankings.

68 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 68 06-Feb-25 12:18:12 PM


Correlation

Solution: Notes
d12 = r1 d13 = r1 d23 = r2
r1 r2 r3 – r2 – r3 – r3 d212 d213 d223
5 1 6 4 –1 –5 16 1 25
3 6 4 –3 –1 2 9 1 4
10 5 9 5 1 –4 25 1 16
7 10 8 –3 –1 2 9 1 4
2 3 1 –1 1 2 1 1 4
1 2 2 –1 –1 0 1 1 0
4 4 3 0 1 1 0 1 1
10 9 10 1 0 –1 1 0 1
4 7 5 –3 –1 2 9 1 4
6 8 7 –2 –1 1 4 1 1
∑d = ∑d = ∑d2 =
2 2

7512 913 6023

Since r13 = 0.9455 is maximum, so the pair of first and third Judge has
the nearest approach to the common taste of beauty.

4.11 Exercise
1. Calculate the coefficient of correlation from the data given below.
x 4 6 8 10 12
y 2 3 4 6 10
2. Find the coefficient of correlation between x and y from the table
of their values:
x 1 3 4 6 8 9 11 14
y 1 2 4 4 5 7 8 9
PAGE 69
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 69 06-Feb-25 12:18:13 PM


IT SKILLS AND DATA ANALYSIS - II

Notes 3. Find the coefficient of correlation of the following data taking new
origin of x at 70 and for y at 67:
x 67 68 64 68 72 70 69 70
y 65 66 67 67 68 69 71 71
4. Calculate the coefficient of correlation for the following data:
x 1 2 3 4 5 6 7 8 9
y 9 8 10 12 11 13 14 16 15
5. Calculate Karl Pearson’s coefficient of correlation from the following
data, using 20 as working mean for price and 70 as working mean
for demand.
Price 14 16 17 18 19 20 21 22 23
Demand 84 78 70 75 66 67 62 58 60
6. Calculate the correlation coefficient between the following pairs of
values:
x 100 110 115 116 120 120 125 130 135
y 18 18 17 16 16 16 15 13 10
7. The following marks have been obtained by a class of students in
statistics (out of 100):
Paper I 45 55 56 58 60 65 68 70 75 80 85
Paper II 56 50 48 60 62 64 65 70 74 82 90
Compute the coefficient of correlation for the above data.
8. Calculate the coefficient of correlation between the values of x and
y from the following data:
x 78 89 97 69 59 79 61 61
y 125 137 156 112 107 136 123 108

[Hint: You may use 69 as working mean for x and 112 as that for y].
9. Find the correlation coefficient between the income and expenditure
of a wage earner and comment on the result:
Month Jan. Feb. March April May June July
Income 46 54 56 56 58 60 62
Expenditure 36 40 44 54 42 58 54

70 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 70 06-Feb-25 12:18:14 PM


Correlation

10. Two girls Asgari and Mumtaz were asked to rank 7 different types Notes
of lipsticks. The ranks given by them are given below:
Lipsticks A B C D E F G
Asgari 2 1 4 3 5 7 6
Mumtaz 1 3 2 4 5 6 7
Calculate Spearman’s rank correlation coefficient.

PAGE 71
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 71 06-Feb-25 12:18:14 PM


L E S S O N

5
Regression

STRUCTURE
5.1 Introduction
5.2 Curve of Regression
5.3 Regression Coefficients
5.4 Properties of Regression Coefficients
5.5 Relation between Regression Analysis and Correlation Analysis
5.6 Exercises

5.1 Introduction
If the scatter diagram indicates some relationship between two variables x and y, then
the dots of the scatter diagram will be concentrated round a curve. This curve is called
the curve of regression.
Regression analysis is the method used for estimating the unknown values of one variable
corresponding to the known values of another variable.

5.2 Curve of Regression


When a number of pairs of two correlated variates are plotted on a graph paper then the
scattered diagram is concentrated around a curve. The curve is called the curve of regres-
sion and the relationship is given by curvilinear regression. The equation of the regression
curve is said to be regression equation.

5.3 Regression Coefficients


Let a number of pairs of two correlated variables be (x1, y1), (x2, y2), (x3, y3)……..
(xn, yn).

72 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 72 06-Feb-25 12:18:14 PM


Regression

Suppose we have to find out the unknown value of y for a certain value Notes
of x, then we have line of regression of y on x, i.e., y = a + bx. Here
y is dependent variable and x is independent variable.
If we have to find out unknown value x for a given value of y, then we
have a line of regression of x on y i.e., x = a + by. Here x is dependent
variable and y is independent variable.
So, we have two lines of regression.
When the curve is a straight line, it is called a line of regression. A line
of regression is the straight line which gives the best fit in the least
square sense to the given frequency.
Regression will be called non-linear if there exists a relationship (parabola
etc.) other than a straight line between the variables under consideration.
Let S be the sum of the squares of such distances, then
S = ∑(y –a – bx)2
According to the principle of least squares, we have to choose a and b
so that S is minimum. The method of least square gives the condition
for minimum value of S.

where x and y are the means of x series and y series. This shows that
( x , y ) lie on the line of regression (1). Shifting the origin to ( x , y ),
the equation (3) becomes

PAGE 73
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 73 06-Feb-25 12:18:16 PM


IT SKILLS AND DATA ANALYSIS - II

Notes But

In the regression line of y on x, e.g. y = a + bx, the coefficient ‘b’ which


is the slope of the line is called the coefficient of regression of y on x
and is denoted by byx .
Similarly, in the regression equation of x on y, e.g., x = c + dy, the
coefficient ‘d’ which is the slope of this line is called the coefficient of
regression of x on y and is denoted by bxy. These coefficients can be
obtained using the following formulae:
Regression coefficient of y on x is :

Regression coefficient of x on y is

74 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 74 06-Feb-25 12:18:18 PM


Regression

Notes
5.4 Properties of Regression Coefficients
1. The geometric mean between regression coefficients is the coefficient
of correlation r.
(a) If byx is positive, bxy will also be positive.
(b) If byx is negative, bxy will also be negative.
(c) Both regression coefficients must have the same sign. If bxy
and byx are both positive, r will be positive and if bxy and byx
are both negative, r will also be negative.
(d) bxy byx < 1 .
2. If one regression coefficient is greater than unity, then the other
regression coefficient must be less than unity.
3. Arithmetic mean of b yx and b xy is equal to or greater than the
coefficient of correlation.
bxy + byx
i.e.
≥ r
2
4. Regression coefficients are independent of origin but not of scale.
5. If θ be the acute angle between the two regression lines in the case
of two variables x and y, then

6. The geometric mean of the coefficients of regression is the coefficient


of correlation

7. The arithmetic mean of the coefficients of regression is greater than


the coefficient of correlation.
Example 1: Find the correlation coefficient between x and y, when the
lines of regression are:
2x – 9 y + 6 = 0 and x – 2 y + 1 = 0 .
Solution: Let the line of regression of x on y be 2x – 9 y + 6 = 0. Then,
the line of regression of y on x is x – 2 y + 1 = 0

PAGE 75
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 75 06-Feb-25 12:18:19 PM


IT SKILLS AND DATA ANALYSIS - II

Notes

So our choice of regression line is incorrect.


∴ The regression line of x on y is x – 2 y + 1 = 0
And, the regression line of y on x is 2x – 9 y + 6 = 0

2
Hence, the correlation coefficient between x and y is
.
3
Example 2: The following regression equations were obtained from a
correlation table:
y = 0.516x + 33.73 , x = 0.512y + 32.52
Find the value of (a) the correlation coefficient, (b) the mean of x and
the mean of y.
Solution:
 (1)
 (2)

(a) From (1),  (3)

From (2), (4)

Multiplying (3) and (4), we get

Coefficient of correlation = 0.514.

76 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 76 06-Feb-25 12:18:22 PM


Regression

Notes
(5)
(6)

Example 3: The two regression equations of the variables x and y are:


x = 19.13 – 0.87y and y = 11.64 – 0.50x
Find (i) Mean of x’s; (ii) Mean of y’s; (iii) The correlation coefficient
between x and y.
Solution:

Example 4: Find the regression line of y on x for the following data


x 1 3 4 6 8 9 11 14
y 1 2 4 4 5 7 8 9
Estimate the value of y, when x = 10.

PAGE 77
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 77 06-Feb-25 12:18:23 PM


IT SKILLS AND DATA ANALYSIS - II

Notes Solution:
S. N. x y xy x2
1 1 1 1 1
2 3 2 6 9
3 4 4 16 16
4 6 4 24 36
5 8 5 40 64
6 9 7 63 81
7 11 8 88 121
8 14 9 126 196
Total 56 40 364 524
Let y = a + bx be the line of regression of y on x, where a and b are
given by the following equations:

Example 5: In a study between the amount of rainfall and the quantity


of air pollution removed, the following data was collected
Daily Rainfall in 0.01 cm 4.3 4.5 5.9 6.1 5.2 3.8 2.1 2.1
Pollution Removed (mg/m3) 12.6 12.1 11.6 11.8 11.4 11.8 13.2 14.1
Find the regression line of y on x.
Solution.
S. No. x(10–2 cm) y (mg/m3) xy x2
1 4.3 12.6 54.18 18.49
2 4.5 12.1 54.45 20.25
3 5.9 11.6 68.44 34.81
4 5.6 11.8 66.08 31.36
5 6.1 11.4 69.54 37.21
6 5.2 11.8 61.36 27.04
7 3.8 13.2 50.16 14.44
8 2.1 14.1 29.61 4.41
Σx = 37.5 Σy = 98.6 Σxy = 453.82 Σx2 = 188.01
78 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 78 06-Feb-25 12:18:24 PM


Regression

Let y = a + bx be the equation of the line of regression of y on x, where Notes


a and b are given by the following equations.

Example 6: The following data regarding the heights (y) and the weights
(x) of 100 college students is given:
Σx = 15000 Σx2 = 2272500
Σy = 6800 Σy2 = 463025
Σxy = 1022250
Find the correlation coefficient between height and weight and state the
equation of regression of height on weight.
Solution:

PAGE 79
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 79 06-Feb-25 12:18:25 PM


IT SKILLS AND DATA ANALYSIS - II

Notes
5.5 Relation between Regression Analysis and Correlation
Analysis
Sr.
No. Correlation Analysis Regression Analysis
1. The relationship between two variables In this case some points are stepped up
is given by the coefficient of correlation. and some are stepped down for making
an average value.
2. It is a measure of direction and degree bxy and byx are mathematical measure of
of relationship between x and y. average relationship between the two
variables.
3. Here, rxy = ryx bxy ≠ byx
4. It does not reflect upon the nature of vari-
It indicates which is dependent variable
able (dependent or independent variable).
and which is independent variable.
5. It does not imply cause and effect rela-
It indicates the cause and effect rela-
tionship between the variables. tionship between the variables.
6. It is a relative measure and have no units.
It is an absolute measure.
7. It indicates the degree of association.
It is used to forecast the nature of the
dependent variable when the value of
independent variable is given.
8. It is confined to the study of linear It has not only application of linear rela-
relationship. tionship but non-linear relationship also.

5.6 Exercises
1. Find the regression line of y on x for the data:
x 1 4 2 3 5
y 3 1 2 5 4
2. Find the equation of regression lines for the following values of x
& y.
x 2 4 6 8 10
y 6 5 4 3 2
3. Compute the regression line of y on x for the following data:
x 1 2 3 4 5 6 7 8 9
y 9 8 10 12 11 13 14 16 15
4. Find the regression lines of x on y and y on x from the given data:
x 1 2 3 4 5 6 7 8 9 10
y 10 12 16 28 25 36 41 49 40 50
80 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 80 06-Feb-25 12:18:26 PM


Regression

5. Find the equations to the lines of regression and the coefficient of Notes
correlation for the following data:
x 2 4 5 6 8 11
y 18 12 10 8 7 5
6. For the following data, find the equation of line of regression:
x 10 12 13 16 17 20 25
y 10 22 24 27 29 33 37
7. Find the regression line of y on x if
x 40 70 50 60 80 50 90 40 60 60
y 2.5 6.0 4.5 5.0 4.5 2.0 5.5 3.0 4.5 3.0
8. The following marks have been obtained by a class of students in
statistics
Paper I 80 45 55 56 58 60 65 68 70 75 85
Paper II 81 56 50 48 60 62 64 65 70 74 90
Compute the coefficient of correlation for the above data. Find the
lines of regression.
9. The following results were obtained from records of age (x) and
systolic blood pressure (y) of a group of 10 men:
x y
Mean 53 142
Variance 130 165
and ∑ (x − x ) (y − y ) =1220.
Find the appropriate regression equation and use it to estimate the
blood pressure of a man whose age is 45.
10. The following results were obtained from lineups in Applied
Mechanics and Engineering Mathematics in an examination:
Applied Mechanics (x) Engg. Maths. (y)
Mean 47.5 39.5
S.D. 16.8 10.8
Find both the regression equations. Also estimate the value of y for
x = 30.

PAGE 81
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 81 06-Feb-25 12:18:26 PM


IT SKILLS AND DATA ANALYSIS - II

Notes 11. The regression equations are: 7x – 16y + 9 = 0, 5y – 4x – 3 = 0.


Find x , y and r.
12. The following regression equations and variances are obtained from
a correlation table:
20x – 9y – 107 = 0, 4x – 5y + 33 = 0, variance of x = 9.
Find (i) the mean values of x and y, (ii) the standard deviation of y
13. Two random variables have the least square regression lines with
equations 3x + 2y = 26 and 6x + y = 31. Find mean values and
correlation coefficient between x and y.
14. Two lines of regression are given by x + 2y = 5 & 2x + 3y = 8.
Calculate:
(i) mean values of x and y
(ii) the coefficient of correlation
(iii) the ratio of the regression coefficients
15. Find the correlation coefficient and regression lines for the data:
x 1 2 3 4 5
y 2 5 3 8 7

82 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 82 06-Feb-25 12:18:27 PM


L E S S O N

6
Multiple Regression and
Correlation

STRUCTURE
6.1 Multiple Regression
6.2 Multiple Correlation
6.3 Formulae for the Calculation of Multiple Correlation Coefficient
6.4 Properties of Multiple Correlation
6.5 Multiple Regression Analysis
6.6 Exercise

6.1 Multiple Regression


We know that the production of wheat depends not only on the amount of rainfall x1 but
also on the fertilizers x2, pesticides x3, quality of seeds x4, quality of soil x5 etc. In a multiple
regression the dependent variable is a function of more than one independent variable.
Linear regression is a linear relationship between y and x1, x2, x3.........
y = a0 + a1 x1 + a2 x2 + a3 x3 + ...........
In multiple non-linear regression, equation is not linear; for example:
y = a0 + a1 xα + a2 xβ + a3 xγ +......
Non-Linear Relationship
Example 1: Fit a non-linear relationship between the following data:
x 1 2 3 4
y 1.7 1.8 2.3 3.2

PAGE 83
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 83 06-Feb-25 12:18:27 PM


IT SKILLS AND DATA ANALYSIS - II

Notes Solution:
Here, we have
x y x2 xy x3 x4 x 2y
1 1.7 1. 1.7 1 1 1.7
2 1.8 4 3.6 8 16 7.2
3 2.3 9 6.9 27 81 20.7
4 3.2 16 12.8 64 256 51.2
Total 10 9.0 3 25.0 100 354 80.8
Let the non-linear relationship be
y = a 0 + a 1x + a 2x 2
Normal equations are
Σy = na0 + a1Σx + a2Σx2
Σxy = a0Σx + a1Σx2 + a2Σx3
Σx2y = a0Σx2 + a1Σx3 + a2Σx4
Substituting the values of Σy, n, Σx, Σx2 etc. in these equations, we get
9 = 4a0 + 10a1 + 30a2
25 = 10a0 + 30a1 + 100a2
80.8 = 30a0 + 100a1 + 354a2
Solving these equations, we get
a0 = 2, a1 = – 0.5 and a2 = 0.2
Then the non-linear relationship is
y = 2 – 0.5x + 0.2x2

6.2 Multiple Correlation


We have seen correlation between two variables only in earlier chapter.
But very often we need to find correlation between 2 or more variables.
In multiple correlation we study three or more variables at a time. Hence
the effect of all the independent variables on a dependent variable is
studied. Let three variables be x1, x2 and x3.
R1.23 = Multiple correlation coefficient with x1 as dependent variable and
x2 and x3 as independent variables.
R2.13 = Multiple correlation coefficient with x2 as dependent variable and
x1 and x3 as independent variables.

84 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 84 06-Feb-25 12:18:27 PM


Multiple Regression and Correlation

R3.12 = Multiple correlation coefficient with x3 as dependent variable and Notes


x1 and x2 as independent variables.

6.3 Formulae for the Calculation of Multiple Correlation


Coefficient

r122 + r132 − 2r12 r13 r23


R1.23 =
1 − r232

r212 + r232 − 2r12 r13 r23


R2.23 =
1 − r232

r312 + r322 − 2r12 r13 r23


R3.12 =
1 − r232

6.4 Properties of Multiple Correlation


1. Its value lies between 0 and 1.
2. If R1.23 = 0, then r12 = 0 and r13 = 0
3. R1.23 ≥ r12 and R1.23 ≥ r13
4. R1.23 = R1.32

Coefficient of Multiple correlation between four Variables:


R1.234 = 1 − (1 − r142 )(1 − r12.3
2 2
)(1 − r12.34 )
Example 2: A simple correlation coefficient between quantity of pro-
duction of wheat (x1), fertilizer (x2) and rainfall (x3 ) are given r12 = 0.4,
r13 = 0.5 and r23 = 0.6
Find the coefficient of multiple correlation R1.23.
Solution: Here, we have
r12 = 0.4, r13 = 0.5 and r23 = 0.6
We know that
r122 + r132 − 2r12 r13 r23
R1.23 =
1 − r232
(0.4) 2 + (0.5) 2 − 2(0.4)(0.5)(0.6)
=
1 − (0.6) 2
PAGE 85
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 85 06-Feb-25 12:18:29 PM


IT SKILLS AND DATA ANALYSIS - II

Notes
0.16 + 0.25 − 0.24
=
1 − 0.36
0.17
= 0.2656
0.64
= 0.515
Example 3: If r12 = 0.6, r23 = 0.35 and r31 = 0.4 then find R3.12.
Solution: We have r12 = 0.6, r23 = 0.35, and r31 = 0.4
We know that
r312 + r322 − 2r12 r13 r23
R3.23 =
1 − r232

0.42 + (0.35) 2 − 2(0.6)(0.4)(0.35) 0.16 + 0.1225 − 0.168


= =
1 − (0.6) 2 1 − 0.36

0.1145
= = 0.1789
0.64
= 0.423
Example 4: If r12 = 0.25, r13 = 0.35 and r23 = 0.45 then find R2.13.
Solution: Here, we have r12 = 0.25, r13 = 0.35, and r23 = 0.45
We know that,
r212 + r232 − 2r12 r13 r23
R2.13 =
1 − r232
(0.25) 2 + (0.45) 2 − 2(0.25)(0.35)(0.45) 0.0625 + 0.2025 − 0.07875
= =
1 − (0.35) 2 1 − 0.1225
0.18625
= = 0.2123
0.8775
= 0.461
Example 5: Given the following data:
x1 3 5 6 8 12 14
x2 16 10 7 4 3 2
x3 90 72 54 42 30 12
Compute the coefficient of linear multiple correlation of x3 on x1 and x2.
86 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 86 06-Feb-25 12:18:31 PM


Multiple Regression and Correlation

Solution: Notes

We have to compute the values of r13, r23 and r12


X1 = x1 – x1 X2 = x2 – x2 X3 = x3 – x3
x1 X1 X 2
1
x2 X2 X 2
2
x 3
X3 X 23 X 1X 2 X 1X 3 X 2X 3
3 –5 25 16 +9 81 90 +40 1600 –45 –200 360
5 –3 9 10 +3 9 72 +22 484 –9 –66 66
6 –2 4 7 0 0 54 +4 16 0 –8 0
8 0 0 4 –3 9 42 –8 64 0 0 24
12 +4 16 3 –4 16 30 –20 400 –16 –80 80
14 +6 36 2 – 5 25 12 –38 1444 –30 –228 190
ΣX21 = ΣX22 ΣX23 = ΣX1X2 ΣX1X3 ΣX2X3
90 = 140 4008 = –100 = –582 = 720

Now
ΣX 1 X 2
r12 = = −0.89
ΣX 12 × ΣX 22
Also,
ΣX 1 X 3 −582
r13 = = = −0.97
ΣX 12 × ΣX 22 90 × 4008
Again
ΣX 2 X 3 720
=r23 = = 0.96
ΣX 22 × ΣX 32 140 × 4008

We know that
(−0.97) 2 + (0.96) 2 − 2(−0.89)(−0.97)(0.96)
R3.12 =
1 − (−0.89) 2
0.97) 2 + (0.96) 2 − 2(−0.89)(−0.97)(0.96)
Substituting the values of r12, r13 and r23 in (1), we (−get 2
1 − (−0.89)
(−0.97) 2 + (0.96) 2 − 2(−0.89)(−0.97)(0.96)
0.9409 + 0.9216 − 1.66 0.2025
1 − (−0.89) 2 =
1 − 0.7921 0.2079
0.9409 + 0.9216 − 1.66 0.2025
= 0.9740
1 − 0.7921 0.2079
0.9740
= 0.987

PAGE 87
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 87 13-Feb-25 2:44:54 PM


IT SKILLS AND DATA ANALYSIS - II

Notes
6.5 Multiple Regression Analysis
We have considered two types of regression equations one of x on y and
the other of y on x.
Multiple regression analysis represents an extension of two variables
to three or more variables.
We take x1 as dependent variable and x2 and x3 as independent variables.
 (1)
Normal equation of multiple regression equation is
(2)
Putting the value of X1 from (1) in (2), we get

Differentiating partially above equation w.r.t. a, b,12.3 and b13.2, we get


(3)

(4)

(5)

Equation (1) can be rewritten as

Since Σx1 = Σ(X1 − X̅1) = 0 , Σx2 = Σ(X 2 − X̅2 ) = 0, Σx3 = Σ(X3 − X̅3) =
0 [Sum of the deviations from the mean = 0]
Therefore from (3), a = 0
Putting the value of a = 0 in (4) and (5), we get

Σx x12 −b12.3 Σx2 −b13.2 Σx x23 =0 (6)

Σx x13 −b12.3 Σx x23 −b13.2 Σx32 =0 (7)

On solving (6) and (7), we get the values of b12.3, and b13.2.
On putting the values of a, b12.3 and b13.2 in (1), we get the required
regression equation.

88 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 88 06-Feb-25 12:18:35 PM


Multiple Regression and Correlation

Similarly Notes
x2 = b21.3x1 + b23.1x3
x3 = b31.2x1 + b32.1x2
Second Method:
On putting the values of Σx1 x2 etc. in (6) and (7), we get

Where ∆ij is the co-factor of the element in the ith row and jth column
in the determinant.

Hence, on substituting the values of b12.3 and b13.2 the equation to the
regression plane of x1 on x2 and x3 is

(12)

The above equation can also be written as:

PAGE 89
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 89 06-Feb-25 12:18:36 PM


IT SKILLS AND DATA ANALYSIS - II

Notes Similarly,

Standard Error of the estimate for multiple regression and multiple


correlation
The standard error of the estimate X1, X2 and X3 is
Σ( X 1 − Y1 ) 2
S1.23 =
N −3
where S1.23 is standard error of the estimate of X1 on X2 and X3. X1 is the
original value of X and Y1 is the estimated value on the basis of the
regression equation.
Standard error in terms of multiple correlation
1 − r122 − r132 − r232 + 2r12 r13 r23
S1.23 = σ 1
1 − r232
Example 6: If r12 = 0.6, r13 = 0.8, r23 = 0.3 and σ1 = 8, σ2 = 9, σ3 = 5
Determine regression equation of x1 on x2 and x3.
Solution:
Let x1, x2 and x3 be respective deviations from means of X1, X2 and X3
series.
Regression equation of X1 on X2 and X3 is

90 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 90 06-Feb-25 12:18:37 PM


Multiple Regression and Correlation

x1 = b12.3 x2 + b13.2 x3(1) Notes


Now
Substituting the values of b12.3 and b13.2 in the equation (1), we get
x1 = 0.35 x2 + 1.09 x3
Example 7: If r12 = 0.75, r13 = 0.65, r23 = 0.55, and σ1 = 9, σ2 = 7, σ3 = 4.
Determine the regression equation of X2 on X1 and X3.
Solution:
The regression equation of X2 on X1 and X3 is given by
x2 = b21.3 x1 + b23.1 x3(1)
We know that

b21.3 =

=
and

b23.1 =

Substituting the values of b12.3 and b23.1 in (1), we get


x2 = 0.5286x1 + 0.1894x3
Example 8: If σ1 = 3, σ2 = 2.5, σ3 = 3.5 and r12 = 0.3, r13 = 0.5, r23 =
0.4. Find the regression equation of x3 on x1 and x2
Solution: Here, we have
σ1 = 3, σ2 = 2.5, σ3 = 3.5
r12 = 0.3, r13 = 0.5, r23 = 0.4
The regression equation of x3 on x1 and x2 is
x3 = b31.2x1 + b32.1x2 (1)

PAGE 91
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 91 06-Feb-25 12:18:37 PM


IT SKILLS AND DATA ANALYSIS - II

Notes We know that

Substituting the values of b31.2 and b32.1 in (1), we get


x3 = 0.487x1 + 0.385x2
Example 9: If r12 = 0.60, r13 = 0.70, r23 = 0.65, and σ1 = 1.0, find S1.23.
Solution: Here, we have
r12 = 0.60, r13 = 0.70, r23 = 0.65, and σ1 = 1.0

We know that

6.6 Exercise
1. f r12 = 0.59, r13 = 0.46, and r23 = 0.77 then find R1.23.
2. If r12 = 0.5, r13 = 0.6, and r23 = 0.7 then find R2.13.
3. If r12 = 0.6, r13 = 0.7, and r23 = 0.65 then find R3.12.
4. If r12 = 0.8, r13 = – 0.5, and r23 = 0.40 then prove that R1.23 = R1.32

92 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 92 06-Feb-25 12:18:37 PM


Multiple Regression and Correlation

5. If r12 = 0.45, r13 = 0.32, and r23 = 0.61 then find R1.23 Notes
6. If r12 = 0.8, r13 = 0.7, r23 = 0.6 and σ1 = 10, σ2 = 8, σ3 = 5, then
find the regression equation of x1 on x2 and x3.
7. If σ1 = 3, σ2 = 4, σ3 = 5 and r12 = 0.7, r23 = 0.4, r31 = 0.6, then
determine the regression equation of x1 on x2 and x3.
8. If r12 = 0.28, r23 = 0.49, r31 = 0.51 and σ1 = 2.7, σ2 = 2.4, σ3 = 2.7,
then find the regression equation of x3 on x1 and x2.
9. Find the multiple linear regression equation of x1 and x2 on x3 from
the data given below:
x1 2 4 6 8
x2 3 5 7 9
x3 4 6 8 10

PAGE 93
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 93 06-Feb-25 12:18:38 PM


L E S S O N

7
Curve Fitting: Principle of
Least Squares
STRUCTURE
7.1 Introduction
7.2 Principle of Least Squares
7.3 Method of Least Squares (Fitting of Straight Lines)
7.4 Fitting a Curve of Type Y = A/X + B√X
7.5 To Fit a Parabola (Up to Second Degree)
7.6 Change of Scale in Second Degree Equations
7.7 To Fit an Exponential Curve (Y = ABX + C)
7.8 Change of Origin and Scale
7.9 Exercise

7.1 Introduction
Consider a number of pairs of x and y i.e.; (x1, y1), (x2, y2), (x3, y3) … and so on. To get
a relationship between x and y, plot these points on a graph paper. The diagram obtained
is known as scatter diagram or dot diagram. This diagram shows a rough relationship
between two variables x and y. An exact relationship between two variables is obtained
by curve fitting. Here we get algebraic equation of the curve.

7.2 Principle of Least Squares


The method of least squares is probably the most systematic procedure to fit a unique curve
through the given points. Let y = f(x) be the equation of curve to be fitted to the given data
(observed or experimental) points (x1, y1), (x2, y2)…..(xn, yn). At x = xi, the observed (or
experimental) value of the ordinate is yi and the corresponding value on the fitting curve is
N1 M1, i.e., [f(x1)]. The difference of the observed and the expected (theoretical) value is
= P1 M1 – N1 M1 = P1 N1 = e1
This difference is called the error.
e1 = y1 – f (x1)
94 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 94 06-Feb-25 12:18:38 PM


Curve Fitting: Principle of Least Squares

Similarly, e2 = y2 – f (x2) Notes


e3 = y3 – f (x3)
.........................
en = yn – f (xn).

Some of the errors e1, e2, e3,….. en will be positive and others negative.
In finding the total errors, all the errors are added. In addition, some
negative and some positive errors may cancel and in some cases sum
of all the errors may be zero, which leads to false result. To avoid such
situation, we may make all the errors positive by squaring. Sum, S = e12
+ e22 + e32 +…..+ en2.
The curve of the best fit is that for which the sum of the squares of
errors (S) is minimum.
This is called the principle of least squares.

7.3 Method of Least Squares (Fitting of Straight Lines)


Let
y = a + bx (1)
be the straight line to be fitted to the given data points (x1, y1), (x2,
y2)…..(xn, yn).
Let yt1 be the theoretical ordinate for x1.
PM = y1, NM = yλ
PN = PM – NM
Then
e1 = y1 – yλ (PN = e1)
e1 = y1 – (a + bx1) (yt1= a + bx1)

PAGE 95
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 95 06-Feb-25 12:18:39 PM


IT SKILLS AND DATA ANALYSIS - II

Notes
P(x1, y1)

On squaring, we get e2 = (y1 – a – bx1)2


S = e12 + e22 + e32 +…..+ en2

For S to be minimum

= 2 (yi – a – bxi) (–1) = 0 or Σ (y – a – bx) = 0 (2)

[To generalize yi, yi is written as y, To generalize xi, xi is written as x]

∂S = 2 (yi – a – bxi) (–xi) or Σ (xy – ax – bx2) = 0 (3)

On simplification equations (2) and (3) become


Σy = na + bΣx (4)
Σxy = aΣx + bΣx2(5)
The equations (4) and (5) are known as Normal equations.
On solving equations (4) and (5), we get the values of a and b.
On putting the values of a and b in (1), we get the equation of required
line.
To Remember: The normal equations (4) and (5) are for
y = a + bx
Equation (4) is obtained by putting Σ before all the terms on both sides
of (1).

i.e., Σxy = Σax + Σbx 2


Σxy = aΣx + bΣx 2

96 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 96 06-Feb-25 12:18:40 PM


Curve Fitting: Principle of Least Squares

Example 1: Fit a straight line to the following data: Notes


x 0 1 2 3 4
y 1 1.8 3.3 4.5 4.3
Solution:
y = a + bx (1)
x y x y x2
0 1.0 0 0
1 1.8 1.8 1
2 3.3 6.6 4
3 4.5 13.5 9
4 4.3 17.2 16
Σx = 10 Σy = 14.9 Σxy = 39.1 Σx2 = 30
Here,
n = 5
Normal equations are
Σy = na + bΣx (2)
Σxy = aΣx + bΣx2(3)
On putting the values of Σx, Σy, Σxy, Σx2 and n in (2) and (3),
we have
14.9 = 5a + 10b (4)
39.1 = 10a + 30b (5)
On solving (4) and (5), we get
a = 1.12, b = 0.93
On substituting the values of a and b in (1), we get
y = 1.12 + 0.93x
Example 2: By the method of least squares, find the straight line that
best fits the following data:
x 1 2 3 4 5
y 14 27 40 55 68
Solution: Let the equation of the straight line best fit be
y = a + bx (1)

PAGE 97
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 97 06-Feb-25 12:18:41 PM


IT SKILLS AND DATA ANALYSIS - II

Notes x y x y x2
1 14 14 1
2 27 54 4
3 40 120 9
4 55 220 16
5 68 340 25
Σx = 15 Σy = 204 Σxy = 748 Σx2 = 55
Here,
n = 5
Normal equations are
Σy = na + bΣx (2)
Σxy = aΣx + bΣx2(3)
On putting the values of Σx, Σy, Σxy and Σx2 in (2) and (3), we have
204 = 5a + 15b(4)
748 = 15a + 55b(5)
On solving equations (4) and (5), we get
a = 0, b = 13.6
On substituting the values of a and b in (1), we get
y = 13.6 x
Example 3: Find the least squares fit of the form y = a + bx2 to the
following data:
x – 1 0 1 2
y 2 5 3 0
Solution:
y = a + bx2(1)
Let
x2 = z, y = a + bz (2)
x y z = x2 yz z2
–1 2 1 2 1
0 5 0 0 0
1 3 1 3 1
2 0 4 0 16
Σy = 10 Σz = 6 Σyz = 5 Σz2 = 18

98 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 98 06-Feb-25 12:18:41 PM


Curve Fitting: Principle of Least Squares

Here, Notes
n = 4
Normal equations are
Σy = na + bΣz (3)
Σyz = aΣz + bΣz (4) 2

On putting the values of Σy, Σz, Σyz, Σz2 in (3) and (4), we have
10 = 4a + 6 b (5)
5 = 6a + 18b (6)
25 10
On solving equations (5) and (6), we get, a = , b = −
6 9
On substituting the values of a, b in (2), we obtain
25 10
y = − z
6 9

Example 4: Find the least values of a and b so that y = a + bx fits the


data given in the table.
x 0 1 2 3 4
y 1.0 2.9 4.8 6.7 8.6
Solution:
y = a + bx (1)
x y xy x2
0 1.0 0 0
1 2.9 2.9 1
2 4.8 9.6 4
3 6.7 20.1 9
4 8.6 34.4 16
Σx = 10 Σy = 24 Σxy = 67.0 Σx = 30
2

Normal equations are


Σy = na + bΣx (2)
Σxy = aΣx + bΣx2(3)

PAGE 99
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 99 06-Feb-25 12:18:42 PM


IT SKILLS AND DATA ANALYSIS - II

Notes On putting the values of Σx, Σy, xy, Σx2 in (2) and (3), we have
24 = 5a + 10 b (4)
67 = 10a + 30b (5)
On solving (4) and (5), we get
a = 1, b = 1.9
On substituting the values of a and b in (1), we get
y = 1 + 1.9x
Example 5: Fit a curve y = ax2 + b for the following data:
x 12 16 20 22 24 26 30
y 4.44 7.5 4.9 10.76 10.76 11.76 14.0
Solution: Given parabola is
y = ax2 + b (1)
Putting x2 = X in (1), we get
y = aX + b (2)
Normal equations are
Σy = aΣX + nb (3)
ΣXy = aΣX2 + bΣX (4)
n x y x2 = X x4 = X2 Xy
1 12 4.44 144 20736 639.36
2 16 7.5 256 65536 1920
3 20 4.9 400 160000 1960
4 22 10.76 484 234256 5207.84
5 24 10.76 576 331776 6197.76
6 26 11.76 676 456976 7949.76
7 30 14.0 900 810000 12600.00
Total Sy = 64.12 SX = SX = 2079280
2
SXy =
3436 37562.72
Putting the values of Σ y, n, Σ X in (3), we get
64.12 = 7b + 3436 aΣ3436 a + 7b – 64.12 = 0 (5)

100 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 100 06-Feb-25 12:18:42 PM


Curve Fitting: Principle of Least Squares

Putting the values of ΣXy, ΣX2 and ΣX in (4), we get Notes


37562.72 = 2079280a + 3436bΣ2079280a + 3436b – 37562.72 = 0 (6)
Solving (5) and (6) by cross multiplication method, we get

Putting the values of a and b in (2), we get


y = 0.0155 X + 1. 5489 2 [∵ X = x ]
⇒ y = 0.0155 x2 + 1.5489
Example 6: Fit a curve of the type xy = ax + b to the following data
x 1 3 5 7 9 10
y 36 29 28 26 24 15
Solution: The equation of the given curve is

PAGE 101
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 101 06-Feb-25 12:18:43 PM


IT SKILLS AND DATA ANALYSIS - II

Notes Here n = 6
x y 1 1 y x
x x2
1 36 1 1 36
3 29 0.3333 0.1111 9.6667
5 28 0.2000 0.0400 5.6000
7 26 0.1429 0.0204 3.7143
9 24 0.1111 0.0123 2.6667
10 15 0.1000 0.0100 1.5
Σ = Σy = 158 Σ1 = 1.8873 Σ1 = 1y =
x 1.1938 x2 59.1477 x
On putting these values in the above normal equations, we have
158 = 6a + 1.8873b
59.1477 = 1.8873a + 1.1938b (1)
On solving the equations (1) and (2), we get
a = 21.3810, b = 15.7441 (2)
Hence, the equation of the curve is
xy = 21.3810x + 15.7441

7.4 Fitting a Curve of Type Y = A/X + B√X


The curve to be fitted is
a
y = +b√ x
x
Let the given points be (x1, y1), (x2, y2), …, (xn, yn).
Then the residual for x = xi is given by
a
Ei = yi − – b xi
xi

By the principle of least squares, we have S minimum

102 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 102 06-Feb-25 12:18:43 PM


Curve Fitting: Principle of Least Squares

Notes
i.e.,

and

On solving equations (2) and (3), we get the values of a and b. On put-
ting the values of a and b in (1) we get the equation of required curve
a
y = + b√ x
x
C0
Example 7: Use the method of Least Squares to fit the curve: y =
x
+ c1 √ x to the following table of value
x 0.1 0.2 0.4 0.5 1 2
y 21 11 7 6 5 6
C
Solution: The equation of the curve to be fitted is y = 0 + c1 √ x
x
The normal equations are:

PAGE 103
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 103 06-Feb-25 12:18:45 PM


IT SKILLS AND DATA ANALYSIS - II

Notes 1
1
x y y/x y√ x
√x x2
0.1 21 210 6.64078 3.16228 100
0.2 11 55 4.91935 2.23607 25
0.4 7 17.5 4.42719 1.58114 4.25
0.5 6 12 4.24264 1.41421 4
1 5 5 5 1 1
2 6 3 8.48528 0.70711 0.25
Σx = Σ(y/x) = Σy √ x = 1 Σ1/x = 136.5
2

4.2 302.5 Σ =
33.71524 √x
10.10081
From equations (1) and (2), we have
302.5 = 136.5c0 + 10.10081c1
and
33.71524 = 10.10081c0 + 4.2c1
Solving these, we get
c0 = 1.97327 and c1 = 3.28182
Hence the required equation of the curve is
1.97327
y = + 3.28182 √ x
x
Example 8: A person runs the same racetrack for five consecutive days
and is timed as follows:
Days (x) 1 2 3 4 5
Time (y) 15.3 15.1 15 14.5 14

Make a least square fit to the above data using the function
b c
y =a + +
x x2
b c
Solution: The given equation is y = a + + 2 Therefore, the normal
x x
equations are

104 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 104 06-Feb-25 12:18:47 PM


Curve Fitting: Principle of Least Squares

Notes

Here n = 5

1 1 1 1 y y
x y
x x2 x3 x4 x x2
1 15.3 1 1 1 1 15.3 15.3
2 15.1 0.5 0.25 0.1250 0.0625 7.5500 3.7550
3 15 0.3333 0.1111 0.0370 0.0123 5.0000 1.6667
4 14.5 0.2500 0.0625 0.0156 0.0039 3.6250 0.9063
5 14 0.2000 0.0400 0.0080 0.0016 2.8000 0.5600
Σy = 1 1 1 1 y y
73.9 Σ = Σ = Σ 3= Σ 4 = Σ = Σ =
x x2 x x x x2
2.2833 1.4636 1.1856 1.0803 34.2750 22.1880

1
On putting the values of Σy, Σ ,…., etc., in the normal equations, we get
x
73.9 = 5a + 2.283b + 1.4636c  (1)
34.2750 = 2.2833a + 1.4636b + 1.1856c (2)
22.1880 = 1.4636a + 1.1856b + 1.0803c (3)
On solving these equations (1), (2) and (3), we get
a = 12.6751, b = 8.2676, and c = – 5.7071
Hence, the equation is

Example 9: The equation of the curve to be fitted is y = ax + bx2 to


the following data.

PAGE 105
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 105 06-Feb-25 12:18:48 PM


IT SKILLS AND DATA ANALYSIS - II

Notes x 1 2 3 4 5
y 1.8 5.1 8.9 14.1 19.8

Solution: Normal equations are


Σxy = aΣx2 + bΣx3
and
Σx2y = aΣx3 + bΣx4

let us form the table below:


x y x2 x3 x4 xy x 2y
1 1.8 1 1 1 1.8 1.8
2 5.1 4 8 16 10.2 20.4
3 8.9 9 27 81 26.7 80.1
4 14.1 16 64 256 56.4 225.6
5 19.8 25 125 625 99 495
Total Σx2 = Σx3 = Σx4 = Σxy = Σx2y =
55 225 979 194.1 822.9
Substituting these values in equations (1) and (2), we get
194.1 = 55a + 225b (3)
and
822.9 = 225a + 979b (4)
Solving (3) and (4) by cross multiplication method, we have
Hence required parabolic curve is y = 1.51x + 0.49x2.

7.5 To Fit a Parabola (Up to Second Degree)


Let
y = a + bx + cx2(1)
be the equation of a parabola. The following normal equations are ob-
tained as in Art. 15.3 The normal equations are

106 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 106 06-Feb-25 12:18:49 PM


Curve Fitting: Principle of Least Squares

Σy = na + bΣx + cΣx2(2) Notes


Σxy = aΣx + bΣx2 + cΣx3(3)
Σx2y = aΣx2 + bΣx3 + cΣx4(4)
On solving these three normal equations, we get the values of a, b and c.
On putting the values of a, b and c in (1), we get the required equation
of parabola.
To remember the normal equations (2), (3) and (4) for y = a + bx + cx2.
(i) Equation (2) is obtained by putting Σ before each term on both sides
of (1).
(ii) Equation (3) is obtained on multiplying (1) by x and putting Σ before
each term on both sides of obtained equation.
(iii) Equation (4) is obtained on multiplying (1) by x2 and putting Σ
before each term on both sides of obtained equation.
Example 10: Using method of least squares, derive the normal equations
to fit the curve y = ax2 + bx. Hence fit this curve to the following data.
x 1 2 3 4 5 6 7 8
y 1 1.2 1.8 2.5 3.6 4.7 6.6 9.1
Solution:
Here, we have, y = ax2 + bx
Normal equations are
Σxy = aΣx3 + bΣx2(1)
Σx2y = aΣx4 + bΣx3(2)
Putting the values in (1) and (2), we get
184 = 1296a + 204b(3)
1227 = 8772a + 1296b (4)
Multiplying (3) by 1296 and (4) by 204, we get
238464 = 1679616a + 264384b (5)
250308 = 1789488a + 264384b (6)
Subtracting (6) from (5), we get
11844 = 109872a ⇒ a = 0.1078

PAGE 107
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 107 06-Feb-25 12:18:49 PM


IT SKILLS AND DATA ANALYSIS - II

Notes Putting the value of a in (3), we get


184 = 1296 (0.1078) + 204b
⇒ 184 – 139.7088 = 204b ⇒ 204b = 44.2912
⇒ b = 0.2171
The required equation of the curve is
y = 0.1078x2 + 0.2171
Example 11: Fit a parabola y = ax2 + bx + c to the following data taking
x as independent variable
x 1 2 3 5 7 11 13 17 19 23
y 2 3 5 7 11 13 17 19 23 29
Solution: Here, we have
y = ax2 + bx + c (1)
x y xy x2 x 2y x3 x4
1 2 2 1 2 1 1
2 3 6 4 12 8 16
3 5 15 9 45 27 81
5 7 35 25 175 125 625
7 11 77 49 539 343 2401
11 13 143 121 1573 1331 14641
13 17 221 169 2873 2197 28561
17 19 323 289 5491 4913 83521
19 23 437 361 8303 6859 130321
23 29 667 529 15341 12167 279841
Σx = Σy = Σxy = Σx2 = Σx2y = Σx3 = Σx4 =
101 129 1926 1557 34354 27971 540009
Normal equations are
Σy = na + bΣx + cΣx2(2)
Σxy = aΣx + bΣx2 + cΣx3(3)
Σx2y = aΣx2 + bΣx3 + cΣ x4(4)
On putting the values of Σx, Σy, Σxy, Σx2, Σx2y, Σx3, Σx4, in equations (2),
(3) and (4), we get
129 = 10a + 101b + 1557c (5)

108 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 108 06-Feb-25 12:18:50 PM


Curve Fitting: Principle of Least Squares

1926 = 101a + 1557 b + 27971c (6) Notes


34354 = 1557a + 27971b + 540009c (7)
On solving (5), (6), (7), we get
a = 1.41259297
b = 1.089013957
c = 0.003136583595
Hence, the equation of the required parabola is
y = 1.41259297x2 + 1.089013957x + 0.003136583595
Example 12: Fit a second degree parabola to the following:
x 1 2 3 4 5
y 1090 1220 1390 1625 1915
Solution: Let the equation of the parabola be
y = a + bx + cx2 (1)
Normal equations are
Σy = na + bΣx + cΣx2(2)
Σxy = aΣx + bΣx2 + cΣx3 (3)
Σx2y = aΣx2 + bΣx3 + cΣx4(4)
On putting the values of n, Σx, Σx2, Σx3, Σx4, Σy, Σxy, Σx2y, in (3), (4)
and (5), we get
7240 = 5a + 15b + 55c (5)
23775 = 15a + 55b + 225c [∵ n = 5] (6)
92355 = 55a + 225b + 979c (7)

Steps for solution of (5), (6) and (7) are the following:
3 × (5), 21720 = 15a + 45b + 165c (8)
(6) – (8), 2055 = 10b + 60c (9)
11 × (5), 79640 = 55a + 165b + 605c (10)
(7) – (10), 12715 = 60b + 374c (11)
6 × (9), 12330 = 60 b + 360c (12)

PAGE 109
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 109 06-Feb-25 12:18:50 PM


IT SKILLS AND DATA ANALYSIS - II

Notes

The equation of the required parabola is 2y = 2048 + 81x + 55x2

7.6 Change of Scale in Second Degree Equations


If the data is of equal interval in large numbers then we change the scale as
x − x0
u= and v = y – y0
h
Example 13: Fit a second degree parabola to the following data by least
squares method:
x 1929 1930 1931 1932 1933 1934 1935 1936 1937
y 352 356 357 358 360 361 361 360 359
Solution:
Taking x0 = 1933, y0 = 357
Again taking u = x – x0, v = y – y0
u = x – 1933, v = y – 357, n = 9
The equation y = a + bx + cx2 is transformed to v = A + Bu + Cu2
u = x – v = y –
x y uv u2 u 2v u3 u4
1933 357
1929 –4 352 – 5 20 16 –80 – 64 256
1930 –3 356 – 1 3 9 –9 – 27 81
1931 –2 357 0 0 4 0 – 8 16
1932 –1 358 1 –1 1 1 –1 1
1933 0 360 3 0 0 0 0 0
1934 1 361 4 4 1 4 1 1
1935 2 361 4 8 4 16 8 16

110 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 110 06-Feb-25 12:18:51 PM


Curve Fitting: Principle of Least Squares

1936 3 360 3 9 9 27 27 81 Notes


1937 4 359 2 8 16 32 64 256
Total Σu = 0 Σv = 11 Σuv Σu 2
Σu2v Σu3 Σu4 =
= 51 = 60 = –9 = 0 708
Normal equations are
Σv = nA + BΣ u + CΣu2
⇒ 11 = 9A + 0B + 60C
⇒ 11 = 9A + 60C
Σuv = AΣu + BΣu2 + CΣu3
⇒ 51 = 0A + 60B + 0C
⇒ 51 = 60B
17

20
Σu2v = AΣu2 + BΣ u3 + CΣu4
⇒ –9 = 60A + 0B + 708C
⇒ –9 = 60A + 708C
On solving these equations, we get

Putting v = y – 357 and u = x – 1933, we get

y = –1000104.41 + 1034.29x – 0.267x2


Example 14: Fit a second degree parabola to the following data:
x 1 2 3 4 5 6 7 8 9 10
y 124 129 140 159 228 289 315 302 263 210

PAGE 111
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 111 06-Feb-25 12:18:51 PM


IT SKILLS AND DATA ANALYSIS - II

Notes Solution: Taking x0 = 6 and y0 = 228, here, n = 10


u = v = y –
x y uv u2 u 2v u3 u4
x – 6 228
1 – 5 124 – 104 520 25 – 2600 – 125 625
2 – 4 129 – 99 396 16 – 1584 –64 256
3 – 3 140 – 88 264 9 – 792 –27 81
4 – 2 159 – 69 138 4 – 276 – 8 16
5 – 1 228 0 0 1 0 – 1 1
6 0 289 61 0 0 0 0 0
7 1 315 87 87 1 87 1 1
8 2 302 74 148 4 296 8 16
9 3 263 35 105 9 315 27 81
10 4 210 – 18 – 72 16 – 288 64 256
Total Σu = Σv = Σuv = Σu Σu2v =
2
Σu3 = Σu4
–5 –121 1586 = 85 –4842 –125 =1333
Let the equation of parabola be a + bu + cu2 = v

Putting the values of a, b and c in a + bu + cu2 = v, we get

Putting the values of u = x – 6 and v = y – 228 we get

112 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 112 06-Feb-25 12:18:52 PM


Curve Fitting: Principle of Least Squares

Notes

y = – 4.3333x2 + 64.1576x + 18.8667

7.7 To Fit an Exponential Curve (Y = ABX + C)


Example 15: Fit a curve of the form y = abx to the following data
x 2 3 4 5 6
y 144 172.3 207.4 248.8 298.5
Solution:
y = abx (1)
Taking logarithm on both sides, we get
log y = log a + x log b
Y = A + Bx (Straight line)
Where Y = log y, A = log a, B = log b
So we have a table of the following form
x y Y = log y xY x2
2 144 2.1584 4.3168 4
3 172.3 2.2363 6.7089 9
4 207.4 2.3168 9.2672 16
5 248.8 2.3959 11.9795 25
6 298.5 2.4749 14.8494 36
Σx = 20 ΣY = 11.5823 ΣxY = 47.1218 Σx2 = 90

PAGE 113
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 113 06-Feb-25 12:18:52 PM


IT SKILLS AND DATA ANALYSIS - II

Notes Normal equations are


ΣY = nA + BΣx (2)
ΣxY = AΣx + BΣx (3)
2

Putting the values of ΣY, n and Σx in (2), we get


11.5823 = 5A + 20B (4)
Putting the values of ΣxY, Σx and Σx2 in (3), we get
47.1218 = 20A + 90B (5)
Multiplying (4) by 4, we get
46.3292 = 20A + 80B (6)
Subtracting (6) from (5), we get
0.7926
0.7926 = 10B ⇒ B = = 0.07926
10
By putting the value of B in (4), we get
1.5823 = 5A + 20 (0.07926)

y = 100(1.2)x
Example 16: Obtain a relation of the form y = abx for the following
data by the method of least squares:
x 2 3 4 5 6
y 8.3 15.4 33.1 65.2 127.4
Solution: The curve to be fitted is y = abx, n = 5
Taking log on both the sides, we get
log y = log a + x log b …(1)
On putting Y = log y
A = log a
B = log b
The transformed equation of (1) becomes
Y = A + Bx …(2)

114 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 114 06-Feb-25 12:18:53 PM


Curve Fitting: Principle of Least Squares

x y Y = log y x2 xY Notes
2 8.3 0.9191 4 1.8382
3 15.4 1.1875 9 3.5625
4 33.1 1.5198 16 4.0792
5 65.2 1.8142 25 9.0710
6 127.4 2.1052 36 12.6312
Σx = 20 ΣY = 7.5458 Σx = 90
2
ΣxY = 31.1821
Normal equations are
ΣY = nA + BΣx (3)
ΣxY = AΣx + BΣx2(4)
On putting the values of Σx, etc. from the above table in (3) and (4),
we get
7.5458 = 5A + 20B (5)
31.1821 = 20A + 90B (6)
On solving (5) and (6), we get
A = 1.1099 and B = 0.0998
loga = 1.1099 ⇒ a = Antilog (1.1099) ⇒ a = 12.87
logb = 0.0998 ⇒ b = Antilog (0.0998) ⇒ b = 1.258
On substituting the values of a and b in (1), we get
y = 12.87 (1.258)x
Example 17: Fit the curve y = axb to the following data by least square
method.
x 1 2 3 4 5 6
y 2.98 4.26 5.21 6.1 6.8 7.5
Solution:
The curve to be fitted is
y = axb …(1)
Taking logarithm on both sides, log y = log a + b log x
⇒ y = A + bx (Straight line)
where Y = log y, A = log a and X = log x and n = 8

PAGE 115
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 115 06-Feb-25 12:18:53 PM


IT SKILLS AND DATA ANALYSIS - II

Notes So, we have a table of the following form


x X = log x y Y = log y XY X2
1 0 2.98 0.4742 0 0
2 0.3010 4.26 0.6294 0.1894 0.0906
3 0.4771 5.21 0.7168 0.3420 0.2276
4 0.6021 6.10 0.7853 0.4728 0.3625
5 0.6990 6.80 0.8325 0.5819 0.4886
6 0.7782 7.50 0.8751 0.6810 0.6056
Total 2.8574 4.3133 2.2671 1.7749
The normal equations for estimating A and b are
ΣY = nA + bΣX (2)
ΣXY = AΣX + bΣX2(3)
Putting the values of ΣY, n and Σx in (2), we get
4.3133 = 6A + 2.8574b ⇒ 6A + 2.8574b – 4.3133 = 0 (4)
Putting the values of ΣX Y, ΣX and ΣX2 in (3), we get
2.2671 = 2.8574 A + 1.7749b
⇒ 2.8574 A + 1.7749 b – 2.2671 = 0 (5)
Solving (4) and (5) by cross-multiplication method, we have

116 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 116 06-Feb-25 12:18:53 PM


Curve Fitting: Principle of Least Squares

Example 18: Find the normal equations to the curve 2x = ax2 + bx + c Notes
Solution: Given curve 2x = ax2 + bx + c

Normal equations are Σ2xx2 = aΣx4 + bΣx3 + cΣx2


Σ2xx = aΣx3 + bΣx2 + cΣx
Σ2x = aΣx2 + bΣx + nc
Example 19: Use least-squares method to fit a curve of the form y =
aebx to the data:
x 1 2 3 4 5 6
y 7.209 5.265 3.846 2.809 2.052 1.499
Solution:
y = aebx (1)
On taking log of both sides, we get
loge y = loge a + bx (2)
On putting loge y = Y, loge a = c in (2), we get
Y = c + bx (3)
x y Y = logey xY x2
1 7.209 1.97533 1.97533 1
2 5.265 1.66108 3.32216 4
3 3.846 1.34703 4.04109 9
4 2.809 1.03283 4.13132 16
5 2.052 0.71881 3.59405 25
6 1.499 0.40480 2.4288 36
Σx = 21 ΣY = 7.13988 ΣxY = 19.49275 Σx = 91
2

PAGE 117
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 117 06-Feb-25 12:18:54 PM


IT SKILLS AND DATA ANALYSIS - II

Notes Normal equations are


ΣY = nc + bΣx (4)
ΣxY = cΣx + bΣx2(5)
On putting the values of n, Σx, ΣY, ΣxY and Σx2 in equations (4) and
(5), we get
7.13988 = 6c + 21b (6)
19.49275 = 21c + 91b (7)
On solving (6) and (7), we obtain b = – 0.3141, c = 2.28933
c = logea ⇒ 2.28933 = logea ⇒ a = 9.86832
On substituting the values of a and b in (1), we get
y = 9.86832 e–0.3141x

7.8 Change of Origin and Scale


In some problems the magnitude of the variables in the given data is
so large that the calculation becomes very tedious. The size of the data
can be reduced by assuming some origin for x, y series. The problem is
further simplified by taking suitable scale for the values of x and y. If
these values are equally spaced. Let z be the width of the interval and
(x0, y0) be taken as origin.
x − x0
Then put u = and v = y – y0
h
Example 20: Fit a straight line to the following data:
x 71 68 73 69 67 65 66 67
y 69 72 70 70 68 67 68 64
Solution:
y = a + bx (1)
u = x – 69 and v = y – 68, n = 8
Transformed Equation is v = a + bu (2)

118 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 118 06-Feb-25 12:18:54 PM


Curve Fitting: Principle of Least Squares

x y u = x – 69 v = y – 68 uv u2 Notes
71 69 2 1 2 4
68 72 –1 4 –4 1
73 70 4 2 8 16
69 70 0 2 0 0
67 68 –2 0 0 4
65 67 –4 –1 4 16
66 68 –3 0 0 9
67 64 –2 –4 8 4
Total Σu = –6 Σv = 4 Σuv = 18 Σu2 = 54

Normal equations are Σv = na + b Σ u (3)


Σ uv = a Σ u + b Σ u2 (4)

On putting the values of Σu, Σv, Σuv, Σu2 in (3) and (4), we get

4 = 8a + b (–6) (5)
18 = –6a + 54b (6)
On solving (5) and (6), we get
9 14
a = , b =
11 33

On putting the values of a and b in (2), we get


9 14
v = + u (7)
11 33

On putting u = x – 69 and v = y – 68 in (7), we get

y – 68 = 0.8182 + 0.4242 x – 29.2727


y = 68.8182 – 29.2727 + 0.4242x
y = 39.5455 + 0.4242x

PAGE 119
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 119 13-Feb-25 2:18:29 PM


IT SKILLS AND DATA ANALYSIS - II

Notes
7.9 Exercise
1. Find the linear least square polynomials based on given data. (Punjab
2018)
x –2 –1 0 1
Y 6 3 2 2
2. Find the least square straight line approximation to the data.
x 1 2 3 4
y 3 7 13 21
3. Find the least square line for the data points
(–1, 10), (0, 9), (1, 7), (2, 5), (3, 4), (4, 3), (5, 0) and (6, –1)
4. Fit a straight line to the following data regarding x as the independent
variable:
x 1 2 3 4 5 6
y 1200 900 600 200 110 50
5. The following table shows the number of salesmen working for a
certain concern:
year 1998 1999 2000 2001 2002 2003
number 28 38 46 40 56 60
Use the method of least squares to fit a straight line trend.
6. Given the following experimental values
x 0 1 2 3
y 2 4 10 15
Fit by the method of least squares a parabola of the type y = a +
bx2
7. If V (km/hr) and R (kg/ton) are related by a relation of type R = a
+ bV2, find by the method of least squares, a and b with the help
of the following table:
V 10 20 30 40 50
R 8 10 15 21 30

120 PAGE
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 120 06-Feb-25 12:18:55 PM


Curve Fitting: Principle of Least Squares

8. Find the values of a, b, c so that y = a + bx + cx2 is the best fit Notes


to the data:
x 0 1 2 3 4
y 1 0 3 10 21
9. Fit a second degree parabola to the following data by Least squares
method:
x 0 1 2 3 4
y 1 1.8 1.3 2.5 4.3
10. Fit a second degree parabola to the following data taking y as
dependent variable:
x 1 2 3 4 5 6 7 8 9
y 2 6 7 8 10 11 11 10 9
11. Use the least-square method to obtain a parabola that approximates
the data
x 1.0 1.2 1.4 1.6 1.8 2
y 2.345 2.419 2.592 2.863 3.233 3.702
12. Employ the method of least squares to fit a parabola y = a + bx +
cx2 in the following data:
(x, y): (–1, 2), (0, 0), (0, 1), (1, 2).
13. Fit a second degree parabola in the following data
x 0 1 2 3 4
y 1 4 10 17 30
14. Fit a parabolic curve of regression of y on x to the following data:
x 1 1.5 2 2.5 3 3.5 4
y 1.1 1.3 1.6 2.0 2.7 3.4 4.1
15. Fit a least-square geometric curve y = axb to the following data:
x 1 2 3 4 5
y 0.5 2 4.5 8 12.5

PAGE 121
Department of Distance & Continuing Education, Campus of Open Learning,
School of Open Learning, University of Delhi

IT Skills and Data Analysis - II.indd 121 13-Feb-25 2:37:33 PM


IT Skills and Data Analysis - II.indd 122 06-Feb-25 12:18:55 PM

You might also like