0% found this document useful (0 votes)
46 views26 pages

Batch 5

The document discusses curve fitting and linear regression. It describes how the method of least squares is used to fit a straight line to a set of data points by minimizing the sum of the squares of the differences between the observed and predicted y-values. Specifically, it shows how to derive the normal equations to solve for the slope and y-intercept of the best-fit line, and provides a Java code example to calculate the line of best fit.

Uploaded by

Doctor Stranger
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views26 pages

Batch 5

The document discusses curve fitting and linear regression. It describes how the method of least squares is used to fit a straight line to a set of data points by minimizing the sum of the squares of the differences between the observed and predicted y-values. Specifically, it shows how to derive the normal equations to solve for the slope and y-intercept of the best-fit line, and provides a Java code example to calculate the line of best fit.

Uploaded by

Doctor Stranger
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 26

1.

INTRODUCTION

1.1 Curve Fitting


Curve fitting is the process of constructing a curve or mathematical function, that

has the best fit to a series of data points, possibly subject to constraints. Curve fitting

can involve either interpolation, where an exact fit to the data is required or

smoothing, in which a “smooth” function is constructed that approximately fits the

data.

A related topic is regression analysis, which focuses more on questions of

statistical inference such as how much uncertainty is present in a curve that is fit to

data observed with random errors. Fitted curves can be used as an aid for data

visualization to infer values of a function where no data are available and to summarize

the relationships among two or more variables. Extrapolation refers to the use of a

fitted curve beyond the range of the observed data and is subject to a degree of

uncertainty since is it may reflect the method used to construct.

1.2 Java Language

Java is a programming language created by James Gosling from Sun Microsystems

(Sun) in 1991. The target of Java is to write a program once and then run this program

1
1|Page
on multiple operating systems. The first publicly available version of Java (Java 1.0) was

released in 1995. Sun Microsystems was acquired by the Oracle Corporation in 2010.

Oracle has now the statesmanship for Java. In 2006 Sun started to make Java available

under the GNU General Public License (GPL). Oracle continues this project

called Opened. Over time new enhanced versions of Java have been released. The

current version of Java is Java 1.8 which is also known as Java 8.

Java is defined by a specification and consists of a programming language, a

compiler, core libraries and a runtime (Java virtual machine). The Java runtime allows

software developers to write program code in other languages than the Java

programming language which still runs on the Java virtual machine. The Java platform is

usually associated with the Java virtual machine and the Java core libraries.

The Java language was designed with the following properties:

 Platform Independent: Java programs use the Java virtual machine as abstraction

and do not access the operating system directly. This makes Java programs highly

portable. A Java program (which is standard-compliant and follows certain rules) can

run unmodified on all supported platforms, e.g., Windows or Linux.

 Object-Orientated Programming Language: Except the primitive data types, all

elements in Java are objects.


2
2|Page
 Strongly-Typed Programming Language: Java is strongly-typed, e.g., the types of

the used variables must be pre-defined and conversion to other objects is relatively

strict, e.g., must be done in most cases by the programmer.

 Interpreted and Compiled Language: Java source code is transferred into the byte

code format which does not depend on the target platform. These byte code

instructions will be interpreted by the Java Virtual machine (JVM). The JVM contains a

so called Hotspot-Compiler which translates performance critical byte code

instructions into native code instructions.

 Automatic Memory Management: Java manages the memory allocation and de-

allocation for creating new objects. The program does not have direct access to the

memory. The so-called garbage collector automatically deletes objects to which no

active pointer exists.

USES OF JAVA PROGRAMMING LANGUAGE: The java programming language is


used in

 Working in a cloud

 Exploring space at NASA

 Working with the internet of things

 Developing self driving cars

3
3|Page
 Performing big data analysis

 Making games

4
4|Page
2. METHOD OF LEAST SQUARES

The mathematical approach of method of least squares is a method of curve

fitting that has been popular for a long time. Least squares minimize the square of the

error between the original data and the values predicted by the equation.

Let the given data points are ( x i , y i), i=1, 2, 3………., m.

Suppose the curve y=f(x) is fitted to this data. Let the observed value at x= x iis y i

and the corresponding value on the fitting curve is f ( x i).

If e i is the error of approximation at x= x i , then we have e i= y i-f( x i), consider

S =¿+¿+…………….+¿

= e 12+e 22+……+e m2

The method of least squares consisting in minimizing S that is the sum of squares

of errors.

5
5|Page
3. FITTING OF A STRAIGHT LINE

3.1 Mathematical Approach


Let the set of data points be ( x i, y i), i=1,2,3,………,m.

Let y=a 0+a 1x be the straight line to be fitted to the given data, where a 0 and a1

are arbitrary constants. ……………..(1)

Then the sum of squares of errors

S = [ y ¿ ¿1−( a0 +a1 x1 )]2 ¿+¿+…………+[ y ¿ ¿ m−( a0 +a1 x1 )]2 ¿

∂s ∂s
If S to be minimum, we have ∂ a =0 and ∂ a =0.
0 1

∂s
For ∂ a =0,
0

⇒ -2[ y 1-(a 0+a 1 x 1)] -2[ y 2- (a 0+a 1 x 2)]-…………-2[ y m −(a 0+a1 x m )] =0

⇒ y 1 + y 2+………..+ y m= (a 0+a 0+………+a 0)+a 1( x 1+ x 2+…….+ x m)

n n
⇒ ma 0+a 1 ∑ x i = ∑ yi …………….(2)
i=1 i=1

∂s
For ∂ a =0,
1

⇒-2 x 1[ y 1-(a 0+a 1 x 1)]-2 x 2[ y 2-(a 0+a 1 x 2)]-…………..-2 x m[ y m-(a 0+a 1 x m)]=0

6
6|Page
⇒ x 1 y 1 + x 2 y 2+………+ x m y m= (a 0 x 1+a 0 x 2+……….+a 0 x m)+(a 1 x 12 +a 1 x 22+…….+a 1 x m2)

m m m
⇒ a 0 ∑ x i+a 1 ∑ x i2 = ∑ x i y i ……………(3)
i=1 i=1 i=1

∴ The values of x i and y i are known and the equations (2) & (3) can be solved for

two unknowns a 0 and a 1. These equations are called normal equations.

3.2 Example
Find the straight line that best fit the following data.

Length(x) 1 2 3 4 5
Width(y) 14 27 40 55 68
Solution:

xi yi x i2 xi yi

1 14 1 14
2 27 4 54
3 40 9 120
4 55 16 220
5 68 25 340
∑ x i=1 ∑ yi=20 ∑ x i2=55 ∑ x i y i=748
5 4

7
7|Page
From the above table, we have

∑ x i=15, ∑ yi=204, ∑ x i2=55, ∑ x i y i=748

The normal equations of the straight line Y = a 0+a 1x are

ma 0+a 1 ∑ x i= ∑ yi

a 0 ∑ x i+a 1 ∑ x i2=∑ x i y i

⇒ 5a 0+a 1(15)=204 ………………(1)

⇒ 15a 0+55a 1=748 ……………….(2)

Solving (1) & (2) we have a o=0, a 1=13.6

∴ Best fit for the straight line y=a 0+a 1x is y=0+13.6x

3.3 Source Code

// Java Program to find m and c for a straight line given,

// x and y

import java.io.*;

import static java.lang.Math.pow;

public class A

8
8|Page
// function to calculate m and c that best fit points

// represented by x[] and y[]

static void bestApproximate(int x[], int y[])

int n = x.length;

double m, c, sum_x = 0, sum_y = 0,

sum_xy = 0, sum_x2 = 0;

for (int i = 0; i < n; i++)

sum_x += x[i];

sum_y += y[i];

sum_xy += x[i] * y[i];

sum_x2 += pow(x[i], 2);

m = (n * sum_xy - sum_x * sum_y) / (n * sum_x2 - pow(sum_x, 2));

c = (sum_y - m * sum_x) / n;

System.out.println("m = " + m);

System.out.println("c = " + c);

}
9
9|Page
// Driver main function

public static void main(String args[])

int x[] = { 1, 2, 3, 4, 5 };

int y[] = { 14, 27, 40, 55, 68 };

bestApproximate(x, y);

3.4 OUTPUT

10
10 | P a g e
Graph:

Y-Values(width)
80

70
f(x) = 13.6 x + 0
60 R² = 1 Y-Values(width)
50 Linear (Y-
Values(width))
40

30

20

10

0
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5

Figure 1

11
11 | P a g e
4 FITTING OF nth DEGREE POLYNOMIAL
4.1 MATHEMATICAL APPROACH
Let the given set of data points be ( x i, y i),i=1,2,3,……..m.

Let the polynomial of nth degree to be fitted to the given data points be

Y= a 0+a 1x+a 2 x 2+…………+a n x n ……..(1)

where a 0, a 1, a 2, …, a n are arbitrary constants.

Then the sum of the squares of the errors

S=¿ +¿

2
[ y −(a + a x +a
2 0 1 2 2 x 22 + …+ an x 2n ) +…+¿
]
2
[ y m−( a0 +a1 x m +a2 x m2+ …+an x mn ) ] ………………..(2)

For S to be minimum, we equate the first partial derivatives of S with respective to

a 0, a 1, a 2, …, a n to 0 and simplifing, we obtain the normal equations has

12
12 | P a g e
¿

……….(3)

The system of n+1 equations in equation (3) are in n+1 unknowns a 0, a 1, a 2,…, a n by

solving these equations we get the values of these unknowns and get the required

polynomial on nth degree.

4.2 Example

Fit a polynomial of second degree to the data points given in the following table.

X 0.0 1.0 2.0


y 1.0 6.0 17.0
SOLUTION: Let the second degree polynomial to be fitted to the given data is

y=a 0+a 1x+a 2 x 2 ……………….(1) ,

where a 0 , a1 , a2 are arbitrary constants.

The normal equations to equation (1) are


m m m
{ ∑ yi =ma 0+a 1 ∑ x i+a 2 ∑ x i2
i=1 i=1 i=1

m m m m
a 0 ∑ x i+a 1 ∑ x i +a 2 ∑ x i3
2
∑ xi yi =
i=1 i=1 i=1 i=1

m m m m

∑ x i2 y i=a 0 ∑ x i2+a 1 ∑ x i3+a 2 ∑ x i4}……………(2)


i=1 i=1 i=1 i=1

From the given data the table follows as


13
13 | P a g e
xi yi x i2 x i3 x i4 xi yi x i2 y i
0.0 1.0 0 0 0 0 0
1.0 6.0 1 1 1 6 6
2.0 17.0 4 8 16 34 68
m m m m m m m

∑ x i= ∑ yi =2 ∑ x i 2= ∑ x i3= ∑ x i4 =1 ∑ x i y i=4 ∑ x i2 y i=7


i=1 i=1 i=1 i=1 i=1 i=1 i=1

3 4 5 9 7 0 4

From the above table, we have

∑ x i=3, ∑ yi=24, ∑ x i2=5, ∑ x i y i=40,∑ x i3=9,∑ x i4 =17,∑ x i2 y i=74

The normal equations of the 2nd degree polynomial are

24=3a 0+3a 1+5a 2 …………….(3)

40=3a 0+5a 1+9a 2 ……………..(4)

74=5a 0+9a 1+17a 2 …………….(5)

Solving (3),(4)&(5) we get

a 0=1, a 1=2,a 2=3

Substitute the above values in equation (1), we get the required 2 nd degree

polynomial is y=1+2x+3 x 2.

4.3 SOURCE CODE

14
14 | P a g e
import java.util.Arrays;

import java.util.function.IntToDoubleFunction;

import java.util.stream.IntStream;

public class PolynomialRegression

private static void polyRegression(int[] x, int[] y)

int n = x.length;

int[] r = IntStream.range(0, n).toArray();

double xm = Arrays.stream(x).average().orElse(Double.NaN);

double ym = Arrays.stream(y).average().orElse(Double.NaN);

double x2m = Arrays.stream(r).map(a -> a * a).average().orElse(Double.NaN);

double x3m = Arrays.stream(r).map(a -> a * a * a).average().

orElse(Double.NaN);

double x4m = Arrays.stream(r).map(a -> a * a * a * a).

average().orElse(Double.NaN);

double xym = 0.0;

15
15 | P a g e
for (int i = 0; i < x.length && i < y.length; ++i)

xym += x[i] * y[i];

xym /= Math.min(x.length, y.length);

double x2ym = 0.0;

for (int i = 0; i < x.length && i < y.length; ++i)

x2ym += x[i] * x[i] * y[i];

x2ym /= Math.min(x.length, y.length);

double sxx = x2m - xm * xm;

double sxy = xym - xm * ym;

double sxx2 = x3m - xm * x2m;

double sx2x2 = x4m - x2m * x2m;

double sx2y = x2ym - x2m * ym;

double b = (sxy * sx2x2 - sx2y * sxx2) / (sxx * sx2x2 - sxx2 * sxx2);

double c = (sx2y * sxx - sxy * sxx2) / (sxx * sx2x2 - sxx2 * sxx2);

double a = ym - b * xm - c * x2m;
16
16 | P a g e
IntToDoubleFunction abc = (int xx) -> a + b * xx + c * xx * xx;

System.out.println("y = " + a + " + " + b + "x + " + c + "x^2");

System.out.println(" Input Approximation");

System.out.println(" x y y1");

for (int i = 0; i < n; ++i)

System.out.printf("%2d %3d %5.1f\n", x[i], y[i], abc.applyAsDouble(x[i]));

public static void main(String[] args) {

int[] x = IntStream.range(0, 3).toArray();

int[] y = new int[]{1, 6, 17};

polyRegression(x, y);

4.4 OUTPUT

17
17 | P a g e
Graph:

Y-Values
18

16 f(x) = 3 x² + 2 x + 1
R² = 1
14

12
Y-Values
10 Polynomial (Y-Values)
8

0
0 0.5 1 1.5 2 2.5

Figure 2

5. FITTING OF AN EXPONENTIAL CURVE


5.1 Mathematical Approach
Let the given data points be ( x i, y i),i=1,2,………..,m

Let the exponential function to be fitted to the given data be

y=ae bx ………(1)
18
18 | P a g e
where a, b are arbitrary constants.

Taking “ln” on both sides (“base e”)

Ln y=ln(ae bx)

=ln a+bx ln e

=ln a+bx [∵ ln e=log ee=1]

Let y=ln y, A=ln a, then

y=A+bx ………..(2)

Equation (2) is of the form of a straight line and the normal equations are
m m

mA+b∑ x i=∑ yi ………(3)


i=1 i=1

m m m
2
A∑ x i+b∑ x i =∑ x i y i ……..(4)
i=1 i=1 i=1

By solving the above normal equations (3) & (4), we get the values of A & b and

then we obtained the value of “a” as

ln a=A, this implies that a=e A.

By substituing the values of a & b in equation(1), we get the required exponential

function fitted for the given data.

5.2 Example

19
19 | P a g e
Determine the constants a & b by the method of least squares such that y=a e bx fits

the following data.

Length(X 2 4 6 8 10

)
Width(y) 4.077 11.084 30.128 81.897 222.62

SOLUTION:

Let the exponential function fitted to given data y=ae bx ………(1),

where a and b are arbitrary constants.

Taking log e on both sides ( base “e”)

log ey=log e(ae bx)

=log ea+bx log ee

log ey=log ea+bx

Letlog e y = y, log ea=A then

y= A+bx ……………(2)

Equation (2) is of the form of a straight line and the normal equations to

equation(2) are
m m
mA+b∑ x i =∑ yi …………(3)
i=1 i=1

m m m
A∑ x i+b∑ x i2 =∑ x i y i ………(4)
i=1 i=1 i=1

20
20 | P a g e
The table below follows fro m the data.

xi yi Y i=ln y i x i2 xi Y i
2 4.077 1.4054 4 2.8108
4 11.084 2.4055 16 9.6220
6 30.128 3.4055 36 20.4330
8 81.897 4.4055 64 35.2440
10 222.62 5.4055 100 54.0550
m m m m

∑ x i=30 ∑ Y i=17.0274 ∑ x i2=220 ∑ x i Y i=122.1648


i=1 i=1 i=1 i=1

From the above table, we have

∑ x i=30, ∑ Y i=17.0274,
i=1
∑ x i2=220,∑ x i Y i=122.1648

Substitute the above values in (3) & (4)

5A+30b=17.0274 ……..(5)

30A+220b=122.1648 ……..(6)

Solving (5) & (6), we get

A=0.4054,b=0.5

Since log ea=A, this implies that

a=e A
21
21 | P a g e
=e (0.4054)

=1.499902341

∴ Required exponential equation is y = (1.499902341)e (0.5 ) x .

5.3 SOURCE CODE


import java.util.Arrays;

import org.apache.commons.math3.fitting.PolynomialCurveFitter;

import org.apache.commons.math3.fitting.WeightedObservedPoints;

public class CurveFitting

public static void main(String[] args)

{
final WeightedObservedPoints obs = new WeightedObservedPoints();

obs.add(2,4.077);

obs.add(4,11.084);

obs.add(6,30.128);

obs.add(8,81.897);

obs.add(10,222.62);

// Instantiate a first-degree Polynomial fitter.

22
22 | P a g e
final PolynomialCurveFitter fitter = PolynomialCurveFitter.create(1);

// Retrieve fitted parameters (coefficients of the Polynomial function).

final double[] coeff = fitter.fit(obs.toList());

System.out.println(Arrays.toString(coeff));

} }

5.4 OUTPUT

Graph:

23
23 | P a g e
Y-Values
250

200 f(x) = 1.5 exp( 0.5 x )


R² = 1
150

100 Y-Values
Exponential (Y-
50 Values)

0
1 2 3 4 5 6 7 8 9 10 11

Figure 3

CONCLUSION
24
24 | P a g e
In real life problems, it is necessary to fit curves for hundreds of given data points.

It is not possible to do this type of problems manually. So, this project with the source

codes will help for dealing with more number of data points

In this project the concept of fitting of a straight line, an exponential curve, a

polynomial of nth degree by using the least square method are revised and written the

source code in java language for these methods executed and shown output.

25
25 | P a g e
REFERENCES
 S.S.Sastry, Introductory Method of Numerical Analysis, Fourth Edition, PHI

learning prints Ltd, 2009.

 Brian W.Kernighan and Dennis M.Ritchie, The JAVA Programming Language,

second edition, 1990.

 William M.Kolb, Curve Fitting for programmable calculators, Syntec, Incorporated,

1984.

 Ranjeet pawar, Dilip Gaikkward and Swathi Dhabarede, programming in JAVA, S.

Chand Publications, 2011.

26
26 | P a g e

You might also like