0% found this document useful (0 votes)
71 views6 pages

03 A Polynomial Linear Regression

This document discusses and demonstrates polynomial linear regression. Polynomial linear regression uses higher-order polynomial terms like x^2, x^3 instead of just x to fit nonlinear datasets better than simple linear regression. The document shows implementing polynomial regression in Python on a nonlinear salary-level dataset, achieving much higher accuracy than linear regression. Polynomial regression fits the nonlinear dataset very well with accuracy over 99%, demonstrating when it should be used instead of linear regression.

Uploaded by

Gabriel Gheorghe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views6 pages

03 A Polynomial Linear Regression

This document discusses and demonstrates polynomial linear regression. Polynomial linear regression uses higher-order polynomial terms like x^2, x^3 instead of just x to fit nonlinear datasets better than simple linear regression. The document shows implementing polynomial regression in Python on a nonlinear salary-level dataset, achieving much higher accuracy than linear regression. Polynomial regression fits the nonlinear dataset very well with accuracy over 99%, demonstrating when it should be used instead of linear regression.

Uploaded by

Gabriel Gheorghe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

What is Polynomial Linear Regression?

In Simple Linear Regression, we use a straight line with formula Y = MX + C.


It was used to best fit the line to our dataset.
But in some datasets, it is hard to fit a straight line. Therefore we use a polynomial function of a curve to fit
all the data points.
The formula of the polynomial linear regression is almost similar to that of Simple Linear Regression.
𝑦 = 𝑎 + 𝑏𝑥 + 𝑐𝑥2 +...+𝑛𝑥𝑛 +...

- if the dataset look like this type then we are doesnot use Linear Regression

In this case we are use Polynomial regresion or other Type of regression techinque like (Decision
tree,naive Bayes etc)

Implementation of the polynomial linear regression in Python

In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [2]:

dataset = pd.read_csv('data/Position_Salaries.csv')
In [3]:

dataset.head(10)

Out[3]:

Position Level Salary

0 Business Analyst 1 45000

1 Junior Consultant 2 50000

2 Senior Consultant 3 60000

3 Manager 4 80000

4 Country Manager 5 110000

5 Region Manager 6 150000

6 Partner 7 200000

7 Senior Partner 8 300000

8 C-level 9 500000

9 CEO 10 1000000

In [4]:

dataset = dataset.drop(['Position'],axis=True)

In [5]:

dataset.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Level 10 non-null int64
1 Salary 10 non-null int64
dtypes: int64(2)
memory usage: 288.0 bytes
In [6]:

sns.pairplot(dataset)

Out[6]:

<seaborn.axisgrid.PairGrid at 0x2870ff66c08>

As see this plot is not linearly separable so here we use polynomial


regression

In [7]:

X = dataset.drop(['Salary'],axis=True)
y = dataset['Salary']
In [8]:

#Now Split The Data


from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=42)

In [9]:

X_train.shape,X_test.shape,y_train.shape,y_test.shape

Out[9]:

((8, 1), (2, 1), (8,), (2,))

## Training the Linear Regression model


In [10]:

from sklearn.linear_model import LinearRegression


lin_reg = LinearRegression()
lin_reg.fit(X_train, y_train)

Out[10]:

LinearRegression()

Accuracy For Linear Regression

In [11]:

print("Training Accuracy :", lin_reg.score(X_train, y_train))


print("Testing Accuracy :", lin_reg.score(X_test, y_test))

Training Accuracy : 0.6366049276570868


Testing Accuracy : 0.8451346684575975

Training the Polynomial Regression model


In [12]:

from sklearn.preprocessing import PolynomialFeatures


poly_reg = PolynomialFeatures(degree = 4)
X_poly = poly_reg.fit_transform(X_train)
lin_reg_2 = LinearRegression()
lin_reg_2.fit(X_poly, y_train)
X_poly_test = poly_reg.transform(X_test)

Accuracy For Using Polynomial Regression


In [13]:

print("Training Accuracy :", lin_reg_2.score(X_poly, y_train))


print("Testing Accuracy :", lin_reg_2.score(X_poly_test, y_test))

Training Accuracy : 0.9995857211026754


Testing Accuracy : 0.9714666803841844

See in above that when i apply only Linear Regression On this data accuracy is not good, but When i apply
polynomial regression then the accuracy is very high.

- So i hope Now you understood when we apply Linear Regression and when
we apply Polynomial Regression

Graph For When We Use Linear Regression

In [14]:

plt.scatter(X_train, y_train, color = 'red')


plt.plot(X_train, lin_reg.predict(X_train), color = 'blue')
plt.title('Truth or Bluff (Linear Regression)')
plt.xlabel('Position Level')
plt.ylabel('Salary')
plt.show()

Graph For When We Use Polynomial Regression


In [15]:

plt.scatter(X, y, color = 'red')


plt.plot(X, lin_reg_2.predict(poly_reg.fit_transform(X)), color = 'blue')
plt.title('Truth or Bluff (Polynomial Regression)')
plt.xlabel('Position level')
plt.ylabel('Salary')
plt.show()

You might also like