0% found this document useful (0 votes)
227 views1 page

Polynomial Regression From Scratch in Python - by Rashida Nasrin Sucky - Towards Data Science

This document discusses polynomial regression, which is an improved version of linear regression that can model nonlinear relationships between input variables and the output. It explains the polynomial regression formula, cost function, and gradient descent method for fine-tuning the model parameters. Step-by-step code is provided to demonstrate implementing polynomial regression on a sample dataset to predict salaries based on job levels.

Uploaded by

siper34606
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
227 views1 page

Polynomial Regression From Scratch in Python - by Rashida Nasrin Sucky - Towards Data Science

This document discusses polynomial regression, which is an improved version of linear regression that can model nonlinear relationships between input variables and the output. It explains the polynomial regression formula, cost function, and gradient descent method for fine-tuning the model parameters. Step-by-step code is provided to demonstrate implementing polynomial regression on a sample dataset to predict salaries based on job levels.

Uploaded by

siper34606
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

!"#$%&'((')*+, · -.

'/0 &'((') A7=8%78 U*0%,06+0*4

J?NTTY@%SX

*(.8'5()B(.-'%
=03?$
Q606%A57*807,0%684
HA%A0/4*80%60 X'/%26F*%M%>+**%<*<.*+K'8(3%,0'+3%(*>0%027,%<'802G%A7=8%/:%>'+%H*47/<%684%=*0%68%*R0+6%'8*
S',0'8%987F*+,703G
?*64%<3%.('=Z
200:,Z[[+*=*8*+607F*0
'463G5'<[

&'((')

DC

12'0'%.3%+*4526+(7*%'8%98,:(6,2

!"#$%"&'(#)*+,-+..'"%)/-"&
01-(213)'%)!$23"%
;*6+8%0'%7<:(*<*80%:'(38'<76(%+*=+*,,7'8%>+'<%,5+6052%)702%,'<*
,7<:(*%:302'8%5'4*

?6,2746%@6,+78%A/5B3 -/=%CD · E%<78%+*64

Polynomial regression in an improved version of linear regression. If you


know linear regression, it will be simple for you. If not, I will explain the
formulas here in this article. There are other advanced and more e>cient
machine learning algorithms are out there. But it is a good idea to learn
linear based regression techniques. Because they are simple, fast, and works
with very well known formulas. Though it may not work with a complex set
of data.

!"#$%"&'(#)*+,-+..'"%)/"-&0#(
Linear regression can perform well only if there is a linear correlation
between the input variables and the output variable. As I mentioned before
polynomial regression is built on linear regression. If you need a refresher
on linear regression, here is the link to linear regression:

4'%+(-)*+,-+..'"%)5#,"-'23&)'%)!$23"%
;*6+8%02*%5'85*:0,%'>%(78*6+%+*=+*,,7'8%684%4*F*(':%6%5'<:(*0*
(78*6+%+*=+*,,7'8%6(='+702<%>+'<%,5+6052%78%:302'8
0')6+4,4606,57*85*G5'<

Polynomial regression can Dnd the relationship between input features and
the output variable in a better way even if the relationship is not linear. It
uses the same formula as the linear regression:

Y = BX + C

I am sure, we all learned this formula in school. For linear regression, we


use symbols like this:

Here, we get X and Y from the dataset. X is the input feature and Y is the
output variable. Theta values are initialized randomly.

For polynomial regression, the formula becomes like this:

We are adding more terms here. We are using the same input features and
taking diNerent exponentials to make more features. That way, our
algorithm will be able to learn about the data better.

The powers do not have to be 2, 3, or 4. They could be 1/2, 1/3, or 1/4 as


well. Then the formula will look like this:

1".2)/0%32'"%)4%5)6-(5'+%2)7+.3+%2
Cost function gives an idea of how far the predicted hypothesis is from the
values. The formula is:

This equation may look complicated. It is doing a simple calculation. First,


deducting the hypothesis from the original output variable. Taking a square
to eliminate the negative values. Then dividing that value by 2 times the
number of training examples.

What is gradient descent? It helps in Dne-tuning our randomly initialized


theta values. I am not going to the diNerential calculus here. If you take the
partial diNerential of the cost function on each theta, we can derive these
formulas:

Here, alpha is the learning rate. You choose the value of alpha.

!$28"%)9&:#+&+%2(2'"%)";)!"#$%"&'(#)*+,-+..'"%
Here is the step by step implementation of Polynomial regression.

1. We will use a simple dummy dataset for this example that gives the data
of salaries for positions. Import the dataset:

import pandas as pd
import numpy as np
df = pd.read_csv('position_salaries.csv')
df.head()

2. Add the bias column for theta 0. This bias column will only contain 1.
Because if you multiply 1 with a number it does not change.

df = pd.concat([pd.Series(1, index=df.index, name='00'), df],


axis=1)
df.head()

3. Delete the ‘Position’ column. Because the ‘Position’ column contains


strings and algorithms do not understand strings. We have the ‘Level’
column to represent the positions.

df = df.drop(columns='Position')

4. DeDne our input variable X and the output variable y. In this example,
‘Level’ is the input feature and ‘Salary’ is the output variable. We want to
predict the salary for levels.

y = df['Salary']
X = df.drop(columns = 'Salary')
X.head()

5. Take the exponentials of the ‘Level’ column to make ‘Level1’ and ‘Level2’
columns.

X['Level1'] = X['Level']**2
X['Level2'] = X['Level']**3
X.head()

6. Now, normalize the data. Divide each column by the maximum value of
that column. That way, we will get the values of each column ranging from
0 to 1. The algorithm should work even without normalization. But it helps
to converge faster. Also, calculate the value of m which is the length of the
dataset.

m = len(X)
X = X/X.max()

7. DeDne the hypothesis function. That will use the X and theta to predict
the ‘y’.

def hypothesis(X, theta):


y1 = theta*X
return np.sum(y1, axis=1)

8. DeDne the cost function, with our formula for cost-function above:

def cost(X, y, theta):


y1 = hypothesis(X, theta)
return sum(np.sqrt((y1-y)**2))/(2*m)

9. Write the function for gradient descent. We will keep updating the theta
values until we Dnd our optimum cost. For each iteration, we will calculate
the cost for future analysis.

def gradientDescent(X, y, theta, alpha, epoch):


J=[]
k=0
while k < epoch:
y1 = hypothesis(X, theta)
for c in range(0, len(X.columns)):
theta[c] = theta[c] - alpha*sum((y1-y)* X.iloc[:, c])/m
j = cost(X, y, theta)
J.append(j)
k += 1
return J, theta

10. All the functions are deDned. Now, initialize the theta. I am initializing
an array of zero. You can take any other random values. I am choosing
alpha as 0.05 and I will iterate the theta values for 700 epochs.

theta = np.array([0.0]*len(X.columns))
J, theta = gradientDescent(X, y, theta, 0.05, 700)

11. We got our Dnal theta values and the cost in each iteration as well. Let’s
Dnd the salary prediction using our Dnal theta.

y_hat = hypothesis(X, theta)

12. Now plot the original salary and our predicted salary against the levels.

%matplotlib inline
import matplotlib.pyplot as plt
plt.figure()
plt.scatter(x=X['Level'],y= y)
plt.scatter(x=X['Level'], y=y_hat)
plt.show()

Our prediction does not exactly follow the trend of salary but it is close.
Linear regression can only return a straight line. But in polynomial
regression, we can get a curved line like that. If the line would not be a nice
curve, polynomial regression can learn some more complex trends as well.

13. Let’s plot the cost we calculated in each epoch in our gradient descent
function.

plt.figure()
plt.scatter(x=list(range(0, 700)), y=J)
plt.show()

The cost fell drastically in the beginning and then the fall was slow. In a
good machine learning algorithm, cost should keep going down until the
convergence. Please feel free to try it with a diNerent number of epochs and
diNerent learning rates (alpha).

Here is the dataset: salary_data

Follow this link for the full working code: Polynomial Regression

*+3"&&+%5+5)-+(5'%,<

6%2+-(12'7+)8+".9(2'(#):(2();'.<(#'=(2'"%)'%)!$23"%
H6:%,:*57>75%:6+0,%'>%02*%)'+(4I%:+*,*80%02*%*F*80,%'8%70%684
86F7=60*%6+'/84
0')6+4,4606,57*85*G5'<

0'&'#(-)>+?2.)0+(-13)6%)!$23"%)@'23)5)/+A)4'%+.)BC)D"E+F)5%)G4!
!-"H+12
&784%,7<7(6+%J7B7:*476%:+'>7(*,%/,78=%5'/80KF*50'+7L*+%684%8*6+*,0K
8*7=2.'+%<*02'4%78%1302'8I%6%,7<:(*%684%/,*>/(M
<*47/<G5'<

4",'.2'1)*+,-+..'"%)'%)!$23"%)>"):+2+12)I+(-2):'.+(.+
N<:'+0680%*O/607'8,%0'%4*F*(':%6%('=7,075%+*=+*,,7'8%6(='+702<%684
P')%0'%4*F*(':%6%('=7,075%+*=+*,,7'8%6(='+702<%)702M
0')6+4,4606,57*85*G5'<

J<'#E)5)G+<-(#)G+2A"-K)/-"&)01-(213)6%)!$23"%
Q*067(%*R:(68607'8%684%,0*:%.3%,0*:%7<:(*<*80607'8%'>%6%@*/+6(
@*0)'+B
<*47/<G5'<

J<'#E)5)*+1"&&+%E(2'"%)0$.2+&)L.'%,)0'&9#+)D"E+.)'%)!$23"%
P')%0'%./7(4%6%<'F7*%+*5'<<*84607'8%,3,0*<%78%1302'8
<*47/<G5'<

=',%)0:);"-)>8+)7('#$)!'3?
S3%T')6+4,%Q606%A57*85*

P684,K'8%+*6(K)'+(4%*R6<:(*,I%+*,*6+52I%0/0'+76(,I%684%5/0078=K*4=*%0*5287O/*,
4*(7F*+*4%H'8463%0'%T2/+,463G%H6B*%(*6+878=%3'/+%467(3%+70/6(G%T6B*%6%(''B

X'/+%*<67( U*0%027,%8*),(*00*+

S3%,7=878=%/:I%3'/%)7((%5+*60*%6%H*47/<%655'/80%7>%3'/%4'8V0%6(+*643%26F*%'8*G%?*F7*)%'/+%1+7F653%1'(753%>'+%<'+*%78>'+<607'8
6.'/0%'/+%:+7F653%:+65075*,G

DC%

H65278*%;*6+878= Q606%A57*85* 1+'=+6<<78= -+0>7576(%N80*((7=*85* T*528'('=3

@"-+);-"&)>"A(-5.)7(2()=3'+%3+ &'((')

-%H*47/<%:/.(75607'8%,26+78=%5'85*:0,I%74*6,I%684%5'4*,G

?*64%<'+*%>+'<%T')6+4,%Q606%A57*85*

@"-+)/-"&)@+5'0&

B"(&)18"&.?$)"%)28+ 4%)+%5D2"D+%5)&(38'%+ >+%)7++:)C+(-%'%, G0K+-%+2+.)'.


/020-+)";)7++:)C+(-%'%, #+(-%'%,):-"E+32)A'28 1"%3+:2.)I"0)=8"0#5 5+:-+3(2'%,)7"3?+-)'%
-84+*)%$/'%78%T')6+4,%Q606
!$28"%)!(%5(.F)G+-(.F G%"A);"-)7(2()=3'+%3+ 28+)0:3"&'%,)-+#+(.+
A57*85*
/#(.?F)7"3?+-)(%5 9%2+-J'+A. Q7<70+7,%1'/(':'/(',%78
H+-"?0 T*+*85*%A%78%T')6+4,%Q606 T')6+4,%Q606%A57*85*
?368%;6<.%78%T')6+4,%Q606 A57*85*
A57*85*

H"A)9)=A'238+5)2")7(2( !$28"%)4#"%+)L"%M2)6+2 >":)OP)!$28"%)6Q9 H"A)&(%$)5(2()2$:+.


=3'+%3+ I"0)()7(2()=3'+%3+)N"K /-(&+A"-?.);"- 3(%)$"0)%(&+R
?6,2746%@6,+78%A/5B3%78 H'26<<*4%-36+%78%T')6+4,
7+J+#":+-. W6,,7*%$'L3+B'F%78%T')6+4,
T')6+4,%Q606%A57*85* Q606%A57*85* W(67+*%QG%W',06%78%T')6+4,%Q606 Q606%A57*85*
A57*85*

-.'/0 P*(: ;*=6(

You might also like