0% found this document useful (0 votes)
15 views13 pages

Practical 5

This document describes using linear regression with multiple variables to predict housing prices. It loads housing data, splits it into training and test sets, initializes model parameters randomly, calculates predictions and error using gradient descent, and optimizes the model over iterations to minimize error.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views13 pages

Practical 5

This document describes using linear regression with multiple variables to predict housing prices. It loads housing data, splits it into training and test sets, initializes model parameters randomly, calculates predictions and error using gradient descent, and optimizes the model over iterations to minimize error.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

11/2/23, 11:34 PM LINEAR REGRESSION WITH MULTIPLE VARIABLES - Jupyter Notebook

LINEAR REGRESSION WITH MULTI-VARIABLE

IMPORTING LIBRARIES

In [1]: import numpy as np


import matplotlib.pyplot as plt

localhost:8890/notebooks/CAL201/SUPERVISED LEARNING/LINEAR REGRESSION WITH MULTIPLE VARIABLES.ipynb 1/13


11/2/23, 11:34 PM LINEAR REGRESSION WITH MULTIPLE VARIABLES - Jupyter Notebook

IMPORTING DATA SET

In [2]: #loading the dataset with the help of numpy


Data_Set = np.loadtxt("LinearReg_Multivariate.txt", dtype = int, delimiter
Data_Set

Out[2]: array([[ 2104, 3, 399900],


[ 1600, 3, 329900],
[ 2400, 3, 369000],
[ 1416, 2, 232000],
[ 3000, 4, 539900],
[ 1985, 4, 299900],
[ 1534, 3, 314900],
[ 1427, 3, 198999],
[ 1380, 3, 212000],
[ 1494, 3, 242500],
[ 1940, 4, 239999],
[ 2000, 3, 347000],
[ 1890, 3, 329999],
[ 4478, 5, 699900],
[ 1268, 3, 259900],
[ 2300, 4, 449900],
[ 1320, 2, 299900],
[ 1236, 3, 199900],
[ 2609, 4, 499998],
[ 3031, 4, 599000],
[ 1767, 3, 252900],
[ 1888, 2, 255000],
[ 1604, 3, 242900],
[ 1962, 4, 259900],
[ 3890, 3, 573900],
[ 1100, 3, 249900],
[ 1458, 3, 464500],
[ 2526, 3, 469000],
[ 2200, 3, 475000],
[ 2637, 3, 299900],
[ 1839, 2, 349900],
[ 1000, 1, 169900],
[ 2040, 4, 314900],
[ 3137, 3, 579900],
[ 1811, 4, 285900],
[ 1437, 3, 249900],
[ 1239, 3, 229900],
[ 2132, 4, 345000],
[ 4215, 4, 549000],
[ 2162, 4, 287000],
[ 1664, 2, 368500],
[ 2238, 3, 329900],
[ 2567, 4, 314000],
[ 1200, 3, 299000],
[ 852, 2, 179900],
[ 1852, 4, 299900],
[ 1203, 3, 239500]])

localhost:8890/notebooks/CAL201/SUPERVISED LEARNING/LINEAR REGRESSION WITH MULTIPLE VARIABLES.ipynb 2/13


11/2/23, 11:34 PM LINEAR REGRESSION WITH MULTIPLE VARIABLES - Jupyter Notebook

PLOTTING THE DATA-POINTS

In [3]: plt.axes(projection='3d')
plt.plot(Data_Set[:,0],Data_Set[:,1],Data_Set[:,2],'.')

Out[3]: [<mpl_toolkits.mplot3d.art3d.Line3D at 0x20315413990>]

SPLITTING DATASET FOR TRAINING AND TESTING

In [4]: #Training set of x


#Here x_Train is a comination array of x0(all ones), x1, x2(given)
x_Train = np.array(([np.ones(35),Data_Set[:35,0],Data_Set[:35,1]]),dtype =
#Training set of y
y_Train = np.array(([Data_Set[:35,2]]),dtype = np.int64)
#Testing set of x
x_Test = np.array(([np.ones(12),Data_Set[35:,0],Data_Set[35:,1]]),dtype = n
#Testing set of y
y_Test = np.array(([Data_Set[35:,2]]),dtype = np.int64)

SIZES OF THE TRAINING AND TESTING DATA

In [5]: s1 = 35 #size of training set


s2 = 12 #size of testing set

localhost:8890/notebooks/CAL201/SUPERVISED LEARNING/LINEAR REGRESSION WITH MULTIPLE VARIABLES.ipynb 3/13


11/2/23, 11:34 PM LINEAR REGRESSION WITH MULTIPLE VARIABLES - Jupyter Notebook

INITIALIZING THETHA MATRIX RANDOMLY

In [6]: #Here W = Thetha matrix


#we are initializing it with random values with help of numpy
rng = np.random.default_rng()
W = rng.random([3,1])
W

Out[6]: array([[0.41848923],
[0.32338244],
[0.43007395]])

EVALUTING HYPOTHESIS(predicted y)

In [7]: #hpothesis
h = np.dot((W.T), x_Train)
h

Out[7]: array([[ 682.1053649 , 519.12061512, 777.82656714, 459.18817221,


972.28610511, 644.05292849, 497.77737408, 463.175453 ,
447.97647832, 484.84207648, 629.50071868, 648.47359113,
612.90152273, 1450.67542542, 411.75764504, 745.91839709,
428.14345796, 401.40940695, 845.84357106, 982.31096075,
573.12548261, 611.8246839 , 520.41414488, 636.61513236,
1259.66640278, 357.42939511, 473.20030864, 818.57275459,
713.15007914, 854.46820543, 595.97894434, 324.2310032 ,
661.83896269, 1016.15942544, 587.78438392]])

EVALUATING SQUARE ERROR

In [8]: #SE = square error = array of square errors of every indivisual data-point
SE = ((h - y_Train) ** 2)
print(SE)

[[1.59374927e+11 1.08491764e+11 1.35587569e+11 5.36111475e+10


2.90443081e+11 8.95541219e+10 9.88487576e+10 3.94164736e+10
4.47542587e+10 5.85713367e+10 5.72977572e+10 1.19959380e+11
1.08495202e+11 4.87831459e+11 6.73341479e+10 2.01739389e+11
8.96833929e+10 3.97996876e+10 2.49152875e+11 3.57625156e+11
6.36688516e+10 6.47133437e+10 5.87478636e+10 6.72175027e+10
3.27916952e+11 6.22714945e+10 2.15320871e+11 2.19193849e+11
2.24948016e+11 8.94282301e+10 1.22013299e+11 2.87559414e+10
9.87456219e+10 3.35106501e+11 8.14030604e+10]]

SUM OF SQUARE-ERROR

In [9]: #Sum_SE = sum of square errors


Sum_SE = np.sum(SE)
Sum_SE

Out[9]: 4917023281209.861

localhost:8890/notebooks/CAL201/SUPERVISED LEARNING/LINEAR REGRESSION WITH MULTIPLE VARIABLES.ipynb 4/13


11/2/23, 11:34 PM LINEAR REGRESSION WITH MULTIPLE VARIABLES - Jupyter Notebook

EVALUTING THE COST FUNCTION

In [10]: #Cost function


Cost_Func = (Sum_SE / (2 * s1))
Cost_Func

Out[10]: 70243189731.56944

GIVING THE VALUES OF HYPER-PARAMETERS

In [11]: #giving the no.of iterations


Iter_No = 60000
#giing the learning rate
Alpha = 0.00000000001

CREATING AN EMPTY LIST FOR COST-FUNCTIONS

In [12]: #Creating an empty list for cost functions


L = list([])
L.append(Cost_Func)

localhost:8890/notebooks/CAL201/SUPERVISED LEARNING/LINEAR REGRESSION WITH MULTIPLE VARIABLES.ipynb 5/13


11/2/23, 11:34 PM LINEAR REGRESSION WITH MULTIPLE VARIABLES - Jupyter Notebook

APPLYING GRADIENT DESCENT ALGORITHM

In [13]: print(f"Iteration 1:Cost funcion = {Cost_Func}")


loss = h - y_Train
#Evaluating the differtiables
Diff_W = (np.sum(loss * x_Train)) / s1
for i in range(2, Iter_No+1):
#Evaluating new / updated W(Thetas)
W -= (Alpha)*(Diff_W)
#new hypothesis
h1 = np.dot((W.T),x_Train)
#nes Square Error
SE = ((h1 - y_Train) ** 2)
Sum_SE = np.sum(SE)
#new updated cost funcion
Cost_Func = (Sum_SE / (2 * s1))
loss = h1 - y_Train
#new Differentiables
Diff_W = (np.sum(loss * x_Train)) / s1
#appending the new cost Function into the list
L.append(Cost_Func)
if (i % 6000 == 0):
print(f"Iteration {i}:Cost funcion = {Cost_Func}")

Iteration 1:Cost funcion = 70243189731.56944


Iteration 6000:Cost funcion = 40767959093.04384
Iteration 12000:Cost funcion = 24069605076.11786
Iteration 18000:Cost funcion = 14611694226.564194
Iteration 24000:Cost funcion = 9254754401.990297
Iteration 30000:Cost funcion = 6220595512.084834
Iteration 36000:Cost funcion = 4502054615.477992
Iteration 42000:Cost funcion = 3528676845.7321734
Iteration 48000:Cost funcion = 2977357825.9138227
Iteration 54000:Cost funcion = 2665091950.258117
Iteration 60000:Cost funcion = 2488225247.6887264

FINAL THETAS

In [14]: #Final Thethas


W

Out[14]: array([[159.96456673],
[159.86945994],
[159.97615145]])

localhost:8890/notebooks/CAL201/SUPERVISED LEARNING/LINEAR REGRESSION WITH MULTIPLE VARIABLES.ipynb 6/13


11/2/23, 11:34 PM LINEAR REGRESSION WITH MULTIPLE VARIABLES - Jupyter Notebook

PLOTTING THE COST-FUNCTION VS NO.OF ITERATIONS


GRAPH

In [15]: #I = no .of Iterations matrix


I = np.arange(1,60001)
fig = plt.figure()
fig.set_figheight(8)
fig.set_figwidth(15)
print(plt.xlabel("No.of Iterations"))
print(plt.ylabel("Cost Function"))
print(plt.plot(I, L, 'green', label = 'J vs Iter_No'))
print(plt.title("Cost function vs No of Iterations graph"))
print(plt.legend())

Text(0.5, 0, 'No.of Iterations')


Text(0, 0.5, 'Cost Function')
[<matplotlib.lines.Line2D object at 0x000002031541F850>]
Text(0.5, 1.0, 'Cost function vs No of Iterations graph')
Legend

localhost:8890/notebooks/CAL201/SUPERVISED LEARNING/LINEAR REGRESSION WITH MULTIPLE VARIABLES.ipynb 7/13


11/2/23, 11:34 PM LINEAR REGRESSION WITH MULTIPLE VARIABLES - Jupyter Notebook

VALIDATION

In [16]: h_Valid = np.dot(W.T,x_Train)


loss_Valid = h_Valid - y_Train
loss_Valid

Out[16]: array([[ -62894.76326182, -73468.97107238, 15326.59688089,


-5144.92785308, -59491.7510027 , 18240.74715659,
-69020.35542852, 29774.61235773, 9259.74774048,
-3015.13382618, 70947.62145922, -26621.18709574,
-27205.82768932, 16955.28694241, -56545.63177298,
-81400.37296181, -88392.39600747, -1661.45449111,
-82098.70983986, -113635.79774452, 30229.22873787,
47313.45723935, 14170.50676739, 54563.74957793,
48632.09219385, -73403.70104317, -230770.43438408,
-64529.85116647, -122647.29510743, 122315.65888705,
-55420.14629779, -9710.59934023, 12033.56745338,
-77749.61114216, 4423.46112675]])

localhost:8890/notebooks/CAL201/SUPERVISED LEARNING/LINEAR REGRESSION WITH MULTIPLE VARIABLES.ipynb 8/13


11/2/23, 11:34 PM LINEAR REGRESSION WITH MULTIPLE VARIABLES - Jupyter Notebook

VALIDATION GRAPH

In [17]: #Validation plot


h_Valid = np.reshape(h_Valid, newshape = ([35,1]))
y_Train = np.reshape(y_Train, newshape = ([35,1]))
print(plt.plot(h_Valid, y_Train, '.',label = 'Distortion in Validation'))
x = np.arange(750000)
y = x
print(plt.plot(x, y, '-',label = 'Y=X line'))
print(plt.ylabel("Predicted Values"))
print(plt.xlabel("Actual Values"))
print(plt.title("Validation Graph"))
plt.legend()

[<matplotlib.lines.Line2D object at 0x0000020316E284D0>]


[<matplotlib.lines.Line2D object at 0x0000020316E2B590>]
Text(0, 0.5, 'Predicted Values')
Text(0.5, 0, 'Actual Values')
Text(0.5, 1.0, 'Validation Graph')

Out[17]: <matplotlib.legend.Legend at 0x20315c5fb50>

localhost:8890/notebooks/CAL201/SUPERVISED LEARNING/LINEAR REGRESSION WITH MULTIPLE VARIABLES.ipynb 9/13


11/2/23, 11:34 PM LINEAR REGRESSION WITH MULTIPLE VARIABLES - Jupyter Notebook

TESTING

In [18]: #Testing
h_Test = np.dot(W.T,x_Test)
loss_Test = h_Test - y_Test
loss_Test

Out[18]: array([[ -19527.69304285, -31181.84611129, -3358.442232 ,


125649.64282632, 59437.64156625, -101997.30178757,
28527.74437035, 97184.77284259, -106516.75504901,
-43211.30326013, -3021.89101564, -46537.14666918]])

localhost:8890/notebooks/CAL201/SUPERVISED LEARNING/LINEAR REGRESSION WITH MULTIPLE VARIABLES.ipynb 10/13


11/2/23, 11:34 PM LINEAR REGRESSION WITH MULTIPLE VARIABLES - Jupyter Notebook

TESTING GRAPH

In [19]: #Testing plot


h_Test = np.reshape(h_Test, newshape = ([12,1]))
y_Test = np.reshape(y_Test, newshape = ([12,1]))
print(plt.plot(h_Test, y_Test, '.',label = 'Distortion in Testing'))
x = np.arange(750000)
y = x
print(plt.plot(x, y, '-',label = 'Y=X line'))
print(plt.ylabel("Predicted Values"))
print(plt.xlabel("Actual Values"))
print(plt.title("Testing Graph"))
plt.legend()

[<matplotlib.lines.Line2D object at 0x0000020313F1C9D0>]


[<matplotlib.lines.Line2D object at 0x0000020316EB32D0>]
Text(0, 0.5, 'Predicted Values')
Text(0.5, 0, 'Actual Values')
Text(0.5, 1.0, 'Testing Graph')

Out[19]: <matplotlib.legend.Legend at 0x20316e86410>

localhost:8890/notebooks/CAL201/SUPERVISED LEARNING/LINEAR REGRESSION WITH MULTIPLE VARIABLES.ipynb 11/13


11/2/23, 11:34 PM LINEAR REGRESSION WITH MULTIPLE VARIABLES - Jupyter Notebook

FINAL HYPER-PLANE and DATA-POINTS

In [20]: plt.axes(projection='3d')
plt.plot(Data_Set[:,0],Data_Set[:,1],Data_Set[:,2],'.',color = 'green')
plt.title('Data-Points')
ax = fig.add_subplot(111,projection='3d')
plt.xlabel('X1(Feature-1)')
plt.ylabel('X2(Feature-2)')
#Final Hyper-plane
a = W[1]
b = -1
c = W[2]
d = -(W[0])
x = np.linspace(-100,100,10)
y = np.linspace(-100,100,10)
X,Y = np.meshgrid(x,y)
Z = (d + a*X + b*Y) / c
fig = plt.figure()
ax = fig.add_subplot(111,projection='3d')
surf = ax.plot_surface(X, Z, Y , label = 'oooo')
ax.view_init(30,160)
plt.title('Final Hyper-Plane')
plt.xlabel('X1(Feature-1)')
plt.ylabel('X2(Feature-2)')

Out[20]: Text(0.5, 0.5, 'X2(Feature-2)')

localhost:8890/notebooks/CAL201/SUPERVISED LEARNING/LINEAR REGRESSION WITH MULTIPLE VARIABLES.ipynb 12/13


11/2/23, 11:34 PM LINEAR REGRESSION WITH MULTIPLE VARIABLES - Jupyter Notebook

localhost:8890/notebooks/CAL201/SUPERVISED LEARNING/LINEAR REGRESSION WITH MULTIPLE VARIABLES.ipynb 13/13

You might also like