Ash Regression
Ash Regression
May 8, 2024
Mounted at /content/drive
[2]:
1
[3]: df.rename(columns={'MEDV':'Price'},inplace=True)
[4]: df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 506 entries, 0 to 505
Data columns (total 14 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 CRIM 486 non-null float64
1 ZN 486 non-null float64
2 INDUS 486 non-null float64
3 CHAS 486 non-null float64
4 NOX 506 non-null float64
5 RM 506 non-null float64
6 AGE 486 non-null float64
7 DIS 506 non-null float64
8 RAD 506 non-null int64
9 TAX 506 non-null int64
10 PTRATIO 506 non-null float64
11 B 506 non-null float64
12 LSTAT 486 non-null float64
13 Price 506 non-null float64
dtypes: float64(12), int64(2)
memory usage: 55.5 KB
[5]: df=df.dropna()
df.info()
<class 'pandas.core.frame.DataFrame'>
Index: 394 entries, 0 to 504
Data columns (total 14 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 CRIM 394 non-null float64
1 ZN 394 non-null float64
2 INDUS 394 non-null float64
3 CHAS 394 non-null float64
4 NOX 394 non-null float64
5 RM 394 non-null float64
6 AGE 394 non-null float64
7 DIS 394 non-null float64
8 RAD 394 non-null int64
9 TAX 394 non-null int64
10 PTRATIO 394 non-null float64
11 B 394 non-null float64
12 LSTAT 394 non-null float64
13 Price 394 non-null float64
2
dtypes: float64(12), int64(2)
memory usage: 46.2 KB
[6]: df
[7]: df.drop('CHAS',axis=1,inplace=True)
<ipython-input-7-0b8b043076ef>:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
<ipython-input-8-2e9e1705e643>:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
3
docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df.drop('ZN',axis=1,inplace =True )
[9]: df
LSTAT Price
0 4.98 24.0
1 9.14 21.6
2 4.03 34.7
3 2.94 33.4
5 5.21 28.7
.. … …
499 15.10 17.5
500 14.33 16.8
502 9.08 20.6
503 5.64 23.9
504 6.48 22.0
df.plot.scatter(x='CRIM',y='Price')
4
[11]: df.plot.scatter(x='TAX',y='Price')
5
[12]: df.plot.scatter(x='B',y='Price')
6
[13]: df.plot.scatter(x='LSTAT',y='B')
7
[14]: import numpy as np
[15]:
[16]: LinearRegression()
8
[17]: y_pred=model.predict(x_test)
[18]: model.score(x_test,y_test)
[18]: 0.6975020387554531
[19]: model.score(x_train,y_train)
[19]: 0.7725543921852038
MAE: 3.81048024985643
MSE: 37.14034003813928
RMSE: 6.094287492245445
[21]: print("Accuracy:",model.score(x_test,y_test)*100,"%")
Accuracy: 69.75020387554531 %
[27]: x_train.shape
y_train.shape
[27]: (315,)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-23-07e70f9b4ff2> in <cell line: 1>()
----> 1 plt.scatter(x_train,y_train, color = 'red')
2 plt.plot(x_train,model.predict(x_train), color='blue')
3 plt.x_label("crime")
4 plt.y_label("house prices")
9
/usr/local/lib/python3.10/dist-packages/matplotlib/pyplot.py in scatter(x, y, s,␣
↪c, marker, cmap, norm, vmin, vmax, alpha, linewidths, edgecolors,␣
/usr/local/lib/python3.10/dist-packages/matplotlib/__init__.py in inner(ax,␣
↪data, *args, **kwargs)
/usr/local/lib/python3.10/dist-packages/matplotlib/axes/_axes.py in␣
↪scatter(self, x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths,␣
4582 y = np.ma.ravel(y)
4583 if x.size != y.size:
-> 4584 raise ValueError("x and y must be the same size")
4585
4586 if s is None:
10
11