0% found this document useful (0 votes)
110 views3 pages

Boston Housing Kaggle Challenge With Linear Regression

This document discusses using linear regression on the Boston Housing dataset to predict housing prices. It loads the Boston Housing dataset, splits it into training and test sets, fits a linear regression model to the training data, uses the model to make predictions on the test set, and calculates the mean squared error to evaluate the model performance. The linear regression model achieved only 66.55% accuracy on this dataset according to the mean squared error, indicating the model is not highly effective at predicting housing prices.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
110 views3 pages

Boston Housing Kaggle Challenge With Linear Regression

This document discusses using linear regression on the Boston Housing dataset to predict housing prices. It loads the Boston Housing dataset, splits it into training and test sets, fits a linear regression model to the training data, uses the model to make predictions on the test set, and calculates the mean squared error to evaluate the model performance. The linear regression model achieved only 66.55% accuracy on this dataset according to the mean squared error, indicating the model is not highly effective at predicting housing prices.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Boston Housing Kaggle Challenge with Linear Regression

Boston Housing Data: This dataset was taken from the StatLib library and is maintained by
Carnegie Mellon University. This dataset concerns the housing prices in housing city of Boston.
The dataset provided has 506 instances with 13 features.

# Inputing Libraries and dataset.


# Importing Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Importing Data
from sklearn.datasets import load_boston
boston = load_boston()

boston.data.shape

boston.feature_names
# Converting data from nd-array to dataframe and adding feature names to the data
data = pd.DataFrame(boston.data)
data.columns = boston.feature_names
data.head(10)

# Adding 'Price' (target) column to the data


boston.target.shape

data['Price'] = boston.target
data.head()

data.describe()
data.info()
#Getting input and output data and further splitting data to training and testing dataset.
# Input Data
x = boston.data
# Output Data
y = boston.target

# splitting data to training and testing dataset.


#from sklearn.cross_validation import train_test_split ( old version)

from sklearn.model_selection import train_test_split


xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size =0.2,

random_state = 0)
print("xtrain shape : ", xtrain.shape)
print("xtest shape : ", xtest.shape)
print("ytrain shape : ", ytrain.shape)
print("ytest shape : ", ytest.shape)

# Applying Linear Regression Model to the dataset and predicting the prices.
# Fitting Multi Linear regression model to training model
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(xtrain, ytrain)
# predicting the test set results
y_pred = regressor.predict(xtest)
# Plotting Scatter graph to show the prediction results – ‘ytrue’ value vs ‘y_pred’ value
# Plotting Scatter graph to show the prediction
# results - 'ytrue' value vs 'y_pred' value
plt.scatter(ytest, y_pred, c = 'green')
plt.xlabel("Price: in $1000's")
plt.ylabel("Predicted value")
plt.title("True value vs predicted value : Linear Regression")
plt.show()

# Results of Linear Regression i.e. Mean Squred Error.


# Results of Linear Regression.
from sklearn.metrics import mean_squared_error
mse = mean_squared_error(ytest, y_pred)
print("Mean Square Error : ", mse)

As per the result our model is only 66.55% accurate. So, the prepared model is not very good
for predicting the housing prices

https://towardsdatascience.com/linear-regression-on-boston-housing-dataset-f409b7e4a155

You might also like