0% found this document useful (0 votes)
10 views9 pages

Predicting House Prices

Uploaded by

RB GEMAR
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views9 pages

Predicting House Prices

Uploaded by

RB GEMAR
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

PREDICTING HOUSE PRICES

A MACHINE LEARNING MICRO-PROJECT REPORT

Submitted by
SHARMA KUSH PARESHBHAI
ENROLLMENT NO. : 226230316198
&
RANGANI AYUSH NAVEENBHAI
ENROLLMENT NO. : 226230316182

In partial fulfillment for the award of the diploma of

DIPLOMA IN ENGINEERING
In
INFORMATION TECHNOLOGY

GOVERNMENT POLYTECHNIC, GANDHINAGAR

2 | Page
INDEX

Sr. no. Title Page no.

1. Introduction to Predicting House Prices Model 3


2. Code of Predicting House Prices Model 4
3. Explanation of Code 6

4. Output 8

1.

2 | Page
1. Introduction to Predicting House Prices Model

Predicting house prices is a common application of machine learning and data


analysis, often used by real estate professionals, investors, and analysts to estimate the
value of properties. This involves building a model that can accurately predict the price
of a house based on various features such as square footage, number of bedrooms,
number of bathrooms, location, and other relevant attributes.
Objectives :
1. Load and Explore the Dataset:
Understand the structure and contents of the dataset, including the features and target
variable.

2. Preprocess the Data:


Handle missing values, encode categorical variables, and scale numerical features to
ensure compatibility with machine learning algorithms.

3. Build and Train the Model:


Use a linear regression model to learn from the training data and fit the model.

4. Evaluate the Model:


Assess the performance of the model using metrics such as Mean Squared Error (MSE).

5. Predict New House Prices:


Use the trained model to predict the price of new houses based on their features.

2 | Page
2. Code of Predicting House Prices Model

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Load the dataset


data = pd.read_csv('Housing.csv')

# Display the first few rows of the dataset


print(data.head())

# Separate features (X) and target (y)


X = data.drop('price', axis=1) # Features
y = data['price'] # Target

# Split the data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a linear regression model


model = LinearRegression()

# Train the model on the training data


# Handle categorical variables
X_train = pd.get_dummies(X_train)
model.fit(X_train, y_train)

# Make predictions on the testing data


X_test = pd.get_dummies(X_test)
predictions = model.predict(X_test)

# Evaluate the model using mean squared error


mse = mean_squared_error(y_test, predictions)
print("Mean Squared Error:", mse)

# Example of predicting the price of a new house


# Replace the values with the features of the new house
# Example features of the new house
# Assuming the features were one-hot encoded with categories 'yes' and 'no'
new_house_features = [[2000, 3, 2, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]] # Numeric
features

2 | Page
predicted_price = model.predict(new_house_features)
print("Predicted price of the new house:", predicted_price[0])

2 | Page
3. Explanation of Code

This Python code demonstrates the process of building and evaluating a linear
regression model to predict house prices. It covers essential steps such as data
loading, preprocessing, model training, evaluation, and making predictions for
new data points. Below is a theoretical explanation of each step involved in the
code.

Importing Libraries

The code begins by importing necessary libraries:


- `pandas`: For data manipulation and analysis.
- `train_test_split` from `sklearn.model_selection`: For splitting the dataset into
training and testing sets.
- `LinearRegression` from `sklearn.linear_model`: For creating the linear
regression model.
- `mean_squared_error` from `sklearn.metrics`: For evaluating the model's
performance.

Loading the Dataset

The dataset, stored in a CSV file named 'Housing.csv', is loaded into a pandas
DataFrame. This step involves reading the file and converting its contents into a
structured format that can be easily manipulated and analyzed.

Displaying the Dataset

The first few rows of the dataset are printed to the console. This initial
examination helps understand the structure of the data, including the columns
(features) and their types.

Separating Features and Target

2 | Page
The dataset is divided into features (`X`) and the target variable (`y`).
- `X` contains all the columns except for the target column 'price'.
- `y` contains the target variable, which is the 'price' column. This separation is
crucial for training the model, as it allows the model to learn the relationship
between the features and the target.

Splitting the Data

The data is split into training and testing sets using the `train_test_split` function.
- The training set (80% of the data) is used to train the model.
- The testing set (20% of the data) is used to evaluate the model's performance.
This split ensures that the model is tested on unseen data, providing a measure
of its generalization ability.

Creating the Linear Regression Model

A linear regression model is instantiated using `LinearRegression()`. This model


assumes a linear relationship between the input features and the target variable.

Training the Model

Before training the model, categorical variables in the training data are converted
into numeric values using one-hot encoding (`pd.get_dummies`). This step
transforms categorical features into a format suitable for the linear regression
model. The model is then trained on the training data (`X_train` and `y_train`),
learning the relationships between the features and the target variable.

Making Predictions

Similar to the training data, categorical variables in the testing data are also
converted into numeric values using one-hot encoding. The trained model is used
to make predictions on the testing set (`X_test`). This step involves applying the

2 | Page
learned relationships to estimate the target variable (price) for the test set.

Evaluating the Model

The model's performance is evaluated using the Mean Squared Error (MSE),
which measures the average squared difference between the actual and
predicted prices. A lower MSE indicates better model performance. The MSE is
printed to provide a quantitative measure of the model's accuracy.

Predicting New House Prices

The code also demonstrates predicting the price of a new house based on its
features. An example list of features is created, following the same format and
preprocessing steps as the training data (including one-hot encoding if
necessary). The trained model is used to predict the price of the new house, and
the predicted price is printed. This step shows how the model can be used in
practical applications to estimate the price of new properties based on their
attributes.

By following these steps, the code builds a linear regression model capable of
predicting house prices, evaluates its performance, and demonstrates how to use
it for new predictions. Each step is crucial for ensuring the model is accurate and
reliable in real-world scenarios.

4. Output

2 | Page
2 | Page

You might also like