0% found this document useful (0 votes)
27 views2 pages

Lab6 Hoursing Price Regression

The document outlines a lab exercise on regression analysis using a housing prices dataset, which includes house size, number of bedrooms, and price. It details the creation of a sample dataset, the application of multiple linear regression using the statsmodels library, and the interpretation of results including coefficients and R-squared values. Additionally, it provides instructions for visualizing the data and regression lines using matplotlib.

Uploaded by

vitlce180322
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views2 pages

Lab6 Hoursing Price Regression

The document outlines a lab exercise on regression analysis using a housing prices dataset, which includes house size, number of bedrooms, and price. It details the creation of a sample dataset, the application of multiple linear regression using the statsmodels library, and the interpretation of results including coefficients and R-squared values. Additionally, it provides instructions for visualizing the data and regression lines using matplotlib.

Uploaded by

vitlce180322
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

ADY201 Hieu Nguyen

Lab 6: Regression
Subject: ADY201 Author: Hieu Nguyen

Problem Statement

Suppose you are working with a housing prices dataset. The dataset includes the following
information:

• size (size of the house in square meters)


• bedrooms (number of bedrooms)
• price (house price in thousand dollars)

hoursing_price.csv

Part 1: Creating the Dataset

First, let's create a sample dataset.

import numpy as np
import pandas as pd

# Create a sample dataset


np.random.seed(0)
size = np.random.randint(50, 150, 100)
bedrooms = np.random.randint(1, 5, 100)
price = 50 + 0.5 * size + 10 * bedrooms + np.random.randn(100) * 10

# Create a DataFrame
df = pd.DataFrame({'size': size, 'bedrooms': bedrooms, 'price': price})

Part 2: Performing Multiple Linear Regression

Use the statsmodels library to perform multiple linear regression.

import statsmodels.api as sm

# Define the dependent variable (price) and independent variables (size


and bedrooms)
X = df[['size', 'bedrooms']]
X = sm.add_constant(X) # Add a constant term to the predictor
y = df['price']

# Fit the regression model


model = sm.OLS(y, X).fit()

# Print the summary of the regression


print(model.summary())

1
ADY201 Hieu Nguyen

Part 3: Interpreting the Results

• Constant (Intercept): The predicted price of a house when the size and number of
bedrooms are zero.
• Coefficients (size, bedrooms): The change in the house price for a one-unit change in
size or the number of bedrooms, holding other variables constant.
• R-squared: A measure of how well the independent variables explain the variability of
the dependent variable.
• P-values: Indicate whether the coefficients are statistically significant.

Part 4: Visualizing the Data and the Regression Line

Use the matplotlib library to visualize the relationship between the variables and the regression
line.

import matplotlib.pyplot as plt

# Plotting the data and regression line for 'size'


plt.figure(figsize=(12, 6))

plt.subplot(1, 2, 1)
plt.scatter(df['size'], df['price'], label='Data')
plt.plot(df['size'], model.predict(X), color='red', label='Regression
Line')
plt.xlabel('Size (square meters)')
plt.ylabel('Price (thousand dollars)')
plt.title('Price vs Size')
plt.legend()

# Plotting the data and regression line for 'bedrooms'


plt.subplot(1, 2, 2)
plt.scatter(df['bedrooms'], df['price'], label='Data')
plt.plot(df['bedrooms'], model.predict(X), color='red',
label='Regression Line')
plt.xlabel('Bedrooms')
plt.ylabel('Price (thousand dollars)')
plt.title('Price vs Bedrooms')
plt.legend()

plt.tight_layout()
plt.show()

You might also like