Assignment7
Assignment7
.Prerequisites
statsmodels — it is used to explore data, estimate statistical models and perform statistical
tests.
After importing the libraries, you can import/load the data into the notebook using the pandas
method read_csv() (for CSV files) or read_excel() (for excel files).
3. Descriptive Statistics
It is a good practice beforehand to get apprised with the descriptive statistics as it helps us to
understand the dataset (eg. — are there any outliers present, etc.)
4. Create Your First Linear Regression
To create a linear regression, you’ll have to define the dependent (targets) and the independent
variable(s) (inputs/features).
We have to predict GPA based on SAT scores, so our dependent variable would be GPA and the
independent variable would be SAT.
(pyplot arguments — The first argument would be the data to be plotted on the x-axis, and the second
argument would be the data to be plotted on the y-axis).
Linear Regression
To perform a linear regression we should always add the bias term or the intercept (b0). We can do
this using the following method:
statsmodels.add_constant(independent_variable)
It’d create a new bias column equal in length to the independent variable, which consists only of 1's.
Let’s fit the Linear Regression model using the Ordinary Least Squares (OLS) model with the
dependent variable and an independent variable as the arguments.
To plot the regression line on the graph, simply define the linear regression equation, i.e., y_hat = b0
+ (b1*x1)
Program:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
# Make predictions
y_pred = model.predict(X_test)
Output:
Conclusion:
In this exercise, we implemented a Linear Regression model from scratch using Python and the
scikit-learn library
Prepared By H.O.D