0% found this document useful (0 votes)
6 views

Linear Regression

Uploaded by

ns4676
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Linear Regression

Uploaded by

ns4676
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Unit-2

Supervised learning

● Supervised learning trains machines using labeled data, where each input has a corresponding
correct output. This labeled data acts like a teacher guiding the machine to learn the
relationship between inputs and outputs, enabling it to predict outputs for new, unseen data.

● The aim of a supervised learning algorithm is to find a mapping function to map the input
variable(x) with the output variable(y).

● In the real-world, supervised learning can be used for Risk Assessment, Image
classification, Fraud Detection, spam filtering, etc.
Types of supervised Machine learning Algorithms:
What is Linear Regression?
● Linear Regression is a key data science tool for predicting continuous outcomes
● Linear regression predicts the relationship between two variables by assuming they
have a straight-line connection.
● It finds the best line that minimizes the differences between predicted and actual
values.

Types of Linear Regression

Simple Linear Regression involves one independent variable, while Multiple Linear
Regression involves two or more independent variables.
Simple Linear Regression
● In a simple linear regression, there is one independent variable and one dependent variable.
● The model estimates the slope and intercept of the line of best fit, which represents the
relationship between the variables.
● The slope represents the change in the dependent variable for each unit change in the
independent variable, while the intercept represents the predicted value of the dependent
variable when the independent variable is zero.
Equation of a Line: The relationship in simple linear regression (with one independent variable)
is described by the equation:
Steps to Perform Linear Regression
Dataset

Hours Studied Test Score


(x) (y)

1 50

2 55

3 65

4 70

5 75
https://docs.google.com/spreadsheets/d/1yyMW4g9jkmzHGOMKuGnFoF7BsL1
qsQc0HPR2oBoq9zY/edit?usp=sharing
Mean SLOPE
What is the Best Fit Line?
In simple terms, the best-fit line is a line that best fits the given scatter plot. Mathematically, you
obtain the best-fit line by minimizing the Residual Sum of Squares (RSS).

Evaluating the Model (Cost Function for Linear Regression)

● Mean Squared Error (MSE)


● Root Mean Squared Error (RMSE)
● R-squared (R²)
Mean Squared Error (MSE)

The MSE is the average of the squared differences between the actual values and the predicted
values. It measures the average of the squares of the errors.
R-squared (R²)
The R-squared value of approximately 0.9855 indicates that approximately 98.55% of the
variance in test scores is explained by the number of hours studied. This suggests a very good
fit of the model to the data.
Gradient Descent for Linear Regression

Gradient descent is an optimization algorithm used to minimize the cost function in linear
regression. It iteratively adjusts the model parameters to find the minimum value of the cost
function.

Cost Function

The cost function (also called the loss function) for linear regression is the Mean Squared Error
(MSE):
https://colab.research.google.com/drive/1hcrChy5_KIjn2JstJ4QPLTvn51gC
VbSx?usp=sharing
Implementation in Python
import numpy as np
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt

# Your data
hours_studied = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
# Reshape to 2D array
test_scores = np.array([50, 55, 65, 70, 75])

# Create a linear regression model


model = LinearRegression()

# Fit the model to the data


model.fit(hours_studied, test_scores)

# Make predictions
predicted_scores = model.predict(hours_studied)
Output:
# Print coefficients Coefficient: [6.5]
print("Coefficient:" , model.coef_)
Intercept: 43.5
print("Intercept:" , model.intercept_)

# Plot the data and the regression line


plt.scatter(hours_studied, test_scores)
plt.plot(hours_studied, predicted_scores,
color='red')
plt.xlabel( 'Hours Studied' )
plt.ylabel( 'Test Score' )
plt.show()
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
# Make predictions
y_pred = model.predict(hours_studied)
y_true = test_scores # Actual values

# Calculate evaluation metrics


mse = mean_squared_error(y_true, y_pred)
Output:
Mean Squared Error: 1.5
rmse = np.sqrt(mse)
Root Mean Squared Error:
mae = mean_absolute_error(y_true, y_pred) 1.224744871391589
r2 = r2_score(y_true, y_pred) Mean Absolute Error: 1.0
R-squared: 0.9825581395348837
print("Mean Squared Error:", mse)
print("Root Mean Squared Error:", rmse)
print("Mean Absolute Error:", mae)
print("R-squared:", r2)

You might also like