0% found this document useful (0 votes)
4 views6 pages

Module 3

The document provides an overview of regression algorithms, specifically Linear and Logistic Regression, detailing their definitions, key points, implementation steps, and applications. It also discusses evaluation metrics for regression models, including Mean Absolute Error, Mean Squared Error, Root Mean Squared Error, and R-squared. Code implementations for both algorithms and evaluation metrics are included to demonstrate practical application.

Uploaded by

gohodoh495
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views6 pages

Module 3

The document provides an overview of regression algorithms, specifically Linear and Logistic Regression, detailing their definitions, key points, implementation steps, and applications. It also discusses evaluation metrics for regression models, including Mean Absolute Error, Mean Squared Error, Root Mean Squared Error, and R-squared. Code implementations for both algorithms and evaluation metrics are included to demonstrate practical application.

Uploaded by

gohodoh495
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Module 3: Regression

Algorithms
1. Linear Regression

Definition:

Linear Regression is a supervised learning algorithm that models the relationship


between a dependent variable (target) and one or more independent variables
(features) using a straight line.

Key Points:

• Formula: y=b0+b1x
o y: Dependent variable
o x: Independent variable
o b0: Intercept
o b1: Coefficient (slope)
• Objective: Minimize the residual sum of squares to find the best-fit line.
• Applications:
o Predicting house prices
o Forecasting sales
o Stock market trends

Steps to Implement:

1. Load and preprocess the data.


2. Fit a linear model to the data.
3. Predict target values using the model.

Code Implementation:
# Import libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

# Load dataset
# Example: Using a dummy dataset
data = {'Hours': [1, 2, 3, 4, 5], 'Scores': [20, 40, 60, 80, 100]}
df = pd.DataFrame(data)

# Split dataset
X = df[['Hours']] # Independent variable
y = df['Scores'] # Dependent variable
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)

# Train model
model = LinearRegression()
model.fit(X_train, y_train)

# Predict
y_pred = model.predict(X_test)

# Evaluate
print("Mean Squared Error:", mean_squared_error(y_test, y_pred))
print("R-squared:", r2_score(y_test, y_pred))

2. Logistic Regression

Definition:

Logistic Regression is a classification algorithm used to predict the probability of a


target variable belonging to a certain class.

Key Points:

Formula:

The output is a probability, mapped to [0,1][0, 1] using the sigmoid function.

• Objective: Find the best-fit logistic function to classify data points.


• Applications:
o Spam email detection
o Loan default prediction
o Disease diagnosis

Steps to Implement:

1. Load and preprocess the data.


2. Fit a logistic regression model to the data.
3. Predict probabilities and classify based on a threshold.

Code Implementation:
# Import libraries
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix

# Load dataset
iris = load_iris()
X = iris.data[:, :2] # Taking only the first two features for simplicity
y = (iris.target == 0).astype(int) # Binary classification: Is the species
Setosa?

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)

# Train model
log_model = LogisticRegression()
log_model.fit(X_train, y_train)

# Predict
y_pred = log_model.predict(X_test)

# Evaluate
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))

3. Evaluation Metrics of Regression Algorithms

Definition:

Evaluation metrics are used to measure the performance of regression models by


comparing predicted values with actual values.

Key Metrics:

1. Mean Absolute Error (MAE):


o Formula:

oInterpretation: Measures the average absolute error.


2. Mean Squared Error (MSE):
o Formula:

o Interpretation: Penalizes larger errors more than smaller ones.


3. Root Mean Squared Error (RMSE):

o Formula:
o Interpretation: Gives the error in the same unit as the target variable.
4. R-squared (R²):
o Formula:

Interpretation: Measures the proportion of variance explained by the


model.

Code Implementation:
# Example: Evaluating a regression model
from sklearn.metrics import mean_absolute_error, mean_squared_error

# Sample actual and predicted values


y_actual = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]

# Calculate metrics
mae = mean_absolute_error(y_actual, y_pred)
mse = mean_squared_error(y_actual, y_pred)
rmse = np.sqrt(mse)

# Print results
print("Mean Absolute Error:", mae)
print("Mean Squared Error:", mse)
print("Root Mean Squared Error:", rmse)
Summary Table

Algorithm Key Use Case Output Evaluation Metrics


Linear Predict continuous MSE, RMSE, R²,
Best-fit line
Regression values Adjusted R²
Logistic Classify binary/multi- Accuracy, Confusion
Probability/Classification
Regression class Matrix

You might also like