Module 3
Module 3
Algorithms
1. Linear Regression
Definition:
Key Points:
• Formula: y=b0+b1x
o y: Dependent variable
o x: Independent variable
o b0: Intercept
o b1: Coefficient (slope)
• Objective: Minimize the residual sum of squares to find the best-fit line.
• Applications:
o Predicting house prices
o Forecasting sales
o Stock market trends
Steps to Implement:
Code Implementation:
# Import libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
# Load dataset
# Example: Using a dummy dataset
data = {'Hours': [1, 2, 3, 4, 5], 'Scores': [20, 40, 60, 80, 100]}
df = pd.DataFrame(data)
# Split dataset
X = df[['Hours']] # Independent variable
y = df['Scores'] # Dependent variable
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
# Train model
model = LinearRegression()
model.fit(X_train, y_train)
# Predict
y_pred = model.predict(X_test)
# Evaluate
print("Mean Squared Error:", mean_squared_error(y_test, y_pred))
print("R-squared:", r2_score(y_test, y_pred))
2. Logistic Regression
Definition:
Key Points:
Formula:
Steps to Implement:
Code Implementation:
# Import libraries
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix
# Load dataset
iris = load_iris()
X = iris.data[:, :2] # Taking only the first two features for simplicity
y = (iris.target == 0).astype(int) # Binary classification: Is the species
Setosa?
# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
# Train model
log_model = LogisticRegression()
log_model.fit(X_train, y_train)
# Predict
y_pred = log_model.predict(X_test)
# Evaluate
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))
Definition:
Key Metrics:
o Formula:
o Interpretation: Gives the error in the same unit as the target variable.
4. R-squared (R²):
o Formula:
Code Implementation:
# Example: Evaluating a regression model
from sklearn.metrics import mean_absolute_error, mean_squared_error
# Calculate metrics
mae = mean_absolute_error(y_actual, y_pred)
mse = mean_squared_error(y_actual, y_pred)
rmse = np.sqrt(mse)
# Print results
print("Mean Absolute Error:", mae)
print("Mean Squared Error:", mse)
print("Root Mean Squared Error:", rmse)
Summary Table