0% found this document useful (0 votes)
18 views22 pages

ML Exp 8

Uploaded by

22b61a6603
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views22 pages

ML Exp 8

Uploaded by

22b61a6603
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Experiment 8

Experiment 8

• Implementation of Logistic
Regression using sklearn
Logistic Regression
• A basic machine learning approach that is frequently
used for binary classification tasks is called logistic
regression.
• It uses the sigmoid function to simulate the likelihood of
an instance falling into a specific class, producing
values between 0 and 1.
• Logistic regression, with its emphasis on interpretability,
simplicity, and efficient computation, is widely applied
in a variety of fields, such as marketing, finance, and
healthcare, and it offers insightful forecasts and useful
information for decision-making
• A statistical model for binary classification is called
logistic regression.
• Using the sigmoid function, it forecasts the likelihood
that an instance will belong to a particular class,
guaranteeing results between 0 and 1.
• To minimize the log loss, the model computes a linear
combination of input characteristics, transforms it using
the sigmoid, and then optimizes its coefficients using
methods like gradient descent.
• These coefficients establish the decision boundary that
divides the classes.
• Because of its ease of use, interpretability, and
versatility across multiple domains, Logistic Regression
is widely used in machine learning for problems that
involve binary outcomes.
• Overfitting can be avoided by implementing
regularization
Logistic Regression uses a linear equation to combine the
input information and the sigmoid function to restrict
predictions between 0 and 1.
Gradient descent and other techniques are used to
optimize the model’s coefficients to minimize the log loss
.
These coefficients produce the resulting decision
boundary, which divides instances into two classes.
• When it comes to binary classification, logistic
regression is the best choice because it is easy to
understand, straightforward, and useful in a variety of
settings.
• Generalization can be improved by using regularization.
Python Code Import Libraries

• # Import necessary libraries


• import numpy as np
• import pandas as pd
• import matplotlib.pyplot as plt
• import seaborn as sns
• from sklearn.datasets import load_diabetes
• from sklearn.model_selection import train_test_split
• from sklearn.preprocessing import StandardScaler
• from sklearn.linear_model import LogisticRegression
• from sklearn.metrics import accuracy_score, classification_report, confusion_matrix,
roc_curve, auc
Read and Explore the data

• # Load the diabetes dataset


• diabetes = load_diabetes()
• X, y = diabetes.data, diabetes.target

• # Convert the target variable to binary (1 for diabetes, 0 for no


diabetes)
• y_binary = (y > np.median(y)).astype(int)
Splitting The Dataset: Train and Test dataset

• # Split the data into training and testing sets


• X_train, X_test, y_train, y_test = train_test_split(
• X, y_binary, test_size=0.2, random_state=42)
Feature Scaling

• # Standardize features
• scaler = StandardScaler()
• X_train = scaler.fit_transform(X_train)
• X_test = scaler.transform(X_test)
Train The Model

• # Train the Logistic Regression model


• model = LogisticRegression()
• model.fit(X_train, y_train)
Evaluation Metrics

• # Evaluate the model


• y_pred = model.predict(X_test)
• accuracy = accuracy_score(y_test, y_pred)
• print("Accuracy: {:.2f}%".format(accuracy * 100))
Output
• Accuracy: 73.03%
Confusion Matrix and
Classification Report
• # evaluate the model
• print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))
• print("\nClassification Report:\n", classification_report(y_test,
y_pred))
Visualizing the performance of our model.

• # Visualize the decision boundary with accuracy information


• plt.figure(figsize=(8, 6))
• sns.scatterplot(x=X_test[:, 2], y=X_test[:, 8], hue=y_test, palette={
• 0: 'blue', 1: 'red'}, marker='o')
• plt.xlabel("BMI")
• plt.ylabel("Age")
• plt.title("Logistic Regression Decision Boundary\nAccuracy: {:.2f}%".format(
• accuracy * 100))
• plt.legend(title="Diabetes", loc="upper right")
• plt.show()
OUTPUT
Plotting ROC Curve
• # Plot ROC Curve
• y_prob = model.predict_proba(X_test)[:, 1]
• fpr, tpr, thresholds = roc_curve(y_test, y_prob)
• roc_auc = auc(fpr, tpr)

• plt.figure(figsize=(8, 6))
• plt.plot(fpr, tpr, color='darkorange', lw=2,
• label=f'ROC Curve (AUC = {roc_auc:.2f})')
• plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--', label='Random')
• plt.xlabel('False Positive Rate')
• plt.ylabel('True Positive Rate')
• plt.title('Receiver Operating Characteristic (ROC) Curve\nAccuracy: {:.2f}%'.format(
• accuracy * 100))
• plt.legend(loc="lower right")
• plt.show()
OUTPUT

You might also like