Logistic Regression
Logistic Regression
It is used for predicting the categorical dependent variable using a given set of independent variables.
Logistic regression predicts the output of a categorical dependent variable ie. it is used for predicting
the categorical dependent variable using a given set of independent variables.
Logistic regression is a supervised machine learning algorithm mainly used for classification tasks
where the goal is to predict the probability that an instance of belonging to a given class or not.
Therefore, the outcome must be a categorical or discrete value. It can be either Yes or No, 0 or 1, true
or False, etc. but instead of giving the exact value as 0 and 1, it gives the probabilistic values which lie
between 0 and 1.
Logistic regression is named for the function used at the core of the method, the logistic function also
called the sigmoid function. It’s an S-shaped curve that can take any real-valued number and map it
into a value between 0 and 1, but never exactly at those limits.
EXAMPLE CODE.
This code uses the diabetes dataset which has been attached to this file.
1. Importing Libraries.
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn import metrics
- Imports necessary libraries for data manipulation, visualization, logistic regression, and model
evaluation.
6. Train-Test Split.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=7)
- Splits the data into training and testing sets using `train_test_split`. 30% of the data is reserved for
testing, and `random_state` ensures reproducibility.