0% found this document useful (0 votes)
19 views5 pages

Dav Exp4 66

This document discusses implementing logistic regression in Python. It provides an overview of logistic regression, describing it as a classification algorithm that predicts the probability of class membership. The document then shows code to load and preprocess data, train a logistic regression classifier on the data, make predictions on new data, and evaluate the model's performance.

Uploaded by

godizlatan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views5 pages

Dav Exp4 66

This document discusses implementing logistic regression in Python. It provides an overview of logistic regression, describing it as a classification algorithm that predicts the probability of class membership. The document then shows code to load and preprocess data, train a logistic regression classifier on the data, make predictions on new data, and evaluate the model's performance.

Uploaded by

godizlatan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

EXP 4

EXPERIMENT-4 : Implement Logistic


LOGISTIC Regression in Python
REGRESSION

AIM: Logistic Regression in Python/R.

THEORY: Logistic regression is a supervised machine learning algorithm used for classification tasks
where the goal is to predict the probability that an instance belongs to a given class or not. Logistic
regression is a statistical algorithm which analyze the relationship between two data factors. The article
explores the fundamentals of logistic regression, it’s types and implementations.

What is Logistic Regression?

Logistic regression is used for binary classification where we use sigmoid function, that takes input as
independent variables and produces a probability value between 0 and 1.

For example, we have two classes Class 0 and Class 1 if the value of the logistic function for an input is
greater than 0.5 (threshold value) then it belongs to Class 1 it belongs to Class 0. It’s referred to as
regression because it is the extension of linear regression but is mainly used for classification problems.

Key Points:

Logistic regression predicts the output of a categorical dependent variable. Therefore, the outcome must
be a categorical or discrete value.

It can be either Yes or No, 0 or 1, true or False, etc. but instead of giving the exact value as 0 and 1, it
gives the probabilistic values which lie between 0 and 1.

In Logistic regression, instead of fitting a regression line, we fit an “S” shaped logistic function, which
predicts two maximum values (0 or 1).

Logistic Function – Sigmoid Function

The sigmoid function is a mathematical function used to map the predicted values to probabilities.

It maps any real value into another value within a range of 0 and 1. The value of the logistic regression
must be between 0 and 1, which cannot go beyond this limit, so it forms a curve like the “S” form.

The S-form curve is called the Sigmoid function or the logistic function.

In logistic regression, we use the concept of the threshold value, which defines the probability of either 0
or 1. Such as values above the threshold value tends to 1, and a value below the threshold values tends
to 0.

Types of Logistic Regression

On the basis of the categories, Logistic Regression can be classified into three types:

Binomial: In binomial Logistic regression, there can be only two possible types of the dependent
variables, such as 0 or 1, Pass or Fail, etc.

Manav Mangela T13 66 1


Multinomial: In multinomial Logistic regression, there can be 3 or more possible unordered types of the
dependent variable, such as “cat”, “dogs”, or “sheep”

Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered types of dependent
variables, such as “low”, “Medium”, or “High”.

CODE:

import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

dataset = pd.read_csv('Social_Network_Ads.csv')

X = dataset.iloc[:, :-1].values

y = dataset.iloc[:, -1].values

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)

print(X_train)

print(y_train)

print(X_test)

print(y_test)

from sklearn.preprocessing import StandardScaler

sc = StandardScaler()

X_train = sc.fit_transform(X_train)

X_test = sc.transform(X_test)

print(X_train)

print(X_test)

from sklearn.linear_model import LogisticRegression

classifier = LogisticRegression(random_state = 0)

classifier.fit(X_train, y_train)

Manav Mangela T13 66 2


print(classifier.predict(sc.transform([[30,87000]])))

y_pred = classifier.predict(X_test)

print(np.concatenate((y_pred.reshape(len(y_pred),1), y_test.reshape(len(y_test),1)),1))

from sklearn.metrics import confusion_matrix, accuracy_score

cm = confusion_matrix(y_test, y_pred)

print(cm)

accuracy_score(y_test, y_pred)

from matplotlib.colors import ListedColormap

X_set, y_set = sc.inverse_transform(X_train), y_train

X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 10, stop = X_set[:, 0].max() + 10, step = 0.25),

np.arange(start = X_set[:, 1].min() - 1000, stop = X_set[:, 1].max() + 1000, step


= 0.25))

plt.contourf(X1, X2, classifier.predict(sc.transform(np.array([X1.ravel(),


X2.ravel()]).T)).reshape(X1.shape),

alpha = 0.75, cmap = ListedColormap(('salmon', 'dodgerblue')))

plt.xlim(X1.min(), X1.max())

plt.ylim(X2.min(), X2.max())

for i, j in enumerate(np.unique(y_set)):

plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1], c = ListedColormap(('salmon', 'dodgerblue'))(i),


label = j)

plt.title('Logistic Regression (Training set)')

plt.xlabel('Age')

plt.ylabel('Estimated Salary')

plt.legend()

plt.show()

Manav Mangela T13 66 3


from matplotlib.colors import ListedColormap

X_set, y_set = sc.inverse_transform(X_test), y_test

X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 10, stop = X_set[:, 0].max() + 10, step = 0.25),

np.arange(start = X_set[:, 1].min() - 1000, stop = X_set[:, 1].max() + 1000, step


= 0.25))

plt.contourf(X1, X2, classifier.predict(sc.transform(np.array([X1.ravel(),


X2.ravel()]).T)).reshape(X1.shape),

alpha = 0.75, cmap = ListedColormap(('salmon', 'dodgerblue')))

plt.xlim(X1.min(), X1.max())

plt.ylim(X2.min(), X2.max())

for i, j in enumerate(np.unique(y_set)):

plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1], c = ListedColormap(('salmon', 'dodgerblue'))(i),


label = j)

plt.title('Logistic Regression (Test set)')

plt.xlabel('Age')

Manav Mangela T13 66 4


plt.ylabel('Estimated Salary')

plt.legend()

plt.show()

CONCLUSION: Thus we successfully implemented and executed logistic regression.

Manav Mangela T13 66 5

You might also like