0% found this document useful (0 votes)

44 views4 pages

Diabetes Prediction System

The document loads diabetes patient data and performs exploratory data analysis including checking for missing values, calculating statistics, and visualizing correlations. It then splits the data into training and test sets, trains a logistic regression model on the training set, uses the model to make predictions on the test set, and calculates the accuracy score of the predictions.

Uploaded by

saurabh khairnar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views4 pages

Diabetes Prediction System

Uploaded by

saurabh khairnar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

In [17]: import pandas as pd

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

import warnings
warnings.filterwarnings('ignore')

In [2]: df=pd.read_csv("diabetes.csv")
df.head()

Out[2]: Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunction Age Outcome

0 6 148 72 35 0 33.6 0.627 50 1

1 1 85 66 29 0 26.6 0.351 31 0

2 8 183 64 0 0 23.3 0.672 32 1

3 1 89 66 23 94 28.1 0.167 21 0

4 0 137 40 35 168 43.1 2.288 33 1

In [3]: df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 768 entries, 0 to 767
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Pregnancies 768 non-null int64
1 Glucose 768 non-null int64
2 BloodPressure 768 non-null int64
3 SkinThickness 768 non-null int64
4 Insulin 768 non-null int64
5 BMI 768 non-null float64
6 DiabetesPedigreeFunction 768 non-null float64
7 Age 768 non-null int64
8 Outcome 768 non-null int64
dtypes: float64(2), int64(7)
memory usage: 54.1 KB

In [5]: df.describe()

Out[5]: Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesPedigreeFunctio

count 768.000000 768.000000 768.000000 768.000000 768.000000 768.000000 768.00000

mean 3.845052 120.894531 69.105469 20.536458 79.799479 31.992578 0.47187

std 3.369578 31.972618 19.355807 15.952218 115.244002 7.884160 0.33132

min 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.07800

25% 1.000000 99.000000 62.000000 0.000000 0.000000 27.300000 0.24375

50% 3.000000 117.000000 72.000000 23.000000 30.500000 32.000000 0.37250

75% 6.000000 140.250000 80.000000 32.000000 127.250000 36.600000 0.62625

max 17.000000 199.000000 122.000000 99.000000 846.000000 67.100000 2.42000

Loading [MathJax]/extensions/Safe.js
CHECKING FOR MISSING VALUES
In [6]: df.isnull().sum()

Pregnancies 0
Out[6]:
Glucose 0
BloodPressure 0
SkinThickness 0
Insulin 0
BMI 0
DiabetesPedigreeFunction 0
Age 0
Outcome 0
dtype: int64

In [12]: sns.heatmap(df.isnull())

<AxesSubplot:>
Out[12]:

CO RELATION MATRIX
In [14]: df.corr()
Loading [MathJax]/extensions/Safe.js
Out[14]: Pregnancies Glucose BloodPressure SkinThickness Insulin BMI DiabetesP

Pregnancies 1.000000 0.129459 0.141282 -0.081672 -0.073535 0.017683

Glucose 0.129459 1.000000 0.152590 0.057328 0.331357 0.221071

BloodPressure 0.141282 0.152590 1.000000 0.207371 0.088933 0.281805

SkinThickness -0.081672 0.057328 0.207371 1.000000 0.436783 0.392573

Insulin -0.073535 0.331357 0.088933 0.436783 1.000000 0.197859

BMI 0.017683 0.221071 0.281805 0.392573 0.197859 1.000000

DiabetesPedigreeFunction -0.033523 0.137337 0.041265 0.183928 0.185071 0.140647

Age 0.544341 0.263514 0.239528 -0.113970 -0.042163 0.036242

Outcome 0.221898 0.466581 0.065068 0.074752 0.130548 0.292695

Visualizing the correlation

In [16]: sns.heatmap(df.corr()).cmap="pink"

TRAINING THE MODEL WITH THE HELP OF

Loading [MathJax]/extensions/Safe.js
TRAIN TEST SPLIT
In [20]: x=df.drop('Outcome',axis=1)
y=df['Outcome']
x_train,x_test,y_train,y_test= train_test_split(x,y,test_size=0.2)

In X all the independent variables are stored In Y the predictor variable(“OUTCOME”) is stored. Train-test
split is a technique used in machine learning to assess model performance. It divides the dataset into a
training set and a testing set, with a 0.2 test size indicating that 20% of the data is used for testing and 80%
for training.

Training the model

In [22]: model=LogisticRegression()
model.fit(x_train,y_train)

LogisticRegression()
Out[22]:

Fitting the X train and y train data into the variable called model.

Making Prediction
In [23]: prediction =model.predict(x_test)

In [24]: print(prediction)

[1 0 1 0 0 1 0 0 1 1 1 1 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1
0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 1 0 1 1 0 0 0 0 0
0 0 0 0 0 0 1 0 1 1 0 1 0 0 0 0 1 0 1 1 0 0 0 0 1 0 0 1 1 0 1 0 1 1 0 1 0
0 0 0 1 0 0 0 1 1 0 0 1 0 1 1 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
0 0 1 0 1 0]

The accuracy of the model is then calculated and determined.

In [25]: accuracy=accuracy_score(prediction,y_test)

In [26]: print(accuracy)

0.7987012987012987

The accuracy of the model is then calculated and determined.

In [ ]:

Loading [MathJax]/extensions/Safe.js

Step-By-Step-Diabetes-Classification-Knn-Detailed-Copy1 - Jupyter Notebook
No ratings yet
Step-By-Step-Diabetes-Classification-Knn-Detailed-Copy1 - Jupyter Notebook
12 pages
Cognitive Ergonomics and Design
No ratings yet
Cognitive Ergonomics and Design
22 pages
Project
No ratings yet
Project
8 pages
Logistic - Ipynb - Colaboratory
No ratings yet
Logistic - Ipynb - Colaboratory
6 pages
Diabetes
No ratings yet
Diabetes
7 pages
6034 Logistic Regression
No ratings yet
6034 Logistic Regression
6 pages
Logidtic Regression ASSIGNMENT
No ratings yet
Logidtic Regression ASSIGNMENT
13 pages
Documentation Code
No ratings yet
Documentation Code
20 pages
Diabetes Prediction
No ratings yet
Diabetes Prediction
1 page
Diabetic Prediction Using LogicalRegression
No ratings yet
Diabetic Prediction Using LogicalRegression
9 pages
Diabetes
No ratings yet
Diabetes
97 pages
Linear Merged Pagenumber
No ratings yet
Linear Merged Pagenumber
48 pages
Unit5 - Logistic Regression
No ratings yet
Unit5 - Logistic Regression
4 pages
Diabetes EDA and Kears Modeling
No ratings yet
Diabetes EDA and Kears Modeling
26 pages
ML Proj Diabetes
No ratings yet
ML Proj Diabetes
51 pages
SVM - RF - Diabetes - CSV - 26 - 6 - 2023.ipynb - Colaboratory
No ratings yet
SVM - RF - Diabetes - CSV - 26 - 6 - 2023.ipynb - Colaboratory
8 pages
Data Pre-Processing
No ratings yet
Data Pre-Processing
22 pages
ML Practical 04
No ratings yet
ML Practical 04
20 pages
AML Sessional 1 Students
No ratings yet
AML Sessional 1 Students
16 pages
ML Data Preprocessing in Python
No ratings yet
ML Data Preprocessing in Python
9 pages
Project 10 Movie Recommendation - Ipynb - Colaboratory
No ratings yet
Project 10 Movie Recommendation - Ipynb - Colaboratory
6 pages
Diabetes and Glucose Correlation - IBM Machine Learning Training Project
No ratings yet
Diabetes and Glucose Correlation - IBM Machine Learning Training Project
10 pages
Healthcare-Project-Simplilearn - Week1
No ratings yet
Healthcare-Project-Simplilearn - Week1
6 pages
ADS Exp-1
No ratings yet
ADS Exp-1
3 pages
Experiment 4
No ratings yet
Experiment 4
5 pages
Week-01.b
No ratings yet
Week-01.b
4 pages
Logistic Regression
No ratings yet
Logistic Regression
12 pages
Exp 5
No ratings yet
Exp 5
7 pages
Diabetes Prediction 1704256341
No ratings yet
Diabetes Prediction 1704256341
17 pages
Diabetes Prediction Using Machine Learning
No ratings yet
Diabetes Prediction Using Machine Learning
16 pages
Python 2025
No ratings yet
Python 2025
25 pages
KNN - Jupyter Notebook
No ratings yet
KNN - Jupyter Notebook
5 pages
Capstone Project 2
No ratings yet
Capstone Project 2
15 pages
Import As From Import From Import From Import From Import From Import From Import From Import From Import From Import From Import Import As
No ratings yet
Import As From Import From Import From Import From Import From Import From Import From Import From Import From Import From Import Import As
8 pages
Fds 1
No ratings yet
Fds 1
44 pages
Pythone Code For Predicting Diabetes Using ML
No ratings yet
Pythone Code For Predicting Diabetes Using ML
18 pages
Logistic Regression 205
No ratings yet
Logistic Regression 205
8 pages
Pima Indians Diabetes Patient Classification
No ratings yet
Pima Indians Diabetes Patient Classification
22 pages
Diabetes - Test Report
No ratings yet
Diabetes - Test Report
62 pages
Diabetes Prediction Using Machine Learning
No ratings yet
Diabetes Prediction Using Machine Learning
20 pages
20BCE7620 AP2021228000397 Experiment-6 Removed
No ratings yet
20BCE7620 AP2021228000397 Experiment-6 Removed
19 pages
Apply Logistic Regression Model Techniques To Predict Data On Any Dataset
No ratings yet
Apply Logistic Regression Model Techniques To Predict Data On Any Dataset
5 pages
Question 7 - Jupyter Notebook
No ratings yet
Question 7 - Jupyter Notebook
4 pages
Univariate and Multivariate Analysis - Jupyter Notebook
No ratings yet
Univariate and Multivariate Analysis - Jupyter Notebook
5 pages
Pima Indian Diabetes Data Analysis in Python - Canopus Business Management Group
No ratings yet
Pima Indian Diabetes Data Analysis in Python - Canopus Business Management Group
21 pages
20MIS7095 (LAB 7) .Ipynb Colaboratory
No ratings yet
20MIS7095 (LAB 7) .Ipynb Colaboratory
4 pages
20MIS7043 (LAB 7) .Ipynb Colaboratory
No ratings yet
20MIS7043 (LAB 7) .Ipynb Colaboratory
4 pages
Stroke Prediction
No ratings yet
Stroke Prediction
14 pages
Exp 4
No ratings yet
Exp 4
4 pages
KNN For Classification
No ratings yet
KNN For Classification
4 pages
Ml4.ipynb - Colab
No ratings yet
Ml4.ipynb - Colab
3 pages
KNN For Classification
No ratings yet
KNN For Classification
5 pages
Stroke Prediction
No ratings yet
Stroke Prediction
10 pages
Untitled15.ipynb - Colaboratory
No ratings yet
Untitled15.ipynb - Colaboratory
1 page
ML Minor May
No ratings yet
ML Minor May
5 pages
RA2111003011432
No ratings yet
RA2111003011432
3 pages
Heart Disease Indicator Prediction Model
No ratings yet
Heart Disease Indicator Prediction Model
17 pages
ExNo 08ml
No ratings yet
ExNo 08ml
4 pages
Major Project - Colab
No ratings yet
Major Project - Colab
15 pages
Softeng1-M03 (Tues) 2
No ratings yet
Softeng1-M03 (Tues) 2
20 pages
Computational Neural Networks Driving Complex Analytical Problem Solving
No ratings yet
Computational Neural Networks Driving Complex Analytical Problem Solving
7 pages
Data Modelling 101 For Data Analysts
No ratings yet
Data Modelling 101 For Data Analysts
13 pages
Gas Turbines Modeling Simulation and Control Using Artificial Neural Networks
100% (1)
Gas Turbines Modeling Simulation and Control Using Artificial Neural Networks
218 pages
DB Lab 2
No ratings yet
DB Lab 2
6 pages
Classification - Issues Regarding Classification and Prediction
No ratings yet
Classification - Issues Regarding Classification and Prediction
42 pages
Artificial Intelligence A-Z™ 2023 Build An AI With
No ratings yet
Artificial Intelligence A-Z™ 2023 Build An AI With
19 pages
Laptop Price Prediction Using Machine Learning (Abstract)
0% (1)
Laptop Price Prediction Using Machine Learning (Abstract)
3 pages
Robotics HW1
No ratings yet
Robotics HW1
17 pages
DIP3E Chapter03 Art
No ratings yet
DIP3E Chapter03 Art
64 pages
MODUL PRAKTIKUM SQL Subqueries
No ratings yet
MODUL PRAKTIKUM SQL Subqueries
7 pages
Group 2 Presentation - Digital Filters
No ratings yet
Group 2 Presentation - Digital Filters
26 pages
1 Hassoun Chap3 Perceptron
No ratings yet
1 Hassoun Chap3 Perceptron
10 pages
Analysis of Seized Drug Samples
No ratings yet
Analysis of Seized Drug Samples
20 pages
EE5101 Linear Systems Project Report: Semester - I, 2010/11
No ratings yet
EE5101 Linear Systems Project Report: Semester - I, 2010/11
22 pages
Formal Approaches To Sla-Universal Grammar Mat-English: Ivan T. Barroga
No ratings yet
Formal Approaches To Sla-Universal Grammar Mat-English: Ivan T. Barroga
39 pages
Hand Gesture Recognition Approach:A Survey
No ratings yet
Hand Gesture Recognition Approach:A Survey
4 pages
Introduction To Business Communication
No ratings yet
Introduction To Business Communication
18 pages
CP4152-Database Practices-Unit-1,2
100% (11)
CP4152-Database Practices-Unit-1,2
71 pages
Deep Learning Notes For Easy Access
No ratings yet
Deep Learning Notes For Easy Access
14 pages
Speech Processing Research Paper 19
No ratings yet
Speech Processing Research Paper 19
1 page
Bachelor's Final Project - Investigating Artificial Intelligence Applied To Robotics
No ratings yet
Bachelor's Final Project - Investigating Artificial Intelligence Applied To Robotics
72 pages
Ai Set 01
No ratings yet
Ai Set 01
27 pages
SF2863 Systems Engineering, 7.5 HP - Intro To Markov Decision Processes PDF
No ratings yet
SF2863 Systems Engineering, 7.5 HP - Intro To Markov Decision Processes PDF
39 pages
Adaptive DP For Discrete Time LQR Optimal Tracking Control Problems With Unknown Dynamics
No ratings yet
Adaptive DP For Discrete Time LQR Optimal Tracking Control Problems With Unknown Dynamics
6 pages
Oral Communication First Grading Exam
100% (1)
Oral Communication First Grading Exam
5 pages
Machine Learning - SoS 2017
No ratings yet
Machine Learning - SoS 2017
15 pages
Chat Bot Synopsis
No ratings yet
Chat Bot Synopsis
4 pages
Compensation
No ratings yet
Compensation
2 pages