Prediction of Diabetes Using Machine Learning: A Modern User-Friendly Model

Diabetes is a prevalent chronic disease affecting a significant portion of the global population. Early detection and accurate prediction of diabetes can play a crucial role in managing the condition and preventing complications. Machine learning (ML) techniques have shown promising results in diabetes prediction based on patient data. In this study, we propose a user-understandable approach utilizing the Random Forest classifier algorithm for accurate and interpretable diabetes prediction.

Uploaded by

International Journal of Innovative Science and Research Technology

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views7 pages

Prediction of Diabetes Using Machine Learning: A Modern User-Friendly Model

Uploaded by

International Journal of Innovative Science and Research Technology

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Volume 8, Issue 11, November 2023 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165

Prediction of Diabetes using Machine Learning:

A Modern User-Friendly Model
S. Umar Kalimulla1, V. AlekyaSatyasri2, K. Srunvitha3, S. H. N. V. V. D. S. Sai Charan 4,
A. V Satya Sai Ram5, DR. V. Venkateswara Rao6
1,2,3,4,5
Department of Computer Science and Artificial Intelligence (CAI),
6
Professor, Department of Computer Science and Engineering (CSE),
Sri Vasavi Engineering College, Tadepalligudem, Andhra Pradesh, India

Abstract:- Diabetes is a prevalent chronic disease Tree and k-Closest Neighbours to fabricate their model.
affecting a significant portion of the global population. They utilize different AI methods, viz., choice trees, k-
Early detection and accurate prediction of diabetes can closest neighbours, arbitrary woodland, and backing vector
play a crucial role in managing the condition and machines, and foresee the exhibition of various order
preventing complications. Machine learning (ML) procedures.
techniques have shown promising results in diabetes
prediction based on patient data. In this study, we Dr. Mohammed Abdul Raheem, Shaik Ehetesham,
propose a user-understandable approach utilizing the Mohammad Faiz Ahmed Subhani, and Sayed Abdul Zakir
Random Forest classifier algorithm for accurate and proposed an examination article" Man-made Intelligence
interpretable diabetes prediction. To build our Calculation Framework for Expectations of Diabetes
prediction model, we utilized a comprehensive dataset Utilizing Moderate Web Add IBM Cloud". They distributed
comprising various patient attributes, including age, the Global Diary of Science and Medical Services. They
body mass index (BMI), blood pressure, glucose levels, researched the article in 2020. They use AI, IBM Cloud,
and medical history. Pre-processing techniques were man-made brainpower calculations, and counterfeit brain
applied to handle missing values and normalize the data, organizations. Recognizing diabetes in its early stages is
followed by feature selection to identify the most relevant vital. Although the precision accomplished by these AI
attributes for diabetes prediction. The user- models is high, there are not many impediments to this task.
understandable representation of the model facilitated
effective interpretation and communication of the RinkalKeniya, Aman Khakharia, Vruddhi Shah,
prediction results. This allows healthcare professionals VrushabhGada, RachiManjalkar, Tirth Thaker, Mahesh
Warang, and NinadMehendale proposed an examination
to explain the prediction rationale to patients, promoting
shared decision-making and patient engagement. article "Infection expectation from different side effects
utilizing AI". They distributed SSRN in the year 2020. They
I. INTRODUCTION utilized their closest neighbors. The weighted k-closest
neighbour model gave the highest exactness of 93.5% for
In an era where healthcare is increasingly intertwined the expectation of illnesses utilizing the side effects.
with advanced technology, our project, which examines a
crucial area of public health called THE PREDICTION OF Min Chen, Yixue Hao, Kai Hwang, Individual, IEEE,
DIABETES, focuses on a time when healthcare is becoming Lu Wang, and Lin Wang proposed an exploration article
more and more entwined with cutting-edge technology. "Sickness Expectation by AI over Enormous Information
Early identification is essential for optimal care of diabetes, from Medical Care Networks". They distributed IEEE in
a chronic metabolic condition that affects millions of people 2017. calculations. The expected precision of our proposed
worldwide. Our study aims to create a prediction model that calculation.
can identify people at risk of diabetes before clinical signs
appear by leveraging the power of machine learning. This III. PROBLEM STATEMENT IN EXISTING
project aims to contribute to proactive healthcare initiatives SYSTEM
and enable people to make well-informed decisions about In the day-to-day routine of an everyday person, the
their future health by utilizing a wide range of health-related forecast of a persistent illness like diabetes at the beginning
diseases. phase is exceptionally urgent. Expectations for these
II. LITERATURE SURVEY infections can be accomplished accurately by utilizing AI
(ML) models with high precision. The current application
They use CNN, K-NN, Choice Tree, and NB. They consumes a greater amount of human time for determining
contrasted with a few commonplace expectations There are and showing important information from the model. A
four pertinent papers that investigate diabetes prediction. client-reasonable model that is less tedious can be
acquainted with getting client information anticipated in a
N.A. Farooqui, Ritika, and A. Tyagi proposed an client-friendly, efficient, and justifiable way.
examination article," Expectation Model for Diabetes
Mellitus Utilizing AI Procedures". They distributed the
Global Diary of Science and Medical Services. They
researched the article in 2020. The writers utilized Choice

IJISRT23NOV1802 www.ijisrt.com 1413

Volume 8, Issue 11, November 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
IV. DISADVANTAGES IN THE EXISTING SYSTEM  Risk Definition: ML calculations can separate patients into
various risk classifications in light of their probability of
 Data Privacy and Security: Data privacy and security are developing diabetes. This empowers medical services
major concerns in the healthcare industry when handling suppliers to assign assets and mediations all the more
sensitive patient data. A significant challenge is effectively, zeroing in on people at higher risk.
maintaining patient data security and privacy while  Customized Medication: AI can assist with fitting
prediction models are being developed and used. treatment plans for diabetic patients. By breaking down
 Limited Accuracy: Many of the models now in use may information from a patient's wellbeing records, the model
not be very accurate at predicting diabetes, which could can suggest individualized treatment choices and lifestyle
result in false positives and false negatives. changes.
 Consumes a greater amount of human time.  Remote Checking: ML-controlled wearable gadgets and
 Not effective and user-friendly models. portable applications can persistently screen blood glucose
levels and other significant information, giving ongoing
V. PROPOSED SYSTEM bits of knowledge to patients and their medical service
suppliers. This can prompt better illness for executives and
 The patient's clinical information is taken as a necessity
convenient intercessions.
for the model. An immense record of information
 Interpretability: While Random Forest models are not
connected with diabetes patients is utilized to prepare the
quite as interpretable as easier models like straight relapse,
model. The model Random Forest Classifier is prepared
they can give some degree of interpretability through
with the dataset connected to diabetic patients.
highlighted significance scores. This can assist clinicians
 The model is then made to anticipate the result as per the
and scientists with understanding which elements are
patient's information. Finally, a WhatsApp message is sent
driving the model's expectations.
to the given portable number that says a client is diabetic
or not. So, the proposed model has the following VI. DATASET DESCRIPTION
advantages in order to overcome the problem in the
existing system: Information collection from the Kaggle-acquired
 Early Recognition: AI models can break down dataset is the underlying period of execution. For our
understanding information, for example, clinical history, review, a dataset associated with diabetes is required. This
lab results, and way of life factors, to anticipate the dataset incorporates names for the accompanying terms:
probability of creating diabetes later on. Early pregnancy, glucose, blood pressure, skin thickness, insulin,
identification permits medical care suppliers to BMI, and age. Given the large number of features provided
immediately mediate and start preventive measures. in the dataset, classes are classified based on them.

Fig. 1: Dataset Description

VII. METHODOLOGY based on the values of input features. The main goal of a
decision tree is to divide data into groups that are
A decision tree is an ML algorithm and a visual homogeneous in terms of the target variable, making it
representation of the decision-making process. Decision easier to make predictions or decisions.
trees are built by recursively dividing a data set into subsets

IJISRT23NOV1802 www.ijisrt.com 1414

Volume 8, Issue 11, November 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Fig. 2: Methodology of the proposed model

 Data Collection: Begin the process by choosing a  Training and testing data: Split the pre-processed data
reasonable diabetes dataset from a Google site that has into testing sets and testing sets. In general, the training
significant features for prediction. set is 80% and the testing set is 20%. Train the model
 Pre-Processing Data: Inspect the data to understand its and test the model to evaluate its performance using
structure, missing values, and data types. random forest.
 Handle missing data by imputing or removing  Deployment: The model can be deployed after being
incomplete rows. trained and evaluated successfully.
 Encode categorical variables using techniques like one-
hot encoding. User inputs are given to the model that predicts outputs
 Scale or normalize numerical features to have the same by constructing decision trees. At last, a WhatsApp
scale. reporting message is sent to the user that describes whether
 Feature Selection: Determine which features are most the user is diabetic or not.
relevant for predicting diabetes and remove the other
irrelevant features.

VIII. SYSTEM ARCHITECTURE

Fig. 3: System Architecture

 A decision tree is a straightforward and interpretable ML  Decision trees make parts in view of component values
model utilized for both classification and regression. to make nodes. The objective is to make splits that best
 It is a variously levelled tree-like design that settles on separate the data into classes for classification. The
choices by recursively dividing the information into splitting process continues until stopping criteria are met.
subsets in view of the values of input features. For classification, a leaf node typically represents a class
 The construction of the tree comprises nodes, branches, label. The final prediction in a random forest is obtained
and leaves. Nodes address choices, branches address the by aggregating the predictions of all individual decision
potential results of decisions, and leaves address the last trees; this can be done by majority voting.
predictions.

IJISRT23NOV1802 www.ijisrt.com 1415

Volume 8, Issue 11, November 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
IX. EXPERIMENTAL RESULTS example, pregnancy, glucose, circulatory strain, skin
thickness, insulin, BMI, and age, are given to the model. In
The dataset is trained utilizing a random forest the wake of giving the client inputs, they are compared with
classifier, and that dataset should be liberated from other information. The following figure shows the prepared
commotion, which implies it shouldn't contain any irrelevant information and the given client inputs.
information or invalid qualities. Client data sources, for

Fig. 4: Interface of the proposed model

A visualized patient report is made that addresses and '1' indicates a client is unhealthy. The accompanying
comparisons between the given client source information figure shows a pregnancy count chart in which the x-axis
and other information. This perception report says a client is addresses age and the y-axis addresses pregnancies.
healthy or unhealthy. Here, '0' indicates a client is healthy,

Fig. 5: Visualization of Pregnancies

IJISRT23NOV1802 www.ijisrt.com 1416

Volume 8, Issue 11, November 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Fig. 6: Visualization of Insulin

Fig. 7: Visualization of Skin Thickness

Fig. 8: Visualization of Blood Pressure

IJISRT23NOV1802 www.ijisrt.com 1417

Volume 8, Issue 11, November 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165

Fig. 9: Visualization of Glucose

Fig. 10: Visualization of BMI

Lastly, based on the reports obtained, it is shown that a as hi client, you are not diabetic if healthy and hi client, you
client is diabetic or not. To get the report, the client name are diabetic if unhealthy. The accompanying figure shows a
and portable number are given. A WhatsApp message is sent WhatsApp message report.
to the given versatile number, which shows messages such

Fig. 11: Final output and Messaging feature

IJISRT23NOV1802 www.ijisrt.com 1418

Volume 8, Issue 11, November 2023 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165
X. CONCLUSION

All in all, the execution of the present-day diabetic

forecast model shows a more prominent effect on humans'
existence. In basic words, client inputs are contrasted with
different information, which envisions a report with
additional accuracy, and a WhatsApp message is sent saying
a client is diabetic or not.In this quick world, foreseeing
diabetes at the beginning is exceptionally pivotal. The
current models are not less tedious, easy to use, and client-
justifiable. So, our proposed model beats these impediments,
i.e., it is less tedious, easy to understand, and gives highly
precise outcomes.

REFERENCES

[1]. Parvin Soleimani, Prediction of Diabetes, Ryerson

University Computer Science Department, Canada,
2020.
[2]. Al Juma, A.L., Ahmad, M.G., and Siddiqui, M.K.,
2013. Application of data mining: diabetes health care
in young and old patients Journal of King Saud
University, Computer and Information Sciences, 25,
127–136. doi: 10.1016/j.jksuci.2012.10.003.
[3]. Mukesh Kumari et al. (IJCSIT), International Journal
of Computer Science and Information Technologies,
Vol. 5(4), 2014, 5174-5178, “Prediction of Diabetes
Using Bayesian Networks.".
[4]. G. D. Kalyankar, S. R. Poojara, and N. V. Dharwadkar,
"Predictive analysis of diabetic patient data using
machine learning and Hadoop," 2017 International
Conference on I-SMAC (IoT in Social, Mobile,
Analytics, and Cloud) (I-SMAC), Palladam, 2017, pp.
619–624, doi: 10.1109/I-SMAC.2017.8058253.