0% found this document useful (0 votes)
100 views18 pages

DIAPRO - Diabetes Prediction Application

This document describes a project to develop a machine learning model called DIAPRO to predict diabetes. It aims to analyze diabetes prediction using 10 different machine learning techniques and propose an effective early detection technique. The project will create a model, web app using Flask, and deploy it on Heroku. It discusses the dataset used, pre-processing steps, feature selection using ANOVA, and results showing Gradient Boosting and KNN achieve the best performance with ROC-AUC scores above 80%. The conclusion is that machine learning can help revolutionize diabetes risk prediction.

Uploaded by

Dhyeaya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
100 views18 pages

DIAPRO - Diabetes Prediction Application

This document describes a project to develop a machine learning model called DIAPRO to predict diabetes. It aims to analyze diabetes prediction using 10 different machine learning techniques and propose an effective early detection technique. The project will create a model, web app using Flask, and deploy it on Heroku. It discusses the dataset used, pre-processing steps, feature selection using ANOVA, and results showing Gradient Boosting and KNN achieve the best performance with ROC-AUC scores above 80%. The conclusion is that machine learning can help revolutionize diabetes risk prediction.

Uploaded by

Dhyeaya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

Exposys Data Labs Internship

Project

" DIAPRO – Diabetes


Prediction
Application”
Made By: Dhyey Joshi
Supervisor : Mr. Vishnu
Vardhan Sir
Semester : 5 th

Branch : CSE
Introduction
1 Analysis Of Our Project Title
Abstract

⊹ Diabetes is a chronic disease with the potential to cause a worldwide health care crisis. According to the International
Diabetes Federation 382 million people are living with diabetes across the whole world. By 2035, this will be doubled
as 592 million. Diabetes mellitus or simply diabetes is a disease caused due to the increased level of blood glucose.
Various traditional methods, based on physical and chemical tests, are available for diagnosing diabetes. However, early
prediction of diabetes is quite a challenging task for medical practitioners due to complex interdependence on various
factors as diabetes affects human organs such as kidney, eye, heart, nerves, foot etc. Data science methods have the
potential to benefit other scientific fields by shedding new light on common questions. One such task is to help make
predictions on medical data. Machine learning is an emerging scientific field in data science dealing with the ways in
which machines learn from experience. The aim of this project is to develop a system which can perform early
prediction of diabetes for a patient with a higher accuracy by combining the results of different machine learning
techniques. This project aims to predict diabetes via 10 different supervised & Ensemble Machine Learning methods
including: SVM, K Nearest Neighbor, Naive Bayes, Logistic Regression, Random Forest Classifier, AdaBoost,
XgBoost, Gradient Boost, LightGBM, Extra Tree Classifier. This project also aims to propose an effective technique for
earlier detection of the diabetes disease.

4
Proposed System
⊹ The whole project will be completed in 3 complex
steps
⊹ a. Creating a model using machine learning
⊹ b. Creating a web app using flask and connecting it
with model
⊹ c. Now, uploading project to GitHub, then connect
Heroku with your GitHub account. Name your
application – Click on Deploy Branch. Wahoo!! our
application on fly now.

5
⊹ Classification is one of the most important decision making techniques in
many real world problems.
⊹ In this work, the main objective is to classify the data as diabetic or non-
diabetic and improve the classification accuracy. For many classification
problems, the higher number of samples chosen doesn't leads to higher
classification accuracy.
⊹ In many cases, the performance of algorithms is high in the context of speed
but the accuracy of data classification is low. The main objective of our model
is to achieve high accuracy.
⊹ Classification accuracy can be increased if we use much of the data set for
training and few data sets for testing. This survey has analyzed various
classification techniques for classification of diabetic and non-diabetic data.
Thus, it is observed that techniques like Gradient Boosting & K nearest
6
Neighbor are most suitable for implementing the Diabetes prediction system.
Current System and its
limitations
Existing problems |
purposed System
⊹ Still no effective ⊹ To develop a intelligent
system to classify pd
solution
patients.
⊹ Time consuming ⊹ To contribute in medical
clinical analysis sector
⊹ High cost ⊹ Reduce the cost of overall
⊹ Experienced clinical analysis
⊹ Diagnose patient in early
manpower
stages
⊹ Reduce mortality rate 7
Hardware and Software
Requirements
a) Python programming language.
b) Jupyter Notebook.
c) Google Colab.
D) Windows 7 / 10 Operating System.
E) RAM minimum 4Gb.

8
System Flow Chart

9
Overall
Workflow

Classification
Naïve Bayes
Feature Extraction Data Pre-Processing Feature
Data Standardization Selection Logistic
Regression
K – nearest

Ensemble neighbors
Voting Random Forest
SVM (Linear)
Result
Fig : Graphical SVM (RBF)

representation of SVM (Poly)


overall proposed
Dataset description

Fig : Dataset Description [ 5 ]


11
Dataset pre-processing

Fig : Correlation of features( High Fig : Non-Diabetic (0) – Diabetic (1)


presence of correlation ) [ 6 ] Ratio in dataset
12
[7]
Feature Selection
(Anova)

Fig : Feature Importance by ANOVA [ 8 ]

Fig : Correlation After Feature 13


Best Result’s Of Two
Model: KNN & GB

Fig : Roc &


Fig : Classification Results 14
Auc Score
Gradient Boosting

Fig : Roc &


Fig : Classification Results 15
Auc Score
conclusion
⊹ Machine learning has the great ability to revolutionize the diabetes risk
prediction with the help of advanced computational methods and availability
of large amount of epidemiological and genetic diabetes risk dataset.
Detection of diabetes in its early stages is the key for treatment. This work
has described a machine learning approach to predicting diabetes levels. The
technique may also help researchers to develop an accurate and effective tool
that will reach at the table of clinicians to help them make better decision
about the disease status.
16
Previous works
⊹ [1] Debadri Dutta, Debpriyo Paul, Parthajeet Ghosh, "Analyzing Feature Importances for Diabetes Prediction
using Machine Learning". IEEE, pp 942-928, 2018.
⊹  
⊹ [2] K.VijiyaKumar, B.Lavanya, I.Nirmala, S.Sofia Caroline, "Random Forest Algorithm for the Prediction of
Diabetes ".Proceeding of International Conference on Systems Compu- tation Automation and Networking,
2019.
⊹  
⊹ [3] Md. Faisal Faruque, Asaduzzaman, Iqbal H. Sarker, "Perfor- mance Analysis of Machine Learning
Techniques to Predict Diabetes Mellitus". International Conference on Electrical, Computer and
Communication Engineering (ECCE), 7-9 Feb- ruary, 2019.
⊹  
⊹ [4] Tejas N. Joshi, Prof. Pramila M. Chawan, "Diabetes Prediction Using Machine Learning Techniques".Int.
Journal of Engineer- ing Research and Application, Vol. 8, Issue 1, (Part -II) Janu- ary 2018, pp.-09-13
⊹  
⊹ [5] Nonso Nnamoko, Abir Hussain, David England, "Predicting Diabetes Onset: an Ensemble Supervised
Learning Approach ". IEEE Congress on Evolutionary Computation (CEC), 2018.
17
Thanks!
Any questions?
You can find us at:
Email: [email protected]
Linkedin:https://www.linkedin.com/in/
dhyey-joshi12/

Made By: Dhyey Joshi


18

You might also like