0% found this document useful (0 votes)
11 views13 pages

Risab

Uploaded by

Rishi Chourasia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views13 pages

Risab

Uploaded by

Rishi Chourasia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 13

Introduction

01 02 03 04
Diabetes is a chronic Traditional diagnostic Machine learning can This project applies machine
disease that affects the methods are time- provide predictive analytics learning techniques to build
body’s insulin production, consuming, delaying timely to assist in early diabetes an accurate model for
leading to irregular treatment. detection, improving patient diabetes prediction,
carbohydrate metabolism outcomes. supporting healthcare
and high blood glucose professionals in faster
levels. decision-making.
Data Analysis: Examine data to find correlations
between features and diabetes diagnosis.

Project High Prediction Accuracy: Aim to achieve the


highest accuracy possible in predicting diabetes.

Objectiv Model Comparison: Determine the most


es effective machine learning algorithms for
diabetes prediction.

Healthcare Support: Support doctors in


predicting diabetes early using an automated,
data-driven approach.
Motivation
•Growing Diabetes Prevalence: The increasing
incidence of diabetes due to modern lifestyle choices.
•Diagnostic Errors in Current Systems:
•False Negatives: Diabetes undiagnosed when
present.
•False Positives: Diabetes diagnosed incorrectly.
•Unclassifiable Cases: Insufficient data for
classification.
•Role of Machine Learning: ML models can reduce
errors, save time, and provide consistent results.
Flow Chart
Dataset
Overview
• Source:The dataset is
originally from the National
Institute of Diabetes and
Digestive and Kidney
Diseases. The objective of
the dataset is to
diagnostically predict
whether or not a patient has
diabetes, based on certain
diagnostic measurements
included in the dataset.
Dataset
Overview
• Dataset Characteristics:
• Includes medical measurements like glucose level,
BMI, and skin thickness.
• Data from female patients aged 21 and above, of
Pima Indian heritage.
• Objective: Predict whether a person has diabetes based
on these measurements.
Data Preprocessing

•Data Cleaning: Remove or impute missing values to ensure consistency.


•Feature Scaling: Apply StandardScaler to maintain consistent data
distributions and prevent data leakage.
•Dataset Splitting:
•Training Set: For model fitting.
•Validation Set: For hyperparameter tuning.
•Testing Set: For evaluating model performance
Exploratory Data Analysis (EDA)
•Initial Insights: Visualize distributions through histograms for
parameters like Glucose, BMI, and Blood Pressure.

•Correlation Analysis: Identify relationships between features


(e.g., strong correlation between BMI and Skin Thickness)
Machine Learning Models
Implemented
Gaussian Naive Bayes:
K-Nearest Neighbors
Logistic Regression: Suitable for small
(KNN): Non-parametric
Probabilistic model for datasets and
method for
binary classification. probabilistic
classification.
classification.

Support Vector Random Forest:


Decision Tree: Simple,
Machine (SVM): Finds Ensemble of decision
tree-based model for
the best boundary trees, reducing
decision rules.
between classes. overfitting.
Comparati
ve
Analysis
of Models
Test Results and Analysis

Overall Test Accuracy: Ranged from 73% to 81% across


models.

Best Model: Random Forest achieved the highest


accuracy and recall.

Analysis: Glucose levels show a strong correlation with


diabetes, confirming the hypothesis of the project.
Conclusion

Successful Outcome: Machine learning successfully assists in


diabetes prediction.

Effective Algorithms: Random Forest, SVM, and AdaBoost are top


performers for binary classification.

Impact: Provides doctors with a reliable tool for early detection,


improving treatment outcomes.
Future Scope

Additional Data: More Enhanced Models: Try Broader Applicability:


features, including deep learning models Aim to refine the
lifestyle factors, could for potentially higher model for use across
improve accuracy. accuracy. diverse populations.

You might also like