Diabetes Assignment Report
Diabetes Assignment Report
Class: BSCS-F21
Table of Contents
1. Introduction
2. Problem Statement
3. Key Questions
5. Data Preprocessing
5.3 Normalization
7. Conclusion
1. Introduction
Diabetes is a growing health concern affecting millions globally. Using data science, we can analyze
medical records and predict diabetes risk, allowing for early intervention and better healthcare
planning.
2. Problem Statement
The goal of this project is to analyze patient data to identify patterns that
levels and BMI, we can build predictive models to aid medical professionals in
early diagnosis.
3. Key Questions
The dataset used for this analysis is the Diabetes Data Set from Kaggle. It
consists of 768 patient records with medical attributes such as glucose levels,
BMI, and insulin measurements. This dataset was chosen for its relevance and
comprehensiveness.
5. Data Preprocessing
The dataset was checked for missing values, and no missing values were found.
5.3 Normalization
Numerical features were normalized using Min-Max Scaling to standardize data values between 0
and 1.
After preprocessing, the dataset is clean and ready for further analysis. Key
predictors such as glucose levels and BMI may play a crucial role in predicting
diabetes.
7. Conclusion
missing values, removing duplicates, and scaling numerical data. The cleaned
data is now ready for further analysis, such as building predictive models.