0% found this document useful (0 votes)

48 views5 pages

ML Minor May

The document summarizes a machine learning project to predict diabetes using a Pima Indian diabetes dataset. It introduces the dataset and objective. It then lists the packages used in the code and provides a step-by-step description of the data preprocessing, modeling, and evaluation process. This includes handling missing values, splitting data, performing logistic regression, and selecting the best model. The results show the final model can predict diabetes with 74% accuracy.

Uploaded by

govind kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views5 pages

ML Minor May

Uploaded by

govind kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

ML-MINOR-MAY

Submitted by: Chodapaneedi Govind kumar

Submitted to : [email protected]

Contents:
1.Problem
2.Imported packages
3.Procedure of solving
4.Code(Screenshot)
5.Conclusion

Project question:
This dataset is originally from the National Institute of Diabetes and Digestive and Kidney
Diseases. The objective of the dataset is to diagnostically predict whether or not a patient
has diabetes, based on certain diagnostic measurements included in the dataset. Several
constraints were placed on the selection of these instances from a larger database. In
particular, all patients here are females at least 21 years old of Pima Indian heritage.

Imported packages in the code:

Pandas
Numpy
Matplotlib.pyplot
From slearn.linear.model_selection imported tran_test_split
From sklearn_linear.model imported logistic regression
From sklearn.neighbors imported KNeighbors classifier.

1.Procedure of solving:
2.Loaded my data from my drive.
3.Gave the basic instructions of pandas,numpy and matplotlib.pyplot.
4.Read the data.
5.Describe the data
Here we can see that there are 9 columns in the data. All columns seem to be numeric in
nature which is good for modelling. In case of Character string columns, we could have used
dummy numeric variables for modelling. The columns here are:
1. Number of times pregnant.
2. Plasma glucose concentration a 2 hours in an oral glucose tolerance test.
3. Diastolic blood pressure (mm Hg).
4. Triceps skinfold thickness (mm).
5. 2-Hour serum insulin (mu U/ml).
6. Body mass index (weight in kg/(height in m)^2).
7. Diabetes pedigree function.
8. Age (years).
9. Outcome: Class variable (0 or 1).
6.Taking the info of the data

All values seem to be in the integer or float format which is opt for modelling. Hence
there’s no type conversions required on the dataset. The row count of the data is 768. Hence
the shape of our data is 768 * 9

7.Checking missing Data

Here we can see that there’s no missing or Null values. But in the data head we had
spotted a 0 value. Now we should check the data range and basic summary statistics about
our data.

8.Describing the data

Missing Values
Here we can see certain columns have a minimum value as 0 which is clearly not logical.
The columns are:
1. Glucose
2. Blood Pressure
3. Skin Thickness
4. Insulin
5. BMI

Next we need to check the amount of missing information in these columns. We can check
this looking for the 0 value rows
9.Defining x

10.Defining df

11.Creating a dataframe df1 with Median values of Skin thickness and median values of
Insulin

12.Creating a dataframe df2 with Median values for Skin Thickness but removing the
missing data for Insulin

13.Training and testing values

14.Logistic Regression

15.Selecting the appropriate data for Evaluation

Based on the model Accuracy scores on the two datasets: df1 and df2, we can clearly see
that the accuracy in df1 is higher. This proves that even though we made certain
assumptions about the missing values in our dataset, the predictions performed better than
the case where we removed the rows with missing data. This is a positive scenario as we
can use our modelling on the entire dataset using -> df1
16.The result:
Here, we can see that the Final model has an Accuracy of 0.7447916666666666. It has
anArea Under the Curve of 0.78.

Code( screen shot):

5.Conclusion:

Diabetes is a serious disease in our society. It is very common in developing nations. In

India, it is said that nearly 7% of the adult population has diabetes and it is commonly found
in my family as well. A Machine learning model, if used in the right manner could help in
detecting symptoms that lead up to Diabetes. This could have tremendous health and cost
benefits to the users.
From our Model we are able to predict Diabetes with 74% accuracy. It is also important to
note that the 2 most important factors while detecting diabetes are:
1. Glucose
2. Body Mass Index

Minor Project Report
No ratings yet
Minor Project Report
46 pages
Final Seminar Report Soumya
No ratings yet
Final Seminar Report Soumya
20 pages
Diabetes Prediction Using Machine Learning
No ratings yet
Diabetes Prediction Using Machine Learning
16 pages
Early Detection of Diabetes Using Logistic Regression Risk Factor Analysis and Probabilistic Prediction
No ratings yet
Early Detection of Diabetes Using Logistic Regression Risk Factor Analysis and Probabilistic Prediction
12 pages
Diabetes Prediction Using Machine Learning
No ratings yet
Diabetes Prediction Using Machine Learning
20 pages
Slide Presetatio
No ratings yet
Slide Presetatio
30 pages
Final
No ratings yet
Final
44 pages
Kush Don FINAL Jatu
No ratings yet
Kush Don FINAL Jatu
11 pages
Machine Learning and Deep Learning Techniques
No ratings yet
Machine Learning and Deep Learning Techniques
13 pages
Improving Healthcare Prediction of Diabetic Patients Using KNN Imputed Features and Tri-Ensemble Model
No ratings yet
Improving Healthcare Prediction of Diabetic Patients Using KNN Imputed Features and Tri-Ensemble Model
11 pages
Seetu Papers 1
No ratings yet
Seetu Papers 1
6 pages
IPL Winning Prediction Intern Report
No ratings yet
IPL Winning Prediction Intern Report
52 pages
ppt715B.pptm (Autosaved)
No ratings yet
ppt715B.pptm (Autosaved)
15 pages
Eating Disoder PPT 3
100% (1)
Eating Disoder PPT 3
22 pages
21BCE9757 ITT Summer Internship AI ML Report
No ratings yet
21BCE9757 ITT Summer Internship AI ML Report
18 pages
Risab
No ratings yet
Risab
13 pages
Diabetes - Test Report
No ratings yet
Diabetes - Test Report
62 pages
Data Pre-Processing
No ratings yet
Data Pre-Processing
22 pages
Dataset
No ratings yet
Dataset
13 pages
Sse 25 21 114-2
No ratings yet
Sse 25 21 114-2
13 pages
MLPPT 11 45
No ratings yet
MLPPT 11 45
31 pages
CIEA Term Project
No ratings yet
CIEA Term Project
19 pages
A Comparative Analysis Using Machine Learning Algorithm On
No ratings yet
A Comparative Analysis Using Machine Learning Algorithm On
19 pages
Diabe PDF
No ratings yet
Diabe PDF
11 pages
Report - SVM
No ratings yet
Report - SVM
13 pages
c20 Final Final
No ratings yet
c20 Final Final
21 pages
مختار النعيري - The Course Work Submission
No ratings yet
مختار النعيري - The Course Work Submission
31 pages
Ads Exp 10
No ratings yet
Ads Exp 10
10 pages
Program Book Round 5
No ratings yet
Program Book Round 5
654 pages
Classifier Model For Diabetes Prediction
No ratings yet
Classifier Model For Diabetes Prediction
30 pages
Diabetes Prediction Report
No ratings yet
Diabetes Prediction Report
4 pages
Diabetes Assignment Report
No ratings yet
Diabetes Assignment Report
3 pages
Diabetes and Glucose Correlation - IBM Machine Learning Training Project
No ratings yet
Diabetes and Glucose Correlation - IBM Machine Learning Training Project
10 pages
Classification
No ratings yet
Classification
9 pages
A Mini Skill Based Project Report On: Machine Learning & Optimization (270404)
No ratings yet
A Mini Skill Based Project Report On: Machine Learning & Optimization (270404)
20 pages
Synonyms: 1. Ostitis (DIETRICH) 2. Osteitis (BURRI) 3. Panostitis (KOCHER) 4. Osteomyeloperiostitis (MAGNUS)
No ratings yet
Synonyms: 1. Ostitis (DIETRICH) 2. Osteitis (BURRI) 3. Panostitis (KOCHER) 4. Osteomyeloperiostitis (MAGNUS)
22 pages
Binod ML Project-052
No ratings yet
Binod ML Project-052
14 pages
Diabetes Prediction - ML
No ratings yet
Diabetes Prediction - ML
29 pages
Predicting Diabetes Onset Using Machine Learning
No ratings yet
Predicting Diabetes Onset Using Machine Learning
4 pages
Independent Project
No ratings yet
Independent Project
10 pages
Projectreport Diabetes Prediction
No ratings yet
Projectreport Diabetes Prediction
22 pages
BI Miniproject Report (Diabetes)
No ratings yet
BI Miniproject Report (Diabetes)
18 pages
Case Study - Healthcare Industry
No ratings yet
Case Study - Healthcare Industry
2 pages
Documentation Code
No ratings yet
Documentation Code
20 pages
Minor Project: Project Name: Project Description
No ratings yet
Minor Project: Project Name: Project Description
1 page
20BCE7620 AP2021228000397 Experiment-6 Removed
No ratings yet
20BCE7620 AP2021228000397 Experiment-6 Removed
19 pages
Comparison of ML Techniques
No ratings yet
Comparison of ML Techniques
16 pages
x23 Group 1 - Final Project cst383
No ratings yet
x23 Group 1 - Final Project cst383
25 pages
54 Batch Project Documentation-1
No ratings yet
54 Batch Project Documentation-1
82 pages
G26 Report
No ratings yet
G26 Report
4 pages
Ek125 Final Project
No ratings yet
Ek125 Final Project
13 pages
Project
No ratings yet
Project
8 pages
Poster Template
No ratings yet
Poster Template
1 page
DIAPRO - Diabetes Prediction Application
No ratings yet
DIAPRO - Diabetes Prediction Application
18 pages
Project Report
No ratings yet
Project Report
10 pages
ML Data Preprocessing in Python
No ratings yet
ML Data Preprocessing in Python
9 pages
Mini Project
No ratings yet
Mini Project
15 pages
Antemortem and Postmortem Inspection of Poultry 2
No ratings yet
Antemortem and Postmortem Inspection of Poultry 2
2 pages
Report Diabetics
No ratings yet
Report Diabetics
8 pages
Evidence Based Critical Care A Case Study Approach 1st Edition Robert C. Hyzy (Eds.) - The Latest Ebook Edition With All Chapters Is Now Available
No ratings yet
Evidence Based Critical Care A Case Study Approach 1st Edition Robert C. Hyzy (Eds.) - The Latest Ebook Edition With All Chapters Is Now Available
64 pages
Red Band Society's Portrayal of Anorexia Nervosa
No ratings yet
Red Band Society's Portrayal of Anorexia Nervosa
12 pages
Ophthalmology Clinical Cases
No ratings yet
Ophthalmology Clinical Cases
3 pages
Aiml Project Report
No ratings yet
Aiml Project Report
10 pages
Oxy Patho Sample Report
No ratings yet
Oxy Patho Sample Report
12 pages
Cauze Boli
No ratings yet
Cauze Boli
90 pages
Head Massage
100% (2)
Head Massage
24 pages
Pima Indians Diabetes Database Analysis - Kaggle
No ratings yet
Pima Indians Diabetes Database Analysis - Kaggle
37 pages
Diabetic Prediction Using LogicalRegression
No ratings yet
Diabetic Prediction Using LogicalRegression
9 pages
Acute Flaccid Paralysis: Moderator: Dr. Bhavana Koppad Asso. Professor Dept of Pediatrics, JNMC
No ratings yet
Acute Flaccid Paralysis: Moderator: Dr. Bhavana Koppad Asso. Professor Dept of Pediatrics, JNMC
32 pages
June 2008 Board Exam Questions
No ratings yet
June 2008 Board Exam Questions
25 pages
Hormones and Related Drugs
No ratings yet
Hormones and Related Drugs
42 pages
Final MPox Case Management and IPC Job Aid
No ratings yet
Final MPox Case Management and IPC Job Aid
12 pages
Dermatologic and Cosmetic Surgery
100% (5)
Dermatologic and Cosmetic Surgery
77 pages
Presentation HuwaSan WT Legionella - Covid
No ratings yet
Presentation HuwaSan WT Legionella - Covid
22 pages
Nerve Entrapment
No ratings yet
Nerve Entrapment
2 pages
NCP Anemia
100% (3)
NCP Anemia
1 page
Newborn Screening TCP
No ratings yet
Newborn Screening TCP
3 pages
Paniculitis Nodular Esteril en Perros
No ratings yet
Paniculitis Nodular Esteril en Perros
10 pages
Fundamentals in Nursing Notes
No ratings yet
Fundamentals in Nursing Notes
6 pages
ERCP
No ratings yet
ERCP
1 page
Distribution of Blood Groups in and Around Bellary, Karnataka
No ratings yet
Distribution of Blood Groups in and Around Bellary, Karnataka
4 pages
Client Readiness For Exercise: Name - Date - Age
No ratings yet
Client Readiness For Exercise: Name - Date - Age
2 pages
Low Pregnancy Associated Plasma Protein-A (PAPP-A)
No ratings yet
Low Pregnancy Associated Plasma Protein-A (PAPP-A)
2 pages
Sleep Disorders
No ratings yet
Sleep Disorders
6 pages
ANTI ANGINA (Metoprolol)
No ratings yet
ANTI ANGINA (Metoprolol)
1 page
Traditional Chinese Medicine in The Treatment of Diabetes
No ratings yet
Traditional Chinese Medicine in The Treatment of Diabetes
6 pages
Acne Adult Care Tips
No ratings yet
Acne Adult Care Tips
1 page
I - PAV+ Clinical Summary - Carteaux Et Al - EnG
No ratings yet
I - PAV+ Clinical Summary - Carteaux Et Al - EnG
2 pages
Six Sigma Yellow Belt: Introduction to Lean six Sigma Methodology for Beginners
From Everand
Six Sigma Yellow Belt: Introduction to Lean six Sigma Methodology for Beginners
Elias Soussi
No ratings yet
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet

ML Minor May

Uploaded by

ML Minor May

Uploaded by

ML-MINOR-MAY

Submitted by: Chodapaneedi Govind kumar

Imported packages in the code:

7.Checking missing Data

8.Describing the data

13.Training and testing values

15.Selecting the appropriate data for Evaluation

Code( screen shot):

Diabetes is a serious disease in our society. It is very common in developing nations. In

You might also like