Final Project
Final Project
• INTRODUCTION
• LITERATURE REVIEW
• EXISTING SYSTEM
• PROPOSED SYSTEM
• FLOW DIAGRAM
• MODULE SPLIT UP
• MODULE EXPLANATION
• PERFORMANCE METRICS
• RESULT
• FUTURE ENHANCEMENT
• CONCLUSION
• REFERENCES
Objective
• To develop a high-accuracy machine learning model for predicting heart attacks for diabetic
adults.
• The ultimate goal is to enhance early detection, reduce mortality rates, and contribute to the
medical community by improving predictive analytics in healthcare.
Introduction
• Heart disease is a major cause of death globally, especially in diabetics, who face increased
cardiovascular risks.
• Traditional diagnostic methods may not predict risks early enough.
• This study explores machine learning techniques to analyze medical data and accurately
identify diabetic patients at risk for heart attacks by comparing various ML models based on
health parameters.
Literary Review
Published Published Name of the Author(s) Models used Pros Cons Accuracy
Organization Year Research Paper
IEEE 2025 Analysis and Paras Negi Various ML Emphasizes the Specific models Not specified
Prediction of & Manoj models importance of and their
Heart Attack Kumar Bisht early detection to performances are
using Machine reduce mortality not detailed.
Learning Models rates.
IEEE 2024 Heart Attack Risk Yasaswini Advanced Focuses on Specific models Not specified
Prediction Using Bonthu & ML proactive and their
Advanced Subbarao techniques healthcare performances are
Machine Mannam & solutions by not detailed.
Learning Gayithri employing
Techniques Kandikunta advanced ML
& Vikranth techniques for
Goud accurate and
Keshagani & personalized
Greeshma predictions.
Sarath
Published Published Name of the Author(s) Models used Pros Cons Accuracy
Organization Year Research Paper
IEEE 2023 Heart Attack Jesslyn Audrey; Support SVM model Specific details 85.53%
Prediction Using Mochammad Vector obtained the about the dataset
Machine Learning Haldi Widianto Machine highest and feature selection
Classification (SVM) accuracy, F1- are not provided.
Models score, recall,
and precision
values.
IEEE 2023 Heart Attack Janaraniani N; Decision Achieved the Specific dataset 99.5%
Prediction using Divya P; Tree highest details and feature
Machine Learning Madhukiruba E; prediction selection process are
R. Santhosh; R. accuracy with a not detailed.
Reshma; D. fast rate of
Selvapandian exactness.
Existing System
Most existing systems focus on diabetes prediction rather than heart attack risk assessment for diabetic
patients. Studies predominantly use KNN, Logistic Regression, and Random Forest for binary
classification of diabetes presence. Limitations of current approaches include:
• Lack of Heart Attack Prediction: Existing models unable to assess the cardiovascular
complications associated with diabetes.
• Limited Feature Scope: Most models focus on glucose levels and BMI, neglecting key
cardiovascular risk indicators such as cholesterol and hypertension.
Proposed System
The proposed system introduces:
.
• The High accuracy of predicting the heart attack risk specifically for the diabetic
adults.
• Clearly shows the performance of the algorithm for the dataset by the metrices i.e,
Accuracy, precision, recall, F1-score, ROC –AUC.
Flow Diagram
Module Split-Up
Phase 4
Phase 3
Phase 2
Phase 1 Data Visualization &
Prediction & Risk
Model Training & Interpretation.
Classification.
Data Preprocessing & Evaluation.
Feature Engineering
Phase 1 – Data Preprocessing & Feature Engineering
Content:
•Objective: Clean and prepare the dataset for accurate modeling.
•Steps:
•Missing Value Handling: Imputed using statistical methods (mean/median).
•Feature Scaling: Standardization using StandardScaler for uniformity.
•Categorical Encoding: Transformed categorical variables via label encoding.
•Data Splitting: 80% training, 20% testing.
•Exploratory Data Analysis: Identified correlations, patterns, and outliers.
•Feature Engineering: Created new features and selected the most relevant ones using SHAP
and correlation analysis.
Phase 2 – Model Training & Evaluation
Content:
• Objective: Build and compare predictive models.
• Models Used:
• Logistic Regression
• K-Nearest Neighbors (KNN)
• Random Forest
• XGBoost, LightGBM, CatBoost
• Techniques:
• Stratified K-Fold Cross Validation
• Grid & Random Search for hyperparameter tuning
• Evaluation Metrics:
• Accuracy, Precision, Recall, F1-Score
• ROC-AUC Score
• Confusion Matrix & Classification Report
Phase 3 – Prediction & Risk Classification
Content:
• Objective: Classify users into heart attack risk categories based on diabetic
indicators.
• Functionality:
• Accepts new patient inputs (glucose, cholesterol, blood pressure, etc.)
• Predicts heart attack likelihood using trained ML model.
• Outputs classification: Low Risk, Moderate Risk, or High Risk.
• Model Used: Best-performing model (e.g., Logistic Regression with highest
recall).
• Impact: Enables early detection and proactive medical intervention.
Phase 4 – Result Interpretation & SHAP Analysis
Content:
• Objective: Make model predictions explainable and trustworthy.
• SHAP Analysis:
• Visualizes feature impact on individual predictions.
• Highlights key contributing factors like glucose, age, smoking, etc.
• Helps doctors understand why a certain risk level is predicted.
• Benefits:
• Increases model transparency.
• Supports clinical decision-making.
• Builds trust with healthcare providers and users.
Performance Metrics
Models Accuracy (%) Precision (%) Recall (%) F1-Score (%) ROC-AUC (%)
Logistic
99.46 35.46 100.00 52.36 99.73
Regression
4. Johnson, M., & Patel, R. (2021). AI-Driven Predictive Models for Cardiovascular Disease in
Diabetic Patients. Journal of Biomedical Informatics, 72(4), 189-204.
5. Zhang, L., & Kumar, S. (2020). Comparative Study of Machine Learning Models for Heart Attack
Prediction. International Journal of Data Science and Analytics, 38(2), 99-113.
Queries?