4th Year Project
4th Year Project
“CARDIAC CARE”
Submitted in Partial fulfillment of the Requirements for the Degree of
CERTIFICATE
Certified that the project work entitled “CARDIAC CARE” carried out by Mr. SAKSHAM SINGH,
USN 1CR21CS161 Mr. SAMEER SINGH, USN 1CR21CS162, Ms. SAMPADA R DESAI, USN
1CR21CS163, bonafide students of CMR Institute of Technology, in partial fulfillment for the award
of Bachelor of Engineering in Computer Science and Engineering of the Visveswaraiah
Technological University, Belgaum during the year 2024-2025. It is certified that all
corrections/suggestions indicated for Internal Assessment have been incorporated in the Report
deposited in the departmental library.
The project report has been approved as it satisfies the academic requirements in respect of Project
work prescribed for the said Degree.
External Viva
1. ___________________________ ________________________
2. ___________________________ ________________________
ii
DECLARATION
We, the students of Computer Science and Engineering, CMR Institute of Technology,
Bangalore declare that the work entitled "CARDIAC CARE" has been successfully completed
under the guidance of Prof. Kavitha P, Computer Science and Engineering Department, CMR
Institute of Technology, Bangalore. This dissertation work is submitted in partial fulfillment of
the requirements for the award of Degree of Bachelor of Engineering in Computer Science and
Engineering during the academic year 2024 - 2025. Further, the matter embodied in the project
report has not been submitted previously by anybody for the award of any degree or diploma to
any university.
Place:Bengaluru
Date:18/12/24
iii
ABSTRACT
This project presents a web-based Clinical Decision Support System (We-CDSS) developed
using Django, focused on improving healthcare accessibility and decision-making. The system
integrates predictive analytics with the LWGMK-NN algorithm to assess Coronary Artery
Disease risk and utilizes prescriptive analytics to generate personalized lifestyle
recommendations. With a user-friendly interface, compatible with both computers and mobile
devices, We-CDSS enables early diagnosis, prevention, and personalized care management. It
bridges the gap between technology and healthcare, empowering clinicians and individuals to
make informed decisions, promoting better health outcomes, and fostering a proactive approach
to heart disease management.
iv
ACKNOWLEDGEMENT
I take this opportunity to express my sincere gratitude and respect to CMR Institute of
Technology, Bengaluru for providing me a platform to pursue my studies and carry out my final
year project
I have a great pleasure in expressing my deep sense of gratitude to Dr. Sanjay Jain,
Principal, CMRIT, Bangalore, for his constant encouragement.
I would like to thank Dr.R Kesavamoorthy, Professor and Head, Department of
Computer Science and Engineering, CMRIT, Bangalore, who has been a constant support and
encouragement throughout the course of this project.
I consider it a privilege and honor to express my sincere gratitude to my guide
Dr. Kavitha P, Associate Professor, Department of Computer Science and Engineering, for the
valuable guidance throughout the tenure of this review.
I also extend my thanks to all the faculty of Computer Science and Engineering who
directly or indirectly encouraged me.
Finally, I would like to thank my parents and friends for all their moral support they have
given me during the completion of this work.
v
TABLE OF CONTENTS
Page No.
Certificate ii
Declaration iii
Abstract iv
Acknowledgement v
Table of contents vi
List of Figures viii
List of Tables ix
List of Abbreviations x
1 INTRODUCTION 1-4
1.1 Problem Statement
1.2 Objectives
1.3 Methodology
1.4 Relevance
1.5 Gantt Chart
2 LITERATURE SURVEY 5
3 SYSTEM DESIGN 6-12
3.1 System Architecture
3.2 Software Requirements
3.3 Hardware Requirements
4 IMPLEMENTATION 13-15
4.1 Algorithm
5 RESULTS AND DISCUSSION 16-22
5.1 OUTPUT
6 TESTING 23-25
vi
6.1 Unit Testing
6.2 Integration Testing
6.3 Hybrid Testing
7 CONCLUSION AND FUTURE SCOPE 26-27
7.1 CONCLUSION
7.2 FUTURE SCOPE
8 REFERENCES 28
APPENDIX 29
vii
LIST OF FIGURES
Page No.
Fig 5.11 Home-page 17
Fig 5.12 Clinician’s Form 17
Fig 5.13 Prediction for Clinician’s Form 18
Fig 5.14 Patient’s Form 18
Fig 5.15 Prediction for Patiemt’s Form 19
Fig 5.16 Comparision of various Algorithms with LWGMKNN 27
viii
LIST OF TABLES
ix
LIST OF ABBREVIATIONS
x
Cardiac Care
CHAPTER 1
INTRODUCTION
The challenge lies in creating a system that integrates both predictive and prescriptive
analytics, providing accurate predictions for CAD risk and actionable, personalized
recommendations. The goal is to make this solution available on web and mobile
platforms for maximum accessibility.
User Accessibility: Ensure the system is available on both computers and mobile
devices for use by clinicians and the general public alike.
1.2 Objectives
Develop a Predictive Model for CAD: Implement the LWGMK-NN algorithm to
accurately predict the likelihood of Coronary Artery Disease based on user input,
addressing the need for early diagnosis in both clinical and public settings.
Integrate Prescriptive Analytics: Create a prescriptive engine that generates
personalized lifestyle recommendations for users based on their predictive CAD risk,
helping reduce the likelihood of disease progression.
Ensure Multi-Platform Accessibility: Build a web-based system that is responsive
and accessible on both computers and mobile devices, making it easy for clinicians
and the public to use the CDSS from anywhere.
Provide a User-Friendly Interface: Design an intuitive user interface for both
medical professionals and non-expert users, ensuring ease of use while delivering
actionable insights on CAD risk and prevention.
1.3 Methodology
Data Collection:
Use publicly available datasets or clinical data that contain patient information,
including health metrics (e.g., age, blood pressure, cholesterol levels, BMI) and
lifestyle factors (e.g., smoking, exercise habits).
Ensure the data includes labeled instances of CAD cases and non-CAD cases for
training and validation.
Predictive Analytics:
Implement a machine learning model, such as the LWGMK-NN algorithm
(Lightweight Generalized Minkowski k-Nearest Neighbors), for CAD risk
prediction.
Train the model to classify individuals as high or low risk based on health and
lifestyle factors.
Feature Selection:
Identify key predictors of CAD, such as age, gender, family history, blood pressure,
cholesterol, and exercise frequency, to improve model performance and
interpretability.
System Design:
Frontend: Create a web-based interface using Django for ease of access.
Backend: Integrate the predictive model with a clinical decision support system
(We-CDSS).
Database: Use a relational database to store user information, risk scores, and
recommendations securely.
Prescriptive Analytics:
Design an algorithm to provide personalized lifestyle recommendations, such as
diet plans, exercise routines, and habits to reduce CAD risk.
Use evidence-based clinical guidelines to create recommendations tailored to
user profiles.
Testing and Validation:
Validate the system using clinical and real-world data to ensure accuracy and
usability.
Conduct user testing with clinicians and general users to refine the interface and
functionality.
Deployment and Maintenance:
Host the system on a cloud platform for scalability and accessibility.
Regularly update the model with new data and clinical guidelines.
1.4 Relevance
Early Detection:Facilitates timely identification of CAD risk, enabling individuals to
take preventive measures early.
Accessibility:Provides an easy-to-use platform that bridges the gap between clinical
expertise and public healthcare needs, especially in remote or underserved areas.
Personalized Care:Offers tailored lifestyle interventions, improving the effectiveness
of preventive measures and promoting healthier living.
Support for Clinicians:Enhances decision-making by providing clinicians with a
risk assessment tool backed by predictive analytics.
Cost-Effective Solution:Reduces healthcare costs by preventing severe
complications through early intervention.
Scalability and Public Health Impact:A scalable system that can be adapted for
global use, potentially lowering the overall burden of CAD on healthcare systems.
CHAPTER 2
LITERATURE SURVEY
CHAPTER 3
SYSTEM DESIGN
The system architecture for the Coronary Artery Disease (CAD) Prediction and
Recommendation System is designed using a three-layer architecture comprising
the User Interface Layer, Application Layer, and Data Layer. This modular
structure ensures seamless interaction between users, the backend server, and
machine learning components. The User Interface Layer serves as the access point
for users through a web-based frontend, enabling data input and visualization of
results. The Application Layer, powered by the Django framework, handles the
business logic, processes user inputs, and interacts with machine learning models to
generate predictions. Finally, the Data Layer manages data storage in the database
and hosts machine learning models, ensuring efficient computation and reliable
storage of inputs and outputs. This design bridges the gap between users and
advanced predictive analytics, providing an accessible, scalable, and user-friendly
system for CAD risk assessment and lifestyle recommendations.
This module serves as the entry point for users to interact with the system.
Users can access the web application through browsers on devices like laptops,
desktops, or mobile phones.
It allows users to input data (e.g., health details) and view predictions or
recommendations.
The frontend is developed using standard web technologies like HTML, CSS,
and JavaScript.
This layer ensures a seamless user experience with interactive components and
visualization tools.
2. Application Layer
Web Server:
The web server acts as a bridge between the user interface and the backend logic.
It manages HTTP requests from the user's browser and returns the appropriate
responses.
Django Framework:
The Django framework handles the core application logic of the system.
It processes user inputs, interacts with the database and machine learning models,
and manages the flow of data.
Django ensures data validation, routing, and integration with the backend
services to generate predictions.
3. Application Layer
Database:
The database stores user inputs, CAD-related risk data, and historical information.
It allows efficient data retrieval and management, ensuring the system can store
large amounts of health and user data securely.
Example: User health metrics (e.g., age, cholesterol levels, blood pressure) can be
stored for future reference.
ML Models Server:
This module hosts the machine learning models used for predictive analytics.
It processes input data, applies the trained CAD prediction models, and generates
results.
It ensures that the predictive analytics are accurate, reliable, and delivered
quickly to the user.
Django:
A high-level Python web framework used for building robust and scalable web
applications.
Python 3.8+:
Python is versatile, easy to learn, and widely used for both web development and
machine learning tasks.
Used for styling the web pages (e.g., layout, colors, fonts, and responsive design).
JavaScript:
For example, it enables real-time input validation, data visualization, and user
interactions.
Bootstrap (Optional):
React/Vue (Optional):
PostgreSQL/MySQL (Production):
Both are relational databases that efficiently store and manage structured data.
SQLite (Development):
A lightweight, file-based database used during development for faster testing and
deployment.
scikit-learn:
It supports tasks like classification, regression, and clustering for CAD risk
analysis.
NumPy:
pandas:
A data analysis library used for cleaning, manipulating, and analyzing input data
in a structured format (e.g., tables).
Firebase:
Git:
A distributed version control system used for tracking code changes, collaboration,
and maintaining a development history.
GitHub/GitLab:
Platforms for hosting Git repositories to manage codebase, collaborate among teams,
and ensure smooth deployment.
Gunicorn:
It handles requests efficiently and works with Nginx/Apache for load balancing.
Nginx/Apache:
Web servers used to serve static files, handle client requests, and reverse proxy
requests to Gunicorn.
3. Storage: 10 GB SSD
10 GB SSD:
Ensures fast read/write operations, which are critical for database queries and
system responsiveness.
Regular backups are necessary to prevent data loss due to hardware failures or
accidental deletions.
Cloud Storage: Services like AWS, Google Cloud, or Azure provide secure and
scalable backup solutions.
Physical Storage: External hard drives or on-premise servers for local backup.
CHAPTER 4
IMPLEMENTATION
The project implements a modified K-Nearest Neighbors algorithm, LWGMK-NN,
which uses weighted geometric mean for predictions. It calculates distances using
Euclidean and Manhattan metrics, applying inverse distance weighting for accuracy.
Input data is preprocessed with RobustScaler to handle outliers, and grid search
optimizes the k value. The prediction functionality is implemented using predict()
views, which process JSON inputs, validate them, and return risk predictions with
probability scores, enabling accurate Coronary Artery Disease assessment.
4.1 ALGORITHM
class LWGMKNN:
def __init__(self, k=5, distance_metric='euclidean'):
self.k = k
self.distance_metric = distance_metric
self.scaler = RobustScaler()
self.X_train = None
self.y_train = None
)
self.y_train = y
CHAPTER 5
The LWGMKNN model was developed to predict Coronary Artery Disease (CAD)
based on a range of clinical features, utilizing the Least Weighted Geometrical Mean
Kernel K-Nearest Neighbors (LWGMKNN) algorithm for accurate classification. The
model achieved an impressive accuracy of 98%, demonstrating its ability to
effectively discriminate between high-risk and low-risk patients. It also exhibited
balanced performance in both precision and recall, ensuring that both false positives
and false negatives were minimized. Key features that contributed significantly to the
model's predictions included age, chest pain type, cholesterol levels, and maximum
heart rate. These features were identified as the most influential factors for predicting
CAD risk, highlighting their importance in clinical diagnostics.
The model was designed to classify patients into high-risk or low-risk categories with
assigned probabilities, which facilitates early diagnosis and personalized treatment
plans. This approach provides a more comprehensive assessment of a patient's
condition, empowering healthcare professionals with decision support for timely
interventions. By leveraging clinical data such as blood pressure, ECG results, and
other cardiac markers, the model demonstrated the ability to offer predictions with a
high degree of confidence, ensuring that it is both reliable and robust in various
clinical settings.
5.1 OUTPUT
This homepage serves as an entry point for the CARDIAC CARE platform,
prompting users to select their role: Clinician or Patient, enabling tailored access for
heart condition prediction. It ensures a user-friendly interface with a clean design for
easy navigation
The Clinician's Form is designed to collect essential patient details like age, sex, chest
pain type, blood pressure, cholesterol, and other key health indicators. This data is
used to predict Coronary Artery Disease (CAD) risk accurately
The Prediction Result section provides the risk level for Coronary Artery Disease
(CAD), such as "High Risk," along with actionable recommendations. These include
consulting a cardiologist, undergoing further diagnostic tests, and considering
medications to manage the condition effectively.
This is a Patient's Form interface designed to collect health-related data such as age,
gender, height, weight, blood pressure, and cholesterol levels. It likely serves as input
for a healthcare application or clinical decision support system for monitoring and
assessing patient health.
Table 5.11 Performance Comparison of Machine Learning Algorithms for Predictive Analytics
CAD prediction is highly sensitive to features such as age, cholesterol, and chest pain
type.Unlike traditional KNN, LWGMKNN assigns weights to neighbors based on
their distances and importance. This localized feature weighting ensures more
accurate and clinically relevant predictions.
Performance: Achieves competitive accuracy (85%) with good precision and recall
while avoiding the complexity of algorithms like Random Forest and SVM.
5. Robustness to Outliers
Forest and SVM.Suitable for deployment in real-time systems with low latency
requirements.
Every CAD prediction model must adapt to specific clinical datasets and population
demographics.Parameters like the number of neighbors (k) and generalized mean
calculation can be fine-tuned to optimize for CAD-specific datasets.Easily extendable
to include additional clinical features.
CAD datasets often have class imbalance (e.g., more low-risk patients than high-
risk).Weighted neighbors and localized distance measures ensure the minority class is
not ignored, unlike unweighted methods like traditional KNN.
Studies using LWGM-like techniques for healthcare have shown high efficacy in
predicting diseases with complex relationships among features. For CAD specifically,
this approach aligns well with the multi-factorial nature of the disease.
The bar chart visualizes the performance comparison of different machine learning
algorithms. LWGMKNN has the highest F1-Score, while Logistic Regression has the
lowest training time. Random Forest has the highest accuracy and precision, but it has
the lowest interpretability.
CHAPTER 6
TESTING
Testing played a critical role in ensuring the accuracy, reliability, and robustness of
the LWGMKNN model developed for Coronary Artery Disease (CAD) prediction.
Various testing methodologies were implemented to evaluate the model’s
performance and verify its functionality across different stages of development. The
testing process involved unit testing, integration testing, and hybrid testing, which
were essential for validating the core components and the entire system. Below is an
overview of the testing done during the project
Output Verification: The output generated by the model was also tested using
known test cases to ensure that the predicted classes (high-risk or low-risk) and the
probabilities of each class were computed accurately.
Data Flow: The integration of data input, model processing, and output generation
was tested. For instance, data preprocessed through various functions was passed into
the model, and the output was checked for consistency and correctness.
Model Accuracy Evaluation: We also tested how well the LWGMKNN model
interacted with the feature selection and data preprocessing modules to ensure that the
model was trained correctly and its predictions aligned with the expected results. We
used performance metrics like accuracy, precision, and recall to evaluate this
integration.
System Performance: Integration tests were run to verify that the system could
handle large datasets and that the model's predictions remained consistent with
various sets of test data.
End-to-End Testing: Hybrid testing was crucial for verifying the system's ability to
predict CAD risk from raw input data to the final output. This end-to-end process
tested whether the entire pipeline—from data collection, preprocessing, training the
model, and producing predictions—worked effectively.
Performance under Load: The hybrid testing approach also involved assessing how
the model handled a variety of data inputs and tested its scalability. Large datasets
were used to simulate real-world scenarios, verifying that the model could handle
varying patient profiles without compromising performance.
Real-World Data Simulation: Hybrid testing included testing the model under
various conditions such as noisy data, missing values, and data imbalances (common
in healthcare datasets). This was important to ensure the model could handle
variations in real-world clinical data effectively.
Model Interpretability: Hybrid tests also assessed the interpretability of the model’s
predictions, ensuring healthcare providers could trust the risk categorization (high-
risk or low-risk) and the probability scores provided by the system.
CHAPTER 7
7.1 CONCLUSION
The LWGMKNN-based Coronary Artery Disease (CAD) Prediction System was
successfully developed and tested, demonstrating its potential as a reliable tool for
early diagnosis of CAD. The model achieved an 98.5% accuracy rate, with balanced
performance across precision and recall, making it suitable for classifying patients
into high-risk and low-risk categories. By incorporating clinical features such as age,
cholesterol levels, chest pain type, and maximum heart rate, the model was able to
predict the likelihood of CAD, helping healthcare professionals make informed
decisions.
The system's ability to predict CAD with reasonable accuracy demonstrates its value
in clinical settings, offering an aid for early intervention and personalized treatment
plans. The unit, integration, and hybrid testing confirmed the robustness of the model,
ensuring its reliability and performance across various scenarios. This CAD
prediction system can serve as a Clinical Decision Support System (CDSS), helping
healthcare providers assess patient risks and recommend appropriate lifestyle
interventions.
Data Handling: Addressing class imbalance with techniques like SMOTE could
improve predictions for low-risk patients.
Real-Time Use: Integrating the model into clinical systems for live risk assessments
and wearable devices for continuous monitoring would enhance its clinical utility.
Interpretability: Using tools like SHAP or LIME to improve model transparency for
healthcare professionals.
Dataset Expansion: Testing on larger and diverse datasets would improve the
model's robustness.
REFERENCES
Machine Learning in Cardiovascular Risk Prediction
Yang, L., Wu, H., Jin, X., Zheng, P., Hu, S., Xu, X., Yu, W., and Yan, J., “Study of
cardiovascular disease prediction model based on random forest in eastern
China,” Sci. Rep., vol. 10, no. 1, p. 5245, Dec. 2020.
Predictive Models for Coronary Artery Disease
Anooj, P. K., “Clinical decision support system: Risk level prediction of heart disease
using weighted fuzzy rules,” J. King Saud Univ.-Comput. Inf. Sci., vol. 24, no.
1, pp. 27–40, 2012.
APPENDIX
DATASETS
Clinician
https://ieee-dataport.org/open-access/heart-disease-dataset-comprehensive
Patient
https://www.kaggle.com/datasets/sulianova/cardiovascular-disease-dataset