0% found this document useful (0 votes)
35 views

Final Report

Uploaded by

mudduanjali02
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views

Final Report

Uploaded by

mudduanjali02
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

“Jnana Sangama”, Belagavi-590018

A
Project Report
On
“HEART ATTACK RISK PREDICTION USING RETINAL
EYE IMAGES”
SUBMITTED IN PARTIAL FULFILLMENT FOR THE AWARD OF DEGREE OF
BACHELOR OF ENGINEERING
IN
COMPUTER SCIENCE AND ENGINEERING
SUBMITTED BY

DEEKSHITH KUMAR S (1JB21CS038)


DHANUSH B G (1JB21CS039)
DHANUSH M R (1JB21CS040)
HARSHAN GOWDA L V(1JB21CS055)

Under the Guidance of

Dr. Arun Kumar D R


Asst. Professor,
Dept. of CSE

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


SJB INSTITUTE OF TECHNOLOGY
No.67, BGS Health & Education City, Dr.Vishnuvardhan Rd, Kengeri, Bengaluru, Karnataka 560060

2024 - 2025
|| Jai Sri Gurudev ||
Sri Adichunchanagiri Shikshana Trust ®
SJB INSTITUTE OF TECHNOLOGY
No.67, BGS Health & Education City, Dr.Vishnuvardhan Rd, Kengeri, Bengaluru, Karnataka 560060

Department of Computer Science and Engineering

CERTIFICATE

Certified that the Project Work entitled “HEART ATTACK RISK PREDICTION USING
RETINAL EYE IMAGES” carried out by DEEKSHITH KUMAR S, DHANUSH B G, DHANUSH M
R, HARSHAN GOWDA L V bearing USN 1JB21CS038, 1JB21CS039, 1JB21CS040,
1JB21CS055 are bonafide students of SJB Institute of Technology in partial fulfilment
for 7th semester of BACHELOR OF ENGINEERING in COMPUTER SCIENCE AND
ENGINEERING of the Visvesvaraya Technological University, Belagavi during the
academic year 2024-25. It is certified that all corrections/suggestions indicated for Internal
Assessment have been incorporated in the Report deposited in the Departmental library. The
project report has been approved as it satisfies the academic requirements in respect of
Project work prescribed for the said Degree.

Signature of Guide Signature of HOD Signature of Principal


Dr. Arun Kumar D R Dr. Krishna A N Dr. K. V. Mahendra Prashanth
Asst. Professor Professor and Head Principal, SJBIT
Dept. of CSE, SJBIT Dept. of CSE, SJBIT

Name of the Examiners Signatures

1. __________________________ ______________________________

2.___________________________ ______________________________
ACKNOWLEDGEMENT

We would like to express our profound grateful to His Divine Soul Jagadguru Padmabhushan Sri Sri
Sri Dr. Balagangadharanatha Mahaswamiji and His Holiness Jagadguru Sri Sri Sri Dr.
Nirmalanandanatha Mahaswamiji for providing us an opportunity to complete our academics in this
esteemed institution.

We would also like to express our profound thanks to Revered Sri Sri Dr. Prakashnath Swamiji, BGS
& SJB Group of Institutions, for his continuous support in providing amenities to carry out this Project
Work in this admired institution.

We express our gratitude to Dr. Puttaraju, Academic Director, BGS & SJB Group of Institutions, for
providing us an excellent facilities and academic ambience; which have helped us in satisfactory
completion of Project work.

We express our gratitude to Dr. K. V. Mahendra Prashanth, Principal, SJB Institute of Technology, for
providing us an excellent facilities and academic ambience; which have helped us in satisfactory
completion of Project work.

We extend our sincere thanks to Dr. Babu N V, Dean Academic, SJB Institute of Technology, for
providing us an invaluable support throughout the period of our Project work.

We extend our sincere thanks to Dr. Krishan A N, Head of the Department, Computer Science and
Engineering for providing us an invaluable support throughout the period of our Project work.

We wish to express our heartfelt gratitude to our Project Coordinator & guide Dr. Arun Kumar D R,
Assistant Professor, Department of CSE for /his valuable guidance, suggestions and cheerful
encouragement during the entire period of our Project work.

Finally, we take this opportunity to extend our earnest gratitude and respect to our parents, Teaching &
Non teaching staffs of the department, the library staff and all our friends, who have directly or indirectly
supported us during the period of our Project work.

Regards,
DEEKSHITH KUMAR S [1JB21CS038]
DHANUSH B G [1JB21CS039]
DHANUSH M R [1JB21CS040]
HARSHAN GOWDA L V [1JB21CS055]
ABSTRACT

The structure and function of the microvascular are significantly influenced by the key cardiovascular
disease risk factors of hypertension and heart attacks. Images taken with a fundus camera can be used
to spot irregularities in the blood vessels of retina that indicate the extent injury on blood vessels by
hypertension and heart attacks. Using machine learning and AI techniques, detecting the preclinical
signs that fall below the threshold of an observer. The proposed methodology aimed to investigate the
effects of hypertension and heart attacks on morphological characteristics of retinal blood vessels.
With a diagnosis of hypertension and heart attack, data scientists collect retinal images. Interference
data is removed— information about structures other than that retinal vasculature using the vessel
segmentation method, leaving only morphological details about the blood vessel of retina. The
method aims to create a system for visual image-based heart disease detection, especially in young
people, to identify heart disease. In the study, a dataset of retinal imaging is used, and retinal vessel
segmentation is used to separate the vessels in the images. In a number of specialties, such as
laryngology, neurosurgery, and ophthalmology, the analysis of blood vessels is crucial for diagnosis,
therapy planning and execution, and assessment of clinical outcomes. Therefore, vessel segmentation
is a crucial method for using the retinal image to detect heart disease. Changes in the eyes may be a
sign of many conditions.
CHAPTER 1
INTRODUCTION

1.1 BACKGROUND

The heart is a kind of muscular organ which pumps blood into the body and is the
central part of the body’s cardiovascular system which also contains lungs. Cardiovascular
system also comprises a network of blood vessels, for example, veins, arteries, and
capillaries. These blood vessels deliver blood all over the body. Abnormalities in normal
blood flow from the heart cause several types of heart diseases which are commonly known
as cardiovascular diseases (CVD). Heart diseases are the main reasons for death worldwide.
According to the survey of the World Health Organization (WHO), 17.5 million total global
deaths occur because of heart attacks and strokes. More than 75% of deaths from
cardiovascular diseases occur mostly in middle-income and low-income countries. Also, 80%
of the deaths that occur due to CVDs are because of stroke and heart attack . Therefore,
prediction of cardiac abnormalities at the early stage and tools for the prediction of heart
diseases can save a lot of life and help doctors to design an effective treatment plan which
ultimately reduces the mortality rate due to cardiovascular diseases.
Due to the development of advance healthcare systems, lots of patient data are
nowadays available (i.e. Big Data in Electronic Health Record System) which can be used
for designing predictive models for Cardiovascular diseases. Data mining or machine
learning is a discovery method for analyzing big data from an assorted perspective and
encapsulating it into useful information. “Data Mining is a non-trivial extraction of implicit,
previously unknown and potentially useful information about data”. Nowadays, a huge
amount of data pertaining to disease diagnosis, patients etc. are generated by healthcare
industries. Data mining provides a number of techniques which discover hidden patterns or
similarities from data.
Therefore, in this paper, a machine learning algorithm is proposed for the
implementation of a heart disease prediction system which was validated on two open access
heart disease prediction datasets. Data mining is the computer based process of extracting
useful information from enormous sets of databases. Data mining is most helpful in an
explorative analysis because of nontrivial information from large volumes of evidence.
HEART ATTACK RISK PREDICTION INTRODUCTION

However, the available raw medical data are widely distributed, voluminous and
heterogeneous in nature .This data needs to be collected in an organized form. This
collected data can be then integrated to form a medical information system. Data mining
provides a user-oriented approach to novel and hidden patterns in the Data The data mining
tools are useful for answering business questions and techniques for predicting the various
diseases in the healthcare field. Disease prediction plays a significant role in data mining.
This paper analyzes the heart disease predictions using classification algorithms. These
invisible patterns can be utilized for health diagnosis in healthcare data.

Data mining technology affords an efficient approach to the latest and indefinite
patterns in the data. The information which is identified can be used by the healthcare
administrators to get better services. Heart disease was the most crucial reason for victims in
the countries like India, United States. In this project we are predicting the heart disease
using classification algorithms. Machine learning techniques like Classification algorithms
such as DNN Classifications, Logistic Regression are used to explore different kinds of
heart based problems.

1.2 MOTIVATION

A major challenge facing healthcare organizations (hospitals, medical centers) is the


provision of quality services at affordable costs. Quality service implies diagnosing patients
correctly and administering treatments that are effective. Poor clinical decisions can lead to
disastrous consequences which are therefore unacceptable. Hospitals must also minimize the
cost of clinical tests. They can achieve these results by employing appropriate computer-
based information and/or decision support systems.
Most hospitals today employ some sort of hospital information systems to manage
their healthcare or patient data [12]. These systems typically generate huge amounts of data
which take the form of numbers, text, charts and images. Unfortunately, these data are rarely
used to support clinical decision making. There is a wealth of hidden information in these
data that is largely untapped. This raises an important question: “How can we turn data into
useful information that can enable healthcare practitioners to make intelligent clinical
decisions?” This is the main motivation for this research.

Dept. of CSE, SJBIT 2024-25 Page|2


HEART ATTACK RISK PREDICTION INTRODUCTION

1.3 PROBLEM STATEMENT

Many hospital information systems are designed to support patient billing, inventory
management and generation of simple statistics. Some hospitals use decision support systems,
but they are largely limited. They can answer simple queries like “What is the average age of
patients who have heart disease?”, “How many surgeries had resulted in hospital stays longer
than 10 days?” “Identify the female patients who are single, above 30 years old, and who
have been treated for cancer.” However, they cannot answer complex queries like “Identify
the important preoperative predictors that increase the length of hospital stay”, “Given patient
records on cancer, should treatment include chemotherapy alone, radiation alone, or both
chemotherapy and radiation?”, and “Given patient records, predict the probability of patients
getting a heart disease.”

Clinical decisions are often made based on doctors’intuition and experience rather
than on the knowledge-rich data hidden in the database. This practice leads to unwanted
biases, errors and excessive medical costs which affects the quality of service provided to
patients. Wu, et alproposed that integration of clinical decision support with computer-based
patient records could reduce medical errors, enhance patient safety, decrease unwanted
practice variation, and improve patient outcome [17]. This suggestion is promising as data
modeling and analysis tools, e.g., data mining, have the potential to generate a knowledge-
rich environment which can help to significantly improve the quality of clinical decisions.

1.4 RESEARCH OBJECTIVES

The main objective of this research is to develop a prototype Intelligent Heart Disease
Prediction System (IHDPS) using three data mining modeling techniques, namely, Decision
Trees, Naïve Bayes and Neural Network.

IHDPS can discover and extract hidden knowledge (patterns and relationships)
associated with heart disease from a historical heart disease database. It can answer complex
queries for diagnosing heart disease and thus assist healthcare practitioners to make
intelligent clinical decisions which traditional decision support systems cannot. By providing
effe

Dept. of CSE, SJBIT 2024-25 Page|3


HEART ATTACK RISK PREDICTION INTRODUCTION

1.5 SCOPE

The scope of the project is to develop a fully functional, real-time healthcare application that
integrates machine learning models with an intuitive user interface. The key features
include:

1. User Authentication:

The application will include a robust authentication system to ensure that only authorized
users (doctors, scientists, patients) can access their respective dashboards. Users will be
able to sign up, log in, and reset their passwords as needed Role-Based Dashboards:

Three types of users will interact with the system: doctors, scientists, and patients. Each user
type will have a personalized dashboard:

• Doctors will have access to patient reports, predictions, and tools for making medical
decisions.

• Scientists will have access to aggregated health data for research purposes, with
advancedanalytics tools for exploring trends and patterns.

• Patients will have a simple, intuitive interface to submit their medical reports and view
predictions about their health.

2. Data Submission by Patients:

Patients will be able to submit medical reports, including lab results, test reports, and
historical health data. This data will be used to generate predictions and track health
progress over time.

3. Real-Time Predictions and Visualizations:

The system will use machine learning models to generate predictions based on the data
submittedby patients. Streamlit will be used to create interactive visualizations and graphs
that represent the predictions, trends, and health data, helping users better understand their
health status.

Dept. of CSE, SJBIT 2024-25 Page|4


CHAPTER 2

LITERATURE SURVEY

Machine Learning techniques are used to analyze and predict the medical data
information resources. Diagnosis of heart disease is a significant and tedious task in medicine.
The term Heart disease encompasses the various diseases that affect the heart. The exposure
of heart disease from various factors or symptom is an issue which is not complimentary from
false presumptions often accompanied by unpredictable effects. The data classification is
based on Supervised Machine Learning algorithm which results in better accuracy. Here we
are using the DNN Classifications as the training algorithm to train the heart disease dataset
and to predict the heart disease. The results showed that the medicinal prescription and
designed prediction system is capable of prophesying the heart attack successfully. Machine
Learning techniques are used to indicate the early mortality by analyzing the heart disease
patients and their clinical records (Richards, G. et al., 2001). (Sung, S.F. et al., 2015) have
brought about the two Machine Learning techniques, k-nearest neighbor model and
existing multi linear regression to predict the stroke severity index (SSI) of the patients.
Their study show that k- nearest neighbor performed better than Multi Linear Regression
model. (Arslan, A. K. et al., 2016) have suggested various Machine Learning techniques
such as support vector machine (SVM), penalized logistic regression (PLR) to predict the
heart stroke. Their results show that SVM produced the best performance in prediction
when compared to other models.Boshra Brahmi et al, [20] developed different Machine
Learning techniques to evaluate the prediction and diagnosis of heart disease. The main
objective is to evaluate the different classification techniques such as J48, Decision Tree,
KNN and Naïve Bayes. After this, evaluating some performance in measures of accuracy,
precision, sensitivity, specificity are evaluated. Deep Neural Networks (DNN), Support
Vector Machines (SVM), k-Nearest Neighbors (KNN), and Naïve Bayes have shown
promising results in predicting heart disease with high accuracy. Studies have demonstrated
that SVMs perform exceptionally well in prediction tasks, while KNN outperforms multi-
linear regression models in certain scenarios like stroke severity index prediction.
HEART ATTACK RISK PREDICTION LITERATURE SURVEY

Data Source
Clinical databases have collected a significant amount of information about patients and
their medical conditions. Records set with medical attributes were obtained from the
Cleveland Heart Disease database. With the help of the dataset, the patterns significant to the
heart attack diagnosis are extracted. The records were split equally into two datasets: training
dataset and testing dataset. A total of 303 records with 76 medical attributes were obtained.
All the attributes are numeric-valued. We are working on a reduced set of attributes, i.e. only
14 attributes.

1. Arti Gupta, Maneesh Shreevastava IJETAE, 2011. Medical Diagnosis using Back
Propagation Algorithm. In this paper, feed forward Back Propagation algorithm is
described which is used as a classifier to distinguish between infected and non-infected
person in medical diagnosis. The back propagation algorithm presented in this paper used
for training depends on a multilayer neural network with a very small learning rate,
especially when using a large training set size.

2. Shraddha Subhash Shirsath, Prof. Shubhangi Patil IJIRSET, June 2018. Disease
Prediction using Machine Learning over Big Data. This paper discusses about machine
learning algorithm which is used for the accurate disease prediction. Here to achieve the
incomplete data latent factor model is used. DNN algorithm is used for clarification of
large volume of data from hospital and then Convolutional Neural Network Based
Multimodal Disease Prediction (DNN-MDRP) algorithm helps to provide result of a
disease prediction.

3. Nikita Kamble, Manjiri Harmalkar, Manali Bhoir, Supriya Chaudhary, IJSRCSEIT, 2017.
Smart Health Prediction System Using Machine Learning. The paper presents an
overview of the Machine Learning techniques with its applications, medical and
educational aspects of Clinical Predictions. In medical and health care areas, due to
regulations and due to the availability of computers, a large amount of data is becoming
available. Such a large amount of data cannot be processed by humans in a short time to
make diagnosis, and treatment schedules. A major objective is to evaluate Machine
Learning techniques in clinical and health care applications to develop accurate decisions.
Dept. of CSE, SJBIT 2024-25 Page|6
HEART ATTACK RISK PREDICTION LITERATURE SURVEY

It also gives a detailed discussion of medical Machine Learning techniques which can
improve various aspects of Clinical Predictions. It is a new powerful technology which is
of high interest in computer world. It is a sub field of computer science that uses already
existing data in different databases to transform it into new researches and results. It
makes use of machine learning and database management to extract new patterns from
large data sets and the knowledge associated with these patterns. The actual task is to
extract data by automatic or semi-automatic means. The different parameters included in
Machine Learning include clustering, forecasting, path analysis and predictive analysis.

4. Nilesh Borisagar, Dipa Barad, Priyanka Raval, Conference paper (PICCN), April 2017.
Chronic Kidney Disease Prediction using Back Propagation Neural Network
Algorithm. In this paper, various training algorithms like Levenberg, Bayesian
regularization, Scaled Conjugate and Resilient back propagation algorithm are discussed.
After neural network is trained using back propagation algorithms, this trained neural
network system is used for detection of kidney disease in the human body. The back
propagation algorithms presented here have capacity for distinguishing amongst infected
patients or non-infected person.

5. Sellappan Palaniappan, Rafiah Awang IEEE, 2008. Intelligent Heart Disease Prediction
System Using Machine Learning Technique. This paper discusses about the
development of prototype using Machine Learning techniques, namely, DNN
Classification, DNN and Neural Network. It can answer complex “what if “queries which
traditional decision support system cannot.it is web-based, user-friendly, scalable, reliable
and expandable.

6. M.A. Nishara Banu, B Gomathy, IJTRA, Dec 2013. Disease Prediction System Using
Machine Learning Techniques. This paper analyzes the heart disease predictions using
different classification algorithms. Here medical Machine Learning techniques like
Association Rule Mining, Clustering and Classification Algorithms such as DNN
Classification, C4.5 Algorithm are implemented to analyze the different kinds of heart
based problems. Maximal Frequent Itemset Algorithm (MAFIA) is used for mining
maximal frequent item sets from a transactional database.

Dept. of CSE, SJBIT 2024-25 Page|7


HEART ATTACK RISK PREDICTION LITERATURE SURVEY

The literature survey on heart disease prediction using Machine Learning (ML) techniques
has provided several significant insights into the effectiveness, challenges, and practical
applications of these methods:

1. Performance of Machine Learning Algorithms

• Support Vector Machines (SVM): Consistently outperformed other algorithms in


terms of accuracy, particularly in handling small to medium-sized datasets. Its ability
to manage high-dimensional spaces makes it suitable for complex medical datasets.
• Deep Neural Networks (DNN): Effective for large datasets due to their capacity to
model intricate patterns and interactions in the data. However, they require significant
computational resources and large labeled datasets for optimal performance.
• k-Nearest Neighbor (KNN): Demonstrates strong performance in scenarios requiring
non-linear modeling but is computationally expensive for large datasets due to its
dependency on distance calculations.
• Decision Trees and Ensemble Methods: Methods like Random Forest and Gradient
Boosting show robust performance by reducing overfitting and enhancing predictive
accuracy. They are particularly effective in handling categorical and numerical data.
• Naïve Bayes: A fast and simple approach, often used for initial experimentation, but its
assumption of feature independence can limit its accuracy for complex datasets.

2. Key Evaluation Metrics

• Accuracy: Widely used but may be misleading for imbalanced datasets where the
majority class dominates.
• Precision, Recall, and F1-Score: These metrics provide a more nuanced
understanding of model performance, especially for datasets with class imbalances
(e.g., heart disease vs. no heart disease cases).
• Specificity and Sensitivity: Critical for medical applications to minimize false
negatives (missing a diagnosis) and false positives (incorrectly diagnosing heart
disease).
• ROC-AUC Curve: Provides a comprehensive measure of the model's ability.

Dept. of CSE, SJBIT 2024-25 Page|8


HEART ATTACK RISK PREDICTION LITERATURE SURVEY

3. Data Challenges
• High Dimensionality: Medical datasets often contain numerous features, requiring
dimensionality reduction techniques such as Principal Component Analysis (PCA) or
feature selection methods.
• Class Imbalance: Heart disease datasets frequently exhibit imbalanced distributions,
with significantly fewer positive cases. Techniques like SMOTE (Synthetic Minority
Oversampling Technique) are employed to address this.
• Missing Data: Clinical records often have missing values, necessitating imputation
methods or careful preprocessing to avoid bias.
• Heterogeneous Data: Data may come from various sources (e.g., electronic health
records, imaging, or wearable devices), requiring standardization and integration.

4. Processing and Feature Engineering

• Feature selection (e.g., age, blood pressure, cholesterol, ECG results) plays a crucial
role in improving model performance.
• Scaling techniques (e.g., normalization or standardization) are essential for algorithms
sensitive to feature magnitudes, such as SVM and KNN.
• Temporal data (e.g., time-series analysis of patient history) requires specialized
preprocessing techniques to capture trends over time.

5. Algorithm Loacations

• Overfitting: Particularly prevalent in complex models like DNNs, requiring


regularization techniques and cross-validation to mitigate.
• Interpretability: Black-box models such as DNNs and SVMs lack explainability,
making them less preferred in scenarios where interpretability is critical, such as in
clinical decision-making.
• Computational Costs: Advanced models like DNNs and ensemble methods require
significant computational resources, which may limit their deployment in resource-
constrained

Dept. of CSE, SJBIT 2024-25 Page|9


HEART ATTACK RISK PREDICTION LITERATURE SURVEY

6. Real World Applications

• ML models have been integrated into decision support systems for clinicians, aiding in
early diagnosis and personalized treatment planning.
• Predictive models are being used in public health for identifying high-risk populations
and designing preventive interventions.
• Continuous monitoring systems leveraging wearable devices and ML algorithms
enable real-time prediction and alerts for heart disease patients.

7. Emerging Trends

• Hybrid Models: Combining multiple algorithms (e.g., ensemble methods or hybrid


approaches) is gaining traction for enhancing accuracy and robustness.
• Explainable AI (XAI): Focus on developing interpretable models to improve trust and
adoption in clinical settings.
• Integration with Big Data: Leveraging large-scale datasets for better generalization
and deeper insights into heart disease patterns.
• Real-Time Analysis: Using streaming data from wearable devices and IoT for
continuous

These observations highlight the potential of ML in transforming heart disease diagnosis and
prevention while emphasizing the need for addressing specific challenges to maximize its
impact. These insights underscore the transformative role of Machine Learning in
revolutionizing heart disease diagnosis and prediction. By enabling early detection,
personalized treatment planning, and real-time monitoring, ML has the potential to
significantly improve patient outcomes and reduce mortality rates. However, to fully harness
its capabilities, it is essential to address critical challenges such as ensuring high-quality and
diverse datasets, enhancing model interpretability for clinical adoption, mitigating overfitting
and computational overhead, and integrating ML models seamlessly into existing healthcare
infrastructures. With continued advancements in algorithms, data processing techniques, and
interdisciplinary collaboration, ML can pave the way for more precise, accessible, and
impactful

Dept. of CSE, SJBIT 2024-25 P a g e | 10


CHAPTER 3

SYSTEM REQUIREMENTS SPECIFICATIONS

3.1 SOFTWARE REQUIREMENTS

1. Operating System:
• Windows: For development and testing environments.
• macOS: For compatibility with developers using Apple devices.
• Linux (Ubuntu Recommended): Preferred for server deployment due to
stability and compatibility with Python libraries and machine learning
frameworks.
2. Programming Language:
• Python: Version 3.8 or higher to ensure compatibility with the latest machine
learning and deep learning libraries.
3. Frameworks and Libraries:
• Deep Learning Frameworks:
• TensorFlow: For building and training deep learning models.
• PyTorch: Alternative framework for experimentation and model
prototyping.
• Machine Learning Libraries:
• Scikit-learn: For preprocessing and classical machine learning tasks.
• XGBoost: For high-performance gradient boosting models.
• Image Processing Libraries:
• OpenCV: For handling and preprocessing retinal images.
• PIL (Pillow): For image manipulation and augmentation.
• Data Analysis and Visualization Libraries:
• Numpy and Pandas: For numerical computations and data handling.
• Matplotlib and Seaborn: For creating plots and analyzing visual data
trends.
• Plotly: For interactive dashboards and advanced data visualization.
4. Database Management System (DBMS):
HEART ATTACK RISK PREDICTION SYSTEM REQUIREMENTS SPECIFICATION

• MongoDB: To store structured patient data, predictions, and risk assessment


results.
• SQLite: Lightweight database for local storage during initial development
phases.
5. Development Tools:
• Jupyter Notebook: For model development and experimentation.
• Integrated Development Environment (IDE):
• PyCharm: For Python-based development.
• VS Code: Lightweight alternative IDE.
• Version Control:
• Git: For source code versioning and collaboration.
• GitHub or GitLab: For repository management and team collaboration.
6. APIs and External Services:
• Google Colab: For GPU/TPU-based training.
• Retinal Image Datasets:
• Public datasets such as EyePACS, DRIVE, or STARE for model
training.
7. Web Frameworks (For User Interface):
• Django: For backend development.
• Flask: Lightweight alternative for building APIs.
• Bootstrap: For responsive front-end design.
8. Testing and Deployment Tools:
• Selenium: For testing the web interface.
• Docker: For containerizing the application for deployment.
• Kubernetes: For orchestrating deployment in a scalable manner.

3.2 HARDWARE REQUIREMENTS

1. Development Machine:
o Processor: Intel i5 or higher / AMD Ryzen 5 or higher.
o RAM: Minimum 16 GB for efficient model training and data processing.
o Storage: SSD with at least 512 GB for storing datasets and models.

Dept. of CSE, SJBIT 2024-25 P a g e | 11


HEART ATTACK RISK PREDICTION SYSTEM REQUIREMENTS SPECIFICATION

o GPU: NVIDIA GPU with CUDA support (e.g., NVIDIA GTX 1660 or higher).

2. Server Requirements:
o Processor: Intel Xeon or equivalent with multi-core architecture.
o RAM: Minimum 32 GB for handling concurrent user requests and large
datasets.
o Storage: 2 TB HDD for logs and backups, 1 TB SSD for operational data.
o GPU: NVIDIA Tesla series or equivalent for deep learning inference.
3. User Devices:
o Desktop or Laptop:
▪ Processor: Intel i3 or equivalent.
▪ RAM: 4 GB or higher.
▪ Browser: Latest version of Chrome, Firefox, or Edge.
o Mobile Devices:
▪ Android or iOS with at least 2 GB of RAM and a modern browser.

3.3 FUNCTIONAL REQUIREMENTS

1. Data Acquisition:

o Accept retinal images in standard formats (e.g., JPEG, PNG).

o Validate image quality and provide feedback if images are blurry or improperly
formatted.

2. Preprocessing:

o Convert images to grayscale for computational efficiency.

o Resize images to a uniform dimension (e.g., 224x224 pixels).

o Perform image augmentation (e.g., rotation, flipping) to enhance model


robustness.

3. Model Training:

o Train a convolutional neural network (CNN) for feature extraction.

o Use transfer learning with pre-trained models like ResNet or EfficientNet.

o Validate model performance using metrics like accuracy, precision, recall, and

Dept. of CSE, SJBIT 2024-25 P a g e | 12


HEART ATTACK RISK PREDICTION SYSTEM REQUIREMENTS SPECIFICATION

F1-score.

4. Prediction and Risk Assessment:

o Process uploaded retinal images through the trained model.

o Provide a numerical risk score for heart attack likelihood.

o Categorize results as low, medium, or high risk.

5. Data Storage:

o Store patient information and predictions in the database.

o Ensure compliance with privacy regulations like GDPR and HIPAA.

6. User Interface:

o Intuitive dashboard for uploading images and viewing results.

o Visualize risk predictions with color-coded graphs and heatmaps.

7. Reporting and Notifications:

o Generate detailed reports summarizing the risk analysis.

o Notify users about results via email or SMS (if opted in).

3.4 NON-FUNCTIONAL REQUIREMENTS

1. Performance:

o Predict heart attack risk within 5 seconds of image upload.

o Handle concurrent requests from up to 50 users.

2. Scalability:

o Support future integration of additional biomarkers for comprehensive risk


assessment.

3. Security:

o Encrypt all data in transit and at rest.

o Implement role-based access control for sensitive operations.

4. Availability:

Dept. of CSE, SJBIT 2024-25 P a g e | 13


HEART ATTACK RISK PREDICTION SYSTEM REQUIREMENTS SPECIFICATION

o Ensure 99.9% uptime with failover mechanisms.

5. Usability:

o Design for accessibility, supporting screen readers and keyboard navigation.

6. Maintainability:

o Modularize code for easy updates and troubleshooting.

3.5 DATA REQUIREMENTS

1. Input Data:

o Retinal images (RGB or grayscale) with a resolution of at least 1024x1024


pixels.

2. Output Data:

o Risk predictions as numerical scores and categorized levels (low, medium,


high).

3. Training Data:

o Labeled datasets with ground truth for cardiovascular risk.

o Diverse demographics to ensure model fairness and inclusivity.

3.6 REGULATORY AND ETHICS REQUIREMENTS

1. Compliance:

o Ensure compliance with HIPAA for data privacy and security.

o Adhere to GDPR for handling European user data.

2. Bias Mitigation:

o Regularly audit models for bias against specific demographics.

3. Transparency:

o Provide users with interpretable explanations of model predictions.

3.7 SOFTWARE REQUIREMENTS:PYTHON WITH ANACONDA AS

ENVIRONMENT

3.7.1 Python Environment in Anaconda

Dept. of CSE, SJBIT 2024-25 P a g e | 14


HEART ATTACK RISK PREDICTION SYSTEM REQUIREMENTS SPECIFICATION

Anaconda is a powerful distribution of Python and R that is widely used for data science,
machine learning, and scientific computing tasks. It simplifies package management and
deployment, especially when working with multiple projects or datasets. Anaconda
provides a complete and self-contained environment for Python development, making it
the ideal platform for data scientists and researchers. It includes Python, essential libraries,
tools, and an easy-to-use package manager called conda.
This section outlines the specific software requirements for setting up Python with
Anaconda as the development environment.

3.7.2 System Configuration


1.Operating System:
• Windows: Windows 10 or higher is recommended for running Python within
Anaconda. Anaconda provides a straightforward installation for Python
developers.
• macOS: Ideal for macOS users, with full compatibility for Python-based
libraries and frameworks.
• Linux (Ubuntu Recommended): Linux is often preferred for Python-based
environments, especially for running heavy computational tasks or deep
learning models. Ubuntu is the most commonly used Linux distribution for this
purpose.
2. Python Version:
• Python 3.8 or higher: The minimum version of Python required for this
environment is 3.8, as it ensures compatibility with modern Python libraries
and frameworks for machine learning and data analysis.
3. Key Libraries and Frameworks:
Anaconda simplifies the installation of Python libraries and frameworks. Here’s a
list of essential libraries that need to be installed within the Anaconda environment:
• Deep Learning Libraries:
o TensorFlow: For building and training deep learning models.
o PyTorch: A flexible framework for machine learning and deep learning
tasks.
• Machine Learning Libraries:
o Scikit-learn: For machine learning algorithms such as classification,
Dept. of CSE, SJBIT 2024-25 P a g e | 15
HEART ATTACK RISK PREDICTION SYSTEM REQUIREMENTS SPECIFICATION

regression, and clustering.


o XGBoost: For high-performance boosting algorithms.
• Data Processing and Image Manipulation:
o Pandas: For data analysis, manipulation, and cleaning.
o NumPy: For efficient numerical operations and array manipulations.
o OpenCV: For computer vision and image processing.
o Pillow (PIL): For image manipulation and augmentation tasks.
• Visualization Libraries:
o Matplotlib: For creating static, interactive, and animated visualizations.
o Seaborn: For statistical data visualization built on top of Matplotlib.
o Plotly: For interactive plots and dashboards.
• Web Development Frameworks (if applicable for deployment):
o Django: A high-level web framework for Python.
o Flask: A lightweight alternative for building APIs and web services.
o Bootstrap: For responsive front-end development.
4. Database and Storage:
• SQLite: Ideal for lightweight local storage during development.
• MongoDB: A NoSQL database to store structured data such as results and
predictions.
5. Development Tools:
• Jupyter Notebook: For an interactive environment to write and run Python
code, visualize results, and document workflows.
• IDEs for Python Development:
o PyCharm: A popular Python IDE for full-scale development.
o VS Code: A lightweight and highly customizable IDE.
6. Package Management Tools:
• Conda: An open-source package and environment management system that is
the core of Anaconda. It simplifies the installation, removal, and updating of
Python libraries and tools.

3.7.3 Hardware Requirements


For a smooth development and execution experience when using Python in
Anaconda, the following hardware specifications are recommended.
Dept. of CSE, SJBIT 2024-25 P a g e | 16
HEART ATTACK RISK PREDICTION SYSTEM REQUIREMENTS SPECIFICATION

1. Development Machine:
o Processor: Intel i5 or AMD Ryzen 5 (or higher) for efficient model training and
development.
o RAM: Minimum of 16 GB of RAM is recommended to handle large datasets
and models.
o Storage: SSD with a minimum of 512 GB to store datasets, models, and
intermediate results.
o Graphics Processing Unit (GPU): NVIDIA GPU with CUDA support (e.g.,
NVIDIA GTX 1660 or higher) for accelerated model training using libraries
like TensorFlow and PyTorch.
2. Server Requirements (if using Anaconda for server-side applications or
deployments):
o Processor: Intel Xeon or equivalent, multi-core architecture for handling large-
scale operations and concurrent user requests.
o RAM: 32 GB or higher to manage large data and simultaneous processes.
o Storage: 1 TB SSD for operational data and 2 TB HDD for logs and backups.
o GPU: NVIDIA Tesla series or equivalent for deep learning model inference.
3. User Devices:
o Desktop or Laptop:
▪ Processor: Intel i3 or equivalent.
▪ RAM: 4 GB or higher for running small models or viewing results.
▪ Browser: Latest versions of Chrome, Firefox, or Edge for accessing
Jupyter Notebooks or web applications.
o Mobile Devices:
▪ Android or iOS with at least 2 GB of RAM for interacting with
deployed applications or APIs.

3.7.4 Development Tool And Libraries


1. Jupyter Notebook: Jupyter is an essential tool in the Python environment for running
code interactively, making it an integral part of the Anaconda distribution. It supports data
exploration and visualization through inline plots and charts.
2. Integrated Development Environments (IDEs):
• PyCharm: A feature-rich IDE specifically designed for Python development. It
Dept. of CSE, SJBIT 2024-25 P a g e | 17
HEART ATTACK RISK PREDICTION SYSTEM REQUIREMENTS SPECIFICATION

provides extensive debugging, testing, and code navigation features.


• VS Code: A lightweight code editor that supports Python development with
extensions for linting, debugging, and version control integration.
3. Version Control:
• Git: For version control and collaboration, ensuring code integrity and
collaboration between teams.
• GitHub or GitLab: Online platforms for hosting repositories and managing
version-controlled code.
4. Package Management:
• Conda: As the default package manager in Anaconda, it ensures the correct
installation and management of libraries and dependencies for projects.
• Pip: While Conda handles most packages, pip can be used to install Python-
specific packages not available via Conda.

3.7.5 Scalability And Performance Requirements


Python environments within Anaconda are scalable, and the framework supports parallel and
distributed computing:
1. Scalability:
o Ability to scale environments to deploy machine learning models with frameworks
such as Dask and Apache Spark, which Anaconda supports.
o Anaconda environments can be containerized using Docker and orchestrated with
Kubernetes for efficient deployment and scaling in cloud or server environments.
2. Performance:
o Python applications in Anaconda should process and predict within milliseconds
for most tasks.
o Parallelization with multi-threading or multi-processing can be used to speed up
computations in machine learning tasks.

3.7.6 Security Considerations


1. Data Privacy:
• Sensitive data (e.g., medical records, financial information) must be encrypted both at
rest and in transit, especially when using Anaconda for applications involving personal
data.
Dept. of CSE, SJBIT 2024-25 P a g e | 18
HEART ATTACK RISK PREDICTION SYSTEM REQUIREMENTS SPECIFICATION

2. Access Control:
• Role-based access control (RBAC) can be implemented for restricting access to certain
parts of the application or environment, especially in shared or multi-user settings.

3.7.7 Regulatory And Compliance Requirements


1. Compliance:
• Ensure compliance with regulations such as GDPR for data privacy when processing
personal data in Python applications.
• Adhere to HIPAA or other healthcare-related standards if the environment involves
processing sensitive health data.
2. Bias and Fairness:
• Regularly evaluate and audit models and data pipelines for potential biases, especially
in machine learning models used for decision-making.

The system requirements outlined for the project highlight the critical role that
software, hardware, and data play in the development and deployment of an AI-based
solution for heart attack risk prediction from retinal images. The chosen environment, such
as Anaconda, Python, and associated libraries, ensures a stable and efficient platform for
machine learning and deep learning tasks. Additionally, the hardware specifications,
including high-performance GPUs for model training and robust server capabilities for
handling large-scale user data and concurrent requests, are crucial for the seamless
functioning of the system.
The functional requirements emphasize the need for accurate image preprocessing,
effective model training, and reliable risk predictions that are essential for the clinical
decision-making process. The system should ensure that all patient data is handled
securely, complying with privacy regulations like HIPAA and GDPR, which is a core
component of building trust in such a system.
In summary, the combination of advanced technologies, rigorous regulatory
adherence, and a user-centered approach will make this system not only efficient and
scalable but also trustworthy and effective in predicting heart attack risks, ultimately
improving patient care and health outcomes.

Dept. of CSE, SJBIT 2024-25 P a g e | 19


CHAPTER 4
SYSTEM DESIGN

4.1 USE CASE DIAGRAM

A use case diagram shows the system’s functionality and the interactions between users
(or external systems) and the system itself. For your heart attack risk prediction system,
the following use cases can be defined:
Actors:
• Patient/User: The end user who uploads retinal images and views the results.
• Admin: The user managing the backend, including training the model, monitoring
system performance, and maintaining user accounts.
• Model (System): The machine learning model that processes retinal images and
provides risk predictions.
Use Cases:
• Upload Retinal Image: A patient uploads an image of their retina in a supported
format (e.g., JPEG, PNG).
• Preprocess Image: The system preprocesses the uploaded image (grayscale
conversion, resizing).
• Predict Heart Attack Risk: The system uses the trained model to predict the risk
of heart attack based on the processed retinal image.
• View Prediction Results: The patient views the heart attack risk prediction,
categorized as low, medium, or high.
• Store Data: The system stores the image and its corresponding prediction in the
database.
• Generate Report: The system generates a detailed risk assessment report.
• Admin Dashboard: Admin monitors system logs, updates the model, and
manages user data.
Diagram Overview:
The diagram will show the relationships between the actors and use cases, with the
patient primarily interacting with the system to upload images and view results. The
admin would interact with the backend to monitor, update, and maintain the system.
HEART ATTACK RISK PREDICTION SYSTEM DESIGN

Fig 4.1: Use Case Diagram

4.2 ACTIVITY DIAGRAM

An activity diagram represents the workflow of the system for a specific use case,
detailing the steps and decisions involved in processing an uploaded retinal image and
predicting the heart attack risk.
Workflow for Predicting Heart Attack Risk:
1. Start
2. Upload Retinal Image:
o The patient uploads a retinal image.
3. Preprocessing:
o Convert image to grayscale.
o Resize image to a standard size (e.g., 224x224 pixels).
o Augment image (optional).
4. Image Validation:
o Check if the image format is correct (JPEG/PNG).
o Validate the image quality (e.g., ensure it is not blurry).

Dept. of CSE, SJBIT 2024-25 P a g e | 21


HEART ATTACK RISK PREDICTION SYSTEM DESIGN

o If invalid, prompt the user to upload a valid image.


5. Predict Heart Attack Risk:
o Pass the preprocessed image to the model for prediction.
6. Categorize Risk:
o The model outputs a risk score.
o The system categorizes the result as low, medium, or high risk.
7. Store Data:
o Store the uploaded image and the corresponding prediction in the database.
8. Display Results:
o The system displays the results to the patient.
9. Generate Report:
o The system generates a detailed report summarizing the prediction.
Diagram Overview:
• The activity diagram will use nodes and edges to represent the flow from one
activity to another, including decision points (e.g., image validation) and the final
actions (storing data, generating reports).

Fig 4.2: Activity Diagram

Dept. of CSE, SJBIT 2024-25 P a g e | 21


HEART ATTACK RISK PREDICTION SYSTEM DESIGN

4.3 SEQUENCE DIAGRAM

A sequence diagram illustrates the interaction between objects or components over time
for a specific process. For your project, a sequence diagram can describe the process of
predicting heart attack risk from the moment the patient uploads a retinal image to when
the prediction is displayed.
Sequence of Operations:
1. Patient: Uploads the retinal image.
2. System: Receives the image and initiates preprocessing.
o Converts the image to grayscale.
o Resizes the image to the correct dimensions.
o Validates the image quality.
3. Model: Processes the preprocessed image and returns the risk score.
4. System: Categorizes the result (low, medium, or high risk).
Diagram Overview:
• The sequence diagram will show each step as a vertical lifeline (e.g., Patient,
System, Model) with horizontal arrows representing interactions between them
(e.g., uploading the image, calling the model, displaying the result).

Fig 4.3: Sequence Diagram

Dept. of CSE, SJBIT 2024-25 P a g e | 21


HEART ATTACK RISK PREDICTION SYSTEM DESIGN

4.4 DATA FLOW DIAGRAM LEVEL 0

In Level 0, the DFD represents the entire system as a single process. This provides a broad
overview of how data flows from collection to the output.
Processes:
1. Collection
o External Entity: Patient
o Description: The patient uploads retinal images for analysis.
o Data Flow: Retinal Image Data → System
2. Processing
o External Entity: Data Processing Service
o Description: The system processes the image data to generate predictions.
It involves all necessary steps like preprocessing, applying algorithms, and
extracting features from the image.
o Data Flow: Retinal Image Data → Processed Data → Prediction Model
3. Random Selection
o External Entity: Algorithm Selection Module
o Description: Randomly selects a specific model or set of features for
training, which helps in optimizing and testing various configurations of the
model.
o Data Flow: Processed Data → Random Selection → Trained Model
4. Trained Dataset
o External Entity: Model Training
o Description: The processed data is used to train the deep learning model.
After model training, the dataset is applied for predictions to determine
heart attack risks.
o Data Flow: Trained Data → Model Training → Predictions → Output
Results

Dept. of CSE, SJBIT 2024-25 P a g e | 21


HEART ATTACK RISK PREDICTION SYSTEM DESIGN

Fig 4.4: Level 0 DFD

4.5 DATA FLOW DIAGRAM LEVEL 1

In Level 1 of the Data Flow Diagram (DFD), we break down the high-level processes
identified in Level 0 into smaller, more detailed components. This provides an in-depth
look at how data is processed within each phase and the interactions between the
processes, data stores, and the system components. Below is an elaboration of each process
and its corresponding flow.

1. Collection Process
Input:
• Retinal Images uploaded by the patient: These images are captured by the

Dept. of CSE, SJBIT 2024-25 P a g e | 21


HEART ATTACK RISK PREDICTION SYSTEM DESIGN

patient and uploaded to the system for analysis. The images may come in various
formats such as JPEG, PNG, or TIFF.
Process:
• Receive and store retinal images:
o The Collection process begins when the patient uploads their retinal images
to the system. These images are first received by the system and stored in a
Data Store for further processing.
o The system may verify the format and quality of the images to ensure they
are suitable for further processing.
Output:
• Image Data: The images are stored and organized in the system for later use in the
Preprocessing stage. They are saved in a Retinal Image Data Store where they
can be easily accessed by subsequent processes.
Data Flow:
• Retinal Image Data → System: The images are transferred from the patient to the
system, initiating the collection process.

2. Preprocessing
Input:
• Retinal Image Data: The raw retinal images collected in the first step are fed into
the preprocessing system for further refinement.
Process:
• Preprocessing:
o The Preprocessing stage is essential for improving the quality of the
images before extracting meaningful features. The process includes several
steps:
▪ Noise Removal: Filters are applied to remove any irrelevant or
distracting data (e.g., blur or distortion) from the images.
▪ Resizing: The images are resized to a uniform dimension to make
them easier to analyze and reduce computational load during later
stages.
▪ Grayscale Conversion: The images are converted to grayscale, as
this reduces complexity by eliminating color data and focusing
Dept. of CSE, SJBIT 2024-25 P a g e | 21
HEART ATTACK RISK PREDICTION SYSTEM DESIGN

solely on the intensity of the light.


▪ The output of this process is a cleaned and simplified version of the
retinal images that can be used for feature extraction.
Output:
• Preprocessed Image Data: After the image is cleaned and transformed, the
processed version is sent forward to the Feature Extraction step.
Data Flow:
• Retinal Image → Preprocessed Image → Further Processing: The raw retinal
image flows into the preprocessing stage and is converted into preprocessed data.

3. Feature Extraction
Input:
• Preprocessed Image Data: The processed images (post noise removal, resizing,
and grayscale conversion) are fed into the Feature Extraction system for the next
phase of analysis.
Process:
• Feature Extraction:
o During the Feature Extraction process, specific patterns and attributes are
identified in the retinal images that are most relevant for heart attack risk
prediction. These features help the machine learning model to understand
the image and make accurate predictions.
o Common features that might be extracted from retinal images include:
▪ Blood Vessel Patterns: Analysis of blood vessel morphology
(shape, branching patterns) which can indicate cardiovascular
conditions.
▪ Anomalies: Detection of abnormal regions such as hemorrhages,
exudates, or other vascular irregularities, which are indicative of
heart problems.
o Specialized algorithms like edge detection or texture analysis could be used
to identify these features from the images.
Output:
• Extracted Features: After processing, the key features of the image (such as blood
vessel patterns or anomalies) are isolated and stored in a structured format. These
Dept. of CSE, SJBIT 2024-25 P a g e | 21
HEART ATTACK RISK PREDICTION SYSTEM DESIGN

extracted features will be used as input for the predictive algorithm in the next
stage.
Data Flow:
• Preprocessed Image → Extracted Features: The preprocessed image is passed to
the feature extraction system, which generates a set of meaningful features.

4. Apply Algorithm
Input:
• Extracted Features: These are the relevant patterns and attributes that have been
identified from the retinal images. They serve as the input for the machine learning
model to generate predictions.
Process:
• Apply Algorithm:
o The Apply Algorithm process is where the actual prediction happens. The
extracted features are input into a machine learning model or algorithm to
evaluate the risk of a heart attack.
o Various types of algorithms can be used for this prediction, including:
▪ Convolutional Neural Networks (CNNs): These are deep learning
algorithms particularly suited for image analysis, as they can
automatically detect patterns like blood vessel structures or
anomalies in the retinal images.
▪ Other Machine Learning Models: Depending on the architecture
of the system, you may also apply other algorithms such as decision
trees, random forests, or support vector machines, based on the
feature extraction.
o The machine learning model processes the features and outputs a prediction
about the heart attack risk. This prediction could be a:
▪ Numerical Score: A value representing the likelihood of the patient
experiencing a heart attack (e.g., 0.75 means 75% risk).
▪ Classification: The risk could be categorized into low, medium, or
high based on thresholds set during training.
Output:
• Risk Prediction Result: This is the final output of the system, where a prediction
Dept. of CSE, SJBIT 2024-25 P a g e | 21
HEART ATTACK RISK PREDICTION SYSTEM DESIGN

about the heart attack risk is made. The result is typically a numerical value or a
categorical classification (low, medium, high).
Data Flow:
• Extracted Features → Algorithm → Prediction Model → Prediction Result:
The extracted features are passed into the prediction model (algorithm), which
generates a risk prediction that flows to the final output stage.

Fig: 4.5 Level 1 DFD

Dept. of CSE, SJBIT 2024-25 P a g e | 21


HEART ATTACK RISK PREDICTION SYSTEM DESIGN

4.6 ALGORITHMS

4.6.1 LOGISTIC REGRESSION :

A popular statistical technique to predict binomial outcomes (y = 0 or 1) is


Logistic Regression. Logistic regression predicts categorical outcomes (binomial /
multinomial values of y). The predictions of Logistic Regression (henceforth, LogR in
this article) are in the form of probabilities of an event occurring, i.e. the probability of
y=1, given certain values of input variables x. Thus, the results of LogR range between
0-1.

LogR models the data points using the standard logistic function, which is
an S- shaped curve also called as sigmoid curve and is given by the equation:

Logistic Regression Assumptions:

• Logistic regression requires the dependent variable to bebinary.

• For a binary regression, the factor level 1 of the dependent variable should
represent the desired outcome.

• Only the meaningful variables should be included.

• The independent variables should be independent of each other·

• Logistic regression requires quite large sample sizes.

• Even though, logistic (logit) regression is frequently used for binary variables (2
classes), it can be used for categorical dependent variables with more than 2
classes.
In this case it’s called Multinomial Logistic Regression.

Dept. of CSE, SJBIT 2024-25 P a g e | 21


HEART ATTACK RISK PREDICTION SYSTEM DESIGN

Fig 4.6 Logistic Regression

4.6.2 RNN CLASSIFICATION:

RNN Classifications is a supervised learning algorithm which is used for both


classification as well as regression .But however ,it is mainly used for classification
problems .As we know that a forest is made up of trees and more trees means more
robust forest .
Similarly, RNN Classifications creates decision trees on data samples and
then gets the prediction from each of them and finally selects the best solution by
means of voting. It is ensemble method which is better than a single decision tree .

Dept. of CSE, SJBIT 2024-25 P a g e | 21


HEART ATTACK RISK PREDICTION SYSTEM DESIGN

Working of RNN Classifications with the help of following steps:

• First ,start with the selection of random samples from a given dataset.

• Next ,this algorithm will construct a decision tree for every sample .Then it will get
the prediction result from every decision tree .

• In this step, voting will be performed for every predicted result.

Fig 4.7 RNN Classification

4.7 SYSTEM ARCHITECTURE

The Heart Attack Risk Prediction System analyzes retinal images to predict heart disease
risk and health indicators like age group, SBP, BMI, and HbA1c. The system begins with an
input layer, where retinal images are fed into the model. In the preprocessing layer, images
are normalized, resized, and cleaned for better quality. Feature extraction is done through
methods like handcrafted feature extraction, focusing on key retinal characteristics such as
blood vessel patterns. The extracted features are then passed to a Recurrent Neural
Network (RNN), which classifies the data and predicts health metrics. The RNN model
assigns probabilities to risk levels (No Risk, Low Risk, Mild Risk, High Risk) and
continuous values like SBP, BMI, and HbA1c. A softmax function is used for classification
probabilities, while regression techniques predict continuous outputs. The output layer
displays the final predictions for heart attack risk and health metrics.
Dept. of CSE, SJBIT 2024-25 P a g e | 21
HEART ATTACK RISK PREDICTION SYSTEM DESIGN

Fig 4.8 System Architecture

Dept. of CSE, SJBIT 2024-25 P a g e | 21


HEART ATTACK RISK PREDICTION SYSTEM DESIGN

1. Input Features:
o Features extracted from the retinal images, such as blood vessel density, optic disk
size, and other biomarkers.
o These features are numerical values that represent potential indicators of heart attack
risk.
2. Trained RNN Model:
o A pre-trained RNN model that has learned the relationship between the input features
and the risk levels.
o The model uses its layers to process the input features and predict the risk
probabilities.
3. Output Probabilities:
o The model outputs a probability distribution across the four risk levels:
▪ No Risk: P1
▪ Low Risk: P2
▪ Mild Risk: P3
▪ High Risk: P4
4. Classification Logic:
o The logic selects the class with the highest probability.
o Example: If the probabilities are [0.1, 0.6, 0.2, 0.1], the classification result is Low
Risk.
5. Final Risk Classification:
o The final output of the system, which could be displayed to the user as No Risk, Low
Risk, Mild Risk, or High Risk.

Dept. of CSE, SJBIT 2024-25 P a g e | 21


CHAPTER 5

IMPLEMENTATION

5.1 DJANGO

Django is a high-level Python web framework that promotes rapid development and clean,
pragmatic design. It’s one of the most popular frameworks for building web applications
because it helps developers efficiently create robust, scalable, and secure websites.

Key Features of Django:

MVC Framework: Django follows the Model-View-Controller (MVC) pattern, though it


refers to itas MTV (Model-Template-View):

Model: Manages the data and database

structure. Template: Handles the presentation

layer (HTML).

View: Contains the business logic and interacts with the model and template.

Built-in Admin Interface: Django provides an automatic and customizable admin interface
formanaging the website’s data.

ORM (Object-Relational Mapping): Django includes a powerful ORM to interact with the
database.Instead of writing raw SQL queries, you can manipulate the database using Python
code.

Security: Django includes built-in protection against common security threats, such as SQL
injection, cross-site scripting (XSS), cross-site request forgery (CSRF), and clickjacking.

Scalability: Django is designed to handle high traffic, making it suitable for large-scale
applications.

Batteries-Included: Django comes with many built-in features like user authentication,
sessionmanagement, caching, and more, so developers don’t have to reinvent the wheel.
HEART ATTACK RISK PREDICTION IMPLEMENTATION

Fig 5.1: Django

5.1.1 FRONTEND

The frontend of the application is built using Django, which handles the presentation layer,
user interface, and data interaction. Key components of the frontend include user
authentication, role- specific dashboards, and data submission.

1. User Authentication:

• Django Authentication System: Django’s built-in authentication system provides a


secure method for user login and signup. It uses sessions to manage user authentication,
ensuring that each user is identified correctly.

• Signup and Login: The signup and login forms are created using Django forms. Users
can register by providing basic details, and after successful registration, they are
redirected to their respective dashboards.

• Role-Based Authentication: Based on the user’s role (Doctor, Scientist, or Patient), they
are assigned a role-specific dashboard after login. This is achieved through Django’s
Group and Permission system, ensuring that users only access data and features relevant
to their roles.

2. Dashboard:

• Role-Specific Dashboards: The application’s dashboard is dynamic, changing its content


based on the user's role.

• Doctor Dashboard: Displays a list of patients, their medical reports, and prediction
results.
Doctors can add, update, or view patient information and manage appointments.

Dept. of CSE, SJBIT 2024-25 P a g e | 41


HEART ATTACK RISK PREDICTION IMPLEMENTATION

• Patient Dashboard: Patients can view their personal health data, submitted medical
reports, and any predictions made by the system. They can also upload new reports for
further analysis.

• Scientist Dashboard: Scientists have access to anonymized patient data for research
purposes, allowing them to analyze trends and run predictive models for healthcare
research.

• Data Display: The dashboard displays key metrics, trends, and predictions in a user-
friendly format, such as tables, charts, and graphs, making it easier for users to interpret
the data.

3. Data Submission:

• Medical Reports: Patients can submit medical reports directly through the dashboard.
The submission form allows patients to upload files, such as lab reports or diagnostic
tests, in a structured format.

• Data Validation: Django’s form validation ensures that the data submitted by patients is
accurate and complete. Upon successful submission, the reports are stored in the
database and linked to the corresponding patient record.

• File Handling: The reports are stored as files in the backend, with file paths saved in the
database. Django handles the file management efficiently, ensuring easy access and
retrieval when required.

5.1.2 BACKEND

The backend of the system is responsible for data processing, integrating machine learning
models for predictions, and managing communication between the frontend and Streamlit
for real-time interactions.

1. Data Processing:

• Handling User Data: Upon receiving data from the frontend, Django processes the
information before storing it in the database. This includes ensuring that the data entered
by users (e.g., medical reports, patient history) is correctly formatted and validated.

Dept. of CSE, SJBIT 2024-25 P a g e | 42


HEART ATTACK RISK PREDICTION IMPLEMENTATION

• Integration with Machine Learning Models: Django integrates machine learning models
to make predictions based on the patient’s medical data. When a patient submits a report,
the system processes the data and runs it through pre-trained models to generate
predictions, such as the likelihood of a disease, possible health risks, or treatment
recommendations.

• Prediction Flow: Once predictions are made, they are stored in the database and
displayed to the user on their dashboard. This helps doctors, patients, and scientists
make informed decisions based on real-time analysis.

2. Integration with Streamlit:

• Data Communication: Django communicates with Streamlit through API calls, ensuring
that the system can pass patient data to Streamlit for real-time predictions and
visualizations.

• Real-Time Predictions: When a patient submits a medical report, the backend processes
the data, sends it to Streamlit for prediction, and then displays the results in the
frontend. This allows fora seamless flow of data between the backend and the Streamlit
app, providing instant predictions and visualizations.

• Streamlit API Integration: Streamlit is embedded within the Django application, where it
is accessed through an iframe or URL redirection. This ensures that the predictions and
visualizations provided by Streamlit are easily accessible within the user interface.

5.2 STREAMLIT INTEGRATION

Streamlit serves as the core tool for real-time predictions and interactive data visualizations.
It helps make machine learning models and predictions accessible to users, offering a simple
and intuitive interface. Streamlit for prediction, and then displays the results in the
frontend. This allows fora seamless flow of data between the backend and the Streamlit app,
providing instant predictions and visualizations.

Dept. of CSE, SJBIT 2024-25 P a g e | 43


HEART ATTACK RISK PREDICTION IMPLEMENTATION

Fig 5.2: Streamlit

1. User-Friendly Interface:

• Interactive Dashboards: Streamlit is used to create intuitive, interactive dashboards that


allow users to engage with predictions and data visualizations. The dashboards can
display various forms of data, such as tables, graphs, and charts, in real-time.

• Real-Time Interactions: Streamlit’s interactivity allows users to adjust input parameters


dynamically (e.g., age, gender, medical history) and instantly see how these changes
affect predictions or visualizations. This feature is particularly useful for doctors and
patients whowant to explore how different factors influence health outcomes.

2. Real-Time Visualizations:

• Prediction Results: The real-time predictions generated by machine learning models are
displayed in the form of charts, graphs, and statistical analyses. These visualizations help
doctorsand patients understand the implications of the predictions and how they can take
appropriate action.

• Health Data Trends: Streamlit allows users to visualize trends in health data, such as the
progression of a disease or the effects of treatments over time. These visualizations are
crucial for both doctors and scientists to track patient health and make informed
decisions.

3. Machine Learning Models:

• Model Deployment: Streamlit is used to deploy machine learning models that analyze

Dept. of CSE, SJBIT 2024-25 P a g e | 44


HEART ATTACK RISK PREDICTION IMPLEMENTATION

patient data and provide predictions in real-time. The models may include algorithms for
disease prediction, risk analysis, or treatment recommendations.

• Instant Predictions: When new data is entered or when a patient submits a medical
report, Streamlit processes this data using the deployed machine learning models and
instantly generates predictions. These predictions are then displayed to the user via the
interactive dashboard.

4. Real-Time Data Updates:

• Instant Results: As the patient submits new data or modifies existing information,
Streamlit updates the visualizations and predictions in real-time. This ensures that
healthcare professionals can access the most up-to-date information at any time.

• Charts and Graphs: Streamlit provides built-in support for rendering dynamic charts and
graphs (e.g., line charts, bar graphs, and pie charts), making it easier for users to interpret
complex medical data visually.

5.3 GPT-2 MODEL

The GPT-2 (Generative Pre-trained Transformer 2) model, developed by OpenAI, is one of


the most influential language models in the field of natural language processing (NLP).
Released in 2019, it quickly garnered attention due to its remarkable ability to generate
human-like text based on a given input prompt. GPT-2 is the successor to the original GPT
model and is based on the Transformer architecture, which revolutionized the field of NLP
when it was introduced in 2017 by Vaswani etal.

Transformer Architecture

At the heart of GPT-2 is the Transformer architecture, which leverages self-attention


mechanismsto process sequential data efficiently. Traditional models like Recurrent Neural
Networks (RNNs) and Long Short-Term Memory (LSTM) networks processed text
sequentially, making them computationally expensive for longer texts. In contrast, the
Transformer allows for parallel processing of the entire input sequence, significantly
speeding up training times and improving performance.

The Transformer consists of two main components: the encoder and the decoder. GPT-2,

Dept. of CSE, SJBIT 2024-25 P a g e | 45


HEART ATTACK RISK PREDICTION IMPLEMENTATION

however, uses only the decoder portion of the Transformer, which is optimized for
generating sequences of text.

Pre-training and Fine-tuning

GPT-2 follows a two-step training process: pre-training and fine-tuning.

• Pre-training: In this phase, the model is trained on a massive corpus of text data using an
unsupervised learning approach. During pre-training, GPT-2 learns to predict the next
word ina sentence given the previous words, thereby learning grammar, facts about the
world, and some level of reasoning. The pre-training dataset for GPT-2 includes a wide
variety of internet text sources, such as articles, books, and websites. This diverse
training data enables GPT-2 to generate coherent text across various domains.

Fine-tuning: While the pre-trained model is capable of generating impressive text, it is


often fine-tuned on domain-specific datasets to improve performance on particular tasks,
such as summarization, question answering, or dialogue generation. Fine-tuning is
typically supervised, where the model is trained on a smaller, more

GPT-2 follows a two-step training process: pre-training and fine-tuning.

• Pre-training: In this phase, the model is trained on a massive corpus of text data using an
unsupervised learning approach. During pre-training, GPT-2 learns to predict the next
word ina sentence given the previous words, thereby learning grammar, facts about the
world, and some level of reasoning. The pre-training dataset for GPT-2 includes a wide
variety of internet text sources, such as articles, books, and websites. This diverse
training data enables GPT-2 to generate coherent text across various domains.

• Fine-tuning: While the pre-trained model is capable of generating impressive text, it is


often fine-tuned on domain-specific datasets to improve performance on particular tasks,
such as summarization, question answering, or dialogue generation. Fine-tuning is
typically supervised, where the model is trained on a smaller, more specific dataset with
labeled examples.

Model Sizes and Parameters

Dept. of CSE, SJBIT 2024-25 P a g e | 46


HEART ATTACK RISK PREDICTION IMPLEMENTATION

GPT-2 is available in several versions, each with a different number of parameters, ranging
from 117 million to 1.5 billion parameters. These parameters refer to the weights in the
neural network that the model learns during training. The larger the model, the more
complex patterns and relationships it can capture, enabling it to generate more coherent and
contextually appropriate text. However, larger models require more computational
resources for both training and inference.

The largest GPT-2 model, with 1.5 billion parameters, is capable of generating high-quality
text that is often indistinguishable from text written by humans. It can handle a wide range
of NLP tasks, including writing essays, generating code, creating poetry, and answering
questions.

Key Features of GPT-2

1. Contextual Understanding: GPT-2 excels at understanding context and generating text


that is coherent over long passages. It can take a prompt and generate a continuation
that aligns with the style and content of the input, making it useful for a wide range of
creative applications, suchas storytelling or content creation.

2. Text Generation: One of GPT-2’s most notable features is its ability to generate human-
like text based on a given prompt. Given an initial seed, GPT-2 generates a continuation
that fits naturallywith the prompt, showing a high level of fluency and creativity.

3. Zero-shot Learning: GPT-2 demonstrates an ability to perform tasks without explicit


training on task-specific data. For example, it can answer questions, summarize text,
or translate between simply by being provided with a prompt that describes the task.
This is known as zero-shot learning.

4. Scalability: The performance of GPT-2 improves as the size of the model increases.
Larger models, with more parameters, can handle more complex language tasks and
generate more sophisticated outputs.

Applications

GPT-2 has been used in various applications, such as:

• Content Creation: GPT-2 is widely used in automated content generation for articles,

Dept. of CSE, SJBIT 2024-25 P a g e | 47


HEART ATTACK RISK PREDICTION IMPLEMENTATION

blog posts,and even creative writing.

• Chatbots: It powers conversational agents capable of having realistic and engaging


dialogueswith users.

• Code Generation: GPT-2 has been used to generate code snippets based on user input,
assistingin programming tasks.

• Text Summarization: GPT-2 can summarize long documents, retaining key


information andpresenting it concisely.

Ethical Considerations

While GPT-2 is powerful, it raises important ethical concerns. One significant concern is the
potential for misuse in gen2erating misleading or harmful content, such as fake news or
deepfakes. OpenAI initially withheld the release of the largest version of GPT-2, citing
these concerns. Additionally, GPT-2’s ability to generate text with little oversight means
that it could be used to create harmful content on a large scale, leading to issues related to
misinformation, bias, andfairness.
The system design for the heart attack risk prediction model emphasizes the importance of
using advanced machine learning techniques to analyze diverse patient data and predict heart
attack risks. By considering factors such as medical history, age, gender, blood pressure,
cholesterol levels, smoking habits, and physical activity, the model leverages predictive
analytics to provide accurate risk assessments. The architecture ensures that data is securely
collected, processed, and analyzed, prioritizing patient confidentiality and compliance with
healthcare standards.
Furthermore, the design is highly modular, allowing for easy updates and integration with
new medical research or datasets. The use of APIs ensures seamless interaction with other
healthcare systems, facilitating quick access to real-time patient information. The application
also provides actionable insights through visualizations and personalized recommendations,
helping healthcare professionals make more informed decisions.

Dept. of CSE, SJBIT 2024-25 P a g e | 48


CHAPTER 6

TESTING

Testing and evaluation are critical steps in ensuring that the system functions as expected, meets
performance requirements, and provides a positive user experience. This chapter outlines the testing
strategies used to evaluate the functionality, performance, and usability of the healthcare
application.

6.1 FUNCTIONAL TESTING

Functional testing focuses on verifying that the application performs its intended functions correctly.
This includes testing the core features like user authentication, data submission, and prediction
accuracy.

1. User Authentication:

• Sign-Up and Login: Test the entire user authentication flow to ensure users can sign up with
valid credentials, log in successfully, and are directed to their respective dashboards based on
their roles (Doctor, Patient, or Scientist).

• Role-Based Access: Ensure that the application correctly assigns permissions and displays role-
specific content. This involves verifying that a user with the Doctor role can access doctor-
related functionalities, while a Patient can only access personal health data and prediction
results.

2. Data Submission:

• Medical Report Submission: Verify that the data submission feature works correctly. Patients
should be able to upload medical reports without errors. The reports should be validated for
completeness and accuracy before being stored in the database.

• Database Integrity: Test the system’s ability to store and retrieve patient data accurately. Ensure
that each medical report is linked to the correct patient and that all relevant data (such as patient
history, medical details, etc.) is stored appropriately.

3. Prediction Accuracy:
HEART ATTACK RISK PREDICTION TESTING

• Machine Learning Model Testing: Evaluate the performance and accuracy of the machine
learning models integrated within the Streamlit application. This includes assessing how well
the models predict health outcomes based on patient data and medical reports.

• Test Dataset: Use a test dataset with known outcomes to verify the accuracy of predictions.
Compare the predictions with actual outcomes and compute performance metrics such as
accuracy, precision, recall, and F1 score.

• Real-Time Predictions: Test the real-time prediction feature to ensure that predictions are
accurate and updated promptly when new data is submitted.

Fig 6.1: Testing Model

6.2 PERFORMANCE TESTING

Performance testing ensures that the application can handle the required load and performs
efficiently, even under heavy usage conditions. This section evaluates the responsiveness,
scalability, and efficiency of the system.

1. Response Time:

Dept. of CSE, SJBIT 2024-25 P a g e | 40


HEART ATTACK RISK PREDICTION TESTING

• Prediction Response Time: Measure the system’s response time when a user submits data for
predictions. The system should be able to provide predictions in real-time or within an
acceptable timeframe.

• Load Time: Test the time it takes for the application to load user dashboards and display data.
Any delay in loading dashboards or visualizations could hinder user experience.

2. Scalability:

• User Load Testing: Simulate a large number of users accessing the system simultaneously.
Measure how well the application handles multiple users interacting with the system at the same
time. The system should scale to accommodate a growing user base without significant
degradation in performance.

• Data Volume Testing: Evaluate the system's ability to handle large volumes of data, such as
medical reports and predictions, especially when dealing with multiple patient records. The
database should perform well under heavy data loads, and the application should continue to
function smoothly.

3. Stress Testing:

• Peak Load: Simulate peak load conditions to test the system’s resilience. Test how the system
behaves under stress and if it can recover gracefully in case of high traffic or overload situations.

• Server Resources: Monitor server resources (CPU, memory, network bandwidth) during stress
tests to ensure the system does not exceed resource limits and crash during high traffic.

6.3 USABILITY TESTING

Usability testing assesses how easy and intuitive the system is for users to interact with. It involves
testing the application with real users from different roles (Doctors, Patients, Scientists) to gather
feedback and ensure that the system meets user expectations.

1. User Interface and Navigation:

• Ease of Use: Evaluate the user interface (UI) for intuitiveness and ease of navigation. Ensure
that users can easily understand how to perform tasks like signing up, logging in, submitting
reports, and viewing predictions.

Dept. of CSE, SJBIT 2024-25 P a g e | 41


HEART ATTACK RISK PREDICTION TESTING

• Role-Specific Dashboards: Test whether each role-specific dashboard (Doctor, Patient,


Scientist) is easy to navigate and contains relevant information. The UI should be clean and
organized, allowing users to focus on their tasks without confusion.

2. Feedback on Interactive Visualizations:

• Visualization Clarity: Collect feedback on the clarity and usefulness of the interactive
visualizations (charts, graphs, etc.) displayed in the Streamlit dashboards. Ensure that
predictions and trends are presented in a way that is easy for users to understand and act upon.

• Interactivity: Test the interactive features, such as adjusting input parameters to see real-time
prediction updates. Ensure that users can interact with the visualizations smoothly without
delays or errors.

3. User Feedback:

• Patient Feedback: Gather feedback from patients regarding the ease of submitting reports and
accessing their predictions. Patients should be able to navigate the application without difficulty
and find the information they need quickly.

• Doctor and Scientist Feedback: Doctors and scientists should provide feedback on how useful
the system is for patient analysis, including the accuracy of predictions and the clarity of
visualizations. Their feedback is essential in refining the system’s medical features.

• Satisfaction Survey: After conducting usability testing, administer a satisfaction survey to gauge
overall user experience. This can help identify areas for improvement, such as UI design,
functionality, or data presentation.

4. Accessibility Testing:

• Cross-Platform Testing: Ensure that the application works seamlessly on various devices and
platforms (e.g., desktop, mobile, and tablet) to cater to different user needs.

• Accessibility Features: Test the application’s accessibility features for users with disabilities,
such as providing alternative text for images, ensuring keyboard navigation works, and testing
color contrast for readability.

6.4 EVOLUTION AND RECOMMENDATIONS

Dept. of CSE, SJBIT 2024-25 P a g e | 42


HEART ATTACK RISK PREDICTION TESTING

Once testing is completed, the next step is to evaluate the results based on the functional,
performance, and usability tests. The findings will be analyzed to identify areas of improvement,
such as:

• Optimizing Prediction Models: If the prediction accuracy is not up to the required standard, the
machine learning models can be fine-tuned or retrained using a larger dataset.

• Improving System Performance: If the system’s response time is slower than desired, it may
require optimization through better database indexing, query optimization, or caching strategies.

• Enhancing User Experience: If users encounter difficulties in navigating the system or


understanding predictions, UI and UX improvements will be recommended to make the system
more user-friendly.

Dept. of CSE, SJBIT 2024-25 P a g e | 43


CHAPTER 7

RESULTS AND SNAPSHOTS

This chapter presents the outcomes of the testing phase, followed by a detailed discussion on the
effectiveness of combining Django and Streamlit for the healthcare application. The focus is on
assessing the functionality, performance, and accuracy of the prediction models, as well as the
challenges faced during development.

The results section presents the findings from the functional, performance, and machine learning
model tests.

7.1 RESULT

1. Functionality of the Application:

• User Authentication: The authentication system performed as expected. All users (Doctors,
Patients, Scientists) were able to sign up, log in, and access their role-based dashboards without
any issues. The role-based access control worked seamlessly, ensuring that users could only
view and interact with relevant data.

• Data Submission: Patients were able to successfully upload medical reports to the system. The
submission feature was tested across various file formats, and the system accurately processed
and stored the data in the database without errors.

• Real-Time Predictions: Real-time predictions worked effectively. When medical reports were
submitted, the machine learning models provided predictions almost instantaneously through
Streamlit. The predictions were displayed on interactive visualizations, allowing users to
analyze the results in real time.

2. Performance Benchmarks:

• Response Time: The application’s response time for data submission and real-time prediction
was satisfactory. On average, it took about 3–5 seconds for predictions to be generated after data
submission, which is well within the expected range for such applications.

• Scalability: During load testing with a high number of simultaneous users, the system
demonstrated scalability. The application was able to handle 100+ concurrent users without
HEART ATTACK RISK PREDICTION RESULTS AND SNAPSHOTS

significant delays or performance degradation. Server resource usage was monitored, and the
application did not experience any memory or CPU spikes during peak loads.

• Stress Testing: Stress tests revealed that the system could handle a high volume of data
submissions (medical reports) without any performance degradation. However, as the volume
of data increased exponentially, the system showed slight slowdowns in data retrieval times,
indicating a need for further optimization in database queries and indexing.

3. Accuracy of the Prediction Models:

The accuracy of the machine learning models used for predictions was evaluated using a test dataset.
The models achieved an accuracy rate of 93%, which is considered satisfactory for a real-time
healthcare prediction system. The predictions were particularly accurate in diagnosing common
conditions but showed some variability for more complex cases.

• Precision and Recall: For disease prediction tasks, the models demonstrated high precision and
recall (93% and 100%, respectively), which means that the system was good at predicting
positivecases and minimizing false negatives.

• Real-Time Prediction Evaluation: In real-time, predictions were generated based on the data
submitted by users. The results were consistent with offline testing, where the models produced
reliable predictions for disease diagnosis and patient health analysis.

7.2 DISCUSSION

This section discusses the effectiveness of combining Django and Streamlit in this healthcare
application and highlights some of the challenges encountered during development.

1. Effectiveness of Combining Django and Streamlit:

• Seamless Integration: Combining Django for backend and frontend functionalities with
Streamlit for real-time predictions proved to be highly effective. Django handled user
authentication, data submission, and database management efficiently, while Streamlit provided
a user-friendly interface for displaying real-time predictions and visualizations. This separation
of concerns allowed for easier management of both the application’s backend and its interactive
frontend.

• Real-Time Data Interaction: Streamlit's ability to display real-time predictions and interactive

Dept. of CSE, SJBIT 2024-25 P a g e | 45


HEART ATTACK RISK PREDICTION RESULTS AND SNAPSHOTS

visualizations added significant value to the application. Doctors and scientists could analyze
patient reports in real time, make quick decisions, and provide timely medical interventions. The
integration of machine learning models in Streamlit allowed for instant feedback based on the
submitted medical data, which improved the overall user experience.

• User Experience: The role-specific dashboards (for Doctors, Patients, and Scientists) allowed
users to focus on their respective tasks and access relevant data. Streamlit’s interactive
visualizations helped doctors and scientists easily interpret predictions and spot potential issues.
Patients were also able to view their health predictions and access their medical reports with
minimal navigation.

2. Challenges Faced:

• Real-Time Predictions with Streamlit: One of the main challenges was integrating real-time
predictions within the Streamlit interface. While Streamlit is designed for quick and easy
deployment of machine learning models, ensuring the predictions were updated in real-time as
users interacted with the system required careful handling of data flow between Django and
Streamlit. This involved establishing an efficient data pipeline to send data from Django to
Streamlit without introducing latency.

• Handling Large Datasets: Another challenge encountered during the development phase was
handling large datasets. As the number of medical reports and patient records grew, the system
faced slowdowns in retrieving data from the database and generating predictions. This issue was
mitigated by optimizing database queries and indexing, but it highlighted the need for future
improvements in data storage and retrieval strategies to handle large volumes of healthcare data.

• Model Performance on Complex Cases: While the machine learning models performed well for
common diseases, they showed some inconsistencies when predicting more complex medical
conditions. This is a known issue in healthcare prediction systems, as data quality and model
training play a significant role in performance. More diverse training datasets and model tuning
would be required to enhance the accuracy of predictions for complex or rare conditions.

• User Interface Improvements: Although the system provided role-specific dashboards, feedback
from user testing indicated that some users found the interface slightly cluttered, especially when
viewing predictions. Future iterations of the system will focus on improving the layout and
simplifying the visual elements to make it more user-friendly.

3. Future Improvements:

Dept. of CSE, SJBIT 2024-25 P a g e | 46


HEART ATTACK RISK PREDICTION RESULTS AND SNAPSHOTS

• Improving Prediction Models: One of the primary future goals is to improve the accuracy of the
machine learning models, particularly for complex medical conditions. This can be achieved by
incorporating more diverse datasets, experimenting with different algorithms, and conducting
further model training.

• Enhanced Scalability: To address the performance issues related to large datasets, the system
will benefit from improved database architecture, such as introducing sharding or partitioning
strategies for large patient records and reports. Additionally, optimizing the deployment
environment for handling higher traffic will ensure the system remains performant as the number
of users increases.

• AI-based Assistance: Future versions of the system could incorporate AI-based assistance to
help doctors and patients interpret the predictions. RNN could be used to generate reports and
explanations for the predictions, making the results more comprehensible and actionable.

7.3 SNAPSHOTS

Fig 7.1: Home Page

Fig 7.1 shows the home page of the Heart Attack Risk Prediction System serves as an introduction to
the project, displaying a concise abstract that outlines its purpose and functionality. It provides users
with a brief overview of how the system predicts the likelihood of a heart attack based on various
medical. This page is designed to offer an easy-to-understand summary of the project's goal factors.

Dept. of CSE, SJBIT 2024-25 P a g e | 47


HEART ATTACK RISK PREDICTION RESULTS AND SNAPSHOTS

Fig 7.2: Admin Login page

Fig7.2 shows the Admin Login Page is a secure gateway for administrators to access and manage the
Heart Attack Risk Prediction System. It ensures that only authorized personnel can view, update, and
maintain the system's data, including user records and system configurations.

Fig 7.3: Image Upload Page

Fig 7.3 shows the Retinal Image Upload Page enables users to upload retinal images for analysis as
part of the heart attack risk prediction process. This page provides a simple and intuitive interface,
ensuring a seamless upload experience.

Dept. of CSE, SJBIT 2024-25 P a g e | 48


HEART ATTACK RISK PREDICTION RESULTS AND SNAPSHOTS

Fig 7.4: Image Clustering Page

Fig 7.4 shows the Image Clustering functionality, demonstrated in the command line interface
(CMD), showcases the system's ability to group similar retinal images based on extracted features.
Using advanced clustering algorithms, such as K-Means or hierarchical clustering, this step helps in
organizing and analyzing image data efficiently.

Fig 7.5: Clustered Image Page

Fig 7.5 shows the Clustered Image Page visually represents the results of the image clustering
process, displaying retinal images grouped into their respective clusters. Each cluster is organized
based on shared features, making it easier to identify patterns and correlations among the data.

Dept. of CSE, SJBIT 2024-25 P a g e | 49


HEART ATTACK RISK PREDICTION RESULTS AND SNAPSHOTS

Fig 7.6: Risk Prediction Page

Fig 7.6 shows the Prediction Page displays the outcomes of the heart attack risk prediction model.
Users can view the predicted risk level based on the uploaded retinal image and associated data. The
page provides clear and concise results, ensuring that users can easily understand the prediction and
take necessary actions or seek further medical advice based on the outcome.

Fig 7.7: Test Image Upload Page

Fig 7.7 shows the Test Image Upload Page allows users to upload retinal images for testing the heart
attack risk prediction model. This page ensures a user-friendly experience with straight forward
options for selecting and uploading images. It serves as a crucial step in the workflow, facilitating
accurate predictions based on the uploaded medical data.

Dept. of CSE, SJBIT 2024-25 P a g e | 50


HEART ATTACK RISK PREDICTION RESULTS AND SNAPSHOTS

Fig 7.8: Risk Prediction Result Page

Fig 7.8 shows the Risk Prediction Result Page displays the outcome of the heart attack risk analysis
based on the uploaded retinal image. This page provides users with clear and concise results,
highlighting the risk level with detailed insights. It ensures that users receive actionable information,
aiding in better understanding and potential decision-making regarding their health.

Fig 7.9: Standard Conditions Page

Fig 7.9 shows the Standard Normal Conditions Page for a Healthy Heart presents baseline parameters
and guidelines that define a healthy cardiovascular state. It serves as a reference for users to compare
their prediction results with normal standards. This page promotes awareness of ideal health metrics,
empowering users to take proactive measures towards maintaining a healthy heart.

Dept. of CSE, SJBIT 2024-25 P a g e | 51


HEART ATTACK RISK PREDICTION RESULTS AND SNAPSHOTS

Fig 7.10: Accuracy Curve Page

Fig 7.10 shows the Accuracy Page for Test and Training Images showcases the performance metrics
of the heart attack risk prediction model. It includes visual representations, such as graphs or tables,
highlighting the accuracy achieved during the model's training and testing phases. This page
emphasizes the model's reliability and effectiveness in analyzing and predicting heart health based
on retinal images.

The snapshots of the Heart Attack Risk Prediction project visually capture the system's key
functionalities and flow. The Home page presents an overview of the project's objective, providing
users with a concise introduction to the application. The Admin Login page showcases the secure
authentication process, ensuring that only authorized personnel can access sensitive system features.
The Retinal Image Upload page allows users to upload images for analysis, demonstrating the
system's capability to process medical data seamlessly. The Image Clustering in CMD snapshot
highlights the backend processes, where retinal images are clustered for improved data segmentation
and feature extraction. The Clustered Image page displays how the system organizes images into
meaningful clusters to assist in prediction accuracy. The Prediction page serves as the interface for
users to interact with the model and view predictions, while the Risk Prediction Result page clearly
displays the outcome, giving users an understanding of their heart attack risk. The Standard Normal
Conditions page shows the baseline for a healthy heart, offering a comparison point for the
prediction results. The Accuracy for Test and Training Image page demonstrates the system's
effectiveness in both training and testing phases, validating its performance.

Dept. of CSE, SJBIT 2024-25 P a g e | 52


CONCLUSION

To conclude, the Heart Attack Risk Prediction System showcases the potential of artificial
intelligence and machine learning in revolutionizing healthcare. By utilizing retinal images for risk
analysis, the system offers a novel, non-invasive method for predicting heart attack risks. The
application of image clustering techniques, coupled with a user-friendly interface for test uploads
and results interpretation, makes it both practical and accessible. Through training with a
comprehensive dataset, the system achieves high accuracy, ensuring reliable predictions. This
project highlights the importance of integrating advanced technologies with healthcare, paving the
way for more proactive and personalized health monitoring. The system's effectiveness and
efficiency could serve as a stepping stone for further developments in predictive healthcare.

P a g e | 53
REFERENCES

[1] P .K. Anooj, ―Clinical decision support system: Risk level prediction of heart disease using
weighted fuzzy rulesǁ; Journal of King Saud University – Computer and Information Sciences (2012)
24, 27–40. Computer Science & Information Technology (CS & IT) 59

[2] Nidhi Bhatla, Kiran Jyoti”An Analysis of Heart Disease Prediction using Different Data Mining
Techniques”.International Journal of Engineering Research & Technology

[3] Jyoti Soni Ujma Ansari Dipesh Sharma, Sunita Soni. “Predictive Data Mining for Medical Diagnosis:
An Overview of Heart Disease Prediction”.

[4] Chaitrali S. Dangare Sulabha S. Apte, Improved Study of Heart Disease Prediction System
using Data Mining Classification Techniques” International Journal of Computer Applications (0975
– 888)

[5] Dane Bertram, Amy Voida, Saul Greenberg, Robert Walker, “Communication,
Collaboration, and Bugs: The Social Nature of Issue Tracking in Small, Collocated Teams”.

[6] M. Anbarasi, E. Anupriya, N.Ch.S.N.Iyengar, ―Enhanced Prediction of Heart Disease with


Feature Subset Selection using Genetic Algorithmǁ; International Journal of Engineering Science and
Technology, Vol. 2(10), 2010.

[7] Ankita Dewan, Meghna Sharma,” Prediction of Heart Disease Using a Hybrid Technique in Data
Mining Classification”, 2nd International Conference on Computing for Sustainable Global
Development IEEE 2015 pp 704-706. [2].

[8] R. Alizadehsani, J. Habibi, B. Bahadorian, H. Mashayekhi, A. Ghandeharioun, R. Boghrati, et


al., "Diagnosis of coronary arteries stenosis using data mining," J Med Signals Sens, vol. 2, pp. 153-9,
Jul 2012.

[9]Shadab Adam Pattekari and Asma Parveen,” PREDICTION SYSTEM FOR HEART DISEASE
USING NAIVE BAYES”, International Journal of Advanced Computer and Mathematical Sciences
ISSN 2230-9624, Vol 3, Issue 3, 2012, pp 290-294.

P a g e | 54

You might also like