HD Paper
HD Paper
ABSTRACT: - In recent years, there has been is predicted with better accuracy, the easier it is
a progressive increase in heart disease, and the for clinicians to save a life. Heart Disease still
number of deaths due to this diagnosis is rapidly remains a substantial burden in the health care
rising. It is very important to predict such system. Innovative approaches are required in
diseases earlier. The primary concern is the prediction process. In recent years, machine
forecasting this disorder to be done adeptly and learning algorithms have emerged as a
accurately, which demands early detection and promising approach in the prediction process.
prevention. There are many attributes that can
Machine Learning algorithms hold patterns and
cause heart disease, which can be predicted
relationships within large datasets that help in
using the patient's medical history. This paper
the identification of risk factors and predict the
provides a literature survey that focuses on the
probability of cardiovascular events. Machine
applications of machine learning algorithms that
Learning offers data-driven methods with the
can guess patterns and trends in data originating
advancement of electronic health records in the
from various domains. It focuses on the existing
betterment of perception of cardiovascular risk
methods of machine learning that are involved
factors. Machine Learning has the ability to
in the prediction of heart disease and the
analyze various datasets and provides more
accuracy of results obtained by implementing
accurate results when trained with proper
the algorithm. Insights obtained from this survey
datasets. Machine Learning is a sub-module of
can guide the researchers and practitioners in
AI which also has the capability to identify
understanding the current landscape and future
various risk factors out of the given datasets.
directions in the domain of CAD prediction.
The use of technology in the health care sector
This paper bestows the various consolidations of
holds great power in enhancing early detection
knowledge in the prediction of heart disease and
of the disease. Machine Learning plays a vital
will be a valuable resource for those seeking the
role in the prediction of heart disease by
development of heart disease risk assessment.
offering several leads in the principality of the
healthcare sector. Here are a few of the most
Keywords: Heart Disease, CAD Prediction,
pivotal roles of machine learning in heart
Pattern Recognition, LR, NB, KNN, DT, SVM
disease prediction
Application Algorithm.
Data Analysis and Pattern Recognition:
I. INTRODUCTION Machine Learning algorithms are best
used at analyzing larger and complex
Heart Disease is a significant global datasets, identifying hidden patterns,
health concern, which is responsible for and finding correlations among patient
numerous deaths worldwide. The prevention can information like demographics, medical
be carried out based on two aspects: Time and history, and lifestyle factors.
accuracy in the prediction process. The earlier it
Personal Identification: Machine Continuous learning: Continuous
Learning allows personalized prediction learning and updating the model
by considering each individual’s unique periodically with a new set of data
health profile, enabling accurate and adapts to the evolving patterns.
targeted risk factors.
Risk Categorization: Machine Learning Types of machine learning algorithms:
models can classify individuals into There are three main types of machine learning
different risk categories by considering algorithms:
the various risk factors that enable the 1. Supervised learning: This algorithm is
authentic identification of individuals at trained with labeled datasets where each
higher risk. input has a respective output.
Primarydetection: Machine Learning Example:
algorithms can detect early signs of Linear regression, decision tree, support
heart disease before the clinical vector machines, and neural networks.
symptoms manifest. This early detection 2. Unsupervised learning: The algorithm has
is important for the implementation of unlabeled data and finds the patterns
precautionary measures. without any guidance, by exploring the
data’s inherent structure.
Here is a set of processes that are carried out in Example:
the prediction of heart disease using Machine K-means clustering, hierarchical
Learning algorithms: clustering, principal component analysis
(PCA)
Data collection: Gather various data 3. Reinforcement learning: The algorithm
about the individual such as medical learns by interacting with the surrounding
history, lifestyle factors, demographics, environment. It gets feedback to learn
and diagnostic test results. through trial and error.
Preprocessing: It helps in handling the Example:
missing values, outliers, to ensure Q-learning, deep q network (DQN),
consistency by cleaning the dataset. policy gradient methods.
Feature extraction: Identify or create a 4. Semi-supervised learning: It contains
new feature that is more similar in the elements of both supervised and
contribution of the prediction process. unsupervised learning
Training the model: Labeled datasets
are used to train the machine learning II. RELATED WORK
models. Common algorithms for
training include logistic regression, 1. WHO declared that in the year 2019, an
decision trees, and neural networks. estimated 17.9 million people lost their lives
Validation: To ensure the performance due to cardiovascular disease, which is a
of the model, a separate dataset which is major cause of death worldwide. It accounts
not used during training has to be for 32% of global deaths. Also, 85% of
processed. deaths are due to heart attacks and strokes.
Testing and evaluation: Evaluation and Out of the 17 million deaths, 38% of the
testing are done to ensure the accuracy, deceased were aged 70 and above.
sensitivity, and specificity.
Prediction: Once trained, the model 2. To reduce medical fatal errors and increase
predicts the symptoms of heart disease patient safety, a computer-based diagnosis
of individuals based on the inputs. system is needed in the medical sector.
Medical diagnosis uses different
classification techniques and tools for experience, whereas discovering and
decision support. Q. K. Al-Shayea predicting the disease is a difficult task that
categorized each patient into two categories: is to be carried out manually. It states that
infected and non-infected using an artificial the prediction can be effectively done
neural network. They proposed a solution systematically by training the system using
using an Artificial Neural Network which appropriate datasets.
provides a decision support system that
identifies mainly three heart diseases: mitral 6. Shinde R et al. has implemented a
stenosis, aortic stenosis, and ventricular prediction system that combines NB and k-
septal defect.The study was carried out on a means clustering algorithm to predict heart
dataset obtained from the UCI ML disease. The proposed system uses the k-
repository. The classifier obtained 95% mean algorithm for grouping of the attribute
simulation using Feedforward and NB algorithm for the prediction
backpropagation network. The compared purpose.
result reveals that the system classifies heart Data mining steps that involve in the
disease with an accuracy of 92%. prediction process include data cleaning, data
integration, data selection, data transformation,
3. Milan Kumari et al. employed a data data pattern, and knowledge.
mining classification method that includes Data segmentation, which is also known
LR, NB, KNN, DT, SVM, classification, as clustering, involves the grouping of data
clustering NN, and RF. To solve this objects that are of the same type. The algorithm
prediction in CVD, SPM, ANN, DT, RT, works as follows:
and PCR Algorithm are employed. The Select k center from the problem
result revealed that the SVM predicts CVD (random selection)
with the least error rate and high accuracy. Divide data into k cluster groups
Calculate the mean of k cluster to find
4. Neonatal disorders include prematurity, new centers
respiratory dysfunction, birth trauma, Repeat 2 & 3 until centers do not
congenital malformation, neonatal infection, change
and hemolytic disorders in newborns. Dilip Naïve Bayes is a supervised learning
Roy Chowdhury et al. proposed a paper method that requires a small amount of
that recognizes a pattern for diagnosing and data for training.
predicting neonatal disorders in newborns. Bayes theorem
An ANN model has been employed to P(A|B) = P(B|A)P(A) / P(B)
diagnose neonatal disorders. The system, as (1)
a result, produces an accuracy of 75%
prediction stability. 7. Machine learning is a tool that can extract
information, identify patterns, and make
5. This paper is developed with the help of a predictions when the system is trained. Z.
neural network for a heart disease prediction Ge et al. follows the state-of-the-art method
system. N. Al-milli aimed at the of data mining with eight unsupervised
development of an intelligent decision learning algorithms, ten supervised learning,
support system that can improve the ability and semi-supervised learning algorithms.
of physicians in medical diagnosis. In the Reinforcement learning algorithm is
case of heart diseases, N. Al-milli states that used in robotics, gaming, and navigation areas.
neural networks can be used widely for the Z. Ge et al. aim at two points: firstly to provide
prediction process. The doctors are a systematic review of data mining and analytics
predicting heart disease by knowledge and
in the process industry. The second aim is to can be used effectively to identify the
provide a new scope for the industry. disease earlier by training the system with
proper datasets. So far Algorithms have
Unsupervised Learning methods: been developed to automatically classify
and analyze heart sound. This article uses
This learning method explores data and
the state-of-the-art algorithm with its own
finds the hidden structure among the data. The
advantages and limitations in identifying
learning methods in the industry include
and classifying heart conditions.
principal component analysis, independent
This study describes advanced
component analysis, k-clustering, kernel density
algorithms that are currently used for heart
estimation, self-organizing map, Gaussian
sound analysis and classification. SVM is one
mixture model, manifold learning, and support
such approach that classifies heart valve disease.
vector data description.
It also distinguishes innocent murmurs from
Supervised Learning Method: other sounds outperforming ANN. Neural
Network requires large training datasets and
It has labeled data samples that are more computational power compared to SVM.
either discrete or continuous. The learning KNN Algorithm is used for classifying heart
method includes monitoring, fault classification sound to differentiate it from normal and
and identification, online operating mode. Also abnormal heart sounds. But this system demands
includes principal component, regression, partial huge memory space for processing.
least square, linear regression, neural network,
support vector machine, nearest neighbor, 10. S. I. Ansarullah et al. explains that
decision tree, random forest. identifying and examining heart diseases
manually is biased. However, ML
Semi-supervised Learning Method: algorithms are efficient and reliable in
detecting and categorizing patients suffering
Includes generative model-based methods, self-
from heart diseases. To achieve this, nine
training, co-training methods.
Machine Learning classifiers are used: NB,
8. A. Jagtap et al. exclaims that an early LR, ET, MNB, CART, SVM, LDA, RF, and
diagnosis of heart diseases is crucial in XGB.
saving lives. But the prediction of early True positive ( TP ) +True negative (TN )
accuracy=
heart diseases is very complex. The TP+TN + False Negative ( FN ) + False Positive(F
objective is to use ML to enhance the
reliability and simplicity of heart diseases In this method, a Standard k-fold cross-
prediction. KNN, GB, RF, NB, DP, and LR validation technique has been deployed to train
Algorithms were used for testing. Out of and process machine learning (ML) algorithms.
these algorithms, LR consistently The results indicated an improvement in the
demonstrated the prediction with an accuracy of prediction classifiers. The dataset
accuracy of 91.6% and 90.8% in most of the included in this process was processed three
testing. But however, as a conclusion RF times to eliminate discrepancies and missing
outperformed in all the datasets by values. The results obtained made clear that
achieving 98.6% accuracy. SVM combined with NB increased accuracy to
9. A. K. Dwivedi et al. addresses the issue of 87.91%. In general, SVM achieved 96.72%. A
CVD that steps have to be taken to prevent recommendation has been made that XG boost
and manage the disease effectively by can be used for heart disease prediction in
spreading awareness with the support of children.
research by implementing through 11. A. K. Dwivedi et al. believe that the clinical
machines. Machine Learning Algorithms information that we obtain by tracking heart
sounds can assist in diagnosing heart disease. heart alignments, and other medical
They provide an in-depth review of existing conditions. Decision Tree, Naïve Bayes,
approaches for the identification and Linear Regression, Support Vector
classification of heart sounds. A. K. Dwivedi et Machine, and Random Forest classifiers are
al. have concluded that the extraction and used in this prediction scenario, resulting in
analysis of the signals is a challenging task. The better accuracy.
steps involved in the analysis and classification
of automated heart sound identification are:
Feature Extraction and 13. Valle Harsha Vardhan et al. explain that
Selection the lack of accuracy in instruments in the
healthcare sector has led to the late
recognition of diseases, resulting in an
Classification of Heart Sounds increase in the death rate of patients. Valle
Harsha Vardhan et al. exclaim that Machine
Learning is a vital tool in the healthcare
Fig. 1 Classification based on Heart Sounds industry for identifying heart disease.
Flowchart Machine Learning, which is a branch of AI,
predicts events based on training from
11. I. D. Mienye et al. partitioned a dataset into natural events with effective testing and
smaller subsets of data using a mean-based training, which helps in the development of
splitting approach. To predict the risk of Machine Intelligence. When properly
heart disease with an improved ML method, trained to process data, machine learning
the partitions were then modeled using utilizes data in an effective manner and
CART. A homogeneous ensemble is formed produces efficient output
using an accuracy-based weighted aging
classifier (WAE). The results show that the 14. Sameh Ghwanmeh et al. explain that an
Cleveland and Framingham datasets unconventional computer-based diagnosis
achieved an accuracy of 93% and 91%, system can produce fatal medical errors.
respectively. They proposed a solution based on ANN
12. Shah et al. state that heart disease is the that provides a decision support system in
leading cause of 12 million deaths the identification of heart diseases. This
worldwide every year. Machine learning system encourages the development of an
algorithms are used in the healthcare sector operational screening and testing device for
to identify anomalies in physical movement, heart disease diagnosis that acts as great
assistance for clinicians. A series of Tree has done the forecasting very well than
experiments were conducted to examine the the other methods.
performance and accuracy using medical
data and revealed that the system produces 17. Galla Siva Sai Bindhika et al. propose an
92% classification performance. SVM and enhanced prediction model that has
Feed Forward back propagation methods produced an accuracy of 92% with a hybrid
were applied over a set of data of 300 random forest and linear model. Random
patients, where 250 patients were used for forest is used in this implementation to
training and 50 patients for the evaluation identify the age group of the patients, and
process. It has been concluded that by using Python and Pandas are used to perform
two neural network techniques, the output heart disease prediction from the UCI
obtained was 50% to 60%, which is not repository. RF has the best accuracy of the
reliable for the prediction of heart disease. result. Decision Tree is built in a top-down
recursive model that functions based on the
15. Even though there are several technologies divide & conquer (DAC) approach. Random
and algorithms to predict medical diagnoses, Forest is built with several decision trees
the main aim is to identify the best and combines all of them to produce the
algorithm that can predict accurately. The best result. Support Vector Machine acts as
proposed method by Harshit Jindal et al. a linear support vector machine that
identifies medical issues in an individual identifies the optional hyperplane.
using KNN and LR algorithms. The
technologies revealed good accuracy by 18. Though several machine learning techniques
functioning under NB. The implementation are available to predict heart disease, Safial
is based on three data mining techniques: 1. Islam Ayon et al. compare seven
L.R 2. KNN 3. RF. This technique, with techniques: logistic regression, random
proper functioning, produced an accuracy of forest, Naïve Bayes, support vector
87.5% for KNN and an accuracy of 85.2%. machine, deep neural network, k-nearest
It is resulted that KNN is better than the neighbor, and decision tree for intelligent
other two techniques. computation. Statlog and Cleveland heart
disease datasets were used for the
16. Shah Samkitkumar Rajnikant 1 et al. evaluation, and as a result, deep neural
state that machine learning algorithms can network techniques obtained 98.15%
accurately predict cardiovascular disease. accuracy.
This technique performs some pre-
processing methods on the datasets by 19. Bhatt et al. focus on the development of a
removing noise, removing missing data, machine learning model that can accurately
replacing default values, and attribute predict cardiovascular disease. The
classification for the decision-making proposed method uses k-modes clustering
process. For the prediction of the disease and models like multilayer perceptron,
SVH, GB, RF, NB, LR are imposed on the decision tree, random forest, and XGBoost.
collected data. With the hybrid R.F & linear The evaluation process was carried out on a
model, the performance of prediction of the real-world dataset of about 70,000
CAD disease has been increased from 30% instances, thus achieving a high rate of
to 90%. In addition, the device can also accuracy. Finally, the multilayer perceptron
predict asthma and heart disease, which is with cross-validation outperformed all other
an early detection symptom of CAD. algorithms by achieving the highest
Results have shown clearly that Decision accuracy of 87.28%.
III. DATA ANALYSIS COMPARISON
algorithmic techniques. Here is some more
A COMPARATIVE ANALYSIS OF summary of the survey papers in the mentioned
CLASSIFICATION TECHNIQUES IN area in tabular form.
HEART DISEASE PREDICTION Below given comparison is presented tabular
form between various machine learning
Many researchers have conducted research algorithms and their performance in predicting
about the prediction of heart disease using heart disease that are based on the accuracy.
machine learning algorithm with different
datasets, different attributes and different
Neural Network Model for Neonatal
Disease Diagnosis
CONCLUSION 5. N. Al-milli, ‘‘Backpropogation neural
network for prediction of heart
Machine learning algorithm offer several disease,’’ J. Theor.
6. Shinde R, Arjun S, Patil P & Waghmare
advantages such as data-driven decision
J (2015). An intelligent heart disease
making, automation, efficiency, adaptability,
prediction system using k-means
continuous learning, pattern recognition, clustering and Naïve Bayes algorithm.
speed learning, scalability and the ability in 7. Z. Ge, Z. Song, S. X. Ding and B.
handling big data. Machine learning can Huang, "Data Mining and Analytics in
identify complex patterns when it is trained the Process Industry: The Role of
properly with adequate inputs; Machine Machine Learning”.
learning has the capability to enhance the 8. A. Jagtap, P. Malewadkar, O. Baswat,
efficiency of healthcare system. This paper and H. Rambade, “Heart disease
has sketched the literature survey of Machine prediction using machine learning,”
Learning algorithms in the prediction of heart
9. A. K. Dwivedi, S. A. Imtiaz, and E. R.
disease. Many existing papers that used
Villegas, “Algorithms for automatic
machine learning algorithm to predict CAD
analysis and classification of heart
are scrutinized and catalogued herein. The sounds - a systematic review,”
accuracy and performance of each algorithm 10. S. I. Ansarullah and P. Kumar, “A
vary depending upon the algorithm, tools, systematic literature review on
datasets and number of attributes used. cardiovascular disorder identification
Machine learning can predict the heart disease using knowledge mining and machine
in an effective manner with the combination learning method,”
of proper dataset, training of algorithm and 11. A. K. Dwivedi, S. A. Imtiaz, and E. R.
testing. Machine learning techniques have Villegas, “Algorithms for automatic
analysis and classification of heart
effectively proved that it gives a limitless
sounds - a systematic review,”
platform in the medical sector.
12. I. D. Mienye, Y. Sun, and Z. Wang, “An
. improved ensemble learning approach
for the prediction of heart disease risk,”
REFERENCES: 13. Shah, D., Patel, S. & Bharti, S.K. Heart
Disease Prediction using Machine
1. World Health Organization Learning Techniques.
Syria Crisis, World Health 14. Heart disease prediction using machine
Organization, 2021. learning mr.valle harsha vardhan 1,
2. Q. K. Al-Shayea, “Artificial Neural mr.uppala rajesh kumar2 , ms.vanumu
Networks in Medical Diagnosis,” vardhini 3 , ms. Sabbi leela varalakshmi
3. Milan Kumari, Sunila Godara, 4,mr.a.suraj kumar.
Comparative Study of Data Mining 15. Innovative Artificial Neural Networks-
Classification Methods in Based Decision Support System for
Cardiovascular Disease Prediction. Heart Diseases Diagnosis. Sameh
4. Dilip Roy Chowdhury, Mridula Ghwanmeh, Adel Mohammad, Ali Al-
Chatterjee R. K. Samanta, An Artificial Ibrahim.
16. Cardiovascular disease forecasting
using machine learning techniques shah
samkitkumar rajnikant 1 , dr. Harsh
lohiya2.
17. Harshit Jindal1, Sarthak Agrawal1,
Rishabh Khera1, Rachna Jain2 and
Preeti Nagrath2.
18. Heart Disease Prediction Using
Machine Learning Techniques Galla
Siva Sai Bindhika1, Munaga Meghana2,
Manchuri Sathvika Reddy3,
Rajalakshmi4.
19. Safial Islam Ayon, Md. Milon Islam &
Md. Rahat Hossain (2022) Coronary
Artery Heart Disease Prediction: A
Comparative Study of Computational
Intelligence Techniques.
20. Bhatt, C.M.; Patel, P.; Ghetia, T.;
Mazzeo, P.L. Effective Heart Disease
Prediction Using Machine Learning
Techniques.
21. Effective Heart Disease Prediction
Using Machine Learning Techniques
Chintan M. Bhatt 1,* , Parth Patel 1 ,
Tarang Ghetia 1 and Pier Luigi Mazzeo
2.
22. Heart Disease Prediction using Hybrid
machine, Learning Model, Dr. M.
Kavitha1, G. Gnaneswar1, R. Dinesh1,
Y. Rohith Sai1, R. Sai Suraj1.