Processes 11 01210
Processes 11 01210
Article
Enhancing Heart Disease Prediction Accuracy through Machine
Learning Techniques and Optimization
Nadikatla Chandrasekhar and Samineni Peddakrishna *
Abstract: In the medical domain, early identification of cardiovascular issues poses a significant
challenge. This study enhances heart disease prediction accuracy using machine learning techniques.
Six algorithms (random forest, K-nearest neighbor, logistic regression, Naïve Bayes, gradient boosting,
and AdaBoost classifier) are utilized, with datasets from the Cleveland and IEEE Dataport. Optimiz-
ing model accuracy, GridsearchCV, and five-fold cross-validation are employed. In the Cleveland
dataset, logistic regression surpassed others with 90.16% accuracy, while AdaBoost excelled in the
IEEE Dataport dataset, achieving 90% accuracy. A soft voting ensemble classifier combining all six
algorithms further enhanced accuracy, resulting in a 93.44% accuracy for the Cleveland dataset and
95% for the IEEE Dataport dataset. This surpassed the performance of the logistic regression and Ad-
aBoost classifiers on both datasets. This study’s novelty lies in the use of GridSearchCV with five-fold
cross-validation for hyperparameter optimization, determining the best parameters for the model,
and assessing performance using accuracy and negative log loss metrics. This study also examined
accuracy loss for each fold to evaluate the model’s performance on both benchmark datasets. The
soft voting ensemble classifier approach improved accuracies on both datasets and, when compared
to existing heart disease prediction studies, this method notably exceeded their results.
Keywords: heart disease prediction; machine learning; soft voting ensemble classifier; performance matrices
there are some tried-and-true methods for improving the accuracy of the model. These
include adding more information to the dataset, treating missing and outlier values, feature
selection, algorithm tuning, cross-validation, and ensembling. This paper implements Grid-
searchCV hyperparameter tuning and five-fold cross-validation to evaluate the model’s
performance on both benchmark datasets. It also employs an ensemble voting classifier to
improve model accuracy, aiming to enhance ML model accuracy. This article presents the
following significant work:
• This work examines and implements six major ML algorithms on the Cleveland and
IEEE Dataport heart disease datasets, analyzing performance classification metrics.
• In the early phase, various ML classifier techniques, including random forest (RF), K-
nearest neighbor (KNN), logistic regression (LR), Naive Bayes (NB), gradient boosting
(GB), and AdaBoost (AB) were trained.
• The GridsearchCV hyperparameter tuning method with five-fold cross-validation and
performance assessment using accuracy and negative log loss metrics was employed
to achieve the highest level of accuracy.
• Finally, all classifiers were combined using a soft voting ensemble method in order to
increase the accuracy of the model.
2. Literature Review
Several new research opportunities in healthcare have been enabled by advances in
ML and advances in computing capabilities [8]. Various researchers have proposed ML
algorithms to enhance the accuracy of disease prediction [9–11]. To refine the precision of
the outcomes, much of the research has meticulously evaluated the presence of missing
data in the dataset, a crucial aspect in the data preprocessing process. Gupta et al. [12]
used Pearson correlation coefficients and different ML classifiers to replace missing values
in the Cleveland dataset. Rani et al. [13] have investigated multiple imputations by the
chained equations (MICE) method to deal with the missing values problem. In this case,
missing values are imputed through a series of iterative predictive models. During each
iteration, each variable in the dataset is assigned using the other variables. In another
work, Jordanov et al. [14] proposed a KNN imputation method for the prediction of both
continuous (average of the nearest neighbors) and categorical variables (most frequent).
Another study used an LR model to classify cardiac disease with an accuracy of 87.1% after
cleaning the dataset and identifying missing values at the time of preprocessing [15]. In
contrast, some researchers have eliminated missing values. Based on DT, LR, and Gaussian
NB algorithms, the features are reduced from 13 to 4 using feature selection method
and reported an accuracy of 82.75% [16]. A hybrid random forest (RF) with the linear
model was developed by Mohan et al. [17] and improved the accuracy of 297 records and
13 characteristics of the Cleveland dataset for heart disease prediction. Kodati et al. [18]
tested several types of classifiers using Orange and Weka data-mining tools to predict heart
disease with 297 records and 13 features.
In addition, the feature selection method plays an important role in improving the
accuracy of the model. To select features, Shah et al. [19] utilized probabilistic principal
component analysis (PCA). The Cleveland dataset was used by R. Perumal et al. [20] to
develop LR and support vector machine (SVM) models with similar accuracy levels (87%
and 85%, respectively). To train the ML classifiers, they used a dataset of 303 data instances
and standardized and reduced features using PCA. In another study, a particle swarm opti-
mization (PSO) technique was used to select features [21]. In contrast, Yekkala et al. [22]
used a rough set-based feature selection method along with the RF algorithm and obtained
an accuracy of 84%. Saw et al. [23] used a random search to find the best parameters to build
an accurate prediction model. It was found that this approach uses LR for classification
and is 87% accurate at predicting heart attacks. Other works have used both methods
and predicted the accuracy using different algorithms. The model presented by Otoom
et al. [24] used NB, SVMs, and available trees to achieve an accuracy of 84.5%. Vemban-
Processes 2023, 11, 1210 3 of 31
dasamy et al. [25] proposed an NB classifier for predicting heart disease and achieved an
accuracy of 84.4%.
Further, to determine the optimum combination of heart disease predictors, Gazeloglu
et al. [26] evaluated 18 ML models and three feature selection techniques for the Cleveland
dataset of 303 instances and 13 variables. Recently, ten classifiers were trained to identify
the most effective prediction models for precise prediction [27]. The most suitable attributes
were identified using three methods of attribute selection, including a feature subset
evaluator based on correlation, a chi-squared attribute evaluator, and a relief attribute
evaluator. Furthermore, a hybrid feature selection method aimed at enhancing accuracy
by incorporating RF, AB, and linear correlation was suggested by Pavithra et al. [28].
The implementation of this technique led to a 2% increase in the accuracy of the hybrid
model, following the selection of 11 features through a combination of filter, wrapper, and
embedded methods. To further enhance the accuracy, researchers have used the ensemble
technique to combine different algorithms. The ensemble method for detecting heart
disease was developed by Latha et al. [29] by combining NB, RF, multilayer perceptrons
(MLP), and Bayesian networks based on majority voting (MV). They achieved an accuracy
of 85.48%. It was also employed by an ensemble model with five classifiers, including a
memory-based learner (MBI), an SVM, DT induction with information gain (DT-IG), NB,
and DT initiation with the Gini index (DT-GI) [30]. As the datasets in the authors’ study
contained only pertinent attributes, there was no feature selection. A pre-processing step
has been performed to eliminate outliers and missing values from the data. Tama et al. [31]
developed an ensemble model to diagnose heart disease with an accuracy rate of 85.71%.
The ensemble model utilized GB, RF, and extreme GB classifiers. Alqahtani et al. [32]
developed an ensemble of ML and deep learning (DL) models to predict the disease with
an accuracy rate of 88.70%. This study employed a total of six classification algorithms.
Trigka et al. [33] developed a stacking ensemble model after applying SVM, NB, and
KNN with a 10-fold cross-validation synthetic minority oversampling technique (SMOTE)
in order to balance out imbalanced datasets. This study demonstrated that a stacking
SMOTE with a 10-fold cross-validation achieved an accuracy of 90.9%. Another study used
stochastic gradient descent classifiers, LR, and SVM to develop a model with an accuracy of
93% using multiple datasets [34]. For further improving accuracy, Cyriac et al. [35] utilized
seven different machine-learning models as well as two ensemble methods (soft voting
and hard voting). With this approach, the highest accuracy score was achieved at 94.2%.
Another study developed a combined multiple-classifier predictive model approach for
better prediction accuracy [36]. Five classifier models are combined with Cleveland and
Hungarian datasets. A total of 590 data-valid instances and 13 attributes were taken into
consideration. A baseline accuracy of 93% was achieved using the Weka data-mining tool.
In 2020, Manu Siddhartha created a new dataset by combining five well-known heart
disease datasets—Switzerland, Cleveland, Hungary, Statlog, and Long Beach VA. This
new dataset includes all the characteristics shared by the five datasets [37]. In the same
dataset, Mert Ozcan et al. [38] investigated the use of a supervised ML technique known
as the Classification and Regression Tree (CART) algorithm to predict the prevalence of
heart disease and to extract decision rules that clarify the associations between the input
and output variables. The outcomes of the investigation further ranked the heart disease
influencing features based on their significance. The model’s reliability was corroborated
by an 87% accuracy in the prediction. Other researchers Rüstem Yilmaz et l. [39] worked to
compare the predictive classification performances of ML techniques for coronary heart
disease. Three distinct models using RF, LR, and SVM algorithms were developed. Hyper-
parameter optimization was performed using a 10-fold repeated cross-validation approach.
Model performance was assessed using various metrics. Results showed that the RF
model exhibited the highest accuracy of 92.9%, specificity of 92.9%, sensitivity of 92.8%,
F1 score of 92.8%, and negative predictive and positive predictive values of 92.9% and
92.8%, respectively.
Processes 2023, 11, 1210 4 of 31
In the field of predictive modeling, there is a constant pursuit to enhance the accu-
racy of classification and forecast models. The classification models are deployed to label
points
data while
points forecast
while models
forecast are used
models to predict
are used futurefuture
to predict values.values.
A suitable combination
A suitable of
combina-
models and features can enhance the accuracy of these models. Bhanu
tion of models and features can enhance the accuracy of these models. Bhanu Prakash Prakash Doppala
et.al [40]et.al
Doppala proposed a model that
[40] proposed was evaluated
a model on diverse on
that was evaluated datasets to determine
diverse datasets toitsdetermine
efficacy
in improving accuracy. The evaluation involved testing the model on
its efficacy in improving accuracy. The evaluation involved testing the model on threethree datasets: the
Cleveland dataset, a comprehensive dataset from IEEE Dataport, and a cardiovascular
datasets: the Cleveland dataset, a comprehensive dataset from IEEE Dataport, and a cardio- dis-
ease dataset
vascular from
disease the Mendeley
dataset from the Data Center.
Mendeley TheCenter.
Data results The
of the proposed
results of themodel exhibited
proposed model
high accuracy rates of 96.75%, 93.39%, and 88.24% on the respective datasets.
exhibited high accuracy rates of 96.75%, 93.39%, and 88.24% on the respective datasets.
Incontrast
In contrast to
tothe
the above
above work,
work, the
the ensemble
ensemble classifier
classifierisisimplemented
implementedusingusingsixsixML
ML
models on the Cleveland heart disease dataset [41] and the IEEE Dataport
models on the Cleveland heart disease dataset [41] and the IEEE Dataport heart disease heart disease da-
tasets (comprehensive) [42]. This study used six ML algorithms: RF, KNN, LR, NB, GB, and
datasets (comprehensive) [42]. This study used six ML algorithms: RF, KNN, LR, NB, GB,
AB. A GridsearchCV hyperparameter method and five-fold cross-validation methods were
and AB. A GridsearchCV hyperparameter method and five-fold cross-validation methods
employed to obtain the best accuracy results before implementing the models. The hyperpa-
were employed to obtain the best accuracy results before implementing the models. The
rameter values provided by GridsearchCV enhance the accuracy of the model. Using these
hyperparameter values provided by GridsearchCV enhance the accuracy of the model.
parameters, the accuracy of six different algorithms is verified and the most accurate algo-
Using these parameters, the accuracy of six different algorithms is verified and the most
rithm is determined. Additionally, the ensemble method was applied to the proposed algo-
accurate algorithm is determined. Additionally, the ensemble method was applied to the
rithms in order to enhance their accuracy. This method boosts overall model accuracy from
proposed algorithms in order to enhance their accuracy. This method boosts overall model
90.16% (LR) to 93.44%, and from 90% (AB) to 95% using soft-voting ensemble classifiers on
accuracy from 90.16% (LR) to 93.44%, and from 90% (AB) to 95% using soft-voting ensemble
Cleveland and IEEE datasets.
classifiers on Cleveland and IEEE datasets.
3.3.Resources
Resources and
and Approaches
Approaches
Thissection
This sectiondescribes
describes the
the methods
methods to to predict
predict heart
heartdisease
diseaseusing
usingthe
thetwo
twobenchmark
benchmark
publicly available datasets. This study consists of various phases, from the collection
publicly available datasets. This study consists of various phases, from the collection of data
to the prediction of heart disease. In the first phase, data can be pre-processed using
of data to the prediction of heart disease. In the first phase, data can be pre-processed feature
scaling
using and data
feature transformation
scaling methods. The methods.
and data transformation proposed The
model is built model
proposed using multiple ML
is built using
algorithms as the next step. An ensemble approach is used in the next phase of the process
multiple ML algorithms as the next step. An ensemble approach is used in the next phase
to enhance the model’s accuracy. Figure 1 shows a detailed diagram of the workflow archi-
of the process to enhance the model’s accuracy. Figure 1 shows a detailed diagram of the
tecture.
workflow architecture.
Algorithms
methods Naïve bayes classifier Best results
Learning
for target Naïve bayes classifier
variable
Gradient boosting classifier
Gradient boosting classifier
Ada boost classifier
Ada boost classifier
Voting classifier
Finding
Testing data
hyperparameters
(20%)
RF prediction model
GB prediction model
Logistic regression
Confusion matrix AB prediction model
Performance Naïve bayes classifier
Precision Assigning weights to
Measurement Evaluation of Testing
Recall models
Analysis of performance models
F1-score Gradient boosting classifier
Different measures Soft Voting Ensemble classifier
Prediction accuracy
Curves
Ada boost classifier Ensemble Weighted
classifier predicted results
Voting classifier
Figure1.1.The
Figure The proposed
proposed system
system model
modelfor
forpredicting
predictingheart
heartdisease.
disease.
Processes 2023, 11, 1210 5 of 31
dataset of 302 unique instances was obtained, with 164 instances corresponding to patients
with heart disease and 138 instances corresponding to patients without heart disease. In
Dataset II, we identified no missing values. Additionally, 272 duplicate instances have
been identified. Therefore, to complete the dataset, the duplicate instance was removed.
Among these, 508 instances correspond to patients with heart disease, and the remaining
410 instances belong to patients without heart disease.
However, some attributes in the data have large input values that are incompatible
with other attributes, which results in poor learning performance. Therefore, to make it
compatible with other attributes, data exploration was performed to visually explore and
identify relationships between them. This is accomplished through the use of a one-hot
encoding method. One-hot encoding is performed using features such as cp, thal, and
slope for the available datasets. Those three features are further subdivided into cp_0 to
cp_3,thal_0 to thal_3, and slope_0 to slope_2 features and merged into the original datasets.
After exploring the data, the data were scaled for further processing. This is essential when
using the dataset for a KNN. In order to make it compatible with all algorithms, a large
number of features have been scaled down. As a result, ML models perform better.
Processes 2023, 11, 1210 7 of 31
Feature scaling involves two essential techniques called standardization and normal-
ization. In standardization, the mean is subtracted from the distribution shifts and divided
by the standard deviation. The act of subtracting the average from the data points is referred
to as centering while dividing each data point by the standard deviation is called scaling.
Standardization helps maintain the presence of outliers, making the resulting algorithm
less susceptible to influences compared to one that has not undergone standardization.
Standardizing a value can be accomplished using the following equations from (1) to (3).
x−µ
x0 = (1)
σ
Here, x is the participation value, x’ is the standardized value, µ the mean, and σ is the
standard deviation. These can be calculated as follows:
N
∑ x
i =1
µ= (2)
N
When referring to a dataset, N represents the total number of columns in the attribute
being scaled. From the available dataset, age, trestbps, chol, and oldpeak features have
large dimensional values. Hence, the standard scalar is used to convert these feature values
into uniform scaling. v
u N
u1
σ = t ∑ ( x i − µ )2 (3)
N i =1
After scaling the large feature values, min-max scaling is applied for normaliza-
tion. This technique is appropriate for data distributions that do not follow a Gaussian
distribution. As a result of normalization, feature values become bounded intervals be-
tween the minimum and maximum. For min-max scaling, normalize the data using
Equation (4) below.
x − xmin
x0 = (4)
xmax − xmin
Here, xmin and xmax are the minimum and maximum values of the respective feature
in the dataset. With the use of the above equation, all the features are normalized [0,1]. The
last step in pre-processing involved dividing the data into two subsets, known as training
and testing data, after normalizing the data. The split was carried out in such a way that
80% of the available data was allocated for training and the remaining 20% for testing.
This division enabled the training and evaluation of various ML classifiers by testing their
accuracy using the training and testing datasets. An exploratory data analysis (EDA) is also
conducted prior to discussing each algorithm used to predict heart disease. A description
of the descriptive statistics and the information regarding the correlation matrix cannot be
presented here for brevity.
Predicted Condition
Positive
True Positive (TP) False Negative (FN)
(P)
Negative
False Positive (FP) True Negative (TN)
(N)
Figure3.3.The
Figure Theconfusion
confusion matrix.
matrix.
•TablePrecision
3. Valuation of the area under the curve.
It is calculated based on the total number of predictions made by the model. The
Area predictions
percentage of correct Under the Curve
is then (AUC)
divided by the total number of Understanding
predictions [44].
This can be defined as the0.90 ≤ AUC
ratio of the TP to the total prediction (TP + FP)Exceptional
made by the
model. It can be expressed
0.80 as an equation
≤ AUC < 0.90 in (5). Decent
0.70 ≤ AUC < 0.80 TP
Reasonable
0.70 = TP + FP
0.60 ≤ AUCPr<ecision (5)
Unfortunate
0.50 ≤ AUC < 0.60 Flop
• Precision
3. AROC second significant
curve and AUC metric is recall, which is also known as sensitivity or the true
positive rate [45]. This can be determined by determining the proportion of positive
Using a learning curve, we can determine how much more training data will benefi
observations that were accurately predicted in relation to the overall number of positive
our model. It illustrates the relationship between training and test scores for a ML mode
observations. Thus, recall indicates the range of positive classes. As an equation, it can be
with a as
written variable
(6). number of training samples. The cross-validation procedure is carried ou
behind the scenes when we call the learning curve. TP
Re call = (6)
4. ROC curve and AUC TP + FN
A good classifier should have precision and recall of one, which corresponds to a FP
Plotting recall on the x-axis and precision on the y-axis obtains the precision-recal
and FN equal to zero. It is better to consider both precision and recall if the cost of the FP
curve.
and FNThis curve
is very depicts
different. the false positive
Consequently, to false
precision and negative ratio.
recall need to beThe precision-recall
considered when curve
is not constructed using the number of true negative results [49].
there is an uneven distribution of classes. Therefore, the F1 score can be regarded as a
measure of both precision and recall [46].
3.4. Accuracy and Loss of Each Fold Measurement
• Precision
In ML classifiers, the accuracy and loss of each fold have a significant impact on the
The F1 score is obtained by taking the average of precision and recall. This metric
model’s overall
has generally beenperformance.
considered toThe
be aaccuracy of each for
reliable method foldcomparing
determines
thehow well the model
performance of has
learned from
different the training
classifiers, datawhen
particularly andthehow accurately
data it can predict
are unbalanced. F1 scoresnew data. If the
are calculated byaccuracy
considering both the number of prediction errors and the type of errors the model makes.patterns
of a fold is high, it indicates that the model has successfully learned the underlying
in the
As data anditcan
an equation, can make accurate
be written as (7).predictions. However, if the accuracy of a fold is low, i
implies that the model needs further improvement and fine-tuning to achieve better results
2 × (Pr ecision × recall )
F1-Score = (7)
Pr ecision + Re call
Processes 2023, 11, 1210 9 of 31
FP
FPR = (8)
FP + TN
The TPR is plotted on the Y-axis, while FPR is plotted on the X-axis. Thus, it is
necessary to utilize a method referred to as AUC in order to calculate the values at any
threshold level efficiently [48]. AUC measures the performance of a classifier across
different thresholds as indicated by the ROC curve. In general, the AUC value ranges
from 0 to 1, which suggests a good model will have an AUC close to 1, which indicates
a high degree of separation. The ROC curve represents how well a classification model
performs across all classification thresholds. On this curve, two parameters are plotted.
The ROC space is divided by the diagonal. Points above the diagonal indicate successful
classification; points below the line indicate unsuccessful classification. The valuation of
the AUC curve is explained in Table 3.
where y is the true label (either 0 or 1), p is the predicted probability of the positive class,
and the log is the natural logarithm. The log loss ranges from 0 to infinity, with a perfect
model having a log loss of 0. A model that always predicts the same probability for all
samples would have a log loss of approximately 0.693. Log loss penalizes highly confident
but wrong predictions more than it penalizes predictions that are only slightly wrong.
As a result, it is a popular loss function for classification problems where the focus is on
predicting probabilities rather than hard class labels.
selection of features is also made. A majority vote is used to combine the predictions of
multiple trees [51]. For Dataset I, the model’s confusion matrix revealed that it successfully
predicted 19 positive cases and 33 negative cases. However, there were nine incorrect
predictions, consisting of eight false negatives and one false positive. In the case of Dataset
II, the confusion matrix showed that the model accurately predicted 71 positive cases
and 92 negative cases, but it also made 21 incorrect predictions, which included 17 false
negatives and 4 false positives. Table 6 showcases the performance of the RF in predicting
heart disease for two datasets: Dataset I (Cleveland) and Dataset II (IEEE Dataport). The
metrics used to evaluate the model include precision, recall, and F1 score, for both classes,
0 (no heart disease) and 1 (having heart disease).
For Dataset I, Class 0 has a precision of 95%, recall of 70%, F1 score of 81%, and
27 instances. Class 1 has a precision of 80%, recall of 97%, F1 score of 88%, and 34 instances.
The overall accuracy, macro average, and weighted average are 85%, 88%, and 87%, respec-
tively, for the 61-instance dataset. For Dataset II, Class 0 has a precision of 94%, recall of
82%, F1 score of 87%, and 88 instances. Class 1 has a precision of 85%, recall of 95%, F1
score of 90%, and 96 instances. The overall accuracy, macro average, and weighted average
Processes 2023, 11, x FOR PEER REVIEW 12 of 34
are 89% for the 184-instance dataset. Figures 4 and 5 represent the RF model’s performance
measuring plots on Dataset I and Dataset II.
Figure
Figure 4. Performance 4. Performance
measuring measuring
curves of RF oncurves of RF
Dataset I. on Dataset I.
Processes 2023, 11, 1210 12 of 31
Figure
Figure 5. Performance 5. Performance
measuring measuring
curves of RF oncurves of RF
Dataset II. on Dataset II.
In Dataset I, Class 0 shows a precision of 81%, recall of 81%, F1 score of 81%, and
27 instances. Class 1 displays a precision of 85%, recall of 85%, F1 score of 85%, and
34 instances. The dataset, containing 61 instances, has an overall accuracy, macro average,
and weighted average of 84%, 83%, and 84%, respectively. For Dataset II, Class 0 has a
precision of 90%, recall of 84%, F1 score of 87%, and 88 instances. Class 1 demonstrates
a precision of 86%, recall of 92%, F1 score of 89%, and 96 instances. The overall accuracy,
macro average, and weighted average for the 184-instance dataset are 88%, 88%, and 89%,
Processes 2023, 11, 1210 13 of 31
Processes
Processes 2023,
2023, 11,
11, xx FOR
FOR PEER
PEER REVIEW
REVIEW 14
14 of
of 34
34
respectively. Figures 6 and 7 represent the KNN model’s performance measuring plots on
Dataset I and Dataset II.
Figure
Figure 6. Performance 6.
6. Performance
measuring
Figure measuring
curves
Performance of KNNcurves
measuring of
of KNN
on Dataset
curves I. on
KNN on Dataset
Dataset I.I.
Figure
Figure
Figure 7. Performance 7.
7. Performance
Performance
measuring measuring
of KNNcurves
measuring
curves curves of
of KNN
on Dataset II.on
KNN on Dataset
Dataset II.
II.
make predictions [53]. The confusion matrix for the model reveals the following results
for Dataset I and Dataset II: In Dataset I, the model accurately predicted 21 positive and
34 negative cases while making 6 incorrect predictions, all of which were false negatives
and no false positives. In Dataset II, the model successfully predicted 75 positive and
88 negative cases, but it also made 21 incorrect predictions, comprising 13 false negatives
and 8 false positives. Table 8 illustrates the performance of a Logistic Regression (LR)
classifier in predicting heart disease for two datasets: Dataset I (Cleveland) and Dataset II
(IEEE Dataport). The evaluation metrics presented include precision, recall, F1 score, and
support for both classes: 0 (no heart disease) and 1 (having heart disease).
Table 8. Performance measure curve values of LR (Datasets I and II).
Figure
Figure 8. Performance 8. Performance
measuring measuring
curves of LR oncurves of LR
Dataset I. on Dataset I.
Processes 2023, 11, 1210 15 of 31
Figure
Figure 9. Performance 9. Performance
measuring measuring
curves of LR oncurves of LR
Dataset II. on Dataset II.
In Dataset I, Class 0 has a precision of 88%, recall of 85%, F1 score of 87%, and 27 instances.
Class 1 exhibits a precision of 89%, recall of 91%, F1 score of 90%, and 34 instances. With
61 instances in total, the overall accuracy, macro average, and weighted average are 89%.
For Dataset II, Class 0 displays a precision of 88%, recall of 85%, F1 score of 87%,
and 88 instances. Class 1 shows a precision of 89%, recall of 91%, F1 score of 90%, and
96 instances. With 184 instances, the overall accuracy, macro average, and weighted average 18
Processes 2023, 11, x FOR PEER REVIEW of 34
are 89%. Figures 10 and 11 represent the NB model’s performance measuring plots on
Dataset I and Dataset II.
Figure measuring
Figure 10. Performance 10. Performance measuring
curves of NB oncurves of NB
Dataset I. on Dataset I.
Figuremeasuring
Figure 11. Performance 11. Performance measuring
curves of NB oncurves of NB
Dataset II. on Dataset II.
For Dataset I (Cleveland), Class 0 has a precision of 88%, recall of 78%, F1 score of 82%,
and 27 instances. Class 1 displays a precision of 84%, recall of 91%, F1 score of 87%, and
34 instances. The dataset, with 61 instances, has an overall accuracy, macro average, and
weighted average of 85%.
In Dataset II (IEEE Dataport), Class 0 exhibits a precision of 91%, recall of 85%, F1 score
of 88%, and 88 instances. Class 1 presents a precision of 87%, recall of 93%, F1 score of 90%,
and 96 instances. With 184 instances, the overall accuracy, macro average, and weighted
average are 89%. Figures 12 and 13 represent the GB model’s performance measuring plots
on Dataset I and Dataset II.
Processes 2023, 11, 1210 Processes 2023, 11, x FOR PEER REVIEW 18 of 31 20 of 34
Processes 2023, 11, x FOR PEER REVIEW 20 of 34
Figuremeasuring
Figure 12. Performance 12. Performance measuring
curves of GB oncurves of GB
Dataset I. on Dataset I.
Figure 12. Performance measuring curves of GB on Dataset I.
Figuremeasuring
Figure 14. Performance 14. Performance measuring
curves of AB oncurves of AB
Dataset I. on Dataset I.
Figure 14. Performance measuring curves of AB on Dataset I.
4.9. Assessing the Accuracy and Accuracy Loss of Each Fold: Measurement and
Performance Evaluation
The loss and accuracy values for each fold provide an estimate of how well the model
is performing on different subsets of the data. Figures 16–27 show six models’ five-fold
accuracy and loss value plots for Datasets I and II. Table 13 presents the values of accuracy,
accuracy loss of each fold, and mean and standard deviation values of six models.
4.9. Assessing
4.9. Assessing the
the Accuracy
Accuracy and
and Accuracy
Accuracy Loss
Loss of
of Each
Each Fold:
Fold: Measurement
Measurement and
and
Performance Evaluation
Performance Evaluation
The loss
The loss and
and accuracy
accuracy values
values for
for each
each fold
fold provide
provide anan estimate
estimate of
of how
how well
well the
the model
model
is performing on different subsets of the data. Figures 16–27 show six models’
is performing on different subsets of the data. Figures 16–27 show six models’ five-fold five-fold
Processes 2023, 11, 1210 21 of 31
accuracy and
accuracy and loss
loss value
value plots
plots for
for Datasets
Datasets II and
and II.
II. Table
Table 13
13 presents
presents the
the values
values of
of accuracy,
accuracy,
accuracy loss
accuracy loss of
of each
each fold,
fold, and
and mean
mean andand standard
standard deviation
deviation values
values of
of six
six models.
models.
Figure 16. RF
Figure 16.
16. RF model’s
model’s 5-fold
5-fold accuracy
accuracy and
andloss
and lossvalue
loss valueplots
value plots for
for Dataset
DatasetI.I.I.
Dataset
Figure 17.
Figure 17. RF model’s
17. RF model’s 5-fold
5-fold accuracy
accuracy and
andloss
and lossvalue
loss plots for
valueplots
value for Dataset
DatasetII.
Dataset II.
II.
Figure 19.
Figure 19. KNN model’s
19. KNN model’s 5-fold
5-fold accuracy
accuracy and
and loss
lossvalue
loss value plots for
valueplots forDataset
for DatasetII.
Dataset II.
II.
Processes
Processes 2023, 11,
Processes 2023,
2023, 11, 1210
11, xx FOR
FOR PEER
PEER REVIEW
REVIEW 2422of
24 of 34
of 31
34
Figure 20. LR model’s 5-fold accuracy and loss value plots for Dataset I.
Figure 21.
Figure 21. LR
LR model’s
model’s 5-fold
5-fold accuracy
accuracy and
and loss
loss value
value plots
plots for
for Dataset
Dataset II.
II.
Figure 22.
Figure 22. NB
NB model’s
model’s 5-fold
5-fold accuracy
accuracy and
and loss
loss value
value plots
plots for
for Dataset
Dataset I.I.
Figure 23.
Figure 23. NB
NB model’s
model’s 5-fold
5-fold accuracy
accuracy and
and loss
loss value
value plots
plots for
for Dataset
Dataset II.
II.
Processes
Processes 2023, 11,
2023, 11,
Processes 2023, xx FOR
FOR PEER
11, 1210 PEER REVIEW
REVIEW 25
23of
25 of 34
of 31
34
Figure 24.
Figure 24. GB model’s 5-fold
GB model’s 5-fold accuracy
accuracy and
and loss
loss value
value plots
plots for
for Dataset
Dataset I.
I.
Figure 25.
Figure 25.
25. GB
GB model’s 5-fold accuracy
model’s 5-fold accuracy and
and loss
loss value
loss value plots for
value plots for Dataset
Dataset II.
Dataset II.
II.
26.
Figure 26. loss
26. AB model’s 5-fold accuracy and loss value Dataset
value plots for Dataset
loss value I.
Dataset I.
I.
27.
Figure 27. loss
27. AB model’s 5-fold accuracy and loss value Dataset
value plots for Dataset
loss value II.
Dataset II.
II.
Processes 2023, 11, 1210 24 of 31
Table 13. The values of accuracy, accuracy loss of each fold and mean and standard deviation values
of six models.
In this study, the performance of various ML models, including RF, KNN, LR, NB, GB,
and AB, was evaluated on two different datasets (I and II). The models’ performance was
Processes 2023, 11, 1210 25 of 31
assessed using five-fold cross-validation, and three metrics were reported: accuracy, loss
(1—accuracy), and negative log loss.
The results indicate that, for Dataset I, the RF model achieved the highest mean
accuracy (0.826) with a standard deviation of 0.089, followed closely by the NB and LR
models with mean accuracies of 0.839 and 0.834, respectively. On the other hand, the KNN
model had the lowest negative log loss (−1.009) with the largest standard deviation (0.740),
which could suggest overfitting or instability in model performance across different folds.
For Dataset II, the GB showed the best performance with a mean accuracy of 0.887
and a standard deviation of 0.016. The other models, including KNN, LR, and NB, also
demonstrated relatively high mean accuracies, ranging between 0.854 and 0.866. The
negative log loss values were more stable for this dataset, with the AB model having
the most consistent performance, indicated by a mean negative log loss of −0.567 and a
standard deviation of 0.002. Further, this study reveals that selecting the best model requires
careful consideration of the evaluation metrics and their respective standard deviations.
1
SVE = argmax( × ( P( RF ) + P(KNN ) + P( LR) + P( NB) + P( GB) + P( AB))) (11)
N
where “N” denotes the number of base classifiers and “P” represents the probability of each
base classifier and arg max (argument maximize) is the function that returns the class with
the highest probability. Figure 28 illustrates an ensemble classifier model for soft voting.
1
SVE = 燼rg max( × (?P RF ) + P ( KNN ) + P ( LR ) + P ( NB ) + P (GB ) + P ( AB ))) (11)
N
where “N” denotes the number of base classifiers and “P” represents the probability of
each base classifier and arg max (argument maximize) is the function that returns the class
Processes 2023, 11, 1210 with the highest probability. Figure 28 illustrates an ensemble classifier model for soft
26 of 31
voting.
RF Prediction-1
KNN Prediction-2
Pre-
Dataset Processed Soft Voting Result
Data
NB Prediction-4
GB Prediction-5
AB Prediction-6
Figure 28.
Figure Proposed soft
28. Proposed soft voting
voting ensemble
ensemble classifier.
classifier.
6. Comparative Study
6. Comparative Study
Figure 29 presents a performance analysis of six ML classifiers applied to Dataset
I. TheFigure 29show
results presents
thataLR performance
achieves the analysis
highestofaccuracy
six ML classifiers applied
of 90% among theto classifiers,
Dataset I.
with notable precision and recall values for both classes. Other classifiers, such aswith
The results show that LR achieves the highest accuracy of 90% among the classifiers, RF,
notable
KNN, NB, precision
GB, andandABrecall
displayvalues for both
varying levelsclasses. Other classifiers,
of performance across thesuch as RF,metrics.
different KNN,
NB,
Upon GB, and AB display
examining varying
the results, levels ofthat
it is evident performance across
the classifiers the different
exhibit differentmetrics.
strengths Upon
and
examining the results, it is evident that the classifiers exhibit different strengths
weaknesses. For instance, while RF and AB have high precision for Class 0, they show and weak-
Processes 2023, 11, x FOR PEER REVIEW
nesses. For instance, 29 of 34
lower recall values forwhile RF and
the same ABConversely,
class. have high precision for Classremarkable
LR demonstrates 0, they show lower
precision
recall values for the same class.
for Class 0 and recall for Class 1. Conversely, LR demonstrates remarkable precision for
Class 0 and recall for Class 1.
Figure 30 represents the performance analysis for six ML classifiers. In this, the anal-
ysis reveals that the classifiers demonstrate relatively similar performance on Dataset II,
with the AB classifier achieving the highest accuracy of 90%. Precision, recall, and F1 score
values are also consistent across the classifiers. However, there are some differences in
performance, such as RF having a higher precision for Class 0 and a lower recall for the
same class.
Figure 30 represents the performance analysis for six ML classifiers. In this, the
analysis reveals that the classifiers demonstrate relatively similar performance on Dataset
II, with the AB classifier achieving the highest accuracy of 90%. Precision, recall, and F1
score values are also consistent across the classifiers. However, there are some differences
Processes 2023, 11, 1210 27 of 31
in performance, such as RF having a higher precision for Class 0 and a lower recall for the
Figure 29. Performance measures comparison for six models on Dataset I.
same class.
Figure30.
Figure Performance
30.Performance measures
measures comparison
comparison for sixfor six models
models on Dataset
on Dataset II. II.
Uponanalyzing
Upon analyzingFigure
Figure 31,31, it it
is is evident
evident that
that thethe
SVESVE classifier
classifier consistently
consistently outperforms
outper-
forms the individual ML classifiers in both datasets, achieving 93.44% accuracy on DatasetDataset I
the individual ML classifiers in both datasets, achieving 93.44% accuracy on
Iand
and 95% onDataset
95% on DatasetII.II.
AsAs it isit considered
is considered fromfrom individual
individual classifiers,
classifiers, it has observed
it has observed the the
maximumaccuracy
maximum accuracy is only
is only 90%, 90%,whichwhich is obtained
is obtained from ABfrom AB classifier
classifier on both ontheboth the data sets.
data sets
.ItItis
is also notablethat
also notable that
thethe performance
performance of allofclassifiers
all classifiers improves
improves from Dataset
from Dataset I to Dataset
I to Dataset
II.
II. The
TheSVE
SVEclassifier effectively
classifier effectively combines
combines the strengths of the six
the strengths individual
of the classifiers,classifiers,
six individual
leading
leadingtotoenhanced
enhanced accuracy
accuracy in both
in bothdatasets. ThisThis
datasets. demonstrates the potential
demonstrates of ensem-
the potential of ensemble
ble
methods for improved performance in heart disease prediction tasks. 14
methods for improved performance in heart disease prediction tasks. Tables and 1514 and 15
Tables
Processes 2023, 11, x FOR PEER REVIEW
compare the previous researcher’s accuracy and the proposed work result accuracy on 30 of 34
compare the previous researcher’s accuracy and the proposed work result accuracy on
Dataset I and Dataset II. Compared to the previous work results, our proposed model
Dataset I and Dataset II. Compared to the previous work results, our proposed model
produced more accuracy.
produced more accuracy.
Figure 31. Proposed ensemble classifier accuracy compared with other six ML classifiers (Dataset I
Figure 31. Proposed ensemble classifier accuracy compared with other six ML classifiers (Dataset I
and Dataset II).
and Dataset II).
Table 14. Comparison of the proposed system with existing heart disease prediction systems on
Dataset I.
Maximum
Ref. Year Dataset Classifiers Used Methodology Used
Accuracy (%)
NB Net, C 4.5, MLP, PART, Ensemble techniques such as bagging and
[28]. 2019 Cleveland Bagging, Boosting, majority boosting are employed for improving pre- 85.48
voting, Stacking diction accuracy.
Feature normalization and dimensionality
[27]. 2020 Cleveland LR, SVM, KNN reduction utilizing principal component 87.00
analysis (PCA)
Processes 2023, 11, 1210 28 of 31
Table 14. Comparison of the proposed system with existing heart disease prediction systems
on Dataset I.
Maximum
Ref. Year Dataset Classifiers Used Methodology Used
Accuracy (%)
NB Net, C 4.5, MLP, PART, Ensemble techniques such as bagging
[28]. 2019 Cleveland Bagging, Boosting, majority and boosting are employed for 85.48
voting, Stacking improving prediction accuracy.
Feature normalization and
[27]. 2020 Cleveland LR, SVM, KNN dimensionality reduction utilizing 87.00
principal component analysis (PCA)
Dimensionality reduction was
LR, DT, and Gaussian naïve
[16]. 2020 Cleveland executed through singular 82.75
Bayes (GNB),
value decomposition.
Stochastic Gradient Descent Majority voting, CNN has been
Classifiers, LR, utilized for feature extraction with
[34]. 2022 Cleveland 93.00
SVM, NB, ConvSGLV, flatten layer converting 3D data into
and Ensemble methods 1D as ML models work on 1D data.
[59] 2023 Cleveland LR, KNN, DT, XGB, SVM, RF GridsearchCV hyperparameter tuning. 87.91
Cleveland RF, KNN, LR, NB, GB, AB,
Proposed Soft voting ensemble method. 93.44
dataset SVE classifier
Table 15. Comparison of the proposed system with existing heart disease prediction systems on
Dataset II.
Maximum
Ref. Year Dataset Classifiers Used Methodology Used
Accuracy (%)
Heart disease dataset Classification and regression
[38] 2022 CART 87.00
(IEEE Dataport) tree algorithm.
A 10-fold repeated
Heart disease dataset
[39] 2021 RF, LR, SVM cross-validation 92.00
(IEEE Dataport)
method was employed.
Heart disease dataset NN, MLPNN, AB, SVM, LR, An ensemble strategy that
[40] 2022 93.39
(IEEE Dataport) ANN, RF combines multiple classifiers.
Heart disease dataset RF, KNN, LR, NB, Soft voting
Proposed 95.00
(IEEE Dataport) GB, AB, SVE classifier ensemble method.
The limitation of this model is that it is based on a limited amount of patient data,
which only include 303 and 1190 patients in the datasets. Future work includes more
patient data, the application of the feature selection method, and the development of a deep
learning-based system for early heart disease detection. Additionally, utilizing medical IoT
devices and sensors for the simultaneous collection of clinical parameters such as ECG,
blood oxygen level, and body temperature can further improve the performance of the
proposed system.
7. Conclusions
In conclusion, this research presents an efficient ML-based diagnosis system for de-
tecting heart disease. To get the best accuracy results, the GridsearchCV hyperparameter
method and the five-fold cross-validation method have been used before implementing
models. Six ML classifiers were implemented and compared using accuracy, precision,
recall, and F1 score metrics. The results indicate that the LR and AB classifiers attained
the highest accuracies of 90.16% and 89.67% on both datasets, respectively. However,
when the soft voting ensemble classifier method was applied to all six models on both
datasets, it yielded even greater accuracies of 93.44% and 95%. To use this ML model for
real-time heart disease prediction, it is necessary to integrate the model into a practical
application. This can be achieved through a web application, mobile app, or other software
systems. By deploying the model in a real-world setting, such as a hospital or clinic, it
can be used to predict heart disease risk for patients. The model can also be integrated
Processes 2023, 11, 1210 29 of 31
into an electronic health record (EHR) system and make use of the patient’s EHR data for
real-time predictions.
References
1. World Health Statistics. Cardiovascular Diseases, Key Facts. 2021. Available online: https://www.who.int/news-room/fact-
sheets/detail/cardiovascular-diseases-(cvds) (accessed on 10 December 2022).
2. Choudhury, R.P.; Akbar, N. Beyond Diabetes: A Relationship between Cardiovascular Outcomes and Glycaemic Index. Cardiovasc.
Res. 2021, 117, E97–E98. [CrossRef] [PubMed]
3. Ordonez, C. Association Rule Discovery with the Train and Test Approach for Heart Disease Prediction. IEEE Trans. Inf. Technol.
Biomed. 2006, 10, 334–343. [CrossRef] [PubMed]
4. Magesh, G.; Swarnalatha, P. Optimal Feature Selection through a Cluster-Based DT Learning (CDTL) in Heart Disease Prediction.
Evol. Intell. 2021, 14, 583–593. [CrossRef]
5. Rohit Chowdary, K.; Bhargav, P.; Nikhil, N.; Varun, K.; Jayanthi, D. Early Heart Disease Prediction Using Ensemble Learning
Techniques. J. Phys. Conf. Ser. 2022, 2325, 012051. [CrossRef]
6. Liu, J.; Dong, X.; Zhao, H.; Tian, Y. Predictive Classifier for Cardiovascular Disease Based on Stacking Model Fusion. Processes
2022, 10, 749. [CrossRef]
7. Devi, A.G. A Method of Cardiovascular Disease Prediction Using Machine Learning. Int. J. Eng. Res. Technol. 2021, 9, 243–246.
8. Uddin, S.; Khan, A.; Hossain, M.E.; Moni, M.A. Comparing Different Supervised Machine Learning Algorithms for Disease
Prediction. BMC Med. Inform. Decis. Mak. 2019, 19, 281. [CrossRef]
9. Patro, S.P.; Nayak, G.S.; Padhy, N. Heart Disease Prediction by Using Novel Optimization Algorithm: A Supervised Learning
Prospective. Inform. Med. Unlocked 2021, 26, 100696. [CrossRef]
10. Song, Q.; Zheng, Y.J.; Yang, J. Effects of Food Contamination on Gastrointestinal Morbidity: Comparison of Different Machine-
Learning Methods. Int. J. Environ. Res. Public Health 2019, 16, 838. [CrossRef]
11. Pasha, S.J.; Mohamed, E.S. Novel Feature Reduction (NFR) Model with Machine Learning and Data Mining Algorithms for
Effective Disease Risk Prediction. IEEE Access 2020, 8, 184087–184108. [CrossRef]
12. Gupta, A.; Kumar, R.; Singh Arora, H.; Raman, B. MIFH: A Machine Intelligence Framework for Heart Disease Diagnosis. IEEE
Access 2020, 8, 14659–14674. [CrossRef]
13. Rani, P.; Kumar, R.; Ahmed, N.M.O.S.; Jain, A. A Decision Support System for Heart Disease Prediction Based upon Machine
Learning. J. Reliab. Intell. Environ. 2021, 7, 263–275. [CrossRef]
14. Jordanov, I.; Petrov, N.; Petrozziello, A. Classifiers Accuracy Improvement Based on Missing Data Imputation. J. Artif. Intell. Soft
Comput. Res. 2018, 8, 31–48. [CrossRef]
15. Ambrish, G.; Ganesh, B.; Ganesh, A.; Srinivas, C.; Mensinkal, K. Logistic Regression Technique for Prediction of Cardiovascular
Disease. Glob. Transit. Proc. 2022, 3, 127–130. [CrossRef]
16. Ananey-Obiri, D.; Sarku, E. Predicting the Presence of Heart Diseases Using Comparative Data Mining and Machine Learning
Algorithms. Int. J. Comput. Appl. 2020, 176, 17–21. [CrossRef]
17. Mohan, S.; Thirumalai, C.; Srivastava, G. Effective Heart Disease Prediction Using Hybrid Machine Learning Techniques. IEEE
Access 2019, 7, 81542–81554. [CrossRef]
18. Kodati, S.; Vivekanandam, R. Analysis of Heart Disease Using in Data Mining Tools Orange and Weka Sri Satya Sai University
Analysis of Heart Disease Using in Data Mining Tools Orange and Weka. Glob. J. Comput. Sci. Technol. C 2018, 18, 17–21.
19. Shah, S.M.S.; Batool, S.; Khan, I.; Ashraf, M.U.; Abbas, S.H.; Hussain, S.A. Feature Extraction through Parallel Probabilistic
Principal Component Analysis for Heart Disease Diagnosis. Phys. A Stat. Mech. Its Appl. 2017, 482, 796–807. [CrossRef]
20. Perumal, R. Early Prediction of Coronary Heart Disease from Cleveland Dataset Using Machine Learning Techniques. Int. J. Adv.
Sci. Technol. 2020, 29, 4225–4234.
21. Vijayashree, J.; Sultana, H.P. A Machine Learning Framework for Feature Selection in Heart Disease Classification Using Improved
Particle Swarm Optimization with Support Vector Machine Classifier. Program. Comput. Softw. 2018, 44, 388–397. [CrossRef]
Processes 2023, 11, 1210 30 of 31
22. Yekkala, I.; Dixit, S. Prediction of Heart Disease Using Random Forest and Rough Set Based Feature Selection. Int. J. Big Data
Anal. Healthc. 2018, 3, 12. [CrossRef]
23. Saw, M.; Saxena, T.; Kaithwas, S.; Yadav, R.; Lal, N. Estimation of Prediction for Getting Heart Disease Using Logistic Regression
Model of Machine Learning. In Proceedings of the 2020 International Conference on Computer Communication and Informatics
(ICCCI), Coimbatore, India, 22–24 January 2020. [CrossRef]
24. Otoom, A.F.; Abdallah, E.E.; Kilani, Y.; Kefaye, A. Effective Diagnosis and Monitoring of Heart Disease. Int. J. Softw. Eng. Its Appl.
2015, 9, 143–156.
25. Vembandasamy, K.; Sasipriya, R.; Deepa, E. Heart Diseases Detection Using Naive Bayes Algorithm. Int. J. Innov. Sci. Eng. Technol.
2015, 2, 441–444.
26. Gazeloğlu, C. Prediction of Heart Disease by Classifying with Feature Selection and Machine Learning Methods. Prog. Nutr. 2020,
22, 660–670. [CrossRef]
27. Reddy, K.V.V.; Elamvazuthi, I.; Aziz, A.A.; Paramasivam, S.; Chua, H.N.; Pranavanand, S. Heart Disease Risk Prediction Using
Machine Learning Classifiers with Attribute Evaluators. Appl. Sci. 2021, 11, 8352. [CrossRef]
28. Pavithra, V.; Jayalakshmi, V. Hybrid Feature Selection Technique for Prediction of Cardiovascular Diseases. Mater. Today Proc.
2021; in press. [CrossRef]
29. Latha, C.B.C.; Jeeva, S.C. Improving the Accuracy of Prediction of Heart Disease Risk Based on Ensemble Classification Techniques.
Inform. Med. Unlocked 2019, 16, 100203. [CrossRef]
30. Bashir, S.; Qamar, U.; Khan, F.H.; Javed, M.Y. MV5: A Clinical Decision Support Framework for Heart Disease Prediction Using
Majority Vote Based Classifier Ensemble. Arab. J. Sci. Eng. 2014, 39, 7771–7783. [CrossRef]
31. Tama, B.A.; Im, S.; Lee, S. Improving an Intelligent Detection System for Coronary Heart Disease Using a Two-Tier Classifier
Ensemble. BioMed Res. Int. 2020, 2020, 9816142. [CrossRef]
32. Alqahtani, A.; Alsubai, S.; Sha, M.; Vilcekova, L.; Javed, T. Cardiovascular Disease Detection Using Ensemble Learning. Comput.
Intell. Neurosci. 2022, 2022, 5267498. [CrossRef]
33. Trigka, M.; Dritsas, E. Long-Term Coronary Artery Disease Risk Prediction with Machine Learning Models. Sensors 2023, 23, 1193.
[CrossRef] [PubMed]
34. Rustam, F.; Ishaq, A.; Munir, K.; Almutairi, M.; Aslam, N.; Ashraf, I. Incorporating CNN Features for Optimizing Performance of
Ensemble Classifier for Cardiovascular Disease Prediction. Diagnostics 2022, 12, 1474. [CrossRef] [PubMed]
35. Cyriac, S.; Sivakumar, R.; Raju, N.; Woon Kim, Y. Heart Disease Prediction Using Ensemble Voting Methods in Machine Learning.
In Proceedings of the 2022 13th International Conference on Information and Communication Technology Convergence (ICTC),
Jeju Island, Republic of Korea, 19–21 October 2022; pp. 1326–1331. [CrossRef]
36. Jan, M.; Awan, A.A.; Khalid, M.S.; Nisar, S. Ensemble Approach for Developing a Smart Heart Disease Prediction System Using
Classification Algorithms. Res. Rep. Clin. Cardiol. 2018, 9, 33–45. [CrossRef]
37. Manu Siddhartha Heart Disease Dataset (Comprehensive). Available online: https://ieee-dataport.org/authors/manu-
siddhartha (accessed on 12 November 2022).
38. Ozcan, M.; Peker, S. A Classification and Regression Tree Algorithm for Heart Disease Modeling and Prediction. Healthc. Anal.
2023, 3, 100130. [CrossRef]
39. Yilmaz, R.; Yağin, F.H. Early Detection of Coronary Heart Disease Based on Machine Learning Methods. Med. Rec. 2021, 4, 1–6.
[CrossRef]
40. Doppala, B.P.; Bhattacharyya, D.; Janarthanan, M.; Baik, N. A Reliable Machine Intelligence Model for Accurate Identification of
Cardiovascular Diseases Using Ensemble Techniques. J. Healthc. Eng. 2022, 2022, 2585235. [CrossRef]
41. UCI Machine Learning Repository Heart Disease Data Set. Available online: https://archive.ics.uci.edu/ml/datasets/Heart+
Disease (accessed on 10 December 2022).
42. IEEE Dataport Heart Disease Dataset. Available online: https://ieee-dataport.org/open-access/heart-disease-dataset-
comprehensive (accessed on 12 November 2022).
43. Bharti, R.; Khamparia, A.; Shabaz, M.; Dhiman, G.; Pande, S.; Singh, P. Prediction of Heart Disease Using a Combination of
Machine Learning and Deep Learning. Comput. Intell. Neurosci. 2021, 2021, 8387680. [CrossRef]
44. Kumari, M.; Ahlawat, P. DCPM: An Effective and Robust Approach for Diabetes Classification and Prediction. Int. J. Inf. Technol.
2021, 13, 1079–1088. [CrossRef]
45. Biswas, P.; Samanta, T. Anomaly Detection Using Ensemble Random Forest in Wireless Sensor Network. Int. J. Inf. Technol. 2021,
13, 2043–2052. [CrossRef]
46. Sengupta, S.; Mayya, V.; Kamath, S.S. Detection of Bradycardia from Electrocardiogram Signals Using Feature Extraction and
Snapshot Ensembling. Int. J. Inf. Technol. 2022, 14, 3235–3244. [CrossRef]
47. Sahu, A.; Gm, H.; Gourisaria, M.K.; Rautaray, S.S.; Pandey, M. Cardiovascular Risk Assessment Using Data Mining Inferencing
and Feature Engineering Techniques. Int. J. Inf. Technol. 2021, 13, 2011–2023. [CrossRef]
48. Saqlain, M.; Jargalsaikhan, B.; Lee, J.Y. A Voting Ensemble Classifier for Wafer Map Defect Patterns Identification in Semiconductor
Manufacturing. IEEE Trans. Semicond. Manuf. 2019, 32, 171–182. [CrossRef]
49. Miao, J.; Zhu, W. Precision–Recall Curve (PRC) Classification Trees. Evol. Intell. 2022, 15, 1545–1569. [CrossRef]
50. Pal, M.; Parija, S. Prediction of Heart Diseases Using Random Forest. J. Phys. Conf. Ser. 2021, 1817, 012009. [CrossRef]
Processes 2023, 11, 1210 31 of 31
51. Polat, K.; Güneş, S. A New Feature Selection Method on Classification of Medical Datasets: Kernel F-Score Feature Selection.
Expert Syst. Appl. 2009, 36, 10367–10373. [CrossRef]
52. Verma, P. Ensemble Models for Classification of Coronary Artery Disease Using Decision Trees. Int. J. Recent Technol. Eng. 2020, 8,
940–944. [CrossRef]
53. Sharma, A.; Mishra, P.K. Performance Analysis of Machine Learning Based Optimized Feature Selection Approaches for Breast
Cancer Diagnosis. Int. J. Inf. Technol. 2022, 14, 1949–1960. [CrossRef]
54. Sarwar, A.; Ali, M.; Manhas, J.; Sharma, V. Diagnosis of Diabetes Type-II Using Hybrid Machine Learning Based Ensemble Model.
Int. J. Inf. Technol. 2020, 12, 419–428. [CrossRef]
55. Al Bataineh, A.; Manacek, S. MLP-PSO Hybrid Algorithm for Heart Disease Prediction. J. Pers. Med. 2022, 12, 1208. [CrossRef]
56. Guleria, P.; Naga Srinivasu, P.; Ahmed, S.; Almusallam, N.; Alarfaj, F.K. XAI Framework for Cardiovascular Disease Prediction
Using Classification Techniques. Electronics 2022, 11, 4086. [CrossRef]
57. Ali, S.; Hussain, A.; Aich, S.; Park, M.S.; Chung, M.P.; Jeong, S.H.; Song, J.W.; Lee, J.H.; Kim, H.C. A Soft Voting Ensemble-Based
Model for the Early Prediction of Idiopathic Pulmonary Fibrosis (IPF) Disease Severity in Lungs Disease Patients. Life 2021, 11, 1092.
[CrossRef] [PubMed]
58. Manconi, A.; Armano, G.; Gnocchi, M.; Milanesi, L. A Soft-Voting Ensemble Classifier for Detecting Patients Affected by
COVID-19. Appl. Sci. 2022, 12, 7554. [CrossRef]
59. Ahamad, G.N.; Fatima, H.; Zakariya, S.M.; Abbas, M. Influence of Optimal Hyperparameters on the Performance of Machine
Learning Algorithms for Predicting Heart Disease. Processes 2023, 11, 734. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.