0% found this document useful (0 votes)

93 views10 pages

Heart Disease Prediction Using Hyper Parameter Optimization (HPO) Tuning

Uploaded by

emil hard

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

93 views10 pages

Heart Disease Prediction Using Hyper Parameter Optimization (HPO) Tuning

Uploaded by

emil hard

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Biomedical Signal Processing and Control 70 (2021) 103033

Contents lists available at ScienceDirect

Biomedical Signal Processing and Control

journal homepage: www.elsevier.com/locate/bspc

Heart disease prediction using hyper parameter optimization (HPO) tuning

R. Valarmathi a, *, T. Sheela b
a
Department of Computer Science and Engineering, Sri Sairam Engineering College, Chennai, India
b
Department of Information Technology, Sri Sairam Engineering College, Chennai, India

A R T I C L E I N F O A B S T R A C T

Keywords: Coronary artery disease prediction is considered to be one of the most challenging tasks in the health care in
Hyper parameter tuning dustry. In our research, we propose a prediction system to detect the heart disease. Three Hyper Parameter
Heart disease Optimization (HPO) techniques Grid Search, Randomized Search and Genetic programming (TPOT Classifier)
Grid search
were proposed to optimize the performance of Random forest classifier and XG Boost classifier model. The
Randomized search
TPOT classifier
performance of the two models Random Forest and XG Boost were compared with the existing studies. The
performance of the models is evaluated with the publicly available datasets Cleveland Heart disease Dataset
(CHD) and Z-Alizadeh Sani dataset. Random Forest along with TPOT Classifier achieved the highest accuracy of
97.52%for CHD Dataset. Random Forest with Randomized Search achieved the highest accuracy of 80.2%,
73.6% and 76.9% for the diagnosis of the stenos is of three vessels LAD, LCX and RCA respectively with Z-
Alizadeh Sani Dataset. The results were compared with the existing studies focusing on prediction of heart
disease that were found to outperform their results significantly.

1. Introduction techniques helps to derive useful knowledge to take decision from vast
datasets [7]. These algorithms have been broadly used in the area of
By 2030, the deaths due to cardiovascular disease is expected to Health Care, Computer vision, Speech recognition, Social Science, Cos
increase to 23.3 million [1]. The blood vessels in the heart supplies the mology and in the Education field [8]. They provide a variety of algo
oxygen and when these vessels get blocked or narrowed, it can lead to rithms to identify the different patterns in large dataset [9]. ML has been
any heart disease or stroke [2]. According to WHO, every year around 12 widely used in the health care industry for identifying the disease and
million people fall as a victim to the heart disease and 80% of the people making effective decisions. It helps to classify the patients with signifi
dies due to heart ailment [3]. High Blood pressure, High Cholesterol, cant risk factors [10–12]. Massive amounts of data is collected by the
stress, tension, consumption of alcohols, sedentary lifestyle, obesity, healthcare industry and ML provides different models to train and
diabetes are the major factors that affects the heart. These attributes analyze the data quickly [13]. These algorithms search through a large
helps in the prediction of heart disease. Due to increased blood pressure search space of solutions and finds an optimal solution by training the
the walls of the arteries get thickened that causes blockage, which can dataset. The performance of the models can be examined from the
increase the mortality rate [4]. Early diagnosis of heart disease, proper various performance metrics such as Accuracy, Sensitivity, Specificity,
treatment can prevent and can also reduce the mortality rate of the Precision and F1-Score. By tuning the hyper parameters, the best hyper
patients. One of the common method to diagnose the abnormal nar parameters are selected and applied to the ML models and the perfor
rowing of heart vessel is angiography. The symptoms, examination and mance is improved. Robust models can be built by adjusting the
ECG features were investigated with SMO, Naïve Bayes and Ensemble hyperparameters. Overfitting or Underfitting can be prevented by tun
method and reached an accuracy of 88.5% to predict the presence of ing the hyperparameters. The major contributions of our proposed
CAD [31]. Numerous Supervised and Unsupervised Machine learning model is as follows
algorithms have been applied by a number of researchers in the medical
field for diagnosis and prediction of the heart disease [5]. 1. The performance of the models are tested on a subset of features
The ability to detect patterns, turning data into information, data selected by Sequential Forward Selection (SFS) method with 10-fold
mining serves as a strong base for analysis [6]. Machine learning cross validation for Cleave land Heart disease dataset.

* Corresponding author.
E-mail addresses: [email protected] (R. Valarmathi), [email protected] (T. Sheela).

https://doi.org/10.1016/j.bspc.2021.103033
Received 9 March 2021; Received in revised form 12 July 2021; Accepted 30 July 2021
Available online 12 August 2021
1746-8094/© 2021 Elsevier Ltd. All rights reserved.
R. Valarmathi and T. Sheela Biomedical Signal Processing and Control 70 (2021) 103033

2. SMOTE technique is applied to balance the dataset in Z-Alizadeh Santhanam and Ephzibah [28] has taken genetic algorithm for feature
Sani dataset extraction and fuzzy logic for prediction. Mohan et al. [29] developed a
3. Next, the model performance are tuned and tested with Grid Search hybrid approach of random forest and Linear method for classification of
CV, Randomized Search CV and Genetic Programming (TPOT Clas heart disease. KaanUyar et al. [30] proposed a genetic algorithm based
sifier) with 10-fold cross validation. recurrent fuzzy neural networks (RFNN) to classify the data. None of the
4. Our work suggests the combination of the ML models and optimi aforementioned studies have implemented hyperparameter optimiza
zation technique that predicts the heart disease with highest tion (HPO) techniques to boost the accuracy of the heart disease pre
accuracy. diction system. Thus, in our proposed model we used HPO techniques to
improve the accuracy of the model.
The remainder of this paper is structured as follows: Section 2 covers Alizadehsani et al. [32] employed the rule based classifier and cost
related work on the previous studies employed in heart disease predic sensitive algorithm along with Sequential minimal optimization (SMO)
tion system using different machine learning algorithms. Section 3 to diagnose CAD. Alizadehsani et al. [33] handled data uncertainity.
demonstrates the methodology of the proposed work in more detail. Different evolutionary algorithms were used for feature selection [33-
Section 4 describes the experimental results, comparative analysis of the 38,65]. Some of the existing studies presented the Real time predictions
previous studies and techniques used. Section 5 outlines our findings and the performance of detection of heart disease using hardware. Based
and future research directions. on five characteristics Cardiac arrhythmias were differentiated using
Multi-Level Support vector machine classifiers [66]. Patient Specific
2. Literature review SCAD processor [67], Smart ECG processor [68], Wearable ECG Pro
cessor [69] was designed to discriminate the CA in real time. An ECG
The performance of different datasets were analyzed using Bayesian processor and STAC algorithm was presented to improve the accuracy of
Optimization based on Gaussian process [14]. The performance of 6 heart rate detection [70].
different machine learning models were examined on the heart dataset
and found logistic regression predicted the heart disease with the 3. Materials and methods
highest accuracy [15]. Least significant features are eliminated using the
backward feature elimination. The association rules were mined using 3.1. Proposed methodology
frequent item sets and the genetic algorithms is applied to predict the
heart disease [16]. Fitness function was used to remove the redundant Original datasets were collected and data preprocessing is done on
rules and in the optimization of association rules were generated. the collected data. Relevant features were selected using sequential
Three neural network model to were used to construct an ensemble forward selection method. The parameters of Random forest and
model to diagnose the heart disease [17]. SAS enterprise miner 5.2 was XGBoost were tuned using the hyper parameter optimization techniques
used to evaluate the performance. The number of NN was increased but Grid search, Randomized search and Genetic Programming (TPOT
no improvement was observed in the performance.270 Patient records Classifier). Finally, the models were validated and analyzed to predict
were trained and tested using Cascaded neural network, a Self Orga the heart disease.
nizing network and Support Vector machine with RBF function [18]. In our proposed model, 10-cross validation is used to validate the
Naïve Bayes machine learning model was developed to predict the heart data. Fig. 1 shows the flowchart of the proposed model.
disease [19].
An enhanced SVM classifier was presented to classify the linear and 3.2. Dataset description
nonlinear inputs. PSO was used in feature extraction and Fuzzy C-means
Clustering was applied to improve the accuracy [20]. Bhatla and Jyoti 3.2.1. Cleveland heart disease (CHD)
[21] employed Weka Tool on different data mining techniques for heart The Cleveland Heart Disease (CHD) dataset is a heart disease pre
disease prediction and observations showed that Neural network diction dataset available online in UCI repository [45]. The actual
showed good results compared to other data mining techniques. Krish dataset contains 76 attributes but most of the published articles used
naiah et al. [22] incorporated a fuzzy approach to remove the uncer only 14 attributes. The heart disease prediction dataset takencontains13
tainty in the data and applied a KNN Classifier to classify the heart independent variables and one dependent target variable, a total of 14
patients. columns. The target variable has two classes that represent the presence
Amin et al. [23] identified the risk factors of 50 patients and and absence of heart disease. It has 303 rows. The dataset is found to
implemented an integrated model of genetic algorithm and neural have no missing values. The dataset description is given in Table 1
network to predict the presence of heart disease. Abdeldjouadet al. [24]
used two different software’s Weka and Keel tool to build two different 3.2.2. Z-Alizadeh Sani Dataset
models. First model was built by applying the PCA feature selection Z-Alizadeh Sani dataset contains 303 samples and 54 attributes.
method to extract the significant features and 3 different classification There are two classes for diagnosis that represent normal and patient
algorithms Multi-Objective Evolutionary Fuzzy Classifier (MOEFC), affected with coronary artery disease (CAD). Among 303 samples, 216
Logistic Regression (LR), Adaptive Boosting (AdaBoostM1) using Weka samples indicate normal patients and 87 samples indicate the presence
tool for classification. The second model was built by applying the of heart disease [46]. The dataset description is given in Table 2.
Wrapper method for feature extraction and Genetic Fuzzy System-
LogitBoost (GFS-LB), Fuzzy Unordered Rule Induction Algorithm 3.3. Feature selection
(FURIA) and Fuzzy Hybrid Genetic Based Machine Learning (FH-GBML)
for classification under Keel tool. The two models were completely The features that are irrelevant decreases the performance of the
trained and the performance were evaluated. model. In our paper, we have used Sequential Forward Selection with
Purusothaman and Krishnakumari [25] surveyed the different ten-fold cross validation to remove the irrelevant features. Sequential
research findings based on single model approach and hybrid model and Forward Selection (SFS) selects one best single feature, then best pair is
concluded hybrid model are better in prediction of disease compared to selected, then best triplet of features selected and this procedure is
a single model. Khourdifiet al. [26] optimized KNN, RF, SVM, Naïve continued until n number of relevant features are selected. Table 2
Bayes and ANN with the combination of Particle Swarm optimization shows the number of features and the highest accuracy obtained with
and ant colony optimization. Kalaiselvi and Nasira [27] used PSO for irrelevant features removed. Random Forest and XGBoost is tested with
extraction of data and ANFIS with AGKNN was used in classification. SFS. Table 3 shows the model and their highest accuracies obtained with

2
R. Valarmathi and T. Sheela Biomedical Signal Processing and Control 70 (2021) 103033

Table 2
Dataset Description(Z-Alizadeh Sani Dataset).
Feature Type Attribute(Values)

Demographic Age(30–86)
Weight(48–120)
Length(140–188)
Sex(M,F)
Body Mass Index(BMI in 18.1–40.9)
Diabetes Milletus(DM)(Y,N)
Hypertension(HTN)(Y,N)
Current Smoker(Y,N)
Ex-Smoker(Y,N)
Family History(FH) (Y,N)
Obesity(Y,N)
Chronic Renal Failure(CRF) (Y,N)
Cerebrovascular Accident(CVA) (Y,N)
Airway Disease(Y,N)
Thyroid Disease(Y,N)
Congestive Heart Failure(CHF) (Y,N)
Dyslipidemia(DLP) (Y,N)
Symptom and Examination Blood Pressure(BP in 90–190)
Pulse Rate(PR in 50–110)
Edema(Y,N)
Weak peripheral pulse(Y,N)
Lung Rales(Y,N)
Systolic murmur(Y,N)
Diastolic murmur(Y,N)
Typical Chest Pain(Y,N)
Dyspnea(Y,N)
Function Class(1,2,3,4)
Atypical(Y,N)
Nonanginal(Y,N)
Exertional CP(Y,N)
LowTHAng(low Threshold angina) (Y,N)
Rhythm(Y,N)
ECG QWave(0,1)
ST Elevation(0,1)
ST Depression(0,1)
T inversion(0,1)
Left Ventricular Hypertrophy(LVH) (Y,N)
Poor R Progression(Y,N)
Laboratory and Echo Fasting Blood Sugar(FBS in 62–100 mg/dl)
Creatine(Cr in 0.5–2.2 mg/dl)
Triglyceride(TG in 37–1050 mg/dl)
Low density lipoprotein(LDL in 18–232 mg/dl)
High density lipoprotein(HDL in 15.9–111 mg/dl)
Blood Urea Nitrogen(BUN in 6–52 mg/dl)
Erythrocyte Sedimentation rate(ESR in 1–90 mm/h)
Hemoglobin(HB in8.9–17.6 g/dl)
Potassium(K in 3.0–6.6 mEq/lit)
Sodium(Nain 128–156 mEq/lit)
Fig. 1. The Proposed Model of the Heart Disease Prediction system. White Blood Cells(WBC in 3700-18000cells/ml)
Lymphocyte (Lymph in 7–60 %)
Neutrophil(Neut in 32–89%)
Platelet(PLT in 25–742/ml)
Table 1 Ejection Fraction(EF in 15–60%)
Dataset description(CHD). Regional Wall Motion Abnormality(RWMA)(0,1,2,3,4)
Valvular Heart Disease(Normal,Mild,Moderate,
S.No Feature Description
Severe)
1 age Age
2 sex male, female
3 cp chest pain type
4 trestbps resting blood pressure Table 3
5 chol serum cholesterol Model, Number of features and their accuracy Using Sequential Forward Se
6 fbs fasting blood sugar lection (SFS).
7 restecg resting electrocardiographic results
8 thalach maximum heart rate achieved Model No of Features Accuracy No of Features Accuracy
9 exang exercise induced angina Random Forest 13 82.68% 9 83.96%
10 oldpeak ST depression induced by exercise relative to rest XGBoost 13 79.36% 11 82.7%
11 slope slope of the peak exercise ST segment
12 ca number of major vessels
13 thal Type of Defect
improves the accuracy of the proposed system. Figs. 2 and 3 shows the
14 target Risk, No risk
accuracy obtained with Random Forest and XGBoost applied with
Sequential Feature Selection(SFS). Table 4 and 5 shows the subset of
reduced number of features. Random forest achieved the highest accu features selected with Random Forest and XGboost.
racy of 83.96% with 9 relevant features. XGBoostobtained the highest
accuracy of 82.7% with 11 relevant features. Selecting relevant features

3
R. Valarmathi and T. Sheela Biomedical Signal Processing and Control 70 (2021) 103033

3.4. Building Machine learning model

3.4.1. Random Forest (RF)

Random forest is a combination of tree predictors proposed by
Breiman [39].He defines random forest as “A random forest is a classifier
consisting of a collection of trees structured classifiers {h(x,Θk), k = 1,
…} where the {Θk} are independent identically distributed random
vectors and each tree casts a unit vote for the most popular class at input
x”. In Random forest, several decision tree are built on selecting the
features and observations randomly and averaging the predictions.
Many different trees are grown and depending upon how the trees are
built and randomness introduced, a variety of random forest exists. Most
commonly used hyperparameters to optimize the RF model aren_esti
mators, max_features, min_sample_leaf, max_features, max_depth and
criterion. The parameters n_estimators, max_features and min_sample_
leaf influences the accuracy prediction value.

Fig. 2. Performance of RF with Sequential Forward Selection (SFS). 3.4.2. Extreme Gradient Boosting (XGBoost)
Extreme Gradient Boosting (XGBoost) is an ensemble technique [40],
in which a set of weak learners are combined to improve the accuracy.
The training speed and the learning effect of the XGBoost model created
wide attention towards research community. XGBoost is an enhance
ment of Gradient Boosting Decision Tree (GBDT) algorithm. It uses the
CART model for classification and regression problems. The model is
trained by adding a tree and splitting the features in each iteration to
grow the tree. At the end of the training, a score of each leaf node is
obtained by multiplying the weight with the predicted value of the tree.
The hyperparameters min_samples_split, min_samples_leaf and max_
depth controls overfitting. These parameters influences each individual
tree in the model. The parameters learning_rate, n_estimators, subsam
ple enhances the boosting operation. These hyperparameter are tuned
using the hyper optimization techniques Grid Search CV, Randomized
Search CV and Genetic Programming (TPOT Classifier) and the best
hyperparameters are selected to improve the performance of the pro
posed system.

3.5. Hyper parameter optimization (HPO)

Fig. 3. Performance of XGBoost with Sequential Forward Selection (SFS).

Selecting the best hyper parameters has a significant impact on the

Table 4 performance model. Various optimization techniques exist and they
Features selected with Random Forest Model. have their own advantages and disadvantages. Experiments were per
Feature no. Feature Name
formed on different optimization techniques to discover the best com
bination of hyper parameter and then applied to the Random Forest and
0 age
XG Boost. Tuning the machine learning model is considered to be one of
1 sex
3 trestbps the optimization problem. The machine learning model consists of two
4 chol types of parameters.1. Hyperparameter and 2. Model Parameters. Hyper
5 fbs parameters must be set randomly by the User before training the Ma
8 exang chine learning Model. The parameters are used to control the process of
10 slope
11 ca
learning. Different learning rates or weights are used to control the
12 thal process of learning and to discover the patterns hidden in the data for the
same type of machine learning model. These hyper parameters are tuned
to minimize the error and maximize the accuracy of the model. Based on
Table 5
trial and error method, these parameters are tuned and the best hyper
Features selected with XGBoost Model. parameters are determined. The best hyper parameters balances the
Over fitting and Under fitting. Choosing good hyper parameters helps in
Feature no. Feature Name
exploring the search space efficiently. The performance of the RF and XG
0 age Boost model can be improved by tuning the Hyper parameters. Model
1 sex
parameters learns during the training phase.
3 trestbps
4 chol Grid and Random Search approaches are often used in hyper
5 fbs parameter optimization. Grid Search is a traditional technique, in which
6 restecg all hyper parameter combinations are evaluated. Grid Search uses
7 thalach learning rate and number of layers as hyper parameters. Initially a
8 exang
10 slope
subset of values are defined for each hyper parameter. In each iteration,
11 ca the combination of hyper parameters are estimated. Finally, the best
12 thal hyper parameter combination are taken and applied for the learning
process. The search space is restricted to a grid shaped subset and is not

4
R. Valarmathi and T. Sheela Biomedical Signal Processing and Control 70 (2021) 103033

suitable for high dimensional space. Random search samples the search 4. Experimental results
space from the equally distributed search space [41].
The researcher [42] stated that only few parameters had a large 4.1. Experimental setup
impact in the optimization of model score. Grid Search spends more time
to find an unimportant parameter. Random search provided better The experiments were implemented in Python 3.0 on DELL, Intel (R)
choice of hyper parameter combination compared to Grid Search. Core (TM) i5-8250U CPU @ 1.60 GHz, RAM 8 GB withWindows10.
Random search focuses on exploring the hyper parameter that has
greater impact in improving the model score. Random search works
better under the assumption that all parameters are not equally impor 4.2. Results and analysis
tant. In random search the experiments were conducted separately, but
there was no way to use the information obtained in one experiment to In this study, the effect of hyper parameter on the predictive per
the next. These two approaches avoid the model falling in Local optima formance of two different machine learning models Random Forest and
[43]. The disadvantages are these two approaches are time consuming XGBoost were examined. Three different hyper parameter optimization
and not suitable for data having high dimensional space. Fig. 4 shows methods: Grid Search, Randomized Search and Genetic programming
the comparison between grid layout and random layout. (TPOT Classifier) were used to optimize the ML models. A comparative
The Tree-Based Pipeline Optimization Tool (TPOT) is a Genetic analysis of the predictive performance of the 2 algorithms RF and XG
Programming (GP) based Auto ML system that optimizes the ML models Boost with 3 hyper parameter techniques were carried out in the ex
automatically [44]. TPOT uses meta learning techniques to optimize the periments. Each algorithm is analyzed by selecting different hyper pa
machine learning pipelines using GP primitives to solve a particular rameters. We compared the different hyper parameter optimization
problem. Auto ML tool is proposed in which feature selection, feature results of Randomized Search, Grid Search, Genetic Programming al
preprocessing, feature construction, model selection and parameter gorithms with the results of existing techniques.80% and 20% of data
optimization takes place automatically. The genetic algorithm finds the were taken for training and testing respectively. In our study we used 10-
best parameters. This AutoML considers multiple machine learning al fold Cross Validation. Previous studies demonstrated 10-fold cross
gorithms, multiple preprocessing steps and multiple ways to ensemble. validation provides generalized model and avoids over fitting [47,48].
In our research, presence of heart disease is represented as 1 and
absence of heart disease as 0. We implemented the three hyper param
3.6. Performance metrics eter optimization techniques using Scikit Python library. Scikit-learn
comes with the built-in functionality for Hyperparameter tuning
Confusion matrix was employed to evaluate the performance of the techniques.
two models. True Positive (TP) is defined as the count of predicted
values correctly identified the presence of disease. True negative is
4.3. Experimental results with Random Forest model (Cleave land
defined as the count of predicted values correctly identified the absence
Dataset)
of disease. False Positive (FP) is defined as the count of predicted values
incorrectly classified as positive (actually when it was negative). False
The Random Forest model were evaluated with different hyper
Negative (FN) is defined as the count of predicted values incorrectly
parameter values. Table 6 shows the best hyper parameters obtained
classified as negative (actually when it was positive).
with different optimization techniques for Random Forest.
Once the model is trained, the risk of heart disease is predicted and
To validate the performance of the model, confusion matrix is used.
evaluated with ten-fold cross validation. The analysis were done with
The correct and incorrect prediction of the classifier is represented in a 2
the Performance metrics Accuracy, Specificity, Sensitivity, Precision and
× 2 matrix. Confusion matrix is depicted in Tables 7–9 for the Random
ROC-AUC values.
forest Classifier with ten fold cross validation.
TN + TP Table 10 shows the performance analysis of Random Forest with
Accuracy = (1)
TN + TP + FN + FP different optimization techniques.

TP
Sensitivity = (2)
TP + FN 4.4. Result analysis with XG Boost Model (Cleave land Dataset)

TN
Specificity = (3) The XG Boost model were evaluated with different hyper parameter
TN + FP values. Table 11 shows the best hyperparameters obtained with different
TP optimization techniques for XG Boost model.
Precision = (4) Confusion matrix for the XG Boost classifier is depicted in
TP + FP
Tables 12–14 for ten fold cross validation.
Table 15 shows the experimental results of Extreme Gradient
Boosting (XGBoost) model.

Table 6
Best hyper parameters obtained with different optimization techniques for
Random Forest for Cleave land Dataset.
Model Parameters Grid Randomized Genetic
Search Search Programming
(TPOT)

Random n_estimators 200 555 1333

Forest min_samples_split 3 2 2
min_samples_leaf 3 2 2
max_features sqrt log2 sqrt
max_depth 890 670 780
criterion gini gini gini
Fig. 4. Comparison between Grid and Random search Layout [42].

5
R. Valarmathi and T. Sheela Biomedical Signal Processing and Control 70 (2021) 103033

Table 7 Table 13
Confusion Matrix for RF Using Grid Search for Cleaveland Dataset. Confusion Matrix for XGBoost Using Random Search for Cleave land Dataset.
Predicted Class Predicted Class

Actual Class Absent Present Actual Value Actual Class Absent Present Actual Value
Absent 98(88.28%) 8 106 Absent 100(90.09%) 8 108
Present 13 123 136 Present 11 123(93.89%) 134
Total predicted 111 131(93.89%) Total predicted 111 131

Table 8 Table 14
Confusion Matrix for RF Using Random Search for Cleave land Dataset. Confusion Matrix for XGBoost Using Genetic Programming (TPOT Classifier) for
Predicted Class
Cleave land Dataset.
Predicted Class
Actual Class Absent Present Actual Value
Absent 103(92.79%) 4 107 Actual Class Absent Present Actual Value
Present 8 127(96.95%) 135 Absent 97(87.38%) 9 108
Total predicted 111 131 Present 14 122(93.12%) 136
Total predicted 111 131

Table 9
Confusion Matrix for RF Using Genetic Programming(TPOT Classifier)for Cleave Table 15
land Dataset. Performance analysis of XG Boost Model with different optimization techniques
Predicted Class for Cleave land Dataset.
Grid Randomized Genetic Programming
Actual Class Absent Present Actual Value
Absent 108(97.29%) 3 111 Search Search (TPOT)
Present 3 128(97.70%) 131 AUC-ROC(%) 86.07 91.99 88.21
Total predicted 111 131 Accuracy(%) 86.36 92.14 90.50
Sensitivity 82.88 90.09 87.38
(%)
Specificity 89.31 93.89 93.12
Table 10 (%)
Performance analysis of Random Forest with different optimization techniques Precision (%) 87 93 92
for Cleave land Dataset. F1-Score(%) 85 91 89

Grid Randomized Genetic Programming

Search Search (TPOT)
4.5. Experimental results with Random Forest (Z-Alizadeh Sani Dataset)
AUC-ROC(%) 91.09 94.80 97.50
Accuracy(%) 91.32 95.04 97.52
Sensitivity 88.28 92.79 97.29
The extension of Z-Alizadeh Sani dataset consists of 3 attributes LAD,
(%) LCX and RCA. The abnormal narrowing of these vessels in the human are
Specificity 93.89 96.95 97.70 considered as stenotic and others as normal. Synthetic Minority Over
(%) sampling Technique (SMOTE) is applied to overcome the imbalance
Precision (%) 92 96 97
problem in the dataset. Table 16 shows the Performance analysis of
F1-Score(%) 90 94 97
Random forest for Z-Alizadeh Sani Dataset.

4.6. Experimental results with XGBoost (Z-Alizadeh Sani Dataset)

Table 11
Best Hyper parameters of XG Boost model with different optimization tech
Table 17 shows the experimental results of XGBoost Model.
niques for Cleave land Dataset.
Model Parameters Grid Randomized Genetic
4.7. Comparative analysis of the models with previous studies
Search Search Programming
(TPOT)
Verma et al. [55] classified the heart disease by using a hybrid model
XGBoost Learning rate 0.01 0.01 0.01
gamma 1 0.5 2
(CFS, PSO, K-means, and MLP) and obtained an accuracy of 90.28%.
max_depth 3 4 3 Chadha and Mayank [50] carried out the experiments with Decision
min_child_weight 8 1 1 Tree (DT) and Naïve Bayes (NB) Model and obtained an accuracy of
n_estimators 600 266 1111 88.03% and 85.86% respectively. Mohan et al. [60] developed a hybrid
subsample 0.8 0.8 0.70

Table 16
Performance analysis of Random forest for Z-Alizadeh Sani Dataset.
Table 12
Artery Accuracy (%) Sensitivity* Specificity (%)
Confusion Matrix for XG Boost Using Grid Search for Cleave land Dataset.
%)
Predicted Class
Grid Search LAD 78.0 75.6 80.0
Actual Class Absent Present Actual Value LCX 70.3 79.6 56.8
Absent 92(82.88%) 14 106 RCA 79.1 92.20 61.5
Present 19 117(89.31%) 136 Randomized Search LAD 80.2 75.6 84
Total predicted 111 131 LCX 73.6 85.1 56.8
RCA 76.9 94.2 74.0
TPot LAD 75.8 73.2 78
LCX 68.1 72.2 62.1
RCA 74.7 88.4 56.4

6
R. Valarmathi and T. Sheela Biomedical Signal Processing and Control 70 (2021) 103033

Table 17 models is effective in classifying the heart disease when compared to the
Performance of XGBoost for Z-Alizadeh Sani Dataset. previous studies.
Artery Accuracy Sensitivity Specificity Babaoglu et al. [62] employed ANN and achieved an accuracy of
73%,64.8% and 69.4%. Alizadehsani et al. [63] studied the presence of
Grid Search LAD 74.7 78.0 72
LCX 65.9 75.9 51.3 CAD by applying different feature selection information gain and Gini
RCA 70.3 80.76 56.4 Index to extract the effective features and evaluated the performance
Randomized Search LAD 75.8 75.6 76 with C4.5 and Bagging algorithm. Alizadeh sani et al. [64] examined the
LCX 71.4 79.6 59.5 demographic, examination and ECG features and obtained an accuracy
RCA 78.0 86.5 66.6
TPot LAD 74.7 73.1 76
of 74.20%,63.76% and 68.33% for LAD, LCX and RCA respectively.
LCX 69.2 75.9 59.4 Table 19 shows our proposed models is effective when compared to the
RCA 78.0 90.3 61.5 previous studies

model (RF with Linear) and obtained an accuracy of 88.4%.Haq et al. 4.8. Performance analysis with ROC_AUC for Cleave land Dataset
[57] carried out the experiments on the combination of various feature
selection methods and different Machine Learning models. Their Receiver operating characteristics is a visualization curve to compare
research concluded the hybrid of Relief-based feature selection and the “true positive rate” and “false positive rate”. Area under the receiver
Logistic regression achieved the highest accuracy of 89% compared to operating characteristics is compared for both models for all the three
other models. Saqlain et al. [58] selected the significant features using hyper optimization techniques. The best model is found when the AUC
Fisher Score algorithm and the subset of features were given to SVM and value is closer or equal to 1. Fig. 5 shows the AUC-ROC curve for the
validated. The combination of MFSFSA and SVM obtained an accuracy experiments conducted with Grid Search. When experiments conducted
of 81.19%. with Grid Search, Random forest and XGBoost achieved an AUC of
Soni et al. [52] applied association rules to classify the disease and 91.09% and 86.09%
obtained an accuracy of 81.51%.Latha and Jeeva [59] employed a AUC-ROC curve for the experiments conducted with Randomized
voting model with NB, BN, RF and MLP and found an accuracy of Search is illustrated in Fig. 6.When experiments were conducted with
85.48%.Leema et al. [54] proposed a Computer-Aided Diagnostic sys Randomized Search, Random forest and XGBoost achieved an AUC of
tem which uses Differential Evolution for global search and Back prop 94.80% and 91.99% respectively.
agation for local search. The accuracy obtained from the system is found Fig. 7 shows the AUC-ROC curve for the experiments conducted with
to be 86.6%. Kumari and Godara [53] proposed a SVM model and ob Genetic Programming (TPOT Classifier). When experiments were con
tained an accuracy of 84.12%. ducted with Genetic Programming (TPOT Classifier), Random forest and
Long et el. [51] proposed a chaos firefly algorithm to predict the XGBoost achieved an AUC of 97.50% and 88.21% respectively.
disease and removed the irrelevant features using rough sets. The
highest accuracy obtained from CFARS-AR is 88.3%.Six different Ma 4.9. Performance Analysis with ROC_AUC for Z-Alizadeh Sani Dataset
chine models were compared by Dwivedi et al. [49] and LR model ob
tained the highest accuracy of 85% compared to other models. Amin Fig. 8–10 shows the AUC-ROC curve for the experiments conducted
et al. [56] developed a hybrid voting model with Naïve Bayes and Lo with Z-Alizadeh sani Dataset. The diagnosis of the stenos is with the
gistic regression. The model was trained with the significant features. vessel LAD achieved 77.8%, 79.8%,75.5% with Random forest and
The results of the hybrid model showed an accuracy of 87.41%.Ayon et 75.0%,75.8% and 74.5% with XGBoost. The diagnosis of the stenos is
al [61] made a comparative study of seven machine learning models and with the vessel LCX achieved 68.1%,71%,67.2% with Random forest
found DNN performed better compared to all other algorithms. The and 63.6%,69.5% and 67.7% with XGBoost. The diagnosis of the stenos
outcome of the study showed random forest obtained an accuracy of is with the vessel RCA achieved 76.9%,74.0%,72.4% with Random
87.45% with 10-fold cross validation. Table 18 shows our proposed forest and 68.5%,76.6% and 75.9% with XGBoost.

Table 18 5. Conclusion
Comparison of the Model from previous studies for Cleave land Dataset.
Every year, a lot of deaths happen due to heart diseases. Heart dis
Authors Method Results
ease, if predicted earlier, can save many lives. In this study, Sequential
Chadha and Mayank DT 88.03% forward selection is applied to remove the insignificant features.
[50] NB 85.86%
Removing the irrelevant features had a huge impact in the performance
Long et al [51] CFARS-AR 88.3%
Soni et al [52] Association rules 81.51% improving it significantly. The machine learning models: Random forest
Kumari and Godara SVM 84.12% and XGBoost were tuned and tested with three different hyperparameter
[53] tuning techniques and their accuracies were compared to the existing
Leema et al [54] DE + BP 86.6%
techniques. By tuning the parameters of Random forest with Grid
Verma et al [55] CFS + PSO + K-means + MLP 90.28%
Amin et al [56] Hybrid(NB + LR) 87.41% Search, Randomized search and Genetic Programming we obtained
A.K.Diwivedi [49] SVM 82.00%
LR 85.00% Table 19
Haq et al [57] Relief -based feature selection + Logistic 89%
Comparison of the model from previous studies for Z-Alizadeh Sani Dataset.
regression
Saqlain et al [58] MFSFSA + SupportVector Machine 81.19% Authors Method Accuracy (%)
Latha and Jeeva [59] Naïve Bayes + BN + RandomForest + MLP 85.48%
LAD LCX RCA
Mohan et al [60] RF + Linear Model 88.4%
Ayon et al [61] RF 87.45% Babaogluetal.[62] ANN 73.0 64.8 69.4
Proposed Model RF With Grid Search 91.32% Alizadehsani,Habibi, Bagging 79.5 65.1 68.0
RF with Randomized Search 95.04 Alizadehsaniet al.[63]
RF With TPOT Classifier 97.52% Alizadehsani,Habibi, Decision Tree 74.2 63.8 68.3
XGBoost with Grid Search 86.36% Bahadorianet al.[64]
XGBoost with Randomized Search 92.14 Proposed Method Random Forest with 80.2 73.6 76.9
XGBoost With TPOT classifier 90.50 Randomized Search

7
R. Valarmathi and T. Sheela Biomedical Signal Processing and Control 70 (2021) 103033

Fig. 5. ROC curve with Grid Search. Fig. 8. ROC curve for LAD using Random Forest and XGBoost.

Fig. 6. ROC curve with Randomized Search. Fig. 9. ROC curve for LCX using Random Forest and XGBoost.

Fig. 7. ROC curve with Genetic Programming (TPOT Classifier). Fig. 10. ROC curve for RCA using Random Forest and XGBoost.

better results with 91.32%, 95.04% and 97.52% compared to XGBoost LCX and RCA vessels respectively. In the future, heart disease pre
and other state of the art algorithms for CHD. The proposed algorithms dictions can be done in real time and the performance can be evaluated
shows the highest accuracy for random forest with randomized search in hardware.
with 80.2%, 73.6% and 76.9% for the diagnosis of abnormality in LAD,

8
R. Valarmathi and T. Sheela Biomedical Signal Processing and Control 70 (2021) 103033

Declaration of Competing Interest [25] G. Purusothaman, P. Krishnakumari, “A survey of data mining techniques on risk
prediction: heart disease”, Indian J. Sci. Technol. 8(12), 2015.
[26] Youness Khourdifi, Mohamed Bahaj, Heart disease prediction and classification
The authors declare that they have no known competing financial using machine learning algorithms optimized by particle swarm optimization and
interests or personal relationships that could have appeared to influence ant colony optimization, Int. J. Intell. Eng. Syst. 12 (1) (2018) 242–252.
the work reported in this paper. [27] C. Kalaiselvi, G. Nasira, Prediction of heart diseases and cancer in diabetic patients
using data mining techniques. Indian J. Sci. Technol. 8(14), 2015.
[28] T. Santhanam, E. Ephzibah, Heart disease prediction using hybrid genetic fuzzy
References model”, Indian, J. Sci. Technol. 8 (9) (2015) 797–803.
[29] Senthilkumar Mohan, Chandrasegar Thirumalai, Gautam Srivastava, Effective
[1] C.D. Mathers, D. Loncar, Projections of global mortality and burden of disease from heart disease prediction using hybrid machine learning techniques, IEEE Access 7
2002 to 2030, Plosmed 3 (11) (2006). (2019) 81542–81554, https://doi.org/10.1109/ACCESS.2019.2923707.
[2] M. Akhil, Dr. Priti Chandra, Dr. B.L Deekshatulu, “Heart Disease Prediction System [30] Kaan Uyar, Ahmet İlhan, Diagnosis of heart disease using genetic algorithm based
using Associative Classification and Genetic Algorithm”, International Conference trained recurrent fuzzy neural networks, Procedia Comput. Sci. 120 (2017)
on Emerging Trends in Electrical, Electronics and Communication Technologies- 588–593.
ICECIT, 2012. [31] Roohallah Alizadehsani, Jafar Habibi, Javad Hosseini Mohammad,
[3] Ibrahim Umar Said, Abdullahi Haruna Adam, Dr. Ahmed BaitaGarko, “Association Reihane Boghrati, Asma Ghandeharioun, Behdad Bahadorian, et al., Diagnosis of
Rule Mining On Medical Data To Predict Heart Disease”, International Journal of Coronary Artery Disease Using Data Mining Techniques Based on Symptoms and
Science Technology and Management, August 2015, pp. 26-35. ECG Features, European Journal of Scientific Research 82 (4) (2012) 542–553.
[4] A. Chauhan, A. Jain, P. Sharma, V. Deep, Heart disease prediction using [32] Alizadehsani Roohallah et al. “Exerting Cost-Sensitive and Feature Creation
evolutionary rule learning, in: 2018 4th International Conference on Algorithms for Coronary Artery Disease Diagnosis.”IJKDB 3.1 (2012): 59-79. Web.
Computational Intelligence & Communication Technology (CICT), Ghaziabad, 1 May. 2021. doi:10.4018/jkdb.2012010104.
2018, pp. 1–4. [33] Roohallah Alizadehsani, Mohamad Roshanzamir, Moloud Abdar,
[5] Ilaria Castelli, Edmondo Trentin, Combination of supervised and unsupervised Adham Beykikhoshk, Mohammad Hossein Zangooei, Abbas Khosravi,
learning for training the activation functions of neural networks, Pattern Recogn. Saeid Nahavandi, Ru San Tan, U. Rajendra Acharya, Model uncertainty
Lett. 37 (2014) 178–191. quantification for diagnosis of each main coronary artery stenosis, Soft Computing
[6] R. Safdari, T. Samad-Soltani, M. GhaziSaeedi, M. Zolnoori, Evaluation of 24 (13) (2020) 10149–10160, https://doi.org/10.1007/s00500-019-04531-0.
classification algorithms vs knowledge-based methods for differential diagnosis of [34] Roohallah Alizadehsani, Jafar Habibi, Mohammad Javad Hosseini,
asthma in Iranian patients, Int. J. Inform. Syst. Serv. Sect. 10 (2) (2018) 22–26. Hoda Mashayekhi, Reihane Boghrati, Asma Ghandeharioun, Behdad Bahadorian,
[7] F.S. Alotaibi, Implementation of machine learning model to predict heart failure Zahra Alizadeh Sani, A data mining approach for diagnosis of coronary artery
disease, Int. J. Adv. Comput. Sci. Appl. 10 (6) (2019) 261–268. disease, Comput. Methods Programs Biomed. 111 (1) (2013) 52–61, https://doi.
[8] M.I. Jordan, T.M. Mitchell, Machine learning: trends, perspectives, and prospects, org/10.1016/j.cmpb.2013.03.004.
Science 349 (6245) (2015) 255–260, https://doi.org/10.1126/science.aaa8415. [35] R. Alizadehsani, M. Roshanzamir, M. Abdar, A. Beykikhoshk, A. Khosravi,
[9] M. Motwani, D. Dey, D.S. Berman, G. Germano, S. Achenbach, M.H. Al-Mallah, D. M. Panahiazar, A. Koohestani, F. Khozeimeh, S. Nahavandi, N. Sarrafzadegan,
Andreini, M.J. Budoff, F. Cademartiri, T.Q. Callister,”Machine learning for pre- A database for using machine learning and data mining techniques for coronary
diction of all-cause mortality in patients with suspected coronary artery disease”: a artery disease diagnosis, Scientific Data 6 (1) (2019), https://doi.org/10.1038/
5-year multicentre prospective registry analysis, Eur. Heart J. 38(7), pp. s41597-019-0206-3.
500–507,2016. [36] Roohallah Alizadehsani, Moloud Abdar, Mohamad Roshanzamir, Abbas Khosravi,
[10] Sani A,” Machine Learning for Decision Making”, Universitéde Lille 1, 2015,. Parham M. Kebria, Fahime Khozeimeh, Saeid Nahavandi, Nizal Sarrafzadegan, U.
[11] W. Raghupathi, V. Raghupathi, “Big data analytics in healthcare: promise and Rajendra Acharya, Machine learning-based coronary artery disease diagnosis: a
potential”, Health Inf. Sci. Syst. 2(3), 2014. comprehensive review, Comput. Biol. Med. 111 (2019) 103346, https://doi.org/
[12] P. Groves, B. Kayyali, D. Knott, S.V. Kuiken, ”The ’Big Data’ Revolution in Health- 10.1016/j.compbiomed.2019.103346.
care: Accelerating Value and Innovation”, 2016. [37] Roohallah Alizadehsani, Mohammad Hossein Zangooei, Mohammad
[13] T. Condie, P. Mineiro, N. Polyzotis, M. Weimer, Machine learning on big data: data Javad Hosseini, Jafar Habibi, Abbas Khosravi, Mohamad Roshanzamir,
engineering (ICDE), in: 2013 IEEE 29th International Conference on, IEEE, 2013, Fahime Khozeimeh, Nizal Sarrafzadegan, Saeid Nahavandi, Coronary artery
pp. 1242–1244. disease detection using computational intelligence methods, Knowledge-Based
[14] Wu. Jia, Xiu-Yun Chen, Hao Zhang, Li-Dong Xiong, Hang Lei, Si-Hao Deng, Syst. 109 (2016) 187–197, https://doi.org/10.1016/j.knosys.2016.07.004.
Hyperparameter optimization for machine learning models based on Bayesian [38] R. Alizadehsani, M.J. Hosseini, Z.A. Sani, A. Ghandeharioun, R. Boghrati. Diagnosis
optimization, J. Electron. Sci. Technol. 17 (1) (2019). of coronary artery disease using cost-sensitive algorithms. in: Paper presented at
[15] Debabrata Swain, Preeti Ballal, Vishal Dolase, Banchhanidhi Dash, Jayasri the 2012 IEEE 12th International Conference on Data Mining Workshops, Brussels,
Santhappan, An efficient heart disease prediction system using machine learning. Belgium 2012.
Advances in Intelligent Systems and Computing, Springer Nature Singapore Pvt [39] L. Breiman, Random Forests, Machine Learning 45 (2001) 5–32.
Ltd, 2020. [40] T. Chen and C. Guestrin, ‘‘XGBoost: A scalable tree boosting system,’’in: Proc. 22nd
[16] M.A. Jabbar, B.L. Deekshatulu, P. Chandra, “An Evolutionary Algorithm for Heart ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, San Francisco, CA, USA,
Disease Prediction. In: K.R. Venugopal, L.M. Patnaik (eds.) Wireless Networks and Aug. 2016, pp. 785_794, doi:10.1145/2939672.2939785.
Computational Intelligence”. ICIP 2012. Communications in Computer and [41] Douglas C Montgomery, Design and Analysis of Experiments, John Wiley & Sons,
Information Science, vol 292. Springer, Berlin, Heidelberg. DOI:10.1007/978-3- 2017.
642-31686-9_44. [42] J. Bergstra, Y. Bengio, Random search for hyper-parameter optimization, J. Mach.
[17] R. Das, I. Turkoglu, A. Sengur, Effective diagnosis of heart disease through neural Learn. Res. 13 (2012) 281–305.
networks ensembles, Expert Syst. Appl. 36 (4) (2009) 7675–7680. [43] R. Liessner, J. Schmitt, A. Dietermann, and B. Bäker, “Hyperparameter
[18] R. Chitra, V. Seenivasagam, Heart disease prediction system using supervised Optimization for Deep Reinforcement Learning in Vehicle Energy Management”,
learning classifier, Bonfring Int. J. Software Eng. Soft Comput. 3 (1) (2013) 1. in: Proceedings of the 11th International Conference on Agents and Artificial
[19] K. Vembandasamy, R. Sasipriya, E. Deepa, Heart diseases detection using Naive Intelligence (ICAART 2019), pages 134-144 ISBN: 978-989-758-350-6.
Bayes Algorithm, Int. J. Innov. Sci. Eng. Technol. 2 (9) (2015). [44] R.S. Olson, J.H. Moore, TPOT: A Tree-Based Pipeline Optimization Tool for
[20] R. Kavitha, T. Christopher, “An effective classification of heart rate data using PSO- Automating Machine Learning, in: F. Hutter, L. Kotthoff, J. Vanschoren (Eds.),
FCM clustering and enhanced support vector machine,” Indian Journal of Science Automated Machine Learning. The Springer Series on Challenges in Machine
and Technology, 8(30), 2015. Learning, Springer, Cham, 2019, https://doi.org/10.1007/978-3-030-05318-5_8.
[21] N. Bhatla, K. Jyoti, An analysis of heart disease prediction using different data [45] https://archive.ics.uci.edu/ml/machine-learning-databases/heart-disease/heart
mining techniques, Int. J. Eng. Res. Technol. 1 (8) (2012) 1–4. -disease.names.
[22] V. Krishnaiah, G. Narsimha, N.S. Chandra, “Heart Disease Prediction System Using [46] http://archive.ics.uci.edu/ml/datasets/extention+of+Z-Alizadeh+sani+dataset.
Data Mining Technique by Fuzzy K-NN Approach. In: Satapathy S., Govardhan A., [47] R. Kohavi, D. Wolpert Bias plus variance decomposition for zero-oneloss functions.
Raju K., Mandal J. (eds) Emerging ICT for Bridging the Future - Proceedings of the in: Proc 13th Int. 1996 Conf. Mach. Learn., San Francisco, CA, USA pp. 275_283.
49th Annual Convention of the Computer Society of India (CSI) Volume 1. [48] J. Han, M. Kamber, Data Mining: Concepts and Techniques, Elsevier, San Diego,
Advances in Intelligent Systems and Computing, vol 337. Springer, Cham,2015. CA, USA, 2012.
DOI:10.1007/978-3-319-13728-5_42. [49] Ashok Kumar Dwivedi, Performance evaluation of differentmachine learning
[23] S. U. Amin, K. Agarwal and R. Beg, “Genetic neural network based data mining in techniques for prediction of heartdisease, Neural Comput. Appl. 29 (10) (2018)
prediction of heart disease using risk factors,” in: 2013 IEEE Conference on 685–693.
Information & Communication Technologies, Thuckalay, Tamil Nadu, India, 2013, [50] Ritika Chadha, Shubhankar Mayank, Prediction of heart diseaseusing data mining
pp. 1227-1231, doi: 10.1109/CICT.2013.6558288. techniques, CSI Trans. ICT 4 (2-4) (2016) 193–198.
[24] F.Z. Abdeldjouad, M. Brahami, N. Matta. “ A Hybrid Approach for Heart Disease [51] N.C. Long, P. Meesad, H. Unger, A highly accuratefirefly-based algorithm for heart
Diagnosis and Prediction Using Machine Learning Techniques.”, In: M. Jmaiel, M. disease prediction, Expert. Syst. Appl. 42 (2015) 8221–8231.
Mokhtari, B. Abdulrazak, H. Aloulou, S. Kallel (eds.) The Impact of Digital [52] J. Soni, U. Ansari, D. Sharma, S. Soni, Intelligent andeffective heart disease
Technologies on Public Health in Developed and Developing Countries. ICOST prediction system using weightedassociative classifiers, Int. J. Comput. Sci. Eng. 3
2020. Lecture Notes in Computer Science, vol 12157. Springer, Cham, 2020. DOI: (6) (2011) 2385–2392.
10.1007/978-3-030-51517-1_26. [53] M. Kumari, S. Godara, Comparative study of datamining classification methods in
cardiovascular diseaseprediction, IJCST 2 (2) (2011) 304–308.

9
R. Valarmathi and T. Sheela Biomedical Signal Processing and Control 70 (2021) 103033

[54] N. Leema, H. Khanna Nehemiah, A. Kannan, Neural network classifier optimization considering laboratory and echocardiography features, Res Cardiovasc Med. 2 (3)
using differential evolutionwith global information and back propagation (2013) 133, https://doi.org/10.5812/cardiovascmed.10888.
algorithm for clinical datasets, Appl. Soft. Comput. 49 (2016) 834–844. [64] Habibi Jafar Alizadehsani Roohallah, MashayekhiHoda Bahadorian Behdad,
[55] L. Verma, S. Srivastava, P.C. Negi, A hybrid data mining model to predict coronary Ghandeharioun Asma, Reihane Boghrati, et al., Diagnosis of coronary arteries
artery disease cases using non-invasive clinical data, J. Med. Syst. 40 (7) (2016) stenosis using data mining, J. Medical Signals Sens. 2 (3) (2012) 57–65.
178, https://doi.org/10.1007/s10916-016-0536-z. [65] Jalali, Seyed Mohammad Jafar, Khosravi, Abbas, Alizadehsani, Roohallah, Salaken,
[56] M. S. Amin, Y. K. Chiam, and K. D. Varathan, ‘‘Identication of signicant features Syed Moshfeq, Kebria, Parham Mohsenzadeh, Puri, Rishi and Nahavandi, Saeid
and data mining techniques in predicting heart disease,’’ Telematics Inform., vol. 2019, Parsimonious evolutionary-based model development for detecting artery
36, pp. 82_93, Mar. 2019, doi: 10.1016/j.tele.2018.11.007. disease, in: ICIT 2019 : Proceedings of the IEEE International Conference on
[57] A. U. Haq, J. P. Li, M. H. Memon, S. Nazir, and R. Sun, ‘‘A hybrid intelligent system Industrial Technology, IEEE, Piscataway, N.J., pp. 800-805, doi: 10.1109/
framework for the prediction of heart disease using machinelearning algorithms,’’ ICIT.2019.8755107.
Mobile Inf. Syst., vol. 2018, pp. 1_21, Dec. 2018,doi: 10.1155/2018/3860146. [66] M. A. Sohail, Z. Taufique, S. M. Abubakar, W. Saadeh and M. A. Bin Altaf, “An ECG
[58] S. M. Saqlain, M. Sher, F. A. Shah, I. Khan, M. U. Ashraf, M. Awais,and A. Ghani, Processor for the Detection of Eight Cardiac Arrhythmias with Minimum False
‘‘Fisher score and matthews correlation coef_cient-basedfeature subset selection for Alarms,” in: 2019 IEEE Biomedical Circuits and Systems Conference (BioCAS),
heart disease diagnosis using support vectormachines,’’ Knowl. Inf. Syst., 58(1), 2019, pp. 1-4, doi: 10.1109/BIOCAS.2019.8919053.
pp. 139_167, Jan. 2019, doi:10.1007/s10115-018-1185-y. [67] S.M. Abubakar, M. Rizwan Khan, W. Saadeh, M. A. Bin Altaf, “A Wearable Auto-
[59] C. B. C. Latha and S. C. Jeeva, ‘‘Improving the accuracy of prediction of heart Patient Adaptive ECG Processor for Shockable Cardiac Arrhythmia,” in: 2018 IEEE
disease risk based on ensemble classification techniques,’’ Inform. Med. Unlocked, Asian Solid-State Circuits Conference (A-SSCC) 2018 267 268 10.1109/
16, Jan. 2019, Art. no. 100203, doi:10.1016/j.imu.2019.100203. ASSCC.2018.8579263.
[60] Senthilkumar Mohan, Chandrasegar Thirumalai, Gautam Srivastava, Effective [68] Shihui Yin, Minkyu Kim, Deepak Kadetotad, Yang Liu, Chisung Bae, Sang
heart diseaseprediction using hybrid machine learning techniques, IEEE Access 7 Joon Kim, Yu Cao, Jae-Sun Seo, A 1.06- $\mu$ W Smart ECG Processor in 65-nm
(2019) 81542–81554, https://doi.org/10.1109/ACCESS.2019.2923707. CMOS for Real-Time Biometric Authentication and Personal Cardiac Monitoring,
[61] Safial Islam Ayon, Md. Milon Islam, Md. Rahat Hossain, CoronaryArtery heart IEEE J. Solid-State Circ. 54 (8) (2019) 2316–2326, https://doi.org/10.1109/
disease prediction: a comparative study of computational intelligence techniques, JSSC.2019.2912304.
IETE J. Res. (2020), https://doi.org/10.1080/03772063.2020.1713916. [69] S. M. Abubakar, W. Saadeh and M. A. B. Altaf, “A wearable long-term single-lead
[62] I. Babaoglu, O. Fındık, M. Bayrak, Effects of principle component analysis on ECG processor for early detection of cardiac arrhythmia,” in: 2018 Design,
assessment of coronary artery diseases using support vector machine, Expert Syst. Automation & Test in Europe Conference & Exhibition (DATE), 2018, pp. 961-966,
Appl. 37 (3) (2010) 2182–2185, https://doi.org/10.1016/j.eswa.2009.07.055. doi: 10.23919/DATE.2018.8342148.
[63] Roohallah Alizadehsani, Jafar Habibi, ZahraAlizadeh Sani, Hoda Mashayekhi, [70] S. Izumi, et al., A 14 µA ECG processor with robust heart rate monitor for a
Reihane Boghrati, Asma Ghandeharioun, Fahime Khozeimeh, Fariba Alizadeh- wearable healthcare system, in: 2013 Proceedings of the ESSCIRC (ESSCIRC),
Sani, Diagnosing coronary artery disease via data mining algorithms by 2013, pp. 145–148, https://doi.org/10.1109/ESSCIRC.2013.6649093.

Mca Questions 001
100% (5)
Mca Questions 001
9 pages
Performance Analysis of Deep Learning Based Object Detection Algorithms On COCO Benchmark: A Comparative Study
No ratings yet
Performance Analysis of Deep Learning Based Object Detection Algorithms On COCO Benchmark: A Comparative Study
18 pages
EILPR Toward End-To-End Irregular License Plate Recognition Based On Automatic P
No ratings yet
EILPR Toward End-To-End Irregular License Plate Recognition Based On Automatic P
10 pages
Which Factors Affect Dental Esthetics and Smile Attractiveness in Orthodontically Treated Patients?
No ratings yet
Which Factors Affect Dental Esthetics and Smile Attractiveness in Orthodontically Treated Patients?
13 pages
Clinical Decision Support System: Fundamentals and Applications
From Everand
Clinical Decision Support System: Fundamentals and Applications
Fouad Sabry
5/5 (1)
ACLS Basics 2024 Presentation
No ratings yet
ACLS Basics 2024 Presentation
30 pages
Intake Form
No ratings yet
Intake Form
3 pages
Swimming Pool Rules & Regulations
No ratings yet
Swimming Pool Rules & Regulations
4 pages
Safety Behavior Observation Checklist: Observe Quietly. Mark Each Observation Item
100% (3)
Safety Behavior Observation Checklist: Observe Quietly. Mark Each Observation Item
1 page
Theranos Case
No ratings yet
Theranos Case
18 pages
PSM Previous Yr QNS Chapterwise
No ratings yet
PSM Previous Yr QNS Chapterwise
21 pages
Ensemble Feature Optimization For Heart Disease INASS 2023
No ratings yet
Ensemble Feature Optimization For Heart Disease INASS 2023
9 pages
1 s2.0 S258993682100044X Main
No ratings yet
1 s2.0 S258993682100044X Main
6 pages
Tree-Structured Parzen Estimator: Understanding Its Algorithm Components and Their Roles For Better Empirical Performance
No ratings yet
Tree-Structured Parzen Estimator: Understanding Its Algorithm Components and Their Roles For Better Empirical Performance
74 pages
Mbedded Methods For Feature Selection in Neural Networks: Reprint
No ratings yet
Mbedded Methods For Feature Selection in Neural Networks: Reprint
7 pages
NCDH - Influence of Psychological Response On Survival in Breast Cancer - A Population-Based Cohort Study - BBYHDPN3-Kim Xuan
No ratings yet
NCDH - Influence of Psychological Response On Survival in Breast Cancer - A Population-Based Cohort Study - BBYHDPN3-Kim Xuan
6 pages
The Framingham Offspring Study: Risk Variable Clustering in The Insulin Resistance Syndrome
No ratings yet
The Framingham Offspring Study: Risk Variable Clustering in The Insulin Resistance Syndrome
7 pages
Prospectus - MD and Board Certification in Anaesthesiology 2022
No ratings yet
Prospectus - MD and Board Certification in Anaesthesiology 2022
189 pages
Brochure
No ratings yet
Brochure
7 pages
Heart Attack
No ratings yet
Heart Attack
19 pages
Effective Models For Predicting Heart Disease Using Machine Learn - Information Sciences Letters - 2023
No ratings yet
Effective Models For Predicting Heart Disease Using Machine Learn - Information Sciences Letters - 2023
13 pages
3406-Article Text-6396-1-10-20210421
No ratings yet
3406-Article Text-6396-1-10-20210421
6 pages
1 s2.0 S2405632423000197 Main
No ratings yet
1 s2.0 S2405632423000197 Main
11 pages
Risk Assessment Matrix
No ratings yet
Risk Assessment Matrix
1 page
Heart Disease Prediction Using Machine Learning A Data-Driven Approach
No ratings yet
Heart Disease Prediction Using Machine Learning A Data-Driven Approach
6 pages
Frankenstein Theme Essay
100% (2)
Frankenstein Theme Essay
5 pages
The Relation Between Emotional Intelligence and Criminal Behavior - A Study Among Convicted Criminals
No ratings yet
The Relation Between Emotional Intelligence and Criminal Behavior - A Study Among Convicted Criminals
9 pages
Report Viewer
No ratings yet
Report Viewer
1 page
Heart Disease
No ratings yet
Heart Disease
6 pages
Technical Seminar 8th Sem
No ratings yet
Technical Seminar 8th Sem
5 pages
Heart Disease Approach Using Modified Random Forest and Particle Swarm Optimization
No ratings yet
Heart Disease Approach Using Modified Random Forest and Particle Swarm Optimization
10 pages
Heart Attack Prediction Using Machine Learning
No ratings yet
Heart Attack Prediction Using Machine Learning
21 pages
Harmonization of Heart Disease Dataset For Accurat
No ratings yet
Harmonization of Heart Disease Dataset For Accurat
13 pages
IEEE
No ratings yet
IEEE
8 pages
Health DLL WK2
No ratings yet
Health DLL WK2
6 pages
Heart Disease Prediction Technical Seminar Report
No ratings yet
Heart Disease Prediction Technical Seminar Report
18 pages
PROS 6 1-S2.0-S1746809424011960-Main
No ratings yet
PROS 6 1-S2.0-S1746809424011960-Main
21 pages
Prediction in Medicine: The Impact of Machine Learning on Healthcare
From Everand
Prediction in Medicine: The Impact of Machine Learning on Healthcare
Neeta Verma
No ratings yet
2022 Research
No ratings yet
2022 Research
19 pages
Heart Disease Prediction Using Machine
No ratings yet
Heart Disease Prediction Using Machine
88 pages
Action Research in TVE EIM
100% (1)
Action Research in TVE EIM
19 pages
Pengaruh Pemberian Kompres Hangat Terhadap Status Dismenore Primer Pada Mahasiswi Tingkat III Prodi DIV Bidan Pendidik STIKES Husada Borneo Tahun 2013
No ratings yet
Pengaruh Pemberian Kompres Hangat Terhadap Status Dismenore Primer Pada Mahasiswi Tingkat III Prodi DIV Bidan Pendidik STIKES Husada Borneo Tahun 2013
6 pages
Project Proposal
No ratings yet
Project Proposal
11 pages
A Proposed Technique For Predicting Heart Disease Using Machine Learning Algorithms and An Explainable AI Method
No ratings yet
A Proposed Technique For Predicting Heart Disease Using Machine Learning Algorithms and An Explainable AI Method
18 pages
Heart Disease Prediction Random Forest A
No ratings yet
Heart Disease Prediction Random Forest A
7 pages
Evaluation of Cardiovascular Disease in Diabetic Patients Using Machine Learning Techniques
No ratings yet
Evaluation of Cardiovascular Disease in Diabetic Patients Using Machine Learning Techniques
13 pages
9.heart Disease Diagnosis and Prediction Based On Hybrid 30o3m8z8
No ratings yet
9.heart Disease Diagnosis and Prediction Based On Hybrid 30o3m8z8
6 pages
726-Article Text-3087-1-10-20241231
No ratings yet
726-Article Text-3087-1-10-20241231
14 pages
Heart Failure Prediction Using Machine Learning Algorithm
No ratings yet
Heart Failure Prediction Using Machine Learning Algorithm
5 pages
BT40962 PPT
No ratings yet
BT40962 PPT
24 pages
Report On Nutritional Status (Baseline)
No ratings yet
Report On Nutritional Status (Baseline)
1 page
Fernandes, 2018 PDF
No ratings yet
Fernandes, 2018 PDF
11 pages
İngilizce: Yabanci Dil Sinavi (Yökdil)
No ratings yet
İngilizce: Yabanci Dil Sinavi (Yökdil)
24 pages
Ibrar Final Synopsis Plagirism Check
No ratings yet
Ibrar Final Synopsis Plagirism Check
13 pages
Research Article: A Method For Improving Prediction of Human Heart Disease Using Machine Learning Algorithms
No ratings yet
Research Article: A Method For Improving Prediction of Human Heart Disease Using Machine Learning Algorithms
9 pages
Safety Data Sheet: Section 1. Chemical Product and Company Identification
No ratings yet
Safety Data Sheet: Section 1. Chemical Product and Company Identification
8 pages
AI Review 1
No ratings yet
AI Review 1
5 pages
Predicting Heart Diseases Using Machine Learning A
No ratings yet
Predicting Heart Diseases Using Machine Learning A
16 pages
Arogya Advance - Quick Reference - 11062024-SBI Bank
100% (2)
Arogya Advance - Quick Reference - 11062024-SBI Bank
1 page
Infectious and Non Infectious Diseases
No ratings yet
Infectious and Non Infectious Diseases
14 pages
Predicting Heart Diseases Using Machine Learning and Different Data Classification Techniques
No ratings yet
Predicting Heart Diseases Using Machine Learning and Different Data Classification Techniques
15 pages
CARDIAC DISEASES PREDICTION USING SVM WITH XG BOOST ALGORITHM Ijariie19362
No ratings yet
CARDIAC DISEASES PREDICTION USING SVM WITH XG BOOST ALGORITHM Ijariie19362
8 pages
A Comprehensive Survey On Heart Disease Prediction
No ratings yet
A Comprehensive Survey On Heart Disease Prediction
16 pages
Heart Disease
No ratings yet
Heart Disease
6 pages
Asd 1
No ratings yet
Asd 1
6 pages
Chapter 3 Old
No ratings yet
Chapter 3 Old
45 pages
Heart Disease Prediction Using
No ratings yet
Heart Disease Prediction Using
8 pages
Applsci 11 08352 v2
No ratings yet
Applsci 11 08352 v2
22 pages
Heart Disease Paper
No ratings yet
Heart Disease Paper
10 pages
Cardiovascular Diseases Prediction Article
No ratings yet
Cardiovascular Diseases Prediction Article
28 pages
Prediction of Cardiovascular Disease Using Machine Learning: Journal of Physics: Conference Series
No ratings yet
Prediction of Cardiovascular Disease Using Machine Learning: Journal of Physics: Conference Series
9 pages
Effective Heart Disease Prediction
No ratings yet
Effective Heart Disease Prediction
4 pages
Heart Disease Risk Prediction Using Deep Learning Techniques With Feature Augmentation
No ratings yet
Heart Disease Risk Prediction Using Deep Learning Techniques With Feature Augmentation
15 pages
Debat Bahasa Inggris
No ratings yet
Debat Bahasa Inggris
5 pages
Diagnostics: Machine Learning-Based Predictive Models For Detection of Cardiovascular Diseases
No ratings yet
Diagnostics: Machine Learning-Based Predictive Models For Detection of Cardiovascular Diseases
19 pages
Performance Enhancement of Machine Learning System Applicable To Detect Heart Disease 2024
No ratings yet
Performance Enhancement of Machine Learning System Applicable To Detect Heart Disease 2024
9 pages
MODULE 5 - Week 2 - The Child and Adolescent Learners and Learning Principles
No ratings yet
MODULE 5 - Week 2 - The Child and Adolescent Learners and Learning Principles
5 pages
Heart Disease Detection Using Machine Learning Models
No ratings yet
Heart Disease Detection Using Machine Learning Models
11 pages
PASSED LIKE A SHADOW SUMMARY PDF (Raj Mpella Blog) - Signed
67% (3)
PASSED LIKE A SHADOW SUMMARY PDF (Raj Mpella Blog) - Signed
13 pages
Research Paper 2023
No ratings yet
Research Paper 2023
28 pages
IEEE Template
No ratings yet
IEEE Template
4 pages
Hospital Formulary
No ratings yet
Hospital Formulary
12 pages
Classsification Model 2020
No ratings yet
Classsification Model 2020
21 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
11 pages
Heart Disease rp2
No ratings yet
Heart Disease rp2
14 pages
8438-Article Text-15156-1-10-20210606
No ratings yet
8438-Article Text-15156-1-10-20210606
13 pages
Final Report
No ratings yet
Final Report
43 pages
2nd Review
No ratings yet
2nd Review
21 pages
Heart Disease Prediction Using Hybrid Model
No ratings yet
Heart Disease Prediction Using Hybrid Model
6 pages
Implementation of An Incremental Deep Learning Model For Survival Prediction of Cardiovascular Patients
No ratings yet
Implementation of An Incremental Deep Learning Model For Survival Prediction of Cardiovascular Patients
9 pages
Research Article: Prediction of Heart Disease Using A Combination of Machine Learning and Deep Learning
No ratings yet
Research Article: Prediction of Heart Disease Using A Combination of Machine Learning and Deep Learning
11 pages
(IJCST-V9I3P22) : Yogesh Gedam, Shivraju Bomble, Uma Kurwade, Bhavana Parchake, Hemant Uike
No ratings yet
(IJCST-V9I3P22) : Yogesh Gedam, Shivraju Bomble, Uma Kurwade, Bhavana Parchake, Hemant Uike
4 pages
Developing A Hyperparameter Tuning Based Machine L
No ratings yet
Developing A Hyperparameter Tuning Based Machine L
17 pages
Jindal 2021 IOP Conf. Ser. Mater. Sci. Eng. 1022 012072
No ratings yet
Jindal 2021 IOP Conf. Ser. Mater. Sci. Eng. 1022 012072
11 pages
Heart Failure Prediction Using Hybrid Method
No ratings yet
Heart Failure Prediction Using Hybrid Method
8 pages
Final Year Project
No ratings yet
Final Year Project
57 pages

Heart Disease Prediction Using Hyper Parameter Optimization (HPO) Tuning

Uploaded by

Heart Disease Prediction Using Hyper Parameter Optimization (HPO) Tuning

Uploaded by

Biomedical Signal Processing and Control 70 (2021) 103033

Contents lists available at ScienceDirect

Biomedical Signal Processing and Control

Heart disease prediction using hyper parameter optimization (HPO) tuning

3.4. Building Machine learning model

3.4.1. Random Forest (RF)

3.5. Hyper parameter optimization (HPO)

Selecting the best hyper parameters has a significant impact on the

Random n_estimators 200 555 1333

Grid Randomized Genetic Programming

4.6. Experimental results with XGBoost (Z-Alizadeh Sani Dataset)

You might also like