0% found this document useful (0 votes)
9 views

Machine Learning Algorithm‑Based Health prediction

Machine learn paper

Uploaded by

Umang Soni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Machine Learning Algorithm‑Based Health prediction

Machine learn paper

Uploaded by

Umang Soni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

SN Computer Science (2020) 1:344

https://doi.org/10.1007/s42979-020-00370-1

ORIGINAL RESEARCH

Sequential Feature Selection and Machine Learning Algorithm‑Based


Patient’s Death Events Prediction and Diagnosis in Heart Disease
Ritu Aggrawal1 · Saurabh Pal1

Received: 3 October 2020 / Accepted: 6 October 2020


© Springer Nature Singapore Pte Ltd 2020

Abstract
Due to the accessibility of data with multiple features, many feature determination techniques available in written form. These
features promote data with extremely high measurement values. The feature determination strategy provides us with a way to
reduce calculation time, improve prediction execution, and have a better understanding of data in machine learning, as well
as a way to recognize applications. As pointed out by related works that have been reviewed, in general, existing works only
focus on amplifying classification accuracy. For real-world applications, the selected subset of features must be continuous
instead. In this research, proposes a sequential feature selection algorithm for detecting death events in heart disease patients
during treatment to select the most important features. Several machine learning algorithms (LDA, RF, GBC, DT, SVM, and
KNN) are used. In addition, the accuracy obtained by this method (SFS) is compared with the accuracy of the classifier. The
confusion matrix, ROC curve, precision, recall rate, and f1-score are also calculated to verify the results obtained by the SFS
algorithm. The experimental results show that for Random Forest Classifier_FS, the SFS method reaches 86.67% accuracy.

Keywords Heart disease · SFS · RF · SVM · KNN · Confusion matrix · ROC · Precision · Recall

Introduction [3]. According to statistics from the Society of Cardiology,


26 million adults worldwide suffer from heart failure, and
Cardiovascular failure is a complex clinical disorder, not a recently 3.6 million people are analyzed each year. 17–45%
disease [1]. It is difficult to distinguish coronary heart dis- of patients with heart failure die within the year, and the rest
ease based on some common risk factors, such as diabetes, die within 5 years. Management costs determined to have
high blood pressure, elevated cholesterol, abnormal heart cardiovascular failure account for approximately 1–2% of
rhythm, and difficulty breathing, such as increased jugu- all healthcare consumption, most of which are related to
lar vein weight, pulmonary cracks, and borderline edema intermittent clinical confirmation [4]. Expected coronary
caused by underlying diseases [2]. The characteristics of artery disease depends on performance, especially beating
coronary artery disease are complex, so the disease must be rate, gender, age and many other factors.
treated with caution from now on. Not doing so may affect Although great progress has been made in understand-
the heart or cause accidental death. Cardiovascular failure is ing the complex pathophysiology of cardiovascular failure,
a real disease associated with high morbidity and mortality the amount and unpredictability of information and data
to be broken down and monitored transforms accurate and
effective conclusions of cardiovascular failure and evalua-
This article is part of the topical collection “Advances in tion of useful options into very Effective testing [5]. These
Computational Approaches for Artificial Intelligence, Image elements, plus the beneficial results of early detection of
Processing, IoT and Cloud Applications” guest edited by Bhanu
cardiovascular failure, are an explanation for the massive
Prakash K N and M. Shivakumar.
use of AI programs to investigate, foresee and characterize
* Saurabh Pal clinical information. Machine learning strategy is a kind of
[email protected] data mining program, these programs have stimulated the
Ritu Aggrawal enthusiasm of exploring information. The precise sequence
[email protected] of disease stages or etiology or subtypes allows medica-
1 tion and intercession to be communicated in an effective
VBS Purvanchal University, Jaunpur, India

SN Computer Science
Vol.:(0123456789)
344 Page 2 of 16 SN Computer Science (2020) 1:344

and focused manner, and can evaluate the patient’s disease These techniques are described in Table 1. The fundamental
development [6]. Regardless of whether cardiovascular fail- difference between these strategies is determined by iden-
ure is analyzed at a later stage, data mining strategies may tifying the characteristics used to identify cardiovascular
be beneficial, because in this case, the beneficial advantages failure.
of intercession and the possibility of endurance are limited
because they can predict mortality, dismal and again the dan-
ger of admission. Used the information recorded in the sub- Tools and Techniques
ject’s health record, expression segment data, clinical history
data, introduction indications, physical assessment results, The accompanying section examines each model, instru-
research facility information, and ECG examination results. ment, methods, and algorithms used in the inspection, which
This article presents a comprehensive use of AI strategies to are of great significance to improving the method proposed
solve the aforementioned problems [7]. As shown in Fig. 1. in this article.
In the process of recognizing cardiovascular failure,
choosing the ideal subset of features is a crucial task. The Feature Selection Algorithms
advantages of choosing trivial features include connection
rejection, which reduces the unpredictability of calculations; Feature selection is a comprehensive consideration in AI.
just as improving the treatment cycle includes finding cardi- In general, including selection techniques can be divided
ovascular failure [8]. In any case, the ideal feature subset to into two categories, specifically, including method-based
be used in the disease prediction and analysis framework is and channel-based methods. Generally, compared with
actually still a questionable issue in writing. Existing related the channel-based method and the channel-based method
works in writing always revolve around selecting a subset with higher computational overhead, the coverage-based
of elements to amplify the accuracy of a single/large-scale method can provide more ideal arrangements. Coverage-
arrangement. based methods include sequential forward feature selection
(SFS), sequential backward feature selection (SBS), etc. For
feature selection in our framework, we use the sequential FS
Literature Review algorithm, and the algorithm selects important features [16].
Due to the iterative idea of algorithms, these algorithms
At the Literary Research Center, most studies show the use are called continuous algorithms. The Sequential Feature
of feature determination strategies and other machine learn- Selection (SFS) algorithm starts with an unfilled set and
ing algorithms, which is a measure to arrange subjects in includes an element in the initial step, which provides the
an expected way or as patients with cardiovascular failure. most compelling incentive for the target work. Starting from

Fig. 1  Heart failure management structure

SN Computer Science
Table 1  Literature survey
Authors Disease/prediction Method Features Evaluation measures

Kumar et al. [9] Heart murmur classification Nonlinear classifier (SVM), floating sequential forward Out of 17 features 10 were selected Set 1 Set 2 Set 3 Set 4
method (SFFS) Feature Set 1: loudness 1, zcr 1, transition ratio1, Sensitivity (Se in %)
spectral power 2, fundamental frequency 1,
SN Computer Science

95.74 93.02 90.51 96.15


spectral shape 1, spectral power 3, flux 1, stat
skewness 1, Lyapunovexponent 1 Specificity (Sp in %)
Feature Set 2: PoS 1–32 no selection, kept as in 95.01 96.79 91.26 96.16
original work
Feature Set 3: WT detail 7, VFD 8, Shannon
energy 5, Shannon energy 6, GMM cycle 5,
(2020) 1:344

Shannon energy 4, GMM murmur 5, Eigen-


frequency1 2, WT entropy 10, GMM cycle 4,
Eigenfrequency1 1, Shannon energy 8, VFD 2,
HOS 1
Feature Set 4: loudness 1, transition ratio1, spec-
tral power 2, stat skewness 1, Lyapunovexponent
1, PoS 13, PoS 16, PoS 17, PoS 18, PoS 22,
Shannon energy 8, WT details 5, WT entropy
6, ST map 3, Eigenfrequencies1 4, Eigenfre-
quencies2 10, Eigentimes1 5, HOS 6, HOS 13,
GMMx 7, GMMx murmur 4, VFD 3
Khemphila et al. [12] Heart disease classification Multi-layer Perceptron (MLP)with BackPropagation Out of 13, 8 features selected by feature selec- With 13 features
learning algorithm, feature selection algorithm, Artifi- tion algorithm Training Accuraacy-
cial neural networks (ANN) Selected features are: Thal, Chest Pain Type, Num- 88.46%
ber Colored Vessels, Old Peak, Maximum Heart Validation Accuracy-
Rate, Induced Angina, Slope, Age 80.17%
With 8 features
Training Accuraacy-
89.56%
Validation Accuracy-
80.99%
Mokeddem et al. [10] Coronary artery disease Genetic Algorithm (GA), wrapped Bayes Naïve (BN), Features selected by FS approach Classification accuracy
Best First Search (BFS), Sequential Floating Forward GA wrapped BN: cp, Sex, restescg, oldpeak, SVM: 83.5%
Search (SFFS) slope, ca, thal; MLP: 83.16%
GA wrapped SVM: cp, Age, exang, oldpeak, C4.5: 80.85%
slope, ca, thal; Wrapper based feature
GA wrapped MLP: cp, Age, Sex, restbps, slope, selection algorithms
ca, thal; accuracy
GA wrapped C4.5: cp, fbs, ca, thal; GA wrapper: 85.50%
BFS wrapped BN: chol, fbs, thalach, exang, ca,
thal;
SFFS wrapped BN: cp, restescg, thalach, old-
Page 3 of 16

peak, ca, thal


344

SN Computer Science
Table 1  (continued)
344

Authors Disease/prediction Method Features Evaluation measures

Usman et al. [13] Heart disease prediction cuckoo search algorithm (CSA) and cuckoo optimization Features selected by COA and CSA approach in In all data sets
algorithm (COA), SVM, MLP, NB and, RFC 5-heart disease data set CSA better performed
than COA after feature
Data set
Page 4 of 16

Features COA CSA selection. SVM has


Eric 7 4 4 highest accuracy

SN Computer Science
before and after feature
Echocardiogram 12 5 4 selection among all the
Hungarian 13 6 4 5-datasets

Stat log 13 6 4
Z-Alizadeh Sani 55 14 7
Haq et al. [11] Heart disease prediction Sequential Backward Selection (SBS), K-Nearest Features eliminated by SBS approach Out of k = 8 Kernal of
Neighbor(k-NN) Number of selected features: 13 (selected one K-NN, highest average
feature at time) accuracy is 90% after
eliminating 6 features
Javeed et al. [14] Heart risk failure prediction Floating window with adaptive size for feature elimi- Experiment No. 1 Experiment No. 1
nation (FWAFE-ANN) artificial neural network and Feature selected by FWAFE method: FWAFE-ANN accuracy:
(FWAFE-DNN) deep neural network n = 6, n = 7 and n = 11, where n = size of fea- 91.11%
tures subset. The optimal accuracy found at Experiment No. 2
n = 11(subset of features) by applying ANN FWAFE-DNN accuracy:
Experiment No. 2 93.23%
Feature selected by FWAFE method:
Optimal accuracy found at n = 11 (subset of fea-
tures) by applying DNN
Yadav et al. [15] Heart disease Pearson correlation, recursive features elimination and Features selected by 3 method are Accuracy
lasso regularization and M5P, random tree, Reduced Pearson correlation: cp, exang, oldpeak and target Pearson correlation-
Error Pruning and Random forest ensemble method Recursive Features selection: 12 features selected 99.9%
Lasso Regularization: 10 features selected Recursive Features selec-
tion- 94.12%
Lasso Regularization-
99.9%
SN Computer Science
(2020) 1:344
SN Computer Science (2020) 1:344 Page 5 of 16 344

the second step, the remaining features are specifically added where RFfii is i calculated from all trees in the Random Forest model,
to the current subset, and the new subset is evaluated [17]. normfiij is normalized feature importance for i in tree j , T is
Redefine the loop until the necessary numbers of func- number of trees.
tions are included. This is a naive SFS calculation because
it does not indicate the dependencies between functions. Decision Tree Classifier
The Sequential Backward Selection (SBS) algorithm can
be established like SFS, but the calculation starts from the Decision trees are standardized AI calculations. The deci-
overall arrangement of factors and eliminates each compo- sion tree shape is just a tree in which each hub is a leaf hub
nent in turn. The withdrawal of the latter has the least impact or selection hub. The technique of selecting trees is fun-
on the performance of the indicator [18]. damental and effective for making decisions. The decision
tree contains interconnected internal and external hubs [21].
Machine Learning Classifiers The internal hub is the dynamic part that determines the
selection, and is the part, where the child hub can access the
To classify heart disease patients and healthy individuals, following hubs. Then, the leaf hub no longer has a child hub,
an AI classification algorithm is used. This article briefly and is related to the name.
discusses some well-known classification algorithms and
their assumptions. Gradient Boosting Classifier

Linear Discriminant Analysis (LDA) Gradient boosting is an AI strategy for regression and clas-
sification problems. It provides a prediction model in the
When the variation covariance grid of all populations is form of a set of overall prediction models and decision trees.
homogeneous, LDA is used. In LDA, our selection principle Like other enhancement techniques, it constructs models in a
depends on the linear score function, which is the population stage-savvy style and summarizes them by allowing arbitrar-
element represented by each μi in our g group and the set ily distinguishable unfortunate work [22].
difference covariance frame [19]. The characteristics of the Extensive use of “gradient enhancement” follows method
linear score function are 1 to limit target work. In each cycle, we adapt the basic
learners to the negative angle of the negative gradient, and
1 ∑−1 ∑−1
sLi (X) = − μ�i μi + μ�i continuously increase the expected value, and add it to the
( )
X + logP Πi
2
p incentives emphasized in the past:

dij xj + logP Πi = diL (X) + logP Πi ,
( ) ( )
= di0 + n
∑ ( ( ))
j=1 Fm (x) = Fm−1 (x) − 𝛾m ∇Fm−1 L yi , Fm−1 xi ,
i=1
w h e r e di0 = − 21 μ�i μi , dij = μ�i −1 jth element ,
∑−1 ∑
diL (X)is a linear discriment function. n
The linear scoring function is a function of unknown argmin ∑ ( ( )) ( ( ))
𝛾m = L yi , Fm−1 xi − 𝛾∇Fm−1 L yi , Fm−1 xi ) ,
parameters μi and Σ. Therefore, we must estimate their val- 𝛾 i=1
ues from the training data.
where L(y, F(x))is a differentiable loss function.
Random Forest (RF).
K‑Nearest Neighbor
Random forest builds many individual selection trees dur-
K-NN is a standardized learning order calculation. K-NN
ing preparation. Summarization of the expectations of all
calculates class names that can predict other information;
trees makes the final prediction; the method of sorting or the
K-NN uses new contributions for its data source testing in
average expectation of recurrence. Because they use various
preparation. If the new information is the same, the exam-
results to reach the final conclusion, they are called “ensem-
ples in the training set are the same [23]. K-NN group execu-
ble method”. To select important features, it is normal on all
tion is unacceptable. Let (x, y) be the ability to perceive and
trees at the “random forest” level [20]. The overall impor-
learn h: x⟶y, the goal is that given perception x, h(x) can
tance of components in each tree is determined and isolated
determine y.
by the total number of trees:

j∈ all trees normfiij
RFfii = ,
T

SN Computer Science
344 Page 6 of 16 SN Computer Science (2020) 1:344

Support Vector Machine Validation Accuracy Metrics

SVM is commonly used for AI characterization calculations To verify the accuracy of the classifier, the description of
for deployment. SVM utilizes one of the largest marginal validation is as follows:
methods, which have become unpredictable secondary pro- To check the performance of the classifier, different per-
gramming problems [24]. Due to the advantages of SVM in formance evaluation metrics are used in this exploration. We
grouping, different applications usually use it. use the confusion matrix to accurately place each perception
in the test set in a box. Given that there are two rest catego-
Performance Metrics ries, it is a 2 × 2 network. In addition, it gives two correct
predictions of the classifier and two non-benchmark predic-
To check the performance of the classifier, different perfor- tions. Table 2 shows the confusion matrix [27].
mance evaluation metrics are used in this exploration. From the confusion matrix, we draw the following
conclusions:
Correlation Matrix TP The expected yield is significantly positive (TP). We
infer that the characteristics of patients with heart disease
The correlation matrix is a table indicating correlation coef- have been accurately characterized, and the patients have
ficients between factors. Each cell in the table shows the heart disease.
correlation between the two factors. The correlation matrix TN The expected output is a significant negative value
is used to aggregate information, as a contribution to fur- (TN). We believe that the subject is healthy and correctly
ther developed research, and as a symptom for cutting-edge characterized.
examinations [25]. FP Expected to be a false positive (FP) yield. We assume
The correlation matrix is “square”, and similar factors that a subject is incorrectly characterized as having heart
appear in lines and segments. The 1.00 line from the upper disease (a level 1 error).
left corner to the lower right corner is the corners in princi- FN The expected yield is false negative (FN). We believe
ple, which shows that each factor in each situation is closely that the diagnosis of heart disease is incorrect, because the
connected with itself. The grid is balanced, and a similar subject does not have heart disease.
relationship appears on the main tilt, which is the same rep- Accuracy of the classifiers Accuracy shows the whole per-
resentation of the tilt under the principle tilt. formance of the classification system is as follows:

True Positive(TP) + True Negative(TN)


Classification Accuracy = ∗ 100.
True Positive(TP) + True Negative(TN) + False Positive(FP) + False Negative(FN)

Correlation with Target Variable Precision Precision is the ratio of the positive observa-
tions that is accurately expected:
When performing any machine learning task, feature deter-
mination is one of the most important advancements. An Precision =
TP
.
element, if a data set should appear; only one part is pro- TP + FP
cessed. When we get any data set, not every segment will
Recall (sensitivity) Recall is the ratio of the positive per-
really affect the yield variable. We are very likely to include
ceptions accurately expected to all the perceptions in the
these unnecessary features in the model. This provides the
real class-yes.
need for feature determination.
It can be said that embedded technology is iterative. It TP
Recall = .
can handle each cycle of model preparation and measure- TP + FN
ment, and carefully separate those functions that contribute
F1 score F1 score is the weighted normal of Precision and
the most to the preparation for specific emphasis [26]. The
Recall. Therefore, this score considers both false positives
regularization strategy is the most commonly used installa-
and false negatives.
tion technique, which penalizes components within a given
coefficient limit. 2 ∗ (Recall ∗ Precision)
Here, we will include the use of lasso regularization A= .
(Recall + Precision)
features. On occasions when the element is not important,
the lasso penalizes sets its coefficient to 0. This eliminates ROC and AUC​The beneficiary’s optimistic curve analyze
the feature with coefficient = 0 and adopts the remaining the expected capabilities of the AI classifier used for group-
features. ing. ROC inspection is a portrayal based on graphics, which

SN Computer Science
SN Computer Science (2020) 1:344 Page 7 of 16 344

considers “correct rate” and “error rate”. AI calculates the Data Set
“positive rate” in the grouping result. AUC depicts the ROC
of a classifier. The larger the estimated value of AUC, the The “Heart Failure Clinical Record Data Set 2020” is used
more feasible the display of the classifier [28]. by different researchers [29] and can be obtained from the
online information mining archives of UCI machine learn-
ing. This data set was used in this inspection study to plan
Experimental Methodology a heart failure framework based on machine learning. The
example size of the UCI heart failure data set is 299 patients,
The proposed framework has been created with the plan to has 13 features, and has no missing values. More appropriate
differentiate patients who are died due to heart disease. In autonomous information functions and target yield markers
proposed model we tried to display various machine learning are extracted and used to diagnose heart failure. There are
models that fully determine the distribution and selected fea- two categories of objective class to classify patient’s death or
tures of the heart disease data set. For feature selection, SFS alive from heart failure during the follow-up period. There-
is used to select important features and try to display clas- fore, the extracted data set has 299 * 13 feature matrices.
sifiers on these selected features. The well-known machine Table 2 gives the total data and descriptions of 299 cases of
learning classifiers LDA, RF, DT, GBC, K-NN and SVM are 13 features of the data set.
used in the framework to process model approval and execu-
tion evaluation metrics. Figure 2 shows the experimental Data Preprocessing
framework for prediction of death cases due to heart disease.
The strategy of the proposed framework is divided into The preprocessing of information is essential for effectively
five stages, including: describing information and machine learning classifiers, and
Data set preprocessing, feature selection, cross-validation it should be prepared and tried in a feasible way. Preproc-
methods, machine learning classifiers and classifier repre- essing methods (for example, elimination of missing qual-
sentation evaluation techniques. ity, standard scalar and MinMax scalar) have been applied
to the data set and can be used in the classifier [30]. The

Experimental Setup
Table 2  Confusion matrix
The following subsections briefly discuss the research
Heart disease (have Heart dis-
materials and methods of the paper. All calculations are death = yes) ease (have
performed in Python 3.7 on Intel(R) Core™i3-1800CPU @ death = no)
2.93 GHz PC.
Have death (yes) TP FN
Have death (no) FP TN

Fig. 2  System framework for predicting death from heart disease

SN Computer Science
344 Page 8 of 16 SN Computer Science (2020) 1:344

standard scalar guarantees that the mean of each element is Results and Discussion
0, the variance is 1, and the coefficients of all elements are
similar. Similarly, in MinMax Scalar, the ultimate goal of This part of the article includes a discussion of classifica-
shift information is that all functions are in the range of 0–1. tion models and results (from other perspectives). First, we
checked the representations of various machine learning
Cross Validation calculations, such as linear feature analysis, random forest,
decision tree, gradient boosting classifier, k-nearest neighbor
In k-fold cross validation, the information collection is and support vector machine for the complete function of the
divided into k equal-sized parts, where k – 1 collection is heart failure clinical record data set. Second, we use element
used to prepare the classifier, and the rest is used to check the selection to calculate SFS to determine important features.
performance in each progress [31]. The approval cycle has In the third category, exhibitions are considered selected
been renamed k times, the classifier to execute according to features. Similarly, the k-cross-validation strategy is used. To
the k results. For CV, various estimates of k are selected. In check the classifier of the exhibition, execution evaluation
our analysis, we use k = 5, because its display is acceptable. measures are applied. All functions are standardized before
In the fivefold CV measurement, 70% of the information being applied to the classifier.
is used for training and 30% of the information is used for
testing reasons. For the overlap of each loop, the loop is Result of Image Analysis
redefined multiple times, and all the conditions in the train-
ing and test strings are arbitrarily divided into the entire In this experiment, those people include who have had a
data set before determining to prepare and test the new set heart attack and died or survived during follow-up [32]. Fig-
for the new loop. Finally, at the end of the fivefold measure- ure 3 is a collection of those attributes that contain binary
ment, the midpoint of all demonstration measurements will values, 1 or 0 (with or without). In this category, the attrib-
be processed. utes of anemia, diabetes, hypertension, sex, smoking, and
death event included.

Fig. 3  Attributes with Boolean values

SN Computer Science
SN Computer Science (2020) 1:344 Page 9 of 16 344

• Anemia (hemoglobin): People without anemia are less Figure 4 shows an attribute with a continuous value,
likely to die than people with severe anemia. under which age, creatinine phosphokinase, ejection frac-
• Diabetes (if the patient has diabetes): According to tion, platelets, serum creatinine, serum sodium, and time
Fig. 3, diabetes is not a major risk for people who are belongs.
already in cardiac attack.
• High blood pressure (hypertension): Patients with • Age (years): In Fig. 4, most patients died at the age of 60
hypertension have a high risk of death. during the follow-up period, and the age range of (60–80)
• Sex (woman or man): Compared with female patients is more dangerous for heart disease patients.
(0), male patients (1) have a higher risk of death. • Creatinine phosphokinase (level of the CPK enzyme
• Smoking (patient smokes or not): As shown in this in the blood): The range of CPK enzyme levels (0–2000)
experiment, smoking has a small effect on the number is more dangerous, because more people die in this range.
of deaths. • Ejection fraction (percentage of blood leaving the
• Death event (if the patient deceased during the fol- heart at each contraction): The percentage of ejection
low-up period): During the follow-up period, there were fraction ranging from 20 to 40 is very fatal. More people
fewer deaths than survivors. died under this range.

Fig. 4  Attributes with continuous values

SN Computer Science
344 Page 10 of 16 SN Computer Science (2020) 1:344

• Platelets (platelets in the blood): People died during The correlation with the target variable is another meas-
the follow-up period, whose blood contained platelets ure of feature importance. In Fig. 6, the two characteris-
(200,000–400,000). tics of gender and smoking have a low correlation with the
• Serum creatinine (level of serum creatinine in the
blood): The dangerous level of serum creatinine is (0–2),
at which most people died.
• Serum sodium (level of serum sodium in the blood):
The fatal level is (130–140).
• Time (follow-up period in days): For a patient who is
already at risk (heart attack), the follow-up period (i.e.,
0–100 days) is the most important to whether he/she will
survive in the future.

The correlation matrix in Fig. 5 represents the relation-


ship between attributes. The higher the positive value toward
1, the feature is highly correlated, while the negative value
represents the negative correlation between features, that is,
if one feature increases, other features will decrease, and
vice versa [33]. If the value is 0, there is no association
between the attributes. In the figure below, age, serum creati-
nine, gender, and smoking are highly correlated with death
events, while ejection fraction, serum sodium and time are
negatively correlated with target variables. Fig. 6  Correlation with target variable

Fig. 5  Correlation matrix

SN Computer Science
SN Computer Science (2020) 1:344 Page 11 of 16 344

target, while the other variables have a strong correlation RandomForestClassifier_sfs and DecisionTreeClassifier_sfs
with the target variable. are 86.67%, 82.22%, 80.00%, 77.78%, 75.56% and 74.44%,
respectively. On the other hand, the classifiers without fea-
Result of Classifiers (Fivefold Cross Validation) ture selection, RandomForestClassifier, GradientBoost-
with All Features (n = 13) and with Selected Features ingClassifier, SVM_rbf, LinearDiscriminantAnalysis,
(SFS) SVC_linear, DecisionTreeClassifier, SVC_poly, KNeigh-
borsClassifier have a descending accuracy are 85.56%,
In this inspection, all features of the data set are focused on 85.56%, 84.44%, 82.22%, 82.22%, 78.89%, 77.78% and
six machine learning classifiers through fivefold cross-vali- 76.67%, respectively. Except for the random forest classi-
dation technology. In the fivefold CV, 70% was used to train fier with feature selection, only random forest and gradient
the classifier and only 30% was tested. Finally, the normal boosting have good invisibility for all features, but at least.
measurement result of the fivefold technique is obtained. Figure 7 shows the performance of different classifiers on
In addition, various boundary evaluations have passed the the training and test data sets in the same order.
classifier. Table 3 lists the fivefold cross-validation results
of six full-featured classifiers and the results of sequential Results of Validation Metrics
feature selection.
In Table 3, the random forest classifier with feature To compile and verify the results from the six classifiers, we
selection shows good performance with an accuracy of have Table 4. In this table, it consists of different classifiers
86.67%. The next number is the “random forest” classi- with selected important features. By observing the table, all
fier and the “gradient start” classifier, their accuracy in all six classifiers and their selected features have two prominent
functions is equal to 85.56%. If we distinguish between features: ejection_fraction and serum_cretinine.
classifiers with and without feature selection, the classi- Precision, recall, and f1-score have the usual meanings,
fiers RandomForestClassifier_FS, LinearDiscriminan- and these values validate the results obtained by the classi-
tAnalysis_sfs,KNeighborsClassifier_sfs, radientBoost- fier. The confusion matrix shows the predictions of TP, FN,
ingClassifier_sfs, Therefore, the descending accuracy of

Table 3  Attribute information

SN Computer Science
344 Page 12 of 16 SN Computer Science (2020) 1:344

Fig. 7  Performance of the classifiers with and without SFS

Table 4  Classifiers accuracy with and without SFS Conclusion


Classifiers name Test_accuracy Train_
(%) accuracy In this study, a prediction system based on hybrid intel-
(%) ligent machine learning was proposed to diagnose deaths
RandomForestClassifier_FS 86.67 100.00
during follow-up. The system was tested on the heart fail-
RandomForestClassifier_ 85.56 100.00
ure clinical record data set. Six well-known classifiers (such
GradientBoostingClassifier_ 85.56 100.00
as LDA, RF, GBC, DT, SVM and KNN) are used together
SVM_rbf 84.44 90.43 with the feature selection algorithm SFS to select important
LinearDiscriminantAnalysis_ 82.22 86.60 features. The system uses K-fold cross-validation method
LinearDiscriminantAnalysis_sfs 82.22 85.17 for verification. To check the performance of the classi-
SVC_linear 82.22 85.65 fier, different evaluation indicators are also used. The fea-
KNeighborsClassifier_sfs 80.00 80.38 ture selection algorithm selects important features. These
DecisionTreeClassifier_ 78.89 100.00 features can improve the classification accuracy, precision,
GradientBoostingClassifier_sfs 77.78 89.95 recall, f1-score and ROC_AUC curve performance of the
SVC_poly 77.78 88.52 classifier, and reduce the calculation time of the algorithm.
KNeighborsClassifier_ 76.67 77.51 When selecting RandomForestClassifier_FS with fivefold
RandomForestClassifier_sfs 75.56 99.52 cross validation by FS algorithm SFS, its best accuracy is
DecisionTreeClassifier_sfs 74.44 94.74 86.67%. Due to the good performance of RandomForest-
Classifier_FS with SFS, it is a better prediction system in
terms of accuracy. However, the random forest classifier and
the gradient boost classifier followed closely, and performed
TN and FP for different classifiers of patient deaths during better in terms of accuracy, both of which were 85.56%. In
follow-up period. Table 4, the average ROC_AUC, GBC results have a higher
The ROC curve is the area under true positive rate and the accuracy of 74%. As shown in Table 4, feature selection
false positive rate. Here the ROC_AUC curve drawn under algorithms should be used before classification to improve
fivefold cross validation. In different fold, it has different the classification accuracy of the classifier. Using feature
results. To eliminate this confusion, the average accuracy is selection (SFS), we obtained two important features (ejec-
also calculated. The average accuracy of the higher ROC_ tion_fraction and serum_cretinine) from which death events
AUC obtained by the GBC classifier is 74%. can be predicted. Therefore, the FS algorithm can reduce the

SN Computer Science
SN Computer Science (2020) 1:344 Page 13 of 16 344

Table 5  Classifiers performance validation with SFS


Classifiers SFS Accuracy Metrics ROC Curve
(Selected
features)
LDA Age, precision recall f1-score support
creatinin
e_phosph 0 0.83 0.94 0.88 62
okinase, 1 0.80 0.57 0.67 28
ejection_ micro avg 0.82 0.82 0.82 90
fraction, macro avg 0.81 0.75 0.77 90
serum_cr weighted avg 0.82 0.82 0.81 90
eatinine,
time

RF Age, precision recall f1-score support


Diabetes,
ejection_ 0 0.79 0.87 0.83 62
fraction, 1 0.64 0.50 0.56 28
serum_cr micro avg 0.76 0.76 0.76 90
eatinine macro avg 0.72 0.69 0.70 90
weighted avg 0.75 0.76 0.75 90

DT Diabetes, precision recall f1-score support


ejection_
fraction, 0 0.81 0.82 0.82 62
serum_cr 1 0.59 0.57 0.58 28
eatinine, micro avg 0.74 0.74 0.74 90
smoking macro avg 0.70 0.70 0.70 90
weighted avg 0.74 0.74 0.74 90

SN Computer Science
344 Page 14 of 16 SN Computer Science (2020) 1:344

Table 5  (continued)
GBC ejection_ precision recall f1-score support
fraction,
serum_cr 0 0.80 0.90 0.85 62
eatinine, 1 0.70 0.50 0.58 28
smoking micro avg 0.78 0.78 0.78 90
macro avg 0.75 0.70 0.72 90
weighted avg 0.77 0.78 0.77 90

KNN Anaemia precision recall f1-score support


,
ejection_ 0 0.79 0.97 0.87 62
fraction, 1 0.86 0.43 0.57 28
platelets, micro avg 0.80 0.80 0.80 90
serum_cr macro avg 0.82 0.70 0.72 90
eatinine, weighted avg 0.81 0.80 0.78 90
time

SVM serum_cr precision recall f1-score support


eatinine,
ejection_ 0 0.89 0.75 0.81 44
fraction, 1 0.80 0.91 0.85 47
smoking Avg/total 0.84 0.84 0.83 91

calculation time and improve the classification accuracy of The curiosity of this exploratory work is building a dis-
the classifier. The FS algorithm selects important features covery framework that can foresee the moment of death
related to distinguishing death events from healthy people. events. The framework utilizes SFS calculations, six clas-
sifiers, a cross-approval strategy and execution evaluation

SN Computer Science
SN Computer Science (2020) 1:344 Page 15 of 16 344

metrics. Through the strategy of machine learning to plan and sequential backward selection algorithm for features selection.
the choice of an decision support network, the analysis of In: 2019 IEEE 5th International Conference for Convergence in
Technology (I2CT); 2019. p. 1–4. IEEE.
heart disease will be more reasonable. In addition, some 12. Khemphila A, Boonjing V. Heart disease classification using
non related features reduce the performance of the model neural network and feature selection. In: 2011 21st International
and extend the calculation time. Therefore, another creative Conference on Systems Engineering; 2011. p. 406–409. IEEE.
element of this research is to use in feature determination 13. Usman AM, Yusof UK, Naim S. Cuckoo inspired algorithms for
feature selection in heart disease prediction. Int J Adv Intell Inf.
calculations to select the best features. These best features 2018;4(2):95–106.
can reduce the execution time of the classification model 14. Javeed A, Rizvi SS, Zhou S, Riaz R, Khan SU, Kwon SJ. Heart
and improve the classification accuracy. Later, we will con- risk failure prediction using a novel feature selection method for
duct more inspections using other features (including feature feature refinement and neural network for classification. Mobile
Inf Syst. 2020. https​://doi.org/10.1155/2020/88431​15
selection and simplified procedures) to build these exhibits 15. Yadav DC, Pal S. Prediction of heart disease using feature selec-
of prerequisite classifiers for determining heart disease. tion and random forest ensemble method. Int J Pharmaceutical
Res. 2020;12(4):56–66.
16. Gunasundari S, Janakiraman S. A hybrid PSO-SFS-SBS algorithm
in feature selection for liver cancer data. In: Power Electronics
Compliance with Ethical Standards and Renewable Energy Systems. New Delhi: Springer; 2015. p.
1369–76.
Conflict of Interest The authors declared that they have interest of con- 17. Yan K, Ma L, Dai Y, Shen W, Ji Z, Xie D. Cost-sensitive and
flict. sequential feature selection for chiller fault detection and diagno-
sis. Int J Refrig. 2018;86:401–9.
18. Ruan F, Qi J, Yan C, Tang H, Zhang T, Li H. Quantitative detec-
tion of harmful elements in alloy steel by LIBS technique and
References sequential backward selection-random forest (SBS-RF). J Anal
Atom Spectrom. 2017;32(11):2194–9.
1. Towbin JA, Bowles NE. The failing heart. Nature. 19. Chaurasia V, Pal S. Skin diseases prediction: binary classification
2002;415(6868):227–33. machine learning and multi model ensemble techniques. Res J
2. Ponikowski P, Voors AA, Anker SD, Bueno H, Cleland JG, Coats Pharmacy Technol. 2019;12(8):3829–32.
AJ, et al. 2016 ESC Guidelines for the diagnosis and treatment of 20. Chaurasia V, Pal S. Applications of machine learning techniques to
acute and chronic heart failure: the Task Force for the diagnosis predict diagnostic breast cancer. SN Comput Sci. 2020;1(5):1–11.
and treatment of acute and chronic heart failure of the European 21. Safavian SR, Landgrebe D. A survey of decision tree clas-
Society of Cardiology (ESC) Developed with the special contribu- sifier methodology. IEEE Trans Syst Man Cybernet.
tion of the Heart Failure Association (HFA) of the ESC. Eur Heart 1991;21(3):660–74.
J. 2016;37(27):2129–200. 22. Son J, Jung I, Park K, Han B Tracking-by-segmentation with
3. Aljaaf AJ, Al-Jumeily D, Hussain AJ, Dawson T, Fergus P, Al- online gradient boosting decision tree. In: Proceedings of the
Jumaily M. Predicting the likelihood of heart failure with a multi IEEE International Conference on Computer Vision; 2015. p.
level risk assessment using decision tree. In: 2015 Third Interna- 3056–3064.
tional Conference on Technological Advances in Electrical, Elec- 23. Keller JM, Gray MR, Givens JA. A fuzzy k-nearest neighbor algo-
tronics and Computer Engineering (TAEECE); 2015. p. 101–106. rithm. IEEE Trans Syst Man Cybernet. 1985;4:580–5.
IEEE. 24. Vishwanathan SVM, Murty MN. SSVM: a simple SVM algo-
4. Cowie MR. The heart failure epidemic: a UK perspective. Echo rithm. In: Proceedings of the 2002 International Joint Conference
Res Pract. 2017;4(1):R15–20. on Neural Networks. IJCNN’02 (Cat. No. 02CH37290) Vol. 3;
5. Dokken BB. The pathophysiology of cardiovascular disease and 2002. p. 2393–2398. IEEE.
diabetes: beyond blood pressure and lipids. Diabetes Spectrum. 25. Higham NJ. Computing the nearest correlation matrix—a problem
2008;21(3):160–5. from finance. IMA J Numerical Anal. 2002;22(3):329–43.
6. Liu M, Jiang Y, Wedow R, Li Y, Brazel DM, Chen F, et al. Asso- 26. McColl KA, Vogelzang J, Konings AG, Entekhabi D, Piles M,
ciation studies of up to 1.2 million individuals yield new insights Stoffelen A. Extended triple collocation: estimating errors and
into the genetic etiology of tobacco and alcohol use. Nat Genetics. correlation coefficients with respect to an unknown target. Geo-
2019;51(2):237–44. phys Res Lett. 2014;41(17):6229–36.
7. Kadkhoda M, Jahani H. Problem-solving capacities of spiritual 27. Oberkampf WL, Barone MF. Measures of agreement between
intelligence for artificial intelligence. Procedia-Soc Behav Sci. computation and experiment: validation metrics. J Comput Phys.
2012;32:170–5. 2006;217(1):5–36.
8. Shilaskar S, Ghatol A. Feature selection for medical diagno- 28. Pritom AI, Munshi MAR, Sabab SA, Shihab S. Predicting breast
sis: evaluation for cardiovascular diseases. Expert Syst Appl. cancer recurrence using effective classification and feature selec-
2013;40(10):4146–53. tion technique. In: 2016 19th International Conference on Com-
9. Kumar D, Carvalho P, Antunes M, Paiva RP, Henriques J. Heart puter and Information Technology (ICCIT); 2016. p. 310–314.
murmur classification with feature selection. In: 2010 Annual IEEE
International Conference of the IEEE Engineering in Medicine 29. Chapman B, DeVore AD, Mentz RJ, Metra M. Clinical profiles in
and Biology; 2010. p. 4566–4569. IEEE. acute heart failure: an urgent need for a new approach. ESC Heart
10. Mokeddem S, Atmani B, Mokaddem M. Supervised feature selec- Failure. 2019;6(3):464–74.
tion for diagnosis of coronary artery disease based on genetic 30. Shaheen H, Agarwal S, Ranjan P. MinMaxScaler binary PSO for
algorithm. arXiv preprint. 2013; arXiv​:1305.6046 feature selection. In: First International Conference on Sustain-
11. Haq AU, Li J, Memon MH, Memon MH, Khan J, Marium SM. able Technologies for Computational Intelligence. Singapore:
Heart disease prediction system using model of machine learning Springer; 2020. p. 705–16.

SN Computer Science
344 Page 16 of 16 SN Computer Science (2020) 1:344

31. Chaurasia V, Pal S, Tiwari BB. Chronic Kidney disease: a 33. Rebonato R, Jäckel P. The most general methodology to create a
predictive model using decision tree. Int J Eng Res Technol. valid correlation matrix for risk management and option pricing
2018;11(11):1781–94. purposes. Available at SSRN 1969689. 2011.
32. Tantimongcolwat T, Naenna T, Isarankura-Na-Ayudhya C, Embre-
chts MJ, Prachayasittikul V. Identification of ischemic heart dis- Publisher’s Note Springer Nature remains neutral with regard to
ease via machine learning analysis on magnetocardiograms. Com- jurisdictional claims in published maps and institutional affiliations.
put Biol Med. 2008;38(7):817–25.

SN Computer Science

You might also like