0% found this document useful (0 votes)

9 views

Machine Learning Algorithm‑Based Health prediction

Machine learn paper

Uploaded by

Umang Soni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

Machine Learning Algorithm‑Based Health prediction

Machine learn paper

Uploaded by

Umang Soni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

SN Computer Science (2020) 1:344

https://doi.org/10.1007/s42979-020-00370-1

ORIGINAL RESEARCH

Sequential Feature Selection and Machine Learning Algorithm‑Based

Patient’s Death Events Prediction and Diagnosis in Heart Disease
Ritu Aggrawal1 · Saurabh Pal1

Received: 3 October 2020 / Accepted: 6 October 2020

Abstract
Due to the accessibility of data with multiple features, many feature determination techniques available in written form. These
features promote data with extremely high measurement values. The feature determination strategy provides us with a way to
reduce calculation time, improve prediction execution, and have a better understanding of data in machine learning, as well
as a way to recognize applications. As pointed out by related works that have been reviewed, in general, existing works only
focus on amplifying classification accuracy. For real-world applications, the selected subset of features must be continuous
instead. In this research, proposes a sequential feature selection algorithm for detecting death events in heart disease patients
during treatment to select the most important features. Several machine learning algorithms (LDA, RF, GBC, DT, SVM, and
KNN) are used. In addition, the accuracy obtained by this method (SFS) is compared with the accuracy of the classifier. The
confusion matrix, ROC curve, precision, recall rate, and f1-score are also calculated to verify the results obtained by the SFS
algorithm. The experimental results show that for Random Forest Classifier_FS, the SFS method reaches 86.67% accuracy.

Keywords Heart disease · SFS · RF · SVM · KNN · Confusion matrix · ROC · Precision · Recall

Introduction [3]. According to statistics from the Society of Cardiology,

26 million adults worldwide suffer from heart failure, and
Cardiovascular failure is a complex clinical disorder, not a recently 3.6 million people are analyzed each year. 17–45%
disease [1]. It is difficult to distinguish coronary heart dis- of patients with heart failure die within the year, and the rest
ease based on some common risk factors, such as diabetes, die within 5 years. Management costs determined to have
high blood pressure, elevated cholesterol, abnormal heart cardiovascular failure account for approximately 1–2% of
rhythm, and difficulty breathing, such as increased jugu- all healthcare consumption, most of which are related to
lar vein weight, pulmonary cracks, and borderline edema intermittent clinical confirmation [4]. Expected coronary
caused by underlying diseases [2]. The characteristics of artery disease depends on performance, especially beating
coronary artery disease are complex, so the disease must be rate, gender, age and many other factors.
treated with caution from now on. Not doing so may affect Although great progress has been made in understand-
the heart or cause accidental death. Cardiovascular failure is ing the complex pathophysiology of cardiovascular failure,
a real disease associated with high morbidity and mortality the amount and unpredictability of information and data
to be broken down and monitored transforms accurate and
effective conclusions of cardiovascular failure and evalua-
This article is part of the topical collection “Advances in tion of useful options into very Effective testing [5]. These
Computational Approaches for Artificial Intelligence, Image elements, plus the beneficial results of early detection of
Processing, IoT and Cloud Applications” guest edited by Bhanu
cardiovascular failure, are an explanation for the massive
Prakash K N and M. Shivakumar.
use of AI programs to investigate, foresee and characterize
* Saurabh Pal clinical information. Machine learning strategy is a kind of
[email protected] data mining program, these programs have stimulated the
Ritu Aggrawal enthusiasm of exploring information. The precise sequence
[email protected] of disease stages or etiology or subtypes allows medica-
1 tion and intercession to be communicated in an effective
VBS Purvanchal University, Jaunpur, India

SN Computer Science
Vol.:(0123456789)
344 Page 2 of 16 SN Computer Science (2020) 1:344

and focused manner, and can evaluate the patient’s disease These techniques are described in Table 1. The fundamental
development [6]. Regardless of whether cardiovascular fail- difference between these strategies is determined by iden-
ure is analyzed at a later stage, data mining strategies may tifying the characteristics used to identify cardiovascular
be beneficial, because in this case, the beneficial advantages failure.
of intercession and the possibility of endurance are limited
because they can predict mortality, dismal and again the dan-
ger of admission. Used the information recorded in the sub- Tools and Techniques
ject’s health record, expression segment data, clinical history
data, introduction indications, physical assessment results, The accompanying section examines each model, instru-
research facility information, and ECG examination results. ment, methods, and algorithms used in the inspection, which
This article presents a comprehensive use of AI strategies to are of great significance to improving the method proposed
solve the aforementioned problems [7]. As shown in Fig. 1. in this article.
In the process of recognizing cardiovascular failure,
choosing the ideal subset of features is a crucial task. The Feature Selection Algorithms
advantages of choosing trivial features include connection
rejection, which reduces the unpredictability of calculations; Feature selection is a comprehensive consideration in AI.
just as improving the treatment cycle includes finding cardi- In general, including selection techniques can be divided
ovascular failure [8]. In any case, the ideal feature subset to into two categories, specifically, including method-based
be used in the disease prediction and analysis framework is and channel-based methods. Generally, compared with
actually still a questionable issue in writing. Existing related the channel-based method and the channel-based method
works in writing always revolve around selecting a subset with higher computational overhead, the coverage-based
of elements to amplify the accuracy of a single/large-scale method can provide more ideal arrangements. Coverage-
arrangement. based methods include sequential forward feature selection
(SFS), sequential backward feature selection (SBS), etc. For
feature selection in our framework, we use the sequential FS
Literature Review algorithm, and the algorithm selects important features [16].
Due to the iterative idea of algorithms, these algorithms
At the Literary Research Center, most studies show the use are called continuous algorithms. The Sequential Feature
of feature determination strategies and other machine learn- Selection (SFS) algorithm starts with an unfilled set and
ing algorithms, which is a measure to arrange subjects in includes an element in the initial step, which provides the
an expected way or as patients with cardiovascular failure. most compelling incentive for the target work. Starting from

Fig. 1 Heart failure management structure

SN Computer Science
Table 1 Literature survey
Authors Disease/prediction Method Features Evaluation measures

Kumar et al. [9] Heart murmur classification Nonlinear classifier (SVM), floating sequential forward Out of 17 features 10 were selected Set 1 Set 2 Set 3 Set 4
method (SFFS) Feature Set 1: loudness 1, zcr 1, transition ratio1, Sensitivity (Se in %)
spectral power 2, fundamental frequency 1,
SN Computer Science

95.74 93.02 90.51 96.15

spectral shape 1, spectral power 3, flux 1, stat
skewness 1, Lyapunovexponent 1 Specificity (Sp in %)
Feature Set 2: PoS 1–32 no selection, kept as in 95.01 96.79 91.26 96.16
original work
Feature Set 3: WT detail 7, VFD 8, Shannon
energy 5, Shannon energy 6, GMM cycle 5,
(2020) 1:344

Shannon energy 4, GMM murmur 5, Eigen-

frequency1 2, WT entropy 10, GMM cycle 4,
Eigenfrequency1 1, Shannon energy 8, VFD 2,
HOS 1
Feature Set 4: loudness 1, transition ratio1, spec-
tral power 2, stat skewness 1, Lyapunovexponent
1, PoS 13, PoS 16, PoS 17, PoS 18, PoS 22,
Shannon energy 8, WT details 5, WT entropy
6, ST map 3, Eigenfrequencies1 4, Eigenfre-
quencies2 10, Eigentimes1 5, HOS 6, HOS 13,
GMMx 7, GMMx murmur 4, VFD 3
Khemphila et al. [12] Heart disease classification Multi-layer Perceptron (MLP)with BackPropagation Out of 13, 8 features selected by feature selec- With 13 features
learning algorithm, feature selection algorithm, Artifi- tion algorithm Training Accuraacy-
cial neural networks (ANN) Selected features are: Thal, Chest Pain Type, Num- 88.46%
ber Colored Vessels, Old Peak, Maximum Heart Validation Accuracy-
Rate, Induced Angina, Slope, Age 80.17%
With 8 features
Training Accuraacy-
89.56%
Validation Accuracy-
80.99%
Mokeddem et al. [10] Coronary artery disease Genetic Algorithm (GA), wrapped Bayes Naïve (BN), Features selected by FS approach Classification accuracy
Best First Search (BFS), Sequential Floating Forward GA wrapped BN: cp, Sex, restescg, oldpeak, SVM: 83.5%
Search (SFFS) slope, ca, thal; MLP: 83.16%
GA wrapped SVM: cp, Age, exang, oldpeak, C4.5: 80.85%
slope, ca, thal; Wrapper based feature
GA wrapped MLP: cp, Age, Sex, restbps, slope, selection algorithms
ca, thal; accuracy
GA wrapped C4.5: cp, fbs, ca, thal; GA wrapper: 85.50%
BFS wrapped BN: chol, fbs, thalach, exang, ca,
thal;
SFFS wrapped BN: cp, restescg, thalach, old-
Page 3 of 16

peak, ca, thal

344

SN Computer Science
Table 1 (continued)
344

Authors Disease/prediction Method Features Evaluation measures

Usman et al. [13] Heart disease prediction cuckoo search algorithm (CSA) and cuckoo optimization Features selected by COA and CSA approach in In all data sets
algorithm (COA), SVM, MLP, NB and, RFC 5-heart disease data set CSA better performed
than COA after feature
Data set
Page 4 of 16

Features COA CSA selection. SVM has

Eric 7 4 4 highest accuracy

SN Computer Science
before and after feature
Echocardiogram 12 5 4 selection among all the
Hungarian 13 6 4 5-datasets

Stat log 13 6 4
Z-Alizadeh Sani 55 14 7
Haq et al. [11] Heart disease prediction Sequential Backward Selection (SBS), K-Nearest Features eliminated by SBS approach Out of k = 8 Kernal of
Neighbor(k-NN) Number of selected features: 13 (selected one K-NN, highest average
feature at time) accuracy is 90% after
eliminating 6 features
Javeed et al. [14] Heart risk failure prediction Floating window with adaptive size for feature elimi- Experiment No. 1 Experiment No. 1
nation (FWAFE-ANN) artificial neural network and Feature selected by FWAFE method: FWAFE-ANN accuracy:
(FWAFE-DNN) deep neural network n = 6, n = 7 and n = 11, where n = size of fea- 91.11%
tures subset. The optimal accuracy found at Experiment No. 2
n = 11(subset of features) by applying ANN FWAFE-DNN accuracy:
Experiment No. 2 93.23%
Feature selected by FWAFE method:
Optimal accuracy found at n = 11 (subset of fea-
tures) by applying DNN
Yadav et al. [15] Heart disease Pearson correlation, recursive features elimination and Features selected by 3 method are Accuracy
lasso regularization and M5P, random tree, Reduced Pearson correlation: cp, exang, oldpeak and target Pearson correlation-
Error Pruning and Random forest ensemble method Recursive Features selection: 12 features selected 99.9%
Lasso Regularization: 10 features selected Recursive Features selec-
tion- 94.12%
Lasso Regularization-
99.9%
SN Computer Science
(2020) 1:344
SN Computer Science (2020) 1:344 Page 5 of 16 344

the second step, the remaining features are specifically added where RFfii is i calculated from all trees in the Random Forest model,
to the current subset, and the new subset is evaluated [17]. normfiij is normalized feature importance for i in tree j , T is
Redefine the loop until the necessary numbers of func- number of trees.
tions are included. This is a naive SFS calculation because
it does not indicate the dependencies between functions. Decision Tree Classifier
The Sequential Backward Selection (SBS) algorithm can
be established like SFS, but the calculation starts from the Decision trees are standardized AI calculations. The deci-
overall arrangement of factors and eliminates each compo- sion tree shape is just a tree in which each hub is a leaf hub
nent in turn. The withdrawal of the latter has the least impact or selection hub. The technique of selecting trees is fun-
on the performance of the indicator [18]. damental and effective for making decisions. The decision
tree contains interconnected internal and external hubs [21].
Machine Learning Classifiers The internal hub is the dynamic part that determines the
selection, and is the part, where the child hub can access the
To classify heart disease patients and healthy individuals, following hubs. Then, the leaf hub no longer has a child hub,
an AI classification algorithm is used. This article briefly and is related to the name.
discusses some well-known classification algorithms and
their assumptions. Gradient Boosting Classifier

Linear Discriminant Analysis (LDA) Gradient boosting is an AI strategy for regression and clas-
sification problems. It provides a prediction model in the
When the variation covariance grid of all populations is form of a set of overall prediction models and decision trees.
homogeneous, LDA is used. In LDA, our selection principle Like other enhancement techniques, it constructs models in a
depends on the linear score function, which is the population stage-savvy style and summarizes them by allowing arbitrar-
element represented by each μi in our g group and the set ily distinguishable unfortunate work [22].
difference covariance frame [19]. The characteristics of the Extensive use of “gradient enhancement” follows method
linear score function are 1 to limit target work. In each cycle, we adapt the basic
learners to the negative angle of the negative gradient, and
1 ∑−1 ∑−1
sLi (X) = − μ�i μi + μ�i continuously increase the expected value, and add it to the
( )
X + logP Πi
2
p incentives emphasized in the past:
∑
dij xj + logP Πi = diL (X) + logP Πi ,
( ) ( )
= di0 + n
∑ ( ( ))
j=1 Fm (x) = Fm−1 (x) − 𝛾m ∇Fm−1 L yi , Fm−1 xi ,
i=1
w h e r e di0 = − 21 μ�i μi , dij = μ�i −1 jth element ,
∑−1 ∑
diL (X)is a linear discriment function. n
The linear scoring function is a function of unknown argmin ∑ ( ( )) ( ( ))
𝛾m = L yi , Fm−1 xi − 𝛾∇Fm−1 L yi , Fm−1 xi ) ,
parameters μi and Σ. Therefore, we must estimate their val- 𝛾 i=1
ues from the training data.
where L(y, F(x))is a differentiable loss function.
Random Forest (RF).
K‑Nearest Neighbor
Random forest builds many individual selection trees dur-
K-NN is a standardized learning order calculation. K-NN
ing preparation. Summarization of the expectations of all
calculates class names that can predict other information;
trees makes the final prediction; the method of sorting or the
K-NN uses new contributions for its data source testing in
average expectation of recurrence. Because they use various
preparation. If the new information is the same, the exam-
results to reach the final conclusion, they are called “ensem-
ples in the training set are the same [23]. K-NN group execu-
ble method”. To select important features, it is normal on all
tion is unacceptable. Let (x, y) be the ability to perceive and
trees at the “random forest” level [20]. The overall impor-
learn h: x⟶y, the goal is that given perception x, h(x) can
tance of components in each tree is determined and isolated
determine y.
by the total number of trees:
∑
j∈ all trees normfiij
RFfii = ,
T

SN Computer Science
344 Page 6 of 16 SN Computer Science (2020) 1:344

Support Vector Machine Validation Accuracy Metrics

SVM is commonly used for AI characterization calculations To verify the accuracy of the classifier, the description of
for deployment. SVM utilizes one of the largest marginal validation is as follows:
methods, which have become unpredictable secondary pro- To check the performance of the classifier, different per-
gramming problems [24]. Due to the advantages of SVM in formance evaluation metrics are used in this exploration. We
grouping, different applications usually use it. use the confusion matrix to accurately place each perception
in the test set in a box. Given that there are two rest catego-
Performance Metrics ries, it is a 2 × 2 network. In addition, it gives two correct
predictions of the classifier and two non-benchmark predic-
To check the performance of the classifier, different perfor- tions. Table 2 shows the confusion matrix [27].
mance evaluation metrics are used in this exploration. From the confusion matrix, we draw the following
conclusions:
Correlation Matrix TP The expected yield is significantly positive (TP). We
infer that the characteristics of patients with heart disease
The correlation matrix is a table indicating correlation coef- have been accurately characterized, and the patients have
ficients between factors. Each cell in the table shows the heart disease.
correlation between the two factors. The correlation matrix TN The expected output is a significant negative value
is used to aggregate information, as a contribution to fur- (TN). We believe that the subject is healthy and correctly
ther developed research, and as a symptom for cutting-edge characterized.
examinations [25]. FP Expected to be a false positive (FP) yield. We assume
The correlation matrix is “square”, and similar factors that a subject is incorrectly characterized as having heart
appear in lines and segments. The 1.00 line from the upper disease (a level 1 error).
left corner to the lower right corner is the corners in princi- FN The expected yield is false negative (FN). We believe
ple, which shows that each factor in each situation is closely that the diagnosis of heart disease is incorrect, because the
connected with itself. The grid is balanced, and a similar subject does not have heart disease.
relationship appears on the main tilt, which is the same rep- Accuracy of the classifiers Accuracy shows the whole per-
resentation of the tilt under the principle tilt. formance of the classification system is as follows:

True Positive(TP) + True Negative(TN)

Classification Accuracy = ∗ 100.
True Positive(TP) + True Negative(TN) + False Positive(FP) + False Negative(FN)

Correlation with Target Variable Precision Precision is the ratio of the positive observa-
tions that is accurately expected:
When performing any machine learning task, feature deter-
mination is one of the most important advancements. An Precision =
TP
.
element, if a data set should appear; only one part is pro- TP + FP
cessed. When we get any data set, not every segment will
Recall (sensitivity) Recall is the ratio of the positive per-
really affect the yield variable. We are very likely to include
ceptions accurately expected to all the perceptions in the
these unnecessary features in the model. This provides the
real class-yes.
need for feature determination.
It can be said that embedded technology is iterative. It TP
Recall = .
can handle each cycle of model preparation and measure- TP + FN
ment, and carefully separate those functions that contribute
F1 score F1 score is the weighted normal of Precision and
the most to the preparation for specific emphasis [26]. The
Recall. Therefore, this score considers both false positives
regularization strategy is the most commonly used installa-
and false negatives.
tion technique, which penalizes components within a given
coefficient limit. 2 ∗ (Recall ∗ Precision)
Here, we will include the use of lasso regularization A= .
(Recall + Precision)
features. On occasions when the element is not important,
the lasso penalizes sets its coefficient to 0. This eliminates ROC and AUCThe beneficiary’s optimistic curve analyze
the feature with coefficient = 0 and adopts the remaining the expected capabilities of the AI classifier used for group-
features. ing. ROC inspection is a portrayal based on graphics, which

SN Computer Science
SN Computer Science (2020) 1:344 Page 7 of 16 344

considers “correct rate” and “error rate”. AI calculates the Data Set
“positive rate” in the grouping result. AUC depicts the ROC
of a classifier. The larger the estimated value of AUC, the The “Heart Failure Clinical Record Data Set 2020” is used
more feasible the display of the classifier [28]. by different researchers [29] and can be obtained from the
online information mining archives of UCI machine learn-
ing. This data set was used in this inspection study to plan
Experimental Methodology a heart failure framework based on machine learning. The
example size of the UCI heart failure data set is 299 patients,
The proposed framework has been created with the plan to has 13 features, and has no missing values. More appropriate
differentiate patients who are died due to heart disease. In autonomous information functions and target yield markers
proposed model we tried to display various machine learning are extracted and used to diagnose heart failure. There are
models that fully determine the distribution and selected fea- two categories of objective class to classify patient’s death or
tures of the heart disease data set. For feature selection, SFS alive from heart failure during the follow-up period. There-
is used to select important features and try to display clas- fore, the extracted data set has 299 * 13 feature matrices.
sifiers on these selected features. The well-known machine Table 2 gives the total data and descriptions of 299 cases of
learning classifiers LDA, RF, DT, GBC, K-NN and SVM are 13 features of the data set.
used in the framework to process model approval and execu-
tion evaluation metrics. Figure 2 shows the experimental Data Preprocessing
framework for prediction of death cases due to heart disease.
The strategy of the proposed framework is divided into The preprocessing of information is essential for effectively
five stages, including: describing information and machine learning classifiers, and
Data set preprocessing, feature selection, cross-validation it should be prepared and tried in a feasible way. Preproc-
methods, machine learning classifiers and classifier repre- essing methods (for example, elimination of missing qual-
sentation evaluation techniques. ity, standard scalar and MinMax scalar) have been applied
to the data set and can be used in the classifier [30]. The

Experimental Setup
Table 2 Confusion matrix
The following subsections briefly discuss the research
Heart disease (have Heart dis-
materials and methods of the paper. All calculations are death = yes) ease (have
performed in Python 3.7 on Intel(R) Core™i3-1800CPU @ death = no)
2.93 GHz PC.
Have death (yes) TP FN
Have death (no) FP TN

Fig. 2 System framework for predicting death from heart disease

SN Computer Science
344 Page 8 of 16 SN Computer Science (2020) 1:344

standard scalar guarantees that the mean of each element is Results and Discussion
0, the variance is 1, and the coefficients of all elements are
similar. Similarly, in MinMax Scalar, the ultimate goal of This part of the article includes a discussion of classifica-
shift information is that all functions are in the range of 0–1. tion models and results (from other perspectives). First, we
checked the representations of various machine learning
Cross Validation calculations, such as linear feature analysis, random forest,
decision tree, gradient boosting classifier, k-nearest neighbor
In k-fold cross validation, the information collection is and support vector machine for the complete function of the
divided into k equal-sized parts, where k – 1 collection is heart failure clinical record data set. Second, we use element
used to prepare the classifier, and the rest is used to check the selection to calculate SFS to determine important features.
performance in each progress [31]. The approval cycle has In the third category, exhibitions are considered selected
been renamed k times, the classifier to execute according to features. Similarly, the k-cross-validation strategy is used. To
the k results. For CV, various estimates of k are selected. In check the classifier of the exhibition, execution evaluation
our analysis, we use k = 5, because its display is acceptable. measures are applied. All functions are standardized before
In the fivefold CV measurement, 70% of the information being applied to the classifier.
is used for training and 30% of the information is used for
testing reasons. For the overlap of each loop, the loop is Result of Image Analysis
redefined multiple times, and all the conditions in the train-
ing and test strings are arbitrarily divided into the entire In this experiment, those people include who have had a
data set before determining to prepare and test the new set heart attack and died or survived during follow-up [32]. Fig-
for the new loop. Finally, at the end of the fivefold measure- ure 3 is a collection of those attributes that contain binary
ment, the midpoint of all demonstration measurements will values, 1 or 0 (with or without). In this category, the attrib-
be processed. utes of anemia, diabetes, hypertension, sex, smoking, and
death event included.

Fig. 3 Attributes with Boolean values

SN Computer Science
SN Computer Science (2020) 1:344 Page 9 of 16 344

• Anemia (hemoglobin): People without anemia are less Figure 4 shows an attribute with a continuous value,
likely to die than people with severe anemia. under which age, creatinine phosphokinase, ejection frac-
• Diabetes (if the patient has diabetes): According to tion, platelets, serum creatinine, serum sodium, and time
Fig. 3, diabetes is not a major risk for people who are belongs.
already in cardiac attack.
• High blood pressure (hypertension): Patients with • Age (years): In Fig. 4, most patients died at the age of 60
hypertension have a high risk of death. during the follow-up period, and the age range of (60–80)
• Sex (woman or man): Compared with female patients is more dangerous for heart disease patients.
(0), male patients (1) have a higher risk of death. • Creatinine phosphokinase (level of the CPK enzyme
• Smoking (patient smokes or not): As shown in this in the blood): The range of CPK enzyme levels (0–2000)
experiment, smoking has a small effect on the number is more dangerous, because more people die in this range.
of deaths. • Ejection fraction (percentage of blood leaving the
• Death event (if the patient deceased during the fol- heart at each contraction): The percentage of ejection
low-up period): During the follow-up period, there were fraction ranging from 20 to 40 is very fatal. More people
fewer deaths than survivors. died under this range.

Fig. 4 Attributes with continuous values

SN Computer Science
344 Page 10 of 16 SN Computer Science (2020) 1:344

• Platelets (platelets in the blood): People died during The correlation with the target variable is another meas-
the follow-up period, whose blood contained platelets ure of feature importance. In Fig. 6, the two characteris-
(200,000–400,000). tics of gender and smoking have a low correlation with the
• Serum creatinine (level of serum creatinine in the
blood): The dangerous level of serum creatinine is (0–2),
at which most people died.
• Serum sodium (level of serum sodium in the blood):
The fatal level is (130–140).
• Time (follow-up period in days): For a patient who is
already at risk (heart attack), the follow-up period (i.e.,
0–100 days) is the most important to whether he/she will
survive in the future.

The correlation matrix in Fig. 5 represents the relation-

ship between attributes. The higher the positive value toward
1, the feature is highly correlated, while the negative value
represents the negative correlation between features, that is,
if one feature increases, other features will decrease, and
vice versa [33]. If the value is 0, there is no association
between the attributes. In the figure below, age, serum creati-
nine, gender, and smoking are highly correlated with death
events, while ejection fraction, serum sodium and time are
negatively correlated with target variables. Fig. 6 Correlation with target variable

Fig. 5 Correlation matrix

SN Computer Science
SN Computer Science (2020) 1:344 Page 11 of 16 344

target, while the other variables have a strong correlation RandomForestClassifier_sfs and DecisionTreeClassifier_sfs
with the target variable. are 86.67%, 82.22%, 80.00%, 77.78%, 75.56% and 74.44%,
respectively. On the other hand, the classifiers without fea-
Result of Classifiers (Fivefold Cross Validation) ture selection, RandomForestClassifier, GradientBoost-
with All Features (n = 13) and with Selected Features ingClassifier, SVM_rbf, LinearDiscriminantAnalysis,
(SFS) SVC_linear, DecisionTreeClassifier, SVC_poly, KNeigh-
borsClassifier have a descending accuracy are 85.56%,
In this inspection, all features of the data set are focused on 85.56%, 84.44%, 82.22%, 82.22%, 78.89%, 77.78% and
six machine learning classifiers through fivefold cross-vali- 76.67%, respectively. Except for the random forest classi-
dation technology. In the fivefold CV, 70% was used to train fier with feature selection, only random forest and gradient
the classifier and only 30% was tested. Finally, the normal boosting have good invisibility for all features, but at least.
measurement result of the fivefold technique is obtained. Figure 7 shows the performance of different classifiers on
In addition, various boundary evaluations have passed the the training and test data sets in the same order.
classifier. Table 3 lists the fivefold cross-validation results
of six full-featured classifiers and the results of sequential Results of Validation Metrics
feature selection.
In Table 3, the random forest classifier with feature To compile and verify the results from the six classifiers, we
selection shows good performance with an accuracy of have Table 4. In this table, it consists of different classifiers
86.67%. The next number is the “random forest” classi- with selected important features. By observing the table, all
fier and the “gradient start” classifier, their accuracy in all six classifiers and their selected features have two prominent
functions is equal to 85.56%. If we distinguish between features: ejection_fraction and serum_cretinine.
classifiers with and without feature selection, the classi- Precision, recall, and f1-score have the usual meanings,
fiers RandomForestClassifier_FS, LinearDiscriminan- and these values validate the results obtained by the classi-
tAnalysis_sfs,KNeighborsClassifier_sfs, radientBoost- fier. The confusion matrix shows the predictions of TP, FN,
ingClassifier_sfs, Therefore, the descending accuracy of

Table 3 Attribute information

SN Computer Science
344 Page 12 of 16 SN Computer Science (2020) 1:344

Fig. 7 Performance of the classifiers with and without SFS

Table 4 Classifiers accuracy with and without SFS Conclusion

Classifiers name Test_accuracy Train_
(%) accuracy In this study, a prediction system based on hybrid intel-
(%) ligent machine learning was proposed to diagnose deaths
RandomForestClassifier_FS 86.67 100.00
during follow-up. The system was tested on the heart fail-
RandomForestClassifier_ 85.56 100.00
ure clinical record data set. Six well-known classifiers (such
GradientBoostingClassifier_ 85.56 100.00
as LDA, RF, GBC, DT, SVM and KNN) are used together
SVM_rbf 84.44 90.43 with the feature selection algorithm SFS to select important
LinearDiscriminantAnalysis_ 82.22 86.60 features. The system uses K-fold cross-validation method
LinearDiscriminantAnalysis_sfs 82.22 85.17 for verification. To check the performance of the classi-
SVC_linear 82.22 85.65 fier, different evaluation indicators are also used. The fea-
KNeighborsClassifier_sfs 80.00 80.38 ture selection algorithm selects important features. These
DecisionTreeClassifier_ 78.89 100.00 features can improve the classification accuracy, precision,
GradientBoostingClassifier_sfs 77.78 89.95 recall, f1-score and ROC_AUC curve performance of the
SVC_poly 77.78 88.52 classifier, and reduce the calculation time of the algorithm.
KNeighborsClassifier_ 76.67 77.51 When selecting RandomForestClassifier_FS with fivefold
RandomForestClassifier_sfs 75.56 99.52 cross validation by FS algorithm SFS, its best accuracy is
DecisionTreeClassifier_sfs 74.44 94.74 86.67%. Due to the good performance of RandomForest-
Classifier_FS with SFS, it is a better prediction system in
terms of accuracy. However, the random forest classifier and
the gradient boost classifier followed closely, and performed
TN and FP for different classifiers of patient deaths during better in terms of accuracy, both of which were 85.56%. In
follow-up period. Table 4, the average ROC_AUC, GBC results have a higher
The ROC curve is the area under true positive rate and the accuracy of 74%. As shown in Table 4, feature selection
false positive rate. Here the ROC_AUC curve drawn under algorithms should be used before classification to improve
fivefold cross validation. In different fold, it has different the classification accuracy of the classifier. Using feature
results. To eliminate this confusion, the average accuracy is selection (SFS), we obtained two important features (ejec-
also calculated. The average accuracy of the higher ROC_ tion_fraction and serum_cretinine) from which death events
AUC obtained by the GBC classifier is 74%. can be predicted. Therefore, the FS algorithm can reduce the

SN Computer Science
SN Computer Science (2020) 1:344 Page 13 of 16 344

Table 5 Classifiers performance validation with SFS

Classifiers SFS Accuracy Metrics ROC Curve
(Selected
features)
LDA Age, precision recall f1-score support
creatinin
e_phosph 0 0.83 0.94 0.88 62
okinase, 1 0.80 0.57 0.67 28
ejection_ micro avg 0.82 0.82 0.82 90
fraction, macro avg 0.81 0.75 0.77 90
serum_cr weighted avg 0.82 0.82 0.81 90
eatinine,
time

RF Age, precision recall f1-score support

Diabetes,
ejection_ 0 0.79 0.87 0.83 62
fraction, 1 0.64 0.50 0.56 28
serum_cr micro avg 0.76 0.76 0.76 90
eatinine macro avg 0.72 0.69 0.70 90
weighted avg 0.75 0.76 0.75 90

DT Diabetes, precision recall f1-score support

ejection_
fraction, 0 0.81 0.82 0.82 62
serum_cr 1 0.59 0.57 0.58 28
eatinine, micro avg 0.74 0.74 0.74 90
smoking macro avg 0.70 0.70 0.70 90
weighted avg 0.74 0.74 0.74 90

SN Computer Science
344 Page 14 of 16 SN Computer Science (2020) 1:344

Table 5 (continued)
GBC ejection_ precision recall f1-score support
fraction,
serum_cr 0 0.80 0.90 0.85 62
eatinine, 1 0.70 0.50 0.58 28
smoking micro avg 0.78 0.78 0.78 90
macro avg 0.75 0.70 0.72 90
weighted avg 0.77 0.78 0.77 90

KNN Anaemia precision recall f1-score support

,
ejection_ 0 0.79 0.97 0.87 62
fraction, 1 0.86 0.43 0.57 28
platelets, micro avg 0.80 0.80 0.80 90
serum_cr macro avg 0.82 0.70 0.72 90
eatinine, weighted avg 0.81 0.80 0.78 90
time

SVM serum_cr precision recall f1-score support

eatinine,
ejection_ 0 0.89 0.75 0.81 44
fraction, 1 0.80 0.91 0.85 47
smoking Avg/total 0.84 0.84 0.83 91

calculation time and improve the classification accuracy of The curiosity of this exploratory work is building a dis-
the classifier. The FS algorithm selects important features covery framework that can foresee the moment of death
related to distinguishing death events from healthy people. events. The framework utilizes SFS calculations, six clas-
sifiers, a cross-approval strategy and execution evaluation

SN Computer Science
SN Computer Science (2020) 1:344 Page 15 of 16 344

metrics. Through the strategy of machine learning to plan and sequential backward selection algorithm for features selection.
the choice of an decision support network, the analysis of In: 2019 IEEE 5th International Conference for Convergence in
Technology (I2CT); 2019. p. 1–4. IEEE.
heart disease will be more reasonable. In addition, some 12. Khemphila A, Boonjing V. Heart disease classification using
non related features reduce the performance of the model neural network and feature selection. In: 2011 21st International
and extend the calculation time. Therefore, another creative Conference on Systems Engineering; 2011. p. 406–409. IEEE.
element of this research is to use in feature determination 13. Usman AM, Yusof UK, Naim S. Cuckoo inspired algorithms for
feature selection in heart disease prediction. Int J Adv Intell Inf.
calculations to select the best features. These best features 2018;4(2):95–106.
can reduce the execution time of the classification model 14. Javeed A, Rizvi SS, Zhou S, Riaz R, Khan SU, Kwon SJ. Heart
and improve the classification accuracy. Later, we will con- risk failure prediction using a novel feature selection method for
duct more inspections using other features (including feature feature refinement and neural network for classification. Mobile
Inf Syst. 2020. https://doi.org/10.1155/2020/8843115
selection and simplified procedures) to build these exhibits 15. Yadav DC, Pal S. Prediction of heart disease using feature selec-
of prerequisite classifiers for determining heart disease. tion and random forest ensemble method. Int J Pharmaceutical
Res. 2020;12(4):56–66.
16. Gunasundari S, Janakiraman S. A hybrid PSO-SFS-SBS algorithm
in feature selection for liver cancer data. In: Power Electronics
Compliance with Ethical Standards and Renewable Energy Systems. New Delhi: Springer; 2015. p.
1369–76.
Conflict of Interest The authors declared that they have interest of con- 17. Yan K, Ma L, Dai Y, Shen W, Ji Z, Xie D. Cost-sensitive and
flict. sequential feature selection for chiller fault detection and diagno-
sis. Int J Refrig. 2018;86:401–9.
18. Ruan F, Qi J, Yan C, Tang H, Zhang T, Li H. Quantitative detec-
tion of harmful elements in alloy steel by LIBS technique and
References sequential backward selection-random forest (SBS-RF). J Anal
Atom Spectrom. 2017;32(11):2194–9.
1. Towbin JA, Bowles NE. The failing heart. Nature. 19. Chaurasia V, Pal S. Skin diseases prediction: binary classification
2002;415(6868):227–33. machine learning and multi model ensemble techniques. Res J
2. Ponikowski P, Voors AA, Anker SD, Bueno H, Cleland JG, Coats Pharmacy Technol. 2019;12(8):3829–32.
AJ, et al. 2016 ESC Guidelines for the diagnosis and treatment of 20. Chaurasia V, Pal S. Applications of machine learning techniques to
acute and chronic heart failure: the Task Force for the diagnosis predict diagnostic breast cancer. SN Comput Sci. 2020;1(5):1–11.
and treatment of acute and chronic heart failure of the European 21. Safavian SR, Landgrebe D. A survey of decision tree clas-
Society of Cardiology (ESC) Developed with the special contribu- sifier methodology. IEEE Trans Syst Man Cybernet.
tion of the Heart Failure Association (HFA) of the ESC. Eur Heart 1991;21(3):660–74.
J. 2016;37(27):2129–200. 22. Son J, Jung I, Park K, Han B Tracking-by-segmentation with
3. Aljaaf AJ, Al-Jumeily D, Hussain AJ, Dawson T, Fergus P, Al- online gradient boosting decision tree. In: Proceedings of the
Jumaily M. Predicting the likelihood of heart failure with a multi IEEE International Conference on Computer Vision; 2015. p.
level risk assessment using decision tree. In: 2015 Third Interna- 3056–3064.
tional Conference on Technological Advances in Electrical, Elec- 23. Keller JM, Gray MR, Givens JA. A fuzzy k-nearest neighbor algo-
tronics and Computer Engineering (TAEECE); 2015. p. 101–106. rithm. IEEE Trans Syst Man Cybernet. 1985;4:580–5.
IEEE. 24. Vishwanathan SVM, Murty MN. SSVM: a simple SVM algo-
4. Cowie MR. The heart failure epidemic: a UK perspective. Echo rithm. In: Proceedings of the 2002 International Joint Conference
Res Pract. 2017;4(1):R15–20. on Neural Networks. IJCNN’02 (Cat. No. 02CH37290) Vol. 3;
5. Dokken BB. The pathophysiology of cardiovascular disease and 2002. p. 2393–2398. IEEE.
diabetes: beyond blood pressure and lipids. Diabetes Spectrum. 25. Higham NJ. Computing the nearest correlation matrix—a problem
2008;21(3):160–5. from finance. IMA J Numerical Anal. 2002;22(3):329–43.
6. Liu M, Jiang Y, Wedow R, Li Y, Brazel DM, Chen F, et al. Asso- 26. McColl KA, Vogelzang J, Konings AG, Entekhabi D, Piles M,
ciation studies of up to 1.2 million individuals yield new insights Stoffelen A. Extended triple collocation: estimating errors and
into the genetic etiology of tobacco and alcohol use. Nat Genetics. correlation coefficients with respect to an unknown target. Geo-
2019;51(2):237–44. phys Res Lett. 2014;41(17):6229–36.
7. Kadkhoda M, Jahani H. Problem-solving capacities of spiritual 27. Oberkampf WL, Barone MF. Measures of agreement between
intelligence for artificial intelligence. Procedia-Soc Behav Sci. computation and experiment: validation metrics. J Comput Phys.
2012;32:170–5. 2006;217(1):5–36.
8. Shilaskar S, Ghatol A. Feature selection for medical diagno- 28. Pritom AI, Munshi MAR, Sabab SA, Shihab S. Predicting breast
sis: evaluation for cardiovascular diseases. Expert Syst Appl. cancer recurrence using effective classification and feature selec-
2013;40(10):4146–53. tion technique. In: 2016 19th International Conference on Com-
9. Kumar D, Carvalho P, Antunes M, Paiva RP, Henriques J. Heart puter and Information Technology (ICCIT); 2016. p. 310–314.
murmur classification with feature selection. In: 2010 Annual IEEE
International Conference of the IEEE Engineering in Medicine 29. Chapman B, DeVore AD, Mentz RJ, Metra M. Clinical profiles in
and Biology; 2010. p. 4566–4569. IEEE. acute heart failure: an urgent need for a new approach. ESC Heart
10. Mokeddem S, Atmani B, Mokaddem M. Supervised feature selec- Failure. 2019;6(3):464–74.
tion for diagnosis of coronary artery disease based on genetic 30. Shaheen H, Agarwal S, Ranjan P. MinMaxScaler binary PSO for
algorithm. arXiv preprint. 2013; arXiv:1305.6046 feature selection. In: First International Conference on Sustain-
11. Haq AU, Li J, Memon MH, Memon MH, Khan J, Marium SM. able Technologies for Computational Intelligence. Singapore:
Heart disease prediction system using model of machine learning Springer; 2020. p. 705–16.

SN Computer Science
344 Page 16 of 16 SN Computer Science (2020) 1:344

31. Chaurasia V, Pal S, Tiwari BB. Chronic Kidney disease: a 33. Rebonato R, Jäckel P. The most general methodology to create a
predictive model using decision tree. Int J Eng Res Technol. valid correlation matrix for risk management and option pricing
2018;11(11):1781–94. purposes. Available at SSRN 1969689. 2011.
32. Tantimongcolwat T, Naenna T, Isarankura-Na-Ayudhya C, Embre-
chts MJ, Prachayasittikul V. Identification of ischemic heart dis- Publisher’s Note Springer Nature remains neutral with regard to
ease via machine learning analysis on magnetocardiograms. Com- jurisdictional claims in published maps and institutional affiliations.
put Biol Med. 2008;38(7):817–25.

SN Computer Science

Heart Failure Prediction Based On Random Forest Algorithm Using Genetic Algorithm For Feature Selection
No ratings yet
Heart Failure Prediction Based On Random Forest Algorithm Using Genetic Algorithm For Feature Selection
10 pages
Detection of Heart Failure Using Different Machine Learning Algorithms
No ratings yet
Detection of Heart Failure Using Different Machine Learning Algorithms
5 pages
Password RCD
No ratings yet
Password RCD
11 pages
HEART FAILURE PREDICTION
No ratings yet
HEART FAILURE PREDICTION
9 pages
Heart Disease Risk Prediction Using Deep Learning Techniques With Feature Augmentation
No ratings yet
Heart Disease Risk Prediction Using Deep Learning Techniques With Feature Augmentation
15 pages
Literature Survey
No ratings yet
Literature Survey
11 pages
Heart Failure Prediction Using Hybrid Method
No ratings yet
Heart Failure Prediction Using Hybrid Method
8 pages
Heart Disease Prediction Using Integer-Coded Genetic Algorithm (ICGA) Based Particle Clonal Neural Network (ICGA-PCNN)
No ratings yet
Heart Disease Prediction Using Integer-Coded Genetic Algorithm (ICGA) Based Particle Clonal Neural Network (ICGA-PCNN)
5 pages
PPT
No ratings yet
PPT
15 pages
J Eswa 2015 06 024
No ratings yet
J Eswa 2015 06 024
11 pages
CARDIAC_DISEASES_PREDICTION_USING_SVM_WITH_XG_BOOST_ALGORITHM_ijariie19362
No ratings yet
CARDIAC_DISEASES_PREDICTION_USING_SVM_WITH_XG_BOOST_ALGORITHM_ijariie19362
8 pages
Implementation of An Incremental Deep Learning Model For Survival Prediction of Cardiovascular Patients
No ratings yet
Implementation of An Incremental Deep Learning Model For Survival Prediction of Cardiovascular Patients
9 pages
Term Paper (3)
No ratings yet
Term Paper (3)
8 pages
Final Year Project
No ratings yet
Final Year Project
57 pages
Reference of Heart Disease Prediction
No ratings yet
Reference of Heart Disease Prediction
3 pages
Heart Disease Prediction Using Adaptive Infinite Feature Selection and Deep Neural Networks
No ratings yet
Heart Disease Prediction Using Adaptive Infinite Feature Selection and Deep Neural Networks
6 pages
Prediction of Heart Disease Using Machine Learning
No ratings yet
Prediction of Heart Disease Using Machine Learning
5 pages
1. Feb 25 - Vol. 23 No. 1
No ratings yet
1. Feb 25 - Vol. 23 No. 1
73 pages
Literature Survey Paper
No ratings yet
Literature Survey Paper
8 pages
Comparison of Various Data Mining Methods For Early Diagnosis of Human Cardiology
No ratings yet
Comparison of Various Data Mining Methods For Early Diagnosis of Human Cardiology
9 pages
Research Article: A Hybrid Classification System For Heart Disease Diagnosis Based On The RFRS Method
No ratings yet
Research Article: A Hybrid Classification System For Heart Disease Diagnosis Based On The RFRS Method
11 pages
Evaluation of cardiovascular disease in diabetic patients using machine learning techniques
No ratings yet
Evaluation of cardiovascular disease in diabetic patients using machine learning techniques
13 pages
Harris Hawks Optimization (HHO) Algorithm based on Artificial Neural Network for Heart Disease Diagnosis
No ratings yet
Harris Hawks Optimization (HHO) Algorithm based on Artificial Neural Network for Heart Disease Diagnosis
5 pages
Harmonization_of_Heart_Disease_Dataset_for_Accurat
No ratings yet
Harmonization_of_Heart_Disease_Dataset_for_Accurat
13 pages
Enhancing Cardiovascular Disease Diagnosis Through
No ratings yet
Enhancing Cardiovascular Disease Diagnosis Through
25 pages
IJCRT2205103
No ratings yet
IJCRT2205103
10 pages
An Empirical Study On Machine Learning Algorithms For Heart Disease Prediction
No ratings yet
An Empirical Study On Machine Learning Algorithms For Heart Disease Prediction
8 pages
666 3353 1 PB
No ratings yet
666 3353 1 PB
6 pages
View of Cardiovascular Heart Disease Prediction Using Machine Learning Classifiers With Data Mining Techniques
No ratings yet
View of Cardiovascular Heart Disease Prediction Using Machine Learning Classifiers With Data Mining Techniques
9 pages
Development of Heart Disease Diagnosis System Using Mutual Information Feature Selection and Back Propagation Neural Network Algorithm (4 and 5)
No ratings yet
Development of Heart Disease Diagnosis System Using Mutual Information Feature Selection and Back Propagation Neural Network Algorithm (4 and 5)
12 pages
Heart Disease Prediction Using Machine Learning
100% (1)
Heart Disease Prediction Using Machine Learning
54 pages
Heart Disease Prediction Using Machine Learning IJERTV9IS080128
No ratings yet
Heart Disease Prediction Using Machine Learning IJERTV9IS080128
3 pages
Prediction_of_heart_disease_using_data_m
No ratings yet
Prediction_of_heart_disease_using_data_m
6 pages
A New Hybrid Model For Heart Disease Prediction Us
No ratings yet
A New Hybrid Model For Heart Disease Prediction Us
18 pages
Predicting_Heart_Failure_Survival_A_Machine_Learning_Approach_with_Explainable_AI
No ratings yet
Predicting_Heart_Failure_Survival_A_Machine_Learning_Approach_with_Explainable_AI
6 pages
Analysis of Performance Metrics of Heart Failured Patie - 2021 - Global Transiti
No ratings yet
Analysis of Performance Metrics of Heart Failured Patie - 2021 - Global Transiti
5 pages
Survival prediction among heart patients using machine learning techniques
No ratings yet
Survival prediction among heart patients using machine learning techniques
12 pages
Paper 19
No ratings yet
Paper 19
10 pages
Pandi A Raj 2021
No ratings yet
Pandi A Raj 2021
8 pages
Term Paper
No ratings yet
Term Paper
10 pages
A Hybrid Approach For Mortality Prediction For Heart Patients Using ACO-HKNN 2020
No ratings yet
A Hybrid Approach For Mortality Prediction For Heart Patients Using ACO-HKNN 2020
8 pages
Synopsis (Heart Disease Prediction)
No ratings yet
Synopsis (Heart Disease Prediction)
7 pages
Arabasadi 2017
No ratings yet
Arabasadi 2017
8 pages
Heart Disease Prediction With Machine Learning Approaches
No ratings yet
Heart Disease Prediction With Machine Learning Approaches
5 pages
jut2
No ratings yet
jut2
12 pages
Buettner 2019
No ratings yet
Buettner 2019
6 pages
Heart Disease Python Report 1st Phase
No ratings yet
Heart Disease Python Report 1st Phase
33 pages
IJCAIT133rsingh
No ratings yet
IJCAIT133rsingh
14 pages
Heart Failure CETM24
No ratings yet
Heart Failure CETM24
28 pages
Heart_disease
No ratings yet
Heart_disease
6 pages
Prediction Heart Disease
No ratings yet
Prediction Heart Disease
11 pages
An Expert System Based On Optimized Stacked Support Vector Machines For Effective Diagnosis of Heart Disease
No ratings yet
An Expert System Based On Optimized Stacked Support Vector Machines For Effective Diagnosis of Heart Disease
9 pages
final_ppr (1) BTP
No ratings yet
final_ppr (1) BTP
14 pages
Prognostic_Modeling_for_Heart_Failure_Survival_A_Classification_Approach
No ratings yet
Prognostic_Modeling_for_Heart_Failure_Survival_A_Classification_Approach
6 pages
BT40962_PPT
No ratings yet
BT40962_PPT
24 pages
Heart Failure Prediction Using XGB Classifier Logistic Regression and Support Vector Classifier
No ratings yet
Heart Failure Prediction Using XGB Classifier Logistic Regression and Support Vector Classifier
5 pages
10 1134@S0361768818060129
No ratings yet
10 1134@S0361768818060129
10 pages
2023-Heart Disease Prediction Using Machine Learning
No ratings yet
2023-Heart Disease Prediction Using Machine Learning
11 pages
Advanced Analytics of Image Datasets in Human Health
From Everand
Advanced Analytics of Image Datasets in Human Health
Dr. Zemelak Goraga
No ratings yet
Data Science Project Ideas, Methodology & Python Codes in Health Care
From Everand
Data Science Project Ideas, Methodology & Python Codes in Health Care
Zemelak Goraga
No ratings yet
Degrading Systems Availability Anal (4)
No ratings yet
Degrading Systems Availability Anal (4)
14 pages
Design_and_dev._of_robotics
No ratings yet
Design_and_dev._of_robotics
42 pages
Technical Review Paper - Assisted Technologies for Challenged and Elderly People (5)
No ratings yet
Technical Review Paper - Assisted Technologies for Challenged and Elderly People (5)
23 pages
Top 10 Machine Learning Algorithms
No ratings yet
Top 10 Machine Learning Algorithms
12 pages
FRA Milestone-1
No ratings yet
FRA Milestone-1
47 pages
Chronic Kidney Disease Prediction Using Machine Learning Techniques
No ratings yet
Chronic Kidney Disease Prediction Using Machine Learning Techniques
19 pages
Final Project Report
No ratings yet
Final Project Report
62 pages
A2305119001 Final Report
No ratings yet
A2305119001 Final Report
28 pages
HYBRID MODEL
No ratings yet
HYBRID MODEL
9 pages
An Intelligent Fuzzy Logic Automobile Fault Diagnostic System
No ratings yet
An Intelligent Fuzzy Logic Automobile Fault Diagnostic System
9 pages
ProSOUL A Framework To Identify PropagandaFrom Online Urdu Content
No ratings yet
ProSOUL A Framework To Identify PropagandaFrom Online Urdu Content
16 pages
Predictive Vehicle Maintenance Using Deep Neural Networks-2
No ratings yet
Predictive Vehicle Maintenance Using Deep Neural Networks-2
5 pages
Developing Recommendation Systems Using Deep Learning Comparison of
No ratings yet
Developing Recommendation Systems Using Deep Learning Comparison of
10 pages
AI-Enhanced Detection of Hazardous Materials in Metal Scrap for Safer Industrial Operations
No ratings yet
AI-Enhanced Detection of Hazardous Materials in Metal Scrap for Safer Industrial Operations
10 pages
A Global Earthquake Prediction Model Based On Spherical Convolutional LSTM
No ratings yet
A Global Earthquake Prediction Model Based On Spherical Convolutional LSTM
10 pages
Spam Detection Synopsis
No ratings yet
Spam Detection Synopsis
8 pages
Deep Learning Unit-1
No ratings yet
Deep Learning Unit-1
40 pages
ML1+Project+%28Coded%29+ +Sample+Business+Report
No ratings yet
ML1+Project+%28Coded%29+ +Sample+Business+Report
56 pages
SMS Fraud Detection and Prevention in Pakistan
No ratings yet
SMS Fraud Detection and Prevention in Pakistan
42 pages
Jhankar Paper Propulsion
No ratings yet
Jhankar Paper Propulsion
13 pages
A Comparative Study Deepfake Detection Using Deep-Learning
No ratings yet
A Comparative Study Deepfake Detection Using Deep-Learning
5 pages
Predicting Travel Insurance Purchases in An Insura
No ratings yet
Predicting Travel Insurance Purchases in An Insura
16 pages
Enhancing Android Malware Detection Throught Ensemble Stakcking
No ratings yet
Enhancing Android Malware Detection Throught Ensemble Stakcking
11 pages
Employee Attrition Classification
No ratings yet
Employee Attrition Classification
16 pages
Customer Churn Prediction
No ratings yet
Customer Churn Prediction
8 pages
Data Analytics Classification
No ratings yet
Data Analytics Classification
56 pages
Maldonado Et Al. - 2023
No ratings yet
Maldonado Et Al. - 2023
6 pages
Deep Learning Methods For Credit Card Fraud Detect
No ratings yet
Deep Learning Methods For Credit Card Fraud Detect
8 pages
Orange - AI417 - 10 - QP (P1) AI Class 10 Sample
100% (2)
Orange - AI417 - 10 - QP (P1) AI Class 10 Sample
7 pages
03 - Model Evaluation Comparison
No ratings yet
03 - Model Evaluation Comparison
80 pages
2023 ML Assignment
No ratings yet
2023 ML Assignment
57 pages
DL 2P DDoSADF
No ratings yet
DL 2P DDoSADF
13 pages
BERT-based Models For Classifying Multi-Dialect Arabic Texts
No ratings yet
BERT-based Models For Classifying Multi-Dialect Arabic Texts
10 pages
10 1109@iwssip48289 2020 9145130
No ratings yet
10 1109@iwssip48289 2020 9145130
6 pages

Machine Learning Algorithm‑Based Health prediction

Uploaded by

Machine Learning Algorithm‑Based Health prediction

Uploaded by

SN Computer Science (2020) 1:344

Sequential Feature Selection and Machine Learning Algorithm‑Based

Received: 3 October 2020 / Accepted: 6 October 2020

Introduction [3]. According to statistics from the Society of Cardiology,

Fig. 1 Heart failure management structure

95.74 93.02 90.51 96.15

Shannon energy 4, GMM murmur 5, Eigen-

peak, ca, thal

Authors Disease/prediction Method Features Evaluation measures

Features COA CSA selection. SVM has

Support Vector Machine Validation Accuracy Metrics

True Positive(TP) + True Negative(TN)

Fig. 2 System framework for predicting death from heart disease

Fig. 3 Attributes with Boolean values

Fig. 4 Attributes with continuous values

The correlation matrix in Fig. 5 represents the relation-

Fig. 5 Correlation matrix

Table 3 Attribute information

Fig. 7 Performance of the classifiers with and without SFS

Table 4 Classifiers accuracy with and without SFS Conclusion

Table 5 Classifiers performance validation with SFS

RF Age, precision recall f1-score support

DT Diabetes, precision recall f1-score support

KNN Anaemia precision recall f1-score support

SVM serum_cr precision recall f1-score support

You might also like