Informatics in Medicine Unlocked: Sciencedirect
Informatics in Medicine Unlocked: Sciencedirect
A R T I C L E I N F O A B S T R A C T
Keywords: Cardiovascular disease (CVD) is a leading cause of death worldwide, with millions dying each year. The iden
Cardiovascular disease tification and early diagnosis of CVD are critical in preventing adverse health outcomes. Hence, this study
Deep learning proposes a hybrid deep learning (DL) model that combines a convolutional neural network (CNN) and long short-
CNN-LSTM
term memory (LSTM) to identify CVD from the clinical data. This study utilizes CNN to extract the relevant
Feature engineering
features from the input data and the LSTM network to process sequential data and capture dependencies and
Explainable AI
patterns over time. This study provides insights into the potential of a hybrid DL model combined with feature
engineering and explainable AI to improve the accuracy and interpretability of CVD prediction. We evaluated our
model on a publicly available dataset where the proposed CNN-LSTM achieved a high accuracy of 73.52% and
74.15% with and without feature engineering, respectively, in identifying individuals with CVD, which is the
best result compared to the current state-of-the-art model. The results of this study demonstrate the potential of
DL models for the early diagnosis of CVD. Our proposed CNN-LSTM model also incorporates explainable AI to
identify the top features responsible for CVD. They could be used to develop more effective screening tools in
clinical practice.
1. Introduction main cause of death on a global scale. In 2020, according to the World
Health Organization, this disease will be the primary cause of death
Cardiovascular disease (CVD) is a condition that causes blood vessel worldwide, with an estimated 17.9 million fatalities annually [3].
obstruction and heart attacks with chest discomfort, as well as other Additionally, coronary disease mortality increases annually. It is antic
heart diseases and heart failure that may result in death or other severe ipated that the population will surpass 23.6 million by 2030. CVD is
issues [1]. It is number one on the list of the top ten reasons for passing in considered the leading cause of death worldwide. Coronary heart dis
the previous 15 years, and 15 million persons passed in 2015 [2]. ease, cerebrovascular disease, peripheral arterial disease, strokes and
January 2017 research demonstrated cardiovascular infections are the transient ischaemic attacks (TIA), vascular illness, chronic heart illness,
☆
www: iu.ac.bd (M.M. Hossain); iu.ac.bd (M.S. Ali); iu.ac.bd (M.M. Ahmed); iu.ac.bd (M.R.H. Rakib); iu.ac.bd (M.A. Kona); cahs.gov.bd/#/ (S. Afrin); iu.ac.bd (M.
K. Islam); sust.edu (M.M. Ahsan); iu.ac.bd (S.M.R.H. Raj); iu.ac.bd (M.H. Rahman).
* Corresponding author. Department of Computer Science and Engineering, Islamic University, Kushtia, 7003, Bangladesh.
** Corresponding author. Department of Biomedical Engineering, Islamic University, Kushtia, 7003, Bangladesh.
E-mail addresses: [email protected] (M.M. Hossain), [email protected] (M.S. Ali), [email protected] (M.M. Ahmed), [email protected]
(M.R.H. Rakib), [email protected] (M.A. Kona), [email protected] (S. Afrin), [email protected] (M.K. Islam), [email protected]
(M.M. Ahsan), [email protected] (S.M.R.H. Raj), [email protected] (M.H. Rahman).
https://doi.org/10.1016/j.imu.2023.101370
Received 2 April 2023; Received in revised form 27 September 2023; Accepted 2 October 2023
Available online 4 October 2023
2352-9148/© 2023 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
M.M. Hossain et al. Informatics in Medicine Unlocked 42 (2023) 101370
congenital heart defects, thrombosis of deep veins, and pulmonary • Additionally, we used feature engineering techniques to identify and
embolism are the most common cardiovascular conditions [4]. Several incorporate relevant features such as ’blood_diff,’ ’BMI,’ ’obese,’ and
contributory risk factors, including hypertension, elevated blood pres ’hypertension’ into our model for CVD detection. Additionally, we
sure, excessive lipids, and abnormal pulse rate, complicate the diagnosis employed preprocessing strategies to refine the dataset, enhancing
of cardiac disease. Detecting CVD at an early period is essential to its specificity and improving the overall performance of our model.
reducing this toll. • Lastly, we utilize explainable AI techniques to identify the key fea
Machine learning (ML) is one of the most swiftly evolving areas of tures affecting cardiac patients in our CNN- LSTM model to gain
Artificial Intelligence (AI). These algorithms can analyze vast amounts valuable insights for clinical decision-making and improving patient
of data from numerous disciplines, including the vital medical field [5]. outcomes.
Moreover, data mining is among the numerous methods for enhancing
disease detection and diagnosis. Furthermore, early detection of CVD 1.3. Organization
reduces costs and CVD mortality. By employing an algorithm for clas
sification, which contributes an essential part of medical research, data The remaining sections of the paper are organized as follows: Section
analysis strategies can efficiently and affordably complete the task. 2 provides a comprehensive literature review, offering an overview of
Using a variety of ML algorithms, such as Logistic Regression, Naive existing research. The subsequent section 3, delves into the challenges
Bayes (NB), Support Vector Machines, Decision Trees, and others we associated with CVD detection. The methodology is outlined in Section
constructed a model. 4, detailing the research design and data collection techniques. Sections
Through supervised learning and binary analysis of user data, we 5 and 6 present the results and discussions, respectively, analyzing the
addressed the identified issue. By dividing the data into study and findings and their implications. Finally, Section 7 concludes the paper,
training sets, applying different methods or combinations, and evalu summarizing the main conclusions and proposing directions for future
ating for precision, we developed an accurate predictive model for research.
assessing an individual’s risk of CVD. Using this procedure, an adverse
factor is identified with high accuracy. This study anticipated the dan 2. Literature review
gers and etiology of CVD from one generation to the next. Informatics in
healthcare for CVD is changing in various disciplines, including storing Cardiovascular identification using ML techniques is expanding
and sending data. The medical consequences of ML for CVD in 2020 are rapidly, with numerous studies investigating the application of ML al
that it accesses the material in a supervised format and provides more gorithms and DL models. By utilizing these techniques, healthcare can
accurate data for the anticipated condition. be revolutionized by enabling more precise diagnoses and tailored
treatment plans for patients. This section reviews prior studies on
identifying CVD through diverse ML approaches.
1.1. Motivation
In [6], the authors proposed an RF algorithm correlating diabetes
and heart disease. The method employed to determine the percentage of
Late identification of the most critical behavioural risk factors for
heart disease prediction is aware of the connection and the extent to
heart disease and stroke underscores the need for this study. Early
which diabetes affects coronary artery disease. Better performance can
detection of these conditions through routine screening exams could
be obtained by using more parameters. The authors in this study [1]
improve treatment options and longer survival times. Public health
concentrated on leveraging healthcare data for cardiovascular disease
initiatives have been established to encourage communities to undergo
prediction through a mobile-based iOS application, obtaining an
regular screening for chronic conditions such as CVD. That is why we are
impressive accuracy of 72.7%. They propose extending the model to
approaching this study by using ML approaches. There are many
encompass other diseases and exploring deep learning and CNN
algorithm-based studies, but we aim to conclude various algorithms in a
methods for potential efficiency enhancements.
point and express them with good accuracy. Here we employed the
The authors [7] proposed a model employing four methods of clas
Support Vector Machine (SVM), Decision Tree (DT), K-Nearest Neigh
sification for data mining: KNN, NB, DT, and RF. They utilized numerous
bors (KNN), NB, Multilayer Perceptron (MLP), CatBoost, Gradient
data mining techniques, including regression, clustering, and associa
Boosting Machine (GBM), AdaBoost, Random Forest (RF), and hybrid
tion rules, to retrieve valuable data and details from large databases.
CNN-LSTM model. Based on our model and performance, we are moti
They discovered the maximum accuracy with KNN (k = 7). Imple
vated to conclude with the proposed CNN-LSTM method. Pre-screening
menting more complex models and incorporating additional data min
systems for early disease prediction, detection, and prevention are a
ing techniques, including time-series analysis, the combination of
crucial source of information that applies cutting-edge ICT methods to
association and clustering rules, SVM, and genetic algorithms, can
address issue with health data collection, analysis, and interpretation, as
improve the accuracy of early heart disease prediction. In Ref. [8], the
well as to enhance current health systems for the advanced screening of
authors applied several ML techniques, namely, DT, NB, and Neural
diseases that we can explicitly discuss in this work. We aim to get over
Networks, and developed a prototype of the Intelligent Heart Disease
the previous restriction and produce a successful outcome.
Prediction System (IHDPS). This approach has improved accuracy and
also saved time and money. They observed that the accuracy of NB was
1.2. Contribution 60%, logistic regression was 61.45%, and SVM was 64.4%. The authors
[9] employed a method to boost the coronary artery disease prediction
A summary of the primary contribution of our study includes the rate more correctly. They utilized effective prediction techniques, such
following: as Gaussian NB, SVM, RF, Hoeffding Tree, and LMT, to boost accuracy.
RF gives the most accurate results among all the algorithms used.
• We introduce our novel CNN-LSTM model, achieving exceptional Adding more input attributes and reducing the data size can enhance the
accuracy in early-stage cardiac condition diagnosis for patients. As genetic algorithm’s performance. In Ref. [10], the authors proposed a
deep learning pioneers, we are the first to develop such a model using method for predicting cardiac disease utilizing sklearn libraries, pandas,
CSV datasets. matplotlib, and other required libraries with a TkInter Python-based
• Furthermore, our model excels in both accuracy and efficiency, application. The hybrid model achieved 88% accuracy. Applying DL
setting new standards. Utilizing CNNs and LSTM networks, we ach techniques to predict heart disease may yield improved results. The
ieve faster diagnosis compared to current ML algorithms, facilitating authors [11] analyzed heart disease prediction via an ML technique.
prompt interventions and potentially improving patient outcomes. Comparing all the algorithms used in this method, the RF shows the best
2
M.M. Hossain et al. Informatics in Medicine Unlocked 42 (2023) 101370
result of 95.60%. The results can be improved by implementing other are compared to the proposed CNN model. The paper claims a high
algorithms and fruitful deep-learning techniques. In Ref. [12], the au accuracy of 94% for the CNN model using the UCI ML repository dataset.
thors proposed a method for evaluating and summarizing the overall The CNN model aims to improve efficiency and accuracy by leveraging
predictive capability of ML algorithms in CVD. They demonstrated that DL capabilities to analyze complex patterns in medical data. In Ref. [16],
the predictive ability of ML algorithms in CVD, particularly SVM and the authors addressed the global health burden of CVD, specifically
boosting algorithms, is promising. They might obtain an optimal accu coronary heart disease. The authors proposed an enhanced deep neural
racy rate using more feature selection methodologies and techniques. network (DNN) model for accurate and reliable diagnosis. The model
The authors [13] employed a hybrid approach using ML classifiers for achieved 83.67% diagnostic accuracy, 93.51% sensitivity, 72.86%
CVD forecasts. They have constructed their framework using a neural specificity, and 79.12% precision. Investigating enhanced DL methods
network based on general principles. Using the confusion matrix, the and advanced models can be effective in further improving the accuracy
proposed method employing the ML classifier obtained a higher accu of heart disease diagnoses in patients worldwide. In Ref. [17], the author
racy of 85.71%. By varying the number of testing datasets, accuracy can of the paper employed a model using DL Techniques (DLTs), specifically
be increased. Artificial Neural Networks (ANN), to analyze and predict CVD in the
In [14], sonam et al. developed a heart disease prediction system Robust Healthcare Industry (RHI). The ANN model achieves a high ac
using two classification techniques: NB and DT classifiers. The DT curacy of 98.4% compared to other models, highlighting its
classifier outperformed the NB classifier, but removing irrelevant attri effectiveness in disease analysis and prognosis.
butes from the dataset improved the NB classifier’s performance. This The existing literature on CVD detection using ML algorithms reveals
research emphasizes the importance of classification techniques and that several studies have been conducted in this domain. However, there
data preprocessing for accurate heart disease prediction. In Ref. [4], the is a research gap in developing more effective and productive models
authors proposed a method for predicting CVD using symptom input. Six that can improve the accuracy and efficiency of these algorithms.
classification algorithms were used to analyze 14 attributes of the Although most studies have focused on using ML algorithms for classi
Cleveland dataset, with SVM and MLP achieving the highest accuracy of fication and prediction, the lack of research on developing efficient
91.7%. Ensembles and exploring more parameter settings could further preprocessing techniques to enhance algorithm performance is evident.
improve performance. This method shows promise in providing accurate Although some studies have achieved high accuracy rates, there is still a
and immediate disease prediction. The authors [15] employed a Con need to improve the prediction of CVD at an early stage and identify
volutional Neural Network (CNN) approach for predicting CVD. Tradi more accurate risk indicators. There is no available related work to
tional ML methods, such as Logistic Regression, K-Nearest Neighbors detect CVD using the DL method on tabular data. Furthermore, it has
(KNN), NB, Support Vector Machine (SVM), and Neural Networks (NN), been observed that most researchers used a limited dataset, raising
3
M.M. Hossain et al. Informatics in Medicine Unlocked 42 (2023) 101370
Table 1
Details of dataset features.
SN Attribute Description
name
4
M.M. Hossain et al. Informatics in Medicine Unlocked 42 (2023) 101370
Table 2
Descriptive statistics for the Dataset.
Variable Min Max Mean Median STD Variance MAD RMS Skewness Missing
ML requires laborious feature engineering, which helps DL and When f is used as a splitting feature, its weighted information gain in
improved text classification rules learning. Our study incorporates a that node is ascribed. When used as an argument of a splitting part, it is
visual aid to illustrate the feature engineering process systematically. credited with the splitting feature’s weighted info-gain, divided by the
5
M.M. Hossain et al. Informatics in Medicine Unlocked 42 (2023) 101370
6
M.M. Hossain et al. Informatics in Medicine Unlocked 42 (2023) 101370
Fig. 5. Heatmap showing correlations among all the features of the dataset.
which processes the sequence of feature vectors to model temporal de 4.5. SHAP as XAI
pendencies and capture long-term context [36]. The LSTM network
contains memory cells that store information over time and gates that SHAP (SHapley Additive exPlanations) is a well-known Explainable
control the flow of information into and out of the cells. This allows the Artificial Intelligence (XAI) technique that describes ML model out
model to learn and remember patterns in the data that occur over long comes. It provides a method for measuring the impact of each model
periods of time [37]. The final output of the model can be either a feature on the output of a specific data point. This can help to under
classification label or a prediction of the next value in the sequence, stand why a model made a particular prediction and which features
depending on the task at hand. The model is trained using were most influential in producing that prediction [38]. By utilizing
back-propagation through time, which allows the gradients to flow SHAP, the insights of the model can be achieved into a model’s pre
through the entire sequence and update the model’s parameters. The dictions to allow for uncovering and rectifying biases, enhancing per
architecture of the model we proposed in this study is shown in Fig. 4. formance, and creating end-user confidence. Firstly, it calculates the
The proposed CNN-LSTM model performs sequential data process SHAP values to see the insightful features that led to further steps [39].
ing. Firstly, the CNN component extracts features automatically from the The SHAP value can be determined via the formula below:
raw data, reducing the need for manual feature engineering. Secondly,
∑|z′|!(M − |z′| − 1) [ (z′)]
the LSTM component can capture long-term dependencies and context, φi (f , a) = fx (z′) − fx (5)
allowing the model to make more accurate predictions or classifications. ′ ′
M ⅈ
z ⊆x
Finally, the model can be trained end-to-end using back-propagation
through time, making it more efficient and effective than traditional Where,
models that require separate feature extraction and sequence modeling
steps. φi = Shapley value for feature i
In our study, we proposed a hybrid CNN-LSTM DL model for f = Blackbox model
analyzing the dataset. Our proposed model introduces a hybrid CNN- x = input data point
LSTM DL model that leverages the strengths of both architectures. z′ = subset
This allows us to capture spatial and temporal dependencies in the data x′ = simplified data input
effectively. Our approach fills a gap in the literature, as no previous fx (z′) = With feature i.
work applies this hybrid model to the dataset. The integration of CNN ( ′)
fx zⅈ = Without feature i.
and LSTM enables us to extract complex patterns and capture long-term
dependencies, resulting in superior performance compared to existing
It calculates the contribution of a single characteristic by considering
methods.
all potential subsets of features that include that feature. The contribu
In our proposed model for the CSV dataset, we adjust CNN param
tion is then weighted by the number of subgroups with the feature and
eters (number of layers, filter size, stride, pooling size), LSTM parame
averaged over all potential subsets. This produces a single SHAP value
ters (number of layers, units, dropout rate), training parameters
for each feature, which may be utilized to explain the model prediction
(learning rate, batch size, number of epochs), feature engineering
for a specific instance.
(feature selection, scaling), and model architecture (skip connections,
attention mechanisms) can improve accuracy. Hyperparameters opti
mization methods like grid search or Bayesian optimization are essential
for finding optimal combinations. Thorough experimentation and vali
dation are crucial for achieving the best performance.
7
M.M. Hossain et al. Informatics in Medicine Unlocked 42 (2023) 101370
TP + Tn
Accuracy = (11)
TP + Tn + Fp + Fn
Fig. 6. KDE plot representing diseased and non-diseased patients based on the
In order to gain a better understanding of the dataset’s characteris
age distribution.
tics, exploratory data analytics were performed. The results of these
analyses are presented in the following section. Fig. 5 is a heatmap
5. Results
depicting the associated values and inter-feature correlations. All the
colored cells show a correlation between the two characteristics and
5.1. Experiment setup
their associated values. The cell color indicates the strength of the
connection, with negative values represented by a distinct color and zero
This study employed a conventional laptop with Windows 10, an
indicating no correlation between variables.
Intel Core I7 processor, and 16 GB of RAM. The experiment was con
In addition, Fig. 6 depicts the density distribution of a dataset of
ducted five times, and the ultimate outcome was determined by taking
patients with and without illness. According to the dataset, the afflicted
an average of all five computational results. group consists of patients aged 50 to 60. The graph indicates that age is a
significant risk factor for heart disease and that the likelihood of
5.2. Performance evaluation developing the condition grows.
8
M.M. Hossain et al. Informatics in Medicine Unlocked 42 (2023) 101370
Table 3
Classification results of the different classification algorithms regarding Sensitivity, Specificity, and Accuracy.
Algorithm Sensitivity Specificity Accuracy
Table 4
Comparison based on precision, recall, and F-Measures.
Classifier Algorithm Precision Recall F-Measure
FE. Using FE, MLP, SVM, and GNB get the closest results, while GBM,
Table 5 XGBoost, and CNN-LSTM demonstrate superior performance.
Evaluation by kappa and MCC. Fig. 8a & b depicts the ROC curve of different classification algo
Classifier Algorithm kappa Mcc rithms with and without FE, constructed from the true and false positive
Without FE With FE Without FE With FE rates. It is a graphical depiction of Table 6’s area under ROC (AUROC).
Fig. 9a and b illustrate the area under the precision-recall curve
GNB 19.61% 43.89% 23.46% 44.59%
SVM 26.21% 44.06% 26.27% 44.98%
(AUPRC) for several classification methods, with and without FE,
Decision Tree 45.12% 45.12% 45.21% 44.85% respectively. The AUPRC values represent the location and agree with
Logistic Regression 28.48% 46.35% 28.48% 46.66% the values reported in Table 6. These figures provide a visual repre
CatBoost 46.13% 46.94% 46.30% 46.99% sentation of the AUPRC values. Table 7 displays the five most relevant
AdaBoost 45.73% 46.88% 46.62% 47.44%
characteristics based on correlation value and feature relevance.
GBM 45.66% 47.58% 45.69% 47.61%
RF 46.76% 47.58% 47.12% 47.88% According to the table, high arterial pressure (ap_hi) is the significant
MLP 44.59%% 48.00% 44.63% 48.02% trait or factor for identifying and predicting heart disease with and
XGBoost 45.48% 48.01% 45.49% 48.14% without FE for various classification algorithms.
Proposed 47.07% 47.97% 47.33% 48.41% In addition to age, the level of cholesterol in the blood (cholesterol),
CNN-LSTM
the patient is either underweight or overweight or normal (BMI), and the
alcohol level in the patient’s blood (alco) are other important predictors
of heart disease.
Table 6 According to the table, high arterial pressure (ap_hi) is the significant
Value of area under ROC and PRC. trait or factor for identifying and predicting heart disease with and
Classifier Algorithm AUROC AUPRC without FE for various classification algorithms. In addition to age, the
level of cholesterol in the blood (cholesterol), the patient is either un
Without FE With FE Without FE With FE
derweight or overweight or normal (BMI), and the alcohol level in the
GNB 0.5982 0.7160 0.6733 0.76254
patient’s blood (alco) are other important predictors of heart disease.
SVM 0.6311 0.7060 0.6935 0.7346
DT 0.7256 0.7234 0.7653 0.7593 Table 8 displays applicable classification algorithms’ feature
LR 0.6423 0.7213 0.7726 0.7592 importance and coefficient scores, excluding SVM, MLP, and GNB,
CatBoost 0.7307 0.7342 0.7858 0.7801 which do not produce such values. The importance and coefficients of
AdaBoost 0.7288 0.7292 0.7766 0.7717 each feature are presented in the table. Fig. 12a and b visually represent
GBM 0.7283 0.7373 0.7886 0.7828
the important features of our proposed CNN-LSTM model as presented in
RF 0.7326 0.7370 0.7894 0.7804
MLP 0.7273 0.5273 0.7787 0.6081 Table 8. The figures illustrate the feature ranking based on their sig
XGBoost 0.7274 0.7383 0.7862 0.7829 nificance and coefficient scores, providing insight into the significant
Proposed CNN-LSTM 0.7344 0.7395 is 0.7873 0.7829 risk factors for CVD.
Without FE, the classifiers had an accuracy range of 59.70%–
73.52%. However, when utilizing the same splitting with FE, the
9
M.M. Hossain et al. Informatics in Medicine Unlocked 42 (2023) 101370
Fig. 8. ROC curve analysis obtained from the proposed hybrid CNN-LSTM model.
Fig. 9. PRC curve analysis obtained from the proposed hybrid CNN-LSTM model.
Table 7
Top five features for heart disease according to applied algorithms.
Feature Ranking 1st 2nd 3rd 4th 5th
10
M.M. Hossain et al. Informatics in Medicine Unlocked 42 (2023) 101370
Table 8
Feature importance and coefficient scores of different applied algorithms.
Algorithm DT CatBoost AdaBoost GBM RF XGBoost Proposed CNN-LSTM
Table 9
Results comparison of different ML algorithms against our proposed CNN-LSTM classifier using a 5-fold cross-validation technique.
Algorithms 1st fold C (%) 2nd fold C (%) 3rd fold C (%) 4th fold C (%) 5th fold C (%) CV Means (%) CV STD (%)
Without With Without With Without With Without With Without With Without With Without With
FE FE FE FE FE FE FE FE FE FE FE FE FE FE
GNB 0.596 0.715 0.595 0.716 0.587 0.713 0.593 0.712 0.594 0.711 0.593 0.713 0.003 0.0019
SVM 0.677 0.483 0.689 0.480 0.679 0.489 0.689 0.489 0.685 0.491 0.684 0.486 0.0051 0.0043
DT 0.727 0.727 0.732 0.732 0.733 0.733 0.727 0.727 0.723 0.723 0.729 0.728 0.0036 0.0037
LR 0.714 0.719 0.725 0.724 0.718 0.720 0.729 0.723 0.718 0.7111 0.721 0.720 0.0055 0.0044
CatBoost 0.731 0.727 0.739 0.735 0.733 0.734 0.732 0.732 0.729 0.727 0.733 0.731 0.0033 0.0031
AdaBoost 0.729 0.508 0.731 0.732 0.729 0.729 0.726 0.727 0.725 0.491 0.728 0.638 0.0024 0.1127
GBM 0.732 0.509 0.743 0.739 0.735 0.734 0.737 0.734 0.733 0.648 0.736 0.672 0.0040 0.0885
RF 0.527 0.676 0.738 0.736 0.736 0.731 0.73 0.731 0.716 0.730 0.690 0.721 0.082 0.0222
MLP 0.728 0.518 0.742 0.492 0.732 0.491 0.734 0.659 0.729 0.509 0.733 0.534 0.0048 0.0635
XGBoost 0.731 0.738 0.743 0.740 0.734 0.732 0.737 0.735 0.734 0.732 0.736 0.734 0.0039 0.0041
Proposed 0.736 0.742 0.743 0.746 0.736 0.739 0.732 0.738 0.734 0.744 0.746 0.749 0.0042 0.0040
CNN-
LSTM
Fig. 10. Accuracy curve analysis of our proposed hybrid CNN-LSTM model.
F1-score were 74.02%, 73.72%, and 73.84%, respectively, without FE, study conducted an evaluation of the proposed CNN-LSTM model,
which improved to 75.09%, 75.22%, and 75.15%, respectively, with FE. comparing its performance with and without FE. Figs. 10a and 11a
These results indicate that the proposed CNN-LSTM model more illustrate the outcomes of the model without FE, indicating a test ac
effectively predicts CVD risk when the FE gets associated with it. The curacy of 73.52% and the convergence of training and validation loss
11
M.M. Hossain et al. Informatics in Medicine Unlocked 42 (2023) 101370
Fig. 11. Model loss curve analysis of our proposed hybrid CNN-LSTM model.
Fig. 12. Important features according to SHAP using our proposed CNN-LSTM model.
Fig. 13. The confusion matrix of our proposed CNN-LSTM without FE takes
80% as training and 20% as testing data.
Fig. 14. The confusion matrix of our proposed CNN-LSTM with FE takes 80%
as training and 20% as testing data.
12
M.M. Hossain et al. Informatics in Medicine Unlocked 42 (2023) 101370
Fig. 15. The confusion matrix of all the experimented classifiers on Dataset before feature engineering (FE) and after feature engineering (FE), respectively.
13
M.M. Hossain et al. Informatics in Medicine Unlocked 42 (2023) 101370
Table 10
Comparison between the state-of-the-art approaches and our proposed CNN-LSTM model on the same dataset.
Study Precision Recall F- Measure Kappa Mcc AUROC Sensitivity Specificity Accuracy
after the 55tℎ epoch. Subsequently, Figs. 10b and 11b present the net diverse dataset, and employing XAI and feature engineering techniques.
work’s performance on training and validation data with FE. These contributions enhance the development of effective and efficient
These figures provide a comprehensive overview of accuracy and ML models for CVD prediction, providing valuable insights for future
loss for each epoch. With FE, the CNN-LSTM model exhibited conver research and clinical applications.
gence at the 82nd epoch, achieving the highest validation score. The However, some limitations of this study should be acknowledged.
model demonstrated a classification performance of 74.15% on unseen First, the proposed model was evaluated on a single dataset, and its
test data not encountered during training. These results indicate that FE performance should be validated on other datasets. Second, the model
enhances the CNN-LSTM model’s performance, improving accuracy and was trained using a supervised learning approach, which requires a large
convergence. The findings emphasize incorporating FE techniques in the amount of labeled data. Obtaining labeled data can be challenging, and
proposed model for enhanced predictive capabilities. The significance of alternative approaches, such as semi-supervised or unsupervised
individual features in a model can be determined by analyzing their learning, may be necessary.
absolute Shapley values. The absolute Shapley values for each feature
are averaged across the dataset to obtain a global understanding of 7. Conclusions
feature importance. Afterwards, the features are sorted in descending
order of importance, and a plot is generated to visualize the results. In In conclusion, our study makes significant strides in the accurate
this context, the proposed CNN-LSTM model was utilized to predict prediction of heart disease using data mining and machine learning
CVD. Fig. 12a depicts the Shapley values for important features without techniques. Through the evaluation of various ML algorithms, we have
employing FE techniques. The plot indicates that the most significant successfully demonstrated that the CNN-LSTM hybrid model emerges as
feature is ap_hi. Conversely, Fig. 12b shows the important features after the most effective algorithm, achieving a commendable accuracy of
applying FE techniques, and the plot reveals that age is the most sig 74.15%. This finding highlights the potential of advanced ML models in
nificant feature, followed by ap_hi. the realm of cardiovascular disease (CVD) prediction, providing a
In our study, we utilized the confusion matrix to compare the per valuable tool for early intervention and prevention. Moreover, we
formance of the proposed CNN-LSTM model with and without FE. employed the SHAP technique to interpret the importance of features in
Fig. 13 displays the confusion matrix obtained from the CNN-LSTM the models, offering crucial insights into the underlying mechanisms of
model without FE. disease prediction. By identifying key features such as age, systolic blood
The confusion matrix shows that the model correctly predicted 5479 pressure, and cholesterol levels as essential contributors to heart disease
instances while the TN was 4814. Fig. 14 displays the confusion matrix prediction, we have deepened our understanding of the factors influ
obtained from the CNN-LSTM model with FE, showing that the model encing CVD development. This valuable information can potentially aid
correctly predicted 6008 instances while the TN was 4018. Overall, clinicians in devising personalized treatment plans and risk manage
adding FE led to an increase in the number of accurate predictions, ment strategies for patients, ultimately leading to improved clinical
indicating that the model could better identify cases of heart disease. outcomes. Our study successfully addresses several research gaps and
Fig. 15 displays the dataset’s confusion matrix. By dividing the limitations by leveraging a hybrid CNN-LSTM model, utilizing a diverse
dataset into training and testing halves, 20% of the data and 80% of the dataset, and incorporating feature engineering techniques. These con
data were used to assess the performance of various classifiers. tributions enhance the effectiveness and efficiency of ML models for
The confusion matrix provides a valuable visual representation of the CVD prediction, positioning them as promising tools in the domain of
model’s performance and can be used to guide further improvements to cardiovascular health management. However, we acknowledge that
the model. Furthermore, it is worth noting that some classification al further research is warranted to validate our findings on larger and more
gorithms, such as RF, AdaBoost, and XGBoost, achieved high accuracy diverse datasets. Continued exploration of ML techniques and their
and F1-score with FE. However, their performance decreased without clinical applications in predicting and preventing heart disease will
FE, indicating that FE is crucial for these algorithms’ performance. undoubtedly strengthen the field’s knowledge base and enable better-
On the other hand, MLP and CatBoost achieved high accuracy and informed decision-making for medical practitioners.
F1-score with and without FE, indicating their robustness to FE. Our
proposed CNN-LSTM model performs better in predicting cardiovascular CRediT authorship contribution statement
disease (CVD) risk than other classification algorithms. One of the recent
study [1] on the same dataset achieved an accuracy of 72.7%, while our Md. Maruf Hossain: Conceptualization, Methodology, Software,
model demonstrates an exceptional accuracy of 74.15% in diagnosing Validation, Writing - Original Draft, Md Shahin Ali: Conceptualization,
early-stage cardiac conditions. This represents a significant improve Methodology, Software, Validation, Writing - reviewing & editing. Md.
ment of 1.45% over the previous state-of-the-art approaches. Mahfuz Ahmed: Data curation, Writing, Validation. Md. Rakibul
Comparison between the state-of-the-art approaches and our pro Hasan Rakib: Data curation, Writing, Validation. Moutushi Akter
posed CNN-LSTM model on the same dataset, are shown in Table 10. Kona: Data curation, Writing, Validation. Sadia Afrin: Data curation,
However, in this study, the author did not employ explainable AI and Writing, Validation. Md Khairul Islam: Methodology, Formal analysis,
feature engineering techniques to visualize the specific features Writing - Review & Editing and Supervising. Md Manjurul Ahsan:
responsible for the disease. We utilized these, which underscores the Investigation, Writing - Review & Editing. Sheikh Md. Razibul Hasan
uniqueness of our study. Raj: Data curation, Writing, Validation. Md Habibur Rahman: Formal
However, other algorithms such as RF, AdaBoost, XG- Boost, MLP, analysis, Writing - Review & Editing and Supervising.
and CatBoost also achieved high accuracy and F1-score with FE.
Therefore, FE is an essential step in improving the performance of these Data availability
algorithms. Moreover, our study addressed these research gaps and
limitations by utilizing a hybrid CNN-LSTM model, incorporating a The data used to support the findings of this study are available from
14
M.M. Hossain et al. Informatics in Medicine Unlocked 42 (2023) 101370
the corresponding author upon request. [15] Sajja TK, Kalluri HK. A deep learning method for prediction of cardiovascular
disease using convolutional neural network. Rev. d’Intelligence Artif. 2020;34(5):
601–6.
[16] Miao KH, Miao JH. Coronary heart disease diagnosis using deep neural networks.
Declaration of competing interest Int J Adv Comput Sci Appl 2018;9(10).
[17] Junejo A, et al. Notice of retraction: molecular diagnostic and using deep learning
techniques for predict functional recovery of patients treated of cardiovascular
The authors declare that they have no known competing financial disease. IEEE Access 2019;7:120315–25.
interests or personal relationships that could have appeared to influence [18] Farzadfar F. Cardiovascular disease risk prediction models: challenges and
the work reported in this paper. perspectives. Lancet Global Health 2019;7(10):e1288–9.
[19] Thabtah F, et al. Data imbalance in classification: experimental evaluation. Inf Sci
2020;513:429–41.
Acknowledgement [20] Islam MK, et al. Brain tumor detection in MR image using superpixels, principal
component analysis and template based K-means clustering algorithm. Machine
Learning with Applications 2021;5:100044.
We would like to acknowledge the support provided by the Bio- [21] Ahsan MM, et al. Deep transfer learning approaches for Monkeypox disease
Imaging Research Lab, Department of Biomedical Engineering, Islamic diagnosis. Expert Syst Appl 2023;216:119483.
[22] Al-Rawahnaa ASM, Al Hadid AYB. Data mining for Education Sector, a proposed
University, Kushtia 7003, Bangladesh, in carrying out our research concept. Journal of Applied Data Sciences 2020;1(1):1–10.
successfully. [23] Ahsan MM, et al. Monkeypox diagnosis with interpretable deep learning. IEEE
Access; 2023.
[24] Henderi H, Wahyuningsih T, Rahwanto E. Comparison of min-max normalization
References and Z-score normalization in the K-nearest neighbor (kNN) algorithm to test the
accuracy of types of breast cancer. Int J Intell Inf Syst 2021;4(1):13–20.
[1] Kedia V, et al. Time efficient IOS application for CardioVascular disease prediction [25] Lou R, et al. Automated detection of radiology reports that require follow-up
using machine learning. In: 2021 5th international conference on computing imaging using natural language processing feature engineering and machine
methodologies and communication (ICCMC). IEEE; 2021. learning classification. J Digit Imag 2020;33:131–6.
[2] Omar S, Mohamed N, Elbendary N. A cardiovascular disease prediction using [26] Dai D, et al. Using machine learning and feature engineering to characterize
machine learning algorithms. In: The international undergraduate research limited material datasets of high-entropy alloys. Comput Mater Sci 2020;175:
conference. The Military Technical College; 2021. 109618.
[3] Ali MM, et al. Heart disease prediction using supervised machine learning [27] Amin A, et al. Customer churn prediction in telecommunication industry using data
algorithms: performance analysis and comparison. Comput Biol Med 2021;136: certainty. J Bus Res 2019;94:290–301.
104672. [28] Amin A, et al. Cross-company customer churn prediction in telecommunication: a
[4] Arunachalam S. Cardiovascular disease prediction model using machine learning comparison of data transformation methods. Int J Inf Manag 2019;46:304–19.
algorithms. Int J Res Appl Sci Eng Technol 2020;8:1006–19. [29] Ali MS, et al. Alzheimer’s disease detection using m-random forest algorithm with
[5] Ali MS, et al. An enhanced technique of skin cancer classification using deep optimum features extraction. In: 2021 1st international conference on artificial
convolutional neural network with transfer learning models. Machine Learning intelligence and data analytics (CAIDA). IEEE; 2021.
with Applications 2021;5:100036. [30] Géron A. Hands-on machine learning with scikit-learn, keras, and TensorFlow.
[6] Rubini P, et al. A cardiovascular disease prediction using machine learning O’Reilly Media, Inc.; 2022.
algorithms. Annals of the Romanian Society for Cell Biology 2021:904–12. [31] Dahouda MK, Joe I. A deep-learned embedding technique for categorical features
[7] Shah D, Patel S, Bharti SK. Heart disease prediction using machine learning encoding. IEEE Access 2021;9:114381–91.
techniques. SN Computer Science 2020;1:1–6. [32] Baughman A, et al. Study of feature importance for quantum machine learning
[8] Jagtap A, et al. Heart disease prediction using machine learning. International models. 2022. arXiv preprint arXiv:2202.11204.
Journal of Research in Engineering, Science and Management 2019;2(2):352–5. [33] Hind M, et al. TED: teaching AI to explain its decisions. In: Proceedings of the 2019
[9] Motarwar P, et al. Cognitive approach for heart disease prediction using machine AAAI/ACM conference on AI. Ethics, and Society; 2019.
learning. In: 2020 international conference on emerging trends in information [34] Molnar C, et al. Model-agnostic feature importance and effects with dependent
technology and engineering (ic-ETITE). IEEE; 2020. features: a conditional subgroup approach. Data Min Knowl Discov 2023:1–39.
[10] Kavitha M, et al. Heart disease prediction using hybrid machine learning model. In: [35] Torres JF, et al. Deep learning for time series forecasting: a survey. Big Data 2021;9
2021 6th international conference on inventive computation technologies (ICICT). (1):3–21.
IEEE; 2021. [36] Khorram A, Khalooei M, Rezghi M. End-to-end CNN+ LSTM deep learning
[11] Katarya R, Meena SK. Machine learning techniques for heart disease prediction: a approach for bearing fault diagnosis. Appl Intell 2021;51:736–51.
comparative study and analysis. Health Technol 2021;11:87–97. [37] Muruganandam NS, Arumugam U. Seminal stacked long short-term memory (SS-
[12] Krittanawong C, et al. Machine learning prediction in cardiovascular diseases: a LSTM) model for forecasting particulate matter (PM2. 5 and PM10). Atmosphere
meta-analysis. Sci Rep 2020;10(1):16057. 2022;13(10):1726.
[13] Kumar NK, et al. Analysis and prediction of cardio vascular disease using machine [38] Das A, Rad P. Opportunities and challenges in explainable artificial intelligence
learning classifiers. In: 2020 6th international conference on advanced computing (xai): a survey. 2020. arXiv preprint arXiv:2006.11371.
and communication systems (ICACCS). IEEE; 2020. [39] Islam MK, et al. Enhancing lung abnormalities detection and classification using a
[14] Nikhar S, Karandikar A. Prediction of heart disease using machine learning Deep Convolutional Neural Network and GRU with explainable AI: a promising
algorithms. International Journal of Advanced Engineering, Management and approach for accurate diagnosis. Machine Learning with Applications; 2023,
Science 2016;2(6):239484. 100492.
15