Automated Detection of Atrial Fibrillation Using Wavelets: Submitted by
Automated Detection of Atrial Fibrillation Using Wavelets: Submitted by
Using Wavelets
Project Report Submitted in Partial Fulfillment of the Requirements for the Degree of
Bachelor of Technology
in
Electronics and Communication Engineering
Submitted by
Harshvardhan Paithane:(Roll No. 16ECE1008)
Mahima Tendulkar:(Roll No. 16ECE1012)
Sukkhada Joshii:(Roll No. 16ECE1028)
CERTIFICATE
This is to certify that the work contained in this report entitled “Auto-
mated Detection of Atrial Fibrillation Using Wavelets” is submitted
by the group members Mr. Harshvardhan Paithane (Roll. No: 16ECE1008),
Ms. Mahima Tendulkar (Roll. No: 16ECE1012) and Ms. Sukkhada Joshii
(Roll. No: 16ECE1028) to the Department of Electronics and Communi-
cation Engineering, National Institute of Technology Goa, for the partial
fulfillment of the requirements for the degree of Bachelor of Technology
in Electronics and Communication Engineering.
They have carried out their work under my supervision. This work has
not been submitted else-where for the award of any other degree or diploma.
The project work in our opinion, has reached the standard fulfilling of
the requirements for the degree of Bachelor of Technology in Electronics and
Communication Engineering in accordance with the regulations of the Insti-
tute.
Sukkhada Joshii
Roll. No: 16ECE1028
Department of ECE
NIT Goa
Acknowledgment
We would like to precise our gratitude to the earnest bolster and encour-
agement given by the Director of National Institute of Technology Goa, Dr.
Gopal Mugeraya. We wish to convey our appreciation and indebtedness to
our project guide Dr. Shivnarayan Patidar and the Head of the Depart-
ment, Dr. Nithin Kumar Y.B. for providing us the chance to work on the
area of Biomedical Signal Processing. We thank Dr. Shivnarayan Patidar
for believing in us and constantly motivating us to take up new challenges
and strive for their solutions which can be beneficial for the mankind and a
boon in the world of signal processing and healthcare. We would also like
to acknowledge the moral support of our parents and friends throughout the
journey of the project.
i
Abstract
Atrial fibrillation shortened as AF or AFib is a sort of arrhythmia. It is found
to be the foremost common among any other cardiac arrhythmia. It leads
to different wellbeing related complications and can increment the hazard of
heart failure or stroke. It is found especially in hypertensive and elderly pa-
tients. Therefore, testing and diagnosing early can reduce the consequences
of AF. The aim of the project is to implement an efficient algorithm to
automatically detect cardiovascular disease mainly atrial fibrillation by cat-
egorizing the dataset that is used, which contains information of different
patients and classifying them into either normal rhythm or atrial fibrilla-
tion(AF) rhythm on the basis of abnormalities present in the ECG signal.
This work is an implementation and extended study of an existing work [1].
The project aims to enhance the generalization ability of the algorithm by
training it over multiple datasets. It proposes to achieve the objective of
providing such an efficient algorithm by using the concept of wavelet packet
transform and correlation function. These concepts are used for for physio-
logical signal analysis and to contrive an efficient feature extraction strategy.
The feature set that is constructed is the input to the various classifiers that
are being used for the detection. The project also aims to lower the degree
of human intervention for the purpose of discovering any form of an anomaly
in the human heart, by implementing concepts of machine learning and ar-
tificial intelligence in the field of biomedical signal processing. The method
is shown to perform well with an accuracy of around 98%+ and a high value
of F1 score.
ii
Contents
1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Problem description and Objectives . . . . . . . . . . . . . . . 2
2 Background 4
2.1 Heart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Electrocardiogram (ECG) . . . . . . . . . . . . . . . . . . . . 5
2.3 Atrial Fibrillation . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4 Wavelet packet transform (WPT) . . . . . . . . . . . . . . . . 8
3 Method 10
3.1 Data Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Pre-Processing . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2.1 Noise filtering . . . . . . . . . . . . . . . . . . . . . . . 11
3.2.2 Segmentation . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Wavelet Packet Transform . . . . . . . . . . . . . . . . . . . . 13
3.4 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4.1 Features Evaluation . . . . . . . . . . . . . . . . . . . . 17
3.5 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.6 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . 20
3.6.1 Cross-validation . . . . . . . . . . . . . . . . . . . . . . 21
3.6.2 Evaluation metrics . . . . . . . . . . . . . . . . . . . . 21
iii
4 Results 23
4.1 Binary classification using MIT-BIH AF Dataset . . . . . . . . 23
4.2 Binary classification using PhysioNet challenge 2017 Dataset . 24
4.3 Binary classification using PhysioNet challenge 2020 Dataset . 25
4.4 Binary classification for All datasets combined . . . . . . . . . 25
4.5 Parameter Analysis . . . . . . . . . . . . . . . . . . . . . . . . 27
5 Conclusion 28
6 Future Work 29
References 30
iv
List of Figures
v
List of Tables
vi
Chapter 1
Introduction
1
save lives of many people, enhance the quality of life of millions of individuals
who are at risk, and will help in reducing the socio-economic burden due to
AF and such cardiac diseases.
1.1 Motivation
Atrial fibrillation, a kind of cardiac arrhythmia may advance to several car-
diac complications such as fall in blood pressure, blood clots, risk of heart
failure, stroke, other heart diseases. The lives of about 10% of the global
population above the age of 75 years is affected by it. As the world popula-
tion is aging the commonness of AF in the adult population is increasing. A
recent study based on prevalence of AF in the globe stated that almost one
in four adults will be affected by the AF in the US and Europe [2].
In recent times, the ability of machines and softwares has been improved
significantly such that they are able to perform classification, quantification,
and identification of the patterns in biomedical signals [3]. Since the global
population is aging fast and the healthcare costs are also increasing, there is
a need for an automated AF detection method to monitor the health status
of patients. It will make life easier for heart patients and improve their
conditions. Hence it is important to design and cultivate new methods that
will help in detection of atrial fibrillation by performing the analysis of heart
rhythm ie. ECG signal.
2
and to help forward diagnosis which will reduce the burden on physicians.
The primary objectives of the project are as follows:
3
Chapter 2
Background
2.1 Heart
The heart is a pumplike muscular organ located slightly left and behind the
breastbone. It is hollow inside and around the measure of a clench hand.
It capacities by pumping the blood through the circuitry of blood vessels
(veins and arteries). The oxygen and nutrients are carried by the pumped
blood to the rest of body and carbon dioxide ie. a metabolic body waste
to the lungs. The heart is partitioned in two sections: both the upper and
lower section comprises two chambers each, left-right atrium and left-right
ventricle respectively as shown in Figure 2.1.
A customary electrical motivation is sent by the sinoatrial (SA) node
too called the pacemaker of the heart. Due to this impulse the upper heart
chambers contract. This impulse then flows to the ventricles through the
atrioventricular (AV) node causing them to pump out blood due to its con-
traction. The right atrium pumps the blood to the right ventricle after it
gets it from the veins. After the right ventricle receives the blood it supplies
it to the lungs. There it gets mixed with oxygen and then it is received by
the left atrium. The left atrium pumps it to the left ventricle which is the
most powerful chamber and it supplies it to the whole body [5].
4
Figure 2.1: Anatomy of the heart [5].
5
Figure 2.2: A typical one-cycle ECG signal tracing [6].
normal rhythm of the heart can lead to severe heart-related issues like drop
in blood pressure, stroke, risk of heart failure, etc. During a regular beat,
normal heart contracts and relaxes. But, in atrial fibrillation, two upper
chambers of the heart are not in coordination with the two lower chambers.
The atria beat irregularly and in a disordered manner (quiver) and is out of
coordination with the ventricles of the heart to move blood into the ventricles.
Heart palpitations, shortness of breath, and weakness are some symptoms of
AF. Conventionally, the diagnosis of AF detection is carried out manually
by visual inspection of the ECG signal. The trained physicians observe the
ECG signals to diagnose the AF as shown in Figure 2.3. Visual detection is
AF is also done by the observing irregularity in the occurrence of R peaks.
However, a non-invasive cardiac monitoring technique used to diagnose AF
is the absence of P waves on an ECG signal, which is a stronger indicator of
AF [7].
6
Figure 2.3: Electrical conduction system and ECG signal during normal
heartbeat and atrial fibrillation [7].
7
• Persistent: The episode duration is more than a week and medication
can be used to stop it. If the treatment or medicines doesn’t work,
physicians go for the electrical cardioversion, a low-voltage current used
to reset the normal rhythm.
8
where the low-pass and high-pass filter coefficients are denoted by h(n)
and g(n) respectively.
A WPT based signal decomposition process is schematically illustrated
in figure 2.4. The number of sub-bands produced by a three-level WPT adds
up to 8, where the signal frequency spectrum is divided into eight parts,
sub-bands covering one each. The discrete wavelet transform (DWT) gives
adjustable time–frequency resolution, but it is affected by the poor resolution
within the high-frequency locale. Due to this drawback of DWT, separation
of high-frequency transient components becomes challenging. But WPT fur-
ther breaks down the signal in the high-frequency region and provides the
detailed information of the signal by overcoming the limitation of DWT [8].
9
Chapter 3
Method
10
3.1 Data Acquisition
A total of three datasets were used to implement the method. MIT-BIH AF
Dataset [9], PhysioNet challenge 2017 Dataset, PhysioNet challenge 2020
Dataset [10]. All three databases containing ECG signals are obtained from
the PhysioNet website. The database includes arrhythmia dataset, AF dataset,
noise stress test dataset, and other datasets and it is publicly available for
the people in medical research. The other two datasets also include ECG
signals of various patients around the globe. They include a variety of ab-
normalities which makes them perfect to enhance the generalization ability
of the implemented method. The algorithm is trained, validated, and tested
on each of them. These datasets are used to validate the genuineness and
practicability of the implemented algorithm.
ECG signal files present in MIT-BIH AF Dataset are two-lead ECG sam-
pled at 250 Hz (samples per second), whereas ECG signals present in Phy-
sioNet challenge 2017 dataset are single-lead ECG and are sampled at 300
Hz. ECG signals present in PhysioNet challenge 2020 dataset are 12-lead
ECG and are sampled at 500 Hz. In this work, we have used a single lead
from each of the datasets to implement the method.
3.2 Pre-Processing
The ECG signal data is pre-processed before it is used for feature extraction.
The pre-processing consists of two steps: Segmentation and Noise filtering
(Denoising). Noise filtering is carried out first and then the segmentation of
the signal is performed. Matlab R2018a was used to perform all the signal
processing and analysis of the ECG signals.
11
Wander, Channel Noise, and power-line interference, and other AWGN noise
[11]. These noisy factors prove to be a hindrance to the features extraction
process during ECG analysis.
So we designed and implemented the following filters to remove the noises
which were present in the datasets. The 50 Hz notch filter and the 0.3-45
Hz Bandpass filter [12]. The ECG segment before and after it is denoised by
the set of filters is illustrated in figure 3.2. We filtered out Baseline wander,
50Hz powerline interference, and electrode motion noise using the mentioned
filters.
Figure 3.2: ECG Signal before denoising and after denoising. (MIT-BIH AF
dataset record #4015)
3.2.2 Segmentation
The procedure of partitioning a signal into numerous sections of a certain
estimate is called segmentation. It can be performed with or without using
an overlapping window. In this step, ECG records from all three datasets are
divided into segments of duration 10 seconds, and the segmentation is done
without overlapping the window. These filtered and segmented ECG seg-
ments are used to find the WPT coefficients, a pre-requisite step for feature
extraction required for binary classification in the next section.
12
3.3 Wavelet Packet Transform
After ECG records undergo pre-processing, wavelet packet transform is per-
formed on each ECG segment. The segments are decomposed at the level
l =5. The figure 3.3 illustrates the WPT decomposition tree.
13
Figure 3.4: Plot of the sub-bands Band51 and Band52 of WPT decomposed
NSR signal (top) and AF signal (bottom). (MIT-BIH AF dataset record
#4015)
14
3.4 Feature Extraction
There are many existing AF detection algorithms in which the features are
extracted from the morphological features of ECG signals. Two main fea-
tures of the ECG signal in AF include the absence of p-waves and RR interval
irregularity. In the Afib ECG signal, p-waves are often replaced by f-waves,
a fast and disordered fibrillatory waves. All such algorithms involve some
pre-processing procedures like detection of R-peak and P-waves, but the al-
gorithm performance is affected if these parameters are not accurately calcu-
lated [13]. Hence to avoid this situation, in this work the feature extraction
method is based on the correlation between the WPT Coefficients extracted
from selected sub-bands (Band51 and Band52 ) [1]. It is known that if there is
any kind of disorder present in ECG signal the correlation among the coeffi-
cients of the wavelet coefficient series will decrease. In any random series, the
correlation function has a superior ability to quantify specific characteristics,
and because of this property, it is considered for sequential data analysis.
The changes in the atrial activity are highlighted due to this property [14].
So the feature set is constructed by calculating the information entropy and
weighted sums of selected sub-bands.
The feature extraction steps are explained below:
Step 1. The wavelet coefficients are obtained from the selected sub-bands
(Band51 and Band52 ) by performing WPT decomposition of filtered ECG
segment. The coefficients series is divided into n segments of equal length,
and by representing one of the segment as d¯(ti ) = d(1) (ti ) , d(2) (ti ) , d(3) (ti ) , . . . , d(m) (ti )
with i = 1, 2, 3, 4, . . . , n [1].
15
Step 2. A correlation matrix is computed with these segments where τ =
0, 1, 2, 3, . . . , n-1, and it is normalized as given below [1]:
B1,1 · · · B1,1+r · · · B1,n
. .. ..
.. . .
Rd¯ = Bi,1 · · · Bi,1+τ ··· Bi,n (3.1)
. .. ..
.
. . .
n
P
WB = {Bi,i+τ .ni,i+τ } (3.2)
i=1
n
P
HB = − {pi,i+τ . log pi,i+τ } (3.3)
i=1
where,
ni,i+τ = Number of Bi,i+τ in a given precision.
pi,i+τ = Proportion of ni,i+τ in total number.
Step 4. The features are assembled as feature set for classifier. We
have used K-Nearest Neighbor (KNN), Support vector machine (SVM), and
Artificial Neural Network (ANN) learning models for classification.
The correlation function estimation of any two segments is denoted as
B̂d¯ (τ0 ), which is actually a sequence of numbers with τ0 = 0, ±1, ±2, . . . , ±(m−
1), presented as follows [1]:
m
1
{d¯j (ti ) .d¯(j+τ0 ) (ti+τ )}
P
B̂d¯ (τ0 ) = m (3.4)
j=1
m−1
1
{B̂ d¯(τ0 )}
P
B̄i,i+τ = ( 2m−1 ) (3.5)
τ0 =−(m−1)
16
3.4.1 Features Evaluation
After the feature extraction step, it is important to check if the extracted
features are good enough to give input to the classifier model. They can
be evaluated by their p-values or box-plots. If the p-value is less than 0.05
then the feature is considered to be reasonable to be used for classification
purpose [15]. If not, it is discarded from the feature set. So, to prove the
genuineness of the features, the Kruskal–Wallis test was conducted. The
p-values obtained after the test for each dataset are given in Table 3.1.
The visual hypothesis testing using box-plots was also executed. The
higher classification ability in the selected sub-bands is represented by the
box-plots of features. It was useful in proving the significance of the features
for the classification. The box-plots for the MIT-BIH AF dataset after the
test are shown in Figure 3.5 and 3.6.
The p-values and the box-plots obtained against each dataset suggests
that the constructed feature set can be used as an input for the classifier
model for automatic detection of AF segments.
17
(a) p-value= 7.4424e-55 (b) p-value= 9.1575e-60
18
3.5 Classification
To further the study, the algorithm was tested on different types of classi-
fication techniques such as ANN, KNN, SVM. For each dataset, all three
classifiers were implemented. The default settings were used for training all
the classifier models. The obtained results are illustrated in the following
chapter. The classifiers that are used perform supervised binary classifica-
tion by segregating the ECG data into two classes, ie. AFib or NSR.
19
etc. ANN classifier predicts the output by assigning the input vectors to
various categories according to their properties [16].
The neural network structure used in our framework contains three (3)
layers ie. input layer, hidden layer, and output layer. The number of neurons
or nodes in the input layer was set as four, equivalent to the dimension of
the feature vector. Due to the binary classification, the number of neurons
or nodes in the output layer was set as two. The hidden node count was set
as ten(10). Figure 3.7 shows the ANN topology in proposed scheme. The
adaptive learning rate was set to 0.1. Levenberg-Marquardt backpropagation
algorithm was used to obtain the best performing model. For the hidden
layer, the activation function was set to Sigmoid function and for the output
layer, the Softmax function was chosen.
20
ability of the model in an unseen example set, a unique test set is chosen.
Usually, it is done by dividing a large dataset into training, validation, and
test sets [15].
In this work, all the datasets used are divided into three parts. The
proportion was kept as 80% for training and 10% for validation and testing
each. During training stage training and validation sets are used. At the final
stage, the test set is used for performance evaluation of the model. However,
we can take advantage of the idea of cross-validation, if the dataset is limited.
3.6.1 Cross-validation
To enhance the generalization ability of the method, we implemented 10-fold
cross validation. This is performed to make sure the conclusive classification
performance of the respective classifier [17]. The dataset was divided into
10 parts in a random order, out of which nine were taken as the training set
inturn and the remaining one as the test set. The results after each iteration
of the experiment were averaged for final classification performance.
21
Figure 3.8: A template of the confusion matrix for the binary classification.
TN
SPE = (3.7)
(T N + F P )
(T P + T N )
ACC = (3.8)
(T P + T N + F N + F P )
F1 score: (F measure or F score) It is the weighted average of Precision
(PRE) and Recall (REC). Both false positives and negatives are taken into
account while calculating F1 score. It is calculated as below:
REC ∗ P RE
F1 Score = 2 ∗ ( ) (3.9)
REC + P RE
REC = TP/(TP + FN) (3.10)
22
Chapter 4
Results
23
Table 4.1: Performance evaluation parameter comparison of the algorithm
using MIT-BIH AF dataset with different classifiers.
24
4.3 Binary classification using PhysioNet chal-
lenge 2020 Dataset
The algorithm was trained and tested on PhysioNet challenge 2020 Dataset
which is available on PhysioNet website. The results obtained are given be-
low. Table 4.3 the performance evaluation parameter comparison of the
algorithm using PhysioNet Challenge 2020 dataset with different classifiers.
25
Table 4.4: Performance evaluation parameter comparison of the algorithm
using all three datasets combined with different classifiers.
26
4.5 Parameter Analysis
Parameter Analysis for the SVM classifier using different datasets.
It was observed that while ANN gave a better accuracy when the datasets
were trained separately, SVM classification was more effective for the gener-
alized approach. Therefore, we chose to closely examine the output of SVM
classifier on all datasets. Table 4.5 illustrates the Evaluation parameter com-
parison of the algorithm using different datasets with SVM classifier.
27
Chapter 5
Conclusion
28
Chapter 6
Future Work
Although the results we are able to get are satisfactory, the implementa-
tion can still be improved by training the method using more number of
diverse databases with a variety of patients and a variety of cardiac defects.
Tasks revolving around machine learning and neural network can be further
boosted by using computers having faster processors. The classification can
be improved by using deep neural networks like recurrent neural network.
The algorithm could be further simplified by exploring other wavelet
transforms in place of wavelet packet transform. Another scope of future
work is the detection of other types of arrhythmias such as atrial flutter,
paroxysmal tachycardia, and heart-valve related diseases. Further, the algo-
rithm can be improved by deploying it under realistic scenarios and analyzing
the results. In the end, the method can be clinically validated in the future
and the possibility of implementing a low-cost device can be explored.
29
References
[1] Jibin Wang, Ping Wang, Suping Wang, “Automated detection of atrial
fibrillation in ECG signals based on wavelet packet transform and cor-
relation function of random process”, Biomedical Signal Processing and
Control, Volume 55, 101662, August 2019.
[2] Chugh, S.S., Havmoeller, R., Narayanan, K., Singh, D., Rienstra, M.,
Benjamin, E.J., Gillum, R.F., Kim, Y.H., McAnulty Jr, J.H., Zheng,
Z.J. and Forouzanfar, “Worldwide epidemiology of atrial fibrillation: a
Global Burden of Disease 2010 Study”, Circulation 129(8), pp.837-847,
2014..
[6] https://ecgwaves.com/topic/cardiac-electrophysiology-ecg-action-
potential-automaticity-vector/
[7] http://www.nhlbi.nih.gov/health-topics/atrial-fibrillation/
30
[8] Abouelanouar, Bouchra, et al., “Application of wavelet analysis and
its interpretation in rotating machines monitoring and fault diagnosis.
A review”, International Journal of Engineering & Technology, 7(4),
3465-3471, 2018.
[9] https://physionet.org/content/afdb/1.0.0/
[10] https://physionet.org/about/challenge/
[11] Velayudhan, Aswathy, and Soniya Peter, “Noise analysis and differ-
ent denoising techniques of ECG signal-a survey”, IOSR Journal of
Electronics and Communication Engineering (IOSR-JECE), pp 40-44,
eISSN-2278, 2016.
[13] Tateno, K., and L. Glass, “A method for detection of atrial fibrillation
using RR intervals”, Computers in Cardiology 2000, Vol. 27, pp. 391-
394. IEEE, 2000.
31
[17] Refaeilzadeh P., Tang L., Liu H., “Cross-validation”, Encyclopedia of
database systems, pp. 532–538, Springer, 2009.
32