0% found this document useful (0 votes)
5 views41 pages

Automated Detection of Atrial Fibrillation Using Wavelets: Submitted by

The project report focuses on the automated detection of atrial fibrillation (AF) using wavelet packet transform and machine learning techniques. The aim is to develop an efficient algorithm that classifies ECG signals into normal or AF rhythms, achieving an accuracy of over 98%. This work is part of the Bachelor of Technology degree requirements in Electronics and Communication Engineering at the National Institute of Technology Goa.

Uploaded by

hmhithesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views41 pages

Automated Detection of Atrial Fibrillation Using Wavelets: Submitted by

The project report focuses on the automated detection of atrial fibrillation (AF) using wavelet packet transform and machine learning techniques. The aim is to develop an efficient algorithm that classifies ECG signals into normal or AF rhythms, achieving an accuracy of over 98%. This work is part of the Bachelor of Technology degree requirements in Electronics and Communication Engineering at the National Institute of Technology Goa.

Uploaded by

hmhithesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Automated Detection of Atrial Fibrillation

Using Wavelets
Project Report Submitted in Partial Fulfillment of the Requirements for the Degree of

Bachelor of Technology
in
Electronics and Communication Engineering

Submitted by
Harshvardhan Paithane:(Roll No. 16ECE1008)
Mahima Tendulkar:(Roll No. 16ECE1012)
Sukkhada Joshii:(Roll No. 16ECE1028)

Under the Supervision of


Dr. Shivnarayan Patidar
Assistant Professor

Department of Electronics and Communication Engineering

National Institute of Technology Goa


June, 2020
Department of Electronics and Communication Engineering
National Institute of Technology Goa
Farmagudi, Goa, India-403401

CERTIFICATE

This is to certify that the work contained in this report entitled “Auto-
mated Detection of Atrial Fibrillation Using Wavelets” is submitted
by the group members Mr. Harshvardhan Paithane (Roll. No: 16ECE1008),
Ms. Mahima Tendulkar (Roll. No: 16ECE1012) and Ms. Sukkhada Joshii
(Roll. No: 16ECE1028) to the Department of Electronics and Communi-
cation Engineering, National Institute of Technology Goa, for the partial
fulfillment of the requirements for the degree of Bachelor of Technology
in Electronics and Communication Engineering.

They have carried out their work under my supervision. This work has
not been submitted else-where for the award of any other degree or diploma.

The project work in our opinion, has reached the standard fulfilling of
the requirements for the degree of Bachelor of Technology in Electronics and
Communication Engineering in accordance with the regulations of the Insti-
tute.

Dr. Shivnarayan Patidar Dr. Nithin Kumar Y.B.


Assistant Professor Assistant Professor and Head
Department of ECE Department of ECE
NIT Goa NIT Goa
DECLARATION
We Harshvardhan Paithane(16ECE1008), Mahima Tendulkar(16ECE1012)
and Sukkhada Joshii(16ECE1028) hereby declare that the project work ti-
tled “Automated Detection of Atrial Fibrillation using Wavelets”
which is being submitted to National Institute of Technology, Goa, Far-
magudi, Ponda, in partial fulfillment for the award of the degree of Bachelor
of Technology in Electronics and Communication Engineering is a bonafide
work carried out by us. The material submitted in this work has not been
submitted to any other university or institution for the grant of any degree.

Harshvardhan Paithane Mahima Tendulkar


Roll. No: 16ECE1008 Roll. No: 16ECE1012
Department of ECE Department of ECE
NIT Goa NIT Goa

Sukkhada Joshii
Roll. No: 16ECE1028
Department of ECE
NIT Goa
Acknowledgment
We would like to precise our gratitude to the earnest bolster and encour-
agement given by the Director of National Institute of Technology Goa, Dr.
Gopal Mugeraya. We wish to convey our appreciation and indebtedness to
our project guide Dr. Shivnarayan Patidar and the Head of the Depart-
ment, Dr. Nithin Kumar Y.B. for providing us the chance to work on the
area of Biomedical Signal Processing. We thank Dr. Shivnarayan Patidar
for believing in us and constantly motivating us to take up new challenges
and strive for their solutions which can be beneficial for the mankind and a
boon in the world of signal processing and healthcare. We would also like
to acknowledge the moral support of our parents and friends throughout the
journey of the project.

i
Abstract
Atrial fibrillation shortened as AF or AFib is a sort of arrhythmia. It is found
to be the foremost common among any other cardiac arrhythmia. It leads
to different wellbeing related complications and can increment the hazard of
heart failure or stroke. It is found especially in hypertensive and elderly pa-
tients. Therefore, testing and diagnosing early can reduce the consequences
of AF. The aim of the project is to implement an efficient algorithm to
automatically detect cardiovascular disease mainly atrial fibrillation by cat-
egorizing the dataset that is used, which contains information of different
patients and classifying them into either normal rhythm or atrial fibrilla-
tion(AF) rhythm on the basis of abnormalities present in the ECG signal.
This work is an implementation and extended study of an existing work [1].
The project aims to enhance the generalization ability of the algorithm by
training it over multiple datasets. It proposes to achieve the objective of
providing such an efficient algorithm by using the concept of wavelet packet
transform and correlation function. These concepts are used for for physio-
logical signal analysis and to contrive an efficient feature extraction strategy.
The feature set that is constructed is the input to the various classifiers that
are being used for the detection. The project also aims to lower the degree
of human intervention for the purpose of discovering any form of an anomaly
in the human heart, by implementing concepts of machine learning and ar-
tificial intelligence in the field of biomedical signal processing. The method
is shown to perform well with an accuracy of around 98%+ and a high value
of F1 score.

ii
Contents

1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Problem description and Objectives . . . . . . . . . . . . . . . 2

2 Background 4
2.1 Heart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Electrocardiogram (ECG) . . . . . . . . . . . . . . . . . . . . 5
2.3 Atrial Fibrillation . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4 Wavelet packet transform (WPT) . . . . . . . . . . . . . . . . 8

3 Method 10
3.1 Data Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Pre-Processing . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2.1 Noise filtering . . . . . . . . . . . . . . . . . . . . . . . 11
3.2.2 Segmentation . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Wavelet Packet Transform . . . . . . . . . . . . . . . . . . . . 13
3.4 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4.1 Features Evaluation . . . . . . . . . . . . . . . . . . . . 17
3.5 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.6 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . 20
3.6.1 Cross-validation . . . . . . . . . . . . . . . . . . . . . . 21
3.6.2 Evaluation metrics . . . . . . . . . . . . . . . . . . . . 21

iii
4 Results 23
4.1 Binary classification using MIT-BIH AF Dataset . . . . . . . . 23
4.2 Binary classification using PhysioNet challenge 2017 Dataset . 24
4.3 Binary classification using PhysioNet challenge 2020 Dataset . 25
4.4 Binary classification for All datasets combined . . . . . . . . . 25
4.5 Parameter Analysis . . . . . . . . . . . . . . . . . . . . . . . . 27

5 Conclusion 28

6 Future Work 29

References 30

iv
List of Figures

2.1 Anatomy of the heart [5]. . . . . . . . . . . . . . . . . . . . . . 5


2.2 A typical one-cycle ECG signal tracing [6]. . . . . . . . . . . . 6
2.3 Electrical conduction system and ECG signal during normal
heartbeat and atrial fibrillation [7]. . . . . . . . . . . . . . . . 7
2.4 WPT decomposition for level l =3 [8]. . . . . . . . . . . . . . . 9

3.1 The AF detection algorithm framework [1]. . . . . . . . . . . . 10


3.2 ECG Signal before denoising and after denoising. (MIT-BIH
AF dataset record #4015) . . . . . . . . . . . . . . . . . . . . 12
3.3 A part of a 5-level wavelet packet transform tree [1]. . . . . . . 13
3.4 Plot of the sub-bands Band51 and Band52 of WPT decom-
posed NSR signal (top) and AF signal (bottom). (MIT-BIH
AF dataset record #4015) . . . . . . . . . . . . . . . . . . . . 14
3.5 The example of box-plot for WB in (a)Band51 (b)Band52 (MIT-
BIH AF dataset) . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.6 The example of box-plot for HB in (a)Band51 (b)Band52 (MIT-
BIH AF dataset) . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.7 ANN topology in proposed scheme. . . . . . . . . . . . . . . . 20
3.8 A template of the confusion matrix for the binary classification. 22

4.1 ROC curve for the SVM classifier. . . . . . . . . . . . . . . . . 26

v
List of Tables

3.1 p-values of the constructed features for respective dataset. . . 17

4.1 Performance evaluation parameter comparison of the algo-


rithm using MIT-BIH AF dataset with different classifiers. . . 24
4.2 Performance evaluation parameter comparison of the algo-
rithm using PhysioNet Challenge 2017 dataset with different
classifiers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.3 Performance evaluation parameter comparison of the algo-
rithm using PhysioNet Challenge 2020 dataset with different
classifiers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.4 Performance evaluation parameter comparison of the algo-
rithm using all three datasets combined with different classifiers. 26
4.5 Evaluation parameter comparison of the algorithm using dif-
ferent datasets with SVM classifier. . . . . . . . . . . . . . . . 27

vi
Chapter 1

Introduction

Atrial Fibrillation (AF or AFib) is a heart condition that causes abnormal


or irregular heart rhythm. It is the most common type among other cardiac
arrhythmias. Even though it is not a fatal disease, AF can advance to car-
diac complications such as the fall in blood pressure, increased risk of heart
failure with the ensuing risk of a heart stroke. Therefore, it is very impor-
tant to detect AF in order to prevent cardiac threats. Approximately 33.5
million people around the globe were affected by AF in 2010. This estimated
number of individuals is expected to grow as the population ages globally.
The proportion of strokes associated with AF was found to be 6.6% for ages
50 to 60 years and 36.2% for ages above 80 years [2].
We are living in the 21st century, and the world is going through various
changes in different aspects of life. We can take the example of technological
evolution, socio-economic changes, environmental changes, etc. We can also
agree with the fact that the healthcare sector is also growing at a great pace.
But if we look at the latest pandemic caused due to the COVID-19 and its
effect, we definitely need to focus more on the healthcare sector in the coming
future. As mentioned, there is a lot of research taking place on the issue of
detecting such diseases by collecting data of such patients and working on the
collected datasets with various methods. Hence, we are aiming to implement
a method which will detect atrial fibrillation automatically. This method will

1
save lives of many people, enhance the quality of life of millions of individuals
who are at risk, and will help in reducing the socio-economic burden due to
AF and such cardiac diseases.

1.1 Motivation
Atrial fibrillation, a kind of cardiac arrhythmia may advance to several car-
diac complications such as fall in blood pressure, blood clots, risk of heart
failure, stroke, other heart diseases. The lives of about 10% of the global
population above the age of 75 years is affected by it. As the world popula-
tion is aging the commonness of AF in the adult population is increasing. A
recent study based on prevalence of AF in the globe stated that almost one
in four adults will be affected by the AF in the US and Europe [2].
In recent times, the ability of machines and softwares has been improved
significantly such that they are able to perform classification, quantification,
and identification of the patterns in biomedical signals [3]. Since the global
population is aging fast and the healthcare costs are also increasing, there is
a need for an automated AF detection method to monitor the health status
of patients. It will make life easier for heart patients and improve their
conditions. Hence it is important to design and cultivate new methods that
will help in detection of atrial fibrillation by performing the analysis of heart
rhythm ie. ECG signal.

1.2 Problem description and Objectives


Conventionally, the diagnosis of AF detection is carried out manually by
trained physicians by visually inspecting the electrocardiogram (ECG) sig-
nals. This makes automated or artificial detection inefficient and pertaining
to an individual. The efficiency of AF detection is getting affected due to
the huge amount of ECG data [4]. Hence, there is a demanding need for an
automated process of AF detection to analyze this huge amount of ECG data

2
and to help forward diagnosis which will reduce the burden on physicians.
The primary objectives of the project are as follows:

• To implement the algorithm for detection of AF using wavelet packet


transform and correlation function. It is an implementation of existing
work [1].

• To perform the analysis by using different supervised binary classifiers


and compare the results.

• Enhancement of the generalization ability of the algorithm by training


it over multiple datasets.

3
Chapter 2

Background

2.1 Heart
The heart is a pumplike muscular organ located slightly left and behind the
breastbone. It is hollow inside and around the measure of a clench hand.
It capacities by pumping the blood through the circuitry of blood vessels
(veins and arteries). The oxygen and nutrients are carried by the pumped
blood to the rest of body and carbon dioxide ie. a metabolic body waste
to the lungs. The heart is partitioned in two sections: both the upper and
lower section comprises two chambers each, left-right atrium and left-right
ventricle respectively as shown in Figure 2.1.
A customary electrical motivation is sent by the sinoatrial (SA) node
too called the pacemaker of the heart. Due to this impulse the upper heart
chambers contract. This impulse then flows to the ventricles through the
atrioventricular (AV) node causing them to pump out blood due to its con-
traction. The right atrium pumps the blood to the right ventricle after it
gets it from the veins. After the right ventricle receives the blood it supplies
it to the lungs. There it gets mixed with oxygen and then it is received by
the left atrium. The left atrium pumps it to the left ventricle which is the
most powerful chamber and it supplies it to the whole body [5].

4
Figure 2.1: Anatomy of the heart [5].

2.2 Electrocardiogram (ECG)


An electrocardiogram (EKG or ECG) is a heart analysis that gauges the
rate, rhythm, and depicts the heart’s electrical activity. The method of
documenting the cardiac electrical action of a patient with the aid of specific
electrical sensors over a given time interval is called Electrocardiography.
ECG is an important tool that can be used to detect the presence of any
sort of cardiac abnormalities or disease with the help of various complex and
sophisticated techniques. In the ECG signal, we can observe a repetitive
pattern that becomes a part of the collective signals as shown in Figure
2.2 [6].

2.3 Atrial Fibrillation


Atrial fibrillation, shortened as AF or AFib is a kind of heart arrhythmia. It
is found to be the most common type of cardiac arrhythmia. It causes quiver-
ing or irregular heartbeat. Due to the irregular and rapid electrical stimulus
in the atrium, the heartbeats become irregular and accelerated. This ab-

5
Figure 2.2: A typical one-cycle ECG signal tracing [6].

normal rhythm of the heart can lead to severe heart-related issues like drop
in blood pressure, stroke, risk of heart failure, etc. During a regular beat,
normal heart contracts and relaxes. But, in atrial fibrillation, two upper
chambers of the heart are not in coordination with the two lower chambers.
The atria beat irregularly and in a disordered manner (quiver) and is out of
coordination with the ventricles of the heart to move blood into the ventricles.
Heart palpitations, shortness of breath, and weakness are some symptoms of
AF. Conventionally, the diagnosis of AF detection is carried out manually
by visual inspection of the ECG signal. The trained physicians observe the
ECG signals to diagnose the AF as shown in Figure 2.3. Visual detection is
AF is also done by the observing irregularity in the occurrence of R peaks.
However, a non-invasive cardiac monitoring technique used to diagnose AF
is the absence of P waves on an ECG signal, which is a stronger indicator of
AF [7].

6
Figure 2.3: Electrical conduction system and ECG signal during normal
heartbeat and atrial fibrillation [7].

In addition, sometimes AF cases are asymptomatic, due to which the


doctors are not able to diagnose them. So, sometimes they have to depend
on the incidental detection of Atrial fibrillation during a physical examination
or on the ECG signal. This is quite a challenge for them to go through the
ECG recordings manually to find AF episodes. If in some case occurrence of
the AF episodes are totally random then a long ECG recording needs to be
analyzed. This becomes impractical for doctors. This problem justifies the
need for automated atrial fibrillation detection methods, which is important
and should be considered. AF can be classified on two bases: the cause or
the time period it lasts. Each kind has different treatment [7]: The cause
to the condition are also many, some are controllable, some are not. Factors
like high blood pressure (BP), diseases related to a heart valve, congenital
heart disease, past heart surgery play a big role.

• Paroxysmal (holiday heart syndrome): Usually an episode of AF,


which lasts maybe for a few minutes or days, but is below the week. In
major cases the treatment is not needed.

7
• Persistent: The episode duration is more than a week and medication
can be used to stop it. If the treatment or medicines doesn’t work,
physicians go for the electrical cardioversion, a low-voltage current used
to reset the normal rhythm.

• Permanent: It is chronic AF, hence cannot be treated. A long term


medication to reduce the heart-related complications is prescribed to
the patient.

To diagnose AF doctors check signs and symptoms, together with pa-


tient’s medical history and conduct different kinds of tests such as Electro-
cardiogram, blood tests, echocardiogram, stress test, chest X-ray, etc

2.4 Wavelet packet transform (WPT)


The wavelet packet transform (WPT) is an efficient method to study and un-
derstand non-stationary time series signals such as ECG signals. It performs
decomposition of the signal into several independent time-frequency signals
called packets. These packets are weighted sums of wavelet base function at
different levels. The yield of the signal decayed by WPT in various sub-bands
is called wavelet coefficients. These wavelet coefficients exhibit more intrinsic
physiological data in the time-frequency space of the signal.
Assuming that the signal S(t) is decomposed by WPT, the wavelet co-
efficients of the signal are given as dpl (t) where l is decomposition level and
the serial number of node is denoted by p, with p = 0, 1, . . ., 2l − 1 in the
WPT tree. So, we can compute the wavelet coefficients using the following
expressions [1]:

d2p {h(n).dpl−1 (2t − n)}


P
l (t) = (2.1)
n∈Z

d2p+1 {g(n).dpl−1 (2t − n)}


P
l (t) = (2.2)
n∈Z

8
where the low-pass and high-pass filter coefficients are denoted by h(n)
and g(n) respectively.
A WPT based signal decomposition process is schematically illustrated
in figure 2.4. The number of sub-bands produced by a three-level WPT adds
up to 8, where the signal frequency spectrum is divided into eight parts,
sub-bands covering one each. The discrete wavelet transform (DWT) gives
adjustable time–frequency resolution, but it is affected by the poor resolution
within the high-frequency locale. Due to this drawback of DWT, separation
of high-frequency transient components becomes challenging. But WPT fur-
ther breaks down the signal in the high-frequency region and provides the
detailed information of the signal by overcoming the limitation of DWT [8].

Figure 2.4: WPT decomposition for level l =3 [8].

9
Chapter 3

Method

This chapter begins with a description of the framework of the implemented


algorithm for AF detection. The data collection procedure along with pre-
processing, wavelet packet transform decomposition, feature extraction, and
classification. Figure 3.1 shows the framework of the implemented algorithm
for AF detection.

Figure 3.1: The AF detection algorithm framework [1].

10
3.1 Data Acquisition
A total of three datasets were used to implement the method. MIT-BIH AF
Dataset [9], PhysioNet challenge 2017 Dataset, PhysioNet challenge 2020
Dataset [10]. All three databases containing ECG signals are obtained from
the PhysioNet website. The database includes arrhythmia dataset, AF dataset,
noise stress test dataset, and other datasets and it is publicly available for
the people in medical research. The other two datasets also include ECG
signals of various patients around the globe. They include a variety of ab-
normalities which makes them perfect to enhance the generalization ability
of the implemented method. The algorithm is trained, validated, and tested
on each of them. These datasets are used to validate the genuineness and
practicability of the implemented algorithm.
ECG signal files present in MIT-BIH AF Dataset are two-lead ECG sam-
pled at 250 Hz (samples per second), whereas ECG signals present in Phy-
sioNet challenge 2017 dataset are single-lead ECG and are sampled at 300
Hz. ECG signals present in PhysioNet challenge 2020 dataset are 12-lead
ECG and are sampled at 500 Hz. In this work, we have used a single lead
from each of the datasets to implement the method.

3.2 Pre-Processing
The ECG signal data is pre-processed before it is used for feature extraction.
The pre-processing consists of two steps: Segmentation and Noise filtering
(Denoising). Noise filtering is carried out first and then the segmentation of
the signal is performed. Matlab R2018a was used to perform all the signal
processing and analysis of the ECG signals.

3.2.1 Noise filtering


ECG signal is often affected by different noises during its acquisition and
transmission. Various noises like Electromyogram (EMG) noise, Baseline

11
Wander, Channel Noise, and power-line interference, and other AWGN noise
[11]. These noisy factors prove to be a hindrance to the features extraction
process during ECG analysis.
So we designed and implemented the following filters to remove the noises
which were present in the datasets. The 50 Hz notch filter and the 0.3-45
Hz Bandpass filter [12]. The ECG segment before and after it is denoised by
the set of filters is illustrated in figure 3.2. We filtered out Baseline wander,
50Hz powerline interference, and electrode motion noise using the mentioned
filters.

Figure 3.2: ECG Signal before denoising and after denoising. (MIT-BIH AF
dataset record #4015)

3.2.2 Segmentation
The procedure of partitioning a signal into numerous sections of a certain
estimate is called segmentation. It can be performed with or without using
an overlapping window. In this step, ECG records from all three datasets are
divided into segments of duration 10 seconds, and the segmentation is done
without overlapping the window. These filtered and segmented ECG seg-
ments are used to find the WPT coefficients, a pre-requisite step for feature
extraction required for binary classification in the next section.

12
3.3 Wavelet Packet Transform
After ECG records undergo pre-processing, wavelet packet transform is per-
formed on each ECG segment. The segments are decomposed at the level
l =5. The figure 3.3 illustrates the WPT decomposition tree.

Figure 3.3: A part of a 5-level wavelet packet transform tree [1].

The bandwidth of P-wave and f-wave is majorly concentrated in 4–12


Hz. Hence, they are both considered as low-frequency waves. Therefore
for feature extraction, Band51 and Band52 of WPT decomposition tree are
chosen as frequency intervals [1]. An exemplification of the Normal ECG
signal and AF ECG signal and their corresponding selected sub-bands are
shown in Figure 3.4. The record #4015 from MIT-BIH AF dataset was
chosen for the above experiment purpose.

13
Figure 3.4: Plot of the sub-bands Band51 and Band52 of WPT decomposed
NSR signal (top) and AF signal (bottom). (MIT-BIH AF dataset record
#4015)

14
3.4 Feature Extraction
There are many existing AF detection algorithms in which the features are
extracted from the morphological features of ECG signals. Two main fea-
tures of the ECG signal in AF include the absence of p-waves and RR interval
irregularity. In the Afib ECG signal, p-waves are often replaced by f-waves,
a fast and disordered fibrillatory waves. All such algorithms involve some
pre-processing procedures like detection of R-peak and P-waves, but the al-
gorithm performance is affected if these parameters are not accurately calcu-
lated [13]. Hence to avoid this situation, in this work the feature extraction
method is based on the correlation between the WPT Coefficients extracted
from selected sub-bands (Band51 and Band52 ) [1]. It is known that if there is
any kind of disorder present in ECG signal the correlation among the coeffi-
cients of the wavelet coefficient series will decrease. In any random series, the
correlation function has a superior ability to quantify specific characteristics,
and because of this property, it is considered for sequential data analysis.
The changes in the atrial activity are highlighted due to this property [14].
So the feature set is constructed by calculating the information entropy and
weighted sums of selected sub-bands.
The feature extraction steps are explained below:

Step 1. The wavelet coefficients are obtained from the selected sub-bands
(Band51 and Band52 ) by performing WPT decomposition of filtered ECG
segment. The coefficients series is divided into n segments of equal length,
and by representing one of the segment as d¯(ti ) = d(1) (ti ) , d(2) (ti ) , d(3) (ti ) , . . . , d(m) (ti )
 

with i = 1, 2, 3, 4, . . . , n [1].

15
Step 2. A correlation matrix is computed with these segments where τ =
0, 1, 2, 3, . . . , n-1, and it is normalized as given below [1]:
 
B1,1 · · · B1,1+r · · · B1,n
 . .. .. 
 .. . . 
 
Rd¯ =  Bi,1 · · · Bi,1+τ ··· Bi,n  (3.1)
 
 . .. .. 
 .
 . . . 

Bn,1 · · · Bn,1+τ · · · Bn,n


Step 3. The features are extracted from the normalized correlation ma-
trix [1].

n
P
WB = {Bi,i+τ .ni,i+τ } (3.2)
i=1

n
P
HB = − {pi,i+τ . log pi,i+τ } (3.3)
i=1

where,
ni,i+τ = Number of Bi,i+τ in a given precision.
pi,i+τ = Proportion of ni,i+τ in total number.
Step 4. The features are assembled as feature set for classifier. We
have used K-Nearest Neighbor (KNN), Support vector machine (SVM), and
Artificial Neural Network (ANN) learning models for classification.
The correlation function estimation of any two segments is denoted as
B̂d¯ (τ0 ), which is actually a sequence of numbers with τ0 = 0, ±1, ±2, . . . , ±(m−
1), presented as follows [1]:

m
1
{d¯j (ti ) .d¯(j+τ0 ) (ti+τ )}
P
B̂d¯ (τ0 ) = m (3.4)
j=1

where, B̄i,i+τ = Mean value of the sequence,

m−1
1
{B̂ d¯(τ0 )}
P
B̄i,i+τ = ( 2m−1 ) (3.5)
τ0 =−(m−1)

and Bi,i+τ = Normalized value of B̄i,i+τ .

16
3.4.1 Features Evaluation
After the feature extraction step, it is important to check if the extracted
features are good enough to give input to the classifier model. They can
be evaluated by their p-values or box-plots. If the p-value is less than 0.05
then the feature is considered to be reasonable to be used for classification
purpose [15]. If not, it is discarded from the feature set. So, to prove the
genuineness of the features, the Kruskal–Wallis test was conducted. The
p-values obtained after the test for each dataset are given in Table 3.1.
The visual hypothesis testing using box-plots was also executed. The
higher classification ability in the selected sub-bands is represented by the
box-plots of features. It was useful in proving the significance of the features
for the classification. The box-plots for the MIT-BIH AF dataset after the
test are shown in Figure 3.5 and 3.6.
The p-values and the box-plots obtained against each dataset suggests
that the constructed feature set can be used as an input for the classifier
model for automatic detection of AF segments.

Table 3.1: p-values of the constructed features for respective dataset.

Features MIT-BIH AF PNC 2017 PNC 2020


WB in Band51 7.4424e-55 4.6259e-223 1.8753e-31
WB in Band52 9.1575e-60 1.5020e-135 6.4447e-33
HB in Band51 9.3999e-08 6.5043e-145 2.7211e-19
HB in Band52 2.9049e-20 5.9010e-140 5.8739e-15

17
(a) p-value= 7.4424e-55 (b) p-value= 9.1575e-60

Figure 3.5: The example of box-plot for WB in (a)Band51 (b)Band52 (MIT-


BIH AF dataset)

(a) p-value= 9.3999e-08 (b) p-value= 2.9049e-20

Figure 3.6: The example of box-plot for HB in (a)Band51 (b)Band52


(MIT-BIH AF dataset)

18
3.5 Classification
To further the study, the algorithm was tested on different types of classi-
fication techniques such as ANN, KNN, SVM. For each dataset, all three
classifiers were implemented. The default settings were used for training all
the classifier models. The obtained results are illustrated in the following
chapter. The classifiers that are used perform supervised binary classifica-
tion by segregating the ECG data into two classes, ie. AFib or NSR.

Support Vector Machine (SVM) A ’Support Vector Machine’ ab-


breviated as SVM is a type of supervised machine learning algorithm com-
monly preferred for binary classification. It maximizes the limit of separa-
tion between two classes in the data by constructing an optimal hyperplane
which acts as a decision surface. In this work polynomial type of SVM’s like
Quadratic and Cubic was found to give better results than other types.

K-Nearest Neighbor (KNN)


KNN is a classification technique that is usually preferred for multi-class
classification. It segregates data based on the distance parameter. Based
on a specified distance metric, the nearest neighbor search locates all the
neighbors or k-nearest neighbors within a specified distance to query data
points. In this work, weighted and cosine type of KNN was found to give
better results than other variants.

Artificial Neural Network (ANN)


Artificial Neural Network is a type of classification technique based on
the principle of neural networks. It is majorly motivated by the function of
human brain. The human brain is made up of neurons and neural connections
ie. synapses. Each neuron is interconnected to different neurons and neural
connections. The computational and signal transferring units of the brain
are neurons and synapses respectively. Neural networks are widely used for
operations like image classification, market prediction, speech recognition,

19
etc. ANN classifier predicts the output by assigning the input vectors to
various categories according to their properties [16].
The neural network structure used in our framework contains three (3)
layers ie. input layer, hidden layer, and output layer. The number of neurons
or nodes in the input layer was set as four, equivalent to the dimension of
the feature vector. Due to the binary classification, the number of neurons
or nodes in the output layer was set as two. The hidden node count was set
as ten(10). Figure 3.7 shows the ANN topology in proposed scheme. The
adaptive learning rate was set to 0.1. Levenberg-Marquardt backpropagation
algorithm was used to obtain the best performing model. For the hidden
layer, the activation function was set to Sigmoid function and for the output
layer, the Softmax function was chosen.

Figure 3.7: ANN topology in proposed scheme.

3.6 Performance Evaluation


A machine learning algorithm is said to be outstanding if it is able to gener-
alize itself and perform well on new and previously unseen samples. A large
error will be generated if the model lacks generalization. However, a model
can mimic very well with high accuracy if we iterate it many times on the
dataset with limited size. But this leads to a situation called overfitting due
to lack of generalization. So, to avoid this situation and improve the reli-

20
ability of the model in an unseen example set, a unique test set is chosen.
Usually, it is done by dividing a large dataset into training, validation, and
test sets [15].
In this work, all the datasets used are divided into three parts. The
proportion was kept as 80% for training and 10% for validation and testing
each. During training stage training and validation sets are used. At the final
stage, the test set is used for performance evaluation of the model. However,
we can take advantage of the idea of cross-validation, if the dataset is limited.

3.6.1 Cross-validation
To enhance the generalization ability of the method, we implemented 10-fold
cross validation. This is performed to make sure the conclusive classification
performance of the respective classifier [17]. The dataset was divided into
10 parts in a random order, out of which nine were taken as the training set
inturn and the remaining one as the test set. The results after each iteration
of the experiment were averaged for final classification performance.

3.6.2 Evaluation metrics


To gauge the performance of a classifier, it is always important to have some
metrics. The most effective way to scope the implementation of a model is us-
ing a table called contingency table or confusion matrix. Some of the metrics
like sensitivity (SEN), accuracy (ACC), specificity (SPE), false positive rate
(FPR) and error rate (Err) can be calculated using the confusion matrix [18].
The binary classification confusion matrix with positive (afflicted by AF) and
negative (not afflicted by AF) classes is a two-by-two table, shown in Figure
3.8.
True Positives (TP): No. of samples correctly classified as AF.
True Negatives (TN): No. of samples correctly classified as non-AF.
False Positives (FP): No. of samples misclassified as AF.
False Negatives (FN): No. of samples misclassified as non-AF.

21
Figure 3.8: A template of the confusion matrix for the binary classification.

In this work, we employed standard metrics that are sensitivity (SEN),


specificity (SPE) and accuracy (ACC) to reveal the effectiveness of the clas-
sification. The expressions to compute them is given below:
TP
SEN = (3.6)
(T P + F N )

TN
SPE = (3.7)
(T N + F P )
(T P + T N )
ACC = (3.8)
(T P + T N + F N + F P )
F1 score: (F measure or F score) It is the weighted average of Precision
(PRE) and Recall (REC). Both false positives and negatives are taken into
account while calculating F1 score. It is calculated as below:
REC ∗ P RE
F1 Score = 2 ∗ ( ) (3.9)
REC + P RE
REC = TP/(TP + FN) (3.10)

PRE = TP/(TP + FP) (3.11)

22
Chapter 4

Results

The implemented algorithm was tested on different datasets to improve


its generalization ability. A total of three datasets were used. The evaluation
parameters such as the accuracy, specificity, sensitivity, and F1 score were
calculated from the obtained confusion matrix of each dataset against each
classifier. The validation performance plot and ROC curve for the SVM
classifier model were also illustrated for binary classification using a combined
dataset study.
One of the efficient measures of classification performance of the imple-
mented method is the area under the ROC curve (AUC). The higher the
area under the ROC curve (AUC) value, the higher is the efficiency of the
classification [19].

4.1 Binary classification using MIT-BIH AF


Dataset
The algorithm was trained and tested on MIT-BIH AF Dataset. The
obtained results are given in this section. Three classifiers were used to
perform the classification. Table 4.1 shows the performance evaluation pa-
rameter comparison of AF detection results using ANN, KNN, and SVM.

23
Table 4.1: Performance evaluation parameter comparison of the algorithm
using MIT-BIH AF dataset with different classifiers.

ANN KNN SVM


ACC (%) 100 99.44 99.44
SEN (%) 100 99.44 100
SPE (%) 100 99.44 98.90
F1 Score 1.000 0.994 0.994

4.2 Binary classification using PhysioNet chal-


lenge 2017 Dataset
The algorithm was trained and tested on PhysioNet challenge 2017 Dataset
and the obtained results are given in this section. Table 4.2 shows the per-
formance evaluation parameter comparison of the algorithm using PhysioNet
Challenge 2017 dataset with different classifiers.

Table 4.2: Performance evaluation parameter comparison of the algorithm


using PhysioNet Challenge 2017 dataset with different classifiers.

ANN KNN SVM


ACC (%) 99.28 99.17 99.27
SEN (%) 98.35 99.33 99.46
SPE (%) 99.54 98.58 98.59
F1 Score 0.983 0.994 0.995

24
4.3 Binary classification using PhysioNet chal-
lenge 2020 Dataset
The algorithm was trained and tested on PhysioNet challenge 2020 Dataset
which is available on PhysioNet website. The results obtained are given be-
low. Table 4.3 the performance evaluation parameter comparison of the
algorithm using PhysioNet Challenge 2020 dataset with different classifiers.

Table 4.3: Performance evaluation parameter comparison of the algorithm


using PhysioNet Challenge 2020 dataset with different classifiers.

ANN KNN SVM


ACC (%) 99.26 98.96 99.15
SEN (%) 99.08 98.50 98.88
SPE (%) 99.44 99.43 99.43
F1 Score 0.992 0.989 0.991

4.4 Binary classification for All datasets com-


bined
After implementing the algorithm on individual dataset we combined
all three datasets and used it to train and test the algorithm. The model
was trained over 17175 segments, out of which 11935 were NSR and 5240
were AF segments. The performance evaluation parameter comparison of
the algorithm using all three datasets combined with different classifiers is
shown in Table 4.4. The ROC curve for the SVM classifier model is illustrated
in Figure 4.1.

25
Table 4.4: Performance evaluation parameter comparison of the algorithm
using all three datasets combined with different classifiers.

ANN KNN SVM


ACC (%) 98.03 97.78 98.50
SEN (%) 95.05 97.66 98.43
SPE (%) 99.40 98.04 98.39
F1 Score 0.968 0.983 0.988

Figure 4.1: ROC curve for the SVM classifier.

26
4.5 Parameter Analysis
Parameter Analysis for the SVM classifier using different datasets.
It was observed that while ANN gave a better accuracy when the datasets
were trained separately, SVM classification was more effective for the gener-
alized approach. Therefore, we chose to closely examine the output of SVM
classifier on all datasets. Table 4.5 illustrates the Evaluation parameter com-
parison of the algorithm using different datasets with SVM classifier.

Table 4.5: Evaluation parameter comparison of the algorithm using different


datasets with SVM classifier.

DS1 DS2 DS3 All Combined


ACC 99.44 99.27 99.15 98.50
SEN 100 99.46 98.88 98.43
SPE 98.90 98.59 99.43 98.39
F1 Score 0.994 0.995 0.991 0.988

27
Chapter 5

Conclusion

The project is aimed at implementing an efficient algorithm to detect atrial


fibrillation using machine learning. This study is an implementation of exist-
ing work [1]. The study emphasizes the danger of atrial fibrillation and the
need for automated tools that can detect it. It also facilitated in achieving a
high detection accuracy.
The algorithm for detection of Atrial Fibrillation was implemented suc-
cessfully. The generalization ability of the algorithm was improved by im-
plementing it on a larger and more diverse dataset. The SVM classification
model gave the highest accuracy of 98.5%. It was observed that while ANN
gave a better accuracy when the datasets were trained separately, SVM clas-
sification was more effective for the generalized approach.
On our way to achieving so, we learned about electrocardiogram, atrial
fibrillation, wavelet packet transform, and gained practical knowledge about
biomedical signal processing and, machine learning as well as skills to imple-
ment an existing method. We believe that the knowledge and skills that we
gained will be useful and will definitely come in handy in the future.

28
Chapter 6

Future Work

Although the results we are able to get are satisfactory, the implementa-
tion can still be improved by training the method using more number of
diverse databases with a variety of patients and a variety of cardiac defects.
Tasks revolving around machine learning and neural network can be further
boosted by using computers having faster processors. The classification can
be improved by using deep neural networks like recurrent neural network.
The algorithm could be further simplified by exploring other wavelet
transforms in place of wavelet packet transform. Another scope of future
work is the detection of other types of arrhythmias such as atrial flutter,
paroxysmal tachycardia, and heart-valve related diseases. Further, the algo-
rithm can be improved by deploying it under realistic scenarios and analyzing
the results. In the end, the method can be clinically validated in the future
and the possibility of implementing a low-cost device can be explored.

29
References

[1] Jibin Wang, Ping Wang, Suping Wang, “Automated detection of atrial
fibrillation in ECG signals based on wavelet packet transform and cor-
relation function of random process”, Biomedical Signal Processing and
Control, Volume 55, 101662, August 2019.

[2] Chugh, S.S., Havmoeller, R., Narayanan, K., Singh, D., Rienstra, M.,
Benjamin, E.J., Gillum, R.F., Kim, Y.H., McAnulty Jr, J.H., Zheng,
Z.J. and Forouzanfar, “Worldwide epidemiology of atrial fibrillation: a
Global Burden of Disease 2010 Study”, Circulation 129(8), pp.837-847,
2014..

[3] Muthuchudar, A. and Baboo, L.D.S.S., “A study of the processes in-


volved in ECG signal analysis”, International Journal of Scientific and
Research Publications, 3(3), pp.1-5, 2013.

[4] Padmavathi K., and K. Sri Ramakrishna, “Detection of atrial fibrillation


using autoregressive modeling”, International Journal of Electrical and
Computer Engineering (IJECE), 5, pp. 64-70, 2015.

[5] http://www.sads.org.uk/heart function.htm.sads.org.uk

[6] https://ecgwaves.com/topic/cardiac-electrophysiology-ecg-action-
potential-automaticity-vector/

[7] http://www.nhlbi.nih.gov/health-topics/atrial-fibrillation/

30
[8] Abouelanouar, Bouchra, et al., “Application of wavelet analysis and
its interpretation in rotating machines monitoring and fault diagnosis.
A review”, International Journal of Engineering & Technology, 7(4),
3465-3471, 2018.

[9] https://physionet.org/content/afdb/1.0.0/

[10] https://physionet.org/about/challenge/

[11] Velayudhan, Aswathy, and Soniya Peter, “Noise analysis and differ-
ent denoising techniques of ECG signal-a survey”, IOSR Journal of
Electronics and Communication Engineering (IOSR-JECE), pp 40-44,
eISSN-2278, 2016.

[12] S. Patidar, A. Sharma and N. Garg, “Automated detection of atrial


fibrillation using Fourier-bessel expansion and teager energy operator
from electrocardiogram signals”, 2017 Computing in Cardiology (CinC),
Rennes, 349-105, pp. 1-4, 2017.

[13] Tateno, K., and L. Glass, “A method for detection of atrial fibrillation
using RR intervals”, Computers in Cardiology 2000, Vol. 27, pp. 391-
394. IEEE, 2000.

[14] Moews, Ben, J. Michael Herrmann, and Gbenga Ibikunle, “Lagged


correlation-based deep learning for directional trend change prediction
in financial time series”, Expert Systems with Applications, 120, pp. 197-
206, 2019.

[15] Ostertagová, Eva, et al., “Methodology and Application of the Kruskal-


Wallis Test.”, Applied Mechanics and Materials, vol. 611, pp. 115–120,
Aug 2014.

[16] Nagesh Singh Chauhan, “https://towardsdatascience.com/introduction-


to-artificial-neural-networks-ann”.

31
[17] Refaeilzadeh P., Tang L., Liu H., “Cross-validation”, Encyclopedia of
database systems, pp. 532–538, Springer, 2009.

[18] Parikh R, Mathai A, Parikh S, Chandra Sekhar G, Thomas R, “Under-


standing and using Sensitivity, Specificity and Predictive values”, Indian
J Ophthalmol, 56(1), pp 45-50, June 2008.

[19] Ladavich, Steven, and Behnaz Ghoraani, “Developing an atrial activity-


based algorithm for detection of atrial fibrillation”, 36th Annual Inter-
national Conference of the IEEE Engineering in Medicine and Biology
Society, pp. 54-57. IEEE, 2014.

32

You might also like