0% found this document useful (0 votes)
10 views4 pages

Rank 3

This paper presents a deep learning architecture for multi-label classification of 26 cardiac abnormalities from ECG data, developed for the PhysioNet/Computing in Cardiology Challenge 2021. The model utilizes both time and frequency domain features from ECG signals, achieving competitive performance across various lead configurations, with rankings ranging from 2nd to 5th among 39 teams. The study highlights the importance of specific leads, such as V2, in accurately classifying cardiac conditions and addresses challenges related to data imbalance in multi-label classification.

Uploaded by

ian
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views4 pages

Rank 3

This paper presents a deep learning architecture for multi-label classification of 26 cardiac abnormalities from ECG data, developed for the PhysioNet/Computing in Cardiology Challenge 2021. The model utilizes both time and frequency domain features from ECG signals, achieving competitive performance across various lead configurations, with rankings ranging from 2nd to 5th among 39 teams. The study highlights the importance of specific leads, such as V2, in accurately classifying cardiac conditions and addresses challenges related to data imbalance in multi-label classification.

Uploaded by

ian
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Multi-Label Cardiac Abnormality Classification from Electrocardiogram Using

Deep Convolutional Neural Networks

Nima L Wickramasinghe1 , Mohamed Athif2


1
Department of Electronic and Telecommunication Engineering, University of Moratuwa, Sri Lanka
2
Department of Biomedical Engineering, Boston University, Boston, USA

Abstract and user-friendly point of care devices. However, it re-


mains largely unexplored whether similar outcomes are
This paper proposes a deep neural network architec- achievable using reduced leads. The objective of The Phy-
ture to perform multi-label classification of 26 cardiac sioNet/Computing in Cardiology Challenge 2021 was to
abnormalities from 12-lead and reduced lead ECG data. find automated, open-source approaches to identify multi-
The model was created by team ”NIMA” for the Phys- ple cardiovascular diseases from 12-lead and reduced-lead
ioNet/Computing in Cardiology Challenge 2021. ECG sig- ECG data [2, 3].
nals of at most 20 seconds in length were used for training. Deep learning methods have recently gained popularity
The data are preprocessed by normalizing, resampling, in classifying various cardiac abnormalities in addition to
and zero-padding to get a constant-sized array. The pre- traditional methods such as support vector machines, lin-
processed ECG signals and Fast Fourier Transforms ob- ear regression, decision trees, and feed-forward neural net-
tained from the preprocessed signals are each fed into two works [4, 5]. In this work, we propose and investigate the
separate deep Convolutional Neural Networks. Spatial suitability of a deep learning-based method using the time
dropouts and average pooling are used between each con- domain and frequency domains of ECG signals to classify
volutional layer to reduce overfitting and to reduce model 26 different classes of cardiac abnormalities using 12-lead
complexity. Following the convolutional layers, the time and reduced-lead ECG data.
and frequency domain network outputs are concatenated
and passed through two dense layers that output an ar-
2. Methods
ray of size 26. A threshold of 0.13 is used on the output
array to determine the class while addressing data imbal-
ance. The method achieved a score of 0.55, 0.51, 0.56, 2.1. Dataset
0.55, and 0.56 ranking 2nd, 5th, 3rd, 3rd and 3rd out of 39
officially ranked teams on 12-lead, 6-lead, 4-lead, 3-lead, Local training is done using the CPSC [6], PTB
and 2-lead hidden test datasets, respectively, according to [7], PTB-XL [8], INCART [9], Chapman-Shaoxing [10],
the challenge evaluation metric. Our model performs com- Ningbo [11] and Georgia databases with 88,259 12-Lead
parable to the 12 Lead ECG using smaller subsets of leads. ECG signals. Recordings with a length greater than 20 sec-
onds are not used for training (<3% of the dataset). The
final dataset containing 85,811 records is randomly divided
into a training dataset (79,791 records) and a local valida-
1. Introduction tion dataset (6,020 records). Reduced lead ECG signals
are generated by selecting the required leads from 12-lead
Cardiovascular diseases have a high prevalence (in the ECG signals as shown in Table 1.
USA for 49.5% of adults >20 years) and are a major
cause of death [1]. An electrocardiogram (ECG) mea- Leads Leads used
sures the heart’s electrical activity and helps the diagno- 12 I, II, III, aVR, aVL, aVF, V1, V2, V3, V4, V5, V6
sis of cardiovascular diseases by identifying various ab- 6 I, II, III, aVR, aVL, aVF
normalities in patterns. Automatic detection of cardiac 4 I, II, III, V2
3 I, II, V2
abnormalities from ECGs has many benefits, including
2 I, II
early detection and better prognosis. While such imple-
mentations often use the standard 12-lead ECGs, using a Table 1. Leads used in different lead sets
smaller number of leads would enable low-cost, portable,

Computing in Cardiology 2021; Vol 48 Page 1 ISSN: 2325-887X DOI: 10.22489/CinC.2021.352


2.2. Preprocessing layer are added, and Swish activation function is applied
to the outputs. Then we used 1D spatial dropout [13] with
Firstly, all the recordings are normalized by subtracting a rate of 0.2 to reduce the overfitting of the model. The out-
the baseline and dividing by the Analog-to-Digital Con- puts are passed through an average pooling layer to reduce
verter (ADC) gain. Recordings that are less than 20 sec- the dimensions of the feature maps and reduce computa-
onds are zero-padded to 20 seconds and then resampled tional cost. This process is repeated four times for the time
at 200Hz, giving an array with 4,000 time points. Fast domain signal with the number of filters increasing by two-
Fourier Transform (FFT) is applied to the resulting record- fold and the kernel size increasing by 2 each time as shown
ings giving a complex array of 4,000 points representing in Table 2. Finally, instead of the average pooling layer, a
the frequency domain of the recording. Since the fre- global average pooling layer is used to produce an output
quency domain of a real signal is symmetric, only the first array of size 576. We used multiples of the number of leads
2,000 points are selected. The magnitude and phase of the for the number of filters used in the convolutional layers,
resulting array are separated and concatenated, giving an a large initial kernel size, and padding in all the convolu-
array of shape (2N, 2,000), where N is the number of leads. tional layers to maintain consistent dimensionality across
The time-domain array is again resampled at 100Hz, giv- convolutional layers and to facilitate skip connections.
ing an array of shape (N, 2,000). The 26 classes of diag-
noses are one-hot-encoded to give a binary array of size
26.
SciPy library is used for resampling and obtaining the
FFT of the signals. Both the time domain and the fre-
quency domain arrays are sent as inputs to the deep learn-
ing model.

2.3. Model Description


The model is created using the Python TensorFlow li-
brary using the Keras functional API.

Figure 2. Convolution block A

The frequency-domain input is sent into convolution


block B (Figure 3). It is first passed through a 1D con-
volutional layer, and then the Swish activation function is
applied to the resulting output. Spatial dropout is then ap-
plied with a rate of 0.1 and then average pooled with a size
Figure 1. The neural network architecture. The numbers of 2. This is repeated 5 times with the number of filters in-
in brackets show the shape of the output array from the creasing by 2 fold and the kernel size increasing by 2 with
particular block. N = number of leads each repetition as shown in Table 2. Finally, a global av-
erage pooling layer is used, producing an output array of
The time-domain input is sent into convolution block A size 1,152.
(Figure 2). It is first passed through a 1D convolutional The outputs from convolution block A and convolution
layer, and then Swish activation function [12] is applied to block B are concatenated (Figure 3), and Swish activation
the outputs. A skip connection is used to pass the informa- function is applied. The resultant output is fed into a fully
tion from the previous layer directly to the next layer. Val- connected layer with 576 units. A dropout layer with a rate
ues from both the skip connection and the convolutional of 0.5 is applied, and another dense layer of 26 units with

Page 2
Lead set 12-lead 6-lead 4-lead 3-lead 2-lead
Epoch at which
17 23 16 16 15
LR=0.0001
Total number
21 26 22 18 17
of epochs
Table 3. Details containing the number of epochs each
lead set is trained

2.4. Model Evaluation


When evaluating, if the length of the recording is larger
than 20 seconds we segmented the recording into 20-
second windows sequentially with no overlap. If the last
segmented window is less than 5 seconds, it is discarded;
otherwise, it is zero-padded to 20 seconds. All the seg-
mented windows are sent as input to the trained model. If
Figure 3. Convolution block B (Left) and fully connected the cardiac abnormality is found in at least one of the win-
block (Right) dows, then the cardiac abnormality is marked as present.
We empirically determined the optimal threshold to be
0.13 by changing the threshold between 0 and 1 with a
sigmoid activation function finally outputs an array of size step of 0.01 and validating against the training data using
26, which is used to predict whether the specific cardiac the challenge scoring metric.
abnormality is present or not. Our method classifies the 26
scored classes, including the 4 pairs of similar diagnoses 3. Results
as 4 classes and is able to identify multiple diagnoses from
a single recording. Table 4 shows the challenge score obtained by the above
The loss and Area Under the Precision-Recall Curve models trained on different leads on the local validation
(AUPRC) are used as metrics when evaluating the model set, hidden validation set and, the hidden test set along with
performance while training. Training is discontinued if the ranking.
the validation AUPRC values did not increase by 0.01
for 5 epochs. The data is shuffled between each epoch Leads Training Validation Test Ranking
and a batch size of 128 is used for training. First, the 12 0.75 0.65 0.55 2
time domain and frequency domain network hyperparame- 6 0.70 0.59 0.51 5
ters were optimized independently and then fine-tuned to- 4 0.72 0.63 0.56 3
gether. We used a grid search method to find the optimum 3 0.73 0.63 0.55 3
hyperparameters. Different lead components are used as 2 0.70 0.61 0.56 3
the number of channels when used as input to the con-
Table 4. Challenge scores for the final accepted entry
volutional layers. Only the number of input channels are
(team NIMA) on the local validation set, hidden validation
changed in the model when different subsets of leads were
set and, the hidden test set along with the ranking
used.
Block A1 A2 A3 A4 B1 B2 B3 B4 B5 Table 5 shows the F1 scores obtained by each class on
Num. of
72 144 288 576 72 144 288 576 1152 the local validation set.
filters (x)
Kernel
size (y)
15 3 5 7 3 5 7 9 11 4. Discussion and Conclusions
Table 2. Hyperparameters used in the final model. A and The classifier performs with scores ranging from 0.59 -
B stand for the convolutional block A (Figure 2) and B 0.65 in the hidden validation set. Overall, the 6-lead and
(Figure 3) respectively. The number following A or B in- 2-lead sets that did not contain the V2 lead show the lowest
dicates the repetition of each block. score. Therefore, V2 lead seems to play an important role
in classifying cardiac abnormalities. The performance on
Binary cross entropy is used to measure the loss. We the hidden test set is lower than the hidden validation set
used Adam optimization with an initial learning rate of but similar across all lead sets.
0.001 for all the models. The learning rate is reduced ten- Multi-label data imbalance is especially harder to ad-
fold after a number of epochs as shown in Table 3. dress because upsampling one of the rare classes may re-

Page 3
Dx F1 Score Dx F1 Score References
AF 0.59 PAC/SVPB 0.57
AFL 0.83 PR/VPB 0.89 [1] Virani SS, Alonso A, Aparicio HJ, Benjamin EJ, Bitten-
BBB 0.28 PRWP 0.31 court MS, Callaway CW, et al. Heart Disease and Stroke
Brady 0.62 PVC 0.63 Statistics – 2021 Update: a Report from the American Heart
Association. Circulation 2021;143(8):e254–e743.
CLBBB/LBBB 0.75 LPR 0.46
[2] Perez Alday EA, Gu A, Shah A, Robichaux C, Wong AKI,
CRBBB/RBBB 0.85 LQT 0.48
Liu C, et al. Classification of 12-lead ECGs: the Phys-
IAVB 0.68 QAb 0.35 ioNet/Computing in Cardiology Challenge 2020. Physio-
IRBBB 0.51 RAD 0.62 logical Measurement 2020;41.
LAD 0.72 SA 0.68 [3] Reyna MA, Sadr N, Perez Alday EA, Gu A, Shah A, Ro-
LAnFB 0.72 SB 0.96 bichaux C, et al. Will Two Do? Varying Dimensions in
LQRSV 0.50 STach 0.94 Electrocardiography: the PhysioNet/Computing in Cardiol-
NSIVCB 0.38 TAb 0.59 ogy Challenge 2021. Computing in Cardiology 2021;48:1–
NSR 0.91 TInv 0.49 4.
[4] Aurore L, Ana M, Pablo MJ, Pablo L, Blanca R. Compu-
Table 5. F1 scores obtained by the final model for 26 tational Techniques for ECG Analysis and Interpretation in
scored diagnoses on the local validation set Light of their Contribution to Medical Advances, 2018.
[5] Ribeiro AH, Ribeiro MH, Paixão GMM, et al. Automatic
Diagnosis of the 12-lead ECG using a Deep Neural Net-
sult in oversampling one of the abundant classes and vice work, 2020.
versa. Upsampling and downsampling data both as indi- [6] Liu F, Liu C, Zhao L, Zhang X, Wu X, Xu X, et al. An Open
vidual classes and as super sets of all 26 classes to create Access Database for Evaluating the Algorithms of Electro-
balanced datasets did not yield better performance. There- cardiogram Rhythm and Morphology Abnormality Detec-
fore, to address the class imbalance, we used a single opti- tion. Journal of Medical Imaging and Health Informatics
mized threshold on the sigmoid output. 2018;8(7):1368––1373.
Having a shorter input array size reduced the training [7] Bousseljot R, Kreiseler D, Schnabel A. Nutzung der EKG-
Signaldatenbank CARDIODAT der PTB über das Internet.
time of the model significantly while preserving informa-
Biomedizinische Technik 1995;40(S1):317–318.
tive features in the signals. Therefore, we resampled the 20
[8] Wagner P, Strodthoff N, Bousseljot RD, Kreiseler D, Lunze
second long recordings at 100Hz. We did not include any FI, Samek W, et al. PTB-XL, a Large Publicly Available
filtering steps to clean the ECG since it may eliminate dis- Electrocardiography Dataset. Scientific Data 2020;7(1):1–
criminating features between classes. Further investigation 15.
is required on whether filtering improves performance. [9] Tihonenko V, Khaustov A, Ivanov S, Rivin A, Yakushenko
E. St Petersburg INCART 12-lead Arrhythmia Database.
Model variation Score PhysioBank PhysioToolkit and PhysioNet 2008;Doi: 10.1
Final Model 0.745 3026/C2V88N.
Replacing Spatial dropout with Dropout 0.733 [10] Zheng J, Zhang J, Danioko S, Yao H, Guo H, Rakovski C.
Without using the frequency domain 0.734 A 12-lead Electrocardiogram Database for Arrhythmia Re-
search Covering More Than 10,000 Patients. Scientific Data
Without using the time domain 0.679
2020;7(48):1–8.
Table 6. Challenge scores for different variations of our [11] Zheng J, Cui H, Struppa D, Zhang J, Yacoub SM, El-Askary
final model on the local validation set H, et al. Optimal Multi-Stage Arrhythmia Classification
Approach. Scientific Data 2020;10(2898):1–17.
Using the frequency domain as input additional to the [12] Ramachandran P, Zoph B, Le QV. Searching for Activation
time domain input, using spatial dropouts, using a number Functions, 2017.
of convolution layer filters in multiples of the number of [13] Tompson J, Goroshin R, Jain A, LeCun Y, Bregler C. Ef-
ficient Object Localization Using Convolutional Networks,
leads, using large initial kernel sizes, using Swish activa-
2015.
tion instead of Rectified Linear Unit activation, and using
residual networks through skip connections improved the
performance and stability of the network (Table 6). Address for correspondence:
Nima L. Wickramasinghe
Acknowledgments
195/2, Galgediyawa, Gampola, Sri Lanka
[email protected]
The authors would like to thank the Sustainable Educa-
tion Foundation, Sri Lanka for facilitating the collabora-
tion and Richie Wheelock for proofreading.

Page 4

You might also like