0% found this document useful (0 votes)
7 views9 pages

This One

Uploaded by

stagnesadastra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views9 pages

This One

Uploaded by

stagnesadastra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Signal, Image and Video Processing (2018) 12:941–949

https://doi.org/10.1007/s11760-018-1237-5

ORIGINAL PAPER

Intelligent hybrid approaches for human ECG signals identification


Mahmoud M. Bassiouni1 · El-Sayed A. El-Dahshan1,2 · Wael Khalefa3 · Abdelbadeeh M. Salem3

Received: 15 January 2017 / Revised: 12 December 2017 / Accepted: 9 January 2018 / Published online: 10 February 2018
© Springer-Verlag London Ltd., part of Springer Nature 2018

Abstract
This paper presents hybrid approaches for human identification based on electrocardiogram (ECG). The proposed approaches
consist of four phases, namely data acquisition, preprocessing, feature extraction and classification. In the first phase, data
acquisition phase, data sets are collected from two different databases, ECG-ID and MIT-BIH Arrhythmia database. In the
second phase, noise reduction of ECG signals is performed by using wavelet transform and a series of filters used for de-
noising. In the third phase, features are obtained by using three different intelligent approaches: a non-fiducial, fiducial and
a fusion approach between them. In the last phase, the classification approach, three classifiers are developed to classify
subjects. The first classifier is based on artificial neural network (ANN). The second classifier is based on K-nearest neighbor
(KNN), relying on Euclidean distance. The last classifier is support vector machine (SVM) classification accuracy of 95% is
obtained for ANN, 98 % for KNN and 99% for SVM on the ECG-ID database, while 100% is obtained for ANN, KNN, and
SVM on MIT-BIH Arrhythmia database. The results show that the proposed approaches are robust and effective compared
with other recent works.

Keywords Biometric · ECG signals · Intelligent hybrid approach · Machine learning · Fiducial and non-fiducial

1 Introduction Some of the existing proposed systems combine fiducial


and non-fiducial features in a hierarchical manner to improve
Biometric recognition provides an important tool for secu- performance [2,12,13].
rity by identifying an individual based on physiological or This study will introduce three different approaches in fea-
behavioral characteristics [1]. Electrocardiogram (ECG) is a ture extraction. The first is based on a non-fiducial approach,
tool for biometrics, which describes the electrical activity of the second is based on a fiducial approach, and the third is
the heart. The electrical activity is related to the impulses that based on a fused approach between fiducial and non-fiducial,
travel through the heart. achieving a high accuracy than each approach alone. The
An ECG signal of a normal heartbeat consists of a P wave, paper is organized as follows: In Sect. 2, we will show the
a QRS complex and a T wave [2–6]. data sets, preprocessing and the feature extraction techniques
There are a lot of systems for ECG as a biometric. Those used. Furthermore, the effectiveness of the proposed methods
ECG-based biometric systems can be categorized into fidu- for classification and different classification algorithms used
cial [7,8] or non-fiducial [9–13] systems, according to the in the ECG identification is demonstrated. In Sect. 3, exper-
utilized approach to feature extraction. imental and discussion results are shown. Finally, Sect. 4
presents the conclusion.

B Mahmoud M. Bassiouni
[email protected] 2 Methodology
1 Egyptian E-Learning University, El-Giza, Egypt A methodology of a biometric system usually mimics that
2 Department of Physics, Faculty of Science, of a pattern recognition system. Thus, it can be divided into
Ain Shams University Abbassia, Cairo, Egypt four main phases: (1) data acquisition (2) preprocessing, (3)
3 Faculty of Computer and Information Science, feature extraction and (4) classification (subject identifica-
Ain Shams University, Cairo, Egypt tion).

123
942 Signal, Image and Video Processing (2018) 12:941–949

2.1 Data acquisition

Datasets are collected from two databases: first database


is ECG-ID [14] database, containing 310 ECG recordings,
obtained from 90 persons.
Second database is MIT-BIH Arrhythmia [15] database, Fig. 1 a ECG-ID person 1 signal 1 before preprocessing and b ECG-ID
containing 47 subjects. Thirty subjects are used in classifica- person 1 signal 1 after preprocessing
tion. Those 30 subjects are classified according to different
heart beats in this database. Six types of ECG heart beats
2.3 Feature extraction
are selected. One is the normal (NORMAL), and five are
ECG arrhythmias, such as premature ventricular contraction
Feature extraction is a process to find a transformation that
(PVC), paced beat (PACE), right bundle branch, block beat
converts the ECG signal into a relatively low-dimensional
(RBBB), left bundle branch block beat (LBBB) and atrial
feature space. In the following subsections, we will dis-
premature contraction (APC).
cuss the three different types of feature extractions. Firstly,
the non-fiducial features that capture the holistic patterns
2.2 Preprocessing
by examining the ECG data in the frequency domain are
discussed. Secondly, the fiducial features that capture the
Preprocessing is a process of de-nosing, filtering the sig-
holistic patterns by measuring the distances, amplitudes and
nal by removing the most common noises that can appear
angles of the ECG data are tackled. Finally, the fusion
and delivering it in a pure shape to extract features from it.
between fiducial and non- fiducial approach is discussed in
There are three types of noises in the ECG signal: power-line
details.
noise, high-frequency noise and baseline drift. As a result of a
series of experiments, the following combination of methods
2.3.1 Non-fiducial approach based on auto
has been selected for the preprocessing phase. Firstly, base-
correlation/discrete cosine transform (AC/DCT)
line drift correction is done using wavelet decomposition.
Donoho and Johnstone proposed the universal ‘VisuShrink’
The AC/DCT method is considered a window of the ECG
threshold given by [16]
heart beats; this method creates a window of samples from
 ECG signal with a predefined length. Samples of the win-
Thr = σ 2log (N ) (1)
dow are of length N, which are greater than a complete ECG
where N is the number of data points and σ is an estimate heart beat to ensure that all of the ECG peaks are included
of the noise level. The wavelet-based de-noising process is in window. In our case, a window of 10 s can be more appro-
summarized as follows: the resulting discrete wavelet trans- priate to use in our feature extraction method. This method
form (DWT) detail coefficients are thresholded by shrinkage is applied in order to combine all the samples found in the
(soft) strategy. Reconstructing the original sequence from window into a sum of products. The normalized AC equa-
the thresholded wavelet detail coefficients leads to removal tion used in our feature extraction technique is shown in the
of baseline drift. Baseline drift correction is done by using following equation:
wavelet decomposition with wavelet name db8 with N = 9.  N −|m|−1
N is the level number, and a soft threshold = 4.29 is used. x [i] x [i + m]
Rx∧x [m] = i=0
(2)
Secondly, an adaptive band stop filter fairly well suppresses Rx∧x [0]
power-line noise with Ws = 50 Hz, where the Ws is stop
band corner frequency. Then, a low-pass Butterworth filter is x[i] is the window from ECG and x[i + m] is the time-
used with Wp = 40 Hz, where Wp is the pass band corner fre- shifted version of the windowed ECG with the time straggle
quency, Ws = 60 Hz, Rp = 0.1 dB. Rp is the pass band ripple of m = 0, 1 . . . (M − 1) and M are much less than N. M is
and Rs = 30 dB is the attenuation in the stop band to remove a parameter that is to be chosen, while Rx∧x [m] is an auto-
the remaining noise components, caused by possible high- correlation sequence and Rx∧x [0] is an average power whose
frequency distortions. The last step is smoothing the signal must be greater than 0 [17]. DCT is used to reduce the dimen-
with N = 5, where N is the smoothing value to produce the sionality of the features produced, keeping the number of the
preprocessed signal. It is shown in Fig. 1a the original ECG important coefficients as shown in Fig. 2a, b, c, d.
signal and its de-noised one in Fig. 1b.

123
Signal, Image and Video Processing (2018) 12:941–949 943

Fig. 2 a ECG-ID subject 1 the preprocessed signal, b the normalized Fig. 3 a ECG-ID subject 1 the detected peaks (P, Q, R, S, and T) and
autocorrelation sequence, c zoomed in to 400 AC coefficients from the b a zoomed ECG-ID subject 1 preprocessed signal detected
maximum and d DCT of the 400 AC coefficient from 10 ECG windows
including the one on top

2.3.2 Fiducial approach based on a modified super set

Fiducial features represent duration, amplitude differences,


along with angles between 11 fiducial points detected from
each heartbeat. These points are three peaks (P, R, and T),
two valleys (Q and S) and six onsets and offsets. Fig. 4 Variants of cardiac cycle information fragments [25]

Peaks detection Our Feature Extraction starts with detect-


ing QRS by using some filters, starting with cancelation
of DC shift and normalization, low-pass filtering, high-pass T and P can enhance and provide some useful information
filtering, derivative filter, squaring and moving window inte- to be used for improving the system accuracy, they will also
gration to produce a vector of zeros and ones with the same enhance some questioning to the extraction, or the process-
size as the preprocessed signal. The ones in this vector deter- ing techniques. In order to determine the power of T and P
mine the QRS interval in the whole signal. waves, it is decided to select four informative fragments to
This vector is divided into two vectors: left and right, deter- be used: QRS, P-QRS, QRS-T and P-QRS-T, as shown in
mining the start of the Q peaks and S peaks, respectively, in Fig. 4. Those four fragments are considered the most infor-
the whole signal. Then, a window of 60 samples is created mative ones. In order to choose which fragment is considered
before the left (Q peak) and the max sample in it are gotten to be the most informative, an experiment is done to find this.
to obtain P peak. Another window of 125 samples is created Finally, from this experiment, P-QRS-T fragment is selected
after the right (S peak) and the max values in the samples as the most informative fragment, producing a better perfor-
are gotten to obtain T peak. R peak is detected by finding the mance than the other fragments.
max value between the left and the right. Q is detected by
selecting the min values between the left vector, starting of PQSRT fragment We want to select the most discriminant
Q peaks and R locations. S is detected by selecting the min fragments that most describe the signal and that can represent
values between the R locations and right vector, starting of S the signal in the identification process. Subsequently, a lot of
peaks. Therefore, we have detected the P, Q, R, S ,T locations checks have been made on each P-QRS-T fragment in the
and peaks, as well as the R interval (samples from Q to S), signal to determine whether to take this fragment, or leave it.
as shown in Fig. 3a, b. Firstly, for each P-QRS-T fragment the R interval in it is
checked. In other words, the number of samples of the R inter-
ECG peaks selection After the peaks and the locations have val in P-QRS-T fragment must be greater than 30 samples
been detected, we need to determine the most informative and less than 70 samples. Otherwise, the P-QRS-T fragment
fragment in the signal that leads to a significant effect in the is rejected. Next, for each P-QRS-T fragment, we calculate
last classification outcome. some amplitudes, distances and means, such as RQ, RS for
The most important and effective component is QRS com- amplitudes, PR, RT, QS, RQ, RS for distances RQ, RS, PR,
plex, while P and T are considered the most uncertain and RT, QS, RQ, RS for means and the mean of R interval. For
suspicious components. P wave has low amplitude and can each two successive P-QRS-T fragments RR distances, mean
be affected by noise, while T wave has a dynamic location and the median of RR distances are calculated. For the mean
and its position always depends on the heart rate. Although and the median of RR distance, we choose the minimum

123
944 Signal, Image and Video Processing (2018) 12:941–949

regardless of the actual lengths of PR, QRS, and QT intervals.


281 samples (110 samples to the left of R peak and 170 sam-
ples to the right) are extracted and analyzed. Figure 5a shows
the selected and extracted P-QRS-T fragments satisfying the
condition and restriction made on ECG signals.

Fig. 5 a P-QRS-T fragments computed from ECG-ID, b mean P-QRS-


T fragments computed from ECG-ID Subject 1 and c mean P-QRS-T PQSRT fragment mean For each ECG record, P-QRS-T frag-
fragments computed from ECG-ID Subject 1 containing Pb, Pe, P, S, ments are extracted. Since P-QRS-T fragment samples are
R, Q, Tb, T, Te used as informative features, extracted P-QRS-T fragments
are processed to enhance their similarity, and their size is
about 281 samples. Figure 5b shows the mean P-QRS-T frag-
value between them and it will be the RR thresholding, as ment obtained from the selected P-QRS-T fragments.
it shows better performance experimentally. The last step is After obtaining the mean fragment, the modified superset
selecting the most similar P-QRS-T fragments. This is done features are obtained from detecting the most important peaks
by putting some restrictions, conditions and weight sum for in the mean fragment, such as Pb, Pe, Tb, Te, P, Q, R, S, T. Our
the ECG fragments. For each two successive P-QRS-T frag- modified superset of 38 features that represents the majority
ments if RR location is < 0.9* RR thresholding, then the first of features utilized in the literature is extracted from the mean
P-QRS-T fragment is chosen. Otherwise, the first P-QRS-T heartbeat, as shown in Table 1. These features encompass
fragment is rejected. 18 temporal features (distances between fiducial points), 12
While the condition of thresholding is satisfied for each amplitude features and 3 angle features. We added another 2
two successive P-QRS-T fragments, starting from the ECG spectral features: entropy and energy, as shown in Fig. 5c.
signal, we create successive conditions starting by: “If the
R interval of the first P-QRS-T fragment, subtracted from
the mean of the R interval, is less than the R interval of the 2.3.3 Fusion between fiducial and non-fiducial approach
second P-QRS-T fragment, subtracted from the mean of the
R interval, then increase the weight sum of the first P-QRS-T This feature extraction is based on a fusion approach between
fragment by 0.3. Otherwise, increase the second P-QRS-T fiducial and non-fiducial. It uses the fiducial approach based
fragment.” on getting P-QRS-T fragments and obtaining the mean frag-
The same is made for RT distance in each two succes- ment that contains 281 samples only without the modified
sive fragments by weight 0.3, PR distance, RQ amplitude, super set features. Then, the mean fragment will be applied
RS amplitude, QS distance by higher weight, equal to 0.75, to a discrete wavelet transform as a non-fiducial approach.
as these amplitudes and distances are more effective experi- Wavelet can be described by using two functions, the
mentally. At the end, the fragment that gives a higher weight scaling function ∅(x), as known as ‘father wavelet’ and the
is selected from the ECG signal. We pick the highest weight wavelet function or ‘mother wavelet’ ϕ (x) undergoes trans-
P-QRS-T fragments representing the signal. The P-QRS-T lation and scaling operations to give self-similar wavelet
fragment length is fixed at 281 samples for each cardiac cycle, families as follows.

Table 1 The labels for the 38


fiducial features of the modified Temporal features 1) QR 6) QS 11) RT 16) PbQ 21) PT
super set 2) RS 7) ST 12) PS 17) STe
3) PbPe 8) Pe R 13) QT 18) PeQRSb
4) TbTe 9) RTb 14) PbR 19) QRSe Tb
5) PQ 10) PR 15) RTe 20) PeTe
b and e means begin and end
Amplitude features 22) RQ 25) PQ 28) RT 31) TQ
23) PT 26) P ↑ 29) QS 32) TS
24) PS 27) RP 30) T ↑ 33) RS
↑ Means amplitude
Angle features 34) < PQR 35) < QRS 36) < RST
< means angle
Spectral features 37) Entropy 38) Energy

123
Signal, Image and Video Processing (2018) 12:941–949 945
 
1 x −τ 2.4.2 Artificial neural network (ANN)
ϕa,τ (x) = a − 2 ϕ (3)
a
The classification operation of the neural network begins with
sum of multiplication of weights and inputs, plus bias at the
There is a limitation for the wavelet and scaling function, so
neuron [21]. Mathematically, the equation from this model
DWT was developed. DWT can be implemented as a set of
of the neuron interval activity can be shown as:
filter banks, comprising a high-pass filter and low-pass filter.
The signal can be decomposed into many levels using ⎛ ⎞
different families of wavelets. In our approach, first-level dis- 
n
Yk = f ⎝ w jk Z j + wk0 ⎠ for k = 1, 2, . . . , L (5)
crete wavelet decomposition is applied to the mean P-QRS-T
j=1
fragment by using discrete ‘db5’ wavelet. This mother
wavelet belongs to Daubechies wavelet family [18]. Thus, d
regarding the mean P-QRS-T, about 281 samples are passed where Z j = f ( i=1 wi j xi + w j0 ) for j = 1, 2, . . . , n; xi
to first level DWT decomposition. Then, a feature vector are the inputs of the network, wi j the weights between the
is formed by combining fiducial and non-fiducial approach, input and hidden layer, w j0 is the initial bias of hidden nodes
forming a feature vector that consists of 145 samples used as and f is some transfer function. Z j the outputs of the hidden
input to the classifiers for each ECG signal. The non-fiducial layer. Yk is the output of the network, w jk is the weight
approach uses 30 features generated from the (AC/DCT). For between the hidden and output layer, and wk0 is the initial
the fiducial approach using the modified super set, 38 features bias of the output layer.
are produced. The fusion between fiducial and non-fiducial
produces 145 features, using DWT. 2.4.3 K -nearest neighbor (KNN)

K-nearest neighbor algorithm (KNN) is a classification


2.4 Classification method based on closest training samples [22]. We have
focused on comparing the distances by using Euclidean
Classification is a process in which the extracted features distance. Let N is a test set described with parameters
are compared against the stored templates to generate match as [N1,1 , N1,2 , N1,3 . . . , N j,k ] and M is a training set and
scores. In the following subsection, we will discuss three described as [M1,1 , M1,2 , M1,3 . . . , Mi,k ]. Finally, the deci-
different types of classifiers, starting with SVM, ANN and sion rule of the highest similarity in KNN can be written as
KNN. Their identification results are discussed in details. follows:

C = argmaxCu score N j , Cu
2.4.1 Support vector machine (SVM) 
= Sim(N j, Mi )δ (Mi , Cu ) (6)
Support vector machines (SVM) are a powerful technique N j ∈ KNN(Mi )
for pattern classification by Vapnik [19].
The major advantage of SVM is its ability to classify where C is the label assigned to the test feature N j ; score
unknown data points with high accuracy. The classifier N j , Cu is the score of the class Cu with respect to
performances for small sample learning problems have N j ; Sim N j , Mi is the maximum similarity between N j
been improved by applying sequential minimal optimization and the training feature Mi ; δ (Mi , Cu ) indicates whether M j
(SMO) and polynomial kernel function idea [20]. The SVM is a part of class Cu .
has shown a better generalization performance in many prac-
tical applications. The SVM decision function is defined as
follows: 3 Experiments and discussion

The experiments are carried out on the platform of core



N
i7 with 3 GHz main frequency and 6 G memory, running
F(y) = α K (xi , y) + b (4)
under window 8 64 bit operating system. The algorithms are
i=1
developed via the discrete wavelet transform toolbox Matlab
2014b (The Math works). Our classification algorithms are
where y is the unclassified tested vector, xi are the support used from the weka software. In ECG-ID database containing
vectors and αi their weights, and b is a constant bias. K(xi , y) 310 recordings, we have showed our results on 308 records.
is the kernel function (polynomial kernel) which performs The number of training records is 200, and the number of
implicit mapping into a high-dimensional feature space. test records used is 108. Two recordings are not used as their

123
946 Signal, Image and Video Processing (2018) 12:941–949

95.3

98.3
62.0
81.4

96.6
A%

100
files have been corrupted. For MIT-BIH Arrhythmia, we have
shown our results on 30 subjects. For each subject, 40,000
samples are used for training and 20,000 samples are used for

0.04

0.01
0.37
0.18

0.03
ER

0.0
testing. This leads to 120 records for training and 60 records
for testing. Note that each 10,000 sample represents a record.
This section provides experimental results that consider the

0.95

0.93
0.62
0.81

0.96

1.0
identification performance comparison of SVM, KNN and

P
ANN classifiers. This will be on the ECG data sets, predefined
by using several different feature extraction algorithm based

FP

1
41

0
20

2
on non-fiducial, fiducial and fusion approach between them.
We have tested our algorithms on two different databases:

ANN

59
67
88
10
58

60
TP
ECG-ID and MIT-BIH Arrhythmia. Table 2 shows the num-
ber of the TP (number of correctly classified subjects) and FP
(number of incorrectly classified subjects) from each of the

98.1

98.3
52.7
70.3

88.3
A%

100
three classifiers on each of the two databases. Three measure-
ments are used to evaluate the performance of our approaches

0.01

0.01
0.47
0.29

0.11
accuracy, precision and error rate. Accuracy % (A) is the per-

ER

0.0
TP True Positive, FP False Positive, P Precision, ER Error Rate and A Accuracy %, NFid Non-fiducial, Fid Fiducial and Fus Fusion
centage of correctly classified records, and precision (P) is
the fraction of the correctly classified records among the total

Table 2 The classification results for ECG-ID and MIT-BIH with the three approaches in feature extraction and classification

0.98

0.93
0.52
0.70

0.88

1.0
amount of records. Finally, the error rate (ER) is the total

P
error value and it is considered the fraction of the incorrectly
classified among all of the records.

FP

1
51

2
7
32

0
TP TP FP
A= ∗100, P= , ER =
TP + FP TP + FP TP + FP
KNN

106

59
76
57

53

60
TP
(7)
Table 2 shows the identification performance of our appro-

99.0
72.2

98.3
51.8

91.0
A%

100
aches. The classification results for the MIT-BIH Arrhythmia
database, using Non-fiducial and fiducial approach have
not achieved a high accuracy by using each feature extrac-
0.009
0.27

0.01
0.48

0.08
ER

0.0
tion algorithm alone. Although the accuracy results have
remained over 90 % while using the fusion approach, the
accuracy has risen and has reached 100 % accuracy in each of
0.99
0.72

0.93
0.51

0.91

the three classifiers. When working on a larger database, such 1.0


P

as ECG-ID, the accuracy has decreased in both non-fiducial


and fiducial approach, as the ECG-ID contains 90 subjects.
FP

1
5

0
52
30

While in the fusion approach, the accuracy has reached 99%


using SVM, 98% for KNN and about 95% is achieved using
SVM

107
78

59
56

55

60
TP

ANN.
In comparison with previous works that used almost the
same techniques, we can conclude that our approach has
Feature extraction

proven better results. For ECG-ID database comparison, as


shown in Table 3, Beil et al. [23] have discussed in their
work 20 subjects from ECG-ID database, 85 records for
NFid

NFid

training and 50 records for testing by using heartbeat waves


Fus

Fus
Fid

Fid

amplitudes, intervals duration for feature extraction and soft


independent modeling of class analogy, achieving an accu-
MIT-BIH 30 subjects

racy of 98%. Shen et al. [13] have discussed 20 subjects from


ECG-ID 90 subjects

the database, 20 heartbeats for each record and 1 heartbeat


for each ECG by using QRS complex, as well as T wave
amplitudes and intervals duration for feature extraction and
Database

for classification. This is done by using template matching


and decision-based neural network, achieving an accuracy of

123
Table 3 Classification performance comparisons of the proposed scheme with some existing schemes using ECG-ID and MIT-BIH
DB References Subject Train Test Approach Results
Feature extraction Classification

ECG-ID Biel et al. [23] 20 85 R 50R Amplitudes and intervals SIM A = 98.0%
Shen et al. [13] 20 20 HB 1HB QRS complex and T wave TM A = 95.0%
amplitudes and intervals DBNN A = 80.0%
TM+DBNN A = 100%
Yi et al. [24] 9 9R 9R WT + PCA PNN A = 95.0%
Nemirko and Lugovaya 90 195 R 6-10 HB 115R 6-10 HB QRS complex, P and T waves + NN A = 87.0%
[25] PCA + WT WNM A = 94.0%
LDA A = 96.0%
MVC A = 96.0%
Signal, Image and Video Processing (2018) 12:941–949

Dar et al. [26] 90 20-200 S for each subject (DWT) + variant in heart rate + RF A = 83.8%
best first search
FAR = 16.1% FRR = 0.3%
Chun [27] 89 2R for each Subject GF+ Simple distance DTW IIDR = 4.2 AUC = 0.99 EER = 5.2%/
measurements
ED IIDR = 22.6 AUC = 0.98 EER = 2.4%
PCA IIDR = 26.1 AUC = 0.99 EER = 2.4%
Our proposed method 90 200 R 108 R Mean P-QRS-T Fragment + ANN A = 95.0%
DWT KNN A = 98.0%
SVM A = 99.0%
MIT-BIH Tang and Shu [28] 10 20 HB 40 HB WT+RS BP, RBF, QNN A = 83.4%, A = 86.6%, A = 91.7%
Arrhythmia
Wang et al. [29] − 9800 samples of 8 different HB LDA LDA + PNN A = 98.2% Sen = 89.9% Spec = 98.2%
PCA PCA+PNN A = 98.6% Sen = 97.1% Spec = 99.0%
PCA + LDA PCA+LDA+PNN A = 99.7% Sen = 97.8% Spec = 99.1%
Ting and Salleh [11] 13 Data Set I = 8R EKF + temporal and amplitude Log-likelihood A = 87.5% for Data Set I
Data Set II = 15R scoring A = 61.5% for Data Set II
Islam et al. [30] 26 − − Features are extracted from chi-squared (χ 2 ) A = 99.8%
heartbeat Shape EER = 0.38%
Our proposed method 30 40000 S 20000 S Mean P-QRS-T Fragment + ANN A = 100%
DWT KNN A = 100%
SNM A = 100%

R records, HB heart beats, S samples, TM template matching, DBNN decision-based neural network, WT coefficients of the wavelet, PCA principal component analysis, PNN probabilistic neural network, NM nearest mean, WNM
weighted nearest mean, LDA linear discriminant analysis, GF guided filter, IIDR inter/intra-distance ration, AUC area under curve, EER equal error rate, RS rough set, FAR false acceptance rate, FRR false rejection rate, EKF extended
Kalman filtering, SIM soft independent modeling, MVC majority vote classifier, RF random forest, QNN quantum neural network, Sen sensitivity and spec specificity, DB Database

123
947
948 Signal, Image and Video Processing (2018) 12:941–949

95, 80%, respectively, and 100% for both. Yi et al. [24] have a lot of important issues. Firstly, we have worked on a large
worked on 9 subjects, 9 records of one day, 30 fragments for database consisting of 90 subjects and have achieved a high
each record used for training, 9 records of another day. All accuracy about 99%. Secondly, we have addressed another
fragments for each record used for test use coefficients of the database consisting of diseases to show the strength of our
wavelet decomposition of successive ECG fragments, 10 s fusion approach and have achieved a high performance about
long, principal component analysis for feature extraction, as 100%. Thirdly, we addressed a large number of samples and
well as reduction and probabilistic neural network for clas- ECG heart beats in training and testing. Other systems just
sification, achieving 95% accuracy. Nemirko and Lugovaya use a small amount of ECG heart beats, achieving a high
[25] has achieved a great progress in this database, using all accuracy, and when the number of the heart beats increases,
the database subjects. They have used 195 records, 6 to 10 the performance starts to degrade. Fourthly, we have shown
heartbeats for training and 115 records, 6 to 10 heartbeats for the use of a non-fiducial, fiducial and fusion between them,
testing. This is achieved by using samples of cardiac cycle and how the fusion can increase the performance. The times
fragment, containing the QRS complex, P and T and principle consumed are 1.2, 0.75 and 1.1 min for Non-fiducial, fiducial
component analysis, or wavelet transform for feature extrac- and fusion respectively on the largest database used
tion and reduction, using linear discriminant analysis and
Majority Vote Classifier for classification to achieve an accu-
racy of 96%. Dar et al. [26] have presented an approach for
4 Conclusions
identification of ECG-ID, based on guided filter (GF), Euclid-
ian measures, dynamic time wrapping (DTW) and PCA for
This paper proposes hybrid ECG system identification. The
authentication. They have worked on a dataset of 89 subjects
proposed approaches contain data acquisition, preprocess-
from the 90 and have used 2 records for each subject, achiev-
ing, feature extraction and classification phases. ECG signals
ing an error rate of 2.4% using PCA combined with GF. Chun
obtained from the MIT-BIH and ECG-ID are used for the
[27] have addressed the challenging database ECG-ID on all
training and testing processes. We have applied three dif-
of the 90 subject work by applying fusion between DWT
ferent methods for feature extraction based on non-fiducial,
features, heart rate variability-based features and reduction
fiducial and fusion and three classifiers such as SVM, ANN
by using best first search with the random forest for clas-
and KNN. The results of the system are compared with other
sification to achieve an accuracy of 83.33%. Our proposed
methods. According to the comparison results, the proposed
method has worked on this database, using all the subjects
method is able to provide robust ECG Signal classification.
200 records for training, all the heartbeats in the records and
The results show accuracy of 100% for MIT-BIH; using
108 records for testing. All the heartbeats have achieved an
SVM, ANN, KNN and 99 % for ECG-ID; using SVM and
accuracy of 99%; better and higher than the previous studies.
KNN. Our main strength is using fusion approach between
For example in MIT-BIH arrhythmia database comparison,
fiducial and non-fiducial.
as shown in Table 3, Tang and Shu [28] have used in their
With the help of the above approaches, one can develop
work 10 subjects in the MIT-BIH arrhythmia, using wavelet
software for a biometric system for the detection of ECG sig-
transform, a rough set for feature extraction and reduction and
nals of different individuals. We have worked on a database
quantum neural network for classification, achieving accu-
that contains a large set of subjects and achieves a high accu-
racy of 91%. Wang et al. [29] have discussed the MIT-BIH
racy to ensure that ECG can be applied in security system
arrhythmia database, working on 9800 samples from differ-
applications. Further studies are ongoing for improving the
ent eight heart beats and using principle components analysis,
classification accuracy and work on a large datasets, in order
linear discriminant analysis for feature extraction and prob-
to create a generalized system for ECG identification.
abilistic neural network for classification, achieving about
99.71%. Ting and Salleh [11] have worked on 13 subjects, Acknowledgements The authors express their special thanks to the
using extended Kalman filtering and log-like hood for clas- Editor-in-Chief, anonymous referees and the production editor for their
sification, achieving an accuracy of 87.50 %. Islam et al. cooperative comments that enhanced the manuscript.
[30] have addressed 26 subjects, using (HBS) as features
and achieving an accuracy of 99.85%. Most of the previous
studies have focused on a small number of subjects and a References
small amount of samples, but our proposed method focuses
on 30 from the 47 subjects, containing different heartbeats, 1. Jain, K., Ross, A., Prabhakar, S.: An introduction to biometric
and uses 20,000 samples for testing and 40,000 samples for recognition. IEEE Trans. Circuits Syst. Video Technol. 14(1), 4–20
(2004)
training for each subject, achieving an accuracy of 100%.
2. Wang, Y., Agrafioti, F., Hatzinakos, D., Plataniotis, K.N.: Analysis
Our contribution in this paper is to prove that ECG can of human electrocardiogram for biometric recognition. EURASIP
be used as a biometric. This goal is already achieved from J. Adv. Signal Process. 2008(1), 1–11 (2007)

123
Signal, Image and Video Processing (2018) 12:941–949 949

3. Hegde, C., Prabhu, H.R., Sagar, D.S., Shenoy, P.D., Venugopal, 18. Daubechies, I.: Ten lectures on wavelets. Phila. Soc. Ind. Appl.
K.R., Patnaik, L.M.: Heartbeat biometrics for human authentica- Math. 61, 198–202 (1992)
tion. Signal Image Video Process. 5(4), 485–493 (2011) 19. Vapnik, V.: The Nature of Statistical Learning Theory. Springer,
4. Agrafioti, F., Hatzinakos, D.: ECG biometric analysis in cardiac Berlin (2013)
irregularity conditions. Signal Image Video Process. 3(4), 329–337 20. Platt, J.C.: Fast training of support vector machines using sequential
(2009) minimal optimization. In: Advances in Kernel Methods, pp. 185–
5. Abo-Zahhad, M., Ahmed, S.M., Abbas, S.N.: Biometric authenti- 208 (1999)
cation based on PCG and ECG signals: present status and future 21. Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd
directions. Signal Image Video Process. 8(4), 739–751 (2014) edn. Prentice Hall, Upper Saddle River (1999)
6. Porée, F., Kervio, G., Carrault, G.: ECG biometric analysis in 22. Araújo, T., Nunes, N., Gamboa, H., Fred, A.: Generic biometry
different physiological recording conditions. Signal Image Video algorithm based on signal morphology information. In: Pattern
Process. 10(2), 267–276 (2016) Recognition: Applications and Methods, pp. 301–310. Springer,
7. Singla, S., Sharma, A.: ECG based biometrics verification sys- Berlin (2015)
tem using LabVIEW. Songklanakarin J. Sci. Technol. 32, 241–246 23. Biel, L., Pettersson, O., Philipson, L., Wide, P.: ECG analysis: a
(2010) new approach in human identification. IEEE Trans. Instrum. Meas.
8. Tantawi, M., Revett, K., Tolba, M.F., Salem, A.: Fiducial feature 50(3), 808–812 (2001)
reduction analysis for electrocardiogram (ECG) based biometric 24. Yi, W.J., Park, K.S., Jeong, D.U.: Personal identification from
recognition. J. Intell. Inf. Syst. 40(1), 17–39 (2013) ECG measured without body surface electrodes using probabilis-
9. Roshan, J., Rajendra, U.: ECG beat classification using PCA, LDA, tic neural networks. In: World Congress on Medical Physics and
ICA and Discrete Wavelet Transform. Biomed. Signal Process. Biomedical Engineering (2003)
Control 8(5), 437–448 (2013) 25. Nemirko A.P., Lugovaya T.S.: Biometric human identification
10. Tantawi, M., Revett, K., Salem, A.B., Tolba, M.F.: A wavelet based on electrocardiogram. In: Proceedings of the XII-th Rus-
feature extraction method for electrocardiogram (ECG)-based bio- sian Conference on Mathematical Methods of Pattern Recognition,
metric recognition. Signal Image Video Process. 9(6), 1271–1280 Moscow, MAKS Press, pp. 387-390. ISBN: 5-317-01445-X (2005)
(2013) 26. Dar, M.N., Akram, M.U., Shaukat, A., Khan, M. A.: ecg based
11. Ting, C., Salleh S.: ECG based personal identification using biometric identification for population with normal and cardiac
extended kalman filter. In: 10th International Conference on Infor- anomalies using hybrid HRV and DWT features. In: (ICITCS), pp.
mation Sciences Signal Processing and their Applications, pp. 1–5 (2015)
774–777 (2010) 27. Chun, S. Y.: Single pulse ECG-based small scale user authen-
12. Sufi, F., Khalil, I., Habib, I.: Polynomial distance measurement for tication using guided filtering. In: International Conference on
ECG based biometric authentication. Secur. Commun. Netw. 3(4), Biometrics (ICB), pp. 1–7. IEEE (2016)
303–319 (2008) 28. Tang, X., Shu, L.: Classification of electrocardiogram signals with
13. Shen, T.W., Tompkins, W.J., Hu, Y.H.: One-lead ECG for iden- RS and quantum neural networks. Int. J. Multimed. Ubiquitous
tity verification. In: Proceedings of the 2nd Joint EMBS/BMES Eng. 9(2), 363–372 (2014)
Conference, pp. 62–63 (2002) 29. Wang, J.S., Chiang, W.C., Hsu, Y.L., Yang, Y.T.C.: ECG arrhythmia
14. https://physionet.org/physiobank/database/ecgiddb/ . ECG-ID classification using a probabilistic neural network with a feature
database. Accessed 16 Nov 2017 reduction method. Neurocomputing 116, 38–45 (2013)
15. https://www.physionet.org/physiobank/database/mitdb/ . MIT- 30. Islam, M.S., Alajlan, N., Bazi, Y., Hichri, H.S.: HBS: a novel bio-
BIH arrhythmia database. Accessed 16 Nov 2017 metric feature based on heartbeat morphology. IEEE Trans. Inf.
16. Donoho, D.L., Johnstone, J.M.: Ideal spatial adaptation by wavelet Technol. Biomed. 16(3), 445–453 (2012)
shrinkage. Biometrika 81(3), 425–455 (1994)
17. Hejazi, M., Al-Haddad, S.A.R., Singh, Y.P., Hashim, S.J., Aziz,
A.F.A.: ECG biometric authentication based on non-fiducial
approach using kernel methods. Digit. Signal Process. 52, 72–86
(2016)

123

You might also like