This One
This One
https://doi.org/10.1007/s11760-018-1237-5
ORIGINAL PAPER
Received: 15 January 2017 / Revised: 12 December 2017 / Accepted: 9 January 2018 / Published online: 10 February 2018
© Springer-Verlag London Ltd., part of Springer Nature 2018
Abstract
This paper presents hybrid approaches for human identification based on electrocardiogram (ECG). The proposed approaches
consist of four phases, namely data acquisition, preprocessing, feature extraction and classification. In the first phase, data
acquisition phase, data sets are collected from two different databases, ECG-ID and MIT-BIH Arrhythmia database. In the
second phase, noise reduction of ECG signals is performed by using wavelet transform and a series of filters used for de-
noising. In the third phase, features are obtained by using three different intelligent approaches: a non-fiducial, fiducial and
a fusion approach between them. In the last phase, the classification approach, three classifiers are developed to classify
subjects. The first classifier is based on artificial neural network (ANN). The second classifier is based on K-nearest neighbor
(KNN), relying on Euclidean distance. The last classifier is support vector machine (SVM) classification accuracy of 95% is
obtained for ANN, 98 % for KNN and 99% for SVM on the ECG-ID database, while 100% is obtained for ANN, KNN, and
SVM on MIT-BIH Arrhythmia database. The results show that the proposed approaches are robust and effective compared
with other recent works.
Keywords Biometric · ECG signals · Intelligent hybrid approach · Machine learning · Fiducial and non-fiducial
B Mahmoud M. Bassiouni
[email protected] 2 Methodology
1 Egyptian E-Learning University, El-Giza, Egypt A methodology of a biometric system usually mimics that
2 Department of Physics, Faculty of Science, of a pattern recognition system. Thus, it can be divided into
Ain Shams University Abbassia, Cairo, Egypt four main phases: (1) data acquisition (2) preprocessing, (3)
3 Faculty of Computer and Information Science, feature extraction and (4) classification (subject identifica-
Ain Shams University, Cairo, Egypt tion).
123
942 Signal, Image and Video Processing (2018) 12:941–949
123
Signal, Image and Video Processing (2018) 12:941–949 943
Fig. 2 a ECG-ID subject 1 the preprocessed signal, b the normalized Fig. 3 a ECG-ID subject 1 the detected peaks (P, Q, R, S, and T) and
autocorrelation sequence, c zoomed in to 400 AC coefficients from the b a zoomed ECG-ID subject 1 preprocessed signal detected
maximum and d DCT of the 400 AC coefficient from 10 ECG windows
including the one on top
123
944 Signal, Image and Video Processing (2018) 12:941–949
123
Signal, Image and Video Processing (2018) 12:941–949 945
1 x −τ 2.4.2 Artificial neural network (ANN)
ϕa,τ (x) = a − 2 ϕ (3)
a
The classification operation of the neural network begins with
sum of multiplication of weights and inputs, plus bias at the
There is a limitation for the wavelet and scaling function, so
neuron [21]. Mathematically, the equation from this model
DWT was developed. DWT can be implemented as a set of
of the neuron interval activity can be shown as:
filter banks, comprising a high-pass filter and low-pass filter.
The signal can be decomposed into many levels using ⎛ ⎞
different families of wavelets. In our approach, first-level dis-
n
Yk = f ⎝ w jk Z j + wk0 ⎠ for k = 1, 2, . . . , L (5)
crete wavelet decomposition is applied to the mean P-QRS-T
j=1
fragment by using discrete ‘db5’ wavelet. This mother
wavelet belongs to Daubechies wavelet family [18]. Thus, d
regarding the mean P-QRS-T, about 281 samples are passed where Z j = f ( i=1 wi j xi + w j0 ) for j = 1, 2, . . . , n; xi
to first level DWT decomposition. Then, a feature vector are the inputs of the network, wi j the weights between the
is formed by combining fiducial and non-fiducial approach, input and hidden layer, w j0 is the initial bias of hidden nodes
forming a feature vector that consists of 145 samples used as and f is some transfer function. Z j the outputs of the hidden
input to the classifiers for each ECG signal. The non-fiducial layer. Yk is the output of the network, w jk is the weight
approach uses 30 features generated from the (AC/DCT). For between the hidden and output layer, and wk0 is the initial
the fiducial approach using the modified super set, 38 features bias of the output layer.
are produced. The fusion between fiducial and non-fiducial
produces 145 features, using DWT. 2.4.3 K -nearest neighbor (KNN)
C = argmaxCu score N j , Cu
2.4.1 Support vector machine (SVM)
= Sim(N j, Mi )δ (Mi , Cu ) (6)
Support vector machines (SVM) are a powerful technique N j ∈ KNN(Mi )
for pattern classification by Vapnik [19].
The major advantage of SVM is its ability to classify where C is the label assigned to the test feature N j ; score
unknown data points with high accuracy. The classifier N j , Cu is the score of the class Cu with respect to
performances for small sample learning problems have N j ; Sim N j , Mi is the maximum similarity between N j
been improved by applying sequential minimal optimization and the training feature Mi ; δ (Mi , Cu ) indicates whether M j
(SMO) and polynomial kernel function idea [20]. The SVM is a part of class Cu .
has shown a better generalization performance in many prac-
tical applications. The SVM decision function is defined as
follows: 3 Experiments and discussion
123
946 Signal, Image and Video Processing (2018) 12:941–949
95.3
98.3
62.0
81.4
96.6
A%
100
files have been corrupted. For MIT-BIH Arrhythmia, we have
shown our results on 30 subjects. For each subject, 40,000
samples are used for training and 20,000 samples are used for
0.04
0.01
0.37
0.18
0.03
ER
0.0
testing. This leads to 120 records for training and 60 records
for testing. Note that each 10,000 sample represents a record.
This section provides experimental results that consider the
0.95
0.93
0.62
0.81
0.96
1.0
identification performance comparison of SVM, KNN and
P
ANN classifiers. This will be on the ECG data sets, predefined
by using several different feature extraction algorithm based
FP
1
41
0
20
2
on non-fiducial, fiducial and fusion approach between them.
We have tested our algorithms on two different databases:
ANN
59
67
88
10
58
60
TP
ECG-ID and MIT-BIH Arrhythmia. Table 2 shows the num-
ber of the TP (number of correctly classified subjects) and FP
(number of incorrectly classified subjects) from each of the
98.1
98.3
52.7
70.3
88.3
A%
100
three classifiers on each of the two databases. Three measure-
ments are used to evaluate the performance of our approaches
0.01
0.01
0.47
0.29
0.11
accuracy, precision and error rate. Accuracy % (A) is the per-
ER
0.0
TP True Positive, FP False Positive, P Precision, ER Error Rate and A Accuracy %, NFid Non-fiducial, Fid Fiducial and Fus Fusion
centage of correctly classified records, and precision (P) is
the fraction of the correctly classified records among the total
Table 2 The classification results for ECG-ID and MIT-BIH with the three approaches in feature extraction and classification
0.98
0.93
0.52
0.70
0.88
1.0
amount of records. Finally, the error rate (ER) is the total
P
error value and it is considered the fraction of the incorrectly
classified among all of the records.
FP
1
51
2
7
32
0
TP TP FP
A= ∗100, P= , ER =
TP + FP TP + FP TP + FP
KNN
106
59
76
57
53
60
TP
(7)
Table 2 shows the identification performance of our appro-
99.0
72.2
98.3
51.8
91.0
A%
100
aches. The classification results for the MIT-BIH Arrhythmia
database, using Non-fiducial and fiducial approach have
not achieved a high accuracy by using each feature extrac-
0.009
0.27
0.01
0.48
0.08
ER
0.0
tion algorithm alone. Although the accuracy results have
remained over 90 % while using the fusion approach, the
accuracy has risen and has reached 100 % accuracy in each of
0.99
0.72
0.93
0.51
0.91
1
5
0
52
30
107
78
59
56
55
60
TP
ANN.
In comparison with previous works that used almost the
same techniques, we can conclude that our approach has
Feature extraction
NFid
Fus
Fid
Fid
123
Table 3 Classification performance comparisons of the proposed scheme with some existing schemes using ECG-ID and MIT-BIH
DB References Subject Train Test Approach Results
Feature extraction Classification
ECG-ID Biel et al. [23] 20 85 R 50R Amplitudes and intervals SIM A = 98.0%
Shen et al. [13] 20 20 HB 1HB QRS complex and T wave TM A = 95.0%
amplitudes and intervals DBNN A = 80.0%
TM+DBNN A = 100%
Yi et al. [24] 9 9R 9R WT + PCA PNN A = 95.0%
Nemirko and Lugovaya 90 195 R 6-10 HB 115R 6-10 HB QRS complex, P and T waves + NN A = 87.0%
[25] PCA + WT WNM A = 94.0%
LDA A = 96.0%
MVC A = 96.0%
Signal, Image and Video Processing (2018) 12:941–949
Dar et al. [26] 90 20-200 S for each subject (DWT) + variant in heart rate + RF A = 83.8%
best first search
FAR = 16.1% FRR = 0.3%
Chun [27] 89 2R for each Subject GF+ Simple distance DTW IIDR = 4.2 AUC = 0.99 EER = 5.2%/
measurements
ED IIDR = 22.6 AUC = 0.98 EER = 2.4%
PCA IIDR = 26.1 AUC = 0.99 EER = 2.4%
Our proposed method 90 200 R 108 R Mean P-QRS-T Fragment + ANN A = 95.0%
DWT KNN A = 98.0%
SVM A = 99.0%
MIT-BIH Tang and Shu [28] 10 20 HB 40 HB WT+RS BP, RBF, QNN A = 83.4%, A = 86.6%, A = 91.7%
Arrhythmia
Wang et al. [29] − 9800 samples of 8 different HB LDA LDA + PNN A = 98.2% Sen = 89.9% Spec = 98.2%
PCA PCA+PNN A = 98.6% Sen = 97.1% Spec = 99.0%
PCA + LDA PCA+LDA+PNN A = 99.7% Sen = 97.8% Spec = 99.1%
Ting and Salleh [11] 13 Data Set I = 8R EKF + temporal and amplitude Log-likelihood A = 87.5% for Data Set I
Data Set II = 15R scoring A = 61.5% for Data Set II
Islam et al. [30] 26 − − Features are extracted from chi-squared (χ 2 ) A = 99.8%
heartbeat Shape EER = 0.38%
Our proposed method 30 40000 S 20000 S Mean P-QRS-T Fragment + ANN A = 100%
DWT KNN A = 100%
SNM A = 100%
R records, HB heart beats, S samples, TM template matching, DBNN decision-based neural network, WT coefficients of the wavelet, PCA principal component analysis, PNN probabilistic neural network, NM nearest mean, WNM
weighted nearest mean, LDA linear discriminant analysis, GF guided filter, IIDR inter/intra-distance ration, AUC area under curve, EER equal error rate, RS rough set, FAR false acceptance rate, FRR false rejection rate, EKF extended
Kalman filtering, SIM soft independent modeling, MVC majority vote classifier, RF random forest, QNN quantum neural network, Sen sensitivity and spec specificity, DB Database
123
947
948 Signal, Image and Video Processing (2018) 12:941–949
95, 80%, respectively, and 100% for both. Yi et al. [24] have a lot of important issues. Firstly, we have worked on a large
worked on 9 subjects, 9 records of one day, 30 fragments for database consisting of 90 subjects and have achieved a high
each record used for training, 9 records of another day. All accuracy about 99%. Secondly, we have addressed another
fragments for each record used for test use coefficients of the database consisting of diseases to show the strength of our
wavelet decomposition of successive ECG fragments, 10 s fusion approach and have achieved a high performance about
long, principal component analysis for feature extraction, as 100%. Thirdly, we addressed a large number of samples and
well as reduction and probabilistic neural network for clas- ECG heart beats in training and testing. Other systems just
sification, achieving 95% accuracy. Nemirko and Lugovaya use a small amount of ECG heart beats, achieving a high
[25] has achieved a great progress in this database, using all accuracy, and when the number of the heart beats increases,
the database subjects. They have used 195 records, 6 to 10 the performance starts to degrade. Fourthly, we have shown
heartbeats for training and 115 records, 6 to 10 heartbeats for the use of a non-fiducial, fiducial and fusion between them,
testing. This is achieved by using samples of cardiac cycle and how the fusion can increase the performance. The times
fragment, containing the QRS complex, P and T and principle consumed are 1.2, 0.75 and 1.1 min for Non-fiducial, fiducial
component analysis, or wavelet transform for feature extrac- and fusion respectively on the largest database used
tion and reduction, using linear discriminant analysis and
Majority Vote Classifier for classification to achieve an accu-
racy of 96%. Dar et al. [26] have presented an approach for
4 Conclusions
identification of ECG-ID, based on guided filter (GF), Euclid-
ian measures, dynamic time wrapping (DTW) and PCA for
This paper proposes hybrid ECG system identification. The
authentication. They have worked on a dataset of 89 subjects
proposed approaches contain data acquisition, preprocess-
from the 90 and have used 2 records for each subject, achiev-
ing, feature extraction and classification phases. ECG signals
ing an error rate of 2.4% using PCA combined with GF. Chun
obtained from the MIT-BIH and ECG-ID are used for the
[27] have addressed the challenging database ECG-ID on all
training and testing processes. We have applied three dif-
of the 90 subject work by applying fusion between DWT
ferent methods for feature extraction based on non-fiducial,
features, heart rate variability-based features and reduction
fiducial and fusion and three classifiers such as SVM, ANN
by using best first search with the random forest for clas-
and KNN. The results of the system are compared with other
sification to achieve an accuracy of 83.33%. Our proposed
methods. According to the comparison results, the proposed
method has worked on this database, using all the subjects
method is able to provide robust ECG Signal classification.
200 records for training, all the heartbeats in the records and
The results show accuracy of 100% for MIT-BIH; using
108 records for testing. All the heartbeats have achieved an
SVM, ANN, KNN and 99 % for ECG-ID; using SVM and
accuracy of 99%; better and higher than the previous studies.
KNN. Our main strength is using fusion approach between
For example in MIT-BIH arrhythmia database comparison,
fiducial and non-fiducial.
as shown in Table 3, Tang and Shu [28] have used in their
With the help of the above approaches, one can develop
work 10 subjects in the MIT-BIH arrhythmia, using wavelet
software for a biometric system for the detection of ECG sig-
transform, a rough set for feature extraction and reduction and
nals of different individuals. We have worked on a database
quantum neural network for classification, achieving accu-
that contains a large set of subjects and achieves a high accu-
racy of 91%. Wang et al. [29] have discussed the MIT-BIH
racy to ensure that ECG can be applied in security system
arrhythmia database, working on 9800 samples from differ-
applications. Further studies are ongoing for improving the
ent eight heart beats and using principle components analysis,
classification accuracy and work on a large datasets, in order
linear discriminant analysis for feature extraction and prob-
to create a generalized system for ECG identification.
abilistic neural network for classification, achieving about
99.71%. Ting and Salleh [11] have worked on 13 subjects, Acknowledgements The authors express their special thanks to the
using extended Kalman filtering and log-like hood for clas- Editor-in-Chief, anonymous referees and the production editor for their
sification, achieving an accuracy of 87.50 %. Islam et al. cooperative comments that enhanced the manuscript.
[30] have addressed 26 subjects, using (HBS) as features
and achieving an accuracy of 99.85%. Most of the previous
studies have focused on a small number of subjects and a References
small amount of samples, but our proposed method focuses
on 30 from the 47 subjects, containing different heartbeats, 1. Jain, K., Ross, A., Prabhakar, S.: An introduction to biometric
and uses 20,000 samples for testing and 40,000 samples for recognition. IEEE Trans. Circuits Syst. Video Technol. 14(1), 4–20
(2004)
training for each subject, achieving an accuracy of 100%.
2. Wang, Y., Agrafioti, F., Hatzinakos, D., Plataniotis, K.N.: Analysis
Our contribution in this paper is to prove that ECG can of human electrocardiogram for biometric recognition. EURASIP
be used as a biometric. This goal is already achieved from J. Adv. Signal Process. 2008(1), 1–11 (2007)
123
Signal, Image and Video Processing (2018) 12:941–949 949
3. Hegde, C., Prabhu, H.R., Sagar, D.S., Shenoy, P.D., Venugopal, 18. Daubechies, I.: Ten lectures on wavelets. Phila. Soc. Ind. Appl.
K.R., Patnaik, L.M.: Heartbeat biometrics for human authentica- Math. 61, 198–202 (1992)
tion. Signal Image Video Process. 5(4), 485–493 (2011) 19. Vapnik, V.: The Nature of Statistical Learning Theory. Springer,
4. Agrafioti, F., Hatzinakos, D.: ECG biometric analysis in cardiac Berlin (2013)
irregularity conditions. Signal Image Video Process. 3(4), 329–337 20. Platt, J.C.: Fast training of support vector machines using sequential
(2009) minimal optimization. In: Advances in Kernel Methods, pp. 185–
5. Abo-Zahhad, M., Ahmed, S.M., Abbas, S.N.: Biometric authenti- 208 (1999)
cation based on PCG and ECG signals: present status and future 21. Haykin, S.: Neural Networks: A Comprehensive Foundation, 2nd
directions. Signal Image Video Process. 8(4), 739–751 (2014) edn. Prentice Hall, Upper Saddle River (1999)
6. Porée, F., Kervio, G., Carrault, G.: ECG biometric analysis in 22. Araújo, T., Nunes, N., Gamboa, H., Fred, A.: Generic biometry
different physiological recording conditions. Signal Image Video algorithm based on signal morphology information. In: Pattern
Process. 10(2), 267–276 (2016) Recognition: Applications and Methods, pp. 301–310. Springer,
7. Singla, S., Sharma, A.: ECG based biometrics verification sys- Berlin (2015)
tem using LabVIEW. Songklanakarin J. Sci. Technol. 32, 241–246 23. Biel, L., Pettersson, O., Philipson, L., Wide, P.: ECG analysis: a
(2010) new approach in human identification. IEEE Trans. Instrum. Meas.
8. Tantawi, M., Revett, K., Tolba, M.F., Salem, A.: Fiducial feature 50(3), 808–812 (2001)
reduction analysis for electrocardiogram (ECG) based biometric 24. Yi, W.J., Park, K.S., Jeong, D.U.: Personal identification from
recognition. J. Intell. Inf. Syst. 40(1), 17–39 (2013) ECG measured without body surface electrodes using probabilis-
9. Roshan, J., Rajendra, U.: ECG beat classification using PCA, LDA, tic neural networks. In: World Congress on Medical Physics and
ICA and Discrete Wavelet Transform. Biomed. Signal Process. Biomedical Engineering (2003)
Control 8(5), 437–448 (2013) 25. Nemirko A.P., Lugovaya T.S.: Biometric human identification
10. Tantawi, M., Revett, K., Salem, A.B., Tolba, M.F.: A wavelet based on electrocardiogram. In: Proceedings of the XII-th Rus-
feature extraction method for electrocardiogram (ECG)-based bio- sian Conference on Mathematical Methods of Pattern Recognition,
metric recognition. Signal Image Video Process. 9(6), 1271–1280 Moscow, MAKS Press, pp. 387-390. ISBN: 5-317-01445-X (2005)
(2013) 26. Dar, M.N., Akram, M.U., Shaukat, A., Khan, M. A.: ecg based
11. Ting, C., Salleh S.: ECG based personal identification using biometric identification for population with normal and cardiac
extended kalman filter. In: 10th International Conference on Infor- anomalies using hybrid HRV and DWT features. In: (ICITCS), pp.
mation Sciences Signal Processing and their Applications, pp. 1–5 (2015)
774–777 (2010) 27. Chun, S. Y.: Single pulse ECG-based small scale user authen-
12. Sufi, F., Khalil, I., Habib, I.: Polynomial distance measurement for tication using guided filtering. In: International Conference on
ECG based biometric authentication. Secur. Commun. Netw. 3(4), Biometrics (ICB), pp. 1–7. IEEE (2016)
303–319 (2008) 28. Tang, X., Shu, L.: Classification of electrocardiogram signals with
13. Shen, T.W., Tompkins, W.J., Hu, Y.H.: One-lead ECG for iden- RS and quantum neural networks. Int. J. Multimed. Ubiquitous
tity verification. In: Proceedings of the 2nd Joint EMBS/BMES Eng. 9(2), 363–372 (2014)
Conference, pp. 62–63 (2002) 29. Wang, J.S., Chiang, W.C., Hsu, Y.L., Yang, Y.T.C.: ECG arrhythmia
14. https://physionet.org/physiobank/database/ecgiddb/ . ECG-ID classification using a probabilistic neural network with a feature
database. Accessed 16 Nov 2017 reduction method. Neurocomputing 116, 38–45 (2013)
15. https://www.physionet.org/physiobank/database/mitdb/ . MIT- 30. Islam, M.S., Alajlan, N., Bazi, Y., Hichri, H.S.: HBS: a novel bio-
BIH arrhythmia database. Accessed 16 Nov 2017 metric feature based on heartbeat morphology. IEEE Trans. Inf.
16. Donoho, D.L., Johnstone, J.M.: Ideal spatial adaptation by wavelet Technol. Biomed. 16(3), 445–453 (2012)
shrinkage. Biometrika 81(3), 425–455 (1994)
17. Hejazi, M., Al-Haddad, S.A.R., Singh, Y.P., Hashim, S.J., Aziz,
A.F.A.: ECG biometric authentication based on non-fiducial
approach using kernel methods. Digit. Signal Process. 52, 72–86
(2016)
123