Phonocardiogram Event-Based Delineation Method Usi
Phonocardiogram Event-Based Delineation Method Usi
Phonocardiogram Event-Based Delineation Method Usi
Introduction
The phonocardiogram (PCG) signal detects and records heart sounds, the sounds
made by the various cardiac structures pulsing and moving blood [1]. This sound
is caused usually by the acceleration and deceleration of blood and turbulence
developed during the rapid blood flow [1]. An electronic stethoscope remains the
primary technology of the doctor for hearing such sounds [2]. When an electronic
stethoscope is placed on the subject’s chest, the first cardiac sound (S1) and the
second cardiac sound (S2) are audible [2]. A stethoscope system consisting of an
electronic stethoscope and a digital assistant like a smartphone or a laptop is
used to assess the condition of the heart [2]. The digital assistant uses an
application based on the various types of signal processing techniques that have
been developed to analyze the condition of the heart [2]. The signal processing
technique involved in dividing the cardiac sounds into S1-S2 and S2-S1 cycles is
called segmentation [2]. Segmentation helps to assess the occurrence of cardiac
events [2]. The most popular method of cardiac sounds is the wavelet transform
[3,4]. Dinesh Kumar et al. performed segmentation of the cardiac sounds by using
the wavelet decomposition simplicity filter [3,4]. In the early years, various
threshold-based techniques based on Shannon Energy, S-transform, and Wavelet
transform were used to segment the Phonocardiograms [5-12]. In the later years,
statistical models based on HMM, HSMM, GMM, and SVM referred HSMM were
used to segment the Phonocardiograms [13-17]. Of late, classifier-based methods
using SVM, ANN, CNN, and deep neural networks have been used to segment
Phonocardiograms with good accuracy [18-22]. One of the main drawbacks of the
classifier-based methods is their reliance on the feature extraction or activity
detection mechanism which is required for PCG delineation [22]. We have
developed a simple Spectrogram based method involving a 2D-Continuous
Wavelet Transform (CWT) that can detect the boundaries of the cardiac events.
Fast Fourier transform has been utilized to identify the beats. No additional signal
processing overheads like noise threshold, activity detection, or feature extraction
are involved.
Segmentation of Phonocardiograms
The segmentation of PCG signal using CWT is divided into 5 steps namely
preprocessing, CWT computation, evaluation of wavelet energy, boundary
extraction, and S1-systole-S2-diastole identification. Each of them is explained
explicitly in the sections given below.
Preprocessing
CWT Computation
The Continuous Wavelet transform of the filtered PCG signal is then computed
after removing the noise. Morlet wavelet is used as the mother wavelet for CWT
computation. The Morlet wavelet was selected because of its morphological
resemblance of the wavelet with the PCG signal. Matlab CWT function is used to
4042
Where 𝒙(𝒕) is the PCG and 𝝍𝒂.𝒃 (𝒕) represents the mother wavelet (Morlet) with
scale 𝒂 dilation 𝒃. Fig 3 represents the CWT spectrogram of the normal PCG
signal. Fig 4 represents the zoomed plot.
Evaluation of CWT coefficients of the filtered signal results in CWT matrix. The
row sum of the CWT matrix is computed to obtain the wavelet energy. Fig 5 Pane
1 indicates the wavelet energy. The wavelet energy signal contains many sharp
spikes. These spikes are smoothened using the moving average filter. The window
4043
∑𝑾
𝑴𝑨 = 𝑬
. (3)
𝒏
𝑾𝑬 represents the data points and 𝒏 represents the total number of data points.
Fig 5 Pane 2 indicated the wavelet energy signal after smoothening.
Boundary Extraction
After smoothening the signal in step 3 the mean value of the smoothened wavelet
energy signal is computed. The mean value is then subtracted from the
smoothened wavelet energy signal to obtain the corrected smoothened wavelet
energy signal with a zero line as the reference axis. The zero crossings of the
corrected smoothened wavelet energy signal with the zero-reference line are
noted. The zero crossings indicate the boundaries of the sounds in the PCG
signal.
The Fast Fourier Transform of the sounds within the boundaries is computed.
Generally, the first heart sound has frequency components in the range 30Hz-
150Hz, while the second heart sound has frequency components in the range
200Hz-280Hz [22]. So, by using these metrics we have identified the heart sounds
as S1-systole-S2-diastole. Fig 5 pane 3 shows the Segmented sounds in the
smoothened wavelet energy signal. Fig 6 shows the zoomed plot. Fig 7 shows the
original segmented PCG signal and Fig 8 shows the zoomed plot. The red color
indicates the S1 sound while the pink color indicated the S2 sound as per sound
identification done using Fast Fourier Transform.
This section describes the results obtained from applying the CWT-based
segmentation on PCG signals from the Peter Bentley PCG database.
Dataset
Two datasets were provided for the Peter Bentley PCG segmentation challenge
[23]. Dataset A comprises data crowd-sourced from the general public via the
stethoscope Pro iPhone app [23]. Dataset B comprises data collected from a
clinical trial in hospitals using the digital stethoscope Digi-Scope [23]. The audio
files are of varying lengths, between 1 second and 30 seconds (some have been
clipped to reduce excessive noise and provide the salient fragment of the sound).
Most information in heart sounds is contained in the low-frequency components,
with noise in the higher frequencies [23]. It is common to apply a low-pass filter at
4046
160 Hz [23]. Fast Fourier transforms are also likely to provide useful information
about volume and frequency over time [23].
Segmentation of PCG
The cardiac sounds are selected from the PCG Peter Bentley database. The PCG
signals are first preprocessed by down-sampling 100 times from a sampling
frequency of 44100 Hz to 441 Hz and then filtered by a Type II Chebyshev
bandpass filter with a cut-off frequency of 20 Hz to 160 Hz. CWT segmentation
procedure is then applied to the filtered sounds. In the segmentation procedure,
the CWT matrix is computed first. Then the wavelet energy is found from the CWT
matrix by taking the row-sum. The wavelet energy signal is smoothened by a
moving-average filter with a window size fixed at 10. The mean value of the
smoothened signal is then evaluated by taking the average of the samples. The
mean value is then subtracted from the original smoothened signal to obtain the
corrected smoothened wavelet energy signal. The zero-reference line of the
corrected smoothened wavelet energy signal is used to extract the boundaries of
the PCG. The zero crossings of the corrected smoothened wavelet energy signal
are noted. The sounds within the boundary are either the first heart sound or the
second heart sound. The Fast Fourier Transform is evaluated for the sounds
within the boundary. Fast Fourier Transform is a plot of frequency versus wavelet
energy. The sounds with a lesser frequency range are marked as the first heart
sound while those with higher frequencies are marked as the second heart sound.
We have effectively used Fast Fourier Transform for labeling the occurrence of S1-
systole-S2-diastole in the PCG signal.
Tables 1-6 give the split-up of the metrics obtained by applying the segmentation
algorithm. All together 643 PCG signals were involved of which 134 signals were
from dataset A while the remaining 509 signals were from dataset B. Ten percent
of the signals were rejected due to the presence of high noise and murmurs which
could not be removed even after filtering. Out of the retained 580 PCG signals,
121 signals were considered from dataset A while 459 signals were considered
from dataset B. Both dataset A and dataset B have signals under the category of
Normal, Murmur, Extrasystole, and Unlabeled sounds. In dataset A, the
segmentation procedure yielded 909 true positive S1 and S2 sounds from all
categories. 62 sounds were categorized as false positives due to bad segmentation.
28 sounds were categorized as false negatives with no segmentation. In dataset B,
segmentation procedure yielded 4182 true positive S1 and S2 sounds from all
4047
Sounds TP FP FN
Normal 213 15 5
Murmur 331 10 11
Extrasystole 74 20 1
Unlabeled 291 17 11
Total 909 62 28
Table 1 True Positive, False Positive and False Negative of the various PCG sounds
(dataset A)
Sounds TP FP FN
Normal 2223 78 75
Murmur 273 48 1
Extrasystole 459 88 13
Unlabeled 1227 135 29
Total 4182 349 118
Sensitivity and F1 Measure)
Table 3 True Positive, False Positive and False Negative of the various PCG sounds
(dataset B)
Sounds TP FP FN
Normal 2436 93 80
4048
Murmur 604 58 12
Extrasystole 533 108 14
Unlabeled 1518 152 40
Total 5091 411 146
Table 5 True Positive, False Positive and False Negative of the various PCG sounds
(overall dataset)
Conclusion
From the above results and discussion, it is evident that the PCG signal can be
segmented by the CWT technique. We have achieved an accuracy of 90.1% and an
F1 score of 94.8%. for both the datasets A and B. This segmentation procedure is
a simple technique that involves no overheads like noise threshold, activity
detection or feature extraction. The technique can be incorporated into the
stethoscope system connected to a laptop or a smartphone. The computational
efficiency is high and the cost of implementation is low.
References