Algorithm Motion Artefact
Abstract—Objective: The challenging task of heart rate smart-watches or wristbands, HR monitoring can guide exercis-
(HR) estimation from the photoplethysmographic (PPG) sig- ers to adapt their training load and better match their training
nal, during intensive physical exercises, is tackled in this goals [26]. PPG signals have become a popular alternative to tra-
paper. Methods: The study presents a detailed analysis of
a novel algorithm (WFPV) that exploits a Wiener filter to at- ditional Electrocardiography (ECG) based HR estimation which
tenuate the motion artifacts, a phase vocoder to refine the measures the bio-potential generated by electrical signals that
HR estimate and user-adaptive post-processing to track control the expansion and contraction of heart chambers. How-
the subject physiology. Additionally, an offline version of ever, ECG requires the presence of ground and reference sensors
the HR estimation algorithm that uses Viterbi decoding is that must be attached to the chest. PPG-based HR monitoring at
designed for scenarios that do not require online HR mon-
itoring (WFPV+VD). The performance of the HR estimation the peripheral positions such as earlobes, fingertips or wrists is
systems is rigorously compared with existing algorithms seen as a much more convenient solution.
on the publically available database of 23 PPG recordings. The PPG signals [1]–[3], [29], [30] come from PPG sensors
Results: On the whole dataset of 23 PPG recordings, the al- which are embedded in these wearable devices. A PPG sensor
gorithms result in average absolute errors of 1.97 and 1.37 emits light to the skin and measures the changes of intensity of
BPM in the online and offline modes, respectively. On the
test dataset of 10 PPG recordings which were most cor- the light which is reflected or transmitted through the skin. The
rupted with motion artifacts, WFPV has an error of 2.95 BPM periodicity of these measurements in most cases corresponds
on its own and 2.32 BPM in an ensemble with two existing to the cardiac rhythm, and thus, HR can be estimated from the
algorithms. Conclusion: The error rate is significantly re- PPG signal.
duced when compared with the state-of-the art PPG-based Motion artifacts (MAs) are known to be a limiting factor
HR estimation methods. Significance: The proposed system
is shown to be accurate in the presence of strong motion ar- that prevents the straight-forward usage of PPG, especially in
tifacts and in contrast to existing alternatives has very few free living conditions. MAs are considered to result from sensor-
free parameters to tune. The algorithm has a low computa- tissue motion and sensor deformation. Strong movements during
tional cost and can be used for fitness tracking and health physical exercise make the HR estimate inaccurate as shown in
monitoring in wearable devices. The MATLAB implementa-
Fig. 1. Due to motion the sensors might situate far enough
tion of the algorithm is provided online.
apart from the skin that the true HR peak is absent in the PPG
Index Terms—Heart rate, photoplethysmographic, phase spectrum. A number of methods have been proposed to detect,
vocoder, spectrum estimation, viterbi decoding, wiener
filter. remove or attenuate MAs in PPG signals, including adaptive
filtering [4], [5], [22], [25], independent component analysis [6],
I. INTRODUCTION empirical mode decomposition [7], [23] or other decomposition
EARABLE devices have gradually increased their func- models [3], [27], spectral subtraction [8], [21], [24], and Kalman
W tionality over the last decades. Modern wearable de-
vices are equipped with a number of internal and external sen-
filtering [9].
A three-stage TROIKA method has recently been proposed
sors and can offer many useful fitness tracking features such as to estimate HR from PPG signals for scenarios where MAs
counting steps, calories, tracking sleep, etc. Photoplethysmog- are strong [2]. The method was based on signal decomposi-
raphy (PPG) based heart rate (HR) monitoring during physical tion, sparsity-based high-resolution spectrum estimation, and
exercise is one of these features [29], [30]. Implemented in spectral peak tracking and verification. The average absolute
error of 2.34 beats per minute (BPM) was reported on 12 PPG
Manuscript received January 4, 2017; revised February 23, 2017; ac-
cepted February 23, 2017. Date of publication March 1, 2017; date of
current version August 18, 2017.
cepted February 23, 2017. Date of publication March 1, 2017; date of spectra of PPG and acceleration signals were jointly estimated
current version August 18, 2017. This work was supported in part by a
Science Foundation Ireland Research Centre Award (12/RC/2272) and using a common sparsity constraint on the spectral coefficients
Wellcome Trust Seed Award in Science (200704/Z/16/Z). This paper was (JOSS). This was achieved by means of a multiple measurement
presented in part at EMBC 2015 and ICASSP 2017. Asterisk indicates vector model [10]. The error was reduced to 1.28 BPM when
corresponding author.
∗ A. Temko is with the Irish Centre for Fetal and Neonatal Transla- evaluated on the same 12 PPG recordings.
tional Research (INFANT), Department of Electrical and Electronic Engi- For the IEEE Signal Processing Cup 2015 the database
neeringUniversity College Cork, Cork, Ireland (e-mail: [email protected]). of 23 PPG recordings of people running on a tread-
This paper has supplementary downloadable material available at (File size: <1 MB).
Digital Object Identifier 10.1109/TBME.2017.2676243 ( The evaluation
Fig. 1. The challenge of HR estimation during physical exercises. Plot (a) shows a spectrogram of a 5 min PPG recording. Plots (b) and (d) show
examples of a PPG signal from the spectrogram in (a). Plot (c) and (e) show the spectral envelopes of the PPGs in (b) and (d), respectively. The
true HR is denoted with a circle. In the presence of MAs, plots (b) and (c), the highest peak of the spectral envelope does not coincide with the true
HR. Best viewed in color.
rules and metrics are defined which facilitates the comparison 4) Benefits of combining different HR estimation algorithms
between different approaches. In the period since the provi- are discussed and an ensemble is designed. Its superior
sion of this dataset, several HR estimation algorithms have been performance is presented for the first time.
designed and tested [2], [3], [21]–[25]. However, the reported 5) The Matlab implementation of the designed HR es-
improvements in performance are usually accompanied with timation algorithms is made available through online
the increased number of free parameters which may be a sign resources.
of overfitting given the fixed size of the dataset on which the This paper is organized as follows. The developed system
algorithms are both designed and tested. is described in detail in Section II. Section III describes the
In this paper a new approach to HR estimation which is based database, metrics and the performance assessment routine used
on Wiener filtering and the phase vocoder is detailed. In contrast in the study. The results on the provided data are presented in
to the previously presented systems, the proposed solution does Section IV and conclusions are drawn in Section V.
not rely on a set of heuristic rules and thresholds and requires
very few parameters to be tuned. The noise signature is estimated II. HEART RATE ESTIMATION ALGORITHM
from accelerometer signals and the Wiener filter is used to atten-
uate the noise components in the PPG signal. The phase vocoder The flowchart of the developed system is shown in Fig. 2.
is exploited to overcome the limited resolution of the discrete The system consists of 4 main blocks–pre-processing, signal
Fourier transform and to refine the initial dominant frequency de-noising to attenuate the influence of MAs, HR estimation,
estimation. A user-adaptive post-processing step is introduced and post-processing. The examples of signal transformations
and additionally an offline version of post-processing which is carried out at each stage are illustrated in Fig. 3.
based on Viterbi decoding is proposed that requires no tunable
parameters. A. Preprocessing
The main contributions of this study are: During the preprocessing stage, the two PPG signals and three
1) A detailed description of the HR estimation algorithm accelerometer signals are filtered with a 4th order Butterworth
(WFPV) including the proposed user-adaptive post- band-pass filter (0.4–4 Hz) as shown in Fig. 3(a). The two PPG
processing signals are then normalized to zero mean and unit variance (z-
2) An offline version that uses Viterbi decoding is designed score normalization) and averaged. The averaged PPG signal
(WFPV + VD) for applications where online HR moni- and the 3 accelerometer signals are down-sampled from 125 to
toring is not required (e.g. for fitness trackers for swim- 25 Hz for further processing. The signals are then subjected to
mers or offline fitness statistics). A significantly higher the Discrete Fourier Transform (DFT) with the number of bins
performance is obtained by trading online monitoring for set to 1024. Fig. 3(b) shows the spectral envelope of the PPG
accuracy. signal for HRs ranging from 60 to 180 BPM. This range of
3) The performance of the most accurate alternatives pub- HRs is chosen based on the specifics of database used in this
lished to date is summarised and thoroughly compared study which will be explained in Section III and can be changed
with both the online WFPV and offline WFPV + VD. accordingly for other corpuses. Overall, the pre-processing stage
Fig. 3. Signal transformation in the developed WFPV system. Plot (a) shows z-score normalised PPG from two channels, average PPG, and three
accelerometer signals after filtering, (b) shows the spectral envelopes of PPG and noise, measured as average accelerometer signals, and the peak
frequency of the PPG, (c) shows the processed spectral envelope and its maximum after MAs were attenuated with Wiener filtering, and (d) shows
the maximum of the spectral envelop before and after the phase vocoder.
in this work repeats that of other papers that report results on where S(f) and N(f) are the spectra of the clean PPG signal and
this dataset [1], [3], [6]. the MAs, respectively. The estimation of the clean signal can
then be obtained as:
B. De-noising Using Wiener Filtering N (f )
S̃ (f ) = X (f )−N (f ) = 1 − X (f ) = W (f ) X (f )
The Wiener filter is a common tool to estimate a desired signal X (f )
by linear time-invariant filtering of an observed noisy process (2)
[15], [31]. Assuming known stationary signal and additive noise For a signal observed in uncorrelated additive random noise,
spectra, the Wiener filter performs the minimum mean square the frequency-domain Wiener filter is given as:
error estimation of the desired signal given another related pro- PX X (f ) − PN N (f ) PS S (f )
cess. Causal Wiener filtering is applied here to estimate the W (f ) = = (3)
PX X (f ) PS S (f ) + PN N (f )
clean PPG signal from the observed PPG signal. The noisy PPG
signal, X(f ), is assumed to be corrupted by additive MA noise: where PS S (f ), PN N (f ) and PX X (f ) are the power spectrums
of the clean signal, noise and observed signal. The filter convolu-
X (f ) = S (f ) + N (f ) (1) tion in time domain is equivalent to multiplication in frequency
and thus the Wiener filter acts as an adaptive signal-to-noise de- and equals to 1/8 ∗ 60 = 7.5 BPM. Zero-padding before DFT
pendent attenuator, where frequencies which are more affected is used to interpolate the spectral envelope to other frequen-
by the noise are given less importance. cies thus decreasing the frequency spacing between neighboring
The filter Eq. (3) requires separate estimates of the noise and DFT bins. This does not create new information but allows for
signal power spectrums. The noise spectrum can be directly a better revelation of the existing information in the signal.
estimated from the accelerometer signals which is done by av- The phase vocoder is the technique that is used in audio
eraging the spectrum of the 3 accelerometer signals. The clean processing to manipulate audio length without changing its pitch
PPG spectrum, PS S (f ), can be estimated as a subtraction of the or to change its pitch without affecting its length, by preserving
noise signal from the observed signal, PX X (f ) − PN N (f ), the coherence of phase information. The phase vocoder uses a
or recursively from previous filter outputs. Depending on how polar representation of the DFT and the instantaneous frequency
the power spectrum of the clean PPG signal is estimated, two estimation is computed as a discrete derivative of the phase.
Wiener filters are implemented, with frequency domain filter When analyzing the signal with multiple overlapping windows
coefficients given as: the individual signal components (sinusoids) will be correlated
in time and spread over multiple adjacent DFT frequency bins
PN N (t, k)
w1 (t, k) = 1 − 1
t (4) (spectral leakage). The deviation of the true frequency from
C i=t−C +1 PX X (i,k ) the bin center frequency is encoded in phase changes of two
t−1 consecutive frames, so that the instantaneous frequency can be
i=t−C w2 (i, k) PX X (i, k)
w2 (t, k) = 1
t−1 (5) given as:
C i=t−C w2 (i, k) PX X (i, k) + PN N (t,k )
1 dθ (t)
where w(t,k) is the weight of the k-th frequency bin at time, t. f (t) = (6)
2π dt
The power spectrums of noise and PPG in Eq. (5) and Eq. (6)
The DFT phases, θ2 , θ1 , from the current and previous frames,
are normalized by their maximum values to be commensurable.
of the chosen frequency peak in the magnitude spectrum, f , are
It can be seen that in both equations the power spectrum of the
used to refine the initial frequency estimation:
observed signal is averaged over the past C spectral envelopes
(C = 15, in this work). In Eq. (4), the power spectrum of the (θ2 − θ1 + 2πn)
arg min f˜ (n) − f ; f˜ (n) = , ∀n ∈ N
clean PPG signal is estimated by subtracting the observed noise n (2π (t2 − t1 ))
from the observed PPG signal. If C = 1, then the Wiener filter (7)
in Eq. (4) performs a simple version of spectral subtraction where n is a positive integer, t2 , t1 are the time stamps of
[16], [17]. In Eq. (5), the spectrum of the clean PPG signal is the two frames, here t2 − t1 = 2 s which is the window shift.
computed recursively by averaging the previous filtered signal The series, f˜(n), is computed for several n using Eq. (7), and
outputs. the value of f˜ which is closest to the initial frequency estima-
The spectral envelop of the cleaned PPG signal is then ob- tion, f , is chosen. As a result, the previous dominant frequency
tained by multiplying the spectral envelope of the observed value is refined to the new value, f ← f˜. This is illustrated in
signal, PX X (t, k), with the calculated filter coefficients, w(t, k). Fig. 3(d), where the estimated DFT HR of 161.3 BPM was
It can be seen that in the current implementation the Wiener refined to 158.2 BPM, with the true HR being of 159.1 BPM.
filter requires only one parameters to be specified, C. It can be seen that the phase vocoder technique requires no
The spectral envelopes processed with the two designed filters parameters to set.
are scaled by their standard deviation because unlike w2 , w1
can have negative values, when the scaled power of observed D. Post-processing
noise is larger than the scaled power of the observed signal
1) Online Post-processing With Heuristic Rules
for certain frequencies. The resultant signals are averaged to
and Thresholds: The post-processing steps include history
give a final spectral envelope of the cleaned PPG signal. The
tracking and smoothing. The history of the past HR estimation is
dominant frequency (the frequency with the highest magnitude)
preserved and used to guide the search range for the maximum
is converted to the HR estimate in BPM as shown in Fig. 3(c).
DFT magnitude in the current frame. For instance, if the past
HR estimation was 125 BPM, then the current HR is expected
C. HR Estimation and Refinement Using Phase Vocoder to be within a certain range around 125 BPM. Here it is set to
In this work, the phase vocoder technique [11]–[13], [32] is the maximum absolute HR difference between consecutive HR
employed to refine the initial HR estimate through the estimation estimates, f (t), observed so far for this user:
of the instantaneous frequency as the rate of change of phase
τi = max{|f (t) − f (t − 1)| : k < t < i − 1} (8)
angle at time [33]. For signals that are not truly sinusoidal or for
nonstationary signals one needs to account for the time-varying The search range, ± τi , t ≤ k is initialized to be wide enough
nature of the process and that can be done with estimation of (±25 BPM) for the first 30 s–1 min of each recording, and then
the instantaneous frequency. adapts to the specifics of the user’s physiology. This eliminates
The effective frequency resolution (the minimum frequency the need to tune another threshold in the post-processing.
that can be estimated, the Rayleigh frequency) of the data is For the final smoothing, the weighted average between the
limited by the size of the window of the analyzed data (8s) current estimate and its prediction using linear regression is
Authorized licensed use limited to: Chulalongkorn University provided by UniNet. Downloaded on October 21,2024 at 04:58:14 UTC from IEEE Xplore. Restrictions apply.
Fig. 4. (a) Pearson correlation between the estimated HR and the ground truth HR, (b) Bland-Altman plot, and (c) the distribution of the HR in the
DB. Correlation coefficient of the WFPV algorithm is 0.9908.
Rec Activity TROIKA∗ [2] JOSS∗ [3] SpaMa [21] EEMD [23] Spectrap [24] (Offline) IMAT [25] MC-SMD [27] WFPV This study WFPV+VD This study (Offline)
1 T1 2.29 | 2.18 1.33 | 1.19 1.23 | 1.14 1.70 | - 1.18 | 1.04 1.72 | 1.50 1.16 | 0.91 1.25 | 1.15 0.93 | 0.89
2 T1 2.19 | 2.37 1.75 | 1.66 1.59 | 1.30 0.84 | - 2.42 | 2.33 1.33 | 1.30 1.07 | 0.87 1.41 | 1.30 0.82 | 0.73
3 T1 2.00|1.50 1.47 | 1.27 0.57 | 0.45 0.56 | - 0.86 | 0.66 0.90 | 0.75 0.80 | 0.62 0.71 | 0.59 0.64 | 0.54
4 T1 2.15 | 2.00 1.48 | 1.41 0.44 | 0.31 1.15 | - 1.38 | 1.31 1.28 | 1.20 1.13 | 0.84 0.97 | 0.88 0.83 | 0.80
5 T1 2.01 | 1.22 0.69 | 0.51 0.47 | 0.31 0.77 | - 0.92 | 0.74 0.93 | 0.69 0.98 | 0.68 0.75 | 0.57 0.50 | 0.38
6 T1 2.76 | 2.51 1.32 | 1.09 0.61 | 0.45 1.06 | - 1.37 | 1.14 1.41 | 1.20 1.29 | 0.96 0.92 | 0.75 0.78 | 0.61
7 T1 1.67 | 1.27 0.71 | 0.54 0.54 | 0.40 0.63 | - 1.53 | 1.36 0.61 | 0.50 0.88 | 0.65 0.65 | 0.50 0.50 | 0.40
8 T1 1.93 | 1.47 0.56 | 0.47 0.40 | 0.33 0.53 | - 0.64 | 0.55 0.88 | 0.80 0.81 | 0.64 0.97 | 0.83 0.67 | 0.56
9 T1 1.86 | 1.28 0.49 | 0.41 0.40 | 0.42 0.52 | - 0.60 | 0.52 0.59 | 0.50 0.55 | 0.43 0.55 | 0.48 0.45 | 0.38
10 T1 4.70 | 2.49 3.81 | 2.43 2.63 |1.59 2.56 | - 3.65 | 2.27 3.78 | 2.40 3.18 | 1.95 2.06 | 1.29 1.43 | 0.90
11 T1 1.72 | 1.29 0.78 | 0.51 0.64 | 0.42 1.05 | - 0.92 | 0.65 0.85 | 0.60 0.79 | 0.51 1.03 | 0.68 0.74 | 0.48
12 T1 2.84 | 2.30 1.04 | 0.81 1.20 | 0.86 0.91 | - 1.25 | 1.02 0.71 | 0.50 0.72 | 0.53 0.99 | 0.70 0.75 | 0.53
13 T2 - - 3.41 | 4.25 - - - - 3.54 | 4.08 2.77 | 3.19
14 T2 6.63 | 8.76 8.07 | 10.9 7.29 | 9.80 - 4.89 | 6.29 - 9.59 | 12.2 8.68 | 10.9
15 T2 1.94 | 2.56 1.61 | 2.01 2.73 | 2.21 - 1.58 | 1.98 - 2.57 | 3.16 1.99 | 2.43
16 T3 1.35 | 1.04 3.10 | 2.69 3.18 | 2.11 - 1.83 | 1.49 - 2.25 | 1.87 1.83 | 1.51
17 T3 7.82 | 4.88 7.01 | 4.49 3.01 | 2.52 - 3.05 | 2.00 - 3.01 | 1.99 2.22 | 1.49
18 T3 2.46 | 2.00 2.99 | 2.52 4.46 | 3.23 - 1.62 | 1.36 - 2.73 | 2.29 2.01 | 1.70
19 T3 1.73 | 1.27 1.67 | 1.23 3.58 | 3.98 - 1.24 | 0.92 - 1.57 | 1.15 1.23 | 0.90
20 T2 3.33 | 3.90 2.80 | 3.46 1.94 | 1.66 - 2.04 | 2.23 - 2.10 | 2.41 1.53 | 1.78
21 T3 3.41 | 2.43 1.88 | 1.32 2.56 | 2.02 - 2.49 | 1.81 - 3.44 | 2.45 2.74 | 1.96
22 T3 2.69 | 2.12 0.92 | 0.74 3.12 | 3.28 - 1.16 | 0.92 - 1.61 | 1.26 1.02 | 0.80
23 T2 0.51 | 0.59 0.49 | 0.57 1.72 | 1.97 - 0.66 | 0.79 - 0.75 | 0.88 0.51 | 0.59
Mean T1 avAE 2.34 1.28 0.89 1.02 1.50 1.25 1.11 1.02 0.65
Rec avRE 1.82 1.01 0.65 - 1.12 0.99 0.80 0.81 0.55
1-12 sdAE 2.47 2.61 - 1.79 1.95 - 1.99 1.25 1.00
T2-T3 avAE - - 3.36 - - - - 3.01 2.16
Rec avRE - - 3.33 - - - - 3.06 2.21
13-23 sdAE - - - - - - - 3.83 2.89
Test avAE 3.19 3.05 3.35 - 2.13 - - 2.95 2.11
Rec avRE 2.95 3.00 3.27 - 2.77 - - 2.96 2.12
14-23 sdAE 3.61 3.35 - - 2.04 - - 3.71 2.82
Rec avAE 2.73 2.08 2.01 - 1.79 - - 1.90 1.31
1-12, avRE 2.33 1.91 1.84 - 1.87 - - 1.98 1.26
14-23 sdAE 2.99 2.79 - - 1.99 - - 2.37 1.83
All avAE - - 2.07 - - - - 1.97 1.37
Rec avRE - - 1.95 - - - - 1.89 1.34
1-23 sdAE - - - - - - - 2.48 1.91
The HRs generated by TROIKA and JOSS on recordings 14-23 are obtained from
# tunable threshhold TROIKA [2] JOSS [3] SpaMa [21] EEMD [23] Spectrap [24] IMAT [25] MC-SMD [27] WFPV This study WFPV + VD This study
The number of parameters does not include the preprocessing parameters such as filter length, cut-off points, down-sampling, etc.
TABLE V in all reported metrics. In fact, Bland-Altman plots that show the
distribution of errors for a given algorithm (as shown in Fig. 4(b)
for WFPV) can be used to assess the level of complementarity
of various approaches—for instance, algorithms that produce
TROIKA [2] JOSS [3] WFPV Ensemble
most errors in the region of high HR would be good candidates
Rec.14-23 avAE 3.19 3.05 2.95 2.32 to form ensemble with WFPV. For this purpose, the availability
avRE 2.95 3.00 2.96 2.29
sdAE 3.61 3.35 3.71 2.37
of the algorithms implementation for building more accurate
solutions is essential.
JOSS [3] and WFPV methods by taking a simple average of
their HR estimates. It can be seen that even a late decision-level An algorithm based on the Wiener filtering and the phase
combination of the HR estimates significantly reduces the error vocoder is proposed. It provides a simple but effective solution
