0% found this document useful (0 votes)
5 views10 pages

A Multi-Channel Ratio-of-Ratios Method For Noncont

This paper presents a noncontact method for monitoring blood oxygen saturation (SpO2) using smartphone cameras, addressing limitations of traditional contact-based pulse oximeters. The proposed multi-channel ratio-of-ratios (RoR) method utilizes all three RGB channels for improved accuracy, achieving a mean absolute error of 1.26% compared to the pulse oximeter reference. This approach is particularly beneficial during the COVID-19 pandemic, as it allows for SpO2 monitoring without skin contact, reducing irritation and sanitary concerns.

Uploaded by

rosamaleki1382
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views10 pages

A Multi-Channel Ratio-of-Ratios Method For Noncont

This paper presents a noncontact method for monitoring blood oxygen saturation (SpO2) using smartphone cameras, addressing limitations of traditional contact-based pulse oximeters. The proposed multi-channel ratio-of-ratios (RoR) method utilizes all three RGB channels for improved accuracy, achieving a mean absolute error of 1.26% compared to the pulse oximeter reference. This approach is particularly beneficial during the COVID-19 pandemic, as it allows for SpO2 monitoring without skin contact, reducing irritation and sanitary concerns.

Uploaded by

rosamaleki1382
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

1

A Multi-Channel Ratio-of-Ratios Method for


Noncontact Hand Video Based SpO2 Monitoring
Using Smartphone Cameras
Xin Tian, Student Member, IEEE, Chau-Wai Wong, Member, IEEE, Sushant M. Ranadive,
and Min Wu, Fellow, IEEE

Abstract—Blood oxygen saturation (SpO2 ) is an important Pulse oximeters have been widely used for SpO2 measure-
arXiv:2107.08528v1 [eess.SP] 18 Jul 2021

indicator for pulmonary and respiratory functionalities. Clinical ment at home and in hospitals in the form of a finger-clip or
findings on COVID-19 show that many patients had dangerously earlobe-clip [6], [7], which adopts the ratio-of-ratios (RoR)
low blood oxygen levels not long before conditions worsened. It is
therefore recommended, especially for the vulnerable population, principle. The RoR principle is based on the different optical
to regularly monitor the blood oxygen level for precaution. Recent absorption rates of the oxygenated hemoglobin (HbO2 ) and
works have investigated how ubiquitous smartphone cameras deoxygenated hemoglobin (Hb) at 660 nm (red) and 940 nm
can be used to infer SpO2 . Most of these works are contact- (infrared) wavelengths. By illuminating red and infrared lights
based, requiring users to cover a phone’s camera and its nearby on the peripheral microvascular bed of tissue such as the
light source with a finger to capture reemitted light from the
illuminated tissue. Contact-based methods may lead to skin fingertip, the intensity of the transmitted light on the receiver
irritation and sanitary concerns, especially during a pandemic. In end of the pulse oximeter contains pulsatile information to
this paper, we propose a noncontact method for SpO2 monitoring derive the level of blood oxygen saturation. The gold standard
using hand videos acquired by smartphones. Considering the for measuring the blood oxygen saturation is blood gas anal-
optical broadband nature of the red (R), green (G), and blue (B) ysis, which is invasive and painful and requires well-trained
color channels of the smartphone cameras, we exploit all three
channels of RGB sensing to distill the SpO2 information beyond healthcare providers to perform the test. In contrast, the pulse
the traditional ratio-of-ratios (RoR) method that uses only two oximeter is noninvasive and provides readings in nearly real
wavelengths. To further facilitate an accurate SpO2 prediction, time, and is therefore more tolerated and convenient for daily
we design adaptive narrow bandpass filters based on accurately use. The pulse oximeter is known to have a deviation of ±2%
estimated heart rate to obtain the most cardiac-related AC from the gold standard when the blood oxygen saturation is
component for each color channel. Experimental results show
that our proposed blood oxygen estimation method can reach a in the range of 70% to 99% [8], which is well-known and
mean absolute error of 1.26% when a pulse oximeter is used as accepted in the clinical use.
a reference, outperforming the traditional RoR method by 25%. With the ubiquity of smartphones and the growing market
of smart fitness devices, the RoR principle has been applied to
Index Terms—Blood oxygen saturation, contact-free, smart- new nonclinical settings for SpO2 measurement. Apple Watch
phone, hand videos, ratio-of-ratios model. Series 6 has blood oxygen measurement functionality, and it
requires skin contact with the watch neither too tight nor too
loose for the best results [9]. The recent scientific literature
I. I NTRODUCTION also explored methods for SpO2 estimation using a smart-
phone. These methods require a user to use his/her fingertip
P ERIPHERAL blood oxygen saturation (SpO2 ) shows the
ratio of oxygenated hemoglobin to total hemoglobin in the
blood, which serves as a vital health signal for the operational
to cover an optical sensor (a built-in camera or a dedicated
sensor) and a nearby light source (a built-in flashlight or
functions of organs and tissues [1]. It has become increasingly purposely designated lights) to capture the reemitted light from
important in the COVID-19 pandemic, where many patients the illuminated tissue [10]–[14]. The aforementioned SpO2
have experienced “silent hypoxia,” a low level of SpO2 even estimation methods based on smartphone and smartwatch
before obvious breathing difficulty is observed [2]–[4]. The are contact-based. It is sensitive to motion, and may irritate
vulnerable population with a high possibility of infection is sensitive skin or cause a sense of burning from the heat built
recommended to monitor their oxygen status continuously for up if the fingertip is in contact with the flashlight on for an
early COVID-19 detection [2], [5]. extended period of time.
Researchers have recently investigated measuring the satura-
X. Tian, and M. Wu are with the Department of Electrical and Computer tion of blood oxygen by means of contactless techniques [15]–
Engineering, University of Maryland, College Park, MD, 20742 USA (e-mail: [18]. These methods typically acquire a user’s face video
{xtian17, minwu}@umd.edu).
C.-W. Wong is with the Department of Electrical and Computer Engineering under ambient light with CCD cameras to estimate SpO2
and the Forensic Sciences Cluster, NC State University, Raleigh, NC, 27695 from pulsatile information of monochromatic wavelengths.
USA (email: [email protected]). Shao et al. [19] also use a facial video-based method to
S. M. Ranadive is with the Department of Kinesiology, School of Public
Health, University of Maryland, College Park, MD, 20742 USA (e-mail: monitor SpO2 that is implemented using a CMOS camera
[email protected]). with a light source alternating at two different wavelengths.
2

Tsai et al. [20] acquire hand images with CCD cameras under distribution I(λ) illuminating the skin, and a remote color
two monochromatic lights to analyze SpO2 from the reflective camera with spectral responsivity r(λ) recording a video. The
intensity of the shallow skin tissue. These contactless methods light from the source travels through the tissue and part of
can provide alternatives of contact-based SpO2 measurements the light in the tissue is reemitted to be received by the
for individuals with finger injuries or nail polish [21], [22], color camera. During each cardiac cycle, the heart muscle
for whom the traditional pulse oximeters may be inaccurate. contracts and relaxes, so that the blood is bumped to the body
However, the setups used in the abovementioned studies are and travels back to the heart. During this process, the blood
expensive and not common for daily use. volume increases and decreases in the arterial vessels, causing
More economical camera devices, low-cost webcams [23]– increased and decreased light absorption [28]. According to
[26] and smartphones [27] have also been applied for con- the skin-reflection model [29], [30], the color camera will
tactless SpO2 estimation. Unlike pulse oximeters, all three receive the specularly reflected light from the skin surface and
color channels in these economical RGB cameras capture a the diffusely reemitted light from the tissue–light interaction
wide range of wavelengths from the ambient light. Most works that contains the cardiac-related pulsatile information. Based
using RGB cameras [23]–[27] directly apply the conventional on the verified assumption proposed in [17] that the specular
narrowband red–infrared RoR principle to the red and blue reflection components can be ignored if the movement is
(or green) channels of RGB videos without addressing the minimized, the camera sensor response at time t can be
broadband nature of the color channels. expressed as:
In this paper, we propose a multi-channel RoR method for Z
noncontact SpO2 monitoring using hand videos captured by Sc (t) = I(λ) · e−µd (λ,t) · rc (λ) dλ. (1)
smartphone cameras under ambient light. The contributions of Λc
our work include: where the λ is the wavelength, the integral range Λc captures
• We exploit all three RGB channels to extract features the responsive wavelength band of channel c of the camera,
for SpO2 prediction, instead of limiting to two wave- I(λ) is the spectral intensity of the light source, µd (λ, t) is
lengths/color channels as in traditional RoR methods. the diffusion coefficient, and rc (λ) is the sensor response of
Efficiently utilizing three channels is nontrivial and is channel c of the camera.
one of the key research issues we shall address in this According to the Beer-Lambert’s law [16], the diffusion
paper. We will take into consideration the underlying coefficient µd (λ, t) can be expanded into:
optophysiological model given the smartphone camera as
the remote sensor and the ambient light environment. Our µd (λ, t) = εt (λ) Ct lt
(2)
proposed multi-channel RoR based method can achieve + [εHb (λ)CHb + εHbO2 (λ)CHbO2 ] l(t),
a mean absolute error of 1.26% in SpO2 estimation with
where εHb , εHbO2 , and εt are the extinction coefficients of
the pulse oximeter as the reference, which is 25% lower
arterial deoxyhemoglobin, arterial oxyhemoglobin, and other
than that of the traditional RoR model.
tissues including the venous blood vessel, respectively; Ct ,
• We filter the RGB signals with a narrow adaptive band-
CHb , and CHbO2 are the concentrations of the corresponding
pass (ABP) filter centered at an accurately estimated heart
substances; lt is the path length that the light travels in the
rate (HR) to obtain the most relevant cardiovascular-
tissue and is assumed to be time-invariant; l(t) is the path
related AC component from each color channel for fea-
length that the light travels in the arterial blood vessels. Note
ture extraction. We systematically analyze and verify the
that l(t) is time-varying because the arteries will dilate with
important roles of both the narrow ABP filter and the
increased blood during systole compared to diastole.
accurate HR tracking for accurate SpO2 monitoring.
The integral range Λc can be simplified to a single value
• We investigate the more favorable scenario for data
λi when the camera is monochromatic, incoming light is
collection by using hand as the signal source for the
filtered by a narrowband optical filter, or alternatively, the light
period that face-covering mandate is on. It has a few
source is a narrowband LED. Then the response of the camera
practical advantages compared to facial videos, including
sensor in (1) can be written as:
being applicable with masks on, less privacy concern,
and potentially being more tolerant to different skin tones Sc (t) =I(λi ) · e−εt (λi )Ct lt · rc (λi )
than the face. We analyze the impact of the sides of the (3)
· e−[εHb (λi )CHb +εHbO2 (λi )CHbO2 ] l(t) .
hand and skin tones on the SpO2 estimation performance.
Given our collected dataset, we find that using the palm Let ∆l = lmax − lmin denote the difference of the light path of
side for video capturing has a good SpO2 estimation the pulsatile arterial blood between diastole when l(t) = lmin
performance regardless of the skin tones. We also do and systole when l(t) = lmax . The log-ratio of the response
not observe significant performance differences between of the cth channel of the camera sensor during diastole and
skin-tone subgroups if they use the palm side for video systole can then be written as:
capturing.
Sc |l(t)=lmin
 
II. BACKGROUND AND R ELATED W ORK R(λi ) = log (4a)
Sc |l(t)=lmax
In this section, we review the ratio-of-ratios (RoR) model
for SpO2 measurement. Consider a light source with spectral = [εHb (λi )CHb + εHbO2 (λi )CHbO2 ] ∆l. (4b)
3

For two different wavelengths λ1 and λ2 , the ratio of ratios

SpO2(%)
100

95
Est.
(RoR) can then be defined as: 90 Ref.

Hand Spatial Feature SpO2


R(λ1 ) εHb (λ1 )CHb +εHbO2 (λ1 )CHbO2 ROI Generation
RoR(λ1 , λ2 ) = = . (5) Video Combining Extraction Prediction
R(λ2 ) εHb (λ2 )CHb +εHbO2 (λ2 )CHbO2
Heart Rate
Est.
rPPG

HR (bpm)
 Ref.

Since SpO2 = CHbO2 (CHbO2 + CHb ), the relation between Extraction Estimation
RoR and SpO2 can be written as:
To Assist the Design of the Adaptive Bandpass Filter
εHb (λ1 )−εHb (λ2 )·RoR Fig. 1: System illustration for the SpO2 prediction using the smart-
SpO2 =
εHb (λ1 )−εHbO2 (λ1 )+[εHbO2 (λ2 )−εHb (λ2 )]·RoR phone captured hand videos. The pixels from the hand region are
(6a) utilized for prediction, and an rPPG signal is extracted for heart
rate (HR) estimation. Multi-channel RoR features are derived from
≈ α·RoR+β. (6b) the spatially combined RGB signals with the help of the HR-guided
filters. The extracted features are then used for SpO2 prediction.
where the linear approximation can be obtained by a Taylor
expansion.
The linear RoR model in (6b) has been applied under A. ROI Generation via Thresholding and Spatial Combining
different SpO2 measurement scenarios. For pulse oximeters,
To facilitate the data collection with good quality, we use
λ1 = 660 nm and λ2 = 940 nm are used to leverage the
a rectangle to enclose the target hand region. We use an
optical absorption difference of Hb and HbO2 at the two
interactive user interface for this step, which can be replaced
wavelengths. In some prior art using narrowband light sources
by an automated hand detection algorithm [31] if desired. The
or monochromatic camera sensors [15], [19] for contactless
pixels in this region are converted from the RGB color space
SpO2 monitoring, different combinations of (λ1 , λ2 ) have
to the YCrCb color space, and the Cr channel is used [32] to
been explored. In the prior art using consumer-grade RGB
determine a threshold that best differentiates the skin pixels
cameras [23]–[27], only two out of the three available RGB
from the background using the Otsu algorithm [33]. We apply
channels were used for the linear RoR model.
morphological erosion and dilation operations with a median
Among the abovementioned SpO2 estimation methods us- filter to exclude noise pixels outside the region of the binary
ing consumer-grade RGB cameras, the SpO2 data collected hand mask. The final hand-shaped mask is considered as the
in [23], [24] only cover a small dynamic range (mostly above ROI, and an example is shown in the second picture in Fig. 1.
95%), which is not very meaningful. Bal et al. [25] and For all n frames in the video, we calculate the spatial average
Tarassenko et al. [26] show a fitted linear relation between values of the red, green, and blue channels in the ROI and
RoR and SpO2 for data that last for merely several minutes. denote them as r̄, ḡ, b̄ ∈ R1×n , and arrange them into a matrix
These limitations can be attributed to that, unlike the signals A = [r̄; ḡ; b̄] ∈ R3×n .
captured in the narrowband setting that is modeled precisely
by (3) and (4), all three RGB color channels capture a wide
range of wavelengths from the ambient light, as is described B. rPPG Extraction and HR Estimation
in (1). The aggregation of the broad range of wavelengths In a typical RoR method, after matrix A is calculated,
lowers the optical difference between Hb and HbO2 and the AC component for each channel of A is quantified by
makes it less optically selective than narrowband sensors either the standard deviation [10] or the peak-to-valley ampli-
used in oximeters. To address this issue, we disentangle the tude [19]. Since the signal-to-noise ratio (SNR) is lower for the
aggregation through a careful combination of the pulsatile video captured by a smartphone in a contactless manner, we
signals from all three channels of RGB videos to efficiently propose to use an adaptive bandpass filter centered at the HR
distill the SpO2 information. frequency to clean the RGB channel signals so as to extract
the AC components more precisely.
The HR can be measured contact-free by capturing the
III. P ROPOSED M ULTI -C HANNEL RO R M ETHOD
pulse-induced subtle color variations of the skin. The pulse
Fig. 1 illustrates the proposed procedure for the SpO2 signal, referred to as remote photoplethysmogram (rPPG),
estimation from the smartphone captured hand videos. First, can be obtained from applying the plane-orthogonal-to-skin
the hand is detected as the region of interest (ROI) for each (POS) algorithm [29], which defines a plane orthogonal to
frame. Second, the spatial average from the ROI is calculated the skin tone in the RGB space for robust rPPG extraction.
to obtain three time-varying signals of RGB channels. The The HR is then tracked from the rPPG signal via a state-of-
averaged RGB signals are extracted for two purposes: i) to the-art adaptive multi-trace carving (AMTC) [34] algorithm
estimate the heart rate (HR), and ii) to acquire the filtered that tracks the HR from the spectrogram of rPPG by dynamic
cardio-related AC components using an HR-based adaptive programming and adaptive trace compensation.
bandpass filter. Third, the ratio between the AC and the DC To study the role of accurate HR tracking for feature
components for each color channel and the pairwise ratios extraction, we also implemented a peak-finding method and a
of the resulting three ratios are computed as the features for weighted energy method for frequency estimation [35] to com-
a regression model where SpO2 is treated as the label. The pare with AMTC. The peak-finding method takes the peaks of
details of each step are provided as follows. the squared magnitude of the Fourier transform of rPPG as the
4

200

175
Frequency (bpm)

150

125

100

75

50
0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
(a)
MAE: Peak-finding = 6.00 bpm, Weighted = 4.94 bpm, AMTC = 2.50 bpm.
120
HR (Ref)
110 HR (Peak-finding)
Frequency (bpm)

HR (Weighted)
100 HR (AMTC)
90
80
70 Fig. 3: Experimental setup for data collection of hand videos and
60 reference signals using an oximeter. The left index finger was placed
50 in a CMS-50E pulse oximeter to record the reference HR and SpO2
0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
signals. The smartphone camera is recording the video of both hands.
Time (min)
(b)

Fig. 2: (a) Spectrogram of an rPPG signal. (b) Reference HR signal


D. Regression and Prediction
and HR signals estimated by the peak-finding method, the weighted
energy frequency estimation algorithm, and AMTC algorithm, re- In this first proof-of-concept work leveraging multi-channel
spectively. The mean absolute error (MAE) of the HR estimation ratio features, we use linear regression (LR) and support
algorithms are 6.00, 4.94, and 2.50 bpm, respectively.
vector regression (SVR) to learn the mapping between the
features and the SpO2 level. Since LR captures only the linear
relationship, it has limited learning capability and will serve as
estimated HR values, which was used in [16] and [17]. The 2
a baseline. The objective function is min ky−Fwk +λ kwk2 ,
2
weighted energy method finds the heart rate by weighing the w

frequency bins in the corresponding frame of the spectrogram where y = [y1 , ..., yL ]T ∈ RL×1 contains the target SpO2
of rPPG. Compared to the peak-finding method, the weighted values, F = [f1 ; ...; fL ] ∈ RL×6 is the feature/data matrix de-
energy method is more robust to outliers in frequency. Fig. 2 rived from the input, and w ∈ R6×1 contains the weights.
illustrates an example of the HR estimation results by the peak- An `2 -regularization term is added to the objective function
finding method, the weighted energy algorithm, and AMTC, to avoid rank deficiency caused by the collinearity among
respectively. features. To select the optimal regularization parameter λ,
we use a 5-fold cross-validation. In addition to LR, we
use the SVR to better capture the nonlinearity of the fea-
tures. The Libsvm library PL [36] is used for training the -
C. Feature Extraction 2
SVR, min 12 kwk +C i=1 L yi , wT φ(fi )+b , where L

w,b
We use a processing window of 10 seconds with a step is the linear -insensitive loss function [37]. Our imple-
size of 1 second to segment the whole video into L windows. mentation uses the radial basis function (RBF) kernel to
Within each window, the DC and AC components of the RGB capture the nonlinearity. The hyperparameters, including the
channels are calculated to build a feature vector f . penalty cost C and the kernel parameter
 γ of kernel
 function
DC component We use a second-order lowpass Butterworth 2
K(fi , fj ) = φ(fi )T φ(fj ) = exp −γ kfi − fj k are selected
filter with a cutoff frequency at 0.1 Hz. The DC component via a grid search over a 5-fold cross-validation loss.
is estimated using the median of the lowpass filtered signal of Once an estimated weight vector ŵ is learned from the
each window. linear or support vector regression, ŵ is then used to predict a
AC component The estimated heart rate values from Sec- preliminary SpO2 signal. Finally, a 10-second moving average
tion III-B are used as the center frequencies for the adaptive window is applied to smooth out the preliminarily predicted
bandpass (ABP) filters to extract the AC components of the signal to obtain the final predicted SpO2 signal.
RGB channels, which eliminates all frequency components
that are unrelated to the cardiac pulse. We adopt an 8th-
IV. E XPERIMENTAL R ESULTS
order Butterworth bandpass filter with ±0.1 Hz (±6 bpm)
bandwidth, centering at the estimated HR of the current A. Data Collection
window. The magnitude of the AC component is estimated Fourteen volunteers, including eight females and six males,
using the average of the peak-to-valley amplitudes of the were enrolled in our study under protocol #1376735-2 ap-
filtered signals within the current processing window. proved by the University of Maryland Institutional Review
We define the normalized AC components at the ith window Board (IRB), with age range between 21 and 30, and Fitz-
AC(i,c)
as R(i, c) = DC(i,c) , where c ∈ {r, g, b} represents color patrick skin types II–V. There are two, eight, one, and three
channel and i ∈ {1, 2, ..., L}. We define the multi-channel participants having skin types II, III, IV, and V, respectively.
ratio-of-ratios based feature vector of the ith window as None of the participants had any known cardiovascular or
R(i,r) R(i,r) R(i,g)
fi = [R(i, r), R(i, g), R(i, b), R(i,g) , R(i,b) , R(i,b) ] ∈ R1×6 . respiratory diseases. During the data collection, participants
5

MAE = 1.66%, = 0.87 MAE = 1.23%, = 0.84 MAE = 0.97%, = 0.88 MAE = 1.34%, = 0.71 MAE = 1.08%, = 0.76 MAE = 1.34%, = 0.65
Training Test Training Test Training Test
SpO2 (%)

(a) Participant 1, Skin Type III (b) Participant 2, Skin Type III (c) Participant 3, Skin Type III
MAE = 1.23%, = 0.74 MAE = 1.57%, = 0.65 MAE = 0.96%, = 0.89 MAE = 1.37%, = 0.47 MAE = 0.41%, = 0.81 MAE = 0.54%, = 0.75
Training Test Training Test Training Test
SpO2 (%)

(d) Participant 4, Skin Type III (e) Participant 5, Skin Type III (f) Participant 6, Skin Type III
MAE = 0.83%, = 0.77 MAE = 0.74%, = 0.82 MAE = 2.38%, = 0.74 MAE = 1.50%, = 0.76 MAE = 1.70%, = 0.60 MAE = 1.36%, = 0.64
Training Test Training Test Training Test
SpO2 (%)

(g) Participant 7, Skin Type II (h) Participant 8, Skin Type IV (i) Participant 9, Skin Type V
MAE = 0.99%, = 0.67 MAE = 1.01%, = 0.62 MAE = 2.00%, = 0.61 MAE = 1.29%, = 0.58 MAE = 1.26%, = 0.83 MAE = 1.71%, = 0.71
Training Test Training Test Training Test
SpO2 (%)

(j) Participant 10, Skin Type V (k) Participant 11, Skin Type II (l) Participant 12, Skin Type III
MAE = 1.96%, = 0.74 MAE = 1.64%, = 0.69 MAE = 1.13%, = 0.80 MAE = 1.02%, = 0.68
Training Test Training Test
SpO2 (%)

Reference Signal
Predicted Signal

Time (s) Time (s) Time (s) Time (s)


(m) Participant 13, Skin Type V (n) Participant 14, Skin Type III

Fig. 4: Predicted SpO2 signals for all participants using SVR when the palm is facing the camera, i.e., the palm-up scenario. Prediction
results of training and testing sessions are shown for each participant with reference SpO2 in red dash lines and predicted SpO2 in solid
black lines. The higher the correlation ρ and the lower the MAE, the better the predicted SpO2 captures the trend of the reference signal.

were asked to hold their breath to induce a wide dynamic range the SpO2 level and were told to resume normal breathing if
of SpO2 levels. The typical SpO2 range for a healthy person they felt discomfort. The aforementioned process is defined as
is from 95% to 100%. By holding breath, the SpO2 level one breath-holding cycle. In each session, the breath-holding
can drop below 90%. Once the participant resumes normal cycles were repeated three times. After the first session, the
breathing, the SpO2 will return to the level before the breath- participants were asked to relax for at least 15 minutes before
holding. attending the second session for data collection. From our
data collection protocol using breath-holding, we were able
Each participant was recorded for two sessions. During to obtain the SpO2 measurements with a dynamic range from
the recording, the participant sat comfortably in an upright 89% to 99%.
position and put both hands on a clean dark foam sheet placed
on a table. As shown in Fig. 3, the palm side of the right hand The total length of recording time for all fourteen partic-
and the back side of the left hand were facing the camera. ipants is 138.9 minutes. The current data size is relatively
These two hand-video capturing positions are defined as palm small for large-scale neural network training. This is by a
up (PU) and palm down (PD), respectively. Simultaneously, large part due to the restrictions for human subject related
a Contec CMS-50E pulse oximeter was clipped to the left data collection imposed during the COVID-19. The available
index finger to measure the participant’s SpO2 level. As we data, however, is adequate for our principled multi-channel
have reviewed earlier, the oximeter is adopted clinically as to signal based approach to SpO2 monitoring, showing a benefit
be within a ±2% deviation from the invasive, gold standard of combining signal processing and biomedical knowledge and
for SpO2 [8], so we use the oximeter measurement results as modeling with data than the primarily data-driven approach.
the reference in our experiments. An iPhone 7 Plus camera Delay Estimation of Pulse Oximeter: When the CMS-50E
was used for video recording. The video started 30 seconds oximeter is turned on and ready for measurement, the first
before the oximeter starts, and stopped immediately after reading is displayed a few seconds after the finger is inserted.
the oximeter ends to allow for proper time synchronization. This delay may be due to oximeter’s internal firmware startup
The participants were asked to hold their breath to lower and algorithmic processing. Since we need to synchronize the
6

TABLE I: Performance of the proposed method. Results using linear


video and the oximeter readings using their precise starting
regression (LR) and support vector regression (SVR) for both sides
time stamps, the delay in oximeter can introduce misalignment of the hand are quantified in terms of the sample mean and sample
errors in the reference data that we use to train the regression standard deviation (in parentheses).
model. To avoid the misalignment, we first estimate the delay Training Testing
and then compensate for that in the training and testing. To MAE Correlation ρ MAE Correlation ρ
do so, we asked one participant to repeatedly place the left PU
1.69% 0.63 1.52% 0.62
(±0.57%) (±0.16) (±0.54%) (±0.11)
index finger, middle finger, and ring finger into the oximeter 50 LR
1.74% 0.61 1.53% 0.56
times each and obtained the average delay time of 1.8s, 1.9s, PD
(±0.76%) (±0.21) (±0.53%) (±0.21)
and 1.7s, respectively. Because the left index finger is used for 1.33% 0.76 1.26% 0.68
PU
(±0.54%) (±0.09) (±0.33%) (±0.10)
reference data collection in our setup, we take 1.8s as the delay. SVR
1.35% 0.75 1.28% 0.65
PD
To further examine whether there exists any difference among (±0.45%) (±0.09) (±0.40%) (±0.14)
the delays from the three fingers, we conducted a one-way
ANOVA test. The p-value is 0.14, which shows no statistically
significant different delays among the three fingers.

Correlation coefficient
B. Performance Metrics
The performance of the algorithm is evaluated using the
mean absolute error (MAE) (7a) and Pearson’s correlation
coefficient ρ (7b) given below: Palm down Palm up

N
1X
MAE(y, ŷ) = |yi − ŷi |, (7a) (a) (b) (c)
N i=1
¯
(y − ȳ)T (ŷ − ŷ) Fig. 5: Boxplots of testing correlation coefficient ρ for all participants
ρ(y, ŷ) = . (7b) when grouped using different criteria. (a) Distributions contrasting
ky − ȳk2 ŷ − ŷ¯ 2 linear and support vector regressions. (b) Distributions of palm-up
and palm-down cases. (c) Detailed breakdown of (b) in terms of
where y = [y1 , ..., yN ]T , ŷ = [ŷ1 , ..., ŷN ]T , ȳ, and ŷ¯ denote the skin-tone subgroups.
reference SpO2 signal, the estimated SpO2 signal, the average
values of all coordinates of vectors y and ŷ, respectively. We
adopt the correlation metric to evaluate how well the trend of Table I summarizes the training and testing SpO2 estimation
the SpO2 signal is tracked. performance of both LR and SVR based methods for both PU
and PD cases. The best performance is achieved using SVR
method in the PU case. We further examine the difference
C. Results From Proposed Algorithm between the two regression methods using boxplots in Fig. 5(a)
In this subsection, we use the training data from one that show the distributions of the correlation ρ for testing by
participant to train the regression model for the prediction of LR and SVR, respectively. Each boxplot in Fig. 5(a) contains
his/her testing session recorded a period of time later. We both PU and PD cases from all participants. The results are
call the aforementioned training and testing procedure the compared in terms of the median and the interquartile range
participant-specific mode in which the models are specifically (IQR). IQR quantifies the spread of the distribution by mea-
learned for each participant. We will discuss the leave-one- suring the difference between the first quartile and the third
out mode of the performance of the proposed algorithm in quartile. The boxplots in Fig. 5(a) reveal that the SVR method
Section IV-E. outperforms LR with a higher median of 0.68 compared to
Fig. 4 presents the learning results for all the participants 0.59 and with a narrower IQR of 0.17 compared to 0.19. This
using SVR for PU cases. Both training and testing sessions are suggests that there may exist a nonlinear relationship between
shown for each participant. The SpO2 curves in each session the extracted features and the SpO2 values.
contain three dips that are resulted from breath holding, except To examine the impact of the side of a hand and the skin
for participant #8 who had a shorter session due to limited tone on the performance of SpO2 estimation, we analyze the
tolerance of breath-holding. For each participant, we provide following two research questions: (i) whether the side of hand
the skin-tone information in the subplot and show the accuracy makes a difference in lighter skin (type II and III) or darker
indicators, MAE and ρ, for SpO2 prediction. In all training skin (type IV and V) or mixed skins (all participants), and
sessions, MAE is below 2.4% and ρ is above 0.6. From (ii) whether the different skin tones matter in PU or PD case.
this observation, we find that all the predicted SpO2 signals To answer question (i), we first focus on the distributions
in the training sessions are closely following the reference from PU and PD cases in Fig. 5(b) with each boxplot repre-
signals. Furthermore, all testing MAE values are within 1.8%, senting the correlation ρ in testing for all participants. We
suggesting that those trained models adapt well to the testing observe that the PU case outperforms the PD case with a
data. While there are a few cycles that the predicted signal does higher median of 0.64 compared to 0.60 and a narrower IQR
not fully follow the reference signal, such as the second dip for of 0.15 compared to 0.25. We then zoom into each subgroup of
participant #4 and participant #11, the trends are consistent. skin tones shown in Fig. 5(c). For the lighter skin group, even
7

TABLE II: Configurations for the ablation study of the proposed


pipeline. The controlled experiments are conducted by replacing or
removing one component at a time.
0.68
Configuration 0.60 0.57

Correlation
0.56
Method Index Multi-channel Narrow Accurate
RoR features? ABP filter? HR tracking?
I Two-channel RoR X X 0.33
II X No ABP n/a 0.22
III X Wide ABP X
IV X X Peak-finding
V X X Weighted energy I II III IV V Proposed
Proposed X X X
(a)

1.67 1.69
though the median of PD case is 0.71, which is 9% better than 1.39 1.43 1.40

MAE (%)
1.26
that of PU, the IQR of PD case is 0.24, which is worse than
the IQR of 0.17 of PU case. This suggests that the distributions
are comparable between PU and PD cases for the lighter skin
group. For the darker skin group, the PU case outperforms
the PD case with a higher median of 0.62 compared to 0.54 I II III IV V Proposed
and a narrower IQR of 0.07 compared to 0.15. In summary, Method Index
(b)
there is no substantial difference between PU and PD cases in
the lighter skin group, whereas for the darker skin group and Fig. 6: Ablation study of the proposed method. The bar plots are
overall participants, the PU case is better than the PD case. from testing sessions (SVR, PU case) of all participants. The error
bars correspond to the 95% confidence intervals.
To answer question (ii), we first focus on the left two
boxplots of Fig. 5(c). In the PD case, the median of the
lighter skin group is significantly larger than that of the darker (nABP) centered at AMTC-tracked HR. The only exception is
skin group by 31%, however, the lighter skin group also has that, instead of using the feature vector f that contains multi-
a larger IQR. This makes it difficult to make a conclusion channel information, only the ratio of ratios between the red
from the median–IQR analysis, hence we apply the t-test to and blue channels as in traditional RoR methods is used.
complement our analysis. We note that the p-value is 0.037 < Fig. 6 reveals that our proposed method outperforms
0.05, showing that there is a significant difference in between method (I) by a big margin. More specifically, our proposed
these two groups. In the PU case shown in the right half of method improves the correlation coefficient from 0.22 to
Fig. 5(c), the medians of the lighter skin group and darker 0.68 and the MAE from 1.67% to 1.26%. This improvement
skin group are 0.65 and 0.62, with IQR being 0.17 and confirms that our proposed multi-channel feature set helps with
0.07, respectively. Thus, in our current dataset, no substantial the more accurate SpO2 monitoring.
performance difference is observed between lighter and darker 2) Contribution of Narrowband ABP Filter for Feature
skin tones in the PU case. Extraction: Here we compare with the following two methods
to show the necessity of using a narrowband HR-guided
D. Ablation Study of Proposed Pipeline bandpass filter:
In Sections III-B and III-C, we have proposed three key • Method (II): Feature vector without ABP uses a
designs in our algorithm, including a) the feature vector f nonadaptive, generic bandpass filter with the passband
containing pulsatile information from all RGB channels, b) the over [1, 2] Hz, covering the normal range of heart rate
narrow ABP filter, and c) the passband of ABP filter centered in sedentary mode to replace the HR-based narrow ABP
at precise HR frequency tracked by AMTC. To study the filter proposed in Section III-C for feature extraction.
importance of each component, we conducted three controlled • Method (III): Feature vector with wide ABP (AMTC)
experiments by removing one factor at a time and the con- applies a wider ABP filter with ±0.5 Hz bandwidth than
figurations of methods corresponding to the experiments are the ±0.1 Hz one used in our proposed method. This wider
listed in Table II. The results for the methods are illustrated in ABP filter’s center frequency is provided by the AMTC
Fig. 6. The height of each bin shows the average of correlation tracking algorithm of the HR described in Section III-B.
coefficient ρ or the MAE of SpO2 estimation results from The bandpass filters used for methods (II) and (III) have
testing sessions (SVR, PU case) of all participants. Each pair the same bandwidth, 1 Hz. In terms of center frequency,
of error bars corresponds√to the 95% confidence interval that method (II) used a fixed setting at 1.5 Hz, while method (III)
is calculated as ±1.96σ̂/ N , where σ̂ is the sample standard is adaptively centered at the estimated HR value. Compared
deviation and N is the sample size/number of participants. to method (II), method (III) has an improved testing MAE by
1) Advantage of The Proposed Multi-Channel RoR Over 18%. Furthermore, compared to method (III), our proposed
Two-Channel RoR: In this part, we compare our proposed method with a narrow ABP filter improves the correlation
algorithm with “Method (I): RoR with nABP (AMTC).” coefficient ρ for testing by 13% and MAE by 9%, suggesting
Method (I) follows the feature extraction method proposed in the contribution of the narrow HR-based ABP filter strategy
Section III-C, including the narrow adaptive bandpass filter for AC computation.
8

TABLE III: Testing results of leave-one-participant-out (LOPartO) TABLE IV: Comparison of the proposed algorithm in both contact
and leave-one-session-out (LOSessO) experiments, measured in the and contact-free SpO2 estimation settings. The testing results are
sample mean and the sample standard deviation (in parentheses). measured in the average MAE and correlation coefficient ρ.
LOPartO LOSessO PS Training Testing
MAE ρ MAE ρ MAE ρ MAE ρ MAE ρ
1.70% 0.53 1.59% 0.55 1.26% 0.68
PU
(±0.60%) (±0.38) (±0.58%) (±0.36) (±0.33%) (±0.10)
RoR [10] (LR) 1.60% 0.54 1.38% 0.64
1.76% 0.48 1.70% 0.50 1.28% 0.65 RoR [10] (SVR) 1.14% 0.73 1.32% 0.60
PD Contact RoR [11] (LR) 1.47% 0.62 1.39% 0.63
(±0.59%) (±0.38) (±0.59%) (±0.39) (±0.40%) (±0.14)
RoR [11] (SVR) 0.99% 0.83 1.27% 0.66
Proposed 0.91% 0.84 1.17% 0.81
RoR (2-channel) 1.61% 0.73 1.75% 0.36
3) Importance of Accurate HR Tracking on SpO2 Monitor- Contact-free
Proposed 1.36% 0.62 1.29% 0.68
ing: We consider the following two methods to compare with
our proposed method:
• Method (IV): Feature vector with narrow ABP (peak- We group the participants by the skin type into lighter
finding) applies a narrow ABP filter of bandwidth skin color (skin types II and III) and darker skin color (skin
±0.1 Hz for extracting the feature vector f . The center types IV and V) groups. We conduct LOSessO and LOPartO
frequency of the ABP filter is the HR estimated from the experiments on each subgroup and obtain the SVR generated
peak-finding algorithm described in Section III-B. testing results from all participants in Table III. The MAE and
• Method (V): Feature vector with narrow ABP correlation coefficient ρ improve from LOPartO to LOSessO
(weighted) is similar to method (IV), except that the fre- to PS for both PU and PD cases. This result suggests that
quency estimation algorithm is replaced by the weighted the precision telehealth inspired PS mode is the most accurate
energy in Section III-B. approach to monitoring SpO2 for an individual. Based on the
The averaged MAE of the HR estimation for all participants overall results shown in Table III, most participants demon-
by the peak-finding algorithm, weighted frequency estimation strate a consistent trend of the accuracy of SpO2 estimation
algorithm, and AMTC algorithm are 7.11 (±3.66) bpm, from LOPartO to LOSessO to PS case. The correlation ρ of
6.42 (±3.02) bpm, and 4.14 (±1.72) bpm, respectively. participant #12 is less than −0.5 in both leave-one-out modes,
Fig. 6 shows that methods (IV) and (V) perform similarly suggesting that this participant may have some uncommon
with 0.56 vs. 0.57 for correlation ρ and 1.43% vs. 1.40% relation compared to others between the extracted features and
for MAE, respectively. Our proposed method guided by the SpO2 values.
AMTC tracked HR outperforms methods (IV) and (V) by
21% and 19% in correlation, and by 12% and 10% in V. D ISCUSSION
MAE, respectively. These results suggest that the accurate HR
estimation for ABP filter design improves the quality of the A. Performance on Contact SpO2 Monitoring
AC magnitude by preserving the most cardiac-related signal In addition to contact-free SpO2 monitoring, we evaluate
from RGB channels, which in turn helps with the accurate whether our proposed algorithm can be applied to a contact-
SpO2 monitoring. based smartphone setup. To collect data, the left index finger
covers the smartphone’s illuminating flashlight and the nearby
E. Leave-One-Out Experiments built-in camera, and the camera captures a pulse video at the
fingertip. Another smartphone is used to simultaneously record
As a proof of concept and considering the currently limited a top view video of the back side of the right hand whose
amount of the available data, we have so far discussed the index finger is placed in the oximeter for SpO2 reference
SpO2 estimation under the participant-specific (PS) scenario in data collection. One participant took part in this extended
Section IV where the models are calibrated for each individual. experiment where one training session with three breath-
This PS mode corresponds well to the trending “precision holding cycles was recorded, and three testing sessions were
telehealth” that tailors the healthcare service to individuals. recorded 30 minutes after the training session.
In this subsection, we consider a more practical scenario In Table IV, we compare the performance of our proposed
where the test participant’s data are never seen or only form algorithm in both the contact-based and contact-free SpO2
a limited portion of the training data. In this scenario, we measurement settings. The conventional RoR models used
can develop a group-based model based on skin tone or other in [10] and [11] were implemented as baseline models for
determinants of health, and for each subgroup, the model is contact-based SpO2 measurement. In [10], the mean and
“universal” and participant-independent. We will examine this standard deviation of each window from the red and blue
group-based model through the following two modes of leave- channels are calculated as the DC and AC components. A
one-out experiments: linear model was built to relate the ratio-of-ratios from the two
• Leave-one-session-out (LOSessO): when testing on a color channels with SpO2 . In [11], the median of the pulsatile
given participant, we include his/her training session data peak-to-valley amplitude is regarded as the AC component. For
together with other participants’ data for training. the two RoR methods, we implemented both LR and SVR. For
• Leave-one-participant-out (LOPartO): when testing on a contact-free SpO2 measurement, we take the traditional two-
given participant, we only use other people’s data for color channel RoR method implemented in Section IV-D as
training and leave out the data from this test participant. the baseline to compare with the proposed method.
9

correlated for many participants. That is, in one breath-holding


cycle, when the participant starts to hold breath, his/her HR
increases and SpO2 drops as the oxygen runs out. As he/she
resumes normal breathing, his/her HR and SpO2 recover to be
within the normal range. Due to individuals’ different physical
conditions, in some participants, the peak of the HR signal and
Fig. 7: Illustration of blurring effects using different blurry level σ valley of the SpO2 signal happen in such a short time interval
on hand videos. The wider the kernel is, the blurrier the videos are. that HR and SpO2 are significantly negatively correlated. This
observation is in line with the biological literature [38]. In the
TABLE V: Simulation for Gaussian blurring effect on hand videos. literature, breath-holding exercises were found to be able to
SVR generated results for PU cases are listed for different σ and
Gaussian kernel sizes. The results are quantified in terms of the yield significant changes in the cardiovascular system. In the
sample mean and sample standard deviation (in parentheses). central circulation, they caused significant changes in heart
rate, and in the peripheral circulation, they caused significant
Training Testing
MAE ρ MAE ρ changes in arterial blood flow and oxygen saturation.
σ = 2.6 blur 1.41% 0.72 1.31% 0.67 Based on the above observation that HR is correlated with
(15 × 15 pixels) (±0.50%) (±0.11) (±0.35%) (±0.09) SpO2 during breath-holding, we are curious whether our
σ = 1.1 blur 1.42% 0.70 1.34% 0.68
method also works for a different protocol where the HR
(5 × 5 pixels) (±0.59%) (±0.16) (±0.41%) (±0.10)
1.33% 0.76 1.26% 0.68 remains relatively constant compared to SpO2 . An intermittent
No blur
(±0.54%) (±0.09) (±0.33%) (±0.10) hypoxia protocol used in the literature shows that by receiving
hypoxic air (inspired fraction of oxygen between 12% and
15%) intermittently with normoxic air, the participant can
Table IV reveals that our proposed algorithm outperforms have a much milder HR change than breath-holding, while
other conventional RoR models in the contact-based SpO2 a significant decrease in SpO2 can be achieved during the
monitoring. Even in the contact-free case, our algorithm hypoxia [39], [40]. The research restriction affecting human
presents a comparable performance to that of the contact-based subject research in many U.S. institutions limited our ability
cases, despite that the SNR of the fingertip video is better than to carry out the abovementioned hypoxia protocol. Once the
the SNR from a remote hand video. restriction is eased, we will investigate the performance of our
proposed algorithm when applied to other hypoxia protocols.
B. Resilience Against Blurring
In this subsection, we explore the robustness of our algo-
VI. C ONCLUSION AND F UTURE W ORK
rithm to the blurring effect on hand images. In the current
setup, the hands are placed on a stable table with a cellphone This paper presents a contact-free method of measur-
camera acquiring the skin color of both hands. Ideal laboratory ing blood oxygen saturation from hand videos captured by
conditions are often not satisfied under practical scenarios, smartphone cameras. Our proposed method is a synergistic
and the hand images captured by the cellphone cameras combination of several key components, including the multi-
may be blurred due to being out of focus. The point spread channel ratio-of-ratios feature set, the narrowband filtering that
function is modeled as a 2D homogeneous Gaussian kernel. adaptively centered at heart rates, and the accurately estimated
The finite support of the kernel is defined manually to generate heart rate. We have seen encouraging results of a mean
perceptually different blurry effects and then the standard absolute error of 1.26% with a commercial pulse oximeter as
deviation σ is computed based on the given support. To test the reference, outperforming the conventional ratio-of-ratios
different blurry effects, we experimented with two different method by 25%. We have also analyzed the impact of sides
blurry levels σ = 1.1 (5 × 5 pixels) and σ = 2.6 (15 × 15 of the hand and skin tones on the SpO2 estimation. We have
pixels), respectively. The blurring effects are demonstrated in found that, given our collected dataset, the palm side performs
Fig. 7. well regardless of the skin tone. Besides, we do not observe
Table V presents the SVR generated results for PU cases significant performance differences between lighter and darker
with different σ and kernel sizes. We use the SVR, PU scenario skin tones in the palm-up case.
to showcase here as it achieves the best SpO2 prediction Future work includes verifying our methods with data
performance, which is verified in Section IV-C. From the table, collected under different hypoxia protocols, enlarging the
we find that our algorithm is robust to the Gaussian blurring dataset with more variations in skin color, and applying large-
effect. After the σ = 1.1 blurring, the testing ρ remains the scale and interpretable neural networks with optophysiological
same, and testing MAE is 6.3% higher than the no blurring insights.
case. After the σ = 2.6 blurring, the testing ρ is 1.5% lower
and MAE is 4.0% higher than the no blurring case. R EFERENCES
[1] M. C. Simon and B. Keith, “The role of oxygen availability in embryonic
C. Limitations and Further Verification with Different Hy- development and stem cell function,” Nature Reviews Molecular Cell
poxia Protocols Biology, vol. 9, no. 4, pp. 285–296, Apr. 2008.
[2] N. Shenoy, R. Luchtel, and P. Gulani, “Considerations for target oxy-
From the recordings of our data collection protocol for gen saturation in COVID-19 patients: Are we under-shooting?” BMC
voluntary breath-holding, we observed that HR and SpO2 are Medicine, vol. 18, no. 1, pp. 1–6, Dec. 2020.
10

[3] J. Couzin-Frankel, “The mystery of the pandemic’s ‘happy hypoxia’,” [22] G. H. Yönt, E. A. Korhan, and B. Dizer, “The effect of nail polish on
Science, 2020. [Online]. Available: https://science.sciencemag.org/ pulse oximetry readings,” Intensive and Critical Care Nursing, vol. 30,
content/368/6490/455 no. 2, pp. 111–115, Apr. 2014.
[4] M. J. Tobin, F. Laghi, and A. Jubran, “Why COVID-19 silent hypoxemia [23] A. d. F. G. Rosa and R. C. Betini, “Noncontact SpO2 measurement using
is baffling to physicians,” American Journal of Respiratory and Critical Eulerian video magnification,” IEEE Trans. Instrum. Meas., vol. 69,
Care Medicine, vol. 202, no. 3, pp. 356–360, Aug. 2020. no. 5, pp. 2120–2130, May 2019.
[5] J. Teo, “Early detection of silent hypoxia in COVID-19 pneumonia using [24] G. Casalino, G. Castellano, and G. Zaza, “A mHealth solution for
smartphone pulse oximetry,” J. Medical Systems, vol. 44, no. 8, pp. 1–2, contact-less self-monitoring of blood oxygen saturation,” in IEEE Symp.
Aug. 2020. Comp. Comm., Jul. 2020, pp. 1–7.
[6] J. G. Webster, Design of pulse oximeters. CRC Press, Oct. 1997. [25] U. Bal, “Non-contact estimation of heart rate and oxygen saturation
[7] J. W. Severinghaus, “Takuo Aoyagi: Discovery of pulse oximetry,” using ambient light,” Biomed. Opt. Exp., vol. 6, no. 1, pp. 86–97, Jan.
Anesthesia and Analgesia, vol. 105, no. 6, pp. S1–S4, Dec. 2007. 2015.
[8] A. Plüddemann, M. Thompson, C. Heneghan, and C. Price, “Pulse [26] L. Tarassenko, M. Villarroel, A. Guazzi, J. Jorge, D. Clifton, and
oximetry in primary care: Primary care diagnostic technology update,” C. Pugh, “Non-contact video-based vital sign monitoring using ambient
British J. General Practice, vol. 61, no. 586, pp. 358–359, May 2011. light and auto-regressive models,” Physiol. Meas, vol. 35, no. 5, p. 807,
[9] “How to use the Blood Oxygen app on Apple Watch Series 6,” https: Mar. 2014.
//support.apple.com/en-us/HT211027, Accessed: 2021-05-17. [27] Z. Sun, Q. He, Y. Li, W. Wang, and R. K. Wang, “Robust non-contact pe-
[10] C. G. Scully, J. Lee, J. Meyer, A. M. Gorbach, D. Granquist-Fraser, ripheral oxygenation saturation measurement using smartphone-enabled
Y. Mendelson, and K. H. Chon, “Physiological parameter monitoring imaging photoplethysmography,” Biomed. Opt. Exp., vol. 12, no. 3, pp.
from optical recordings with a mobile phone,” IEEE Trans. Biomed. 1746–1760, Mar. 2021.
Eng., vol. 59, no. 2, pp. 303–306, Jul. 2011. [28] Y. Sun and N. Thakor, “Photoplethysmography revisited: From contact
[11] Z. Lu, X. Chen, Z. Dong, Z. Zhao, and X. Zhang, “A prototype to noncontact, from point to imaging,” IEEE Trans. Biomed. Eng.,
of reflection pulse oximeter designed for mobile healthcare,” IEEE J. vol. 63, no. 3, pp. 463–477, 2015.
Biomed. Health Inform., vol. 20, no. 5, pp. 1309–1320, Aug. 2015. [29] W. Wang, A. C. den Brinker, S. Stuijk, and G. de Haan, “Algorithmic
[12] X. Ding, D. Nassehi, and E. C. Larson, “Measuring oxygen saturation principles of remote PPG,” IEEE Trans. Biomed. Eng., vol. 64, no. 7,
with smartphone cameras using convolutional neural networks,” IEEE pp. 1479–1491, Sep. 2016.
J. Biomed. Health Inform., vol. 23, no. 6, pp. 2603–2610, Dec. 2018. [30] R. Szeliski, Computer Vision: Algorithms and Applications. Springer
[13] N. Bui, A. Nguyen, P. Nguyen, H. Truong, A. Ashok, T. Dinh, R. Deter- Science & Business Media, 2010.
ding, and T. Vu, “Smartphone-based SpO2 measurement by exploiting [31] A. Urooj and A. Borji, “Analysis of hand segmentation in the wild,” in
wavelengths separation and chromophore compensation,” ACM Trans. Proceedings of the IEEE Conference on Computer Vision and Pattern
Sens. Netw., vol. 16, no. 1, pp. 1–30, Jan. 2020. Recognition (CVPR), 2018, pp. 4710–4719.
[14] İ. Tayfur and M. A. Afacan, “Reliability of smartphone measurements [32] D. Chai and K. N. Ngan, “Face segmentation using skin-color map in
of vital parameters: A prospective study using a reference method,” The videophone applications,” IEEE Trans. Circuits and Systems for Video
American J. Emergency Medicine, vol. 37, no. 8, pp. 1527–1530, Aug. Technology, vol. 9, no. 4, pp. 551–564, Jun. 1999.
2019. [33] N. Otsu, “A threshold selection method from gray-level histograms,”
[15] L. Kong, Y. Zhao, L. Dong, Y. Jian, X. Jin, B. Li, Y. Feng, M. Liu, IEEE Trans. Syst., Man, and Cybernet., vol. 9, no. 1, pp. 62–66, Jan.
X. Liu, and H. Wu, “Non-contact detection of oxygen saturation based 1979.
on visible light imaging device using ambient light,” Opt. Exp., vol. 21, [34] Q. Zhu, M. Chen, C.-W. Wong, and M. Wu, “Adaptive multi-trace
no. 15, pp. 17 464–17 471, Jul. 2013. carving for robust frequency tracking in forensic applications,” IEEE
[16] M. Van Gastel, S. Stuijk, and G. de Haan, “New principle for measuring Trans. Inf. Forensics Security, vol. 16, pp. 1174–1189, May 2020.
arterial blood oxygenation, enabling motion-robust remote monitoring,” [35] A. Hajj-Ahmad, R. Garg, and M. Wu, “Instantaneous frequency estima-
Scientific Reports, vol. 6, no. 1, pp. 1–16, Dec. 2016. tion and localization for ENF signals,” in Proc. 4th Annu. Summit and
[17] A. R. Guazzi, M. Villarroel, J. Jorge, J. Daly, M. C. Frise, P. A. Robbins, Conf. (APSIPA), Dec. 2012, pp. 1–10.
and L. Tarassenko, “Non-contact measurement of oxygen saturation with [36] C.-C. Chang and C.-J. Lin, “LIBSVM: A library for support vector
an RGB camera,” Biomed. Opt. Express, vol. 6, no. 9, pp. 3320–3338, machines,” ACM Trans. Intelligent Systems and Technology, vol. 2, no. 3,
Sep. 2015. pp. 1–27, May 2011.
[18] M. van Gastel, W. Verkruysse, and G. de Haan, “Data-driven calibration [37] S. Theodoridis, Machine Learning: A Bayesian and Optimization Per-
estimation for robust remote pulse-oximetry,” Applied Sciences, vol. 9, spective. Academic press, 2015.
no. 18, p. 3857, Jan. 2019. [38] A. Grunovas, E. Trinkunas, A. Buliuolis, E. Venskaityte, and J. Poderys,
[19] D. Shao, C. Liu, F. Tsow, Y. Yang, Z. Du, R. Iriya, H. Yu, and N. Tao, “Cardiovascular response to breath-holding explained by changes of
“Noncontact monitoring of blood oxygen saturation using camera and the indices and their dynamic interactions,” Biological Systems: Open
dual-wavelength imaging system,” IEEE Trans. Biomed. Eng., vol. 63, Access, vol. 5, p. 152, 2016.
no. 6, pp. 1091–1098, Sep. 2015. [39] M. Faulhaber, H. Gatterer, T. Haider, T. Linser, N. Netzer, and
[20] H.-Y. Tsai, K.-C. Huang, and J. A. Yeh, “No-contact oxygen saturation M. Burtscher, “Heart rate and blood pressure responses during hypoxic
measuring technology for skin tissue and its application,” IEEE Instrum. cycles of a 3-week intermittent hypoxia breathing program in patients
Meas. Magazine, vol. 19, no. 5, pp. 57–64, Sep. 2016. at risk for or with mild COPD,” Int. J. Chronic Obstructive Pulmonary
[21] C. J. Coté, E. A. Goldstein, W. H. Fuchsman, and D. C. Hoaglin, Disease, vol. 10, p. 339, 2015.
“The effect of nail polish on pulse oximetry,” Anesthesia and Analgesia, [40] M. Koehle, W. Sheel, W. Milsom, and D. McKenzie, “The effect of
vol. 67, no. 7, pp. 683–686, Jul. 1988. two different intermittent hypoxia protocols on ventilatory responses
to hypoxia and carbon dioxide at rest,” in Integration in Respiratory
Control. Springer, 2008, pp. 218–223.

You might also like