Multi-occupancy Fall Detection using Non-Invasive Thermal Vision Sensor
Multi-occupancy Fall Detection using Non-Invasive Thermal Vision Sensor
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2020.3032728, IEEE
Sensors Journal
Index Terms—Multi-occupancy Fall Detection, Thermal Vision Sensor, MoT-LoGNN, smart environments, Neural
Networks
1530-437X (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 06,2020 at 13:53:21 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2020.3032728, IEEE
Sensors Journal
focused on the single-occupancy scenario, because they variety of sensors have been proposed to describe the fall of
think that in the multi-occupancy scenario, standing a person, such as vibration detection sensors, video cameras,
people can provide help for the fallen person. However, pressure sensors, and thermal sensors [11]. All of these
they may not be able to provide timely help for the fallen approaches have their advantages and disadvantages.
person. Alternatively, the machine actively detecting the Acoustic sensors can be used to detect a noise which is
occurrence of falls can contact the most professional atypical of a fall event. These are, however, comprised in
medical staff in the vicinity at the first time. Besides, noisy environments where background noise interferes with
when there is an accident, bystanders may hardly be able the underlying sound of the fall [12]. Vibration sensors can
to save themselves, not to mention that providing the help. be tuned to detect the measurement of a sudden impact
In this paper, we introduce a novel end to end solution for which can be representative of a fall. They are, however,
the remote management of falls within a multi-occupancy subject to false positives due to activities such as heavy
living environment. A low-cost sensing solution is presented, walking in the environment. Video cameras provide what
which has been developed based on the use of low-resolution may be the only definitive solution to record what has
thermal sensors. This configuration of sensors enables happened in an environment [13]. Nevertheless, they suffer
capturing activity in an unobtrusive manner and integrating from the significant issue of perceived intrusiveness when
data into a scalable sensor platform where an innovate they are deployed to monitor the daily activities of users in
approach for thermal image processing is deployed. The real homes. A potential alternative to video-based sensing is
classification of fall or non-fall is computed in real-time the use of thermal cameras. Due to the low resolution of
using image decomposition and classification with a neural thermal images the intrusiveness issue is overcome whilst at
network (NN), which trained via minimization the Localized the same time having the ability to collect sufficient
Generalization Error with features extracted by information from the heat of the human body to capture a fall
Convolutional Neural Networks (CNN). The developed event.
approach has been deployed within 2 smart lab environments B. Fall Detection Methods based on Thermal Cameras
in the UK and Spain and has been evaluated by means of Many methods have been applied in an attempt to improve
collection and analysis of labeled data sets. the performance of automated fall detection based on thermal
The remainder of this paper is organized as follows. cameras. W. K. Wong, et al [14] utilized the width height
Section II provides a brief review of sensors which have ratio of the rectangle bounding the human as the feature and
been used for fall detection and approaches which address set up artificial rules to detect the falls. The x-, and y-axis
fall detection problems using thermal cameras. Section III histograms were utilized as input features for the SVM
describes the thermal camera used in this work and the (Support Vector Machine) model to detect falls of patients
proposed fall detection method. Section IV discusses the [15]. The FallSense method was proposed in [17] which
experimental settings in the smart labs and experimental adopts fuzzy inference system based on the accleration,
results. Finally, conclusions are drawn and outlook for future infrared, and ultrasonic snesors to detect falls. Experimental
work is presented in Section IV. results show that the FallSense achieves overall 16%
II. RELATED WORK improvement in comparison with comparative methods on an
average. P.Mazurek, et al [18] applied the traditional
A. Sensors for Fall Detection
machine learning classifiers (i.e. support vector machine,
A number of approaches have been implemented in an artificial neural network, and naïve Bayes classifier) for fall
attempt to improve the process of detecting falls. From a detection using the kinematic features and
sensing perspective, these have either been centered around mel-cepstrum-related features extracted from the thermal
exploiting wearable sensors and environmental sensing images. Experimental results show that the accuracy of the
approaches [6]. proposed method is more than 90% on two data sets.
From a wearable sensing perspective, efforts have been A number of studies placed the thermal sensors on the
directed towards the processing of data gleaned through ceiling as an alternative to a wall mounted solution in an
sensors such as accelerometers, gyroscopes, and barometers effort to provide a broader view and to reduce occlusions. In
[7]. To further improve the detection accuracy, multi-sensor [18], the thermal pixels of occupants were identified through
data can be utilized jointly as well [8]. More recently the a certain temperature range, and then the thermal pixel count
sensing platforms within smart phones [9] or within smart of the occupant was used to detect the fall. Only focusing on
shoes [10] have also been leveraged to detect falls. Although the number of pixels made this method ignore the shape and
accuracy levels of detecting a fall have been reported to be in edge of a detected person, thus affecting the fall recognition
excess of 90% in some studies these solutions have a major accuracy. In [19], the authors separated the foreground from
disadvantage that they must be worn to offer their the background based on temperature values. Manually set
functionality. To a certain extent, this requirement can be features based on temperature difference and temporal
viewed as being both an inconvenience and intrusive for the information were proposed and evaluated by several
user and in some instances can be forgotten to be carried or classifiers to detect falls. Experimental evaluation
in the worst case not used at all. demonstrated that the system achieved real-time operation
Approaches based on environmental sensors rely on the and over 94% fall recognition rate at room temperatures up
technology being deployed at mostly fixed locations. A wide to 24ºC.
1530-437X (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 06,2020 at 13:53:21 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2020.3032728, IEEE
Sensors Journal
Cankun Zhong et al.: Multi-occupancy Fall Detection using Non-Invasive Thermal Vision Sensor 3
1530-437X (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 06,2020 at 13:53:21 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2020.3032728, IEEE
Sensors Journal
In the following sub-sections, the thermal vision sensor images during the data collection, not a bounding-box label
hardware is firstly introduced. Following this, four major for each individual of each thermal image.
components of the MoT-LoGNN: the MOD, the B. Multi-occupancy Image Decomposer (MOD)
T-LoGNN, the fine-tuning mechanism, and the SSS are
The MOD consists of three steps: 1) Image Binarization, 2)
presented in Sections II-B, II-C-1, II-C-2, II-D,
Contour Detection, and 3) Single-occupancy Thermal
respectively. Finally, the time complexity analysis of the
Sub-images Generation.
MoT-LoGNN in the testing phase is presented in II-E.
1) Image Binarization: The image binarization process
A. Low Cost and Non-invasive Thermal Sensor aims to distinguish the human heat points regarding the floor
Between the range of thermal vision sensors [25], the high heat points using two pixel values: 0 and 255, respectively
resolution [26] and low resolution devices are used in smart for floor and human shape. The determination of the
environments [27]. In the case of fall detection, a binarization threshold influences both the decomposition
comparative of thermal sensor devices [28] has shown that result and the detection accuracy of the T-LoGNN. In this
non-invasive and low resolution thermal sensors have better study, the binarization threshold is set to be 201 which is
performance and reduction of learning time. determined using a validation set. More details are presented
Based on the encouraging results on fall detection from in Section IV.
previous works, in this work we select the thermal sensor [20] 2) Contours Detection: The border-following algorithm in
Heimann HTPA 32x31, a suitable device with an operating [31] is applied to find the contour of each person in the
temperature range of -20 to 85 oC and powered by a 3.3 Volt binarized thermal images. Owing to noise and blur in thermal
supply. The thermal sensor generates a 32*31 matrix, where images, very small contours are created which is likely to be
each value defines a heat point of temperature. The data are mistaken by the high-temperature floor and are subsequently
collected in real-time by means of a Ethernet crossover cable removed. For 28×28 thermal images that are cropped from
which is connected to the local area network. The the center of the original 32×31 images to contain the most
middleware [29] collect and recover the data from the sensor relevant visual information, small contours whose areas are
in real time within a Web Service in JSON format. less than 4 pixels are removed since these areas would be too
As suggested in [20][28], the thermal sensor was affixed to small to represent a human at the intended sensor
the ceiling of the Smart Lab in the Ulster University to deployment height.
provide a zenith view of the space to be monitored. It was 3) Single-occupancy Thermal Sub-images Generation:
deployed at a height of 2.5 meters in a removable plaster For each multi-occupancy thermal image, k single-occupancy
ceiling, where the Ethernet connection and power supply sub-images are generated if k contours are found (k>1). If
keep hidden by the ceiling. It provides a viewable area of there is either zero or one contour found, the entire thermal
approximately 6 meters by 5.6 meters which makes it image is treated as a thermal sub-image. For each contour,
possible to monitor multiple individuals at the same time. pixels located outside it are set to 0 while others are set to
The field of view of the sensor is 86° by 83°. A picture of the 255. Such that, a single-occupancy thermal sub-image is
sensor deployed in the ceiling is provided in Figure 2(a) created and the detected person (i.e. contour) appears at the
together with the operating range in Figure 2 (b) and an original location of the entire multi-occupancy thermal image.
example of an actual recording in Figure 2 (c). Besides, the fallen or not fallen class label of a thermal
sub-image is inherited from this original multi-occupancy
thermal image.
C. T-LoGNN
The T-LoGNN consists of a robust LG-RBFNN and
thermal image features extracted by a CNN. The
LG-RBFNN and the fine-tuning mechanism [32] for
T-LoGNN are introduced in II-B.1 and II-B.2, respectively.
The CNN is adopted in the T-LoGNN for feature
(a) (b) (c)
extraction from both binarized thermal sub-images and single
Fig. 2. (a) The sensor deployed in the ceiling. (b) The operating range occupancy images. One of the major contributions of this
of the sensor. (c) An example of a thermal image presented in a web work is the use of the LG-RBFNN as the classifier with
interface. features extracted from the CNN. In contrast to the Softmax
classifier, the LG-RBFNN is expected to yield higher
Frames are sampled from the sensor through an I2C generalization capability to future unseen samples since it is
interface at a rate of 6 Hz and processed by a listener which trained via minimizing the generalization error estimated by
communicates via wifi directly with endpoints on the the L-GEM. Furthermore, RBFNN is used here because it is
SensorCentral platform [30]. Once captured by a nonlinear classifier [33] with fast convergence.
SensorCentral image processing techniques are invoked. A class balanced weighting trick is utilized in the training
In this study, the aim is to identify whether a fall has of the CNN when the model performance is hindered by the
occurred in the thermal images rather than consider what has class imbalance problem. The class weight of each class is
happened to each individual in the thermal images. Therefore, the reciprocal of its number of samples multiplied by a
the fallen or not labels were applied to the thermal frame constant.
1530-437X (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 06,2020 at 13:53:21 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2020.3032728, IEEE
Sensors Journal
Cankun Zhong et al.: Multi-occupancy Fall Detection using Non-Invasive Thermal Vision Sensor 5
1530-437X (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 06,2020 at 13:53:21 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2020.3032728, IEEE
Sensors Journal
To prevent overfitting, the most robust T-LoGNN among n In Case 2, multiple sub-images are classified as non-fallen,
iterations is selected as the final T-LoGNN to be used in the but at least one of the sub-images should be classified as
MoT-LoGNN. A validation robustness measure (VRM) is fallen. In Case 3, multiple sub-images are classified as fallen,
proposed to measure the robustness of the T-LoGNN as but the classification of multiple fallen sub-images from the
follows: same multi-occupancy thermal image has a high chance to be
mistaken because both fallen events existing at the same time
1
V
1
W is quite rare. However, we do not have a bounding-box label
V (y
v 1
v F ( X v ))
W SSM ( X
w 1
w, f) (5) for each participant in the dataset, therefore it is not possible
to know which sub-image is misclassified.
where F(·), f(·), V , W, Xv, yv, Xw denote the MoT-LoGNN,
the T-LoGNN, the number of samples in the multi-occupancy
1) For non-fallen (i.e. all individuals in a multi-occupancy
validation set, the number of single-occupancy sub-images
thermal image are standing or no one in a thermal
generated from the multi-occupancy validation set, a
image) multi-occupancy thermal images misclassified
multi-occupancy image from the validation set, the label of
as fallen. The sub-images which have been classified
Xv, a single-occupancy sub-image generated from the
as fallen are incorrect and are subsequently selected;
multi-occupancy validation set, respectively. The best
T-LoGNN yielding the minimum VRM needs to yield both 2) For fallen thermal images misclassified as non-fallen,
low classification error on the validation set and low sub-images yielding the largest SSM value among
sensitivity to small perturbations to thermal sub-images. This sub-images decomposed from the same
follows the idea of the minimization of the L-GEM in multi-occupancy thermal image are selected;
[33]-[34] and aligns well with the multi-occupancy falling 3) For correctly classified fallen multi-occupancy thermal
classification problem in this work. images with more than one sub-image being classified
as fallen, the thermal sub-image yielding the largest
D. Sensitivity-based Samples Selector (SSS) stochastic sensitivity measure value among the
An SSS approach is proposed to select useful samples for sub-images being classified as fallen and decomposed
fine-tuning the T-LoGNN. The main difficulty is to identify from the same thermal image are selected.
misclassified single-occupancy thermal sub-images since the In the L-GEM framework, a sample yielding a large SSM
class label is assigned to the entire multi-occupancy thermal value is informative to Neural Network training [36] because
image only and we do not have the real label of the it has a higher chance of being misclassified by the Neural
sub-images. Network. Therefore, for Cases 2 and 3, the sub-image
The three following possible cases are considered by SSS yielding the largest SSM (SSM(xf, g) in Equation (4)) is
to select useful sub-images for fine-tuning T-LoGNN, which selected. Selected samples are labeled with the opposite
are presented in Fig. 3. labels with respect to their corresponding classification
Both Cases 1 and 2 identify misclassified results from the T-LoGNN. Then, these samples are added to
single-occupancy scenarios whilst Case 3 provides additional the ensuing round of fine-tuning T-LoGNN.
samples for people who have not fallen.
1530-437X (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 06,2020 at 13:53:21 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2020.3032728, IEEE
Sensors Journal
Cankun Zhong et al.: Multi-occupancy Fall Detection using Non-Invasive Thermal Vision Sensor 7
1530-437X (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 06,2020 at 13:53:21 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2020.3032728, IEEE
Sensors Journal
For both multi-occupancy and single-occupancy datasets, resulting in waste of resources. Therefore, the MoT-LoGNN
ten independent runs were performed for all experiments is obviously superior to other comparison methods in the
with 80% of thermal images being randomly selected for single-occupancy scenario.
training and the remaining 20% used for testing.
3) Evaluation metrics: The performance of different
models is validated on the single-occupancy dataset and the
TABLE I
multi-occupancy dataset. The mean and the standard MEAN (± STDEV) OF PERFORMANCE METRICS OF DIFFERENT METHODS
deviation values of different metrics for different models are ON THE SINGLE-OCCUPANCY DATA
calculated and the symbol “*” denotes a statistically Model Accuracy FAR (%) FRR (%) F1(%) Gmean(%)
(%)
significant difference between the MoT-LoGNN and the
RectFall[14] 86.43 40.62 0.00 74.35 76.97
corresponding method by Student’s t-test with 95% (±3.69) (±5.63) (±0.00) (±6.53) (±3.81)
confidence. We regard fallen as the positive class and HistFall[15] 92.95 6.27 7.29 89.45 93.22
non-fallen as the negative class. Three performance metrics (±0.22) (±1.46) (±0.35) (±1.76) (±0.56)
are used. The Accuracy measures the overall performance of CNN[21] 91.80 13.15 4.37 87.53 90.15
(±2.95) (±4.31) (±1.98) (±3.09) (±3.44)
different models. The False Rejection Rate (FRR) and the Inception-v3 95.63 5.45 2.5 94.26 96.01
False Acceptance Rate (FAR) measure the missing report rate [22] (±1.47) (±2.23) (±1.88) (±2.15) (±2.74)
of fallen and the false alarm rate, respectively. These MoT-LoGNN 97.31 0.91 3.50 95.63 97.93
performance metrics are defined as follows: (±1.33) (±0.48) (±2.00) (±2.75) (±0.79)
𝑇𝑃 + 𝑇𝑁
Accuracy =
𝑇𝑃 + 𝐹𝑁 + 𝑇𝑁 + 𝐹𝑃 (6) 2) Performance on multi-occupancy dataset: Since the
RectFall and the HistFall methods are proposed for
𝐹𝑁 single-occupancy scenarios, they cannot be directly applied
FRR =
𝐹𝑁 + 𝑇𝑃 (7) to the fall detection problems in multi-occupancy scenarios.
In this experiment, the proposed MOD proposed in this paper
𝐹𝑃 is combined with these two methods, so that these two fall
FAR = detection methods can be applied to multi-occupancy
𝐹𝑃 + 𝑇𝑁 (8)
scenarios. As shown in Table II, in multi-occupancy
scenarios, the performance of deep learning based fall
2* Precision*Recall detection methods are obviously better than the RectFall and
F1 (9)
Precision Recall the HistFall. Moreover, the FARs of the RectFall and
HistFall methods are significantly higher than other methods.
This is mainly due to the characteristics of the MOD
Gmean Recall * Specificit y (10) framework. That is, as long as a thermal sub-image is
classified as fall, it is considered that there is a fall. Therefore,
the RectFall and the HistFall with high FAR in the
where TP, TN, FP, and FN denote the True Positive, the True single-player scenario have their disadvantages further
Negative, the False Positive, and the False Negative, expanded under the multi-occupancy scenario and the MOD.
respectively. Precision=TP/(TP+FP), Recall=TP/(TP+FN), Overall, MoT-LoGNN has the highest average classification
and Specificity=TN/(TN+FP). accuracy in multi-occupancy scenarios, the lowest average
B. Comparison Test with Other Fall Detection Studies FAR, and the second lowest average FRR, but its FAR is
using Thermal Sensor dozens of times lower than the RectFall which has the lowest
In this section, the proposed MoT-LoGNN is compared FRR.
with two artificial feature extraction based methods (the
RectFall [14] using the width height ratio of the rectangle TABLE II
bounding the human as the feature and the HistFall [15] MEAN (± STDEV) OF PERFORMANCE METRICS OF DIFFERENT METHODS
ON THE MULTI-OCCUPANCY DATA
using the histogram of x-axis and y-axis as the feature) and Model Accuracy FAR (%) FRR (%) F1(%) Gmean(%)
two deep learning based methods (the manually designed (%)
CNN [21] and the popular Inception-v3 [22] model which RectFall[14] 64.46 79.38 0.12 34.13 45.36
(±1.36) (±1.47) (±0.09) (±2.04) (±1.62)
pre-trained on the ImageNet database).
HistFall[15] 75.72 48.60 4.56 65.10 69.84
1) Performance on single-occupancy dataset: It can be (±2.32) (±7.00) (±1.91) (±5.64) (±4.50)
seen from table I that in the single-occupancy scenario, CNN[21] 85.46 13.15 15.42 84.19 85.60
(±0.77) (±4.31) (±4.53) (±0.99) (±4.48)
compared with other thermal sensor based fall detection
Inception-v3 92.12 8.35 7.33 91.65 92.16
methods, the proposed MoT-LoGNN achieves the best [22] (±0.69) (±3.91) (±3.65) (±1.12) (±3.88)
performance in terms of all metrics except the FRR. As tothe MoT-LoGNN 95.89 4.12 3.89 95.42 95.92
FRR, the MoT-LoGNN reaches the second lowest, while the (±0.50) (±1.32) (±1.07) (±0.55) (±0.68)
RectFall with the lowest FRR has a high FAR of 40.62%,
which means that the system is easy to make error alarm,
1530-437X (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 06,2020 at 13:53:21 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2020.3032728, IEEE
Sensors Journal
Cankun Zhong et al.: Multi-occupancy Fall Detection using Non-Invasive Thermal Vision Sensor 9
TABLE III
SENSITIVITY ANALYSIS OF MOT-LOGNN TRAINING WITH DIFFERENT
AMOUNT OF TRAINING DATA ON SINGLE-OCCUPANCY SCENARIO
Amount of Accuracy (%) FAR FRR F1(%) Gmean
Training Data (%) (%) (%)
10% 91.72 (±2.44) 11.60 6.00 87.62 91.09
(±6.20) (±2.41) (±2.43) (±3.27)
50% 96.40 (±1.51) 1.06 4.63 94.26 97.14
(±1.10) (±1.92) (±3.21) (±1.02)
100% 97.31 (±1.33) 0.91 3.50 95.63 97.93
(±0.48) (±2.00) (±2.75) (±0.79)
TABLE IV
SENSITIVITY ANALYSIS OF MOT-LOGNN TRAINING WITH DIFFERENT
AMOUNT OF TRAINING DATA ON MULTI-OCCUPANCY SCENARIO (b) Multi-occupancy
Amount of Accuracy FAR FRR F1(%) Gmean
Training Data (%) (%) (%) (%)
10% 83.83 24.04 8.71 79.49 82.16 Fig. 5. The data distribution of thermal images. The green points
(±7.90) (±20.53) (±3.75) (±12.01) (±10.43) denote the non-fallen class and the blue points denote the fallen
50% 95.44 4.26 4.68 94.92 95.52 class.
(±0.29) (±1.19) (±1.51) (±3.21) (±1.96)
100% 95.89 4.12 3.89 95.42 95.92
(±0.50) (±1.32) (±1.07) (±0.55) (±0.68)
Due to this non-linearity characteristic of the thermal
sensor data, a non-linear classifier is preferred. Tables V and
VI summarize the performance of different classifiers under
2) Non-linearity of the Thermal Image Data: In this study,
the single-occupancy and multi-occupancy scenarios,
the CNN is utilized to extract features of thermal images and
respectively. Where the [cnn+softmax], [cnn+dt], and
the LG-RBFNN serves as a non-linear classifier. To show the
[cnn+svm] denote the models using the softmax, decision
non-linearity of thermal image data, the data is transformed
tree, and support vector machine as a classifier with the
into a new feature space by a CNN and then it is reduced to a
features extracted by the CNN, respectively. In both cases,
two-dimensional space by using the t-Distributed Stochastic
the T-LoGNN yields the best performance which shows the
Neighbor Embedding (tSNE), as shown in Fig. 5. Fig. 5
superiority of the LG-RBFNN for dealing with the thermal
shows that the thermal image data in the feature space
sensor fall detection problem using the CNN features.
constructed by the CNN is non-linear separable in both
1530-437X (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 06,2020 at 13:53:21 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2020.3032728, IEEE
Sensors Journal
TABLE VI
COMPARISON OF DIFFERENT CLASSIFIERS USING CNN FEATURES ON TABLE VII
MEAN (± STDEV) OF PERFORMANCE METRICS OF DIFFERENT METHODS
MULTI-OCCUPANCY DATA
Model Accuracy FAR (%) FRR (%) F1(%) Gmean(%)
ON THE SINGLE-OCCUPANCY DATA
(%) Model Accuracy FAR (%) FRR (%) F1(%) Gmean(%)
(%)
[cnn+softmax] 88.77 13.31 9.42 87.37 88.60
(±1.49) (±2.40) (±1.65) T-LoGNN 95.10 9.92 1.91 92.58 93.95
(±1.40) (±1.48)
(±1.88) (±5.22) (±0.41) (±1.88) (±2.57)
[cnn+dt] 90.16 11.85 8.09 88.90 90.00
(±0.83) (±2.29) (±0.98) MOD-LoGNN 97.09 1.70 3.66 95.24 97.50
(±0.68) (±0.90)
(±0.28) (±2.00) (±2.08) (±2.81) (±0.63)
[cnn+svm] 89.98 11.59 8.59 88.74 89.88
(±1.42) (±1.39) (±2.82) MoT-LoGNN 97.31 0.91 3.50 95.63 97.93
(±1.65) (±1.33)
(±1.33) (±0.48) (±2.00) (±2.75) (±0.79)
T-LoGNN 90.81 11.55 7.07 89.57 90.63
(±0.43) (±2.21) (±2.83) (±0.62) (±0.26)
TABLE VIII
MEAN (± STDEV) OF PERFORMANCE METRICS OF DIFFERENT METHODS
ON THE MULTI-OCCUPANCY DATA
3) Determination of the Binarization Threshold: As one Model Accuracy FAR (%) FRR (%) F1(%) Gmean(%)
of the components of the MoT-LoGNN, MOD is responsible (%)
T-LoGNN 90.81 11.55 7.07 89.57 90.63
for decomposing the multi-occupancy into single-occupancy (±0.43) (±2.21) (±2.83) (±0.62) (±0.26)
sub-images, where the determination of the binarization MOD-LoGNN 92.57 12.40 0.85 91.08 91.57
threshold will influence the effect of the MOD directly. (±1.10) (±1.43) (±0.62) (±1.06) (±1.06)
Intuitively, the optimal binarization threshold should separate MoT-LoGNN 95.89 4.12 3.89 95.42 95.92
the person and background clearly, which is helpful for the (±0.50) (±1.32) (±1.07) (±0.55) (±0.68)
1530-437X (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 06,2020 at 13:53:21 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2020.3032728, IEEE
Sensors Journal
Cankun Zhong et al.: Multi-occupancy Fall Detection using Non-Invasive Thermal Vision Sensor 11
5) Mis-classified Cases By MoT-LoGNN: The examples conducts fall detection only using a single thermal image,
shown in Fig.7. are representative misclassified samples by where the decision making might be wrong owing to noise
the proposed MoT-LoGNN. The subtitle shows the actual and blur in a thermal image. Thus, decision made using a
label of the samples. It can be seen that these samples are sequence of successive thermal images rather than a single
extremely difficult to distinguish. Sample (a) is one should be more preferrable.
mis-classified is mainly due to that the fallen person is not In the future, we will verify the proposed MoT-LoGNN in
completely in the monitoring area. As to sample (b), the more real application scenarios. Research related to the
fallen person with body curling is easily mis-classified as the image decomposition algorithm adapted to different
standing person. Sample (c) shows a case where the environments is a meaningful research direction to optimize
MoT-LoGNN mis-classifies a standing person as a fallen MoT-LoGNN. In addition, due to the uncertainty of thermal
person when the the background temperature is high enough. images acquired by this low-cost device, temporal
When two standing person are closed enough, the information will be added to improve the robustness of the
MoT-LoGNNmay wrongly judge them as a fallen person, as recognition system.
shown in sample (d).
REFERENCES
[1] World Health Organization. "World report on ageing and health," World
Health Organization, 2015.
[2] Z. He, et al, "Prevalence of multiple chronic conditions amongolder
adults in florida and the united states: comparative analysis of
theoneflorida data trust and national inpatient sample," Journal of
medicalInternet research, vol. 20, no. 4, p. e137, 2018.
[3] D. Marikyan, et al, "A systematic reviewof the smart home
(a) fallen (b) fallen literature: A user perspective," Technological Fore-casting and
Social Change, vol. 138, pp. 139–154, 2019
[4] M. L. Shuwandy, et al., "Sensor-based mHealth authentication for
real-time remote healthcare monitoring system: A multilayer systematic
review,'' Journal of medical systems, vol. 43, no.2, pp. 33, 2019.
[5] E. A. Kramarow, et al., "Deaths from Unintentional Injury Among
Adults Aged 65 and Over, United States, 2000-2013,'' no.2015. US
Department of Health and Human Services, Centers for Disease Control
and Prevention, National Center for Health Statistics, 2015.
(c) non-fallen (d) non-fallen [6] P. Vallabh and R. Malekian, "Fall detection monitoring
systems: A comprehensive review," Journal of Ambient Intelligence
Fig. 7. Mis-calssified samples of the MoT-LoGNN. and Humanized Computing, vol. 9, no. 6, pp. 1809–1833, 2018.
[7] S. S. Kendri, et al., "Development and monitoring of a fall
detection system through wearable sensor belt," Development,
vol. 6, no. 12, 2019.
V. CONCLUSIONS AND FUTURE WORKS [8] L. Wang, et al, "Pre-impact fall detection based
onmulti-source cnn ensemble," IEEE Sensors Journal, vol. 20, no. 10,
In this study, we use have proposed a robust approach that pp.5442–5451, 2020.
distinguishes falls and non-fall shapes from low-resolution [9] J.-S. Lee and H.-H. Tseng, “Development of an enhanced
images captured by low-cost and non-invasive thermal vision threshold-based fall detection system using smart phones with built-in
accelerometers,”IEEE Sensors Journal, vol. 19, no. 18, pp. 8293–8302,
sensors. The device provides a zenithal point of view from
2019.
the ceiling where it is located to detect falls in instances of [10] L. Montanini, et al, "Afootwear-based methodology for fall detection,"
both single and multi-occupancy. We propose the use of IEEE Sensors Journal,vol. 18, no. 3, pp. 1233–1242, 2018.
Convolutional Neural Networks to extract features from the [11] T. Xu, et al., "New advances and challenges of falldetection
thermal images and then use a Radial Basis Function Neural systems: A survey," Applied Sciences, vol. 8, no. 3, p. 418,
Network trained via the minimization of the Localized 2018.
[12] S. M. Adnan, et al, “Fall detection through acoustic local
Generalization Error Bound as the classifier (T-LoGNN) to ternary patterns,”Applied Acoustics, vol. 140, pp. 296–300, 2018
reduce the effects of thermal images with strong amounts of [13] E. Cippitelli, et al., "Radar andrgb-depth sensors for fall
noise and blurred areas on the classification results. In detection: A review," IEEE Sensors Journal, vol. 17, no. 12, pp.
addition, we propose a multi-occupancy fall detection 3585–3604, 2017.
method MoT-LoGNN in response to the decrease of [14] W. K. Wong, et al., "Home alone faint detection surveillance system
using thermal camera," in Computer Research and Development, 2010
classification accuracy caused by the increasing complexity
Second International Conference on. IEEE, 2010, pp. 747–751.
of thermal images in the multi-occupancy scenarios. [15] K.-S. Song, et al., "Histogram based fall prediction of patients using a
Experimental results demonstrated that the MoT-LoGNN thermal imagery camera," in 2017 14th International Conference on
achieved the best performance on both single and Ubiquitous Robots and Ambient Intelligence (URAI). IEEE, 2017, pp.
multi-occupancy scenarios. 161-164.
[16] S. Moulik and S. Majumdar, "Fallsense: An automatic fall detectionand
However, the proposed MoT-LoGNN still have some alarm generation system in iot-enabled environment," IEEE Sensors
limitations. The binarization threshold is set for a given data Journal, vol. 19, no. 19, pp. 8452–8459, 2018.
set in this work. Therefore, its fall detection performance [17] P. Mazurek, et al, "Use of kinematic and mel-cepstrum-related
may decline if the MoT-LoGNN is directly applied to other features for fall detection based on data from infrareddepth sensors,"
Biomedical Signal Processing and Control, vol. 40, pp.102–110, 2018.
environments. Meanwhile, the proposed MoT-LoGNN
1530-437X (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 06,2020 at 13:53:21 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2020.3032728, IEEE
Sensors Journal
1530-437X (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Carleton University. Downloaded on November 06,2020 at 13:53:21 UTC from IEEE Xplore. Restrictions apply.