0% found this document useful (0 votes)

22 views18 pages

A Tree-Based Approach For Visible and Thermal Sens

Uploaded by

habibi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views18 pages

A Tree-Based Approach For Visible and Thermal Sens

Uploaded by

habibi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Machine Vision and Applications (2024) 35:60

https://doi.org/10.1007/s00138-024-01546-y

RESEARCH

A tree-based approach for visible and thermal sensor fusion in winter

autonomous driving
Jonathan Boisclair1 · Ali Amamou1 · Sousso Kelouwani1 · M. Zeshan Alam2 · Hedi Oueslati1 · Lotfi Zeghmi1 ·
Kodjo Agbossou1

Received: 20 November 2023 / Revised: 14 March 2024 / Accepted: 17 April 2024 / Published online: 3 May 2024
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024

Abstract
Research on autonomous vehicles has been at a peak recently. One of the most researched aspects is the performance
degradation of sensors in harsh weather conditions such as rain, snow, fog, and hail. This work addresses this performance
degradation by fusing multiple sensor modalities inside the neural network used for detection. The proposed fusion method
removes the pre-process fusion stage. It directly produces detection boxes from numerous images. It reduces the computation
cost by providing detection and fusion simultaneously. By separating the network during the initial layers, the network can
easily be modified for new sensors. Intra-network fusion improves robustness to missing inputs and applies to all compatible
types of inputs while reducing the peak computing cost by using a valley-fill algorithm. Our experiments demonstrate
that adopting a parallel multimodal network to fuse thermal images in the network improves object detection during difficult
weather conditions such as harsh winters by up to 5% mAP while reducing dataset bias during complicated weather conditions.
It also happens with around 50% fewer parameters than late-fusion approaches, which duplicate the whole network instead
of the first section of the feature extractor.

Keywords Pedestrian detection · Deep learning · Machine learning · Road vehicle identification · Winter · ADAS

1 Introduction Autonomous vehicles rapidly adopt vision-based detec-

tion algorithms [5] for object detection. Recently, neural
Driving assistance systems and self-driving vehicles have networks have shown tremendous success in this area and are
shown reduced traffic and improved road safety [1, 2]. The becoming state-of-the-art for almost all vision-related tasks.
international standard J3016 refers to multiple levels of However, in harsh weather such as heavy rainfall or snow-
autonomy. Level 3 requires great driving assistance, while fall, the illumination conditions change drastically, resulting
level 4 requires self-driving in certain situations. Level 5, in significant detection degradation [6, 7].
however, needs it to be in all cases, including harsh weather The visible spectrum is heavily affected by light con-
conditions, such as snow, rain, haze, fog, and night, which ditions and occlusions. Most detection systems use that
deteriorate the performance of camera-based sensor systems spectrum, hence not achieving a reasonable enough detection
to the point where self-driving is rendered impossible [3]. rate for level 4 and 5 autonomy from the J3016 standard [6].
Difficult weather conditions are encountered by more than The increasing availability of new perception sensors allows
200 million people worldwide for at least 60 days per year [4]. multimodal data fusion to solve the deterioration [8]. As such,
many applications, such as autonomous driving, can benefit
from those sensors [9–11].
Much research has been done where thermal informa-
B Jonathan Boisclair
tion complements the visible information [12–22]. Thermal
[email protected]
vision is well known for night vision systems and is com-
1 Hydrogen Research Institute, University of Quebec in plementary to visible images, making it perform well under
Trois-Rivières, 3351 Blvd Des Forges, Trois-Rivières, QC challenging conditions like snowstorms and fog [23]. Large
G9A 5H7, Canada
datasets and knowledge transfer have allowed computer
2 Department of Computer Science, University of Brandon, 270 vision to succeed in many tasks in the vision spectrum [24].
- 18th Street, Brandon, MB R71 6A9, Canada

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

60 Page 2 of 17 J. Boisclair et al.

However, thermal imaging has multiple drawbacks. Work- This research uses a novel time synchronization technique
ing with temperature instead of a light ray makes it unable to that is both low latency and resilient to sensor failures. The
view through Lambertian surfaces [25]. Another disadvan- proposed fusion network uses an approach based on MLF
tage of thermal imaging is that being a costly sensor makes fusion that avoids the intermediary state of fusion to improve
data available for training scarce. The imbalance between the results. It combines the fusion and detection networks into
available visible and thermal images must be counteracted a low-cost network adapted to edge devices.
with a process allowing training with missing data without A summary of the contribution of this article is presented
losing on the large-scale datasets of the visible spectrum. below.
Fusion is a widely researched topic in the literature, and
several fusion methods have been proposed. Three main 1. Network level fusion strategy of multimodal image data is
approaches exist: High-level Fusion (HLF), Mid-level Fusion proposed to improve the overall performance of computer
(MLF), and Low-level Fusion (LLF). High-level fusion is vision-based detection methods for self-driving vehicles
fusion post-detection [26]. HLF cannot use cross-channel in challenging environmental conditions.
information and generally results in lower classification task 2. Novel synchronization method that provides resilience to
performance in harsh conditions [26]. Low-level fusion is sensor failures, low latency, and real-time synchroniza-
the fusion of the raw input of the sensors. While it allows tion.
for all information to improve the task’s performance, it also 3. End-to-end learning from the data fusion to object detec-
requires precise extrinsic calibration and is the least tolerant tion for various deep-learning models is adopted (fusion
to the distance between sensors [26]. And lastly, MLF is an & detection) for better performance and reduced opera-
abstraction between low-level and high-level fusion. It fuses tional complexity.
some relevant subset of the information from a statistical 4. The proposed fusion strategy accommodates imbalanced
point of view [27]. MLF requires extensive training to extract datasets with minimal bias and, therefore, handles the
those relevant features. Current approaches to MLF are inde- data scarcity problem associated with specialized imaging
pendent of the detection phase and tend to fuse raw input into modalities.
features [28]. Current MLF approaches need to be revised 5. A fully automated network splitting optimizer is proposed
for level 4 and 5 autonomy. However, those MLF methods to realize network-level fusion.
bring many advantages, such as fidelity to spatial details, low
computational complexity, and robustness to imprecise cali- The rest of this paper is organized as follows. Firstly,
bration. Following the MLF approach is the best option for Sect. 2 presents related works, Sect. 3 discusses the proposed
the current work. The proposed network uses fusion inside method of modularity instead of manual fusion, and Sect. 4
the network to accommodate lower processing cost and high offers the proof of concept for the multimodal input. Finally,
performance. Sect. 5 concludes this paper.
Fusion at the input or in the network, like LLF and MLF
methods, requires sensors to be calibrated [26]. Two types
of calibration are needed. The first type is spatial calibra- 2 Related works
tion [29], which requires knowing the distance and rotation
between each sensor. The second type of temporal calibration 2.1 Fusion
is synchronizing the inputs in a compatible way [30]. The
primary approach to temporal synchronization is dynamic 2.1.1 CNN based
programming to minimize the time between the sensors’
captures, hence reducing latency from the fusion. However, Image fusion has multiple families of approaches. The first
that approach can’t be performed in real-time without added family is the pan sharpening method [8]. This method allows
latency, and the primary technique for a real-time process is the fusion of images of different resolutions and spec-
simply waiting for all inputs to be obtained. These methods trums, like satellite imaging. One of the best-performing
require that all information be present, making the synchro- pan-sharpening methods is the convolutional auto-encoder,
nizing approach vulnerable to missing sensor reading [31]. which improves the quality of low-resolution images to
During real-time execution, sensors do not adhere to a perfect be fused with the high-quality ones after processing [32,
frequency. Factors such as bandwidth or simply imprecise 33]. Component Substitution (CS) [34–37] is also a pop-
sensor clocks cause variations in the frequency of the sen- ular fusion algorithm that converts the original inputs into
sor capture. Thermal sensors are known to have a lower another set of features combining the originals [8]. Most
frequency than visible sensors, and a well-designed fusion current approach converts RGB + Thermal inputs into a
algorithm can handle that difference in sensor order in tan- transformed RGB image containing Thermal inside the three
dem with the variation in sensor frequency. channels. Many of those approaches utilize deep-learning-

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

A tree-based approach for visible and thermal sensor fusion in winter autonomous driving Page 3 of 17 60

based networks to fuse the image using an encoder-decoder are also too slow for real-time applications such as self-
architecture [8]. Modern techniques such as auto-encoders driving [45].
work on the encoder-decoder architecture and have been The SSD [45] method introduced the first deep network-
applied to CS fusion [32, 38]. However, the main problem of based object detector that didn’t resample pixels or fea-
those independent pre-network fusions is the additional cost tures for classification. That method works by having
of the fusion network, which is independent of the detection. a convolutional-based classifier at a few steps through
Current state-of-the-art approaches include but are not lim- the network to allow different aspect ratios and scale
ited to the DenseFuse [39], belonging to the family of CNN objects. Removing the independent classification network
deep learning. has achieved real-time detection with a slight drop in mAP
to 25.1%.
2.1.2 GAN and auto-encoder based

Two other families of deep learning exist: generative adver- 3 The proposed model
sarial networks (GAN) and auto-encoders (AE) [38]. GAN-
based fusion [40–42] generates a new image from the feature This section describes the proposed method of creating
vector representing it, while auto-encoders do component branches and how to size them appropriately. It also explains
substitution of an image to reproduce an output of the same how to modify existing networks without creating a new
dimension. Auto-encoders are mainly used to denoise the one and how to select the fusion point. Next, it presents
input and improve its quality in a whole, semi-supervised the required input preparation and the implementation of the
way, as their output should be their input. To apply auto- proposed method. Finally, it provides the necessary sync-
encoders to images, they must be convolutional; as such, ing process and the training protocol. The proposed method
auto-encoders are encoder-decoder CNNs in which the out- aims to improve the detection performance of road elements,
put size is the same as the input [38]. Auto-encoders cannot including pedestrians, vehicles, traffic signs, lights, motorcy-
produce a new image and can only regenerate the input or clists, and cyclists, particularly under challenging conditions
a variation of it when denoising [38]. In [38], they intro- like winter, nighttime, or rain while reducing the processing
duce variational auto-encoders, generating hidden features cost and enhancing the robustness for autonomous vehicles.
using a Gaussian distribution following the equivalent-sized
layer in the encoder network. The generative aspect of that 3.1 Proposed branches approach
method allows for generating higher quality fused images
than the other deep-learning-based methods. These SOTA Inspired by SSD [45] and YOLO [46–49], which com-
auto-encoder methods constitute one of the many basic archi- bine region proposal and object detection, the proposed
tectures the proposed method uses and introduce them as fusion method combines the fusion network and the detec-
backbones. tion network into an end-to-end fusion detector. The need
for intermediary representation or transformation is removed
2.2 Detectors by having intra-network fusion. Consequently, the need for
manual labeling of the fusion is eliminated by removing the
Fusion alone does not provide object detection capabilities; it intermediary model. However, this implies that the fusion
only generates a combined image from inputs. Object detec- cannot be visualized. A fusion layer must be included in the
tion must be performed on the fusion output using dedicated network to enable the network to learn data fusion. Splitting
object detectors. Many state-of-the-art networks have been the network into branches to treat part of the information
implemented in multiple frameworks such as Pytorch [43] independently is proposed in this paper. At a level decided
and offer plug-and-play pre-trained networks to work in ideal at the modeling step, these branches will merge. Modality-
weather. When using that detection on other weather situa- specific feature extraction in the first layers of the network is
tions or for detecting other objects, fine-tuning the network induced by them while the abstraction in deep layers is kept.
is encouraged as many patterns could be reused from the Computational cost to the network is added by each of those
existing training [24]. branches.
Classic networks like the Faster-RCNN [44] were based Using the branching approach lowers the processing cost
on a typical shared process. Those networks hypothesized compared to HLF fusion networks. The cost of branching a
bounding boxes, resampled the content of those boxes, and network is the following:
applied a classifier on the resulting sub-image. This pipeline
has been the de facto way for all networks. Still, it has the q

k
K
inconvenience of being computationally intensive, making C F O (k) = F Oic + F Oi (1)
their use restricted to high-end devices, and those methods i=1 c=1 i=k+1

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

60 Page 4 of 17 J. Boisclair et al.

where F Oic is the number of operations for layer i for chan- Table 1 Statistics on the weight percentage of each channel in pre-
nel c, C F O (·) is the cost in floating operations, F Oi is the trained networks from Pytorch [43]
number of operations for layer i, k is the layer before where Model Red Green Blue
we merge, K is the total count of layers, and q is the number
fasterrcnn_mnet_v3_lg_fpn 28.83 56.11 15.05
of channels.
fasterrcnn_resnet50_fpn 34.04 38.21 27.74
The cost of evaluation of a convolutional layer F Oi is
defined as: ssd300_vgg16 [45] 29.70 36.58 33.70
ssdlite320_mnet_v3_large 33.95 46.43 19.61
F Oi = di × wi × h i × cxi × c yi × di−1 (2) resnet50 [50] 34.04 38.21 27.74
alexnet [51] 35.56 35.73 28.70

where wi and h i are the sizes of the feature map at layer i, densenet121 [52] 34.14 40.36 25.49
cxi and c yi are size of the convolutional filter, and di is the efficientnet_b0 [53] 30.74 45.20 24.04
number of features. inception_v3 [54] 39.56 31.09 29.34
Effectively, branches can be split with different sizes, mnasnet0_5 [55] 31.16 46.49 22.34
which, when sizing is adequately done, will reduce the bias regnet_x_16gf 32.91 42.81 24.27
towards data in high quantity by providing neurons propor- shufflenet_v2_x0_5 33.09 45.22 21.67
tional to the desired importance. Proper sizing of the branches vgg11 [56] 34.59 37.33 28.07
can further reduce the cost of processing by eliminating the Average 33.13 41.79 25.06
need for extra neuron calculation.
A simplified equation can be used if each branch has the
same amount of features, which is defined by: forced. The network is forced to consider the inputs, no mat-
ter the ratio of that data in the information, by the branch size,

k
K
which is crucial. For similar accuracy, the network size can
C F O (k) = (n × F Oi ) + F Oi (3)
be adapted to a lower calculation point than the vast network.
i=1 i=k+1
The output of those branches must be on a compatible spatial
where n is the number of channels to merge. size to be concatenated, but there is no restriction on the size
of the branch. The exact spatial resolution is needed to con-
catenate the branches’ output, but they can have any number
3.2 Proper branches sizing
of features. To alleviate the additional number of floating-
point operations caused by the branches, proper sizing must
Neural networks tend to split the weight across all the inputs
be provided. Higher accuracy in smaller-sized datasets has
almost evenly, as shown in Table 1, which shows the weights
been proven to be achieved by smaller NNs, as they need
applied to each input channel. The red channel averages
fewer images to be trained [57]. According to the desired
33.13% of the weights of the pre-trained networks, while
problem, dataset size, and dataset quality, the proper size for
the green and blue channels go with 41.79% and 25.06%. A
each branch needs to be defined. The inverse proportion of the
slight bias is shown towards the green channel, which is more
image count for each branch would be the proposed branch
determinant than blue in an image. However, that value is still
size. To remove the bias toward data, the proportion of the
in the range to statistically follow the tendency to separate
dataset multiplied by the ratio of neurons in the output of
the weight equally across inputs. Using the distribution of
the branch should be similar. The reduction of the branches’
weights mentioned above, adding modality such as thermal
size, reducing the cost, would be allowed by it. However,
to a color image would be around 25% of the weights on that
it should be noted that depending on the data size, branch
modality. However, due to local maximums of the weight in
size can be adjusted, and in the worst case(scarce dataset),
the network, there is a possibility this fourth channel might
overfitting may occur in the corresponding data branch only.
tend to have a deficient proportion. Additional information
would not follow this distribution using a pre-trained net-
work that has already attained maximum. This misintegration 3.3 Modifications to existing networks
of the additional data would not affect the result enough to
increase global accuracy. Multiple solutions exist to this mis- To apply the current approach, networks must receive min-
integration of information. As each branch is independent of imal changes. A state-of-the-art backbone network, such as
the others, it is possible to partially train the network or reuse Alexnet [51] or VGG-16 [56], is first chosen as the base net-
branches. work. The first modification to be applied to the network is
By adjusting the size of branches, the desired proportion of duplicating the first k layers, as seen in Fig. 1. For apply-
neurons and filters associated with a specific modality can be ing the weights, duplication doesn’t alter the size, making

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

A tree-based approach for visible and thermal sensor fusion in winter autonomous driving Page 5 of 17 60

Fig. 1 Representation of the proposed network modifications

weights compatible except for the fusion and first layers. 3.4 Evaluating split layer
The first layer will have an input size of 1 feature per branch.
The fusion layer at k has n times the features where n is To propose a value for k, the network must first be trained
the number of branches. For that layer to receive all the with k = K , meaning at the end. Once the data from the
required inputs, the output of all branches is concatenated. training has been acquired, there are two options. In the first
However, the convolutional size and the number of filters will option, a layer visualizer is used to display the weights of
be changed to specialize the network’s branches to prevent each layer to a neural network expert. That expert identifies
reusing existing kernels. Each branch being independent, the layer number with the most similar weights and uses this
they can be processed on different processing units or spread value as k.
out using a valley-fill algorithm to be run during downtimes The second automatic option approximates the expert
between captures. decision based on entropy. There are four steps needed to
The choice of k impacts the detection performance and extract the information from the previous training. In the first
processing cost. If k is chosen to be close to the beginning step, the computer evaluates three statistics for each channel
of the network, it acts as an LLF. The main problem with of all layers. The first statistic is the entropy calculated with:
such approaches is the need for modularity. In contrast, a k
value next to the network’s end acts as an HLF, significantly
increasing the processing cost. This method’s drawbacks are
N
n=1 logsoftmax(X )n × elogsoftmax(X )n
H (X ) = − (4)
that the whole network must be computed even on channels N
with no information, such as a green channel, when detecting
a stop sign. It is crucial to take into consideration both cal- where N is the size of the sample, X the sampled value and
culation cost and meaningful information. Neural networks n an iterator for the samples. The second statistic is the mean
work by abstracting more details at each level until detec- calculated with:
tion. For example, an array of colorful pixels in the input can N
later become a square’s abstraction. This abstraction is the n Xn
X̄ = (5)
best point to split the network, as each channel will abstract N
similar information for the same object. However, input dis-
And finally, the variance calculated with:
tributions will vary.
The abstraction point of a NN cannot be readily deter- 2
mined. Each network architecture and each dataset will imply V ar (X ) = E X − X̄ (6)
a different level of abstraction for the convolutional layers.
Another drawback of not knowing the abstraction layer is that where E is the symbol for expected value. The difference
channel-specific patterns may be abstracted after the fusion between those values for each layer is then computed:
layer, reducing efficiency for all channels if the fusion is
achieved too soon in the network. Merging at the end of the

C
network or having incorrectly sized features in the branches Si − Si−1 (7)
will result in higher computation and require a more consid- i=2
erable dataset extent.

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

60 Page 6 of 17 J. Boisclair et al.

where Si is one of the aforementioned statistics. Then, objects exhibit negligible temporal displacement. The tol-
join the three statistics into a single one with a weighted sum erance in positional change should be constrained by the
such as Minkowski: convolutional window size. Therefore, the parameters for
fine-tuning and inference should be nearly identical.
M I N K O W S K I
3.6 Proposed implementation
= p
ωx × |X 2 − X 1 | p + ω y × |Y2 − Y1 | p + ωz × |Z 2 − Z 1 | p (8)

The Pytorch [43] platform is used to implement the current

where X , Y and Z are the 3 statistics, ωi is the weight asso-
research. Each proposed network has been converted to a
ciated with that statistic in the distance function. Finally, find
standard format that takes the k parameter as a hyperpa-
the value for which the layer gets the lowest difference.
rameter. However, due to the automatic nature of the split,
networks containing skip connections such as ResNet [50]
3.5 Input preparation
cannot be used. Manual conversion of those networks is pos-
sible, but adding the additional parameter of how to split a
As for the inputs, the modified network still requires par-
skip connection needs to be covered in the current research.
tial calibration of the sensors. The lens geometry deviations
For a skip connection to work in a semi-automatic way, ensur-
in visible sensors are called optical distortion [26]. Cali-
ing the source and target of the skip connection are on the
bration is necessary if the different sensors have different
same branch is the easiest solution. For the proposed network
distortions. Camera calibration is an essential part of camera
architecture, two main branches exist. The first branch is the
fusion. Such calibration obtains a transformation required
color branch and has split into three branches. The thermal
for all sensors to be on the same reference world, such as
image is inserted for recognition in more challenging weather
intrinsic and extrinsic parameters. The intrinsic parameters
conditions on the second branch.
are estimated using the standard chessboard and a pinhole
At the fusion layer, an additional modification is applied
camera model. For the extrinsic parameters, key points are
to the convolution. Optional weights apply the possibility
manually annotated in a few images, and the rotation and
of missing information. Computers run convolutions effi-
translation parameters required for aligning the images to a
ciently by matrix dot multiplications. As such, the weights
common frame are obtained. As calibration can be a tedious
are regrouped by their input and can be removed without
part of the fusion process and costs a lot, the need for calibra-
changing the output size. They are adapted in real-time
tion must be reduced. As neural networks obtain features by
and processed for a missing channel. Since removing those
spatial statistics, spatial calibration is mandatory. However,
weights from the weighted sum of the dot multiplication will
neural networks can tolerate some imprecision as long as it
reduce the range of the output, a normalization is added in
is within the size of the convolutional window. Loosely done
the form of the following equation:
calibration would be enough for the pictures to be fused.
Multiple homographic transformations, such as scaling, X
translation, and perspective, are applied to geometrically cal- X = (9)
X max
ibrate the images and place them approximately in the same
geometric frame. Homographic transforms cannot correct the where X is the input value X max is the maximum value of X

high physical distance between the sensors. Therefore, the and X is the normalized output. This normalization step is
area of interest, such as the road in the context of autonomous visible in Fig. 1, where it is applied directly after the convo-
vehicles, should be the point where the calibration overlaps. lutional layer responsible for the merge.
As long as both images overlap in a world, the network could The proposed method has been tested with three back-
learn the spatial features required for merging. Due to their bones as feature extractors. The first backbone is the
convolutions, CNNs are inherently translation invariant [58]. Alexnet [51], the second is VGG-16 [56], and the third
The networks are robust to small temporal variations that backbone was created on the encoder-decoder architecture.
stem from high-frequency inputs and smooth motions, as the Those backbones have been chosen for their absence of fast-

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

A tree-based approach for visible and thermal sensor fusion in winter autonomous driving Page 7 of 17 60

Fig. 2 Timelines of the three

syncing methods

forward layers, making them adapted for network fusion. The Table 2 Different syncing methods
encoder-based network consists of five layers with a kernel Method Variant Cost Latency Missing data
of size three, followed by five convolutions with padding of
two. The Single Shot Detector assumes the detection part of Wait (A) Parallel High Low Failure
the network (SSD) [45] as it can work with any backbone. Queue Low High Failure
First come – – Failure
Fixed Freq (B) Parallel High Moderate Unaffected
3.7 Proposed syncing method
Queue Low High Unaffected
Inherently merging layers in the network allows parallel or First come Lowa Moderate Unaffected
partial calculation of the network, enabling separate pro- Wait + Parallel High Moderate Latency
cessing of the sensors with different frequencies possible, Queue Low High Latency
uniformizing the calculation burden. The uniformization is Max delay (C) First come Lowa Low Latency
done according to a valley-fill algorithm. Valley-fill algo- a May be moderate
rithms distribute the processing over a period where the
calculation cost would have been lower to average it. Valley-
fill algorithms aim to spread out and reduce computation Figure 2b shows a theoretical event where the FLIR cam-
peeks [59, 60]. Calculating layers 1 to k for each input with era stopped working.
inputs at varying frequencies is possible. Henceforth, a sync- The first method is to wait for all data to be available
ing technique must be used [61]. before the processing starts. As in Fig. 2a timeline A during
As shown in Fig. 2, a regular syncing method that acts an optimal case, the latency is short, but this first method fails
before the network introduces lots of latency and processes all if a sensor failure happens as in Fig. 2b:
data simultaneously, thus requiring better computing equip-
ment to keep both the latency and frequency at real-time • Its first variant is to process all data simultaneously once
levels. everything arrives.
Several possibilities are studied to reduce the processing • The second variant is to process each network branch
cost, mainly three methods, each with three variants. The one at a time but still wait for all the info. This method
Table 2 shows the specifics of each technique. reduces the peak cost at the expense of latency.
Temporal calibration of sensors requires calibration meth- • The third variant isn’t available for this method as we
ods that are robust to missing data and have minimal latency. cannot process the information as it comes and still wait
Figure 2 illustrates the proposed methods with a visual for all information to be available.
representation. The purple circle signifies the input from
the FLIR camera. The blue circle represents the input from The second method aims to resolve the sensor failure issue
the Mynteye P camera. The green line denotes the latency using a fixed frequency for data processing, as depicted in
between the incoming data and the processing, represented the B timeline of Fig. 2a. With a fixed processing interval,
by the yellow triangle. The crossed-out circles indicate data missing a sensor will not impact the system since the period
that has been disregarded, as more recent data on the same between data processing remains constant:
channel was received before processing. Using the most
recent picture helps reduce the time difference between cap- • This first variant of this method aims to moderate the
ture and processing. latency as seen in Fig. 2a. Still, the fixed frequency might

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

60 Page 8 of 17 J. Boisclair et al.

create false positives for sensor failures and induce lags that would turn off the missing sensor after a few iterations
if it needs to be adequately timed with the sensors. It also to reduce that lag and automatically reenable it if it sends
has a high processing cost. new data to fix the potential latency of the last option. This
• Through the second variant, all the data is processed one step is present in the sync + concat step of the network as
at a time once the frequency has been reached, creating seen in Fig. 1.
a high latency, but the cost stays low.
• The third method processes incoming information imme- 3.8 Data preparation and collection
diately, and only the shared part of the network is
processed at a fixed frequency. However, this first-come, Due to the absence of public or accessible datasets for
first-processed approach may lead to redundant process- self-driving vehicles that contain both thermal and visible
ing for a sensor. If a sensor’s frequency allows it to arrive spectrum images, normal driving datasets that include both
twice before the fixed frequency trigger, two processing spectrums were used. The public FLIR ADAS dataset, which
events will occur, even if only the most recent output is consists of 5322 pairs of visible and thermal images that are
used. synchronized and labeled, was used for night, fog and rain
conditions. As no public datasets were available for winter
The current paper proposes a third method to address conditions, our own data were collected. A Mynteye S1030
moderate latency and double processing issues while fixing camera and a FLIR ADK were mounted on a Kia Soul EV
sensor failures. This third synchronization method offers the 2017 and driven around the UQTR campus and its vicinity
advantages of both previous ways. Instead of using a fixed for several hours. Data were recorded under various weather
frequency, this method waits for the information like the first. conditions in winter, and then one image every 5 min from
However, a max delay starts calculating once the first sensor the recording was extracted, resulting in 256 winter-based
is read. The information is processed if this delay is reached pairs. Optimal offline synchronization was applied to obtain
before retrieving data from all the sensors. In Fig. 2a, this the best matching pair of thermal and visible images. An
method is represented as the C timeline, in which the latency example of the dataset image is shown in Fig. 3, where (a) is
lines are to their maximum duration: the visible spectrum, (b) is the original uncalibrated thermal
image, (c) is the loosely calibrated thermal image, and (d)
• The first variation processes all information after the trig- and (e) are the thermal images overlayed with 40% opacity
ger is activated, resulting in parallel processing and high on the visible image. Figure 3b demonstrates the importance
cost. However, latency can be moderate in the worst case of thermal imaging, as a pedestrian who is obscured by the
and low in the best case since the delay is a fixed fre- building shadow in the visible spectrum is only visible in the
quency if a sensor is missing. thermal image.
• The second variant queues each branch once the trigger Figure 4 shows six other pairs of thermal and visible
is activated. Like the other methods, the cost becomes images that illustrate various scenarios relevant to self-
low and the latency high. driving vehicles. In these images, road objects that are more
• For the third variant, each branch is processed when data than 15 m away are aligned correctly, while off-road objects
are available, reducing the peak cost and the early exit and tall elements, such as lampposts, are displaced.
induced if all sensors have data available, reducing the
latency to a minimum. The negative aspect of this method 3.9 Training
is that a missing sensor causes latency.
A special training procedure was required by this novel archi-
All syncing methods and their variants are presented in tecture. The original weights from pretraining on datasets
Table 2. It is important to note that each branch is separated such as MS-COCO were used for the modified state-of-the-
from the previous methods. Except for the first variant, in art networks, and then a three-step training on fusion was
which all data is after the firing of the syncing method, only applied. In the first step, the network was trained on the fusion
the trunk would be processed after the firing of the syncing dataset, which consisted of synchronized images from all
procedure. It should be attached to a learning mechanism

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

A tree-based approach for visible and thermal sensor fusion in winter autonomous driving Page 9 of 17 60

Fig. 3 The 256 image

self-collected fusion dataset

Fig. 4 Image/thermal pairs for the 256 image self-collected fusion dataset

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

60 Page 10 of 17 J. Boisclair et al.

k and the auto column representing the value of k found by

entropy. It is visible in Fig. 5 a tendency that the higher k
correlates to higher mAP without changing any other meta-
parameter. Exceptions to that are k = 8 and k = 14 for
the AlexNet backbone, in which the mAP lowers drastically.
Those drops can be explained by sub-optimal branch siz-
ing on the backbone, as everything was automated instead
of manually optimized for a more fair comparison (Fig. 5).
This figure also presents the performance of the proposed
network on the FLIR dataset with 34.1% for the Alexnet
backbone, 40.7% for auto-encoder, and 44.3% for the VGG
backbone (Fig. 5). In all those backbones, entropy was able
to retrieve the full mAP.
As per the improvement to the sync method, those meth-
Fig. 5 mAP compared to split layer (SSD)
ods strongly affect the network processing cost and latency
without affecting the result. Table 3 presents a summary
of the syncing process. Following the capture of data for
branches. The fusion layer was initialized to accommodate
winter fusion, it was obtained that sensors often perform
all inputs by this step. The learning rate of the network was
slower than advertised. The RGB camera, used for the RGB
set low in the first step to learn proper patterns. In the second
part of the dataset, served from 20 to 62 fps. The cam-
step, a subset of the weights in the fusion layer was selected
era output frequency averages 40 fps. The Thermal camera
and the network was trained on all available datasets. All
provided a framerate of 14 to 34 with a mean of 24 fps.
branches that were not contained in the dataset were frozen
Following those frequency distributions, multiple 30-minute
to avoid changes. The output was optimized using multiple
simulations were run with the three syncing methods afore-
sensors by retraining the network on the fusion dataset in the
mentioned. Those simulations provided the experimental
final step.
results in Table 3 for all possible sensor failure states. The first
method showed good frequency, low latency, and no failure
in optimal conditions. However, this method failed to fuse
4 Results and discussions any data when a sensor was missing, as visible in the 2nd
and 3rd lines of Table 3. The second method showed a stable
Using the proposed architecture requires the k hyperparam- frequency independent of sensor failure but failed 2479 times
eter to be set. Multiple values of k have been evaluated and to process any information as the trigger happened without
are presented in Fig. 5. The figure shows the result of the receiving camera readings. Those failed fusion runs misused
network on the FLIR ADAS dataset with different values of processing time, wasting 6.8% of processing time (Table 3).

Table 3 The experimental results from the different syncing algorithms

Method RGB T Run Partial Failed Ignored Latency (ms) Fusion
runs runs readings Minimum Average Maximum frequency (Hz)

Wait for all 40,310 0 0 24,837 0.0003 13.96 49.82 22.39 A

Wait for all 0 0 0 40,260 ∞ ∞ ∞ 0
Wait for all 0 0 0 65,289 ∞ ∞ ∞ 0
Fixed frequency of 20 Hz 36,001 2605 0 36,009 0.01974 24.78 50 20
Fixed frequency of 20 Hz 36,001 36,001 2479 6842 0.0028 21.37 50 18.62
Fixed frequency of 20 Hz 36,001 36,001 2479 29,178 0.0011 15.38 50 18.62
Max delay of 50 ms 43,909 3635 0 21,411 0.0003 13.44 49.95 24.39
Max delay of 50 ms 23,501 23,501 0 16,726 2.38 22.88 50 13.06
Max delay of 50 ms 27,192 27,192 0 38,022 2.84 13.89 50 15.11
The RGB column represents if the RGB sensor was present during the simulation. The T column represents if the thermal was present during
that simulation. The run command is the amount of time the fusion took place. The Partial runs column represents the number of times the fusion
processed incomplete sensor values. The Failed run column counts the amount of fusion run without any sensor data. The ignored column represents
the amount of data ignored for receiving multiple of the same sensors captured between fusion runs. The latency columns record the time from the
first data to the fusion process

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

A tree-based approach for visible and thermal sensor fusion in winter autonomous driving Page 11 of 17 60

Fig. 6 Cost distribution for

RGB-T fusion

Fig. 7 Calibration sensitivity study

This method also has a high average latency compared to the cially since sensors are often exposed to harsh environmental
first one. Finally, the proposed syncing process provided low conditions, like snowstorms, where the risk of sensor failure
latency and high fusion frequency during an optimal stage; is significantly higher.
it didn’t waste any processing on failed runs nor ignored lots In Fig. 6, the red, green, and blue curves represent the pro-
of readings while only slowing for missing sensors. The processing of the color image. The purple curve represents the
posed method is a more robust algorithm for processing than processing of the thermal image. The red vertical line repre-
the current widely used methods. The increased robustness sents the capture of the RGB image, and the purple vertical
demonstrated by this approach is particularly advantageous line represents the capture of the thermal image. The pro-
for autonomous vehicles. It ensures better reliability, espe- cessing cost could be lowered from 8 GFOP to 4 GFOP, as

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

60 Page 12 of 17 J. Boisclair et al.

Fig. 8 Comparison to state of

the art networks

visible in Fig. 6, using the selected syncing algorithm and a latency was used, and the cost was reduced by half. The
valley-fill algorithm. This figure shows the valley-fill algo- trunk (layers k + 1 to K ) is unaffected by both methods. By
rithm perfectly to present the best outcome. However, the reducing the peak computational demand, the use of energy-
processing cost will be slightly higher depending on which efficient, low-power edge AI computers becomes feasible,
implementation is chosen [59, 60]. The latency won’t be which can reduce the vehicle energy consumption.
affected in normal conditions as the sensors are desynchro- A comparison of the proposed end-to-end fusion-to-
nized. In Fig. 6b, by using the wasted latency between the detection network with different calibration qualities is
capture of the visible and the thermal image, no additional presented in Fig. 7. The network is trained on 1000 images

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

A tree-based approach for visible and thermal sensor fusion in winter autonomous driving Page 13 of 17 60

from the Flir ADAS dataset and evaluated on another 1000 and 30.1% for the FLIR ADAS dataset (Fig. 8). SSD is known
images. To simulate improper geometric calibration due to in the literature to provide lower mAP than other state-of-the-
large sensor distances, reverse homographic transformations art methods; however, it gives real-time results suitable for
are applied on the Flir ADAS dataset. The results show that edge devices. This very low mAP indicates that SSD, unlike
imprecise calibration achieves a similar level of mAP as per- other state-of-the-art methods, is more sensitive to weather
fect calibration, while allowing for more flexibility in sensor variations. The FasterRCNN detector performs slower and
placement. introduces latency, which could be fatal in the case of self-
The proposed end-to-end fusion-to-detection network is driving, but provides 38.4% on the FLIR dataset, 43.8%
then compared with combinations of state-of-the-art fusion on PST-900, and 34.8% on the winter dataset (Fig. 8). For
and state-of-the-art detectors to offer a fair comparison with YoloV5, the mAP reaches 39.4% using the DenseFuse fusion
state-of-the-art methods. The novel architecture has been and 23.6% on the winter dataset. YoloV8, being more pre-
trained using a 256 entries self-collected fusion dataset com- cise than YOLOv5 gets 41.36% on the FLIR dataset, 36.6%
bined with the MS COCO 2017 [62] and the FLIR ADAS on PST-900, and 24.21% on the winter dataset (Fig. 8). The
dataset [63]. end-to-end network using auto-encoders or VGG as back-
Two state-of-the-art fusions are presented-firstly, the bones performs better than state-of-the-art detectors in all
DenseFuse [39] fusion, which was initially trained on the conditions by using the thermal information without the inter-
FLIR ADAS dataset [63]. Secondly, the variational convolu- mediary image. The fusion of thermal and visible images
tional auto-encoder [38] is used for its compelling results improves the mAP, as this fusion technique effectively detects
with minimal dataset size. The generative aspect of that objects obscured in shadows or otherwise imperceptible in
method allows for generating higher-quality fused images the visible light spectrum.
compared to other deep-learning-based approaches. Since
the proposed approach provides both fusion and detection
without providing the intermediate image, it is impossible 5 Conclusion
to compare with fusion methods alone. Three state-of-
the-art detectors have been used. The Faster-RCNN [44] This paper proposes an end-to-end fusion network with syn-
with the Resnet50 [50] backbones are legacy detectors. chronization for detecting road obstacles using the fusion
The SSD300 [45] provides a recent network with com- of thermal and visible images. Following SSD’s path, our
pelling aspects such as real-time and high mAP. Finally, method integrates two sequential networks into an end-to-end
the YoloV8 [47, 48] is used for a modern implementation one to reduce processing costs and increase detection perfor-
with high performance and low latency. The detectors under mance. The fusion is applied by a branch and trunk approach
examination were equipped with pre-trained models on the with spectral length-based branches and adaptive weights at
Common Objects in Context (COCO) dataset. Fine-tuning the fusion layer. The proposed method also suggests using
was performed for 100 epochs on the new datasets to assess a maximum delay syncing method that adds robustness for
the performance of these detectors on the fusion datasets. missing sensors while keeping the latency low. The modu-
The proposed entropy-driven choice of k was used to com- lar branch and trunk approach allows for input adaptation
pare the state of the art introduced at this section’s beginning. without retraining. By using the network’s modularity, par-
The state-of-the-art fusion and detector combinations results tial training can reduce the bias created by dataset size during
are then compared to the proposed entropy-based method in training. The proposed way of splitting the network allows
Fig. 8. The figure presents the network evaluated the FLIR for automatic branching for detection networks without any
dataset in Fig. 8a, the winter based 256 images dataset pro- fast forward in the tensors. The fusion point is automatically
vided by our team in Fig. 8b, and the PST-900 [64] dataset selected by the proposed entropy-based method. The pro-
in Fig. 8c. The harsh winter-based dataset is significantly posed network achieved 40.5% mAP on the challenging 256
more challenging to detect objects due to snow and night. image-pair heavily degraded winter dataset captured by our
This 256-image dataset annotated by our team contains snow team and 44.3% on the public FLIR ADAS dataset, which
storms and other severe data, which are difficult to analyze contains slightly degraded conditions, using thermal images
by neural networks. The PST-900 dataset is taken in subter- combined with visible spectrum images. It should be noted
ranean conditions for survivors’ rescue. The PST900 dataset that mAP of 44.3% by the proposed method is a significant
is often taken in darkness, where the thermal image contains performance jump in the aforementioned adverse condition,
the primary information, unlike winter driving, where car and including daytime thermal images in the dataset has
lights make the visible image less challenging to understand. shown a drastic improvement in the mAP reported by SOTA
DenseFuse performs slightly better than the VCAE method methods, as in the FLIR ADAS dataset. This paper proposes
by half a percent. The SSD300 method using a state-of-the- a novel end-to-end fusion approach for thermal and visible
art fusion performs the worse at 15.8% in winter conditions imaging to improve autonomous driving performance during

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

60 Page 14 of 17 J. Boisclair et al.

harsh weather like winter. It also offers a novel synchroniza- 4. Boisclair, J., Kelouwani, S., Ayevide, F.K., Amamou, A., Alam,
tion method for real-time low-latency synchronization that M.Z., Agbossou, K.: Attention transfer from human to neural net-
works for road object detection in winter. IET Image Proc. (2022).
is vital in autonomous driving. That novel synchronization https://doi.org/10.1049/ipr2.12562
method is resilient to sensor failure, such as camera discon- 5. Zhao, Z., Zheng, P., Xu, S., Wu, X.: Object detection with
necting during vehicle operation. Our method enhances the deep learning: a review. IEEE Trans. Neural Netw. Learn. Syst.
robustness, efficiency, and performance of the detection sys- 30(11), 3212–3232 (2019). https://doi.org/10.1109/TNNLS.2018.
2876865
tem. It uses low-power edge AI computers that can handle 6. Ji, K., Lei, W., Zhang, W.: A deep retinex network for underwater
harsh weather conditions and save battery. It also detects low-light image enhancement. Mach. Vis. Appl. 34(6), 122 (2023)
objects that are otherwise invisible in the visible spectrum. 7. Malik, M., Majumder, S.: An integrated computer vision based
The proposed architecture can be easily applied with addi- approach for driving assistance to enhance visibility in all weather
conditions. In: International and National Conference on Machines
tional spectral length, such as ultraviolet, by adding branches
and Mechanisms
and data while keeping the weights of existing branches. 8. Ghamisi, P., Rasti, B., Yokoya, N., Wang, Q., Hofle, B., Bruzzone,
The architecture also applies to other tasks using neural net- L., Bovolo, F., Chi, M., Anders, K., Gloaguen, R., Atkinson, P.M.,
works with inputs of compatible types and a convolutional Benediktsson, J.A.: Multisource and multitemporal data fusion in
remote sensing: a comprehensive review of the state of the art.
layer acting as the fusion layer. While the proposed network
IEEE Geosci. Remote Sens. Mag. 7(1), 6–39 (2019). https://doi.
achieves a satisfactory mAP for self-driving in favorable org/10.1109/MGRS.2018.2890023
weather conditions, it struggles in adverse conditions such 9. Du, H., Hao, X., Ye, Y., He, L., Guo, J.: A camera style-invariant
as winter. In winter, the network surpasses the state-of-the- learning and channel interaction enhancement fusion network for
visible-infrared person re-identification. Mach. Vis. Appl. 34(6),
art methods that use conventional convolutional architectures 117 (2023)
and shows the feasibility of using attention mechanisms for 10. Watt, N., Plessis, M.C.: Neuro-augmented vision for evolutionary
detection tasks. It also demonstrates the potential of low- robotics. Mach. Vis. Appl. 34(6), 95 (2023)
computation and high-robustness detection systems. Future 11. Coenen, M., Schack, T., Beyer, D., Heipke, C., Haist, M.: Con-
sinstancy: learning instance representations for semi-supervised
work will explore how to enhance the detector’s mAP in panoptic segmentation of concrete aggregate particles. Mach. Vis.
challenging weather conditions, such as winter scenarios, by Appl. 33(4), 57 (2022)
employing Transformers as the backbone of the architec- 12. Singha, A., Bhowmik, M.K.: Tu-vdn: Tripura university video
ture, by applying the proposed architecture to LiDAR-based dataset at night time in degraded atmospheric outdoor conditions
for moving object detection. In: 2019 IEEE International Confer-
detection networks, and by fusing a state-of-the-art detection ence on Image Processing of the ICIP, pp. 2936–2940. IEEE
network with a point cloud-based network. 13. Liu, Q., Lu, X., He, Z., Zhang, C., Chen, W.-S.: Deep convolutional
neural networks for thermal infrared object tracking. Knowl. Based
Acknowledgements This research was funded by the Natural Sciences Syst. 134, 189–198 (2017)
and Engineering Research Council of Canada and the Canada Research 14. Jonsson, P.: Remote sensor for winter road surface status detection.
Chair Program. In: 2011 IEEE SENSORS, pp. 1285–1288. IEEE
15. Light, J., Parthasarathy, S., McIver, W.: Monitoring winter ice con-
Author Contributions J.B. wrote the main manuscript A.A., M.Z.A. ditions using thermal imaging cameras equipped with infrared
S.K, K.A provided research supervision J.B, M.O, L.Z. contributed to microbolometer sensors. Procedia Comput. Sci. 10, 1158–1165
the code and experiments S.K and K.A worked for funding aquisition (2012)
All authors reviewed the manuscript. 16. Fetzer, G.J., Sitter, D.N., Jr., Gugler, D., Ryder, W.L., Griffis, A.J.,
Miller, D., Gelbart, A., Bybee-Driscoll, S.: Ultraviolet, Infrared,
Data availability No datasets were generated or analysed during the and Near-infrared Lidar System and Method (2010)
current study. 17. Shopovska, I., Jovanov, L., Philips, W.: Deep visible and thermal
image fusion for enhanced pedestrian visibility. Sensors (2019).
Declarations https://doi.org/10.3390/s19173727
18. Chebrolu, K.N.R., Kumar, P.N.: Deep learning based pedestrian
detection at all light conditions. In: Proceedings of the 2019 IEEE
Conflict of interest The authors declare no conflict of interest.
International Conference on Communication and Signal Process-
ing, ICCSP 2019, pp. 838–842. https://doi.org/10.1109/ICCSP.
2019.8698101
19. Bercier, E., Louvat, B., Harant, O., Balit, E., Bouvattier, J., Nacsa,
References L.: Far-infrared thermal camera: an effortless solution for improv-
ing adas detection robustness. In: Proceedings of SPIE—The
1. Nguyen, V.N., Jenssen, R., Roverso, D.: Ls-net: fast single-shot International Society for Optical Engineering, vol. 11009. https://
line-segment detector. Mach. Vis. Appl. (2021). https://doi.org/10. doi.org/10.1117/12.2520364
1007/s00138-020-01138-6 20. Hwang, S., Park, J., Kim, N., Choi, Y., So Kweon, I.: Multispectral
2. Murthy, C.B., Hashmi, M.F., Keskar, A.G.: Efficientlitedet: a real- pedestrian detection: benchmark dataset and baseline. In: Proceed-
time pedestrian and vehicle detection algorithm. Mach. Vis. Appl. ings of the IEEE Conference on Computer Vision and Pattern
33(3), 47 (2022) Recognition, pp. 1037–1045
3. Yao, J., Huang, B., Yang, S., Xiang, X., Lu, Z.: Traffic sign detection 21. Yang, R., Zhu, Y., Wang, X., Li, C., Tang, J.: Learning target-
and recognition under low illumination. Mach. Vis. Appl. 34(5), oriented dual attention for robust rgb-t tracking. In: 2019 IEEE
75 (2023)

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

A tree-based approach for visible and thermal sensor fusion in winter autonomous driving Page 15 of 17 60

International Conference on Image Processing of the ICIP, pp. 41. Ahmad, K., Pogorelov, K., Riegler, M., Conci, N., Halvorsen, P.:
3975–3979. IEEE Cnn and gan based satellite and social media data fusion for disaster
22. Li, H., Wu, X.: Densefuse: a fusion approach to infrared and visi- detection. In: MediaEval (2017)
ble images. IEEE Trans. Image Process. 28(5), 2614–2623 (2019). 42. Wang, C., Yang, G., Papanastasiou, G., Tsaftaris, S.A., Newby,
https://doi.org/10.1109/TIP.2018.2887342 D.E., Gray, C., Macnaught, G., MacGillivray, T.J.: Dicyc: Gan-
23. Huangfu, Y., Campbell, L., Habibi, S.: Temperature effect on ther- based deformation invariant cross-domain information fusion for
mal imaging and deep learning detection models. In: 2022 IEEE medical image synthesis. Inf. Fusion 67, 147–160 (2021)
Transportation Electrification Conference & Expo (ITEC), pp. 43. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan,
185–189. IEEE (2022) G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison,
24. Erhan, D., Courville, A., Bengio, Y., Vincent, P.: Why does unsu- A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chil-
pervised pre-training help deep learning? In: Proceedings of the amkurthy, S., Steiner, B., Fang, L., Bai, J., Chintala, S.: Pytorch:
Thirteenth International Conference on Artificial Intelligence and an imperative style, high-performance deep learning library. In:
Statistics, pp. 201–208. JMLR Workshop and Conference Proceed- Advances in Neural Information Processing Systems, vol. 32, pp.
ings (2010) 8024–8035. Curran Associates, Inc. (2019)
25. Tu, L., Qin, Z., Yang, L., Wang, F., Geng, J., Zhao, S.: Identifying 44. Zhao, X., Li, W., Zhang, Y., Gulliver, T.A., Chang, S., Feng, Z.: A
the Lambertian property of ground surfaces in the thermal infrared faster rcnn-based pedestrian detection system. In: 2016 IEEE 84th
region via field experiments. Remote Sens. 9(5), 481 (2017) Vehicular Technology Conference. VTC-Fall, pp. 1–5 (2016)
26. Yeong, D.J., Velasco-Hernandez, G., Barry, J., Walsh, J.: Sensor 45. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-
and sensor fusion technology in autonomous vehicles: a review. Y., Berg, A.C.: Ssd: single shot multibox detector. In: European
Sensors 21(6), 2140 (2021) Conference on Computer Vision, pp. 21–37. Springer, Berlin
27. Li, Y., Jha, D.K., Ray, A., Wettergren, T.A.: Feature level sen- 46. Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Pro-
sor fusion for target detection in dynamic environments. In: ceedings of the IEEE Conference on Computer Vision and Pattern
2015 American Control Conference (ACC), pp. 2433–2438. IEEE Recognition, pp. 7263–7271
(2015) 47. Terven, J., Cordova-Esparza, D.: A comprehensive review of
28. Kandylakis, Z., Vasili, K., Karantzalos, K.: Fusing multimodal yolo: From yolov1 to yolov8 and beyond. arXiv preprint
video data for detecting moving objects/targets in challenging arXiv:2304.00501 (2023)
indoor and outdoor scenes. Remote Sens. 11(4), 446 (2019) 48. Jocher, G., Chaurasia, J.Q.A.: Yolo by Ultralytics (2023)
29. Yang, Y., Lee, W., Osteen, P., Geneva, P., Zuo, X., Huang, G.: icalib: 49. Wang, C.-Y., Bochkovskiy, A., Liao, H.-Y.M.: YOLOv7: trainable
inertial aided multi-sensor calibration. In: VINS Worshop (2021) bag-of-freebies sets new state-of-the-art for real-time object detec-
30. Mirzaei, F.M.: Extrinsic and Intrinsic Sensor Calibration. PhD the- tors. arXiv (2022). https://doi.org/10.48550/ARXIV.2207.02696
sis, University of Minnesota (2013) 50. Kaiming, H., Xiangyu, Z., Shaoqing, R., Jian, S.: Deep residual
31. Ackermann, J.: Robustness against sensor failures. Auto- learning for image recognition. arXiv:1512.03385 (2015)
matica 20(2), 211–215 (1984). https://doi.org/10.1016/0005- 51. Alex, K.: One weird trick for parallelizing convolutional neural
1098(84)90027-X networks. arXiv:1404.5997 (2014)
32. Azarang, A., Manoochehri, H.E., Kehtarnavaz, N.: Convolu- 52. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely
tional autoencoder-based multispectral image fusion. IEEE Access connected convolutional networks. In: Proceedings of the IEEE
7, 35673–35683 (2019). https://doi.org/10.1109/ACCESS.2019. Conference on Computer Vision and Pattern Recognition, pp.
2905511 4700–4708 (2017)
33. Guan, Q., Ren, S., Chen, L., Feng, B., Yao, Y.: A spatial- 53. Koonce, B., Koonce, B.: Efficientnet. In: Convolutional Neu-
compositional feature fusion convolutional autoencoder for mul- ral Networks with Swift for Tensorflow: Image Recognition and
tivariate geochemical anomaly recognition. Comput. Geosci. 156, Dataset Categorization, pp. 109–123 (2021)
104890 (2021). https://doi.org/10.1016/j.cageo.2021.104890 54. Xia, X., Xu, C., Nan, B.: Inception-v3 for flower classification. In:
34. Kwarteng, P., Chavez, A.: Extracting spectral contrast in Landsat 2017 2nd International Conference on Image, Vision and Comput-
thematic mapper image data using selective principal component ing (ICIVC), pp. 783–787. IEEE (2017)
analysis. Photogramm. Eng. Remote. Sens. 55(1), 339–348 (1989) 55. Tan, M., Chen, B., Pang, R., Vasudevan, V., Sandler, M., Howard,
35. Carper, W., Lillesand, T., Kiefer, R.: The use of intensity-hue- A., Le, Q.V.: Mnasnet: platform-aware neural architecture search
saturation transformations for merging spot panchromatic and for mobile. In: Proceedings of the IEEE/CVF Conference on Com-
multispectral image data. Photogramm. Eng. Remote. Sens. 56(4), puter Vision and Pattern Recognition, pp. 2820–2828 (2019)
459–467 (1990) 56. Karen, S., Andrew, Z.: Very deep convolutional networks for large-
36. Laben, C.A., Brower, B.V.: Process for enhancing the spatial res- scale image recognition. arXiv:1409.1556 (2014)
olution of multispectral imagery using pan-sharpening. Google 57. Soekhoe, D., Van Der Putten, P., Plaat, A.: On the impact of data set
Patents. US Patent 6,011,875 (2000) size in transfer learning using deep neural networks. In: Advances
37. Aiazzi, B., Baronti, S., Selva, M.: Improving component substi- in Intelligent Data Analysis XV: 15th International Symposium,
tution pansharpening through multivariate regression of ms + pan IDA 2016, Stockholm, Sweden, October 13–15, 2016, Proceedings
data. IEEE Trans. Geosci. Remote Sens. 45(10), 3230–3239 (2007) 15, pp. 50–60. Springer, Berlin (2016)
38. Ren, L., Pan, Z., Cao, J., Liao, J.: Infrared and visible image fusion 58. Blything, R., Biscione, V., Vankov, I.I., Ludwig, C.J., Bowers,
based on variational auto-encoder and infrared feature compensa- J.S.: The human visual system and cnns can both support robust
tion. Infrared Phys. Technol. 117, 103839 (2021). https://doi.org/ online translation tolerance following extreme displacements. J.
10.1016/j.infrared.2021.103839 Vis. 21(2), 9–9 (2021)
39. Li, H., Wu, X.-J.: Densefuse: a fusion approach to infrared and vis- 59. Valentine, K., Temple, W.G., Zhang, K.M.: Intelligent electric vehi-
ible images. IEEE Trans. Image Process. 28(5), 2614–2623 (2018) cle charging: rethinking the valley-fill. J. Power Sources 196(24),
40. Lee, J., Shiotsuka, D., Nishimori, T., Nakao, K., Kamijo, S.: Gan- 10717–10726 (2011)
based lidar translation between sunny and adverse weather for 60. Ma, Z.: Decentralized valley-fill charging control of large-
autonomous driving and driving simulation. Sensors 22(14), 5287 population plug-in electric vehicles. In: 2012 24th Chinese Control
(2022) and Decision Conference (CCDC), pp. 821–826. IEEE (2012)

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

60 Page 16 of 17 J. Boisclair et al.

61. Kaempchen, N., Dietmayer, K.: Data synchronization strategies Sousso Kelouwani received the
for multi-sensor fusion. In: Proceedings of the IEEE Conference Ph.D. degree in robotics systems
on Intelligent Transportation Systems, vol. 85, pp. 1–9 from the Ecole Polytechnique de
62. Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, Montreal, in 2011. He completed
D., Dollár, P., Zitnick, C.L.: Microsoft coco: common objects in his Postdoctoral Internship on fuel
context. In: European Conference on Computer Vision, pp. 740– cell hybrid electric vehicles with
755. Springer (2014) the University ofQuebec at Trois-
63. LLC, T.F.: Flir adas dataset. Online (2019) Rivières (UQTR), in 2012. He
64. Shivakumar, S.S., Rodrigues, N., Zhou, A., Miller, I.D., Kumar, developed expertise in the opti-
V., Taylor, C.J.: Pst900: Rgb-thermal calibration, dataset and seg- mization and intelligent control
mentation network. In: 2020 IEEE International Conference on of vehicular applications. He has
Robotics and Automation (ICRA), pp. 9441–9447. IEEE (2020) been a Full Professor of Mecha-
tronics with the Department of
Mechanical Engineering, since
Publisher’s Note Springer Nature remains neutral with regard to juris- 2017, and a member of the Hydro-
dictional claims in published maps and institutional affiliations. gen Research Institute. He holds four patents in U.S. and Canada. He
has published more than 100 scientific articles. His research interests
Springer Nature or its licensor (e.g. a society or other partner) holds include optimizing energy systems for vehicle applications, advanced
exclusive rights to this article under a publishing agreement with the driver assistance techniques, and intelligent vehicle navigation tak-
author(s) or other rightsholder(s); author self-archiving of the accepted ing into account Canadian climatic conditions. Holder of the Canada
manuscript version of this article is solely governed by the terms of such Research Chair in Energy Optimization of Intelligent Transport Sys-
publishing agreement and applicable law. tems and holder of the Noovelia Research Chair in Intelligent Nav-
igation of Autonomous Industrial Vehicles. Prof. Kelouwani was co-
president and president of the technical committee of the IEEE Inter-
national Conferences on Vehicular Power and Propulsion in Chicago
Jonathan Boisclair received a (USA, 2018) and in Hanoi (Vietnam, 2019). He is the winner of the
B.Sc.A. in computer science in Canada General Governor Gold Medal, in 2003, and a member of the
April of 2017. He completed the Order of Engineers of Quebec. In 2019, his team received the First
M.Sc. degree in 2019 from the Innovation Prize in partnership with DIVEL, awarded by the Asso-
Université du Québec à Trois- ciation des Manufacturiers de la Mauricie et Center-du-Québec for
Rivières (UQTR), Trois-Rivières, the development of an autonomous and natural navigation system. In
QC, Canada. In September of 2017, he received the Environment Prize from the Gala des Grands
2019, he began a Ph.D. in mechan- Prix d’excellence en transport, the Association québécoise du Trans-
ical engineering at Université du port (AQTr), for the development of hydrogen range extenders for
Québec à Trois-Rivières to improve electric vehicles.
his knowledge of applied artifi-
cial intelligence. His main research M. Zeshan Alam received his B.S.
interests are artificial intelligence, degree in Computer Engineering
autonomous driving, advanced driv- from COMSATS University, Pak-
ing techniques, and real computer istan, M.S. degree in Electrical
intelligence. and Electronics Engineering from
the University of Bradford, UK,
Ali Amamou received the B.S. and Ph.D. In Electrical Engineer-
degree in Industrial Computing ing and Cyber-Systems from Istan-
and automatic science from bul Medipol University, Turkey.
National Institute of Applied Sci- He worked at the University of
ences and Technology, Tunis, Cambridge as a post-doctoral fel-
Tunisia, in 2013 and the M.S. low where his work focused on
degree in Embedded Systems Sci- computer vision and machine learn-
ence from Arts et Métiers Paris- ing models. He currently joined
Tech University, Aix en Provence, Brandon University, Canada, as
France, in 2014. Between 2015 an assistant professor while also working as a Computer Vision Con-
and 2018, he completed his Ph.D. sultant at Vimmerse INC. His research interests include immersive
degree in electrical engineering videos, computational imaging, computer vision and machine learning
at Université du Québec à Trois- modeling.
Rivières (UQTR), Canada. In May
2018, he started a postdoctoral
fellow at the Hydrogen Research Institute. His main research interests
are the optimization of energy systems for stationary and mobile appli-
cations, hybridization of energy sources for vehicular application, and
eco-energy navigation of low-speed autonomous electric vehicles.

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

A tree-based approach for visible and thermal sensor fusion in winter autonomous driving Page 17 of 17 60

Hedi Oueslati graduated as an Kodjo Agbossou (M’98-SM’01)

engineer industrial computing and received the B.S., M.S., and Ph.D.
automation in september of 2021 degrees in electronic measurements
from the National Institute of from the Université de Nancy I,
Applied Sciences and Technology France, in 1987, 1989, and 1992,
(INSAT), Tunis, Tunisia. In Jan- respectively. He is currently a
uary of 2022, he began a M.Sc. Hydro-Québec Research Chair
degree in mechanical engineering holder at the Transactive Manage-
at Université du Québec à Trois- ment of Power and Energy in the
Rivières. His main research inter- Residential Sector, and the Chair
ests are artificial intelligence, at the Smart Energy Research and
autonomous driving, and real com- Innovation Laboratory,Université
puter intelligence. du Québec à Trois-Rivières
(UQTR). He was the Head at the
Engineering School, UQTR (2011-
2017). He was the Head of the Department of Electrical and Computer
Lotfi Zeghmi received his B.S. Engineering, UQTR (2007-2011). He was also the Director of gradu-
degree in control systems from ate studies in electrical engineering of the UQTR (2002-2004). He was
University of Science and Tech- a Postdoctoral Researcher(1993-1994) at the Electrical Engineering
nology Houari Boumediene Department, UQTR, and a Lecturer (1997-1998) at the same depart-
(USTHB), Algiers, Algeria, in ment. He is the author of more than 325 publications and has four
2017, and M.Sc.A degree in Elec- patents and two Patents Pending. His current research interests include
trical Engineering from Univer- renewable energy, the use of hydrogen, home demand side manage-
sity of Quebec at Trois-Rivières ment, integration of energy production, storage and electrical energy
(UQTR), QC, Canada, in 2020. generation systems, connection of electrical vehicle to the grid, and
He is currently a research assis- control and measurements. He is a member of Hydrogen Research
tant in UQTR and his research Institute and Research Group “GREI” of UQTR. Since 2015, he has
interests include mobile robotics been the Sub-Committee Chair on “Home and Building Energy Man-
and control systems. agement of Smart Grid Technical Committee” and IEEE Industrial
Electronics Society (IES).

123

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
apply.
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
not:

1. use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
2. use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
3. falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
4. use bots or other automated methods to access the content or redirect messages
5. override any security feature or exclusionary protocol; or
6. share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at

[email protected]

Excerpt From Drawing Ideas by Mark Baskinger and William Bardel
80% (10)
Excerpt From Drawing Ideas by Mark Baskinger and William Bardel
16 pages
The Pharma 1000
No ratings yet
The Pharma 1000
98 pages
Graded Examples in Reinforced Concrete Design
100% (5)
Graded Examples in Reinforced Concrete Design
116 pages
Dynamic BUsiness Environment
No ratings yet
Dynamic BUsiness Environment
16 pages
RDCAM Software Installation Manual
100% (1)
RDCAM Software Installation Manual
25 pages
Concrete Masonry Report
No ratings yet
Concrete Masonry Report
21 pages
Chapter 03 - Random Variables
No ratings yet
Chapter 03 - Random Variables
14 pages
M.Tech (CS) - Syllabus
No ratings yet
M.Tech (CS) - Syllabus
49 pages
Ijerph 16 03406 v2 PDF
No ratings yet
Ijerph 16 03406 v2 PDF
23 pages
Muslim Prayer Guide Part I
No ratings yet
Muslim Prayer Guide Part I
496 pages
ISO 37120 City Indicators - City of Pickering
No ratings yet
ISO 37120 City Indicators - City of Pickering
37 pages
(INV0006) Copy Inventory Organization - Simplifying Oracle E Business Suite
0% (1)
(INV0006) Copy Inventory Organization - Simplifying Oracle E Business Suite
3 pages
Aapl Apple Inc. Common Stock Nasdaq
No ratings yet
Aapl Apple Inc. Common Stock Nasdaq
4 pages
Design of Microbending Deformer For Optical Fiber Weight Sensor
No ratings yet
Design of Microbending Deformer For Optical Fiber Weight Sensor
7 pages
Deep Learning Based Multi Modal Fusion Architecture For Maritime Vessel Detection
No ratings yet
Deep Learning Based Multi Modal Fusion Architecture For Maritime Vessel Detection
17 pages
Lampiran 5. Growth Chart
No ratings yet
Lampiran 5. Growth Chart
14 pages
Anand Bhat PHD Thesis
No ratings yet
Anand Bhat PHD Thesis
173 pages
Teacher Evaluation by Students
No ratings yet
Teacher Evaluation by Students
1 page
Audacity of Sons of God 1 PDF
No ratings yet
Audacity of Sons of God 1 PDF
3 pages
1902 07830
No ratings yet
1902 07830
27 pages
Git Hub Log
No ratings yet
Git Hub Log
4 pages
Avi Infrared Dataset Generation For People Detection Through Superimposition of Different ICCVW 2021 Paper
No ratings yet
Avi Infrared Dataset Generation For People Detection Through Superimposition of Different ICCVW 2021 Paper
10 pages
Logistics Manager - Franco Canzani
No ratings yet
Logistics Manager - Franco Canzani
2 pages
Single Image Dehazing Via Fusion of Multilevel Attention Network For Vision-Based Measurement Applications
No ratings yet
Single Image Dehazing Via Fusion of Multilevel Attention Network For Vision-Based Measurement Applications
15 pages
Intro Corrected
No ratings yet
Intro Corrected
15 pages
IHRM Notes UNIT 2 MBA Batch 2022-24 Semester 4
No ratings yet
IHRM Notes UNIT 2 MBA Batch 2022-24 Semester 4
17 pages
3D-Cvf: Generating Joint Camera and Lidar Features Using Cross-View Spatial Feature Fusion For 3D Object Detection
No ratings yet
3D-Cvf: Generating Joint Camera and Lidar Features Using Cross-View Spatial Feature Fusion For 3D Object Detection
16 pages
Infrared and Visible Image Fusion With Resnet and Zero-Phase Component Analysis
No ratings yet
Infrared and Visible Image Fusion With Resnet and Zero-Phase Component Analysis
22 pages
Espaol III Holistic Rubric For Webquest Mayas Incas Aztecs
No ratings yet
Espaol III Holistic Rubric For Webquest Mayas Incas Aztecs
2 pages
A005A602620-TDI Injection and Glow Plug System (4-Cyl 2 0 LTR Unit Injector 4-Valve)
No ratings yet
A005A602620-TDI Injection and Glow Plug System (4-Cyl 2 0 LTR Unit Injector 4-Valve)
48 pages
1 s2.0 S1566253518305505 Main
No ratings yet
1 s2.0 S1566253518305505 Main
20 pages
Sensors: Deep Visible and Thermal Image Fusion For Enhanced Pedestrian Visibility
No ratings yet
Sensors: Deep Visible and Thermal Image Fusion For Enhanced Pedestrian Visibility
21 pages
LIDAR Camera Fusion For Road Detection Using Fully - 2019 - Robotics and Autono
No ratings yet
LIDAR Camera Fusion For Road Detection Using Fully - 2019 - Robotics and Autono
7 pages
Poetry
No ratings yet
Poetry
3 pages
Deep Learning For Image and Point Cloud Fusion in Autonomous Driving: A Review
No ratings yet
Deep Learning For Image and Point Cloud Fusion in Autonomous Driving: A Review
19 pages
TransFusion Robust LiDAR-Camera Fusion For 3D Object Detection With Transformers
No ratings yet
TransFusion Robust LiDAR-Camera Fusion For 3D Object Detection With Transformers
10 pages
Unsupervised Densely Attention Network For Infrared and Visible Image Fusion
No ratings yet
Unsupervised Densely Attention Network For Infrared and Visible Image Fusion
12 pages
DLL - Tle 6 - Q4 - W8
No ratings yet
DLL - Tle 6 - Q4 - W8
4 pages
Deep Continuous Fusion
No ratings yet
Deep Continuous Fusion
16 pages
A Real Time Pixel-Level Based Image Fusion Via Adaptive Weight Averaging
No ratings yet
A Real Time Pixel-Level Based Image Fusion Via Adaptive Weight Averaging
11 pages
GFD-SSD: Gated Fusion Double SSD For Multispectral Pedestrian Detection
No ratings yet
GFD-SSD: Gated Fusion Double SSD For Multispectral Pedestrian Detection
10 pages
LBachusACustodi 2003 UsefulWorkFromAPump KnowAndUnderstandCent
No ratings yet
LBachusACustodi 2003 UsefulWorkFromAPump KnowAndUnderstandCent
2 pages
Group Project
No ratings yet
Group Project
12 pages
Radar-Camera Fusion For Object Detection and Semantic Segmentation in Autonomous Driving: A Comprehensive Review
No ratings yet
Radar-Camera Fusion For Object Detection and Semantic Segmentation in Autonomous Driving: A Comprehensive Review
40 pages
Radar and Camera Early Fusion For Vehicle Detection in Advanced Driver Assistance Systems
No ratings yet
Radar and Camera Early Fusion For Vehicle Detection in Advanced Driver Assistance Systems
11 pages
Printing The Future Layer by Layer A Comprehensive
No ratings yet
Printing The Future Layer by Layer A Comprehensive
39 pages
In-Situ Study of Understanding The Resistive Switc
No ratings yet
In-Situ Study of Understanding The Resistive Switc
30 pages
Cond
No ratings yet
Cond
81 pages
Investigation of The Mechanical Properties Surface
No ratings yet
Investigation of The Mechanical Properties Surface
23 pages
Antenna Systems For IoT Applications A Review
No ratings yet
Antenna Systems For IoT Applications A Review
28 pages
Measurements of Spatial Angles Using Diamond Nitrogen-Vacancy Center Optical Detection Magnetic Resonance
No ratings yet
Measurements of Spatial Angles Using Diamond Nitrogen-Vacancy Center Optical Detection Magnetic Resonance
5 pages
A Review On Cost-Effective Polyhydroxyalkanoate PH
No ratings yet
A Review On Cost-Effective Polyhydroxyalkanoate PH
15 pages
FuseSeg Semantic Segmentation of Urban Scenes Based On RGB and Thermal Data Fusion
No ratings yet
FuseSeg Semantic Segmentation of Urban Scenes Based On RGB and Thermal Data Fusion
12 pages
Autonomous Vehicles
No ratings yet
Autonomous Vehicles
15 pages
3D Vehicle Detection Using Multi-Level Fusion From Point Clouds and Images
No ratings yet
3D Vehicle Detection Using Multi-Level Fusion From Point Clouds and Images
9 pages
ME Assignment 2
No ratings yet
ME Assignment 2
3 pages
TriboMat2024VorkapiTM 2024 01 04
No ratings yet
TriboMat2024VorkapiTM 2024 01 04
10 pages
Sample Case Study Report 3
No ratings yet
Sample Case Study Report 3
8 pages
Incet49848 2020 9154066
No ratings yet
Incet49848 2020 9154066
6 pages
Impact of Process Parameters On Improving The Perf
No ratings yet
Impact of Process Parameters On Improving The Perf
30 pages
2024 Comp AGDRMeca Fib
No ratings yet
2024 Comp AGDRMeca Fib
15 pages
Jin 2024 Machine
No ratings yet
Jin 2024 Machine
18 pages
Assessment of The Development Performance of Addit
No ratings yet
Assessment of The Development Performance of Addit
17 pages
1 s2.0 S0888327024006605 Main
No ratings yet
1 s2.0 S0888327024006605 Main
16 pages
Energetic Materials in 3D An In-Depth Exploration
No ratings yet
Energetic Materials in 3D An In-Depth Exploration
23 pages
Soft Robotics ASystematic Reviewand Bibliometric Analysis
No ratings yet
Soft Robotics ASystematic Reviewand Bibliometric Analysis
44 pages
(TransFusion) Robust LiDAR-Camera Fusion For 3D Object Detection With Transformers
No ratings yet
(TransFusion) Robust LiDAR-Camera Fusion For 3D Object Detection With Transformers
15 pages
Characterization of Slaughterhouse Wastewater and
No ratings yet
Characterization of Slaughterhouse Wastewater and
28 pages
In Situ Growth of Carbon Nanotubes On Fly Ash Subs
No ratings yet
In Situ Growth of Carbon Nanotubes On Fly Ash Subs
18 pages
An Efficient Network Model For Visible and Infrared Image Fusion
No ratings yet
An Efficient Network Model For Visible and Infrared Image Fusion
18 pages
2021 (Code) SDNet - A Versatile Squeeze-and-Decomposition Network For Real-Time Image Fusion
No ratings yet
2021 (Code) SDNet - A Versatile Squeeze-and-Decomposition Network For Real-Time Image Fusion
25 pages
Audit Chapter 5 Remaining Questions (Kindly Printout)
No ratings yet
Audit Chapter 5 Remaining Questions (Kindly Printout)
18 pages
Inversion + Passive Voices
No ratings yet
Inversion + Passive Voices
51 pages
1 ST Paper
No ratings yet
1 ST Paper
16 pages
Geometry-Induced Process and Part Characteristics
No ratings yet
Geometry-Induced Process and Part Characteristics
11 pages
Topological Engineered 3D Printing of Architectura
No ratings yet
Topological Engineered 3D Printing of Architectura
22 pages
Hao2022 Article NOSMFuseAnInfraredAndVisibleIm
No ratings yet
Hao2022 Article NOSMFuseAnInfraredAndVisibleIm
14 pages
Deep Learning For Image and Point Cloud Fusion in Autonomous Driving A Review
No ratings yet
Deep Learning For Image and Point Cloud Fusion in Autonomous Driving A Review
18 pages
A Lightweight Dual-Branch Object Detection Network
No ratings yet
A Lightweight Dual-Branch Object Detection Network
23 pages
JAECh 2006 01
No ratings yet
JAECh 2006 01
10 pages
3D and 4D Printing of PETG-ABS-Fe3O4 Nanocomposite
No ratings yet
3D and 4D Printing of PETG-ABS-Fe3O4 Nanocomposite
16 pages
In Situ Tensile Testing Under High Speed Optical R
No ratings yet
In Situ Tensile Testing Under High Speed Optical R
20 pages
Recent Progress in Caron-Based Stimulus-Responsive
No ratings yet
Recent Progress in Caron-Based Stimulus-Responsive
18 pages
SZanchetti LKissIFACMMM07 - Final2 Draft
No ratings yet
SZanchetti LKissIFACMMM07 - Final2 Draft
7 pages
Multi-Sensor Fusion in Automated Driving A Survey
No ratings yet
Multi-Sensor Fusion in Automated Driving A Survey
22 pages
1 s2.0 S2666682023000245 Main
No ratings yet
1 s2.0 S2666682023000245 Main
11 pages
Liu 2021
No ratings yet
Liu 2021
14 pages
The Influenceof Industrial Environmental Factorson Soft Robot Materials
No ratings yet
The Influenceof Industrial Environmental Factorson Soft Robot Materials
16 pages
Durand-2023 CompositesPartC
No ratings yet
Durand-2023 CompositesPartC
10 pages
Nabati和Qi - 2021 - CenterFusion Center-based Radar and Camera Fusion for 3D Object Detection
No ratings yet
Nabati和Qi - 2021 - CenterFusion Center-based Radar and Camera Fusion for 3D Object Detection
10 pages
SSRN Id4107251
No ratings yet
SSRN Id4107251
7 pages
Scholarworks at Utrgv Scholarworks at Utrgv
No ratings yet
Scholarworks at Utrgv Scholarworks at Utrgv
15 pages
51 Submission
No ratings yet
51 Submission
5 pages
1 s2.0 S2214785322035957 Main
No ratings yet
1 s2.0 S2214785322035957 Main
7 pages
Influence of 3D-Printing Strategies On The Compres
No ratings yet
Influence of 3D-Printing Strategies On The Compres
6 pages
Evaluating The Flexural Properties of Polyoxymethy
No ratings yet
Evaluating The Flexural Properties of Polyoxymethy
6 pages
IET Computer Vision - 2024 - Massoud - Learnable Fusion Mechanisms For Multimodal Object Detection in Autonomous Vehicles
No ratings yet
IET Computer Vision - 2024 - Massoud - Learnable Fusion Mechanisms For Multimodal Object Detection in Autonomous Vehicles
13 pages
2020 (Code) RXDNFuse - A Aggregated Residual Dense Network For Infrared and Visible Image Fusion
No ratings yet
2020 (Code) RXDNFuse - A Aggregated Residual Dense Network For Infrared and Visible Image Fusion
42 pages
Li Et Al 2021 An Improved Traffic Lights Recognition Algorithm For Autonomous Driving in Complex Scenarios
No ratings yet
Li Et Al 2021 An Improved Traffic Lights Recognition Algorithm For Autonomous Driving in Complex Scenarios
17 pages
Background-Aware Cross-Attention Multiscale Fusion
No ratings yet
Background-Aware Cross-Attention Multiscale Fusion
20 pages
Multimodal Fusion Object Detection System For Autonomous Vehicles
No ratings yet
Multimodal Fusion Object Detection System For Autonomous Vehicles
9 pages
Multi-Modal 3D Object Detection in Autonomous Driving A Survey and Taxonomy
No ratings yet
Multi-Modal 3D Object Detection in Autonomous Driving A Survey and Taxonomy
18 pages
Electronics 13 02790
No ratings yet
Electronics 13 02790
15 pages
Comparative Analysis of Feature Descriptors and Classifiers For Real-Time Object Detection
No ratings yet
Comparative Analysis of Feature Descriptors and Classifiers For Real-Time Object Detection
11 pages
Pedestrian Detection Using Embedded Night-Vision Systems
No ratings yet
Pedestrian Detection Using Embedded Night-Vision Systems
8 pages
Spatio Contextual Deep Network Ujjwal Sir Paper
No ratings yet
Spatio Contextual Deep Network Ujjwal Sir Paper
11 pages
Public International Law LLB 2 Year
No ratings yet
Public International Law LLB 2 Year
144 pages
Fully-Connected Transformer For Multi-Source Image Fusion
No ratings yet
Fully-Connected Transformer For Multi-Source Image Fusion
18 pages
Seeing Through Fog Without Seeing Fog - 001231375
No ratings yet
Seeing Through Fog Without Seeing Fog - 001231375
3 pages
Image Fusion Transformerhjguyguyftufrdtr
No ratings yet
Image Fusion Transformerhjguyguyftufrdtr
6 pages
STDFusionNet An Infrared and Visible Image Fusion Network Based On Salient Target Detection
No ratings yet
STDFusionNet An Infrared and Visible Image Fusion Network Based On Salient Target Detection
13 pages
SAFuseNet Integration of Fusion and Detection For Infrared and Visible
No ratings yet
SAFuseNet Integration of Fusion and Detection For Infrared and Visible
7 pages
Complete Research Paper Computer Vision
No ratings yet
Complete Research Paper Computer Vision
6 pages
Devaguptapu Borrow From Anywhere Pseudo Multi-Modal Object Detection in Thermal Imagery CVPRW 2019 Paper
No ratings yet
Devaguptapu Borrow From Anywhere Pseudo Multi-Modal Object Detection in Thermal Imagery CVPRW 2019 Paper
10 pages
ICAFusion
No ratings yet
ICAFusion
35 pages
Improving Autonomous Vehicle Cognitive Robustness
No ratings yet
Improving Autonomous Vehicle Cognitive Robustness
15 pages
Sensors
No ratings yet
Sensors
21 pages
Resumenes FiltroVA
No ratings yet
Resumenes FiltroVA
13 pages

A Tree-Based Approach For Visible and Thermal Sens

Uploaded by

A Tree-Based Approach For Visible and Thermal Sens

Uploaded by

Machine Vision and Applications (2024) 35:60

A tree-based approach for visible and thermal sensor fusion in winter

1 Introduction Autonomous vehicles rapidly adopt vision-based detec-

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Fig. 1 Representation of the proposed network modifications

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

The Pytorch [43] platform is used to implement the current

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Fig. 2 Timelines of the three

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Fig. 3 The 256 image

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

k and the auto column representing the value of k found by

Table 3 The experimental results from the different syncing algorithms

Wait for all 40,310 0 0 24,837 0.0003 13.96 49.82 22.39 A

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Fig. 6 Cost distribution for

Fig. 7 Calibration sensitivity study

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Fig. 8 Comparison to state of

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

Hedi Oueslati graduated as an Kodjo Agbossou (M’98-SM’01)

Content courtesy of Springer Nature, terms of use apply. Rights reserved.

You might also like