Lecture 8 Multi Sensor Part2
Lecture 8 Multi Sensor Part2
• RGB + Infrared
• RGB + depth
• RGB + LiDAR
• RGB + IMU
• RGB + language
RGB + Infrared for object detection
No person?
RGB + Infrared for object detection
No person?
RGB + Infrared for object detection
Single-modal RGB sensor cannot capture all objects, e.g., under poor illumination.
Thermal captures stronger signatures for objects that emit heat.
How about for objects that do not? Let’s fuse modalities.
No person?
RGB + Infrared for object detection
HDR scenes
RGB + Infrared for object detection
HDR scenes
RGB + Infrared for object detection
[1] Devaguptapu, et al. Borrow from Anywhere: Pseudo Multi-modal Object Detection in Thermal Imagery. CVPRW, 2019
RGB + Infrared for object detection
[1] Devaguptapu, et al. Borrow from Anywhere: Pseudo Multi-modal Object Detection in Thermal Imagery. CVPRW, 2019
RGB + Infrared for object detection
modality x1 modality x2
fusion
Late fusion of RGB + Infrared
modality x1 modality x2
fusion
(a) Pooling
modality x1 modality x2
fusion
(a) Pooling
modality x1 modality x2
fusion
Dalal & Triggs. “Histograms of oriented gradients for human detection”. CVPR, 2005.
Late fusion of RGB + Infrared
modality x1 modality x2
fusion
Dalal & Triggs. “Histograms of oriented gradients for human detection”. CVPR, 2005.
Late fusion of RGB + Infrared
modality x1 modality x2
fusion
modality x1 modality x2
fusion
modality x1 modality x2
fusion
Chen, Shi, Ye, Mertz, Ramanan, Kong. “Multimodal Object Detection via Probabilistic Ensembling”. ECCV, 2022
Probabilistic ensembling of RGB & Infrared detections
modality x1 modality x2
p (y | x1, x2) =
?
Probabilistic ensembling of RGB & Infrared detections
ProbEn is the optimal fusion strategy given the conditional independence assumption
modality x1 modality x2
p (y | x1, x2) =
?
Probabilistic ensembling of RGB & Infrared detections
ProbEn is the optimal fusion strategy given the conditional independence assumption
p (x1 | y ) = p (x1 | x2 , y )
modality x1 modality x2
p (y | x1, x2) =
?
Probabilistic ensembling of RGB & Infrared detections
ProbEn is the optimal fusion strategy given the conditional independence assumption
p (x1 | y ) = p (x1 | x2 , y )
modality x1 modality x2
Bayes rule
p (y | x1, x2) =
?
Probabilistic ensembling of RGB & Infrared detections
ProbEn is the optimal fusion strategy given the conditional independence assumption
p (x1 | y ) = p (x1 | x2 , y )
modality x1 modality x2
Bayes rule
p(x1, x2 | y) p(y)
p (y | x1, x2) =
p(x1, x2)
Probabilistic ensembling of RGB & Infrared detections
ProbEn is the optimal fusion strategy given the conditional independence assumption
p (x1 | y ) = p (x1 | x2 , y )
modality x1 modality x2
Bayes rule
p(x1, x2 | y) p(y)
p (y | x1, x2) = ∝ p(x1, x2 | y) p(y)
p(x1, x2)
Probabilistic ensembling of RGB & Infrared detections
ProbEn is the optimal fusion strategy given the conditional independence assumption
p (x1 | y ) = p (x1 | x2 , y )
modality x1 modality x2
Bayes rule
p(x1, x2 | y) p(y)
p (y | x1, x2) = ∝ p(x1, x2 | y) p(y)
p(x1, x2)
p(x1 | y) p(y) p(x2 | y) p(y)
∝
p(y)
Probabilistic ensembling of RGB & Infrared detections
ProbEn is the optimal fusion strategy given the conditional independence assumption
p (x1 | y ) = p (x1 | x2 , y )
modality x1 modality x2
Bayes rule
p(x1, x2 | y) p(y)
p (y | x1, x2) = ∝ p(x1, x2 | y) p(y)
p(x1, x2)
p(x1 | y) p(y) p(x2 | y) p(y)
∝
p(y)
Chen, Shi, Ye, Mertz, Ramanan, Kong. “Multimodal Object Detection via Probabilistic Ensembling”. ECCV, 2022
Probabilistic ensembling of RGB & Infrared detections
ProbEn is the optimal fusion strategy given the conditional independence assumption
p (x1 | y ) = p (x1 | x2 , y )
modality x1 modality x2
Bayes rule
p(x1, x2 | y) p(y)
p (y | x1, x2) = ∝ p(x1, x2 | y) p(y)
p(x1, x2)
p(x1 | y) p(y) p(x2 | y) p(y)
∝
ProbEn p(y)
• multiply single modal probability
• divide by the class prior p(y | x1) p(y | x2)
∝
• re-normalize p(y)
Chen, Shi, Ye, Mertz, Ramanan, Kong. “Multimodal Object Detection via Probabilistic Ensembling”. ECCV, 2022
Probabilistic ensembling of RGB & Infrared detections
Probabilistic ensembling of RGB & Infrared detections
Probabilistic ensembling of RGB & Infrared detections
Probabilistic ensembling of RGB & Infrared detections
Probabilistic ensembling of RGB & Infrared detections
Chen, Shi, Ye, Mertz, Ramanan, Kong. “Multimodal Object Detection via Probabilistic Ensembling”. ECCV, 2022
Probabilistic ensembling of RGB & Infrared detections
Chen, Shi, Ye, Mertz, Ramanan, Kong. “Multimodal Object Detection via Probabilistic Ensembling”. ECCV, 2022
Probabilistic late-fusion of RGB + Infrared Log-Average Miss Rate
better
● ProbEn outperforms heuristic fusion methods, e.g., avg and NMS. 0 0.05 0.10 0.15 0.20 0.25 0.30 0.35
RGB
Thermal
MidFusion
KAIST dataset Pooling
NMS
average fusion
ProbEn
Thermal ProbEn
Probabilistic late-fusion of RGB + Infrared Log-Average Miss Rate
better
● ProbEn outperforms heuristic fusion methods, e.g., avg and NMS. 0 0.05 0.10 0.15 0.20 0.25 0.30 0.35
RGB
● ProbEn still improves even when the conditional independence
Thermal
assumption does not hold. MidFusion
Pooling
NMS
average fusion
ProbEn
ProbEn (3)
Probabilistic late-fusion of RGB + Infrared Log-Average Miss Rate
better
● ProbEn outperforms heuristic fusion methods, e.g., avg and NMS. 0 0.05 0.10 0.15 0.20 0.25 0.30 0.35
RGB
● ProbEn still improves even when the conditional independence
Thermal
assumption does not hold. MidFusion
Pooling
● ProbEn off-the-shelf detectors achieves 26% relative improvement!
NMS
average fusion
ProbEn
ProbEn (3)
RPN+BDT [CVPRW 2017]
TC-DET [ECCV 2020]
IATDNN [InfoFusion 2019]
IAF RCNN [PR 2019]
CIAN [InfoFusion 2019]
MSDS [BMVC 2018]
AR-CNN [ICCV 2019]
MBNet [ECCV 2020]
MLPD [RA-L 2021]
GAFF [WACV 2021] 0.65
ProbEn (3 w/ GAFF) 0.51
Chen, Shi, Ye, Mertz, Ramanan, Kong. “Multimodal Object Detection via Probabilistic Ensembling”. ECCV, 2022
Multi-modality / multi-sensor
• RGB + Infrared
• RGB + depth
• RGB + LiDAR
• RGB + IMU
• RGB + language
RGB + depth