Mixed frame-/event-driven fast pedestrian detection
Z Jiang, P Xia, K Huang, W Stechele… - … on Robotics and …, 2019 - ieeexplore.ieee.org
2019 International Conference on Robotics and Automation (ICRA), 2019•ieeexplore.ieee.org
Pedestrian detection has attracted enormous research attention in the field of Intelligent
Transportation System (ITS) due to that pedestrians are the most vulnerable traffic
participants. So far, almost all pedestrian detection solutions are based on the conventional
frame-based camera. However, they cannot perform very well in scenarios with bad light
condition and high-speed motion. In this work, a Dynamic and Active Pixel Sensor (DAVIS),
whose two channels concurrently output conventional gray-scale frames and asynchronous …
Transportation System (ITS) due to that pedestrians are the most vulnerable traffic
participants. So far, almost all pedestrian detection solutions are based on the conventional
frame-based camera. However, they cannot perform very well in scenarios with bad light
condition and high-speed motion. In this work, a Dynamic and Active Pixel Sensor (DAVIS),
whose two channels concurrently output conventional gray-scale frames and asynchronous …
Pedestrian detection has attracted enormous research attention in the field of Intelligent Transportation System (ITS) due to that pedestrians are the most vulnerable traffic participants. So far, almost all pedestrian detection solutions are based on the conventional frame-based camera. However, they cannot perform very well in scenarios with bad light condition and high-speed motion. In this work, a Dynamic and Active Pixel Sensor (DAVIS), whose two channels concurrently output conventional gray-scale frames and asynchronous low-latency temporal contrast events of light intensity, was first used to detect pedestrians in a traffic monitoring scenario. Data from two camera channels were fed into Convolutional Neural Networks (CNNs) including three YOLOv3 models and three YOLO-tiny models to gather bounding boxes of pedestrians with respective confidence map. Furthermore, a confidence map fusion method combining the CNN-based detection results from both DAVIS channels was proposed to obtain higher accuracy. The experiments were conducted on a custom dataset collected on TUM campus. Benefiting from the high speed, low latency and wide dynamic range of the event channel, our method achieved higher frame rate and lower latency than those only using a conventional camera. Additionally, it reached higher average precision by using the fusion approach.
ieeexplore.ieee.org
Showing the best result for this search. See all results