0% found this document useful (0 votes)
114 views6 pages

UAV Based Target Tracking and Recognition

1) This document describes a quadrotor UAV system for target tracking and recognition. It includes an intelligent gimbal subsystem for accurate camera positioning and fast image processing. 2) The system utilizes consensus-based algorithms for robust target tracking, as well as image mosaicking techniques to deal with moving backgrounds. It also uses a Geographic Information System (GIS) database and neural network-based object database to improve tracking and recognition performance. 3) Experimental results demonstrate the robustness of the proposed target tracking and recognition framework, which provides a more complete solution compared to other systems through the use of both GIS and neural network databases, as well as consensus-based tracking algorithms.

Uploaded by

Sixsigma Tqm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
114 views6 pages

UAV Based Target Tracking and Recognition

1) This document describes a quadrotor UAV system for target tracking and recognition. It includes an intelligent gimbal subsystem for accurate camera positioning and fast image processing. 2) The system utilizes consensus-based algorithms for robust target tracking, as well as image mosaicking techniques to deal with moving backgrounds. It also uses a Geographic Information System (GIS) database and neural network-based object database to improve tracking and recognition performance. 3) Experimental results demonstrate the robustness of the proposed target tracking and recognition framework, which provides a more complete solution compared to other systems through the use of both GIS and neural network databases, as well as consensus-based tracking algorithms.

Uploaded by

Sixsigma Tqm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2016 IEEE International Conference on Multisensor

Fusion and Integration for Intelligent Systems (MFI 2016)


Kongresshaus Baden-Baden, Germany, Sep. 19-21, 2016

UAV Based Target Tracking and Recognition*


Tian Xiang1 , Fan Jiang1 , Gongjin Lan1 , Jiaming Sun1 , Guocheng Liu1 , Qi Hao1 and Cong Wang2

Abstract In this paper, we develop a quadrotor UAV based utilization of object and GIS databases results in limited
target tracking and recognition system, which includes an identification and tracking performance.
intelligent gimbal sub-system for accurate camera positioning In [2], a set of algorithms are developed to detect and
and fast image processing. A set of robust consensus-based
algorithms are developed for objects tracking, in addition to track objects with pre-known shapes for UAVs. The scheme
moving background processing techniques. A neural network has been proved useful for tracking indoor targets below
learning based database is used to improve target recognition certain heights. In [3], [4], [5], GIS databases are utilized
performance. Moreover, a Geographic Information System for UAV navigation systems to generate tracking resulting
(GIS) is used to provide geo-location, environmental, and in terms of global coordinates. The experimental results
contextual information for the tracked objects. Experimental
and simulation results have demonstrated the robustness of the illustrate the improved tracking performance with the help
proposed target tracking and recognition framework. of GIS databases. [6] gives an state-of-the-art concensus-
based tracker which is capable of tracking deformable objects
I. INTRODUCTION in real time. A real-time object tracking and identification
In recent years, quadrotor UAV has become increasingly system with a convolutional neural network running on
widely used in both military or civil applications due to FPGA is proposed in [7]. The system performance is robust
its small size, low cost, high maneuverability, fast response to various environment and target conditions but only suitable
[1], so that it can replace the pilots and rescue team to for near-distance objects.
implement dangerous missions. Civil applications include In this paper, we propose a framework of real-time UAV
aerial crop surveys, inspection of power lines and pipelines, based objects tracking and recognition. Compared with most
forest fire detection and monitoring, search and rescue, tracking system described above, we develop a more com-
aerial photography, etc. For military applications, UAVs can plete and robust system with both GIS environment database
complete the complex tasks such as transporting materials, and neural network based object database. Besides, a set of
battlefield surveillance, border patrol, electronic warfare, etc. consensus-based algorithms are utilized for target tracking,
With the recent advances in MEMS, artificial intelligent, which show better performance than other algorithms in case
digital communications, and sensing technology, UAVs have of occlusion.
found more applications in many areas. Besides, computer The contributions of this paper include
vision techniques further enable intelligent UAV applications. 1) Design a UAV based target tracking and recognition
UAV based autonomous target identification and tracking system with an intelligent gimbal sub-system;
poses many technical challenges. First, as targets are moving 2) Develop a consensus-based visual tracking algorithm
within a changing background, it is important to develop real- to achieve robust target tracking, as well as image mo-
time algorithms for varying-background video processing. saicking techniques to deal with moving background;
Second, since the resolution of human facial images is 3) Utilize a GIS database to provide environmental and
usually low due to the far distances between the UAV and the contextual information, and improve tracking accuracy;
target, how to identify the targets with high accuracy among 4) Utilize a neural network based object database for fast
the low-resolution facial images becomes a key problem in feature matching, and high-accuracy target recognition.
real-time aerial surveillance for law enforcement agencies.
This paper is organized as follows. Section II describes the
Third, target occlusion by the changing background becomes
whole system architecture and states problems; Section III
very difficult to solve since it is hard to determine whether
presents the target tracking and recognition framework, in-
the target leaves the camera Field of View (FoV) or it is
cluding image mosaicking, consensus-based tracking algo-
occluded by other objects. Besides, algorithms without full
rithm, GIS database, and neural network based database;
*Supported by the Shenzhen Fundamental Research Program and South- Section IV provides the results and discussions. Section V
ern University of Science and Technology Research Committee. concludes the paper and outlines future work.
1 Tian Xiang, Fan Jiang, Gongjin Lan, Jiaming Sun,
Guocheng Liu and Qi Hao are with the Department of II. S YSTEM S ETUP AND P ROBLEM S TATEMENT
Computer Science and Engineering, Southern University of
Science and Technology, Shenzhen, Guangdong 518055, China A. UAV based Target Tracking and Recognition
[email protected], [email protected],
[email protected], [email protected], Fig. 1 illustrates the UAV based target tracking and recog-
[email protected], [email protected] nition system, which contains the UAV system (flight control
2 Cong Wang is with the school of automation science and engineering,
the South China University of Technology, Guangzhou, Guangdong 510641, + gimbal sub-system) and the ground station. The real-time
China [email protected] object tracking and recognition scheme is performed within

400
the UAV system, which implements a set of consensus-based UAV points the onboard camera onto the target and
algorithms for target tracking given video images, utilizes a continuously acquire the target images, after receiving
database for feature matching, as well as various sensors the instruction on the target initial position, texture, and
for controlling UAV flights and gimbal motions. On the size from the ground station; under the autonomous
other side, the ground station receives the video stream, the mode, the UAV automatically determines the targets of
position and orientation information of the quadrotor and interest, track their movements, continuously acquires
gimbal, and fuses them with the GIS database to achieving their images, and preprocesses the data, including
the accurate target geo-locations. white balance and de-noising.
2) Video mosaicking: to deal with the moving back-
B
grounds in the UAV videos, consecutive images within
/
a time window are aligned and stitched together into a
3 A C / panoramic image to get the static background image,
such that the targets can be localized more accurately
/
/
[9].
3) Target recognition: extract the feature points of the
images, and match those features with the target mod-
els stored in the database; if the matching is successful,
B then continue the target tracking, otherwise re-localize
the target from the images for subsequent processing.
/
4) Target tracking: implement the consensus-based al-
/
/
gorithm to perform the online target detection, local-
ization, tracking and prediction; estimate the motion
A B and the shape of the target continuously.
3 /

/ / 5) Gimbal and UAV control: estimate the position and


attitude of the UAV according to the predicted position
Fig. 1. Target tracking and recognition system of the target and data from the sensors, then control
the motion and movement of the gimbal and the UAV
Pixhawk (an open source autopilot module and software) to keep the target remain in the vicinity of the image
is selected as UAVs flight control platform [8]. A gimbal center.
system with a 4K HD camera is used to capture the target 6) Ground command center: the UAV transfers the tar-
images. In addition, NVIDIAs embedded computing board get position to the ground station, which is integrated
TX1 is selected as the image processing platform. TX1 is with the GIS database to determine the global coor-
OpenCV enhanced and suitable for performing real-time dinates of the target and provide the target trajectory
target tracking and recognition tasks, which usually involve prediction using geographical information such as road
high computational complexity. or river; besides, the user can interact with the UAV
Image Video
through the ground command center such as select the
Image Sampling
Processing Mosaicking target from the video monitor or change the UAV flight
Target Feature course.
Extraction

Target Feature Knowledge


Matching Base B. Gimbal System
The development of a gimbal system follows the following
Recognized?
steps:
no
1) Choose FPGA/DSP/GPU as the core processor, the
GPS Barometer Camera Angles yes
flexibility of FPGA can provide great convenience
for varieties of algorithms verification, as well as its
Target Tracking
Position Solver Ground Station
scalability;
2) Use brushless PMSM motors to meet the requirement
GIS
Attitude Solver UAV Control
Information of SWaPC, and carbon fiber materials are chosen as
the gimbals shell; use low-power devices to reduce the
IMU power consumption; use low-cost devices to reduce the
cost;
Fig. 2. Workflow of the complete system 3) Employ the modular design principle to achieve the
system scalability and choose standard data interfaces
The entire framework is shown in Fig. 2. The whole such as Ethernet;
process can be divided into following stages: 4) Perform the signal integrity design and robust design
1) Image acquisition: under the instruction mode, the for the circuit to achieve high system reliability.

401
The hardware components of the gimbal system in- III. TARGET T RACKING AND R ECOGNITION
clude the core processor, various sensors (IMU, barometer, A. Image Mosaicking
GPS, camera), actuators (brushless motors) and wireless
transceiver, as shown in Fig. 3. In order to align the sequential frames, there must be com-
mon features within the captured images that can be matched
between the neighboring image frames. SIFT algorithm has
Barometer Gyroscope been used to implement the feature matching [10], which
Accelerator
is invariant to scale, orientation and affine distortion so that
IMU
it can produce a more precise matching between successive
Brushless
video frames.
GPS motor After finishing the feature matching between the succes-
sive frames, the extracted feature points will be used to
compute a geometric transformation matrix for warping the
Camera Wireless images so that they can be aligned with the previous frames.
transceiver
Transceiver
Finally, consecutive aligned images are stitched together into
a panoramic image. The process is described in Fig. 5.

Fig. 3. The hardware components of the airborne gimbal system

Fig. 4 illustrates the gimbal system architecture, where


1) the core processor estimates the orientation and attitude
of the quadrotor based on the data from the altitude
control unit, and performs target tracking and recogni-
tion algorithms based on images from cameras; Fig. 5. The process of image mosaicking
2) the attitude control unit estimates the attitude using
data fusion from different sensors, sends the estimation
results to the core processor for further computation, B. Object Tracking
and receives commands to control both the UAV and A robust and accurate object tracking scheme is central for
the gimbal motions and/or movements; other image processing stages, such as target recognition and
3) other modules such as sensor modules, GPS module, target geo-localization. In an urban environment, targets are
cameras, provide information for the core processor. frequently occluded by other objects, such as trees, buildings,
etc. Targets themselves may have certain 3D structure and
are often deformed in 2D images, posing a serious challenge
Attitude Control Execution Unit on the tracking algorithm. To address these challenges, we
Unit motor
develop a robust visual tracking algorithm using consensus-
based temporal learning.
Core Processor
Image Sampling
FPGA/ARM 1) The scheme uses the SIFT algorithm to extract the im-
age feature points and their descriptors. After obtaining
EO/IR Video Coding
Module
Image
Transmission
the target ROI from the operator, the algorithm will use
Sensor
features inside the ROI as initial target model.
2) For subsequent frames, target tracking is performed in
Fig. 4. The hardware functional block architecture of the gimbal system an evaluate-and-update manner. The bounding box of
the target is determined by taking a consensus vote
on inliers, like in CMT [6]. Meanwhile, outliers in
C. Problem Statement subsequent frames are also evaluated using a statistical
Bayesian approach to identify outliers that are in fact
The goal of this study is to develop a quadrotor UAV based part of the target, and add back to the model. False
target tracking and recognition system that can inliers rejected by the consensus are also removed from
1) provide an intelligent gimbal sub-system for target the model. To minimize false matches, feature points
tracking and recognition with multiple occurrences in the image is also removed
2) track multiple objects in case of occlusion from the matching process, as such points degrade the
3) use a GIS database to provide environmental and matching quality.
contextual information 3) Target occlusion poses a challenge on target tracking
4) utilize a neural network based database for fast feature in urban environments. Although our tracker is robust
matching from partial occlusions, frequent total occlusion will

402
severely disrupt the functions of subsequent video Target No
processing and target-following UAV flight control. To Recognition?
Yes
tackle this problem, an Extended Kalman Filter (EKF)- Transfer the video images and
based state observer is employed with the output of the geographical information

tracking algorithm. This ensures a steady and contin- Search in the GIS database

uous position and scale update by using predictions, No


Image
which is essential for a stable control system. Matching?
Yes
C. Integration of UAV Images and GIS database Compute the target position

Since the position and orientation of UAVs camera are not Target 3D reconstruction

accurate enough due to the GPS deviation and measurement


Display
errors of IMU sensors, the UAVs usually cant accurately
localize targets in terms of global coordinates merely based
Fig. 7. Process of video scenario and GIS integration
on the target position within the images. In order to solve this
problem, the method of matching the UAV camera images
with a GIS database is proposed to help UAVs localize targets Input Transmission to Feature Extraction
UAV video Fast calibration
ground station
and predict target movements in terms of global coordinates Reference parameter

and with a higher precision(Fig. 6).


The key is to extract features of HD images captured Match request
Transformed
on the UAV and matching them with the images stored in Image mosaicing
image
the GIS database. Our scheme uses motion detection and
GIS Server Altitude Angle GPS of UAV Output
estimation, and edge detection techniques to extract images
QGIS The precise position
Calculating the target position
features. Then estimate the zoom ratio and camera angles of Spatial databases Tracking the target

the images in terms of the data format of the GIS database.


Based on both the GIS data and targets position, we can Fig. 8. GIS as a service
predict the trajectories of targets, which will improve the
range and accuracy of target tracking.
Then the target location can be estimated in terms of those
parameters by
x = x1

y = y1 h tan (1)

z = z1 h

, as shown in Fig. 9.

Z
Fig. 6. Target global coordinates and static background ($%, '%, (%)

"
The ground station first receives a rough estimate of target
geo-location information from the UAV based on the GPS ($, ', () Y
and IMU readings, which can be used to estimate the camera 0

height and angles, and then search it in the GIS database


X
to determine the GIS region where the tracked object is
present. After the determination of the target GIS region,
the images transformation and registration are performed Fig. 9. The principle of geopointing
between UAV video streams and the GIS database images.
After the successful image registration, the local coordinates However, such an estimated position is not accurate due to
of target positions can be converted into global coordinates. the errors caused by sensors and GPS readings. As a result, it
Furthermore, the trajectories of targets can be better predicted requires the GIS database to provide more geographic infor-
by utilizing the geographic structures of the environment, mation of the target surroundings, which can improve target
such as road, bridge, and river. The process is illustrated in tracking accuracy, and provide target global coordinates, and
Fig. 7 and Fig. 8. Given the huge size of the GIS database, predict the target trajectory with more geographical limits.
it should be installed on the ground station. The resolution
of UAV video streams is also optimized to achieve a trade- D. Integration of Neural Network Based Database
off between image matching performance and transmission To improve the target recognition accuracy, which is
costs. degraded by target occlusion, target obscuration with long
The angle , altitude h, and 3D position (x1 , y1 , z1 ) of the measurement distance, we propose a scheme by integrating
UAV can be obtained by the onboard IMU and GPS module. both the GIS database and a neural network based target

403
database to achieve long-distance and high-precision targets The target classifier is first trained on the ImageNet[11]
recognition performance for UAVs. The system setup is dataset, then fine-tuned on a pedestrian/vehicle dataset cap-
illustrated in Fig. 10. tured by ourselves. The proprietary dataset not only includes
various kinds of targets, but also takes into account a
GIS Localization variety of target poses and surroundings. The structure of the
recognition neural network includes five convolution layers
GIS
Camera NVIDIA TX1
and two fully connected layers. Meanwhile, a similar neural
network with three convolution layers is trained in parallel,
which trades for speed at the expense of accuracy.

Neural network
Sensor networks Neural network ground workstation Neural network on UAV
Targets recognition and tracking
Target
Target data set Neural network

update
Database Neural network

Target model Target database


Autonomous Target model
matching
feedback

Fig. 10. Integrating GIS and neural network based databases Fig. 12. Offline and online training architecture of neural network

First, edge detection and target segmentation are used to


extract target features before feature normalization. Then, a Jeep

neural network is trained to establish the target database,


where each target has multiple images in different scales and Pickup
perspectives. During the training, the weights are adjusted Pickup


to construct the feature models for targets correspondingly,


which can maximize the feature distances among different Jeep Car

targets, and reduce the feature distances for the same target, Car

and update the target database, as shown in Fig. 11. F1 Car
Racing
Neural Network Convertible

Neural Network Racing

Trailer

Fig. 13. Autonomous matching of targets using neural network

Human
IV. E XPERIMENT R ESULTS
Targets Base

Fig. 11. Building target database

The ground station and the UAV implements two sets of


neural networks, which are synchronized at a certain rate.
First, the ground station implements a target database based
on artificial neural networks and trains them offline. Then,
the trained weights are updated to the neural network based
classifier in the UAV, which performs the target recognition
in real time. Meanwhile, new target images acquired by the
UAV will also be sent to the ground station to update the
target database. The implementation of the neural network
based target database is shown in Fig. 12.
Fig. 13 shows the target recognition probability of differ- Fig. 14. Types of challenges in the dataset
ent vehicle images using convolution neural networks. One
challenge for UAV based target recognition is that target The tracking algorithm without EKF estimation is then
images are often acquired from the top perspective, the tested on a real-world scenario car chase dataset. The
number of feature points will be reduced, and the feature dataset features many challenges including similar objects,
differences among targets will be reduced accordingly. partial/total occlusion and change of target pose, illustrated

404
designed to provide a platform for objects tracking and
recognition. Within our scheme, SIFT is used to extract
the image feature points and descriptors, and a robust
visual tracking algorithm using consensus-based temporal
learning is applied to realize the tracking process. Besides,
an Extended Kalman Filter (EKF)-based state observer is
employed to solve the target occlusion problem. The GIS
database has been integrated within the scheme to improve
the target localization and prediction accuracy, and provide
target global coordinates. The neural network target database
has been integrated to improve target recognition perfor-
mance. Experiment results demonstrate that the proposed
algorithm can achieve better performances than CMT in
Fig. 15. Comparison of the number of features tracked using the proposed
algorithm and CMT
various challenging situations.
ACKNOWLEDGMENT
in Fig. 14. We compared the performance of our algorithm This paper is partly supported by Southern University of
against CMT using two criteria, feature points tracked and Science and Technology Research Committee (No. FRG-
total frames of tracking loss. The result is shown in Fig. 15 SUSTC1501A-29 and FRG-SUSTC1501A-44). We are also
and Fig. 16. Total occlusion by land obstacles (trees, bridges, grateful to the precious help from our lab engineers Yunbo
etc.) is marked as red. Yang and Miaolin Hou.
R EFERENCES
[1] Islam, S., Liu, P.X., EI Saddik, A. Robust control of four-rotor
Unmanned Aerial Vehicle With Disturbance Uncertainty. IEEE Trans-
actions on Industrial Electronics,2015, 62(3),pp.1563-1571.
[2] K. Boudjit and C. Larbes, Detection and implementation autonomous
target tracking with a quadrotor AR.Drone, Informatics in Control,
Automation and Robotics (ICINCO), 2015 12th International Confer-
ence on (Volume:02 )., July 2015, pp. 223 230.
[3] Duo-Yu Gu, Cheng-Fei Zhu, Jiang Guo, Shu-Xiao Li, and Hong-
Xing Chang, Vision-sided UAV navigation using GIS data, Vehicular
Electronics and Safety (ICVES), 2010 IEEE International Conference
onJuly 2010, pp. 78 - 82.
[4] Cheng-Fei Zhu, Shu-Xiao Li, Hong-Xing Chang, Ji-Xiang Zhang,
Matching road networks extracted from aerial images to GIS data,
Information Processing, 2009. APCIP 2009. Asia-Pacific Conference
on (Volume:2 ) July 2009, pp. 63 66.
[5] Nathan Rackliffe, Holly A. Yanco, and Jennifer Casper, Using
Fig. 16. Comparison of the cumulative number of tracking losses between geographic information systems (GIS) for UAV landings and UGV
the proposed algorithm and CMT navigation, Technologies for Practical Robot Applications (TePRA),
2011 IEEE Conference onApril 2011, pp. 145 - 150.
[6] Nebehay, Georg, and Roman Pflugfelder. Clustering of static-adaptive
From Fig. 15, we can see that the number of feature points correspondences for deformable object tracking. In Proceedings of the
tracked of our algorithm is consistently far larger than that IEEE Conference on Computer Vision and Pattern Recognition, pp.
of CMT. The speed of recovery from tracking loss due to 2784-2791. 2015.
[7] Rohan Ghosh, Abhishek Mishra, Garrick Orchard, and Nitish V.
occulusions is also faster. The unstable number of features Thakor, Real-Time object recognition and orientation estimation
is due to the rapid update of the model, which still needs Using an event-based camera and CNN, Biomedical Circuits and
better tweaking in terms of learning rate. Systems Conference (BioCAS), 2014 IEEE., Oct. 2014, pp. 544 - 547.
[8] Meier, L., Tanskanen, P., Fraundorfer, F., and Pollefeys, M. Pixhawk:
Fig. 16 shows the total number of frames that do not have A system for autonomous flight using onboard computer vision.
a good target lock. Lower cumulative tracking loss indicates Robotics and automation (ICRA), 2011 IEEE international conference
that our algorithm has significantly lower probability of on. IEEE, 2011,pp. 2992-2997.
[9] Patil, Raju, Paul E. Rybski, Takeo Kanade, and Manuela M. Veloso.
losing track in partial occlusions by nearby vehicles and road People detection and tracking in high resolution panoramic video
signs in urban environments, proving its robustness. mosaic. InIntelligent Robots and Systems, 2004.(IROS 2004). Pro-
At the current stage, we are performing more experiments ceedings. 2004 IEEE/RSJ International Conference on, vol. 2, pp.
1323-1328. IEEE, 2004.
for human and vehicle tracking and recognition with the [10] Ke, Yan, and Rahul Sukthankar. PCA-SIFT: A more distinctive
proposed UAV-based system. More results will be obtained representation for local image descriptors. In Computer Vision and
in near future. With the help of GIS and neural network Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE
Computer Society Conference on, vol. 2, pp. II-506. IEEE, 2004.
databases, the system performance has been much improved. [11] J. Deng, W. Dong, R. Socher, L. Li, K. Li, and F. Li, Imagenet:
A large-scale hierarchical image database, in IEEE Conference on
V. CONCLUSIONS Computer Vision and Pattern Recognition, 2009, pp. 248-255.
This paper presents a framework for UAV based targets
tracking and recognition. The intelligent gimbal system is

405

You might also like