Combined Major Project
Combined Major Project
(EC454)
ON
OBJECT DETECTION AND TRACKING SYSTEM
This project report is submitted to USICT in partial fulfillment of the
requirements for the degree of
“Bachelor of Technology in Electronics and Communication”
I have conformed to the norms and guidelines given in the Ethical Code of
Conduct of the Institute. All sources used for reference or citation have
been duly acknowledged.
Jatin Drall
CERTIFICATE
This is to certify that Mr. Jatin Drall , has carried out his major project
work on “Object Detection and tracking system”
Using Arduino kit, in the Electronics and Communication Engineering
Department of USICT, Dwarka during the year 2023-2024 of Batch 2020-
2024. Their work is approved for submission in partial fulfillment of the
requirements for the degree of “Bachelor of Technology.”
Date:
Tue, “May 28, 2024”
ACKNOWLEDGEMENT
I am very grateful to all the people, who gave an excellent opportunity to
develop this Project "“ Object Detection and tracking system ” " which
not only enhanced my knowledge but also allowed me to apply my knowledge
in a meaningful and practical manner.
I am heartly thankful to my mentor, Prof. MANSI JHAMB (Professor)
Department of ECE, USICT, whose encouragement, guidance and support kept
me highly motivated from the initial to the final level and enabled me to develop
an understanding of the project. Lastly, I offer my regards to all of those who
directly or indirectly supported me in any respect during the completion of this
Project.
In this project, Tracking approaches that employ a stable model can only accommodate small
changes in the object appearance but do not explicitly handle severe occlusions or continuous
appearance changes. Occlusion, either partial or full, can be classified into self occlusion, inter-
object occlusion and occlusion by the background scene structure. Self occlusion occurs when one
part of the object occludes another, especially for articulated objects. Inter-object occlusion occurs
when two objects being tracked occlude each other, which is the common case in surveillance
video. Similarly, occlusion by the background occurs when a structure in the background e.g. a
column, a divider, etc., occludes the tracked objects. Generally, for inter-object occlusion, the
multi-object trackers can exploit the knowledge of position and appearance of the occluded and
occluding objects to detect
and resolve occlusion. Partial occlusion of an object by a scene structure is hard to detect, since it
is difficult to differentiate between the object changing its shape and the object getting occluded.
A common approach to handle full occlusions during tracking is to assume motion consistency,
and in case an occlusion is detected, to keep on predicting the object location till the object
reappears. Among such predictors, Kalman filter can be given as an example. Occlusion can also
be implicitly resolved during generation of object tracks. The chance of occlusion can be reduced
by an appropriate selection of camera positions. For instance, if the cameras are mounted on for
birds eye view of the scene, most occlusions can be eliminated. Multiple cameras viewing the
same scene can also be used to resolve object occlusions during tracking.
Multi-camera tracking methods have demonstrated superior tracking results as compared to single
camera trackers in case of persistent occlusion between the objects. In many situations it is not
possible to have overlapping camera views due to limited resources or large areas of interest.
Non-overlapping multi-camera tracking has to to deal with sparse object observations.
Therefore additional assumptions have to be made about the object speed and the path in order to
obtain the correspondences across cameras. Methods that establish object correspondence assume
1) the cameras are stationary and 2) the object tracks within each camera are available. The
performance of these algorithms depends greatly on how much the objects follow the established
paths and expected time intervals across cameras Object appearance changes can be included in
the model update by introducing noise and transient models.
Despite explicit modeling of noise and transient features, trackers often perform poorly, or even
lose tracking, in cases when the performer suddenly turns around during an action and reveals a
completely different appearance, which has not been learned before. A potential approach to
overcome this limitation is to learn different views of the object and later use them during
tracking. In addition, a tracker that takes advantage of contextual information to incorporate
general constraints on shape and motion of objects will usually perform better than the one that
does not exploit this information. The capability to learn object models online may greatly
increase the applicability of a tracker. Unsupervised learning of object models for multiple non-
rigid moving objects from a single camera remains an unsolved problem. One interesting
direction, that has largely been unexplored, is the use of semi-supervised learning techniques for
modeling objects. These techniques (co-training, transductive SVMs, constrained graph cuts) do
not require prohibitive amount of training data.
REFERENCES
a) vis, J., Bobick, A.: The representation and recognition of action using temporal templates.
In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, San Juan, Puerto Rico
(1997
b) Wren, C., Azarbayejani, A., Darell, T., Pentland, A.: Pfinder: real-time tracking of the
human body. IEEE Transactions on Pattern Analysis and Machine Intelligence 19 (1997)
c) Leibe, B., Seemann, E., Schiele, B.: Pedestrian detection in crowded scenes. In: Proc.
IEEE Conf. on Computer Vision and Pattern Recognition, San Diego, CA (2005)
d) Kockelkorn, M., Luneburg, A., Scheffer, T.: Using transduction and multi-view learning to
answer emails. In: European Conf. on Principle and Practice of Knowledge Discovery in
Databases (2003).
e) Levin, A., Viola, P., Freund, Y.: Unsupervised improvement of visual detectors using co-
training. In: Proc. 9th Intl. Conf. on Computer Vision, Nice, France (2003)
f) Opelt, A., Pinz, A., Zisserman, A.: Incremental learning of object detectors using a visual shape
alphabet. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, New York, NY
(2006)
g) Mikolajczyk, K., Schmid, C., Zisserman, A.: Human Detection Based on a Probabilistic
Assembly of Robust Part Detectors. In: Pajdla, T., Matas, J(G.) (eds.) ECCV 2004, Part I.
LNCS, vol. 3021, pp. 69–82. Springer, Heidelberg (2004)
RESEARCH PAPER
a) Fatih Porikli and Alper Yilmaz , Object detection and tracking system
WEBSITES
1) https://www.tutorialspoint.com
2) https://www.researchgate.net/