Object Detection Guide From A Computer Vision Expert (2024) - Viam
Object Detection Guide From A Computer Vision Expert (2024) - Viam
PLATFORM expand_more SOLUTIONS expand_more INDUSTRIES expand_more DEVELOPERS expand_more REQUEST DEMO TRY FREE LOG
expert (2024)
You may not be familiar with the term "object detection," but you’ve most likely come into
contact with it in one way or another. This field of computer vision is quietly working behind
the scenes in your everyday life.
Think about how easy it is to unlock your phone with just your face or scan your fruit at a
self-checkout in the grocery store—these conveniences are made possible by object
detection.
With object detection, you can visibly see where designated objects are within an image.
But what exactly is object detection, and how does it work?
https://www.viam.com/post/computer-vision-object-detection-guide 1/19
12/6/24, 4:59 PM Object detection guide from a computer vision expert (2024) | Viam
In this guide, I’ll walk you through the basics of object detection, how it’s used in everyday
life, the technology that makes it possible, and what you’ll need to know before diving in
yourself.
Whether you're curious about how things work or are someone who wants to see how this
technology can be useful in your projects or business, this blog is for you.
The designated objects of flowers and people being detected by an object detection model. Built with
Viam’s ML Model Service.
If you're looking for people, for example, the technology will scan the scene, draw boxes
around all the detected individuals, label them as "person," and provide a confidence score
indicating how certain it is that each detected object is actually a person.
This allows you to identify and analyze specific objects within a scene accurately and
efficiently.
Object detection is a way of identifying where specific objects are within an image or video. Built with
Viam’s ML Model Service.
To put it another way, imagine viewing the world through the lens of a camera. The camera
captures everything in its view, but object detection steps in to identify and highlight key
objects within the image.
https://www.viam.com/post/computer-vision-object-detection-guide 2/19
12/6/24, 4:59 PM Object detection guide from a computer vision expert (2024) | Viam
This allows you to quickly and accurately focus on what matters most, creating a clearer
image.
Image classification is a way of identifying what objects are within a single image or video. Built with Viam’s
ML Model Service.
Image classification, also referred to as object recognition, operates on the assumption that
there is a single object or class to be identified in an entire image. For example, an image
classification model might look at a picture and determine that it belongs to the class "dog,"
or "cat," or "fish," without specifying where within the image the object is located.
To learn more about image classification, head to our guide.
https://www.viam.com/post/computer-vision-object-detection-guide 3/19
12/6/24, 4:59 PM Object detection guide from a computer vision expert (2024) | Viam
This image shows the difference between object detection and image classification.
In contrast, object detection not only identifies multiple objects within an image but also
determines their exact coordinates. For example, in an image of a living room, object
detection can simultaneously identify and locate dogs, cats, and fish, drawing bounding
boxes around each one.
This multi-object capability builds on the foundation of image classification by adding
spatial information.
This image shows the difference between object detection and image segmentation.
So, while image classification provides a general label for an entire image, object detection
adds the ability to pinpoint where multiple objects are within the scene, and image
segmentation offers even finer detail by classifying every pixel.
Each of these tasks plays a crucial role in the broader field of computer vision, enabling
machines to interpret and interact with the visual world in increasingly sophisticated ways.
https://www.viam.com/post/computer-vision-object-detection-guide 4/19
12/6/24, 4:59 PM Object detection guide from a computer vision expert (2024) | Viam
An apple being wrongly identified as a tomato due to the heuristics traditional object detection models
within computer vision relied on.
Similarly, one of the first facial detection models used on a digital camera, the Fujifilm
FinePix S6500fd, relied on an algorithm that identified facial features like eyes and nose
shadows based on light and dark patterns.
An image showing how a facial detection model looks from a camera’s viewpoint.
Showing Haar-like features, which are instrumental in the Viola-Jones algorithm, allowing cameras to
detect certain patterns that suggest facial regions.
Histogram of Oriented Gradients (HOG) - 2005: Used for detecting objects, like
pedestrians, by counting occurrences of gradient orientation in localized portions of
an image.
While groundbreaking at the time, these methods were inflexible and hard to use for
general-purpose detection tasks, such as identifying multiple different kinds of objects at
once.
This made the rise of deep learning-based object detection around 2014 all the more
important, as it brought greater flexibility and accuracy by learning features directly from
data.
https://www.viam.com/post/computer-vision-object-detection-guide 6/19
12/6/24, 4:59 PM Object detection guide from a computer vision expert (2024) | Viam
A diagram showing how convolutional neural networks (CNNs) operate, similarly to the primary visual cortex
of the brain. (source)
Imagine a viewfinder moving across an image, detecting changes from light to dark and
learning shapes that define specific objects, like fur patterns or scales.
Through iterative training, the network starts by recognizing fine details and progressively
builds up to understanding the entire image.
For instance, you might show a neural network many images of rooms containing tomatoes,
apples, and people wearing red shirts. Over time, the network learns to distinguish between
the pixels of a tomato and those of an apple, developing a "certainty" of what constitutes a
tomato.
On this page
What is object detection
How does object detection work? within computer vision?
What are the different
Before we explore the various object detection frameworks, it’s important to note that while approaches to object
these frameworks are valuable, they’re not essential for creating an object detection model. detection in computer
So, don’t be discouraged if you’re not familiar with them. vision?
How does object detection
With that, let’s dive in! work?
How do you train an
Types of deep learning object detection model architectures accurate object detection
model?
Object detection can be achieved through various model architectures, that can be broadly
What are the applications
categorized into single-stage and two-stage detectors. Both types of detectors use CNNs to of object detection?
analyze images and pinpoint objects.
Get started with object
Let’s examine the details of each architecture type. detection today
Share
https://www.viam.com/post/computer-vision-object-detection-guide 7/19
12/6/24, 4:59 PM Object detection guide from a computer vision expert (2024) | Viam
Diagram showing how single-stage detectors perform object detection in a single pass through the
network.
These models predict both the bounding boxes and the class probabilities directly from the
full images in one go, without needing a separate region proposal stage.
This makes them particularly suitable for real-time applications where speed is crucial.
YOLO (You Only Look Once) family
YOLO models are a prime example of single-stage detectors, known for their ability to
detect objects in a single pass through the network.
Imagine a city struggling with traffic congestion, aiming to reduce gridlock, spot potential
accidents, and enhance overall safety. While various object detection models could be used,
YOLO models are particularly effective due to their fast-processing speeds, allowing traffic
officers to detect and classify objects in real-time.
https://www.viam.com/post/computer-vision-object-detection-guide 8/19
12/6/24, 4:59 PM Object detection guide from a computer vision expert (2024) | Viam
This capability can significantly improve the traffic management system by enabling
dynamic adjustments to traffic light timings, immediate dispatch of emergency services,
and real-time information sharing with commuters.
A look at how YOLO object detection models work in a single pass through the network.
Iterative improvements on the original YOLO have led to versions like YOLOv2, YOLOv3,
YOLOv4, and YOLOv5, each enhancing performance and accuracy. While the most recent
version is YOLOv10, YOLOv8 is widely considered the most stable at the moment, as it’s
been tested extensively.
To learn more about YOLO models, check out our guide.
CenterNet
CenterNet identifies objects by detecting their centers and associated attributes. This
method simplifies the detection process and improves speed by focusing on the central
points of objects rather than scanning the entire image for edges and shapes.
https://www.viam.com/post/computer-vision-object-detection-guide 9/19
12/6/24, 4:59 PM Object detection guide from a computer vision expert (2024) | Viam
A diagram showing how two-stage detectors work, passing images and videos through multiple
stages to detect objects. (source)
This two-step approach, while generally slower than single-stage detectors, tends to
provide higher accuracy, especially for more challenging detection tasks.
R-CNN family
R-CNN (2014): The original R-CNN (Region-based Convolutional Neural Network) was
the first notable algorithm of its kind to use a two-stage object detection framework.
First, it defines region proposals, and then classifies these regions independently.
While it was transformative for its ability to detect multiple objects within an image, it was
relatively slow due to its two-step process and high computational power.
https://www.viam.com/post/computer-vision-object-detection-guide 10/19
12/6/24, 4:59 PM Object detection guide from a computer vision expert (2024) | Viam
https://www.viam.com/post/computer-vision-object-detection-guide 11/19
12/6/24, 4:59 PM Object detection guide from a computer vision expert (2024) | Viam
And, keep in mind that with Viam, you can build your own object detection model in under
an hour.
Additionally, if you already have data you’re looking to train, you can upload it to Viam’s app
in minutes. Head to our documentation on uploading a batch of data to learn more.
An image of the config tab within Viam's app, showing data capture as on.
https://www.viam.com/post/computer-vision-object-detection-guide 12/19
12/6/24, 4:59 PM Object detection guide from a computer vision expert (2024) | Viam
Showing the training process of an object detection model in Viam. As you can see the image is being
tagged with a bounding box and then labeled as “dog.”
https://www.viam.com/post/computer-vision-object-detection-guide 13/19
12/6/24, 4:59 PM Object detection guide from a computer vision expert (2024) | Viam
Showing the training process within Viam, this time with a close-up view.
When training new models in the Viam app, we use fine-tuning, a transfer learning
approach. This means you only need to label about a hundred images instead of hundreds
or thousands, making the process significantly faster and more resource-efficient.
After labeling your images, you can train your model with Viam in just a few minutes by
following the detailed instructions in our documentation.
If you’re looking to save time and deploy a model from another repository, like
HuggingFace, Model Zoo, or Kaggle, to your machine, this is totally doable with Viam. You
have a few options, including:
Deploying a pre-trained model another community member has published on the Viam
Registry. If you have one in mind, I’d look here first as it’s the easiest way to deploy it
onto your own machine.
Uploading a model to the Viam Registry yourself, making it private or public so others
can use it on their devices later.
Deploying a model that’s trained outside the Viam platform that’s already available on
your machine.
Just make sure the model you use is compatible with the Viam platform, which supports
TensorFlow Lite, TensorFlow, PyTorch, and ONNX model frameworks.
Within the Viam app, you can test if your model is working accurately.
If your model isn't performing reliably after deployment, you might need to make some
adjustments. You can try:
Adding and labeling more images in your dataset if you trained the model yourself.
This can boost accuracy.
Lowering the confidence threshold of the transform camera. Ideally, your ML model
should identify objects with high confidence, which usually depends on having a
robust dataset.
Now that you know how to train and deploy an accurate object detection model, let's
explore its practical uses.
Quality assurance
https://www.viam.com/post/computer-vision-object-detection-guide 15/19
12/6/24, 4:59 PM Object detection guide from a computer vision expert (2024) | Viam
Credit computer vision for ensuring your products arrive just as you expected. By
identifying defects on production lines—such as color inconsistencies, dents, or scratches
—object detection helps companies maintain high product quality and reduces the risk of
faulty products reaching the market.
Safety compliance
You might not realize it, but object detection is also making workplaces safer by keeping an
eye on compliance with safety protocols. Imagine you're a construction worker where
wearing hardhats and gloves is mandatory.
With object detection, companies can detect whether employees are wearing their helmets
and automatically alert the safety officer if someone isn't. This way, everyone stays safe and
sticks to the necessary precautions.
A dataset trained on Viam, showing a YOLOv8 Hard Hat Detection model, in use at a construction site.
Plant care
Object detection extends its benefits to hobbies and daily activities, including indoor
gardening. It can monitor your plants, detecting issues like browning leaves or pests.
By automating responses such as watering or spraying insecticides, it helps keep your
plants healthy with minimal effort, potentially saving your beloved fiddle leaf fig from
wilting.
Pet care
Pet care also benefits from object detection, allowing owners to automate and monitor
many aspects of their animals’ day-to-day activities.
It can detect when pets approach their food bowls and automatically dispense food,
ensuring they are fed on time. Additionally, it can monitor pets' movements to make sure
they get enough exercise, or recognize individual pets in multi-pet households, providing
personalized care and attention.
https://www.viam.com/post/computer-vision-object-detection-guide 16/19
12/6/24, 4:59 PM Object detection guide from a computer vision expert (2024) | Viam
An object detection model being trained directly on Viam for logic that automatically gives the dog, Toast,
a treat whenever he’s spotted.
For retail, it could look at customer behavior and count the number of people in a store,
providing valuable data for optimizing store layouts and staffing levels.
The object detection model showing how it could be used with traffic flow management.
https://www.viam.com/post/computer-vision-object-detection-guide 17/19
12/6/24, 4:59 PM Object detection guide from a computer vision expert (2024) | Viam
Autonomous vehicles
Self-driving cars rely heavily on object detection to identify and navigate around
pedestrians, vehicles, and obstacles, ensuring safe and efficient operation. With it, these
vehicles can interpret their surroundings accurately, making real-time decisions to avoid
collisions and navigate complex environments.
An object detection model within the Viam app interface, showing how people and cars can be identified
to help autonomous vehicles navigate safely.
About the author: Khari is a computer vision enthusiast interested in democratizing access
to robotics and technology. His work focuses on multiple object tracking, image detection,
and other machine learning applications of computer vision.
Reviewed by: Bijan Haney (Lead Engineer, CV Team at Viam) and Nick Hehr (Senior
Developer Advocate at Viam)
https://www.viam.com/post/computer-vision-object-detection-guide 18/19
12/6/24, 4:59 PM Object detection guide from a computer vision expert (2024) | Viam
GET STARTED
Email Address*
Subscribe to our developer newsletter and every month
we’ll share the latest Viam releases, tutorials, and tips.
No spam, we promise. protected by reCAPTCHA
Viam uses your information to send updates on our products, Privacy - Terms
Pricing Events
CASE STUDIES
Security & Compliance Support
Customers
https://www.viam.com/post/computer-vision-object-detection-guide 19/19