Unit 5 Int345
Unit 5 Int345
Range data
❑ Range data is a 2-1/2 D or 3-D representation of the
scene.
❑ An image d(i, j), which records the distance d to the
corresponding scene point (X, Y, Z) for each image pixel
(i, j).
❑ It could be provided as a set of 3-D scene points (point
cloud).
2
Imaging Techniques
■ Passive imaging.
■ Stereo imaging
■ Active range sensing
o Time-of-flight sensors
o Triangulation-based sensors
o Structured Light
Active range sensors
• Active range sensors are devices that use emitted energy, such
as light or sound, to measure the distance between the sensor
and an object.
• These sensors are commonly used in various applications,
including robotics, industrial automation, automotive systems,
and more.
• They work by emitting a signal, measuring the time it takes for
the signal to bounce off an object and return, and then using
this information to calculate the distance.
Time-of-Flight (ToF) Sensors
• ToF sensors use light, typically infrared, to measure the time it
takes for a light pulse to bounce off an object and return.
• They provide accurate distance measurements and are used in
applications like gesture recognition, augmented reality, and
indoor navigation.
Time-of-Flight Range Sensors
Source and
t: Time taken to travel
detector
the forward and return
path. v: speed of light in collocated.
Use a moving
the
given medium. mirror to scan
the beam. Pulsed
Distance: laser
d: (v x t)/2
Laser-based Limitation:
time-of-flight range o the minimum observation time
sensors: light detection thus limited by the minimum
and ranging (LIDAR) or
distance observable.
laser radar (LADAR)
Triangulation-based Sensors
• Triangulation-based active range sensors are a category of
active sensors that determine the distance to an object by
measuring the angle and position of the reflected signal.
• Triangulation is a geometric principle used to calculate
distances by forming a triangle between the sensor, the object,
and the point of reflection.
• These sensors are widely used in various applications, including
industrial automation, 3D scanning, robotics, and more.
Triangulation based Sensors
Known for a predetermined
The camera scanning path of the beam.
and light
source are Apply triangulation
(X,Y,Z
calibrated. ) to get the 3D point.
(i,j)
(u,v)
Observed
in camera. A light
Camera source.
Yi-Chih Hsieh, Decoding structured light patterns for three-dimensional imaging systems, Pattern Recognition 34 (2001) 343-349
.
Imaging principle
Encoding the 3-D
position of Scan the ray in a
projected predetermined
ray. (X,Y,Z
)
calibrated path.
(i,j)
(u,v)
A light
Camera
source.
Yi-Chih Hsieh, Decoding structured light patterns for three-dimensional imaging systems, Pattern Recognition 34 (2001) 343-349
.
Structured Light Sensors
• These sensors project a structured pattern (such as grids or
stripes) of light onto an object and use the deformation of the
pattern on the object's surface to calculate distance.
• They are commonly used in 3D scanning and industrial
metrology.
Structured Light
Encoding the 3-D Project a strip or
position of pattern.
projected ray.
(X,Y,Z
Get 3-D positions
) of all the scene
(i,j)
points lying on the
(u,v)
projected strip.
Structured
Camera
light
Yi-Chih Hsieh, Decoding structured light patterns for three-dimensional imaging systems, Pattern Recognition 34 (2001) 343-349
.
Range Data Segmentation
• Range data segmentation is the task of segmenting (dividing) a range
image, an image containing depth information for each pixel, into
segments (regions), so that all the points of the same surface belong
to the same region, there is no overlap between different regions and
the union of these regions generates the entire image.
• The goal of range data segmentation is to identify and isolate
objects, obstacles, or distinct regions within the sensor's field of
view for further analysis and decision-making.
There have been two main approaches to the range segmentation problem:
region-based range segmentation and edge-based range segmentation.
1. Region-based range segmentation
• Region-based range segmentation algorithms can be further categorized into two major groups:
parametric model-based range segmentation algorithms and region-growing algorithms.
• Algorithms of the first group are based on assuming a parametric surface model and grouping data
points so that all of them can be considered as points of a surface from the assumed parametric
model (an instance of that model).
• Region-growing algorithms start by segmenting an image into initial regions. These regions are
then merged or extended by employing a region growing strategy. The initial regions can be
obtained using different methods, including iterative or random methods. A drawback of
algorithms of this group is that in general they produce distorted boundaries because the
segmentation usually is carried out at region level instead of pixel level.
Edge-based range segmentation
• Edge-based range segmentation algorithms are based on edge
detection and labeling edges using the jump boundaries
(discontinuities).
• They apply an edge detector to extract edges from a range image.
Once boundaries are extracted, edges with common properties are
clustered together.
• The segmentation procedure starts by detecting discontinuities using
zero-crossing and curvature values.
• The image is segmented at discontinuities to obtain an initial
segmentation.
• At the next step, the initial segmentation is refined by fitting quadratics
whose coefficients are calculated based on the Least squares method.
• In general, a drawback of edge-based range segmentation algorithms is
that although they produce clean and well defined boundaries between
different regions, they tend to produce gaps between boundaries.
• In addition, for curved surfaces, discontinuities are smooth and hard to
locate and therefore these algorithms tend to under-segment the range
image. Although the range image segmentation problem has been studied
for a number of years, the task of segmenting range images of curved
surfaces is yet to be satisfactorily resolved.
Range Image Registration
Range image registration is the process of aligning and combining multiple range images,
which may be captured from different viewpoints or at different times, to create a single,
coherent 3D model. The key steps involved in range image registration include:
1. Feature Extraction: Extract features or keypoints from the range images. These
features serve as distinctive points that can be matched across different images.
2. Feature Matching: Match corresponding features in pairs of range images to establish
their relative pose (position and orientation) with respect to each other.
3. Pose Estimation: Calculate the transformation (usually translation and rotation) that
aligns the range images accurately. Common algorithms for this purpose include
Iterative Closest Point (ICP) and its variants.
4. Global Registration: Combine the relative transformations to register all range images
into a common global coordinate system.
5. Refinement: Fine-tune the alignment to minimize any residual errors or misalignments.
6. Data Fusion: Merge the registered range images into a single, unified 3D model.
Model Acquisition:
Model acquisition is the process of creating a 3D model from the registered range images. This representation
can take the form of a point cloud, a mesh, or other 3D data structures. The typical steps in model acquisition
include:
• Point Cloud Generation: Convert the registered range images into a 3D point cloud, where each point
corresponds to a 3D coordinate in space and includes depth information from the range data.
• Mesh Generation: In some applications, a 3D mesh is generated from the point cloud, often consisting of
interconnected triangles. This mesh can provide a more detailed and structured representation of the 3D
model.
• Texture Mapping: Apply color or texture information to the 3D model, typically using images or textures
captured in conjunction with the range data.
• Mesh Simplification: For real-time rendering and storage efficiency, reduce the complexity of the 3D
model by simplifying the mesh while preserving important geometric features.
• Texture Projection: Project textures from the original images onto the 3D model to enhance its appearance
and realism.
• Post-Processing: Perform various data cleaning, noise reduction, hole filling, and optimization tasks to
ensure that the 3D model is suitable for its intended application.
Object Recognition
• Object recognition is a computer vision technique for identifying
objects in images or videos. Object recognition is a key output
of deep learning and machine learning algorithms.
• When humans look at a photograph or watch a video, we can
readily spot people, objects, scenes, and visual details.
• The goal is to teach a computer to do what comes naturally to
humans: to gain a level of understanding of what an image
contains.
Object recognition is a key technology behind driverless cars, enabling them to recognize a stop sign or
to distinguish a pedestrian from a lamppost. It is also useful in a variety of applications such as disease
identification in bioimaging, industrial inspection, and robotic vision.
Object recognition typically consists of several key steps:
1.Object Detection:
• Localization: Determine the location and extent of objects within an image. This is often
done by drawing bounding boxes around the objects.
• Class Labeling: Assign a label or category to each detected object (e.g., "car," "person,"
"dog").
2. Feature Extraction: Extract distinctive features from the detected objects. Common features
include color, texture, shape, and key points.
3. Feature Representation: Transform the extracted features into a suitable format for further
analysis. This often involves creating feature vectors or descriptors that capture the essential
characteristics of the object.
4. Classification: Use machine learning algorithms, such as deep neural networks (e.g.,
CNNs), support vector machines (SVMs), or decision trees, to classify objects based on their
feature representations.
5. Recognition and Decision-Making: Determine the identity of recognized objects based on
the classification results. This may involve associating objects with known object categories or
labels.
6. Post-Processing: Enhance the recognition results by applying additional techniques, such
as non-maximum suppression to remove redundant bounding boxes or smoothing techniques
to improve object tracking.
How Object Recognition Works
• You can use a variety of approaches for object recognition.
Recently, techniques in machine learning and deep
learning have become popular approaches to object recognition
problems. Both techniques learn to identify objects in images,
but they differ in their execution.
Object Recognition Using Deep Learning
• Deep learning techniques have become a popular method for
doing object recognition. Deep learning models such as
convolutional neural networks, or CNNs, are used to
automatically learn an object’s inherent features in order to
identify that object.
• For example, a CNN can learn to identify differences between
cats and dogs by analyzing thousands of training images and
learning the features that make cats and dogs different.
There are two approaches to performing object recognition using deep
learning:
• Training a model from scratch: To train a deep network from scratch, you gather a very large
labeled dataset and design a network architecture that will learn the features and build the model.
The results can be impressive, but this approach requires a large amount of training data, and you
need to set up the layers and weights in the CNN.
• Using a pretrained deep learning model: Most deep learning applications use the transfer
learning approach, a process that involves fine-tuning a pretrained model. You start with an existing
network, such as AlexNet or GoogLeNet, and feed in new data containing previously unknown
classes. This method is less time-consuming and can provide a faster outcome because the model
has already been trained on thousands or millions of images.
Deep learning offers a high level of accuracy but requires a large amount of data to make accurate
predictions.
Machine learning techniques for object
recognition
• Machine learning techniques are also popular for object
recognition and offer different approaches than deep learning.
Common examples of machine learning techniques are:
• HOG feature extraction with an SVM machine learning model
• Bag-of-words models with features such as SURF and MSER
(Maximally Stable External Regions)
• The Viola-Jones algorithm, which can be used to recognize a
variety of objects, including faces and upper bodies
Machine Learning vs. Deep Learning for
Object Recognition
• The main consideration to keep in mind when choosing
between machine learning and deep learning is whether you
have a powerful GPU and lots of labeled training images.
• If the answer to either of these questions is No, a machine
learning approach might be the best choice.
• Deep learning techniques tend to work better with more images,
and a GPU helps to decrease the time needed to train the
model.
Thank you!